計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告講解_第1頁(yè)
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告講解_第2頁(yè)
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告講解_第3頁(yè)
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告講解_第4頁(yè)
計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告講解_第5頁(yè)
已閱讀5頁(yè),還剩19頁(yè)未讀 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說(shuō)明:本文檔由用戶(hù)提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

1、計(jì)算機(jī)系統(tǒng)結(jié)構(gòu)實(shí)驗(yàn)報(bào)告3.1流水線(xiàn)中的相關(guān)一、實(shí)驗(yàn)?zāi)康?.掌握WinDL戲擬器的操作和使用,熟悉DLX旨令集結(jié)構(gòu)及其特點(diǎn);2 .加深對(duì)計(jì)算機(jī)流水線(xiàn)根本概念的理解;3 .進(jìn)一步了解DLX1本流水線(xiàn)各段的功能以及根本操作;4 .加深對(duì)數(shù)據(jù)相關(guān)、結(jié)構(gòu)相關(guān)的理解,了解這兩類(lèi)相關(guān)對(duì)CPUS能的影響;5 .了解解決數(shù)據(jù)相關(guān)的方法,掌握如何使用定向技術(shù)來(lái)減少數(shù)據(jù)相關(guān)帶來(lái)的暫停.二、實(shí)驗(yàn)平臺(tái)WinDL戲擬器.三、實(shí)驗(yàn)內(nèi)容、步驟及實(shí)驗(yàn)結(jié)果1.用WinDL擬器執(zhí)行以下三個(gè)程序:求階乘程序fact.s求最大公倍數(shù)程序gcm.s求素?cái)?shù)程序prim.s分別以步進(jìn)、連續(xù)、設(shè)置斷點(diǎn)的方式運(yùn)行程序,觀(guān)察程序在流水線(xiàn)中的執(zhí)行情

2、況,觀(guān)察CP加存放器和存儲(chǔ)器白內(nèi)容.熟練掌握WinDL的操作和使用.結(jié)果總結(jié):三種方式:步進(jìn)的方式是按快捷鍵F減者選擇菜單欄Execute中的SingleCycle;連續(xù)的方式是按快捷鍵F5或者選擇Execute中的Run;設(shè)置斷點(diǎn)是通過(guò)選擇window菜單欄中的code,然后在菜單欄中多出一項(xiàng)code項(xiàng),選中你想要插入的指令,在多出來(lái)的code項(xiàng)中找到setbreakpoint,即可插入斷點(diǎn),然后按F5a行即可.以fact.s為例Pipeline圖指出了每個(gè)功能段所進(jìn)行的具體指令,點(diǎn)擊指令還可以看到指令的具體相關(guān)的其他方面的內(nèi)容.時(shí)空?qǐng)D更加直觀(guān)的形式顯示出了在某個(gè)時(shí)間周期某個(gè)功能段所執(zhí)行的具

3、體的指令.inlEXIbui3.QM0|r2)inpdtLoopMEMInstiuctions/Cyclesaddirlj0,0x1000ialInpUUnsignedmovi2fpflOjlswSaveR2(iO)j2swSaveR3(rOLi3swSarveR4(iO)j4Int-Sfegestmez而jnputFni的ftiODOOOl7cIFIDI-IImulEX工givEXabortedIFIDIintEXIhlEl/WBIinputLoopChOQQOOI%faddEXxzz“qii5j3.0b«>CM)OaiO170kClockCycleDiarras口WINDL

4、X-RegisterIFileVVFnciQwMerri©ryFC-工MRR二TR-A=AHI-B=BHI-BTA=ATJU-&LUHIDMAR=SDR-SDEHII.DR-1LDEHIKO-00000014c0it00000148OmOOOODOOOyrOODODOODOmOOOODOOOnOOOOQODn000000000xDonagiios=OmOOOOOOOOwDDDOaiODaOmO0000000ODiDDa=OmOOOOOiOOO :K口口口口.口.口=OnOOOODOOO 3cQOQOQiODaRL口村口0口口1口口口R3=R8二R1O=DicDOOClDOOU

5、O?eO00001000DjcOOOOQOOOOstOOOOOOOOOjtOOOOOiOODstOOOOO'OOOooiLioooooooxoooodoaoOicOOOClOOOlO二一二二=二二=一=二二二二-O12345_67B9012345670904S6789111111111122222222223FFFFFFFFFFFFFFFFFFFFFFFFFFFoon-on-onononoDOUODo_uo_un-on-on-ORegister圖指出了各個(gè)存放器和存儲(chǔ)器的值,如執(zhí)行完了第一條加法指令之后,R1=OX000010O0Total:6Cdelsexecuted.IDexecu

6、tedby4Instructiorufs),4ln&truction(scurrentIpinPipeline.HairdwereoonfIguHmtion:Memorysize:327S8BytesfaddEX-5lages.1,requiredCycles:2JmUlEX-Stages;1requiredCycles:5Fdiv£X-S(ages:111rrequiredCycles:19Fotfardingenabled.StailIs;RAWstalls:0(0.00ofallCycles-thereof:LDstalls:0(0.00ofRAWstalls)Brri

7、Pch/Jumpstalls:0(0,00ofRAW牟WilsJFloatingpointstalls:0(00OofRAWstaill5WAWstalls:00.00or811CyclesStructuralstalls-0.OOSfofallCyclesControlstalls:11G.G7SofallCyicles)Trapstalls'(O.OOXofallCycles)Total;1StalH)(16,675;ofallCycles)ConditxoelaIBra.nches).Totail:0(0.00otallInsJUuctions),thereof:taken:D(

8、QD5fofallcQnd.Brafiche-sJnottaken:0(0.00ofq】ccrud.Branches)Statistics指出了指令的相關(guān)分析數(shù)據(jù),例如,執(zhí)行了6個(gè)cycles,4條指令在流水線(xiàn)中等相關(guān)的總結(jié)信息.Total:IQZCyeleltJftwcUMIDewKLtedby67lnSnWiofi|$l2Insinxlioti'ilcuretilyriPpeirteDLSStandard-1/0An.i&tegeir口813日>1.6Factorial-720S-axdv-axe-canEiguzQtiodMeiray簽7/叼色I(xiàn)Mi菱X國(guó)制網(wǎng)1.K

9、MMedCdMrt:2ImJEA-StagK.rwr.-r-KlCjkIas:5Idi4用lag":LicquiedCyct£19Stalls:RAW100雕ofAiliwZLD3次皿ofRIW曲|囪ancWJuTp£黨:ZfilOIKdRAWslaklFbgcunidaisH聞00dRAWaIs)WW世&貫0Q.MKM$CA*JStucKr.alsials:IO0Q*kaICpchslCwftdsljls:10molriCpdeijTw$i4t$1之1而武出丁加1325城同31.37HdU1曲上)Condit101141TdUtBIllJlnAudkFi

10、slLflttMALakjm:2|25DO"olUumd由am判詞2i&a6?苒脆M溫wrxJa執(zhí)行結(jié)果圖gcm.s、prim.s類(lèi)似,所以只給出運(yùn)行的結(jié)果圖髓WJNDLX-ClockDigramFileWindoxExecuteMcmoiryCanfigurationClockCycleDigrdMlruHiw,匚2"卜ME14i人1149115g/151.5,153,154,1旺,何,bhMd3/1GifrrtaIF|R.Sidiir-iD"MEMT-E|DLX'tanI/q名山i2j2j1igcrn.Loopsubi

11、1j1j2seqi3/1jZbnesi3jResdtsgtr3j1.r2bneziSIGieateisub22川|IF|D-T'-E:|MEMIV/B|FirstNujiber.4.5SecondHu凱ber:13gcH=SIF|IF|ID-rr.E:|MEM|WB|IF|aboitedsubi111j2igQm.LoopswPrhtlValuB|rOLi1seqr3ji1j2bnezi3JR日加50rS.rl.riaPtHIValuclrOMlgcm.s結(jié)果圖Prim.s結(jié)果圖2.用WinDLX云行程序structure_d.s,通過(guò)模擬找出存在資源相關(guān)的指令對(duì)以及導(dǎo)致資源相關(guān)的部件

12、;記錄由資源相關(guān)引起的暫停時(shí)鐘周期數(shù),計(jì)算暫停時(shí)鐘周期數(shù)占總執(zhí)行周期數(shù)的百分比;論述資源相關(guān)對(duì)CPU生能的影響,討論解決資源相關(guān)的方法.目ClockCycleDiacraaIresJjudianE/CgEfewldl0.0M(Xi2)IdI4.0mD|i3)adddfO.lOJ4addd氾加J2;-30-29.-23-27,-2S,-252,-23-22,-21-2D,-19,-18-17.-16-15,-14-131111111111i1111n叔irtedaddr2.i2.0x8.addr3j3.0x8subi5j4,i2bneziS.looptrapOwO站里圖資源相關(guān)的指令:adddf

13、0,f0,f4adddf2,f0,f2發(fā)生先寫(xiě)后讀的數(shù)據(jù)相關(guān)導(dǎo)致消除了資源相關(guān)由圖可知是由于只有一個(gè)faddEX的運(yùn)算部件,而它需要兩個(gè)時(shí)鐘周期,當(dāng)?shù)谝粭laddd指令執(zhí)行EX段時(shí),第二條指令勢(shì)必要等一個(gè)周期等f(wàn)addEX部件空閑了才能夠使用.但是,同時(shí)由于這兩條指令存在先寫(xiě)后讀數(shù)據(jù)相關(guān),暫停一個(gè)周期之后沒(méi)有了資源相關(guān)adddFO.H3.M:|IF|I0希rf-1J,|MEMIWgT|adddr2.rO.l2:|IF|SWII|I-閆1PladdEK1j泥EIWB|InforMationaboutadddf2,f嘰f2adddf2J(M2IFIDAdr.:loop+OxcCode:0x04021

14、004Terminated$uccessfulyFirstCpde:26Ld$tCycle:-19TotalCycles:3Cycles26(2)TerminatedsuccessfullyIIMAR<PC(=loop+0xc)IR<-MemIMAR(=0x04021004PCcPC+4卜俯中+0*101Stal()becauseofstructuralHazard!Cycles:-24(2)Terminatedsucces4必(=9)B<-D2=72)1Stall(s)becauseofRAW-HazrdwithadddKUDJ4faddEX(l)MEMWBCycles:*

15、22(2)TerminatedsuccessJulyALU<A+B(=90)(A=1UB=72)NoStallsrequiredForwardingapplicated:A<-DxO(adddQfO.f町AHI<-Ok40320000(adddfOJOJ4)Scle工-201)TerminatedsuccessfullyNothingtodaNoStallsrequired.Cycles:-19(1JIerminatedsuccessfullyD2<ALU(=90)NoStallsrequired.單條指令的詳細(xì)資源沖突圖由于Statistics圖中的分析數(shù)據(jù)沒(méi)有看見(jiàn)s

16、tructuralstall,但是根據(jù)圖知道這個(gè)存在adddf2,f0,f2時(shí)的資源沖突,大概循環(huán)了6次,總共時(shí)鐘周期是139個(gè),所以資源相關(guān)引起的暫停的時(shí)鐘周期的個(gè)數(shù)是6個(gè),暫停時(shí)鐘周期數(shù)占總執(zhí)行周期數(shù)的百分比為6/139=4.3%WIIMOLX-File;-W/in<dovtf,C.<3rrfigLjrt-iQnsticsTD>I0上.133iCyj匚lulls:HMUfulucLIDuxuuutudti5J曰6IriTkructicn(-x:).Nlnj.krui=tian(£)curreritlyinPipcineHmrTB4nr&unn1£

17、;-i-g11i:t?n.tGon,Meoiorjutsjize-iRjptesfaddEX-Stooesr1.requiredCycles:之froulEX-Staoes:1.reauirediC$*cle:5fdkvEX-Stoes:1irrequiredCycles:1日Forafdinaeri-blexi.自七.工工H:HAWstalls.'JLJ(ZTutallthuru口I:l_d51ah:IO33.33口rRAW9白1匕|DramcKAJumpstalls:!.XofHAW土上Flc-atingpaintat凸II土:I0(33.33£cfRAA/st.alls

18、WAWstalls-(A門(mén)門(mén)注piniCycles)Striucfuralstnlli?'O(H口口備ofallC1日心旬Controlstalls:947本ofallTrpstalls-3(2.165EofaflliC5#cle)Total:42Stm肛法30.22ofaMCjpcleelCo>rxclat-ioti-c*.±Hj?y工20;11J:1ot-al.1U(-11.63ofcalllnitiuidions:)thereoLt-skcn:"33O.DD5iol-allcond.Br-5nch«)nottak.e-n:11.OOofallc

19、ond.Br-andhe-sJT.r-iAd一J三生0=0-Tns-taru.citions:Tct3卜尸門(mén)P:N?G.rrfIrt£lTLiul>inkwLthrcsretcbf-Loacfe-20(100口0ofLoad/StoreInstructionsJStores:00.口口慈erfLosd-/Store-lnstnucfiords:)Floait-ingisoi三七m0日instmetione:TolaL20123.NS宅ofallln«<iuictior-i$)?tKereuli;Addituri.201考o(jì)lFlu-dlirimp口irit

20、65;1口in¥LIMLHltiph匚目ti口in±1U|,U.(JCliind卜II口與iLhtiqpoint注1口q=in7L|Orvisiom:.O.N.口FFlo-atingponnl:Tt-sgeimLJ"THdpm:Tr-aps:1(NII口F凸11IInstrud-ions:)Statistics數(shù)據(jù)分析圖資源相關(guān)降低CP帆能,并行運(yùn)算的速度降低,解決資源相關(guān)的方法有停頓幾個(gè)時(shí)鐘周期法針對(duì)訪(fǎng)存沖突和設(shè)備資源沖突輪流單個(gè)使用或者是增加硬件設(shè)備解決設(shè)備資源沖突.3.在不采用定向技術(shù)的情況下去掉Configuration菜單中EnableForwardin

21、g選項(xiàng)前的勾選符,用WinDLX!行程序data_d.s.記錄數(shù)據(jù)相關(guān)引起的暫停時(shí)鐘周期數(shù)以及程序執(zhí)行的總時(shí)鐘周期數(shù),計(jì)算暫停時(shí)鐘周期數(shù)占總執(zhí)行周期數(shù)的百分比.在采用定向技術(shù)的情況下勾選EnableForwarding,用WinDLX再次運(yùn)行程序data_d.s重復(fù)上述3中的工作,并計(jì)算采用定向技術(shù)后性能提升的倍數(shù).1、不定向技術(shù):總時(shí)鐘周期=202數(shù)據(jù)相關(guān)引起的暫停時(shí)鐘周期=104暫停時(shí)鐘周期數(shù)占總執(zhí)行周期數(shù)的百分比=51.48%St&Hardwareconfiguzration二MemoirSissi327G6BtesfiddEX-Stages:LrequiredCycles:2f

22、mulEX-Stages:1HrequiredCycles;5fdivEX-Stages:1,requiredCycles:13Forwardingdisabled.JLCXiclesAWAWstalls0(0.00otallCycles)Sbucluil£l«lh:0(000ofallCycles)Controltrails:9(446彳ofMlCycles)Trapstall?;3(1.43ofalCycles)Total:116StaKW(57.42fectalCycles)CcnditicnalBranches);Total;10(1176ofallInstruct

23、ionsLthereof:taken:9(90.00ofallcord.BrarchesJnott5kenc1(10.00ofallcondBlanches)Load-toreIustructions:Total:30(3.23%of11Instructions),thereof:L-±.rmmoor0工!三一卜=1m-七2、定向技術(shù):總時(shí)鐘周期=128數(shù)據(jù)相關(guān)引起的暫停時(shí)鐘周期=30暫停時(shí)鐘周期數(shù)占總執(zhí)行周期數(shù)的百分比=23.44%Tnsnuetien(sJ.2Instruction仁currentlyinPipeline.Hardwacreconfigu3?aitiooi:Mem

24、orysize:32768BtesfaddEX-Stage5:1RrequiredCycles:2fmulEX-S(ages:1HrequiredCycles:5rdr/t?:-Siages:1,requiredCycles:19Forwardingenabled.axz£Laikmn泛ofal匚uc:后止.由日E川:1BranchZJunpitalk:ID33,33ofRAWstalls)Fhaiingpointstalls:0(0.口0君ofRAWstalls)W直Wstalls:0(O.aOStof4IICelesStructuralstalls:0(0.00ofallCycl

25、esControlstalls:9(7.03cfalCvclesTrapstake3(2.34老ofalCyclesTotal:42Stall(sJ(32.ofMCycles)Condi11ona1Branches):Total:10口1r76霽ofallInstructions),thereof:taken:9(90.00ofalleandBranchesncittaken:1(10.00ofallcond.Branches)定向技術(shù)的加速比=202/128=1.578四、心得體會(huì)通過(guò)使用WinDlX對(duì)指令模擬與分析,我們對(duì)流水線(xiàn)的執(zhí)行過(guò)程更加熟悉,也對(duì)執(zhí)行時(shí)進(jìn)一步加深了使我們對(duì)流出現(xiàn)的問(wèn)題

26、,如資源相關(guān),數(shù)據(jù)相關(guān)等產(chǎn)生原因有了清楚的熟悉,水線(xiàn)的理解.3.2循環(huán)展開(kāi)及指令調(diào)度一、實(shí)驗(yàn)?zāi)康?,加深對(duì)循環(huán)級(jí)并行性、指令調(diào)度技術(shù)、循環(huán)展開(kāi)技術(shù)以及存放器換名技術(shù)的理解;2,熟悉用指令調(diào)度技術(shù)來(lái)解決流水線(xiàn)中的數(shù)據(jù)相關(guān)的方法;3,了解循環(huán)展開(kāi)、指令調(diào)度等技術(shù)對(duì)CPIB能的改良.二、實(shí)驗(yàn)平臺(tái)WinDLX模擬器.三、實(shí)驗(yàn)內(nèi)容、步驟及實(shí)驗(yàn)結(jié)果1.用指令調(diào)度技術(shù)解決流水線(xiàn)中的結(jié)構(gòu)相關(guān)與數(shù)據(jù)相關(guān)(1)用DLX匯編語(yǔ)言編寫(xiě)代碼文件*.s,程序中應(yīng)包括數(shù)據(jù)相關(guān)與結(jié)構(gòu)相關(guān)(假設(shè):加法、乘法、除法部件各有2個(gè),延遲時(shí)間都是3個(gè)時(shí)鐘周期)(2)通過(guò)Configuratio讀單中的Floatingpointstag

27、e?s選項(xiàng),把加法、乘法、除法部件的個(gè)數(shù)設(shè)置為2個(gè),把延遲都設(shè)置為3個(gè)時(shí)鐘周期;(3)用WinDLX運(yùn)行程序.記錄程序執(zhí)行過(guò)程中各種相關(guān)發(fā)生的次數(shù)、發(fā)生相關(guān)的指令組合,以及程序執(zhí)行的總時(shí)鐘周期數(shù);(4)采用指令調(diào)度技術(shù)對(duì)程序進(jìn)行指令調(diào)度,消除相關(guān);(5)用WinDLX運(yùn)行調(diào)度后的程序,觀(guān)察程序在流水線(xiàn)中的執(zhí)行情況,記錄程序執(zhí)行的總時(shí)鐘周期數(shù);(6)根據(jù)記錄結(jié)果,比擬調(diào)度前和調(diào)度后的性能.論述指令調(diào)度對(duì)于提升CPU性能的意義.1)代碼:divff2,f5,f6divff1,f2,f6divff3,f1,f5divff0,f4,f7addff14,f0,f6addff15,f5,f7multff2

28、0,f4,f6multff21,f5,f72設(shè)置運(yùn)算部件個(gè)數(shù)以及運(yùn)算時(shí)鐘周期數(shù)FloatingPointStageConfigura.以下為出現(xiàn)的數(shù)據(jù)相關(guān)IDCycles:.電司TerminatedsuccessfulA<-F2-132B<-F5=12司2Stall(s)becauseofRAW4Hazardwithdivf即5脂先寫(xiě)后讀相關(guān)Pipeline圖Cycles:*11(3)TerminatadsuccessfullyCounl:Delay:AdditionUnits:MulliplicationUnits:DivisionUnits:&32323MEMIWB1f

29、divE>:|1)|ID口卜山B<-F6-1282StalikbecauseofRAWHazard而thdivf(0/4J7IFIID|R-Stail由于只有兩個(gè)除法部件,所以出現(xiàn)了功能部件的沖突總的執(zhí)行周期是38指令調(diào)度后代碼:將無(wú)關(guān)指令放在一起執(zhí)行,相關(guān)指令分開(kāi)盡量防止數(shù)據(jù)相關(guān)divff2,f5,f6multff20,f4,f6multff21,f5,f7divff1,f2,f6addff15,f5,f7divff3,f1,f5divff0,f4,f7addff14,f0,f6靦由皿I巷I.劉I-I右|小|包|劉I伯|代|M|直IIL4|IfTiDMTTMniwi2lpfl1j

30、2E21fHIfIGIZiWid2lpH1,j2EK編THrwpWIZJ5J53H割.H節(jié)md»I21,J517ifcfFUWBTdf15f5J7thfUFIB2lnstruictions|currentlyinPipeline.Pipeline圖Hardwareconfigua?ation:Memorysiize:22768日yte臺(tái)f-addEX-Slagei:2,inquiredCydesi3fmulEX-Stapgear之requiredCycles;2fdivEX-Slages.:2,requid巴dCycles.:3ForwardingenabledStalIs:RAW占

31、WIL:2|(S.57Sof呂IICycles),thereof:LDstalls:0(0.DOXofRAWstaRs)Branch/Jump占tails:00,00SiofRAW就司令Ftoalingpointstalls:3(100.00ofHAWstallsWAWstalk:0(,ofallCyclesSlructural號(hào)向版01(0.00ofallCjieleJContrclstaHs;0(D,QQ%ofallCcl$Trapstalls:720.00ofdlCyclelTotalWStdKs)2857S川國(guó)I匚皿司Conditions1Biranches);Total:00.00o

32、fallImstructionsthereof:taken:(0.00ofallcond.Branches)nottaken:0(.OOSrofallcond.BranchesStatistics圖總執(zhí)行時(shí)鐘周期為35個(gè).(6)指令調(diào)度后,前的時(shí)鐘周期數(shù)為數(shù)據(jù)相關(guān)減少了,總時(shí)鐘周期數(shù)減少了,效能提升了.調(diào)度38,調(diào)度后的時(shí)鐘周期數(shù)減少為35,加速比=38/35=1.082,用循環(huán)展開(kāi)、存放器換名以及指令調(diào)度提升性能(1)用DLX匯編語(yǔ)言編寫(xiě)代碼文件*.s,程序中包含一個(gè)循環(huán)次數(shù)為4的整數(shù)倍的簡(jiǎn)單循環(huán);(2)用WinDLX運(yùn)行該程序.記錄執(zhí)行過(guò)程中各種相關(guān)發(fā)生的次數(shù)以及程序執(zhí)行的總時(shí)鐘周期數(shù);(

33、3)將循環(huán)展開(kāi)3次,將4個(gè)循環(huán)體組成的代碼代替原來(lái)的循環(huán)體,并對(duì)程序做相應(yīng)的修改.然后對(duì)新的循環(huán)體進(jìn)行存放器換名和指令調(diào)度;(4)用WinDLX運(yùn)行修改后的程序,記錄執(zhí)行過(guò)程中各種相關(guān)發(fā)生的次數(shù)以及程序執(zhí)行的總時(shí)鐘周期數(shù);(5)根據(jù)記錄結(jié)果,比擬循環(huán)展開(kāi)、指令調(diào)度前后的性能.帶循環(huán)指令代碼:求四個(gè)1相加的和,結(jié)果存在r2中,text,globalmainmain:addir1,r0,#4addir2,r0,#0Loop:sgtr3,r1,r0bnezr3,Sub1trap0Sub1:addir2,r2,#1subir1,r1,#1jLoop結(jié)果:總時(shí)鐘周期是42個(gè),5rawstalls,循環(huán)了

34、4次,結(jié)果r2=4舊ClockCycleDiagram272BB29,3),113233,34.35,3G3711111111363940功1131jOIbrwzrXEiJbll國(guó)QkQ«3dd2/2,OwlLoop飾即112rlitjOIbrezrXSdblIrv0:hQaddIF|ID|i:匚MEM|WnIF|d3型|巾_I|MEM|WB|Stiatklii-sTot&l:42Cjc*a($|e»ecdtBdIDCMCCuledby251rl到rudE汕ZIWiucMri閔cutie曲hFlpebne.R3-OkOOOOODOOR4=OxOQOOQOODR5&#

35、171;DhOOOOOOOOPC=DxQOOOOllBIK1R=03C00000114IR-0x20420001a=OxQOQOOOOOAHDDxOOOOOOOOB-0x00000000BHI®DuDIDDOaOODBTA-000000000A1(J-DxOOOOOOOOdilUHI=DwDOOOODODFPSR-DOOOOOOODDHAR=0x000000005DR=DyQODOOOODSDRHI-OiiOOOOOiOOOLDR=DxOOQOOOOQLDRHI-OxOOOODODDR0*DxOOOOOOODRL二QkQQQQQQUQR?=UUUUUU4Herdvaxeeonfigu

36、r*tionM日mry也H327ES即*口ddEXXSt占ges:LcequredCycles:2frrMEX-Siade:lriequhedC%les5dh£X-St.ages-:1.requiredErdeis:19ForvuwdngStalls:RAW里日帕5(1190ofalCidesLfchweol:LDfialk0口叩,dRAW就&到Bianch/Jumps;tails:51Q0.OIKGAA>WwtK同二.l-il.cre川k-u'i循環(huán)展開(kāi):代碼:.text.globalmainmain:addir1,r0,#4addir2,r0,#0addir

37、2,r2,#1subir1,r1,#1addir2,r2,#1subir1,r1,#1addir2,r2,#1subir1,r1,#1addir2,r2,#1subir1,r1,#1trap0結(jié)果:執(zhí)行了4次,結(jié)果r2=4總時(shí)鐘周期是15個(gè),0rawstalls,國(guó)ClockCycleDiagramiln$ixuciioni/卬cfej_addr2rrD.OifO“A1UFI巾T-IMEIMIIWBI.加add建立口收ifLIDL!MEMIWBItUbid.f1.Ok13ddi2啟0內(nèi)siibidljlXkclirepOxQStadstk5Total:15Cpdh(s|ewcdtedIDexe

38、cuiedM11Innxiioritsl2lrHtiu£tiah«curiendyrkRpeine.H*rdvareoonEiguratisnMemorysize:327tByiw'r口ddEX&dQts.1.requrcdCydst:2fimJEX-Siagestiequ«edCcte?5fdivEXStag6LreqWedGd的19For附3rlingenabledStalls:RAW馱&卜:U(000S&ofdlQtleslUhcreol:LD$聞上.以MR;dRAW,肉*Bianch/Junrps:taiIs:0(0.00%

39、ofRAW就d國(guó)1-!:-:l.l口L口Sn/By-4G.J-鼻.、IF|10慌EXMEM|WB|IIFI口1xlm'-StallIHAKhUKUUUUUlCJIR=0»0000000010x000000001-AHI*QxOQQQQQOO1B=OkODOQODOO1BHI-0x000000001BTA-QxQQOQQQOO1ALU=OkODOOOOOD1ALUHI-OkODOOOOOO1FPSR=口1DMAR=OxOonoQQoo1SDR-OkO'DOOOOOO1SDRHI-OkOQOQQOOO1LDR=OkOOOOOOOO1LDRHI-OkOOOOOOOO1R0=

40、OkQDODODOO1Rl=0x00000000JR2-OkOODOOOQ4R3=OkQDODODODR4-OkOOOOOOOO1IOxQOOOOQOQ1IF舊|HEXIMEM|麗口口I0O原因比照:是由于LOOP旨令執(zhí)行完后會(huì)有一個(gè)nop指令的延遲.addr2/2f0«1IFPMEM|飛麗"subir1j1,0x1IFI口MEM|WEILoopnopIFaborted四、實(shí)驗(yàn)總結(jié)a指令調(diào)度技術(shù)可以明顯的優(yōu)化指令執(zhí)行的效率,通過(guò)指令調(diào)度使得功能部件被盡可能的充分使用,從而進(jìn)一步增強(qiáng)指令執(zhí)行的效率;b循環(huán)在執(zhí)行時(shí)會(huì)出現(xiàn)不同循環(huán)次數(shù)的執(zhí)行過(guò)程中出現(xiàn)相關(guān),導(dǎo)致數(shù)量增加,通過(guò)寄存器

41、換名等方法,使得這種相關(guān)性數(shù)量下降.3.3Cache性能分析、實(shí)驗(yàn)?zāi)康? ,加深對(duì)Cache勺根本概念、根本組織結(jié)構(gòu)以及根本工作原理的理解;2 ,了解Cache勺容量、相聯(lián)度、塊大小對(duì)Cache生能的影響;3 .掌握降低Cach既效率的各種方法,以及這些方法對(duì)Cache生能提升的好處;4 .理解Cach缺效的產(chǎn)生原因以及Cache勺三種失效;5 .理解LRU與隨機(jī)法的根本思想,及它們對(duì)Cache生能的影響;、實(shí)驗(yàn)平臺(tái)SimpleScala校擬器.三、實(shí)驗(yàn)內(nèi)容及步驟1.在根本配置情況下運(yùn)行程序請(qǐng)指明所選的測(cè)試程序,統(tǒng)計(jì)Cache總失效次數(shù)、三種不同種類(lèi)的失效次數(shù);配置好了環(huán)境之后,用hello.

42、c生成的a.out文件來(lái)進(jìn)行模擬演示.rootlocalhostrootftcdsimplescalarrootClocalhostsimplescalar#./simplesim-3.O/sim-cachea.outill.accesses4207其totalnumberofaccessesill.hits3749#totalnumberofhitsill,misses458#totalnumberofmissesdll.replacements202#totalnumberofreplacementsdll.writebacks193#totalnumberofwritebacksill.

43、invalidations0#totalnumberofinvalidationsdll.miss_rate0.1089#missrate(ie,misses/ref)dll.repl_rate0,0480#replacementrate(iFe.,repls/ref)dll.wb_rate0,0459#writebackrate(i¥e.Twrbks/ref)dll.inv_rate0.0000invalidationrate(i.e.rinvs/ref)由圖可知:Cache勺總失效次數(shù)為458,容量失效和沖突失效都發(fā)生了替換總共為202,那么強(qiáng)制性沖突就為256.2.改變Cach

44、es量*2,*4,*8,*64,運(yùn)行程序指明所選的測(cè)試程序,統(tǒng)計(jì)各種失效的次數(shù),并分析Cach斷量對(duì)Cache性能的影響;測(cè)試程序?yàn)閠est-math.Cachel設(shè)定舉例:-cache:dl1dll:2:32:4:r第一個(gè)參數(shù)為集合數(shù),第二個(gè)參數(shù)為塊的大小,第三個(gè)參數(shù)為相聯(lián)度,最后一個(gè)參數(shù)為策略.改變集合數(shù),設(shè)定塊大小為32B,相聯(lián)度為4路,采取LRU策略,來(lái)考慮容量對(duì)cache性能的影響.replacementrate(i.e.,repls/tef)writebackrate(i.e.,wrbks/ref)invalidationrate(i.e.,invs/ref)*2:設(shè)定容量為2*3

45、2*4B時(shí),結(jié)果如下:dll,accesses57466#totalnuinberofaccessesdll.hits50201#totalnumberofhitsdll7265#totalnumberofmissesdll.replacements7257#totalnumberofreplacementsdll,writebacks4598#totalnumberofwritebacksdll.invalidations0#totalnumberofinvalidationsdll.miss_rate0.1264#missrate(i.e.pmisses/ref)0.12630.080D0

46、.0000存善群dll.repl_tatedll.wb_ratedll.inv_rate*4:設(shè)定容量為4*32*4B時(shí),結(jié)果如下:dll,accesses57466#totalnumberofaccessesdll.hitsS3189#totalnumberofhitsdll.misses4277#totalmijtibei'ofmissesdll.replacements4261尊totalnumberofreplacementsdll.writebacks2692#totalnumberofwritebacksdll.invalidations0#totalnumberfinva

47、lidationsdll.Ttiiss_rate0.0744#missrate(i.e.,misses/ref)dll.repl_rate0.0741ftTeplacetnent】(i,e.,repls/ref)dll.wb_rate0.0468#writebackrate(i.e.,wrbks/ref)dll、inv_rat&0.0000#invalidationrate(i.e.tinvs/ref)_1n*8:設(shè)定容量為8*32*4B時(shí),rcrrcq結(jié)果如下dll.accesses57466尊totalnumberofaccessesdll.hits55280#totalnumbe

48、rofhitsdll.fflisses2186#totalnumberofmissesdll-replacements2154尊totalnumberofreplacementsdll.writebacks1493#totalnumberfwritebacksdll.invalidations.#totalnumberofinvalidationsdll.miss_rate0,0380尊missrate(i.e.rmisses/ref)dll.repl_rate0,0375#replacementTate(i,e-repls/ref)dll.wb_rate0.0260善writebackrat

49、e(i.e.,wrbks/ref)dll.inv_raTe0,0000帶invalidationrate總、,invs/ref)*64:設(shè)定容量為64*32*4B時(shí),結(jié)果如下:dll.accesses57466#totalnumberofaccessesdll.hitsS6891善totainumberofhitsdll.misses575#totalnumberofmissesdll.replacetnents319totalnumberofreplaeerLEfUtsdll+writebacks307#totalnumberofwritebacksdll.InvalidatiGns0tot

50、alnumberofinvalidationsdllhjniss_rate0.0100#missrte(i,欄-*misses/ref)dll,repl_rateO.OOS6并replacementrate!(i.e.,repls/ref)dll.wb_rate0.0053#writebackrate(ie.*wrbks/rof)dll.inv_rate0.0000invalidationrate(i.e.,invs/ref)從上面的數(shù)據(jù)中提取出有用的數(shù)據(jù)信息,制作下面的表.容量人小總失效率總失效數(shù)容量失效和沖突失效數(shù)強(qiáng)制性失效數(shù)*20.1264726572578*40.07444277426

51、116*80.03802180215426*640.0100575319256結(jié)論:隨著cache容量的增大,總失效率減小,總失效數(shù)也減少,容量失效和沖突失效數(shù)也減小,但是強(qiáng)制性失效數(shù)反而升高.3 .改變Cache勺相聯(lián)度1路,2路,4路,8路,64路,運(yùn)行程序指明所選的測(cè)試程序,統(tǒng)計(jì)各種失效的次數(shù),并分析相聯(lián)度對(duì)Cache生能的影響;固定其他的參數(shù),只測(cè)試程序?yàn)閠est-printf.參數(shù)dl1:2:32:12,4,8,64:l,改變相聯(lián)度的參數(shù),觀(guān)察相聯(lián)度對(duì)cache性能的影響.1路:dll,accesses531424dll.hits300140dllHisses223284dll,re

52、placements223282dll.writebacks83743dll-invalidations0dll.misE_rate0.4202dll,repl_rate0.4202dll,wb_rate0.157日dll*inv_rate0.0000#totalnumberofaccesses#totalnumberofhits#totalnumberofmisses#totalnumberofreplacements#totalnumberofwritebacks#totalnumberofinvalidations# missrate(i.e.,misses/ref)ffreplacem

53、entrate(i.e.,repls/ref)# writebackrate(i.e*>wrbks/ref)# invalidationrate(i.erinvs/ref)2路:dll.accessesdll.hitsdll.missesdll.replacementsdll.writebacksdll,invalidations.rnss_yfitedll,repl_ratedll.wb_ratedll.inv_rate531424totalnumberofaccesses395208#totalnumberofhits136216totalnumberofmisses136212#totalnumberofreplacements62B80totalnumberofwritebacks0善totalnumberofinvalidations0,2563#missrate(i.,e.,misses/ref)0.25630.11S30.0000m1GHiaifreplacementrate(ie.repls/ref)writebackrate(i.e.,wrbks/ref)invalidationrate(i.e,tinvs/ref)i

溫馨提示

  • 1. 本站所有資源如無(wú)特殊說(shuō)明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶(hù)所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒(méi)有圖紙預(yù)覽就沒(méi)有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶(hù)上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶(hù)上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶(hù)因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

評(píng)論

0/150

提交評(píng)論