DCN-FatTreeAdvanced Computer Networking完整原版課件_第1頁
DCN-FatTreeAdvanced Computer Networking完整原版課件_第2頁
DCN-FatTreeAdvanced Computer Networking完整原版課件_第3頁
DCN-FatTreeAdvanced Computer Networking完整原版課件_第4頁
DCN-FatTreeAdvanced Computer Networking完整原版課件_第5頁
已閱讀5頁,還剩43頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

AdvancedComputerNetworkingScopeCuttingedgetechnologicaltrendsincomputernetworkinginthepastafewyears.Time:Tuesday3,4,5(9:45-12:10,week1-16)Thursday1,2(7:50-9:25,week8-12)Venue:Online:Offline:3B202Score:60%:~3courseprojects40%:finalexam(USTCmandatoryrequirement)AScalable,CommodityDataCenterNetworkArchitectureMohammadAl-Fares,AlexanderLoukissas,AminVahdatSIGCOMM2008PresentedbyYeTianforCourseCS05112OutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewDatacentersClustersofthousandsofcomputersWheretheInternetlivesGoogledatacenter/about/datacenters/inside/streetview/Microsoft

underwaterdatacenter騰訊貴安七星數(shù)據(jù)中心阿里巴巴千島湖數(shù)據(jù)中心DatacenterracksWhatisdatacenter?DCCommunicationsPlentyofM2Mcommunications,theprinciplebottleneckinlarge-scaleclustersisofteninter-nodecommunicationbandwidth.MapReduce:mustperformsignificantdatashufflingtotransporttheoutputofitsmapphasebeforeproceedingwithitsreducephase.Websearchengine:oftenrequiresparallelcommunicationwitheverynodeintheclusterhostingtheinvertedindextoreturnthemostrelevantresultsManagedbyonesingleauthorityTwoapproachesforDCnetworkApproach1:SpecializedhardwareandcommunicationprotocolsForexample:InfiniBand,MyrinetDonotleveragecommodityparts,expensiveNotcompatiblewithTCP/IPapplicationsApproach2:LeveragescommodityEthernetswitchesandrouterstointerconnectclustermachines.Unmodifiedapplications,OS,andhardwareButhow?DesiredPropertiesforaDCNetworkArchitectureScalableinterconnectionbandwidth:anarbitraryhostinthedatacentercancommunicatewithanyotherhostinthenetworkatthefullbandwidthofitslocalnetworkinterface.Economiesofscale:makecheapoff-the-shelfEthernetswitchesthebasisforlargescaledatacenternetworks.Backwardcompatibility:theentiresystemshouldbebackwardcompatiblewithhostsrunningEthernetandIP.CurrentDataCenterNetworkTopologiesCurrentDataCenterNetworkTopologiesThreetiers:core,aggregation,edge(ToRswitch)Twotypesofswitches:48-portGigEswitch,withfour10GigEuplinks,usedattheedgeofthetree128-port10GigEswitchforhigherlevelsofacommunicationhierarchyProblemsoftheTopologyOversubscription:theratioofthetotalbisectionbandwidthofaparticularcommunicationtopologytotheworst-caseachievableaggregatebandwidthamongtheendhosts.Ideal:1:1,allhostsmaypotentiallycommunicatewitharbitraryotherhostsatthefullbandwidthoftheirnetworkinterfaceTypicaldesignsareoversubscribedbyafactorof2.5:1to8:1ProblemsoftheTopologyMulti-pathRouting:Deliveringfullbandwidthbetweenarbitraryhostsinlargerclustersrequiresa“multi-rooted”treewithmultiplecoreswitchesECMPperformsstaticload-splittingamongflows.Limitthemultiplicityofpathsto8–16ProblemsoftheTopologyCost:保證一定的oversubscription,cost會隨規(guī)模急劇增加。ProblemsoftheTopologyCost:Usingthelargest10GigEandGigEswitchestobuildadatacenterwith1:1oversubscriptionAclustercanbeupto27,648hostsOutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewFat-treeFat-treekpods,eachcontainingtwolayersofk/2switches.Eachk-portswitchinthelowerlayerisdirectlyconnectedtok/2hosts.Eachoftheremainingk/2portsisconnectedtok/2ofthekportsintheaggregationlayer.Fat-tree(k/2)2

k-portcoreswitches.Eachhasoneportconnectedtoeachofkpods.Theithportofanycoreswitchisconnectedtopodisuchthatconsecutiveportsintheaggregationlayerofeachpodswitchareconnectedtocoreswitcheson(k/2)strides.Fat-treeFocusondesignsuptok=48.Useidentical48-portGigEswitches.Thenetworksupports27,648hosts,madeupof1,152subnetswith24hostseach.Thereare576equal-costpathsbetweenanygivenpairofhostsindifferentpods.Thecostofdeployingsuchanetworkarchitecturewouldbe$8.64M,comparedto$37Mforthetraditionaltechniques.ArchitectureDesignMotivationThereare(k/2)2shortest-pathsbetweenanytwohostsondifferentpods,butonlyoneischosen.Eachpathhas5hopsProtocolslikeOSPFselectspathbasedonhopcounts.itispossibleforasmallsubsetofcoreswitches,perhapsonlyone,tobechosenastheintermediatelinksbetweenpods.Needasimple,fine-grainedmethodoftrafficdiffusion.AddressingAllIPaddressesinthenetworkwithintheprivate/8block.Thepodswitchesaregivenaddressesoftheform10.pod.switch.1,poddenotesthepodnumber(in[0,k?1]),switchdenotesthepositionofthatswitchinthepod(in[0,k?1],startingfromlefttoright,bottomtotop).Givecoreswitchesaddressesoftheform10.k.j.i,jandidenotethatswitch’scoordinatesinthe(k/2)2coreswitchgrid(eachin[1,(k/2)],startingfromtop-left).Two-levelRoutingTableEachentryinthemainroutingtablewillpotentiallyhaveanadditionalpointertoasmallsecondarytableof(suffix,port)entries.Afirst-levelprefixisterminatingifitdoesnotcontainanysecondlevelsuffixes,Asecondarytablemaybepointedtobymorethanonefirst-levelprefix.Two-levelRoutingTableEntriesintheprimarytableareleft-handed(i.e.,/mprefixmasksoftheform1m032?m),entriesinthesecondarytablesareright-handed(i.e./msuffixmasksoftheform032?m1m).Ifthelongest-matchingprefixsearchyieldsanon-terminatingprefix,thenthelongest-matchingsuffixinthesecondarytableisfoundandused.Two-levelRoutingTableTheroutingtableofanypodswitchwillcontainnomorethank/2prefixesandk/2suffixes.RoutingAlgorithmPodswitchesIfahostsendsapackettoanotherhostinthesamepodbutonadifferentsubnet,thenallupper-levelswitchesinthatpodwillhaveaterminatingprefixpointingtothedestinationsubnet’sswitch.RoutingAlgorithmPodswitchesForallotheroutgoinginter-podtraffic,thepodswitcheshaveadefault/0prefixwithasecondarytablematchinghostIDs.EmploythehostIDsasasourceofdeterministicentropy;theywillcausetraffictobeevenlyspreadupwardamongtheoutgoinglinkstothecoreswitches.RoutingAlgorithmAggregationswitchesOnceapacketreachesitsdestinationpod,thereceivingupper-levelpodswitchwillalsoincludea(10.pod.switch.0/24,port)prefixtodirectthatpackettoitsdestinationsubnetswitch,whereitisfinallyswitchedtoitsdestinationhost.Generatingupperaggregationswitchroutingtable;Forlowerswitches,omitline3-5.RoutingAlgorithmCoreswitchesOnceapacketreachesacoreswitch,thereisexactlyonelinktoitsdestinationpod,andthatswitchwillincludeaterminating/16prefixforthepodofthatpacket(10.pod.0.0/16,port).AnExampleSource:;destination:Atthegatewayswitch(),matcheswiththe/0first-levelprefix,thenmatcheswiththe/8secondary-levelsuffix,thenforwardtoport2,androutedtothepodswitch.(i=3,z=1)AnExampleAtthegatewayswitch(),matcheswiththe/0first-levelprefix,thenmatcheswiththe/8secondary-levelsuffix,thenforwardtoport2,androutedtothepodswitch.(i=3,z=2)AnExampleAt,matchesaterminating/16prefix,whichpointstopod2onport2,andswitch.At,matchesaterminatingprefix/24,whichpointstotheswitchresponsibleforthatsubnet,onport0.Howaboutthedestinationbecomes?Centralizedalgorithm,notadistributedone.Whyfeasible?FlowClassificationThetwo-levelroutingtechniqueisstatic,buttrafficsarenotevenlydistributedamongthehosts.EdgeswitchRecognizesubsequentpacketsofthesameflow,andforwardthemonthesameoutgoingport.(packetsofsame<srcIP,dstIP,srcport,dstport,proto>belongtoasameflow)Periodicallyreassignaminimalnumberofflowoutputportstominimizeanydisparitybetweentheaggregateflowcapacityofdifferentports.FlowSchedulingTrafficsaredominatedbyfewlargelong-livedflowsEdgeswitchAdditionallydetectanyoutgoingflowwhosesizegrowsaboveapredefinedthreshold,andperiodicallysendnotificationstoacentralschedulerspecifyingthesourceanddestinationforallactivelargeflows.FlowSchedulingCentralSchedulerMaintainsbooleanstateforalllinksinthenetworktheiravailabilitytocarrylargeflows.Whentheschedulerreceivesanotificationofanewflow,itlinearlysearchesthroughthecoreswitchestofindonewhosecorrespondingpathcomponentsdonotincludeareservedlink.Uponfindingsuchapath,theschedulermarksthoselinksasreserved,andnotifiestherelevantlower-andupper-layerswitchesinthesourcepodwiththecorrectoutgoingportthatcorrespondstothatflow’schosenpath.PowerandHeatIssues不同Switch的能耗效率,后三個是10GigE的switchEmploysmoreindividualswitches,issuperiortothoseincurredbycurrentdatacenterdesigns,with56.6%lesspowerconsumptionand56.5%lessheatdissipation.OutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewImplementationImplementrouterprototypewithClick.Supportthetwo-levelroutingtable4-portsTheClickModularRouterProjectAsoftwarearchitectureforbuildingflexibleandconfigurablerouters/kohler/clickExperimentDescriptionImplementa4-portfat-tree(k=4):thereare16hosts,fourpods(eachwithfourswitches),andfourcoreswitches.Multiplexthese36elementsontotenphysicalmachines,interconnectedbya48-portProCurve2900switchwith1GigabitEthernetlinks.Eachpodofswitchesishostedononemachine;eachpod’shostsarehostedononemachine;andthetworemainingmachinesruntwocoreswitcheseach.ExperimentDescriptionForthecomparisoncaseofthehierarchicaltreenetwork,fourmachinesrunningfourhostseach,andfourmachineseachrunningfourpodswitcheswithoneadditionaluplink.BenchmarkSuiteRandom:Ahostsendstoanyotherhostinthenetworkwithuniformprobability.Stride(i):Ahostwithindexxwillsendtothehostwithindex(x+i)mod16.StaggeredProb(SubnetP,PodP):Whereahostwillse

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

評論

0/150

提交評論