版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)
文檔簡介
軟件工程基礎(chǔ)Hadoop生態(tài)系統(tǒng)劉馳軟件工程基礎(chǔ)1AnEcosystemforCloudComputing222ProblemBatch(offline)processingofhugedatasetusingcommodityhardwareisnotenoughforreal-timeapplicationsStrongdesireforlinearscalabilityNeedinfrastructuretohandleallmechanicsallowdeveloperstofocusontheprocessinglogic/algorithms3ProblemBatch(offline)process3ExplosiveData!–StorageNewYorkStockExchange:1TBdataperdayFacebook:100billionphotos,1PB(1000TB)InternetArchive:2PBdata,growingby20TBpermonthCan’tputdataonaSINGLEnodeStrongneedsfordistributedfilesystemsExplosiveData!–StorageNewY45Java/Python/Cinterfaces5Java/Python/Cinterfaces5666CommercialHardware典型的2層構(gòu)架–節(jié)點(diǎn)是普通的商業(yè)PC機(jī)–30-40節(jié)點(diǎn)/Rack–頂層到Rack帶寬3-4Gbps–Rack到節(jié)點(diǎn)帶寬1Gbps7CommercialHardware典型的2層構(gòu)架77Whois(was)UsingHadoop?8Whois(was)UsingHadoop?88Example:Facebook的Hadoop集群產(chǎn)品集群4800個內(nèi)核,600個機(jī)器,每個機(jī)器16GB—2009年4月8000個內(nèi)核,1000個機(jī)器,每個機(jī)器32GB—2009年7月每個機(jī)器擁有4個1TB大小的SATA硬盤兩層網(wǎng)絡(luò)結(jié)構(gòu),每個Rack有40個機(jī)器整個集群大小為2PB,未來還會不斷增加測試集群800個內(nèi)核,每個16GB9Example:Facebook的Hadoop集群產(chǎn)品集群9ADistributedFileSystem軟件工程基礎(chǔ)Hadoop生態(tài)系統(tǒng)劉馳AnEcosystemforCloudComputing課件10Single-NodeArchitectureMemoryDiskCPUMachineLearning,Statistics“Classical”DataMining11Single-NodeArchitectureMemory11CommodityClustersWebdatasetscanbeverylargeTenstohundredsofTBCannotmineonasingleserverStandardarchitectureemerging:ClusterofcommodityLinuxnodesGigabitEthernetinterconnectHowtoorganizecomputationsonthisarchitecture?Maskissuessuchashardwarefailure12CommodityClustersWebdataset12ClusterArchitectureMemDiskCPUMemDiskCPU…SwitchEachrackcontains16-64nodesMemDiskCPUMemDiskCPU…SwitchSwitch1Gbpsbetweenanypairofnodesinarack2-10Gbpsbackbonebetweenracks13ClusterArchitectureMemDiskCPU13StablestorageFirstorderproblem:ifnodescanfail,howcanwestoredatapersistently?Answer:DistributedFileSystemProvidesglobalfilenamespaceGoogleGFS;HadoopHDFS;KosmixKFSTypicalusagepatternHugefiles(100sofGBtoTB)DataisrarelyupdatedinplaceReadsandappendsarecommon14StablestorageFirstorderprob14151515161616NamenodeandDatanodesMaster/slavearchitecture1Namenode,amasterserverthatmanagesthefilesystemnamespaceandregulatesaccesstofilesbyclients.manyDataNodesusuallyonepernodeinacluster.managestorageservesread,writerequests,performsblockcreation,deletion,andreplicationuponinstructionfromNamenode.HDFSexposesafilesystemnamespaceandallowsuserdatatobestoredinfiles.AfileissplitintooneormoreblocksandsetofblocksarestoredinDataNodes.2023/9/2717NamenodeandDatanodesMaster/s17Namespace2023/9/27HierarchicalfilesystemwithdirectoriesandfilesCreate,remove,move,renameetc.NamenodemaintainsthefilesystemAnymetainformationchangestothefilesystemrecordedbytheNamenode.Anapplicationcanspecifythenumberofreplicasofthefileneeded:replicationfactorofthefile.ThisinformationisstoredintheNamenode.18Namespace2023/8/8Hierarchical18DataReplication2023/9/27Storeverylargefilesacrossmachinesinalargecluster.Eachfileisasequenceofblocksofsamesize.Blocksarereplicated2-3times.Blocksizeandreplicasareconfigurableperfile.NamenodereceivesaHeartbeatandaBlockReportfromeachDataNodeinthecluster.BlockReportcontainsalltheblocksonaDatanode.19DataReplication2023/8/8Store19ReplicaPlacement2023/9/27Rack-aware:Goal:improvereliability,availabilityandnetworkbandwidthutilizationResearchtopicNamenodedeterminestherackidforeachDataNode.Replicasareplaced:1inalocalrack,1onadifferentnodeinthelocalrackand1onanodeinadifferentrack.1/3ofthereplicaonanode,2/3onarackand1/3distributedevenlyacrossremainingracks.20ReplicaPlacement2023/8/8Rack-20HDFS:DataNodeDistance21HDFS:DataNodeDistance2121ReplicationPipeliningWhentheclientreceivesresponsefromNamenode,itflushesitsblockinsmallpieces(4K)tothefirstreplica,thatinturncopiesittothenextreplicaandsoon.ThusdataispipelinedfromDatanodetothenext.2023/9/2722ReplicationPipeliningWhenthe22ReplicaSelection2023/9/27ReplicaselectionforREADoperation:HDFStriestominimizethebandwidthconsumptionandlatency.IfthereisareplicaontheReadernodethenthatispreferred.HDFSclustermayspanmultipledatacenters:replicainthelocaldatacenterispreferredovertheremoteone.23ReplicaSelection2023/8/8Repl23Datanode2023/9/27ADatanodestoresdatainfilesinitslocalfilesystem.DatanodehasnoknowledgeaboutHDFSfilesystemItstoreseachblockofHDFSdatainaseparatefile.Datanodedoesnotcreateallfilesinthesamedirectory.Itusesheuristicstodetermineoptimalnumberoffilesperdirectoryandcreatesdirectoriesappropriately:Researchissue?WhenthefilesystemstartsupitgeneratesalistofallHDFSblocksandsendthisreporttoNamenode:Blockreport.24Datanode2023/8/8ADatanodesto24HDFS:FileRead25HDFS:FileRead2525HDFS:FileWrite26HDFS:FileWrite2626CommunicationProtocol2023/9/27AllprotocolsarelayeredontopoftheTCP/IPprotocolAclientestablishesaconnectiontoaconfigurableTCPportontheNamenodemachine.IttalksClientProtocolwiththeNamenode.DatanodestalktotheNamenodeusingDatanodeprotocol.RPCabstractionwrapsbothClientProtocolandDatanodeprotocol.Namenodeissimplyaserverandneverinitiatesarequest;itonlyrespondstoRPCrequestsissuedbyDataNodesorclients.27CommunicationProtocol2023/8/827DataNodeFailureandHeartbeatDatanodesloseconnectivitywithNamenode.NamenodedetectsthisconditionbytheabsenceofaHeartbeatmessage.NamenodemarksDatanodeswithoutHearbeatanddoesnotsendanyIOrequeststothem.AnydataregisteredtothefailedDatanodeisnotavailabletotheHDFS.2023/9/2728DataNodeFailureandHeartbeat28ClusterRebalancingHDFSarchitectureiscompatiblewithdatarebalancingschemes.AschememightmovedatafromoneDatanodetoanotherifthefreespaceonaDatanodefallsbelowacertainthreshold.Intheeventofasuddenhighdemandforaparticularfile,aschememightdynamicallycreateadditionalreplicasandrebalanceotherdatainthecluster.Thesetypesofdatarebalancingarenotyetimplemented:researchissue.2023/9/2729ClusterRebalancingHDFSarchit29APIsHDFSprovidesJavaAPIforapplicationtouse.Pythonaccessisalsousedinmanyapplications.AClanguagewrapperforJavaAPIisalsoavailable.AHTTPbrowsercanbeusedtobrowsethefilesofaHDFSinstance.2023/9/2730APIsHDFSprovidesJavaAPIfor30FSShell,AdminandBrowserInterfaceHDFSorganizesitsdatainfilesanddirectories.ItprovidesacommandlineinterfacecalledtheFSshellthatletstheuserinteractwithdataintheHDFS.Thesyntaxofthecommandsissimilartobashandcsh.Example:tocreateadirectory/foodir/bin/hadoopdfs–mkdir/foodirThereisalsoDFSAdmininterfaceavailableBrowserinterfaceisalsoavailabletoviewthenamespace.2023/9/2731FSShell,AdminandBrowserIn31ADistributedComputationFrameworkforBatchProcessing軟件工程基礎(chǔ)Hadoop生態(tài)系統(tǒng)劉馳AnEcosystemforCloudComputing課件32WhatisMap/Reduce?AProgrammingModelDecomposeaprocessingjobintoMapandReducestagesDeveloperneedtoprovidecodesforMapandReducefunctionsconfigurethejobletHadoophandletherest33WhatisMap/Reduce?AProgrammi33MapReduceModel34MapReduceModel3434DistributedExecutionOverviewUserProgramWorkerWorkerMasterWorkerWorkerWorkerforkforkforkassignmapassignreducereadlocalwriteremoteread,sortOutputFile0OutputFile1writeSplit0Split1Split2InputData35DistributedExecutionOverview35Example:WordCountWehavealargefileofwords,onewordtoalineCountthenumberofappearancesforeachdistinctwordSampleapplication:analyzewebserverlogstofindpopularURLs36Example:WordCountWehaveal36Pseudo-Code:WordCountmap(key,value)://key:documentname;value:textofdocument foreachwordwinvalue: emit(w,1)reduce(key,values)://key:aword;values:aniteratorovercounts result=0 foreachcountvinvalues: result+=v emit(key,result)37Pseudo-Code:WordCountmap(key37map(key=url,val=contents):Foreachwordwincontents,emit(w,“1”)reduce(key=word,values=uniq_counts):Sumall“1”sinvalueslistEmitresult“(word,sum)”seebobrunseespotthrowsee 1bob 1run 1see 1spot 1throw 1bob 1run 1see 2spot 1throw 138WordCountmap(key=url,val=contents):see38MapReduceInput:asetofkey/valuepairsUsersuppliestwofunctions:map(k,v)list(k1,v1)reduce(k1,list(v1))v2(k1,v1)isanintermediatekey/valuepairOutputisthesetof(k1,v2)pairs39MapReduceInput:asetofkey/v39WhatisMAP?
Mapeachdataentryintoapair<key,value>ExamplesMapeachlogfileentryinto<URL,1>Mapdaystocktradingrecordinto<STOCK,Price>40WhatisMAP?Mapeachdataent40WhatisShuffle/Mergephase?
Hadoopmerges(shuffles)outputoftheMAPstageinto:<key,valulue1,value2,value3>Examples<URL,1,1,1,1,11><STOCK,PriceOnday1,PriceOnday2,…...>41WhatisShuffle/Mergephase?H41WhatisReduce?
ReduceentriesproducesbyHadoopmergingprocessinginto<key,value>pairExamplesMap<URL,1,1,1>into<URL,3>Map<Stock,3,2,10>into<Stock,10>42WhatisReduce?Reduceentries42Exampleuses:distributedgrep
distributedsort
weblink-graphreversalterm-vector/hostwebaccesslogstatsinvertedindexconstructiondocumentclusteringmachinelearningstatisticalmachinetranslation.........WidelyApplicable
MapReduceProgramsinGoogleSourceTree43Exampleuses:distributedgrep43100s/1000sof2-CPUx86machines,2-4GBofmemoryLimitedbisectionbandwidthStorageisonlocalIDEdisksGFS:distributedfilesystemmanagesdata(SOSP'03)Jobschedulingsystem:jobsmadeupoftasks, schedulerassignstaskstomachines
ImplementationisaC++librarylinkedintouserprograms
ImplementationOverview44ImplementationOverview4444ImplementationOverviewJobtrackerTasktrackerTasktrackerTasktrackerMasterNodeSlavenode1Slavenode2SlavenodeNWorkersuserWorkersWorkers45ImplementationOverviewJobtra45HowItWorks?46HowItWorks?4646DataFlowInput,finaloutputarestoredonHDFSSchedulertriestoschedulemaptasks“close”tophysicalstoragelocationofinputdataIntermediateresultsarestoredonlocalFSofmapandreduceworkersOutputisofteninputtoanothermapreducetask47DataFlowInput,finaloutputa47CoordinationMasterdatastructuresTaskstatus:(idle,in-progress,completed)IdletasksgetscheduledasworkersbecomeavailableWhenamaptaskcompletes,itsendsthemasterthelocationandsizesofitsRintermediatefiles,oneforeachreducerMasterpushesthisinfotoreducersMasterpingsworkersperiodicallytodetectfailures48CoordinationMasterdatastruct48FailuresMapworkerfailureMaptaskscompletedorin-progressatworkerareresettoidleReduceworkersarenotifiedwhentaskisrescheduledonanotherworkerReduceworkerfailureOnlyin-progresstasksareresettoidleMasterfailureMapReducetaskisabortedandclientisnotified49FailuresMapworkerfailure4949Execution
50Execution5050ParallelExecution51ParallelExecution5151HowManyMapandReduceJobs?Mmaptasks,RreducetasksRuleofthumb:M,R>>(#ofnodes)inclusterOneDFSchunkpermapiscommonImprovesdynamicloadbalancingandspeedsrecoveryfromworkerfailureUsuallyRissmallerthanM,becauseoutputisspreadacrossRfiles52HowManyMapandReduceJobs?M52CombinersOftenamaptaskwillproducemanypairsoftheform(k,v1),(k,v2),…forthesamekeyke.g.,popularwordsinWordCountCansavenetworktimebypre-aggregatingatmappercombine(k1,list(v1))v2sameasreducefunction53CombinersOftenamaptaskwill53PartitionFunctionInputstomaptasksarecreatedbycontiguoussplitsofinputfileForreduce,weneedtoensurethatrecordswiththesameintermediatekeyendupatthesameworkerSystemcanuseadefaultpartitionfunctione.g.,hash(key)modRSometimesusefultooverridee.g.,hash(hostname(URL))modRensuresURLsfromahostendupinthesameoutputfile54PartitionFunctionInputstoma54ExecutionSummaryHowisthisdistributed?Partitioninputkey/valuepairsintochunks,runmap()tasksinparallelAfterallmap()sarecomplete,consolidateallemittedvaluesforeachuniqueemittedkeyNowpartitionspaceofoutputmapkeys,andrunreduce()inparallelIfmap()orreduce()fails,re-execute!55ExecutionSummaryHowisthisd55Example:TradingDataProcessingInput:Historica
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 淮南市壽縣輔警招聘考試題庫 (答案+解析)
- 耳鼻咽喉科試題及答案
- 醫(yī)療機(jī)構(gòu)面試題型及答案
- 煤礦安全生產(chǎn)管理人員考試及答案
- 消防設(shè)施操作員(初級)習(xí)題(含參考答案)
- 基礎(chǔ)護(hù)理習(xí)題庫(附答案)
- 商品選品員突發(fā)故障應(yīng)對考核試卷及答案
- 成人護(hù)理學(xué)試題及答案
- 護(hù)理組感染防控考核試題及答案
- 河南黨建考試題庫及答案
- 2025-2026學(xué)年北京市西城區(qū)初二(上期)期末考試物理試卷(含答案)
- 公路工程施工安全技術(shù)與管理課件 第09講 起重吊裝
- 河南省2025年普通高等學(xué)校對口招收中等職業(yè)學(xué)校畢業(yè)生考試語文試題 答案
- 《中醫(yī)藥健康知識講座》課件
- 中國地級市及各省份-可編輯標(biāo)色地圖
- 產(chǎn)科品管圈成果匯報降低產(chǎn)后乳房脹痛發(fā)生率課件
- 急性消化道出血的急診處理
- 馬口鐵印鐵制罐工藝流程詳解課件
- 狼蒲松齡原文及翻譯
- 預(yù)應(yīng)力管樁-試樁施工方案
- GB/T 3500-1998粉末冶金術(shù)語
評論
0/150
提交評論