擴(kuò)散模型教程 -Diffusion models_第1頁
擴(kuò)散模型教程 -Diffusion models_第2頁
擴(kuò)散模型教程 -Diffusion models_第3頁
擴(kuò)散模型教程 -Diffusion models_第4頁
擴(kuò)散模型教程 -Diffusion models_第5頁
已閱讀5頁,還剩279頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

LauraSánchezGarcíaJulioAntonioSotoVicenteIEUniversity(C4__466671-AdvancedArtificialIntelligence)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20241/841Intro2DenoisingDiffusionProbabilisticModels3Advancementsandimprovements4Largediffusionmodels5BeyondimagegenerationIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20242/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20243/84A(certainlynotcomplete)list:?LatentVariablemodels(incl.VAEs)?Autoregressivemodels(incl.GPT-styleLanguageModels)?GANs?Flow-basedmodels(incl.NormalizingFlows)?Energy-BasedModels(incl.Score-basedmodels)?Diffusionmodels(kindofmixofallpreviouspoints)?CombinationsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20244/84ImageImagegenerationSource:Hoelal.[2020]”AphotoofaCorgidogridingabikeinTimesSquare.Itiswearingsunglassesandabeachhat.”Source:Sahariaetal.[2022]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20245/84Source:Brooksetal.[2024]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20246/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20247/84Outline?Theforwardprocess?TheNice?property?Thereverseprocess?Lossfunction?Trainingalgorithm?Themodel?SamplingalgorithmIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20248/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20249/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020DDPMsprogressivelygenerateimagesoutofnoiseSource:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageOncetrained,themodelshouldhavethereforelearnedhowtodenoiseimagesSowecangeneratesomepurelyrandomnoise,runitthroughourmodelandgetbackanimagegeneratedoutofpurenoise!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T?x0istheoriginalimage?q(xt∣xt?1)istheforwarddiffusionprocess?pθ(xt?1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T?x0istheoriginalimage?q(xt∣xt?1)istheforwarddiffusionprocess?pθ(xt?1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)DuringforwarddiffusionweaddGaussian(Normal)noisetotheimageineveryt,producingnoisyimagesx1,x2,…xTAstbecomeshigher,theimagebecomesmoreandmorenoisyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt?1)?N(xt;√1?βtxt?1,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt?1)?N(xt;√1?βtxt?1,βtI)noiseisaddedineachstept11Inthepaperitismadetogrowlinearlyfromβ1=10?4toβT=0.02forT=1000IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84Thefullforwardprocessistherefore:Tq(x1:T|x0)污Πq(xt|xt—1)t=1ForalargeT,thefinalimageisbasicallyonlynoise(alloriginalimageinfoisessentiallylost),soitbecomesroughlyxT~N(0,I)Demohere!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202415/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:q(xt|x0)=N(xt;√α- Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice?propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:DetailsinAppendixA!q(xt|x0)=N(xt;√α-DetailsinAppendixA!?Easier,fastercomputation?Anyimagestatextcomesfroma(Normal)probabilitydistribution,drasticallysimplyfingderivationsDemohere!Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice?propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtAndTpθ(x0:T)污p(xT)Πpθ(xt—1|xt)t=1IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84SummarySummaryIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84SummarySummaryTheforwardprocessposterioristheground-truthreversediffusionprocessthatthemodelwilllearntoapproximate!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqDetailsDetailsinAppendixB!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?Eq?LT→priormatchingterm.Hasnolearnableparameters,soDetailsinAppendixB!?LtDetailsinAppendixB!?L0→reconstructionterm.Onlylearninghowtogofromx1tox0,soauthorsendedupignoringit(simplerandbetterresults)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0?pθ(xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17 IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnSincebothareNormaldistributions,thisKLdivergenceis: Eqt(xt,x0)?μθ(xt,t)∥]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: process.Reformulating:Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84Source:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84Source:Hoetal.[2020]Where√tx0+√1+t∈isjustxtcomputedthroughtheNice?property!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocks IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocksTheyalsoincludeGroupNorm(WuandHe[2018])inResNetandself-attentionblockstisaddedoneveryResNetblockthroughpositionalencoding(Vaswanietal.[2017])IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020] IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020]Step4justappliesthereparametrizationtricktothelearnedreverseprocesspθ(xt?1∣xt)=N(xt?1;μθ(xt,t),Σθ(xt,t))IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoise IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?Answer:DetailsDetailsinAppendixE!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84AdvancementsandimprovementsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202426/84AdvancementsAdvancementsandimprovementsOutline?Variance/noiseschedulers?Learningthereverseprocessvariance?Fastersampling:DDIMs?Conditionalgeneration?ClassifierGuidance?Classifier-FreeGuidance?Conditioningonimages?ControlNet?ConditioningontextIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202427/84Variance/noiseVariance/noiseschedulersNicholandDhariwal[2021]propot=beingtinywhentiscloseto0(theysetittos=0.008)ComparisonbetweenschedulerinDDPMandNichol&Dhariwal’scosineschedulerproposal.Source:Nichol&Dhariwal[2021]IEUniversity-L.Sánchez,J.A.Soto

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

最新文檔

評(píng)論

0/150

提交評(píng)論