版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
LauraSánchezGarcíaJulioAntonioSotoVicenteIEUniversity(C4__466671-AdvancedArtificialIntelligence)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20241/841Intro2DenoisingDiffusionProbabilisticModels3Advancementsandimprovements4Largediffusionmodels5BeyondimagegenerationIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20242/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20243/84A(certainlynotcomplete)list:?LatentVariablemodels(incl.VAEs)?Autoregressivemodels(incl.GPT-styleLanguageModels)?GANs?Flow-basedmodels(incl.NormalizingFlows)?Energy-BasedModels(incl.Score-basedmodels)?Diffusionmodels(kindofmixofallpreviouspoints)?CombinationsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20244/84ImageImagegenerationSource:Hoelal.[2020]”AphotoofaCorgidogridingabikeinTimesSquare.Itiswearingsunglassesandabeachhat.”Source:Sahariaetal.[2022]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20245/84Source:Brooksetal.[2024]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20246/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20247/84Outline?Theforwardprocess?TheNice?property?Thereverseprocess?Lossfunction?Trainingalgorithm?Themodel?SamplingalgorithmIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20248/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20249/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020DDPMsprogressivelygenerateimagesoutofnoiseSource:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageOncetrained,themodelshouldhavethereforelearnedhowtodenoiseimagesSowecangeneratesomepurelyrandomnoise,runitthroughourmodelandgetbackanimagegeneratedoutofpurenoise!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T?x0istheoriginalimage?q(xt∣xt?1)istheforwarddiffusionprocess?pθ(xt?1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T?x0istheoriginalimage?q(xt∣xt?1)istheforwarddiffusionprocess?pθ(xt?1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)DuringforwarddiffusionweaddGaussian(Normal)noisetotheimageineveryt,producingnoisyimagesx1,x2,…xTAstbecomeshigher,theimagebecomesmoreandmorenoisyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt?1)?N(xt;√1?βtxt?1,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt?1)?√1?βt?xt?1+N(0,βtI)Takeanimageatsomepointt?1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext?1valuesby√1?βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt?1)?N(xt;√1?βtxt?1,βtI)noiseisaddedineachstept11Inthepaperitismadetogrowlinearlyfromβ1=10?4toβT=0.02forT=1000IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84Thefullforwardprocessistherefore:Tq(x1:T|x0)污Πq(xt|xt—1)t=1ForalargeT,thefinalimageisbasicallyonlynoise(alloriginalimageinfoisessentiallylost),soitbecomesroughlyxT~N(0,I)Demohere!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202415/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:q(xt|x0)=N(xt;√α- Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice?propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:DetailsinAppendixA!q(xt|x0)=N(xt;√α-DetailsinAppendixA!?Easier,fastercomputation?Anyimagestatextcomesfroma(Normal)probabilitydistribution,drasticallysimplyfingderivationsDemohere!Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice?propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtAndTpθ(x0:T)污p(xT)Πpθ(xt—1|xt)t=1IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84SummarySummaryIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84SummarySummaryTheforwardprocessposterioristheground-truthreversediffusionprocessthatthemodelwilllearntoapproximate!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?EqDetailsDetailsinAppendixB!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:?T?Eq?LT→priormatchingterm.Hasnolearnableparameters,soDetailsinAppendixB!?LtDetailsinAppendixB!?L0→reconstructionterm.Onlylearninghowtogofromx1tox0,soauthorsendedupignoringit(simplerandbetterresults)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0?pθ(xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17 IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt?1:Where:?q(xt?1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt?1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:?q(xt?1∣xt,x0)=N(xt?1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnSincebothareNormaldistributions,thisKLdivergenceis: Eqt(xt,x0)?μθ(xt,t)∥]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: process.Reformulating:Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt?1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt?1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84Source:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84Source:Hoetal.[2020]Where√tx0+√1+t∈isjustxtcomputedthroughtheNice?property!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocks IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocksTheyalsoincludeGroupNorm(WuandHe[2018])inResNetandself-attentionblockstisaddedoneveryResNetblockthroughpositionalencoding(Vaswanietal.[2017])IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020] IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020]Step4justappliesthereparametrizationtricktothelearnedreverseprocesspθ(xt?1∣xt)=N(xt?1;μθ(xt,t),Σθ(xt,t))IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoise IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?Answer:DetailsDetailsinAppendixE!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84AdvancementsandimprovementsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202426/84AdvancementsAdvancementsandimprovementsOutline?Variance/noiseschedulers?Learningthereverseprocessvariance?Fastersampling:DDIMs?Conditionalgeneration?ClassifierGuidance?Classifier-FreeGuidance?Conditioningonimages?ControlNet?ConditioningontextIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202427/84Variance/noiseVariance/noiseschedulersNicholandDhariwal[2021]propot=beingtinywhentiscloseto0(theysetittos=0.008)ComparisonbetweenschedulerinDDPMandNichol&Dhariwal’scosineschedulerproposal.Source:Nichol&Dhariwal[2021]IEUniversity-L.Sánchez,J.A.Soto
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 廉政課件亮點(diǎn)
- 廉政自保課件
- 2025年福清市人民法院關(guān)于公開招聘勞務(wù)派遣人員的備考題庫及一套參考答案詳解
- 2025年國科大杭州高等研究院公開招聘編外工作人員備考題庫及一套參考答案詳解
- 2025年中國農(nóng)業(yè)銀行研發(fā)中心社會(huì)招聘7人備考題庫帶答案詳解
- 算力芯片行業(yè)深度研究報(bào)告:算力革命疊浪起國產(chǎn)GPU奮楫篤行
- 2025年福州大學(xué)附屬省立醫(yī)院高層次人才招聘?jìng)淇碱}庫及完整答案詳解1套
- 偷我的協(xié)議合同
- 簽了協(xié)議解除合同
- 租賃廠房么協(xié)議書
- 2025年谷胱甘肽及酵母提取物合作協(xié)議書
- 2026廣西融資擔(dān)保集團(tuán)校園招聘補(bǔ)充參考筆試題庫及答案解析
- 2026貴州安創(chuàng)數(shù)智科技有限公司社會(huì)公開招聘119人參考筆試題庫及答案解析
- 韓家園林業(yè)局工勤崗位工作人員招聘40人備考題庫新版
- 2025年云南省人民檢察院聘用制書記員招聘(22人)參考筆試題庫及答案解析
- 維修班組長(zhǎng)設(shè)備故障應(yīng)急處理流程
- 2026年湖南司法警官職業(yè)學(xué)院?jiǎn)握新殬I(yè)技能測(cè)試題庫及完整答案詳解1套
- 兔年抽紅包課件
- DB31∕T 634-2020 電動(dòng)乘用車運(yùn)行安全和維護(hù)保障技術(shù)規(guī)范
- 醫(yī)師證租借協(xié)議書
- 2025年11月國家注冊(cè)質(zhì)量審核員(QMS)審核知識(shí)考試題(附答案)
評(píng)論
0/150
提交評(píng)論