版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡介
基于物理?xiàng)l件約束的可信視覺生成大模型Visual
generative
modelInputOutputVAE:
maximize
variationallowerboundVideo
generative
methods?
Thefieldofvideo
generationhasseenrapiddevelopment,
reachingseveralmilestones...VAE:
maximize
variationallowerboundGAN:
AdversarialtrainingFlow-based
models:
Invertible
transform
ofDiffusionmodels:
GraduallyaddGaussian
noisedistributionsandthenreverseDiffusion
for
visual
generation
(1)?
DenoisingDiffusion
Probabilistic
Models
(DDPMs)Diffusion
for
visual
generation
(2)?
Stochastic
Differential
Equations
(Score
SDEs)Key
Elements
of
visual
Diffusion
Models?
Pixel
diffusion
(originalinput)?
Latent
spacediffusion?
Unet?
TransformerSora,
breakthrough?
Consistency:consistencyin3Drendering,long-rangecoherence,
andobjectpermanence.?
Highfidelity.?
Surprisinglength:extended
videolength
capability(Sora:
1
minutevs.previous
systems:
seconds).?
Flexible
resolution:generation
ofvideosacross
various
durations,aspectratios,
andresolutions.Sora,
key
technologies?
TheDiTframework
by
Meta
(2022.12)is
designedfor
videoprocessing.?
Google's
MAGViT
(2022.12)focuses
onVideoTokenization.?
GoogleDeepMindintroduced
NaViT(2023.07)to
supportvariousresolutions
andaspectratios.?
OpenAI's
DALL-E
3
(2023.09)enhancesVideoCaptiongeneration
forimproved
conditioned
videocreation.Modeling
the
physical
world?
We
knowthat
itis
verycomplicated
real
physical
model.probabilistic?
bayesian
inference;?
probabilisticgraphical
models.deterministic?
mathematicalequations;?
physics
basedsimulation;?
control
theory.Modeling
the
physical
world?
We
knowthatitisverycomplicatedrealphysicalmodel.probabilistic?
bayesian
inference;?
probabilisticgraphical
models.deterministic?
mathematicalequations;?
physics
basedsimulation;?
control
theory.Key
elements
of
a
physical
world?
GivenaSora
demo(thewalkingwomanintheTokyo
street),thekey
elementsofaphysicalworld,inthegraphicalway...?
Appearance?
Geometry?
Lighting?
Motion&Animation?
AudioModeling
the
physical
world?
[CVPR]Gaussian-Flow:4DReconstructionwithDynamic3DGaussianParticleEspressoChick-ChickenSplit-CookieFlame-SteakModeling
the
physical
world?
[CVPR]Gaussian-Flow:4DReconstructionwithDynamic3DGaussianParticleIt
is
hard
to
model
the
physical
world?
In
fact,
theworld
ishard
to
modelina
probablistic
way.?
Sora
resource
consumption...–
1billionsofimages;–
1millionsofhoursofvideo
data;–
10trillionstokens
aftertokenizingimagesandvideos–
Training
with~5,000A100sinparallel.It
is
hard
to
model
the
physical
world?
Sora
failure
casein
geometryandappearance.It
is
hard
to
model
the
physical
world?
Sora
failure
case
inlighting.It
is
hard
to
model
the
physical
world?
Sora
failure
case
inmotionandanimation.It
is
hard
to
model
the
physical
world?
VideoMV:ConsistentMulti-ViewGenerationBasedonLarge
VideoGenerativeModel?
Geometricenhancementisstillneededfor
multi-viewimages.It
is
hard
to
model
the
physical
world?
VideoMV:ConsistentMulti-ViewGenerationBasedonLarge
VideoGenerativeModel?
Fromastatic
aspects,SVDisabletomodelmulti-viewimages.It
is
hard
to
model
the
physical
world?
Stag4D:Spatial-Temporal
AnchoredGenerative4DGaussians?
From
atemporalaspects...It
is
hard
to
model
the
physical
world?
STAG4D:
Spatial-Temporal
AnchoredGenerative4DGaussians?
Fromatemporal
aspects...It
is
hard
to
model
the
physical
world?
Ilya
Sutskever:
compression
is
generalization.?
Thebest
losslesscompression
for
adataset
is
thebestgeneralization
for
data
outsidethedataset.Apply
the
deterministic
conditions?
Different
representationsof
deterministicconditionsinthephysicalworld.?
Muchlessdata
andparameters!GeometryLightingMotion&AnimationApply
the
deterministic
conditions?
Thereare
two
ways
to
injectdeterministicinformation.deterministic#1deterministic#2Image
Human
Animation?
Champ:
Controllable
andConsistent
HumanImage
Animation
with
3D
Parametric
GuidanceImage
Human
Animation?
Champ:
Controllable
andConsistent
HumanImage
Animation
with
3D
Parametric
GuidanceImage
Human
Animation?
Champ:
Controllable
andConsistent
HumanImage
Animation
with
3D
Parametric
GuidanceImage
Por
trait
Animation?
Hallo:
Hierarchical
Audio-Driven
VisualSynthesisfor
Portrait
Image
AnimationImage
Por
trait
Animation?
Hallo:
Hierarchical
Audio-Driven
VisualSynthesisfor
Portrait
Image
AnimationImage
Por
trait
Animation?
Hallo:
Hierarchical
Audio-Driven
VisualSynthesisfor
Portrait
Image
AnimationDynamic
Protein
Structure
Prediction?
4D
Diffusion
for
DynamicProtein
Structure
Prediction
with
Reference
GuidedTemporal
AlignmentDynamic
Protei
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2026年法律專業(yè)基礎(chǔ)練習(xí)題庫民法典相關(guān)法律知識(shí)題目
- 塑料浮箱拆除施工方案
- 地下空間垃圾清理工程施工方案
- 變頻器更換施工方案
- 人工養(yǎng)護(hù)施工技術(shù)方案
- 2024年鄭州市職工大學(xué)馬克思主義基本原理概論期末考試題及答案解析(必刷)
- 2025年加查縣幼兒園教師招教考試備考題庫附答案解析
- 2025年克拉瑪依職業(yè)技術(shù)學(xué)院單招職業(yè)適應(yīng)性考試題庫附答案解析
- 2025年上海立達(dá)學(xué)院單招職業(yè)適應(yīng)性考試題庫帶答案解析
- 2025年久治縣招教考試備考題庫帶答案解析(必刷)
- 施工合作協(xié)議書范文范本電子版下載
- 建筑施工企業(yè)主要負(fù)責(zé)人項(xiàng)目負(fù)責(zé)人專職安全生產(chǎn)管理人員安全生產(chǎn)培訓(xùn)考核教材
- 煙草物理檢驗(yàn)競賽考試題庫及答案
- 人才技術(shù)入股公司股權(quán)分配協(xié)議書
- 招聘會(huì)會(huì)展服務(wù)投標(biāo)方案(技術(shù)標(biāo) )
- 馬超-水田省力化劑型的開發(fā)及應(yīng)用研究-
- 頭面部的神經(jīng)阻滯課件
- 友達(dá)光電(昆山)有限公司第一階段建設(shè)項(xiàng)目環(huán)?!叭瑫r(shí)”執(zhí)行情況報(bào)告
- 光學(xué)下擺拋光技術(shù)培訓(xùn)教材
- LY/T 2456-2015桉樹豐產(chǎn)林經(jīng)營技術(shù)規(guī)程
- GB/T 9414.9-2017維修性第9部分:維修和維修保障
評(píng)論
0/150
提交評(píng)論