信息技術(shù)-人工智能系列之一從RNN到ChatGPT:大模型的發(fā)展與應(yīng)用-東北證券-20231114_第1頁
信息技術(shù)-人工智能系列之一從RNN到ChatGPT:大模型的發(fā)展與應(yīng)用-東北證券-20231114_第2頁
信息技術(shù)-人工智能系列之一從RNN到ChatGPT:大模型的發(fā)展與應(yīng)用-東北證券-20231114_第3頁
信息技術(shù)-人工智能系列之一從RNN到ChatGPT:大模型的發(fā)展與應(yīng)用-東北證券-20231114_第4頁
信息技術(shù)-人工智能系列之一從RNN到ChatGPT:大模型的發(fā)展與應(yīng)用-東北證券-20231114_第5頁
已閱讀5頁,還剩55頁未讀, 繼續(xù)免費(fèi)閱讀

下載本文檔

版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進(jìn)行舉報或認(rèn)領(lǐng)

文檔簡介

????ó???_o?z|~?^ó_÷??2023-11-14?z|~?^/?y{|~?^?RNNrChatGPT?k??ó]NT}---ty~y?WON?^|??y300y?twytnao?ó]?ty?tüwT~nyy???w?]v?×??Y??]2?k?_?aoó]?_??so?NnO?v?wO?{{o??o??W÷íuo]{?w~×??ü??T}g2o?^V~On?V?íyuV÷?k??aoó]×{1x??yüT}??????W÷ío?T}_o2ts?^?í??Wk?ró]g??t???o?Nú????:Noík1r?1?[èót?_;k??ao???_?ytnü?w?]v??o??ó? --20231103??w??útv?üry?2??????t?yt?k:z?T?wt?Ygt?YO?;??r?Transformer~?x÷??ˉ?k??GPTüBERT???ìo?vr?wó×???y^{??~[n?óo×??--20231030:2023~O{t}?úgt??c???|g?gy?gt?~???y?ChatGPT2÷;--20231027o???ú????k??Ox??y?r1{?1ú1:t{÷í?{?NT;ó×1V1èN}ù??w?×T}o?V÷?k??_y--20231026oO\??{?n?ìuüY?~???[kak??ao?????GPT-4V[kak?~?è?__[kak??ó]??2g^?o?V÷??k??W÷ío?T}yU2?????zV÷^??vg^?n÷?S0550521100001k??_?W÷ío?}?uouvüV÷1^?tV÷1Tangqi_5636@o1÷íu?rü_W????Nao?__~÷í??~gm÷?óW2?k????T}k?tr~NnY?|~yU2?ytnü?w?]vao?Oí?o???k?O?y_v?ür]g??w???y?W÷í{_^?óg_}2?ao?rx??y?Yr?×T}???k?]ó????y{o???2{{o?W÷ío??k??T}O?óú?uo]vü^V÷?m÷??y_~T?oü÷íu?|~ó????2og?wao??Noó]ü_W?s?????k???óg?òY??ò?~÷íó??òt×üúm?óoc2T?ó{?k?ymT??To??×óuo?O?èogèó2?ó??k?^??n??n免責(zé)申明:本內(nèi)容非原報告內(nèi)容;報告來源互聯(lián)網(wǎng)公開數(shù)據(jù);如侵權(quán)請聯(lián)系客服微信,第一時間清理;報告僅限社群個人學(xué)習(xí),如需它用請聯(lián)系版權(quán)方;如有其他疑問請聯(lián)系微信行業(yè)報告資源群進(jìn)群福利:進(jìn)群即領(lǐng)萬份行業(yè)研究、管理方案及其他學(xué)習(xí)資源,直接打包下載每日分享:6份行研精選報告、3個行業(yè)主題報告查找:群里直接咨詢,免費(fèi)協(xié)助查找嚴(yán)禁廣告:僅限行業(yè)報告交流,禁止一切無關(guān)信息微信掃碼,長期有效知識星球行業(yè)與管理資源專業(yè)知識社群:每月分享8000+份行業(yè)研究報告、商業(yè)計劃、市場研究、企業(yè)運(yùn)營及咨詢管理方案等,涵蓋科技、金融、教育、互聯(lián)網(wǎng)、房地產(chǎn)、生物制藥、醫(yī)療健康等;已成為投資、產(chǎn)業(yè)研究、企業(yè)運(yùn)營、價值傳播等工作助手。微信掃碼,行研無憂????ó???_o?y{|~t_1.?k???vüaoó].............................................................................41.1._44?RNNrChatGPT.................................................................................................41.2.?T_?k?.......................................................................................................................41.2.1.???t-RNN.......................................................................................................................................61.2.2.????t-LSTMNy???WS-GRU..........................................................................................81.2.3.y~-?y~÷?......................................................................................................................................111.3.ìo?v.............................................................................................................................121.3.1.ìo?v...................................................................................................................................................121.3.2.Transformer÷?..........................................................................................................................................141.4.ˉ??k?.....................................................................................................................201.4.1.ˉ??]uk?.......................................................................................................................................201.4.2.NO?os?ˉ??]uk?...............................................................................................................201.4.3.}ˉ?k??GPT...............................................................................................................................211.4.4.ˉ?k??r?BERT.....................................................................................................................221.5.?k??~yíó.........................................................................................................241.5.1.O??ˉ??k?...........................................................................................................................241.5.2.ó{n-Prompt..........................................................................................................................................251.5.3.?k??~yíó44GPT-3.5üChatGPT.....................................................................................281.5.4.k?~yíóy??gt.......................................................................................................................331.6._??ó?.....................................................................................................................352.?k??T}N]...............................................................................372.1.?k??T}.................................................................................................................372.2.k???Y.................................................................................................................432.3.[kak?.........................................................................................................................443.?k??W÷í??T}N_o..................................................493.1.??oó×N{?y?.................................................................................................493.1.1.?k???×...............................................................................................................................................493.1.2.÷????×...........................................................................................................................................503.1.3.è|?ouv...............................................................................................................................................513.2.??oV÷NVy?.................................................................................................523.2.1.?o?uo...........................................................................................................................................523.2.2.?ouo?{?V÷]v...........................................................................................................................523.3.??yrN?}y?.................................................................................................533.3.1.?yr.......................................................................................................................................................533.3.2.?yì?.......................................................................................................................................................543.3.3.?ytNdebug........................................................................................................................................543.4.ogó]yU.........................................................................................................................554.T?ó{...........................................................................................................56tèt_t1??T_k?.........................................................................................................................................................5t2????T_k?{ot.................................................................................................................................5t3????t{ot.........................................................................................................................................6t4?t÷?]_????t{ot.................................................................................................................7t5?????t???.....................................................................................................................................9?ó??k?^??n??n 2/58????ó???_o?y{|~t6?y???WS??? 10t7?y~-?y~÷? 11t8?ìo?v{ot 13t9??ìo?v{ot 14t10?transformer÷? 15t11?[}ìo?v{ot 17t12???????{ot 18t13?ELMot???y_?{{ot 21t14?GPTˉ?k???{ot 22t15?ELMo1GPT1BERTO??t 23t16?GPT-2 TNLP?óo?èót 25t17?Oy??In-ContentLearningNFine-Tuning???t 26t18?O\íkO?u??k?èó?Yt 27t19?GPT-3rChatGPT?ao??t 29t20???t??Wnˉ??{t 31t21?O\?ˉ?y_?GPT-3?Wk??Yt 34t22??k?x??y{ot 38t23?ChatGPT{?y?{t 39t24?ChatGPT{?y?{t 39t25?ChatGPTó×y?{t 40t26?ChatGPTVy?{t 41t27?ChatGPT}ùy?{t 43t28?[kaè??ny_??t 45t29?[ka??/f\{ot 45t30?[ka??{ot 46t31?[ka\{ot 46t32?[ka_\n{ot 47t33?ChatGPT?PDF?k{?jt 49t34?ChatGPT?PDF?k{?jt??? 50t35?ChatGPTó×??oy-jt 51t36?ChatGPTV÷è|?t{?jt 51t37?ChatGPT?yr{?jt 53t38?ChatGPT?yì?{?jt 54t39?ChatGPT}w?vGPTs?yjt 55?ó??k?^??n??n 3/58????ó???_o?y{|~?k???vüaoó]1.1. _44?RNNrChatGPT?w?]v?NaturalLanguageProcessing?NLP?o??ù1ty~yü???Nn??ù?|~?_???y_v?1]v1rük?t??y????óNt?ì?w???y?2??w?]vao????ó~??1???1?V÷1?o|{[yT}2wytnao?ó]?ty?tüwT~nyy???w?]v?×??Y??]2?w?]v?ó]???r20^t50~??_÷??ù?_?]????{?g?ó??w??v?ür2í|~sìí[ü??÷??yy??ù?yí[ü?yg?ì÷[V÷220^t80~??w??y??óúü??y???ó???yy?w?]v?~so?wO2?N÷??[?????~??1V?1?goì{yyt??ó2?u21^t?|wo?S?~?ytnao?ó]?w?ú??w?]v??o22010~?TomasMikolov?w\_ó???????t?RNN???k?_???w?]v?u?úó]÷22015~DzmitryBahdanau{t??:Neuralmachinetranslationbyjointlylearningtoalignandtranslate;oó??ìo?v??k???nr?ü?O??y^k???s????}wv?ür?÷2×]?Transformer??}?×}?ìo?v?_?^??w?]vao?ó]2??o2018~ó??BERT~?è?ˉ??k??Yk^?w?]v?ó]??r?Nn????2ˉ??k?{}?÷wník?y?N?ìˉ??rNn}??k??w^{??óN?ì??uo{}1?ó?W1k?gyy÷yo~×óW2OpenAIó_?GPT-3k??óu??r1750??w~yíóy?o?wty~yk???w?]vy??u?ONn??2g?ó_?GPT-4.0zo?óuík?r?O??N?uW???k??Noó]~??]vg?O\ka?t?1?÷1?o{??o?[kak?2oú^k÷?ú????k??Tó]?????T??wo?èg?ao?v21.2. ?T_?k??k??Nn?x?óo?Wo2???uoù|~o?[uuoyW÷??W?o?wo?ouo?s??[u2?oo??okú?óW????ú??s??ú?o}}_ó}??ó?yv?2?o?]v??Wuo?k??o?}???o?o?y?2W?yoNnuo?W{?!,?",?#,···,?$}?go??WONn÷?ù?|?$%!??$%!~?(?$%!|?$,···,?!)??}T_k?g?ó?no?[Nn??u(?$,···,?!)?u???_2_}o??uuo?u?_w×r?uo??TòTò??????óu?w÷?OíT?2?oNn?yyg?n?????w]v2???n??N?onyu2Nyu?×}t?÷??ó?u?W??×÷??t~???W×ó?o?s}?o?W(?$,···,?$&')2?yu?}]o?_?>?÷?óu?u?ó??k?^??n??n 4/58????ó???_o?y{|~?oO?????ˉ?Nn?W?y??tg?ó?2?u_][o?u?_?÷??óO_?uuo??yó×r÷??óO_?×ó?o?om?O2?yk??ù~?T_k??autoregressivemodels?2t1??T_k?uogt?^W?z?yu[?wo?{Oí??? ??/g?[×ó?W?o?Y????/$go?WONn÷?ù?|?$2?yu???0$=?(?$|/$)??$??}_/$=?(/$&!,?$&!)????/$2?k?_?ù~???T_k??latentautoregressivemodels?2t2? ???T_k?{otuogt?^W?z?ó??k?^??n??n 5/58????ó???_o?y{|~?T_k?y_?ó?W^??o?????t?recurrentneuralnetwork?RNN???wO\??óy_y???WS?gatedrecurrentunits?GRU?1????t?longshort-termmemory?LSTM?yoy???T_k?2?k??÷NnY?óo?W?c?~??o?W?ck??Nnx???wuü?yo?t????W2~??ó?~????Wr?W?sequencetosequence?seq2seq??n?ó?|~t\ó??y~-?y~?encoder-decoder?÷?2o^??y???T_?k??yRNN1LSTM1GRU??}??W??W?W??y~-?y~÷?21.2.1. ???t-RNN. ???t???t?RNN??2010~?kT}??k??ˉ??w?o??s~?o????T_k?2RNNk??Nn÷?oy?ì?????~?????r????o÷?o???ìo2???Nn÷?o?RNN???NNNn÷?o}t\??O??\__÷?o?u?o??w??????2?o????????a?t?}]~???t2t3????t{otuogt?^W?z???to?gì?????^ù~?^??????o??~??^ù~?^2??÷?o?????$N??$???}_?O??$=?(?$?()+?$&!?))+?))$$)**?ó??k?^??n??n 6/58????ó???_o?y{|~wo??è{__?÷?o??$~?Wo÷?o?]?Nno??$ü?$&!V{~k?__÷?o???_Nn÷?o?21]???2?^o?gYóu?()ü?))V{}???__÷?o????$??__u?$N_Nn÷?o???$&!??oó×????)~_óu2?^o?gYóu?)*???_????$o????$??*~_óu2??RNNg??k??óuyüu?ot???óu?ttuuo?W?ottü{?ttó??nWN?W?t?s2?o?RNN?óu_O_w÷?o?TòTò. ?÷??Uoí?yRNNk?×}<?÷??Uoí=?Back-PropagationThroughTime?BPTT??y?ìˉ?2ˉ?÷??n÷?OY?_?÷{?u?+_U??óg^Nn÷{?$?Y??÷{?Uoí????????{o?[?Nn÷{?????+,?!,&,?$?ü???+,?!,&,?$?2?o?RNNˉ?÷s}??[ü??÷?_w?W?t?TòOíT2?×T}?{o?N?_?jí?yy???W?t?ì?v2NN?o?????T_k??jíO\O]??RNNk???^÷??óO_??W?o[??÷??ó??????$&',&,?$?Oo2?o?s??W?ìjí?_y[????o??o}?^??W|?o2t4?t÷?]_????t{otuogt?^W?z??RNNk??ˉ??{÷? }?O÷gè{Tnóu??a/$1u?$1??$?Oo?è{__?÷?o??)ü?*V{è{?^ü?^?gY23N?Nn÷?o??aü???}O÷?}_è{?/$=?(?$,/$&!,?))?$=?(/$,?*)wo?ü?V{o?^ü?^???yu2RNNo?????????{oTn÷?o??a[??s??{(?!,/!,?!),&,(?$&!,/$&!,?$&!),(?$,/$,?$),&}_Uoí÷? ?ó??k?^??n??n 7/58????ó???_o?y{|~?(?!,&,?,,?!,&,?,,?),?*)=?1C,?(?$,?$)?Uoí÷???yyus?gYóu??t?Y??tO??y??gY2???^?gY~????=1C,??(?$,?$)=1C,??(?$,?$)??(/$,?*)?/$ ? ? wo?.)!ù???yy?O_??/=??(?,/) )

,?))$&!$??(?0,/0&!,?))??(?1,/1&!,?))?/)??r?N_o[Nn??W?t~ùu??^ù????~Nn??W?t~gu?guù2_?W?tOíTò??ùT?v_?t__Oíò _?t_Oíò?2_?W?t ÷????^ù???u?1?_°r?tv??????ù??u_?1?[_°r?tíy2?tv?_???tˉ?o?k?gY_t?????k???O{??ˉ??yw??tíy[_???tˉ?÷k??gY\k?????k??ywr\?g_?2v?N???y_?k?gY?ì\v?_?W????]?tv?ü?tíy??O?×ˉ??{o?~rNt\?_?Wóu?~??2?×T}o?~_×}jíy?k?tv?N?tíy?ó2????y?güu|{?g??×}RNNk???yo?~??k?W????ˉ?÷?ó?tíyü?tv??????]_^??t_s}?[????_?W?g?t?ì?v2_?W?t?í??tO^??W?ìjí2?o?RNNk?[?~×?????long-termdependencies???s????Y?????Y[wou?TòOí?_Oó?yoy21.2.2. ????t-LSTMNy???WS-GRU~??ó???tk?[??????|~t\Y^ó??????t?longshort-termmemory?LSTM???w?W??_44y???WS?gatedrecurrentunit?GRU?2t?_?b??????t?LSTMNGRU????W??]vm???~?w?]v1?÷?{1÷??Wo{??r??ò?wT}. ????t-LSTMLSTMoNy}??RNN?íomw]vüwk???s???Wuo???x?}óo_uNy{?????av???}w]v??W?~?]?tíy?2LSTM_u???WS?memorycell?g?v?ao?_??o???ny???w?ì?v??y?outputgate?}g?WSo?gt?uy?inputgate?}??vuo??u?×?y?forgetgate?}?Y??WS???2?Oyy??tY\?v??ó??N÷Y???au??o_??ó??k?^??n??n 8/58????ó???_o?y{|~???N÷Y?u2\~?^?????WS?}?t???o?t???a[_or?^}?o__÷?o????2t5??? ??t???uogt?^W?zLSTMk?^RNN? ?a?????}~?t5b{??{2^__÷?o?u?$ü_Nn÷?o??a?$&!_~uuyo?~uy??^]v????Ony??Y??WS?|2LSTMk?o?yW~×|?~?0,1???·??èu?uo?yO^?}?????o?sigmoidyu_~y??^??yu2Y???H$[~??cw|??N??WS?$???a?$t\?}tanhyu_~y??^??yu2??yóu???y_t\2?$=?(?$?(1+?$&!?)1+?1)$$(3$&!)33?H$=???/(?$?(4+?$&!?)4+?4)wo??(1??(3??(*??(4??)1??)3??)*??)4~gYóu??1??3??*??4o_óu2??WS?$?|uuy?$ü×?y?$?ì?v?wouy?v?÷?o?uoòu??WS????w·?Wo???o?×?y[?v?÷?o???WSouo?}????w·?Wo?? $?$=?$??$+?$??H$wo?÷<?=~ü??????O~nnttt\??·kS}t^?r???·2?ó??k?^??n??n 9/58????ó???_o?y{|~?y?$[?vw??S?$o??o?ó?a?$?????$=?$?tanh(?$)?o?r??a?$t??RNNo??a?Tò??n?×óuo?ogY2LSTMOb?y_[oRNN?÷?o?~?{?????ˉ?÷{o????\?óu?g?]?t???{o?guù~g?tíyü?tv??. y???WS-GRUt??LSTM?y???WS?GRU?oNn??W??_2~?GRUy_ó?NLSTM\{?m??^w?t??2t6?y???WS???uogt?^W?zGRU??onny?Yy?resetgate?ü??y?updategate??w??y__LSTM??$=?(?$?(5+?$&!?)5+?5)$$(6$&!)66wo??(5??(6??)5??)6?~gYóu??5??6o_óu2Y?a?W$u__÷?o?u?$üNNn÷?o??a?$&!Yy?$?vY?auou?????v?Wo???o??W$=tanh(?$?()+(?$??$&!)?))+?))??y?$[}??v???a?$o?NNn÷?o??a?$&!?Y?a?W$bs?????v?Wo???o?

????2No÷?o?ó??k?^??n??n 10/58????ó???_o?y{|~?$=?$??$&!+(12?$)??W$GRUt??LSTM??W??~?\~?LSTM?uyü×?y??]?óuu??\÷Owo~_???WS?a?o^???y~u?ao??owwGRU?????W???w?t??2O÷?]g?WuoN?GRU?èó?yO?LSTM?ò2?o??nyRNN?????×ó??ó?n???}???ít2w?u??W??Y???g?ˉ?LSTMüGRU{?Wk??royot_ú?2o????Wk?u??yü??{?v?Y?}?~??{?ó?W}?oN?Y?21.2.3. y~-?y~÷?^u?W?cr??W??W?ck??sequencetransduction?Tó?ty~yT}oógwósY?_}2~??o?W?ck??Nnx???_o?k?g~}ügr???×o?2~???uo?out?üto???o?W?tr?wuü?yo?t????W2y~-?y~?encoder-decoder?÷?ko~?]v?y??uü?????orO\?W?ck???x???k??^?ó]_?r?Y?_}2t7?y~-?y~÷?uogt?^W?zy~-?y~÷?uy~?encoder?ü?y~?decoder?nntttr2y~^?t???u?W?c~wot??t?o??a?~?ù~NO?U?vyU??2u?Wo??nS}??y~??t^?ì]v?ow??oor?ao?gt?t?NO?U?2y~???a[?}_?y~?_??a2?y~[?×y~r?NO?U???to?W???o??k÷?oowrto?W?S}??ór?u?to?W2~?u?y~r?__÷?o??S}?__~wONn÷?o?uor?y~o????r?W?ONnS}2y~-?y~÷??T}W?~??1?o|1??rü?÷?{{2??x?}óo^O???u?Wf\rt??t?NO?U??Y?NO?U?rO?????W????}?[y?Wr?W??ó2?N÷??ó]???úWr?óü?w?v??ó??]??y??Y2?ó??k?^??n??n 11/58????ó???_o?y{|~1.3. ìo?v?w}LSTM1GRU{????k?y?????WSwk?W?÷??o?O?w[N??2^u?W?bo?oy?rNnt??t??ao?_°r?o?y?_u÷[?t??÷?k??gyewO?2÷_??y~rto?W??{o?y~?w_?W??o_wr?{~?<×?=2o|~~??óo?ó??TòNO??o?RNN?Wk???y~??Nn÷?oy^y~??un÷[?oN?y__?tokenN?u?y~o??kt??NO??owto?W?T???2Oo???[Nn???Wr?Wk??y?O\÷?o?tokenT??r\?NO??oW?h~woO2????~??to?÷???{o?__?SNt?÷O\ON??S?ts{tn~O\?uy~??r?NO??o?yó????o2??ìoO?ìo?v}T??22015~DzmitryBahdanau{t??:Neuralmachinetranslationbyjointlylearningtoalignandtranslate;oó??ìo?v?AttentionMechanism??}??u?o?O\?V??O\?gY??k???nr?ü?O??y^k???s????}wv?ür?÷2_uìo?v^??k??gy?r?_t?óW22017~Vaswani{tó?:Attentionisallyouneed;?ó?????ìo?v?self-attentionmechanism???í??o?RNN?????Transformer??2Transformer??óú?k??~ìg?~g?~×?gyóW2\÷?w~n???ó]?Transformerík?o?y?N?ìˉ?~?{?O~?ó?ì??_?^??BERT1GPT{ˉ?k?ü?k??ó]2{O?Transformerk???óo?w?w?]v??Nk?}???r??k]r????é{wT??~[ka?óó?????yg21.3.1. ìo?v??tg???Ty????wr??o????y_?y]v?|2<ìo?=?è?__]v?o?~Ug?^?[?÷?üít}?]v÷N?V?o2?w??o?{_?t?Ty?~g??oyOt\2?????O?t?~^ìo??owo??g???oo???÷?woó??ò1ó?z3~÷v?_{?/íw?{_2???t°r??g?ìo?ó{?{???tg?ON?o?|2~??r__??ón?t?_??g?o?ó{?^ìo??o?N?óts?toN??v]v?o?ít÷u??}w?r?ó2ty~y?o?ìo?v?o-Ng??Nt??gìo?vt?x?o????ó?ó{?^ìo??o?_?VY?o??u?V~r?o2ytno?ìo?vo-NoNn????{?W?On??????query?1??key?ü|?value??keyüvalueor?[?2??}?è_~?[?????_}?3Nbokey-value???\}t_?uníy??keyt_???íy?s?_ü|?valuet_???íy?w_???query[t_????toíy???2è÷?^queryNbokey?ì{Yü??~rtsgú?key?T?value_~?????2?×N?ìo?voquery1keyvalue????ò?í?gt????o?keyNquery???s?~gY??nkeyb?T?value?ògü2?ó??k?^??n??n 12/58????ó???_o?y{|~t8?ìo?v{otuogt?^W?z?t8b{?ìo?v???V~On?{??1? ìo??V?2? _NWgY?3? ògnüìo?vo??Vyu}???query?~o?y~?__?av??Nkeys?~oy~??v??a?O??tsgVu2??Vu}?~?__÷?oNk?T?sìt?Wot??V??o2ìo??Vyuo?[y???ògìo??Vyu1ù?ìo??Vyu1?ù?ìo??Vyu{?O\??Vyu??}?O\?T}ü?ó2o?O?Ty?VyuZw_?????^?^???Transformerk?b }?ù?ìo??scaleddot-productattention??Vyu~???ìo??Vyu?_}ü???{2}ù????r??m÷?ú??Vyu?Ooù?í_n??ü?wot\??t2ù?ìo??Vyu~?O??ó??k?^??n??n 13/58????ó???_o?y{|~?(?,?)=?,?wo????ü??W~?t~??U?2^ù???:?y_^y??~1???VyuV_?<÷y={tNU??t??s??ˉ??{o?t?c{?2?yo~?ì{???góúm÷??????n??ü?n??|???ìo??wo??ü???t~??|??t~?2[?·ttV{~?:?×???:?×?,?:?×?2??gY_NWüògnü^?ù?ìo???}_~??????????(?,?,?)=???????(?:??,)?1.3.2. Transformer÷?. ?ìo?vìo?v}?^NnS}?query?NwTS}?key-value??ì{Y??rNnòg?o{?o2?~???~??_uìo?v??^?y~?Nn÷?o??_~query?Ny~??Nn÷?o??a?ì{Y?·?Nn?Nquery?tsg~gY?òg?a_~NO??o?_~y~?íUu?ìONn÷?o??o?óú???ó?×~t2?oww?y~?Nn÷?oy?u?W?boO?ìy\?gY??????u?WN_?ú???]gt???k?ˉ?ü?v?m÷O?2?o?~_×}t??ó{?vyyg?v?nONwTO?tsg??}???No??k???Y?????k???×~wkr?Y?OO????s?2t9??ìo?v{otuogt?^W?z?ìo?v?self-attentionmechanism?^u?Wo??nS}?~query1keyvalue???u?Wo??nS}NwTS}?tsg???omwwk?W??O\S}O????s?ütsg2??ìo?gY÷??nO?ìo?ó??k?^??n??n 14/58????ó???_o?y{|~?yotTt??????ìo?vy_UV{}ó?|t??GPUvTPU??~ì??y?2?y~ì???_?]v??W÷|~Y??~??wóú???m÷??k?y_??w?ìˉ?ü?v. Transformer?u_÷?2017~Vaswani{tóè?:Attentionisallyouneed;?ó?????ìo?v?self-attentionmechanism???í??o?RNN?????Transformer????{twóú?k????~ìg?~g??~~×?gyóW2O÷_?Transformerk????ì????210?transformer÷?uogt?:Attentionisallyouneed;Transformeru_×}??o?y~??y~÷??wy~ü?y~[××????ìo?v?kWóò??2y~ü?y~?uV{~t?u???ó??k?^??n??n 15/58????ó???_o?y{|~Wüto?W?]uè{^òNOy2y~ü?y~???kWW??[}?ìo?ü??O?_tnn[^2w???t10b{2u_N?Transformer?y~ü?y~you[nt\?W?block?óòr???nWyonn[^2Nn[^o[}?ìo??multi-headself-attention?^??n[^o??O?_t?position-wisefeed-forwardnetwork?^2??y~??ìo?÷?query1keyüvalueyg?_Nny~^??2~?T?ytk??ˉ??????tíy?üóúˉ?{?g??n[^y×}??????residualconnection?2~????????uo??n?Transformern?W?_O??_u?N[^?????????(?)?uottt\?*=7?2?ì?????òy??O^??w?ì^_NW?layernormalization?2?o?u?W?T??nO?Transformery~y^?Nn?t?è{U?2?ì??t?ü_NW^??N[^??_~??????????L?+????????(?)M*=Transformer?y~_ou[nt\?Wóòr??~^?W???Nn[^_?ì?????ü^_NW2??y~o???nn[^O_??y~??nn[^O?óu?On[^?ù~y~??y~ìo??encoder-decoderattention?^2y~??y~ìo?^o?queryg?_Nn?y~W???keyüvalueg?uny~??2?y~?ìo?^o?query1keyüvalueyg?NNn?y~W??2Oo??y~o??nO?y×?OO_?boO2?y???masked?ìo??}??T_^g?~?o÷Oóuoì·?s?_??y?_N?y~obo??r?S?O?ììo???. [}ìo?v??_÷????ìo?v?Vaswani{t?ó??[}ìo??multi-headattention?v~^wT}TransformerWo2[}ìo?ví?k?y_??t\?query1keyüvalue???t\?ìo?vnrO\?ì~?~yoO\ì~?t\wk?W?Ty????s?2w_??óyyo???uuo?ìt?n?·?query1keyüvalue?[t?g÷?linearprojections??Y^?Nt÷^?query1keyüvalueuT?t??ìo?^o2g^^boìo?^????^??ìNk{_??g÷??gt????2?Nt?÷?cüìo?^yoTtt?ü~ì?????Nnìo?^y?ù_Nn}?head???o?yyy?ù_[}ìo?v2?ó??k?^??n??n 16/58????ó???_o?y{|~t11?[}ìo?v{otuogt?^W?z[}?ìo????uo?ìO\??g÷?y??^??c???y??^óu?n·?query1key1value?O\[z?è{?representationsubspaces???ììo?òg^^[nìo?}???Y?ìNk?g÷?c?g^?r?}wó×?W?TS}O\??tsg?o?m?23NNn?onn}?[}ìo??uO_???????????(?,?,?)=??????(/???!,&,/???8)?9 wo???????()~t?í_??????????()~×}_?? ù?ìo?yu2?,?,?~?^?uuo?W??ìo?o~t\?W???:1,?;1,?<1o?Nnìo?}?T?uuo?g÷?W~???·1??·1|?·?gYóu??9o^t?^?ìo???g?W~????ntt??gYóu???gYóuyo?n?2T}[}ìo?v?y_?k?\÷wk[y?Ws?ü{??óúk??èó2t??o??ìo?v?[}ìo?y_wkr??WOoO\S}????s??[o?ìo?v]v??W÷??v2÷_???[}ìo?oO\}?V÷?y_N?{tN?}wv?k?ó??{?óú?k?????g2??O\}?V÷?_???O\}?Yg?ìN?o×O?ó??×_N?VYgO?}???}wék??_1ot??èó. ?????????ResidualConnection?o?tyt?t?Yyw??yt?t?ˉ?ügywo?[y÷?}]?w??g_g?_zn{tó?????ó??k?^??n??n 17/58????ó???_o?y{|~t?ResNet?2ResNet2015~?ImageNetík~?é?{sz?{??~y{wY?^g?ytt???2??g??n??tWN??W?{{??W?t???u~??^?tnr?vóf\~?(?)?nW?gYn??{???\?vóf\?(?)???W[o?\???f\?(?)=?(?)2?2t12???????{otuogt?^W?z?n?{O??yo???f\?o?[_?2yt?to?_?t?Uoí÷????[n^k?gY?·2??gY?·~W??g?yu??ReLU?sigmoid{2???yu?y???tíy???t?_r??????y??o?o2???????^u?oN??otò?^??u_~????o?^?^k??????V?gY?·ü?yu2?y??o?c??t?t?_??]??tíy?T?2?????O??tíy??Y?\÷??f\??\t????f\??w_W?ˉ??t?O??}?èwy_??^k?y1?]g?t??óúk??gy2????y_yt?tk?woN?{t??b]g2sNn??Wo?gYíor?ro}?{???????w??o??u?o??~????tò2?yb]g?k??w?g?s÷??V?tíorom?è{??w??^?^ko?k?2No?NLP<?m=?k???O\?Transformero??block?^ó???^?<ty=?t?????UV{}????g?r_Wt?t?. ?]uüOy?ó??k?^??n??n 18/58????ó???_o?y{|~NLPaoo??]u}g^?o?W??oWS<?=f\~~y_v??<?U?=2g?W??U?è{yyotyU??one-hotvector?2W??yoO\??u???y_?~N??n??TNn?0rN-1?O\uu?_?2~??r_~???o??tyU?è{?s?[t?Nny~0??t~?U??~^O??S}?~12???n?y?è{~Nn?t~N?U??????u?t}2tyU??w?~w??t?O~OoNn}??2Ny÷o?~tyU??ttN?y?_t{??y??÷Ywtt_?~??o?tw?O2?Y?o?×}tyU??y×~è?O\?O??tt2??~}?é?U?tsg?go<_tt=???nnt\tt?U??ü?,w_tt???}_?O???????????=?,?*[21,1]?yo?onn??tyU?_tty~0??yé??N??tsg2uoù??ó??word2vecg?ó?n??^?n?f\rNnt??t?U????U?y?}wè?O\?O??tgs?2W??y?_~N?to?U??tt~d?3N???Nntt~?×??gY?·g^tt~N?tyU?f\~tt~d??U???n?{?ù~?]u?wordembedding?2word2vec?aotüˉ?yyOo?o??2w_rTransformerk?o?s??]u?Yù?Oy?positionalencoding?2o??RNNk?okú?n]v?WS}??Transformer??ìo??~~ì???\÷]vug?WboS}??y·×?÷??o2u?o???]uyy?ó???n???O?o?Oó??÷[v?oo?O??oO}?Transformerk?2~??ó?n??Transformer_u?Oy?wttN?U?t\?s???n?r?_??}t?y2Transformer×}k_t?Oy?w??}_?O???????????????????(???,2?)=sin(?????????????????????(???,2?+1)???10000wo?pos?? ?Wo?Oi?Oy?ttd??U??tt2?yOy???è{S}?Wo???O?o?_??è{S}O??t?O?o?^t?O?è{O×r??O?Y2^?WS}??]u^÷?r??U?NS}?Wo?Oytò?s???ry_}??ìo?u?~oO?o??U?. ??O?_?t??ìo?^O_?Transformer?y~ü?y~o??nblockyWNny??_t^??}]~ù~??O?_?t?position-wisefeed-forwardnetwork?2?n??O?_t?s???]v?Wuo÷??~?nO?S}V{T}t\??c?Oo^boO?S}?~t\2?oww??ó??k?^??n??n 19/58????ó???_o?y{|~y_w·O\O?O\{?ü\??o2?WS}?O?o?NN_o???Oyg_u2?t??N???O?_?t}oNny??t??nFFN^oW?nk?g?c?nk?cO?oNn??gyuReLU?ì?2?w?g?cO\ONot\??O???N^O?}O\?óu2???(?)=????(??!+?!)?"+?"wo??!ü?!oNn?g^?gYóuü_óu??"ü?"o?n?g^?gYóuü_óu2Transformerk???ó??w?]v?owR÷??oO?t?o??k????W]v1~ì??1k?ík?vük?}gy÷yoY??|2Transformer?r?_???íkˉ??NLPk??u??^??[×]k??????BERT?BidirectionalEncoderRepresentationsfromTransformers?1GPT?GenerativePre-trainedTransformer?{W?y??Transformer??2o_?t?]v1÷?]v{wT???[^?k??ViT?VisionTransformer?1TimeSformer?Time-Spacetransformer?{k?_y?y??Transformer??21.4. ˉ??k?1.4.1. ˉ??]uk?NN?W????]u?????W??}o}Nn?]u?·?^?yobo?S?tyU?è{f\~t?tt^y??O\?Stg?U?è{?wYù??_·?Nnom?÷?·2?Nn?yˉ?Nnom??]u?·O_???ü÷?ro2?oío????Nn?k?yY???]ukW?ìˉ??? Yˉ?Nnom??]u?·?wordembedding????T}?k?o}???rO?m???}og_?ˉ?k?2??word2veck?üGloVe?GlobalVectors?y\U??k??yoˉ??]uk?2^ˉ?}?wordembeddingT}rO~?ó??{?t_?}??]u?·??ˉ??t?N^?ìgY_?W2^?ˉ?k?onyy_?{{?ov?embedding^?gYóu2×}??óu?yy?Frozen?ˉ?÷?t]}O^embedding^?óu??ˉ??{of\?·Oó??2×}?y?Fine-Tuning?ˉ?÷?O?embedding^?óu?ˉ?÷yoto?ó?uo???gY2Frozenˉ?t_?^ˉ??embedding^_rNn_y?}???w^tyU??W~~o?Os???U?2?NN????_}o??{{?????[??S?or?U??f\s??Frozen?embedding[?o{?{?uo?~ì??2Fine-Tuningˉ?[t_?ó??Ntt_O?gY_?Wóu|?y_omw?Ok?ˉ??ro???O~?ó?k?m?_oN??~?21.4.2. NO?os?ˉ??]uk?word2vecüGloVey?Ww?\Nn?VYNnˉ??U??O×??NO?2w??w?o[t_[?Nk[Oóí?O\?NO???o???O?yOt\2?o?NO??s??U?è{won~?\?g2??ú?NO?os??U??ó]?wo??è?×ó????NO?2??word2vecüGloVe{NO??s??]u??U??×ó??Soˉ??????????=?(?)??NO?os??]u??U?×ó??Soˉ??wN?ó??k?^??n??n 20/58????ó???_o?y{|~O???????????=?(?,???????(?))?2~é?NO?os?U?è{oTagLM?language-model-augmentedsequencetagger??k?T??Wo?~?1CoVe?ContextVectors?NO?U??1üELMo?EmbeddingsfromLanguageModels?g??k??]u?{2ELMo?Nn?m?yh?ó??U??N?[O?2ELMoík?o?y?Nˉ??Nn?UytLSTMt?ˉ?÷?to?óo<?{z=?syoW??NO?go?W?2ˉ??r^?uNn??÷[?}y_?r÷[o??n?SnnyU??]u?W2\?nn?]u?W?gt}???r??S?Nn~ou÷[NO??o??]uè{U?2ELMot????Otb{2t13?ELMot???y_?{{otuogt?^W?z13]{?oNn??ˉ?}?ELMok?T}rO~?ó??{2^u?WkwT??yUunnyU?y^LSTMk???NnS}Ei???N^LSTM÷?yy_ó×??S}?Nn~oNO??o?{?U??^?o??]uU???Ei?T??Nù?{?U??×}N??gY?ìnü??rgt??Pi?s·???S?NO?osè?U?2?S???N^LSTM^yy·?N???NNO??ots??y1?O{???o[Nyy?ù~??{??ˉ??Feature-basedPre-Training?2ELMo??}?Ty?w?]v?ó???oV1?V÷1}]?_?{{2w_?óo?~_^ELMo?]u_~u?w^?ó{??k?N?ì???T{??ó?n2?÷???ˉ???ULSTMk?o?bogY?O^[n{?U?\rgt?U??Pi?gYN?o?ˉ??????ónnyU?]u?t?Yg21.4.3. }ˉ?k??GPT?ó??k?^??n??n 21/58????ó???_o?y{|~}?ELMo~×}??Ty?w?]v?ó??óyh?O?n?óyh?w???Nn{???ó?÷?2w?~?Nn?w?]v?ó??Nn{??÷??×N~OoNt?w??2GPT?GenerativePreTraining?r_ˉ??k?~NO?os??è{????ó?s?}k?2GPTt?Transformer?y~??xN?ˉ??Nn}?è{?o?W??T_?k?2_^GPTT}?O~?ó÷??k???^?rNn?ò??g?^??o?ó?o2NELMo??ˉ?k??óuO\?GPTO~?ó?÷wn?{o?ˉ?Transformer?y~o?boóu?ì?2t14?GPTˉ?k???{otuogt?:ImprovingLanguageUnderstandingbyGenerativePre-Training;GPTk?NELMok??y_?{?_Vrnn???1.{}?k??ì?÷wˉ??2.?o÷w???Fine-tuning??óO~?ó2Nt]{?GPTk????????O\O~?ó??ó?2GPT}Transformerk???y~W_~{??×~?w{ùNt???ìo?^wo??T_{g??ó×NO?o?<N?=?o_~{?2??O\?O~?ó?V1Wyví1tví1[{?yot14ó??y_?ìO\?u]v^2O^?N?uTransformerW?ì{?ó×?g^yo?ó???g^???r???2\~ˉ??GPT?Fine-tuningk_NELMo?Feature-basedPre-Trainingk_{{??ELMok?oNy?]uk????t??r??NO?osè{?Ogì{???ó2T}?O~?ó÷???Nn]v??ó?k?g}ELMor??U?2ˉ?O~?ó÷??ELMo??ULSTMk??óu?ìˉ?2GPTk?oNy?T_?k??woˉ}oNny_]v?or?ó?k?2T}?wTO~?ó÷?yo?ó{ù?GPTk?????ì}°?_?}ú??}ˉ??GPTk??ìóu_?W^??unk??ìOó??ˉ???Tw_??ó{ù21.4.4. ˉ?k??r?BERTELMoNGPTnTo_?2ELMok??_??w?U÷?y_\÷ó×rto?SNO?nnyUN?{??GPT?yó×rú??{??GPTk??_??w}g???O\?O~?ó???o?k??ìt__????ó??k?^??n??n 22/58????ó???_o?y{|~u?ELMo[??]vO~?ó?k???22018~Google?Devlin{tó??BERT?BidirectionalEncoderRepresentationsfromTransform

溫馨提示

  • 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
  • 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
  • 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
  • 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
  • 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負(fù)責(zé)。
  • 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請與我們聯(lián)系,我們立即糾正。
  • 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時也不承擔(dān)用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。

最新文檔

評論

0/150

提交評論