全文預(yù)覽已結(jié)束
下載本文檔
版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)
文檔簡(jiǎn)介
MachLearn(2006)63:211215DOI10.1007/s10994-006-8919-xGUESTEDITORIALMachinelearningandgamesMichaelBowlingJohannesFurnkranzThoreGraepelRonMusickPublishedonline:10May2006SpringerScience+BusinessMedia,LLC2006Thehistoryoftheinteractionofmachinelearningandcomputergame-playinggoesbacktotheearliestdaysofArtificialIntelligence,whenArthurSamuelworkedonhisfamouschecker-playingprogram,pioneeringmanymachine-learningandgame-playingtechniques(Samuel,1959,1967).Sincethen,bothfieldshaveadvancedconsiderably,andresearchintheintersectionofthetwocanbefoundregularlyinconferencesintheirrespectivefieldsandingeneralAIconferences.ForsurveysofthefieldwerefertoGinsberg(1998),Schaeffer(2000),Furnkranz(2001);editedvolumeshavebeencompiledbySchaefferandvandenHerik(2002)andbyFurnkranzandKubat(2001).Inrecentyears,thecomputergamesindustryhasdiscoveredAIasanecessaryingredienttomakegamesmoreentertainingandchallengingand,viceversa,AIhasdiscoveredcom-putergamesasaninterestingandrewardingapplicationarea.TheindustrysperspectiveiswitnessedbyaplethoraofrecentbooksongentleintroductionstoAItechniquesforgameprogrammers(Collins,2002;Champanard,2003;Bourg&Seemann,2004;Schwab,2004)oraseriesofeditedcollectionsofarticles(Rabin,2002,2003,2006).AIresearchoncomputergamesbegantofollowdevelopmentsinthegamesindustryearlyon,butsinceJohnLairdskeynoteaddressattheAAAI2000conference,inwhichheadvocatedInteractiveComputerGamesasachallengingandrewardingapplicationareaforAI(Laird&vanLent,2001),numerousworkshops(Fu&Orkin,2004;Ahaetal.,2005),conferences,andspecialissuesofjournals(Forbus&Laird,2002)demonstratethegrowingimportanceofgame-playingapplicationsforArtificialIntelligence.M.Bowling(envelopeback)e-mail:bowlingcs.ualberta.caJ.Furnkranze-mail:fuernkranzinformatik.tu-darmstadt.deT.Graepele-mail:R.Musicke-mail:Springer212MachLearn(2006)63:211215Games,whethercreatedforentertainment,simulation,oreducation,providegreatop-portunitiesformachinelearning.ThevarietyofpossiblevirtualworldsandthesubsequentML-relevantproblemsposedfortheagentsinthoseworldsislimitedonlybytheimagination.Furthermore,notonlyisthegamesindustrylargeandgrowing(havingsurpassedthemovieindustryinrevenueafewyearsback),butitisfacedwithatremendousdemandfornoveltythatitstrugglestoprovide.Againstthisbackdrop,machinelearningdrivensuccesseswoulddrawhigh-profileattentiontothefield.Surprisinglyhowever,themorecommercialthegametodate,thelessimpactlearninghasmade.Thisisquiteunlikeothergreatmatchesbetweenapplicationanddata-drivenanalyticssuchasdataminingandOLAP.Topicsofparticularimportanceforsuccessfulgameapplicationsincludelearninghowtoplaythegamewell,playermodeling,adaptivity,modelinterpretationandofcourseperfor-mance.Theseneedscanberecastasacallfornewpracticalandtheoreticaltoolstohelpwith:learningtoplaythegame:Gameworldsprovideexcellenttestbedsforinvestigatingthepoten-tialtoimproveagentscapabilitiesvialearning.Theenvironmentcanbeconstructedwithvaryingcharacteristics,fromdeterministicanddiscreteasinclassicalboardandcardgamestonon-deterministicandcontinuousasinactioncomputergames.Learningalgorithmsforsuchtaskshavebeenstudiedquitethoroughly.Probablythebest-knowninstanceofalearninggame-playingagentistheBackgammon-playingprogramTD-Gammon(Tesauro,1995).learningaboutplayers:Opponentmodeling,partnermodeling,teammodeling,andmultipleteammodelingarefascinating,interdependentandlargelyunsolvedchallengesthataimatimprovingplaybytryingtodiscoverandexploittheplans,strengths,andweaknessesofaplayersopponentsand/orpartners.OneofthegrandchallengesinthislineofworkaregameslikePoker,whereopponentmodelingiscrucialtoimproveovergame-theoreticallyoptimalplay(Billingsetal.,2002).behaviorcaptureofplayers:Creatingaconvincingavatarbasedonaplayersin-gamebe-haviorisaninterestingandchallengingsupervisedlearningtask.Forexample,inMassiveMultiplayerOnlineRole-playingGames(MMORGs)anavatarthatistrainedtosimulateausersgame-playingbehaviorcouldtakehiscreatorsplaceattimeswhenthehumanplayercannotattendtohisgamecharacter.FirststepsinthisareahavebeenmadeincommercialvideogamessuchasForzaMotorsport(Xbox)wheretheplayercantraina“Drivatar”thatlearnstogoaroundthetrackinthestyleoftheplayerbyobservingandlearningfromthedrivingstyleofthatplayerandgeneralizingtonewtracksandcars.modelselectionandstability:Onlinesettingsleadtowhatiseffectivelytheunsupervisedconstructionofmodelsbysupervisedalgorithms.Methodsforbiasingtheproposedmodelspacewithoutsignificantlossofpredictivepowerarecriticalnotjustforlearningefficiency,butinterpretiveabilityandend-userconfidence.optimizingforadaptivity:Buildingopponentsthatcanjustbarelyloseininterestingwaysisjustasimportantforthegameworldascreatingworld-classopponents.Thisrequiresbuildinghighlyadaptivemodelsthatcansubstantivelypersonalizetoadversariesorpart-nerswithawiderangeofcompetenceandrapidshiftsinplaystyle.Byintroducingaverydifferentsetofupdateandoptimizationcriteriaforlearners,awealthofnewresearchtargetsarecreated.modelinterpretation:“Whatsmynextmove”isnottheonlyquerydesiredofmodelsinagame,butitiscertainlytheonewhichgetsthemostattention.Creatingtheillusionofintelligencerequires“paintingapicture”ofanagentsthinkingprocess.TheabilitytodescribethecurrentstateofamodelandtheprocessofinferenceinthatmodelfromSpringerMachLearn(2006)63:211215213decisiontodecisionenablesqueriesthatprovidethefoundationforahostofsocialactionsinagamesuchaspredictions,contracts,counter-factualassertions,advice,justification,negotiation,anddemagoguery.Thesecanhaveasmuchormoreinfluenceonoutcomesasactualin-gameactions.performance:Resourcerequirementsforupdateandinferencewillalwaysbeofgreatimpor-tance.TheAIdoesnotgetthebulkoftheCPUormemory,andthemachinesdrivingthemarketwillalwaysbeunderpoweredcomparedtotypicaldesktopsatanypointintime.Thisspecialissuecontainsthreearticlesandoneresearchnotethatspanthewiderangeofresearchintheintersectionofgameplayingandmachinelearning.Inthefirstcontribution,AdaptiveGameAIwithDynamicScripting,Sproncketal.tackletheproblemofadaptivitybydynamicallymodifyingtheruleswhichgoverncharacterbe-haviorin-game.Thispaperistargetedatthecommercialgamesindustry,andprovidessomegoodinsightintoproblemsfacedbythecreatorsoftodaysroleplayinggames.Theauthorsproposefourfunctionalandfourcomputationalrequirementsforon-linelearningingames.Theythenproceedtoshowhowdynamicscriptingfitsintothoserequirements,andprovideexperimentalevidenceofthepotentialpromiseofthisapproach.Dynamicscriptingcanbecharacterizedasstochasticoptimization.Theauthorsevaluatedynamicscriptingonboththetaskofprovidingthetoughestopponentpossible,andonthetaskofdifficultyscaling.Gooddifficultyscalingunderpinswhatmakesmostgamesfun,andsolvingthisproblemisoftenverychallengingandthesolutionsarealmostalwaysad-hoc.TheauthorspresentexperimentaldatathatcomparesdynamicscriptingtostaticopponentsandthosecontrolledbyQ-LearningandMonteCarlo.Thetestenvironmentsincludebothsimulatedgamesandanactualcommercialgame(NeverwinterNights),andhelptopresentaveryinterestingstudywhichissuretoblazeapathforfurtherinterestingresearch.Thesecondpaper,UniversalParameterOptimizationinGamesBasedonSPSAbySzepesvariandKocsis,considerstheproblemofoptimizingparameterstoimprovetheperfor-manceofparameterizedpoliciesforgameplay.TheyconsidertheSimultaneousPerturbationStochasticApproximation(SPSA)methodintroducedbySpall(1992)whichisageneralgra-dientfreeoptimizationmethodthatisapplicabletoawiderangeofoptimizationproblems.TheauthorsdemonstratethatSPSAisapplicabletoawiderangeoftypicaloptimizationproblemsingamesandproposeseveralmethodstoenhancetheperformanceofSPSA.Theseenhancementsincludetheuseofcommonrandomnumbersandantitheticvariables,acombinationwithRPROPandthereuseofsamples.TheapplicationtogamesconsidersthedomainoflearningtoplayOmahaHi-LoPokerwiththeirpokerprogramMcRaise.SPSAcombinedwiththeirproposedenhancementsleadstopokerperformancecompetitivewithTD-learning,themethodsosuccessfullyusedbyTesauro(1995),forlearningaworld-classevaluationfunctionforBackgammonandstillusedintodaysworldclassbackgammonprogramssuchasJellyFishandSnowie.Thethirdcontribution,LearningtoBidinBridgebyMarkovitchandAmit,addressestheproblemofbiddinginthegameofBridge.WhileresearchinBridgeplayinghaspioneeredMonteCarlosearchalgorithmsfortheplayingphaseofcardgamesandresultedinprogramsofconsiderablestrength(Ginsberg,1999),thebiddingphase,inwhichthegoal(theso-calledcontract)ofthesubsequentplayingphaseisdetermined,isstillamajorweaknessofexistingBridgeprograms.ThispaperisaboutanapproachthatsupportsthedifficultbiddingphaseinthegameBridgewithtechniquesfrommachinelearning,inparticularopponentmodelingviathelearningofdecisionnetsandviamodel-basedMonteCarlosamplingtoaddresstheproblemofhiddeninformation.Theevaluationclearlyestablishesthatthesystemimproveswithlearning,anditseemsthatthelevelofplayachievedbythisprogramsurpassesthelevelSpringer214MachLearn(2006)63:211215ofthebiddingmoduleofcurrentstate-of-the-artprogramsandapproachesthatofanexpertplayer.Finally,SadikovandBratkopresentaresearchnoteonLearningLong-termChessStrate-giesfromDatabases.Theyaddresstheproblemofknowledgediscoveryingamedatabases.Formanygamesorsubgames(suchaschessendgames),therearegamedatabasesavailable,whichcontainperfectinformationaboutthegameinthesensethatforeverypossibleposi-tion,thegame-theoreticoutcomeisstoredinadatabase.However,althoughthesedatabasescontainallinformationtoallowperfectplay,theyarenotamenabletohumananalysis,andaretypicallynotverywellunderstood.Forexample,chessGrandmasterJohnNunnanalyzedsev-eralsimplechessendgamedatabasesresultinginaseriesofwidelyacknowledgedendgamebooks(Nunn,1992,1994b,1995),butreadilyadmittedthathedoesnotyetunderstandallaspectsofthedatabasesheanalyzed(Nunn,1994a).Thispaperreportsonanattempttomakeheadwaybyautomaticallyconstructingplayingstrategiesfromchessendgamedatabases.Itdescribesamethodforbreakinguptheproblemintodifferentgamephases.Foreachphase,itisthenproposedtolearnaseparateevaluationfunctionvialinearregression.Experimentsinthethekingandrookvs.king,orkingandqueenvs.kingandrookendgamesshowencouragingresults,butalsoillustratethedifficultyoftheproblem.MachinelearninghasbeeninstrumentaltodateinbuildingsomeoftheworldsbestplayersinBackgammonandhasleadtointerestingresultsingameslikeChessandGo.Tomoveintomainstreamcommercialgames,machinelearningresearchhastofacewhatinmanywaysaretheharderproblemsoflosingininterestingways,creatingmoreusefulillusionsofintelligence,hyper-fastadaptation,andtakingonpersona.Thearticlesinthisspecialissueprovideaglimpseintodifferentfacetsofalloftheseproblems.ReferencesAha,D.W.,Munoz-AvilaH.M.,&vanLent,M.(Eds.),(2005).Reasoning,representation,andlearningincomputergames:ProceedingsoftheIJCAIworkshop.Edinburgh,Scotland:NavalResearchLaboratory,NavyCenterforAppliedResearchinArtificialIntelligence.TechnicalReportAIC-05-127.Billings,D.,Pena,L.,Schaeffer,J.,&Szafron,D.(2002).Thechallengeofpoker.ArtificialIntelligence,134(12),201240,SpecialIssueonGames,ComputersandArtificialIntelligence.Bourg,D.M.,&SeemannG.(2004).AIforgamedevelopersCreatingintelligentbehavioringames.OReilly.Champanard,A.(2003).AIgamed
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽,若沒有圖紙預(yù)覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間,僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理,對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
- 6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容,請(qǐng)與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。
最新文檔
- 2026年市場(chǎng)營(yíng)銷專業(yè)試題消費(fèi)者行為分析與市場(chǎng)調(diào)研
- 2026年大學(xué)英語(yǔ)四級(jí)寫作翻譯題庫(kù)
- 2026年生物醫(yī)學(xué)工程醫(yī)療器械原理與技術(shù)題庫(kù)
- 2026年世界經(jīng)濟(jì)地理大賽應(yīng)考專項(xiàng)選擇題
- 2026年市場(chǎng)營(yíng)銷策略與案例分析題目庫(kù)
- 2026年區(qū)塊鏈與物聯(lián)網(wǎng)結(jié)合數(shù)據(jù)安全與隱私保護(hù)模擬題
- 2026年英語(yǔ)四六級(jí)模擬試題詞匯與語(yǔ)法專項(xiàng)練習(xí)
- 2026年電子商務(wù)運(yùn)營(yíng)策略與市場(chǎng)分析模擬題
- 2026年化學(xué)實(shí)驗(yàn)有機(jī)化學(xué)基礎(chǔ)反應(yīng)原理與操作模擬題
- 2026年國(guó)際合作項(xiàng)目下的教學(xué)管理技能測(cè)試題
- 2026年及未來5年市場(chǎng)數(shù)據(jù)中國(guó)帶電作業(yè)機(jī)器人行業(yè)市場(chǎng)需求預(yù)測(cè)及投資規(guī)劃建議報(bào)告
- 錳及化合物職業(yè)健康安全防護(hù)須知
- 春節(jié)后復(fù)產(chǎn)復(fù)工安全培訓(xùn)
- 森林管護(hù)培訓(xùn)
- 2026年北京市房山區(qū)公安招聘輔警考試試題及答案
- 軍品生產(chǎn)現(xiàn)場(chǎng)保密制度
- DB32-T 5320-2025 疾病預(yù)防控制機(jī)構(gòu)檢驗(yàn)檢測(cè)能力建設(shè)規(guī)范
- 46566-2025溫室氣體管理體系管理手冊(cè)
- 數(shù)據(jù)保護(hù)及信息安全方案手冊(cè)
- 電動(dòng)重卡的可行性報(bào)告
- 中建物資管理手冊(cè)
評(píng)論
0/150
提交評(píng)論