版權說明:本文檔由用戶提供并上傳,收益歸屬內容提供方,若內容存在侵權,請進行舉報或認領
文檔簡介
1、Lexical Acquisition,Fu lei 2007-05-14,Outline,Introduction Evaluation measures Verb subcategorization Attachment ambiguity Selectional preferences Semantic similarity between words Significance and further reading,Outline,Introduction Evaluation measures Verb subcategorization Attachment ambiguity S
2、electional preferences Semantic similarity between words Significance and further reading,Introduction,General goal To develop algorithms and statistical techniques for filling the holes in existing machine-readable dictionaries by looking at the occurrence patterns of words in large text corpora.,I
3、ntroduction,Lexical acquisition problems Collocations Verb subcategorization The recipient of contribute is expressed as a prepositional phrase with to Attachment ambiguity The children ate the cake with their hands The children ate the cake with blue icing Semantic categorization What is the semant
4、ic category of a new word that is not covered in our dictionary Selectional preference The verb eat usually takes food items as direct objects,Outline,Introduction Evaluation measures Verb subcategorization Attachment ambiguity Selectional preferences Semantic similarity between words Significance a
5、nd further reading,Evaluation measures,Evaluation measures Precision Recall F-measure Fallout,Evaluation measures,Target=tp+fn Selected=tp+fp Total=tn+fp+tp+fn,Evaluation measures,Precision=tp/selected=tp/(tp+fp) Measure of the proportion of selected items that the system got right Recall=tp/target=
6、tp/(tp+fn) Measure of the proportion of the target items that the system selected F=1/(/P+(1-)/R) If set =0.5,F=2PR/(P+R) Fallout=fp/(fp+tn) Measure of the proportion of non-targeted items that were mistakenly selected,Evaluation measures,Acc=(tp+tn)/total,Outline,Introduction Evaluation measures Ve
7、rb subcategorization Attachment ambiguity Selectional preferences Semantic similarity between words Significance and further reading,Verb subcategorization,Definition We refer to the classification of verbs according to the types of complements they permit as subcategorization. we say that a verb su
8、bcategorizes for a particular complement.,Verb subcategorization,Subcategorization frame A particular set of arguments that a verb can appear with is referred to as a subcategorization frame Example,Verb subcategorization,Why called subcategorization We can think of the verbs with a particular set o
9、f semantic arguments as one category. Each such category has several subcategories that express these semantic arguments using different syntactic means. Example The class of verbs with semantic arguments theme and recipient has a subcategory that expresses these arguments with an object and a prepo
10、sitional phrase (for example, donate in He donated a large sum of money to the church), and another subcategory that in addition permits a double-object construction (for example, give in He gave the church a large sum of money),Verb subcategorization,Importance for parsing example,Verb subcategoriz
11、ation,Algorithm for learning subcategorization frames Lerner by Brent(1993) Two steps Cues Hypothesis testing,Verb subcategorization,Example greet-V Peter-CAP,-PUNC I came-V Thursday-CAP,-PUNC before the storm started,Verb subcategorization,Outline,Introduction Evaluation measures Verb subcategoriza
12、tion Attachment ambiguity Selectional preferences Semantic similarity between words Significance and further reading,Attachment ambiguity,Mainly discussed PP attachment Example The children ate the cake with a spoon,Attachment ambiguity,Model A simple model based on co-occurrence statistics Probabil
13、istic model,Attachment ambiguity,Probabilistic model By Hindle and Rooth(1993) Model introduction,Attachment ambiguity,example He put the book on World War II on the table Derivation,Attachment ambiguity,Derivation,Attachment ambiguity,Attachment ambiguity,Attachment ambiguity,Problems 該模型建模時只考慮了v,n
14、,pp,然而現(xiàn)實句子中其它的信息可能對確定pp的依附性也有很大的作用,例如hindle和rooth發(fā)現(xiàn)在名詞前出現(xiàn)形容詞最高級就很可能是名詞短語依附 該模型僅考慮了緊跟在np后面的pp是修飾緊在其前的np的還是vp的這一基本情況,實際上pp的依附性有很多種情況。例如:pp對復合名詞的依賴(door bell manufacturer),Outline,Introduction Evaluation measures Verb subcategorization Attachment ambiguity Selectional preferences Semantic similarity be
15、tween words Significance and further reading,Selectional preferences,Definition Most verbs prefer arguments of a particular type. such regularities are called selectional preferences or selectional restrictions. Importance of the acquisition of SP Infer Susan had never eaten a fresh durian before Ra
16、nking the possible parses,Selectional preferences,Model By Resnik(1993,1996) In principle, the model can be applied to any class of words that imposes semantic constraints on a grammatically dependent phrase: verbsubject, verbdirect object, verbpp, adj.noun etc. Here, only consider the case verb-dir
17、ect object,Selectional preferences,Model Selectional preference strength (SPS): Measures how strongly the verb constrains its direct object. Selectional association The proportion that its summand contributes to the overall preference strength,Selectional preferences,Model How to estimate P(c|v),Sel
18、ectional preferences,SPS Example,Selectional preferences,Association Example,Outline,Introduction Evaluation measures Verb subcategorization Attachment ambiguity Selectional preferences Semantic similarity between words Significance and further reading,Semantic similarity,The holy grail of lexical a
19、cquisition is the acquisition of meaning. Semantic similarity Judgments of semantic similarity can be explained by the degree of contextual interchangeability or which on word can be substituted for another in context (Miller and Charles 1991),Semantic similarity,Application Generalization Under the
20、 assumption that semantically similar words behave similarly Susan had never eaten a fresh durian before. Query expansion KNN classification,Semantic similarity,Computing methods Vector space measures Probabilistic measures,Semantic similarity,Vector space measures Binary valued Matching coefficient
21、 Dice coefficient Jaccard (or Tanimoto) coefficient Overlap coefficient Cosine Real valued Cosine,Semantic similarity,Definition of similarity measures for binary vectors,Semantic similarity,Comparison Matching coefficient Simply counts the number of dimensions on which both vectors are non-zero. Di
22、ce coefficient Normalizes for length by dividing by the total number of non-zero entries. Range from 0 to 1 Jaccard coefficient Penalizes a small number of shared entries more than the Dice coefficient.,Semantic similarity,Comparison Overlap coefficient Has the flavor of a measure of inclusion. Cosi
23、ne Identical to the Dice coefficient for vectors with the same number of non-zero entries, but it penalizes less in cases where the number of non-zero entries is very different.,Semantic similarity,Real valued,Semantic similarity,Real valued cosine For normalized vector, cosine gives the same ranking of similarities as Euclidean distance does,Semantic similarity,Advantage of vector measures Simple representation Easy to compute Example,Semantic similarity,Probabilistic measures Why introduce this method vector space based measures is that, excep
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經權益所有人同意不得將文件中的內容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內容本身不做任何修改或編輯,并不能對任何下載內容負責。
- 6. 下載文件中如有侵權或不適當內容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025年高職(建設工程監(jiān)理)建設工程合同管理試題及答案
- 2026年合肥共達職業(yè)技術學院單招綜合素質考試備考題庫帶答案解析
- 2026年貴州工程職業(yè)學院高職單招職業(yè)適應性測試備考題庫帶答案解析
- 2026年保定理工學院單招綜合素質筆試備考題庫帶答案解析
- 2026年廣西自然資源職業(yè)技術學院高職單招職業(yè)適應性測試模擬試題有答案解析
- 2026年廣州民航職業(yè)技術學院高職單招職業(yè)適應性測試備考題庫帶答案解析
- 2026年阜陽職業(yè)技術學院單招綜合素質筆試模擬試題帶答案解析
- 2026年河南對外經濟貿易職業(yè)學院高職單招職業(yè)適應性考試備考試題帶答案解析
- 2026年常州工業(yè)職業(yè)技術學院單招綜合素質筆試模擬試題帶答案解析
- 2025年貴州銅仁市“千名英才·智匯銅仁”赴西安引才151人筆試歷年典型考題(歷年真題考點)解題思路附帶答案詳解
- 漢字魚的講解課件
- 內蒙古電力招聘考試真題2024
- 知道智慧樹知識產權信息檢索與利用滿分測試答案
- 火電廠消防知識培訓課件
- 醫(yī)院三合理一規(guī)范培訓
- 解讀《重癥監(jiān)護病房臨終關懷與姑息治療指南》
- 關鍵物料管理辦法
- 禁毒講師團管理辦法
- 《室內空氣 第9部分:建材產品和裝飾材料中揮發(fā)性有機化合物釋放量的測試 環(huán)境測試艙法》標準化發(fā)展報告
- 《2025-2026中國房地產市場報告》
- 校園安全教育論文3000字
評論
0/150
提交評論