支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯

上傳人：a*** IP屬地：廣東上傳時(shí)間：2022-03-23 格式：DOCX 頁數(shù)：14 大?。?6.17KB 積分：15 舉報(bào) 版權(quán)申訴

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯_第2頁

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯_第3頁

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯_第4頁

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯_第5頁

已閱讀5頁，還剩9頁未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡介

1、文獻(xiàn)出處：Neves R F P, Zanchettin C, Filho A N G L. An Efficient Way of Combining SVMs for Handwritten Digit RecognitionM/ Artificial Neural Networks and Machine Learning ICANN 2012. Springer Berlin Heidelberg, 2012:229-237.翻譯后中文字?jǐn)?shù)：4763第一部分為譯文，第二部分為原文。默認(rèn)格式：中文五號(hào)宋體，英文五號(hào)Times New Roma，行間距1.5倍。一種結(jié)合支持向量機(jī)的手寫數(shù)字

2、識(shí)別的有效方法摘要：本文提出了一種將組合SVM(支持向量機(jī))與其他分類器相比較,以保證高識(shí)別率和短處理時(shí)間的多問題的方法。這種分層的SVM組合考慮了高識(shí)別率和短處理時(shí)間作為評(píng)價(jià)標(biāo)準(zhǔn)。使用的案例研究是手寫數(shù)字識(shí)別問題,并取得初步實(shí)驗(yàn)成功。關(guān)鍵詞：模式識(shí)別，手寫數(shù)字分類器，支持向量機(jī)。1引言現(xiàn)在世界是數(shù)字化的。技術(shù)在人們的生活中無處不在，一些人工任務(wù)，如手寫識(shí)別，語音識(shí)別，人臉識(shí)別等都可以由機(jī)器來替代。在這種應(yīng)用中使用的主要識(shí)別過程12需要以下步驟：數(shù)據(jù)采集;預(yù)處理數(shù)據(jù)消除噪聲;分割，其中要識(shí)別的對(duì)象（文本，數(shù)字，面部等）位于背景中并分離;特征提取，其中提取每個(gè)對(duì)象的主要特征;最后有識(shí)別或分類，其

3、中的對(duì)象是根據(jù)其特征進(jìn)行標(biāo)記的。本文將重點(diǎn)放在分類任務(wù)上，用作案例研究手寫數(shù)字識(shí)別問題，因?yàn)檫@個(gè)任務(wù)可以代表一些分類問題。例如，模式可能是不明確的，或者一些功能在多個(gè)類中是相似的。這個(gè)問題的一個(gè)例子如圖1所示。在圖1中1a和圖1中1c圖像的正確值為7，而在圖1中1b是4，但圖1a和b是相似的，可以是相同的數(shù)字。因此，構(gòu)建一個(gè)概括好的分類器是一項(xiàng)艱巨任務(wù)。在某些情況下，最好的選擇是嘗試使用上下文信息來區(qū)分。隱馬爾可夫模型（HMM）3是一種經(jīng)常用于分析上下文并提高分類器識(shí)別率的技術(shù)。但其主要缺點(diǎn)是處理時(shí)間。建模上下文技術(shù)通常也較慢。因此，我們的研究重點(diǎn)是研究經(jīng)典方法的優(yōu)化和組合，并嘗試在分類器中引

4、入更多的知識(shí)。近年來手寫數(shù)字識(shí)別研究的簡要概述表明，經(jīng)典分類器，如多層感知器（MLP）5，k-最近鄰（kNN）2和支持向量機(jī)（SVM）6用過的。一些研究人員嘗試使用這些分類器的組合來改進(jìn)結(jié)果78101112。組合不同技術(shù)的主要問題是我們結(jié)合了二者的優(yōu)勢同時(shí)也不可避免地結(jié)合了二者的缺點(diǎn)。MLP5是用于多類問題的強(qiáng)大分類器，但是當(dāng)使用反向傳播作為學(xué)習(xí)算法時(shí)，存在缺點(diǎn)。該算法可以以局部最小值停止訓(xùn)練。但是，如果我們嘗試?yán)^續(xù)訓(xùn)練階段，網(wǎng)絡(luò)可以超越權(quán)重，降低泛化能力，就可以使用動(dòng)量策略來擺脫局部最小化。kNN2根據(jù)距離樣本最近的訓(xùn)練集中的模式的距離對(duì)樣本進(jìn)行分類。因此,訓(xùn)練集合中的模式越多,類之間的分布

5、也越均勻,識(shí)別率越高。但是,對(duì)樣本進(jìn)行分類的時(shí)間取決于訓(xùn)練數(shù)據(jù)庫中模式的數(shù)量。因此,這種技術(shù)通常是緩慢的。SVM6被認(rèn)為是最好的二進(jìn)制分類器，因?yàn)樗业絻蓚€(gè)類之間最好的分隔邊距。SVM是一個(gè)二進(jìn)制分類器的事實(shí)是其最大的缺點(diǎn)，因?yàn)榇蠖鄶?shù)的識(shí)別任務(wù)是多類問題。為了解決這個(gè)問題，有些研究人員嘗試將SVM8組合起來，或者將其用作決策者分類器9。基于這些假設(shè)，本文介紹了一種分層SVM組合，在應(yīng)用于手寫數(shù)字識(shí)別時(shí)，可以在短時(shí)間內(nèi)提供高精度的識(shí)別率。本研究結(jié)構(gòu)如下：相關(guān)文獻(xiàn)見第2節(jié);第3節(jié)提出的SVM組合架構(gòu);實(shí)驗(yàn)和結(jié)果在第4節(jié);本文的最終結(jié)論在第5節(jié)。2相關(guān)文獻(xiàn)支持向量機(jī)（SVM）65是一種二進(jìn)制分類技術(shù)

6、。訓(xùn)練階段包括查找每個(gè)類的支持向量,并創(chuàng)建一個(gè)函數(shù),表示不同類的支持向量之間的最佳分離邊距。因此,可以獲得一個(gè)最優(yōu)的類分離的超平面。分析支持向量機(jī)及其先前提出的特征,它似乎類似于感知器1,因?yàn)樗苍噲D找到一個(gè)線性函數(shù)來分離類。但有兩個(gè)主要的區(qū)別:SVM發(fā)現(xiàn)最優(yōu)線性函數(shù),而感知器尋求發(fā)現(xiàn)任何線性分離函數(shù);第二個(gè)區(qū)別是SVM可以處理非線性的可分離數(shù)據(jù)。為了做到這一點(diǎn),SVM利用核函數(shù)來增加特征維數(shù),從而使數(shù)據(jù)線性地分離。有兩種經(jīng)典的方法可以使用支持向量機(jī)來處理多個(gè)類:一反對(duì)全部和一反對(duì)一13。在一個(gè)反對(duì)所有的方法,一個(gè)SVM是為每個(gè)類創(chuàng)建。如果我們有10類,例如,在數(shù)字識(shí)別,我們將有10向量,每個(gè)

7、數(shù)字一個(gè)。這樣我們訓(xùn)練支持向量機(jī)(0)區(qū)分0類和其他類標(biāo)記為1,其他模式為0;SVM(1)以相同的方式將類1與其他類區(qū)分開來,等等。在識(shí)別階段,模式被提交到10向量,應(yīng)答標(biāo)簽1的SVM表示模式的類2。訓(xùn)練集是相同的數(shù)據(jù)庫,所有向量只改變模式的標(biāo)簽。如果該集合用于訓(xùn)練SVM(i),則歸類為i的模式將被替換為1,其他模式被替換為0。Nevesetal.8提出了組合一反對(duì)-一應(yīng)用于手寫數(shù)字識(shí)別。研究人員對(duì)每個(gè)可能的類使用了不同的SVM。在這種方法中,對(duì)稱對(duì)(如0-1和1-0)被視為相同的對(duì),而同一類的對(duì)(如1-1)不被考慮。在手寫數(shù)字識(shí)別的情況下,有10類:0到9。配對(duì)的類,我們有一個(gè)總計(jì)45對(duì)。每

8、個(gè)類將出現(xiàn)在9向量。例如,類0將出現(xiàn)在以下對(duì)中:0-1、0-2、0-3、0-4、0-5、0-6、0-7、0-8、0-9。訓(xùn)練階段包括兩個(gè)步驟：分離數(shù)據(jù)庫，為每對(duì)SVM創(chuàng)建45個(gè)數(shù)據(jù)子集。每個(gè)子集只包含其各自類的模式。例如，如果該對(duì)負(fù)責(zé)將0與1區(qū)分開，則子集（0,1）將僅包含0和1個(gè)模式;找到更好地分離類的內(nèi)核函數(shù)訓(xùn)練所有45個(gè)SVM分類階段包括向所有45個(gè)SVM提交模式，并確定SVM輸出中最常見的類。該算法的主要優(yōu)點(diǎn)來自于每個(gè)SVM給出的最優(yōu)超平面分離。它產(chǎn)生高精度的識(shí)別率。然而，組合的SVM的數(shù)量取決于類的數(shù)量。如果問題有很多類，系統(tǒng)時(shí)間處理會(huì)增加。Bellili等人提出的另一種使用SVM進(jìn)

9、行手寫數(shù)字識(shí)別的方法是一種決策分類器。9。在這項(xiàng)工作中，研究人員觀察到，在97.45的病例中，正確的識(shí)別類別存在于最高M(jìn)LP輸出中。然而，如果我們考慮兩個(gè)最高的MLP產(chǎn)出，正確類別的百分比增加到99的情況。然后，研究人員證實(shí)了哪些主要的類對(duì)被混淆。在他們提出的方法中，當(dāng)這些對(duì)存在于兩個(gè)最高M(jìn)LP輸出中時(shí)，研究人員使用SVM來決定哪個(gè)輸出對(duì)應(yīng)于正確的類。對(duì)于所有其他情況，他們使用MLP輸出。在7中，主要思想是使用SVM作為決策者分類器來增加kNN識(shí)別率，類似于Bellili等人。9。在這種情況下的適應(yīng)是在k個(gè)最近鄰居中采用兩個(gè)最常見的類，并使用SVM來決定這兩個(gè)類之間的關(guān)系。錯(cuò)誤分類導(dǎo)致高處理時(shí)

10、間或計(jì)算成本本質(zhì)上并不理想，這是令人滿意的技術(shù)。在Ciresan等人數(shù)字識(shí)別任務(wù)使用MLP執(zhí)行。研究人員的觀點(diǎn)是找到正確的MLP架構(gòu)的主要問題是從訓(xùn)練數(shù)據(jù)集中產(chǎn)生一個(gè)強(qiáng)大的分類器。該建議是提供具有隱藏層和神經(jīng)元的MLP。Camastra18將SVM與神經(jīng)氣體相結(jié)合。該方法使用神經(jīng)氣體網(wǎng)絡(luò)來驗(yàn)證在哪種情況下字符是大寫的，然后定義字符的大寫或小寫形式是否可以包含在單個(gè)類中。然后將字符提交給SVM識(shí)別器以獲得最終分類。在19中，研究人員創(chuàng)建了一個(gè)綜合分類器，使用門控網(wǎng)絡(luò)來組合來自三個(gè)不同神經(jīng)網(wǎng)絡(luò)的輸出。在20中提出了一種使用基于字符圖像的遞歸細(xì)分的新特征提取技術(shù)的方法。MLP和SVM的組合也用于識(shí)別

11、沒有西方語言。3提出的算法在分析了現(xiàn)有技術(shù)之后，我們提出了另一個(gè)簡單的SVM組合。主要思想是創(chuàng)建一個(gè)分層的SVM結(jié)構(gòu)。第一級(jí)由一組SVM組成。每個(gè)類對(duì)有一個(gè)SVM，但是如果一個(gè)類是一對(duì)，則不能在其他對(duì)中。例如，在數(shù)字識(shí)別的情況下，我們有10個(gè)可能的類（輸出）：0到9.第一級(jí)將有五個(gè)支持向量機(jī)，這些對(duì)中的每一個(gè)對(duì)：0-1,2-3,4-5，6-7和8-9。該模式將由第一級(jí)的每個(gè)SVM分類。預(yù)期用正確分類集訓(xùn)練的SVM正確地對(duì)樣本進(jìn)行分類，其他人可以選擇其對(duì)中的任何類別。第二級(jí)將使用與上一級(jí)相同的策略組合第一級(jí)獲得的輸出。該過程將持續(xù)到只有一個(gè)輸出。這種分層結(jié)構(gòu)的一個(gè)例子如圖1所示。其中字母a，b，

12、x，y和i表示由SVM給出的輸出，括號(hào)中的數(shù)字表示SVM可以區(qū)分的對(duì)。4實(shí)驗(yàn)方法和結(jié)果實(shí)驗(yàn)中使用的圖像是從NISTSD19數(shù)據(jù)庫3中提取的,它是美國國家標(biāo)準(zhǔn)和技術(shù)研究所提供的一個(gè)數(shù)字?jǐn)?shù)據(jù)庫。該數(shù)據(jù)庫中的每個(gè)圖像都包含不同的數(shù)字,如圖3所示。利用連通分量標(biāo)注的算法將圖像分割成孤立位數(shù)10。每個(gè)標(biāo)簽對(duì)應(yīng)一個(gè)孤立的數(shù)字。分割后,垂直和水平投影15被用來集中在圖像中的數(shù)字。大于20x25的圖像被裁剪,只刪除額外的白色邊框。如果圖像中的對(duì)象大于20x25,則刪除白色邊框并調(diào)整數(shù)字大小。之所以選擇大小(20x25)是因?yàn)榇蠖鄶?shù)數(shù)字是近似于此大小的。在分類器的監(jiān)督訓(xùn)練中,每個(gè)數(shù)字都被手工分離并標(biāo)記成類。最終

13、數(shù)據(jù)庫包含總共11377位數(shù)字。每個(gè)類平均包含1150位數(shù)字。這個(gè)數(shù)字?jǐn)?shù)據(jù)庫被分成訓(xùn)練集和測試集。訓(xùn)練集包含7925示例。它包含大約每類800位數(shù)。測試集包含3452樣本,每類大約350位數(shù)。所有分類器的特征向量都是相同的。它是作為一個(gè)向量結(jié)構(gòu)的矩陣圖像。但是,在將其轉(zhuǎn)換為矢量之前,圖像又重新調(diào)整為12x15,以減少特征向量的維數(shù),從而生成具有180個(gè)二進(jìn)制特征的向量。這個(gè)尺寸是根據(jù)先前的實(shí)驗(yàn)定義的。4.1算法：訓(xùn)練和配置實(shí)驗(yàn)中使用的方法是MLP，kNN，MLP-SVM，45SVM（一對(duì)一），kNN-SVM，SVM（一對(duì)一）和所提出的分層SVM結(jié)構(gòu)。在實(shí)驗(yàn)中，我們嘗試找到每個(gè)分類器的最佳配置。

14、所使用的配置如下所述：a）多層感知器輸入數(shù)量：180（基于特征向量）;隱藏層數(shù)：一個(gè);隱藏層節(jié)點(diǎn)數(shù)：180（在實(shí)驗(yàn)過程中，此數(shù)字減少并增加，這是找到的最佳拓?fù)洌?。輸出?shù)量：10（可能類數(shù)）;雙曲正切作為所有節(jié)點(diǎn)的激活功能;訓(xùn)練時(shí)期的數(shù)量：30,000;反向傳播（梯度下降）作為訓(xùn)練算法。b）支持向量機(jī)（對(duì)）如前所述，為每個(gè)可能的對(duì)創(chuàng)建了一組訓(xùn)練數(shù)據(jù)庫，并使用了45對(duì)。多項(xiàng)式和高斯徑向基函數(shù)（RBF）被用作核函數(shù)。多項(xiàng)式核呈現(xiàn)最好的結(jié)果，被選為SVMs對(duì)和SVM一對(duì)一。c）KNNkNN主要參數(shù)是k值和使用的距離方程。k值在3和11之間變化。實(shí)驗(yàn)中使用的距離是歐幾里得，曼哈頓和閔可夫斯基。發(fā)現(xiàn)最好的

15、結(jié)果是k等于3和歐幾里得距離。所有算法都使用MatlabTM16版本R2010a實(shí)現(xiàn)。所有算法在參數(shù)選擇后訓(xùn)練。在整個(gè)方法中使用相同訓(xùn)練的MLP和SVM，以及相同的kNN參數(shù)。4.2實(shí)驗(yàn)和結(jié)果使用兩個(gè)標(biāo)準(zhǔn)來比較七種算法（MLP，kNN，MLP-SVM9，45SVMs-一對(duì)一8，分層SVM組合，kNN-SVM7和SVM-所有）。這些標(biāo)準(zhǔn)是處理時(shí)間和識(shí)別率。表1給出了算法在測試集上獲得的識(shí)別率。表2給出了在每個(gè)算法中分類一個(gè)模式的處理時(shí)間（以秒為單位）的平均和標(biāo)準(zhǔn)偏差?？紤]到具有分層SVM的算法是最快的算法（表2）的假設(shè)的統(tǒng)計(jì)測試中，結(jié)果表明，假設(shè)是98的置信度的假設(shè)。分析SVM組合，一對(duì)一和一對(duì)

16、一，他們呈現(xiàn)有希望的結(jié)果，但如果使用分類器的標(biāo)準(zhǔn)是：處理時(shí)間和高識(shí)別率，則可能不足。在一個(gè)對(duì)抗中，主要的困難是找到一個(gè)增加特征空間的內(nèi)核函數(shù)，使一個(gè)類與其他類線性分離。如果類的數(shù)量增加，找到有效內(nèi)核函數(shù)的復(fù)雜度也會(huì)增加。在某些情況下，此組合不會(huì)返回有效的輸出。例如，在向所有SVM提交一個(gè)模式之后，輸出被標(biāo)記為0，并且不能進(jìn)行分類。在一對(duì)一中，使用的SVM的數(shù)量是基于類的數(shù)量。該數(shù)字可以通過使用下面的公式（1）獲得。本文提出的架構(gòu)也取決于類的數(shù)量，但是由于SVM的數(shù)量較少，所以顯著降低?；趉NN的技術(shù)獲得了較高的識(shí)別率和較長的處理時(shí)間。因此，例如，這些方法不足以用于在線識(shí)別任務(wù)。如果我們考慮識(shí)

17、別率標(biāo)準(zhǔn)，則基于SVM的技術(shù)獲得了良好的效果。一對(duì)一的方法是唯一一個(gè)在SVM方法中不具有高識(shí)別率的方法。然而，仔細(xì)分析，我們可以看到，在634個(gè)錯(cuò)誤中，629沒有被標(biāo)記，當(dāng)所有SVM返回相同的分類時(shí)，這樣就不可能對(duì)模式進(jìn)行分類（樣本被分類器拒絕）。在634個(gè)錯(cuò)誤中，只有15個(gè)錯(cuò)誤分類。當(dāng)拒絕而不是分類錯(cuò)誤時(shí)，這種方法可以是一個(gè)很好的選擇。所提出的分級(jí)SVM是手寫數(shù)字的第三好的分類器，但是第一和第二之間的差異非常小，所以在統(tǒng)計(jì)測試中它們是等效的。在這種情況下，處理時(shí)間是主要目標(biāo)，因?yàn)閗NN-SVM和45SVMs技術(shù)是最慢的，所提出的方法是最快的。因此，分層方法可以在短的處理時(shí)間內(nèi)被高識(shí)別率所強(qiáng)調(diào)

18、。在圖4，將評(píng)估方法的處理時(shí)間和誤差率歸一化，并在圖形中繪制在一起。在這種情況下，最好的方法是結(jié)果接近圖形來源的方法。該分析表明，本文提出的方法是最佳評(píng)估方法。5結(jié)論本文提出了一種考慮到短處理時(shí)間和高精確識(shí)別率的SVM應(yīng)用于手寫識(shí)別問題的方法。經(jīng)過對(duì)相關(guān)文獻(xiàn)的簡要研究，發(fā)現(xiàn)傳統(tǒng)分類器由于處理次數(shù)少，識(shí)別率高，仍然被用于識(shí)別手寫文本。新的方法，如Neves等8和Zanchettin等7增加了識(shí)別率，同時(shí)也增加了處理時(shí)間和計(jì)算成本?；谶@些標(biāo)準(zhǔn),對(duì)手寫體數(shù)字識(shí)別的經(jīng)典分類器和分類符進(jìn)行了實(shí)現(xiàn)和測試。提出的新方法考慮了處理時(shí)間和識(shí)別率的最佳結(jié)果。基于kNN的技術(shù)處理時(shí)間最長，識(shí)別率最高。另一方面，遞

19、階SVM組合獲得了高識(shí)別率和最短的處理時(shí)間。通過實(shí)驗(yàn)驗(yàn)證，當(dāng)這兩種標(biāo)準(zhǔn)都是必須要達(dá)到的要求時(shí),這種SVM組合是最佳選擇。未來的作品將考慮手寫字符，并嘗試將提出的方法與單詞分類方法相結(jié)合。An Efficient Way of Combining SVMs for Handwritten Digit RecognitionAbstract. This paper presents a method of combining SVMs (support vector machines) for multiclass problems that ensures a high recognition

20、rate and a short processing time when compared to other classifiers. This hierarchical SVM combination considers the high recognition rate and short processing time as evaluation criteria. The used case study was the handwritten digit recognition problem with promising results.Keywords: pattern reco

21、gnition, handwriting digit classifier, support vector machine.1IntroductionNowadays the world is digital. Technology has become ubiquitous in people´s lives, and some human tasks, such as handwriting recognition, voice recognition, face recognition and others are now machine tasks. The main rec

22、ognition process 12 used in this kind of application requires the following steps: data acquisition; pre- process data to eliminate noise; segmentation, where the objects (text, numbers, face, etc) to be recognized are located and separated from the background; feature extraction, where the main fea

23、tures of each object are extracted; and finally there is the recognition, or classification, where the objects are labeled based on their features. This paper is focused on the classification task and we used as case study the handwritten digit recognition problem because this task can represent som

24、e classification issues. For example, the patterns can be ambiguous or some features are similar in more than one group of classes. An example of this problem is presented in Fig. 1. In Fig. 1a and in Fig. 1c the correct value of the image is seven, and in Fig. 1b, four. But the Fig. 1a and b are si

25、milar and could be the same digit. Fig. 1c could be confused as a digit one.Because of this, to build a classifier that generalizes well is a hard task. In some cases the best choice is try to use the context information to differentiate one class from the other.The Hidden Markov Models (HMM) 3 is a

26、 technique frequently used to analyze the context and improve the classifier recognition rate. But its main disadvantage is the processing time. Modeling context techniques also usually are slower. Thus, our research focuses on studying the optimization and combination of classical approaches and tr

27、ying to introduce more knowledge in the classifier.A brief overview of the handwritten digit recognition research in recent years shows that classical classifiers such as the multilayer perceptron (MLP) 5, k-nearest neighbor (kNN) 2 and support vector machine (SVM) 6 are extensively used. Some autho

28、rs tried combinations of these classifiers to improve results 78101112. The main problem of combining different techniques is that although we are combining the advantages, we are also joining the disadvantages of both.The MLP 5 is a powerful classifier for multiclass problems, but there is a disadv

29、antage when using back-propagation as the learning algorithm. The algorithm can stop training in a local minimum. It is possible to use the momentum strategy to escape from local minima, however, if we try to continue the training phase the network can overfit the weights, decreasing the generalizat

30、ion capability. kNN 2 classifies a sample based on the distance from the patterns in training set nearest to the sample. Thus, the more patterns there are in training set, equally distributed between the classes, higher is the recognition rate. But the time to classify a sample depends of the number

31、 of patterns in training database. Therefore, this technique is usually slow.The SVM 6 is considered the best binary classifier, because it finds the best margin of separation between two classes. The fact that SVM is a binary classifier is its greatest disadvantage, as most of the recognition tasks

32、 are multiclass problems. To solve this, some authors try to combine the SVMs 8 or use it as a decision maker classifier 9.Based on these assumptions, this paper introduces a hierarchical SVM combination that provides a highly accurate recognition rate in a short time to answer when applied to handw

33、ritten digit recognition.The present study is structured as follows: related works are presented in Section 2; the SVM combination architecture proposed in Section 3; the experiments and results are in Section 4; and the final conclusions of this paper are in Section 5.2Related WorksSupport Vector M

34、achine (SVM) 65 is a binary classification technique. The training phase consists of finding the support vectors for each class and creating a function that represents an optimal separation margin between the support vectors of different classes. Consequently, it is possible to obtain an optimal hyp

35、erplane for class separation. Analyzing SVM and its previously presented characteristics, it seems similar to the perceptron 1 because it also tries to find a linear function to separate classes. But there are two main differences: the SVM discovers the optimal linear function while the perceptron s

36、eeks to discover any linear separation function; and the second difference is that the SVM can deal with non-linearly separable data. To do that, SVM uses a kernel function to increase the feature dimensionality and consequently turning the data linearly separable.There are two classical ways to wor

37、k with multiple classes using SVM: one-against- all and one-against-one 13. In one-against-all approach, one SVM is created for each class. If we have 10 classes for example, as in digit recognition, we will have 10 SVMs, one for each digit. In this way we train the SVM(0) to differentiate the class

38、 0 from the others classes labeling it as 1 and the others patterns as 0; the SVM(1) to differentiate the class 1 from the others classes in the same way, and so on. In the recognition phase, the pattern is submitted to the 10 SVMs, the SVM that replies label 1 indicates the class of the pattern 2.

39、The training set is the same database for all SVMs changing just the label of patterns. If the set is used to train the SVM(i), patterns classified as i will be replaced by 1 and the others patterns replaced by 0.Neves et al. 8 presented the combination one-against-one applied to handwritten digit r

40、ecognition. The authors used a different SVM for each possible pair of classes. In this approach, symmetric pairs, such as 0-1 and 1-0, are considered as the same pair, whereas pairs of the same class, such as 1-1 are not considered. In the case of handwriting digit recognition, there are 10 classes

41、: 0 to 9. Pairing the classes, we have a grand total of 45 pairs. Each class will appear in 9 SVMs. For example, the class 0 will be present in the following pairs: 0-1, 0-2, 0-3, 0-4, 0-5, 0-6, 0-7, 0-8, 0-9.The training phase consists of two steps:Separate the database creating 45 data subsets for

42、 each pair of SVMs. Each subset contains only the patterns of its respective classes. For example, if the pair is responsible for differentiating 0 from 1, the subset (0,1) will contain only 0 and 1 patterns;Find the kernel function that better separate the classes;Train all 45 SVMs.The classificati

43、on phase consists in submitting the pattern to all 45 SVMs and identifying the most frequent class in the outputs of the SVMs.The main advantage of this algorithm comes from the optimal hyperplane separation given by each SVM. It produces a highly accurate recognition rate. However, the number of SV

44、Ms combined depends of the number of classes. If the problem has a lot of classes, the system time processing will increase.Another method of using SVM for handwriting digit recognition is as a decision classifier, as proposed by Bellili et al. 9. In this work the authors observed that the correct r

45、ecognition class is present in the highest MLP output in 97.45% of cases. However, if we consider the two highest MLP outputs, the percentage of correct class is increased to 99% of cases. Then, the authors verified which main pairs of classes were being confused. In their proposed method, when thes

46、e pairs are present in the two highest MLP outputs, the authors use an SVM to decide which output corresponds to the correct class. For all other cases, they use the MLP output.In 7 the main idea is to increase the kNN recognition rate using the SVM as a decision maker classifier, similar to Bellili

47、 et al. 9. The adaptation in this case is to take the two most frequent classes in the k nearest neighbors and to use the SVM to decide between these two classes. It is a satisfactory technique to be used where a misclassification results in high and processing time or computational cost are not ess

48、entially.In Ciresan et al. 17 the digit recognition task is performed using a MLP. The authors argument is that the main problem to find the correct MLP architecture is to produce a robust classifier from the training dataset. The proposal is to provide a MLP with hidden layers and neurons enough. C

49、amastra 18 combines SVMs with neural gas. The method uses a neural gas network to verify in which case the characters are, capitalized or not, and then defines if the upper or lower case forms of a character can be included into a single class. The characters are then submitted to the SVM recognizer

50、 to obtain the final classification.In 19 the authors create an ensemble classifier, using gating networks to assemble outputs given from three different neural networks. In 20 is proposed a method that uses a new feature extraction technique based on recursive subdivision of the character image. Th

51、e combination of MLP and SVM are also used to recognize no western languages.3Proposed AlgorithmAfter analyzing the state of the art, we propose another simple SVM combination. The main idea is to create a hierarchical SVM structure. The first level is composed by a set of SVMs. There is one SVM for

52、 each class pair but if one class is in one pair it cannot be in other pairs. For example, in the case of digit recognition, we have 10 possible classes (outputs): 0 to 9. The first level will have five SVMs, one for each of these pairs: 0-1, 2-3, 4-5, 6-7 and 8-9.The pattern will be classified by e

53、ach SVM in the first level. It is expected that the SVM trained with the correct classification set correctly classifies the sample and the others can choose any class of its pair. The second level will combine the outputs obtained in the first level, using the same strategy of the previous level. T

54、he process will continue until there is only one output. An example of this hierarchical structure is shown in Fig. 2, where the letters a, b, x, y and i represents the outputs given by the SVMs and the number in parenthesis represents the pair that the SVM can differentiate.4Methods, Experiments an

55、d ResultsThe images used in the experiments were extracted from the NIST SD19 database 3, which is a numerical database available from the American National Institute of Standards and Technology. Each image in this database contains a varied number of digits, as presented in Fig. 3.The images were s

56、eparated into isolated digits using an algorithm which employed connected component labeling 10. Each label corresponds to an isolated digit. After segmentation, vertical and horizontal projections 15 were used to centralize the digit in the image. Images larger than 20x25 were cropped, removing onl

57、y the extra white borders. If the object in the image is larger than 20x25, the white border is eliminated and the digit resized. The size (20x25) was selected because the majority of digits are approximately of this size.Each digit was manually separated and labeled into classes to be used in the s

58、upervised training of classifiers. The final database contains a total of 11,377 digits. Each class contains in average 1,150 digits. This digit database was separated into a training set and a test set. The training set contains 7,925 samples. It contains approximately 800 digits per class. The test set contains 3,452 samples, which is approximately 350 digits per class.The feature vector is the same for all classifiers. It is the matrix image structured as a vector. However, before converting it into a vec

人人文庫> 全部分類> 教育資料 > 中學(xué)教育

溫馨提示

1. 本站所有資源如無特殊說明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯

文檔簡介

溫馨提示

最新文檔

評(píng)論

支持向量機(jī)SVM的手寫數(shù)字識(shí)別的有效方法外文文獻(xiàn)翻譯

文檔簡介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔