版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領
文檔簡介
1、Powerset Convolutional Neural Networks Chris Wendler Department of Computer Science ETH Zurich, Switzerland chris.wendlerinf.ethz.ch Dan Alistarh IST Austria dan.alistarhist.ac.at Markus Pschel Department of Computer Science ETH Zurich, Switzerland pueschelinf.ethz.ch Abstract We present a novel cla
2、ss of convolutional neural networks (CNNs) for set functions, i.e., data indexed with the powerset of a fi nite set. The convolutions are derived as linear, shift-equivariant functions for various notions of shifts on set functions. The framework is fundamentally different from graph convolutions ba
3、sed on the Laplacian, as it provides not one but several basic shifts, one for each element in the ground set. Prototypical experiments with several set function classifi cation tasks on synthetic datasets and on datasets derived from real-world hypergraphs demonstrate the potential of our new power
4、set CNNs. 1Introduction Deep learning-based methods are providing state-of-the-art approaches for various image learning and natural language processing tasks, such as image classifi cation 22,28, object detection 41, semantic image segmentation 42, image synthesis 20, language translation / underst
5、anding 23,62 and speech synthesis 58. However, an artifact of many of these models is that regularity priors are hidden in their fundamental neural building blocks, which makes it impossible to apply them directly to irregular data domains. For instance, image convolutional neural networks (CNNs) ar
6、e based on parametrized 2D convolutional fi lters with local support, while recurrent neural networks share model parameters across different time stamps. Both architectures share parameters in a way that exploits the symmetries of the underlying data domains. In order to port deep learners to novel
7、 domains, the according parameter sharing schemes refl ecting the symmetries in the target data have to be developed 40. An example are neural architectures for graph data, i.e., data indexed by the vertices of a graph. Graph CNNs defi ne graph convolutional layers by utilizing results from algebrai
8、c graph theory for the graph Laplacian 9,51 and message passing neural networks 18,47 generalize recurrent neural architectures from chain graphs to general graphs. With these building blocks in place, neural architectures for supervised 16,18,50, semi-supervised 25 and generative learning 52,59 on
9、graphs have been deployed. These research endeavors fall under the umbrella term of geometric deep learning (GDL) 10. In this work, we want to open the door for deep learning on set functions, i.e., data indexed by the powerset of a fi nite set. There are (at least) three ways to do so. First, set f
10、unctions can be viewed as data indexed by a hypercube graph, which makes graph neural nets applicable. Second, results from the Fourier analysis of set functions based on the Walsh-Hadamard-transform (WHT) 15,33,54 can be utilized to formulate a convolution for set functions in a similar way to 51.
11、Third, 36 introduces several novel notions of convolution for set functions (powerset convolution) 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. as linear, equivariant functions for different notions of shift on set functions. This derivation parallels t
12、he standard 2D-convolution (equivariant to translations) and graph convolutions (equivariant to the Laplacian or adjacency shift) 34. A general theory for deriving new forms of convolutions, associated Fourier transforms and other signal processing tools is outlined in 38. ContributionsMotivated by
13、the work on generalized convolutions and by the potential utility of deep learning on novel domains, we propose a method-driven approach for deep learning on irregular data domains and, in particular, set functions: We formulate novel powerset CNN architectures by integrating recent convolutions 36
14、and proposing novel pooling layers for set functions. As a protoypical application, we consider the set function classifi cation task. Since there is little prior work in this area, we evaluate our powerset CNNs on three synthetic classifi cation tasks (submodularity and spectral properties) and two
15、 classifi cation tasks on data derived from real-world hypergraphs 5 . For the latter, we design classifi ers to identify the origin of the extracted subhypergraph. To deal with hypergraph data, we introduce several set- function-based hypergraph representations. We validate our architectures experi
16、mentally, and show that they generally outperform the natural fully-connected and graph-convolutional baselines on a range of scenarios and hyperparameter values. 2Convolutions on Set Functions We introduce background and defi nitions for set functions and associated convolutions. For context and an
17、alogy, we fi rst briefl y review prior convolutions for 2D and graph data. From the signal processing perspective, 2D convolutions are linear, shift-invariant (or equivariant) functions on imagess : Z2 R;(i,j) 7 si,j, where the shifts are the translationsT(k,l)s = (sik,jl)i,jZ2. The 2D convolution t
18、hus becomes (h s)i,j= X k,lZ2 hk,lsik,jl.(1) Equivariance means that all convolutions commute with all shifts: h (T(k,l)s) = T(k,l)(h s). Convolutions on vertex-indexed graph signalss : V R;v 7 svare linear and equivariant with respect to the Laplacian shifts Tks = Lks, where L is the graph Laplacia
19、n 51. Set functions With this intuition in place, we now consider set functions. We fi x a fi nite set N = x1,.,xn. An associated set function is a signal on the powerset of N: s : 2N R;A 7 sA.(2) Powerset convolutionA convolution for set functions is obtained by specifying the shifts to which it is
20、 equivariant. The work in 36 specifi esTQs = (sAQ)ANas one possible choice of shifts for Q N. Note that in this case the shift operators are parametrized by the monoid(2N,), since for all s TQ(TRs) = (sARQ)AN= (sA(RQ)AN= TQRs, which impliesTQTR= TQR. The corresponding linear, shift-equivariant power
21、set convolution is given by 36 as (h s)A= X QN hQsAQ.(3) Note that the fi lterhis itself a set function. Table 1 contains an overview of generalized convolutions and the associated shift operations to which they are equivariant to. Fourier transform Each fi lterh defi nes a linear operatorh:= (h ) o
22、btained by fi xinghin(3). It is diagonalized by the powerset Fourier transform F = ?1 0 11 ?n = ?1 0 11 ? ?1 0 11 ? ,(4) 2 signalshifted signalconvolutionreferenceCNN image(si,j)i,j(sik,jl)i,jZ P k,lhk,lsik,jl standardstandard graph Laplacian(sv)vV(Lks)vV(PkhkLks)v519 graph adjacency(sv)vV(Aks)vV(Pk
23、hkAks)v4455 group(sg)gG(sq1g)gG P qhqsq1g 5313 group spherical(sR)RSO(3)(sQ1R)RSO(3) R hQsQ1Rd(Q)1212 powerset(sA)AN(sAQ)AN P QhQsAQ 36this paper Table 1: Generalized convolutions and their shifts. wheredenotes the Kronecker product. Note thatF1= Fin this case and that the spectrum is also indexed b
24、y subsets B N. In particular, we have FhF1= diag(hB)BN),(5) in whichh denotes the frequency response of the fi lterh36. We denote the linear mapping fromh to its frequency responseh by F, i.e.,h = Fh. Other shifts and convolutions There are several other possible defi nitions of set shifts, each com
25、- ing with its respective convolutions and Fourier transforms 36. Two additional examples are T? Qs = (sAQ)AN and the symmetric differenceT Qs = (s(AQ)(QA)AN 54. The associated convolutions are, respectively, (h s)A= X QN hQsAQand(h s)A= X QN hQs(AQ)(QA).(6) Localized fi ltersFiltershwithhQ= 0for|Q|
26、 karek-localized in the sense that the evaluation of(h s)Aonly depends on evaluations ofson sets differing by at mostkelements fromA. In particular,1 -localized fi lters(h s)A= hsA+ P xNhxsAxare the counterpart of one-hop fi lters that are typically used in graph CNNs 25. In contrast to the omnidire
27、ctional one-hop graph fi lters, these one-hop fi lters have one direction per element in N. 2.1Applications of Set Functions Set functions are of practical importance across a range of research fi elds. Several optimization tasks, such as cost effective sensor placement 27, optimal ad placement 19 a
28、nd tasks such as semantic image segmentation 35, can be reduced to subset selection tasks, in which a set function determines the value of every subset and has to be maximized to fi nd the best one. In combinatorial auctions, set functions can be used to describe bidding behavior. Each bidder is rep
29、resented as a valuation function that maps each subset of goods to its subjective value to the customer 14. Cooperative games are set functions 8. A coalition is a subset of players and a coalition game assigns a value to every subset of players. In the simplest case the value one is assigned to win
30、ning and the value zero to losing coalitions. Further, graphs and hypergraphs also admit set function representations: Defi nition 1.(Hypergraph) A hypergraph is a tripleH = (V,E,w), whereV = v1,.,vnis a set of vertices, E (P(V ) ) is a set of hyperedges and w : E R is a weight function. The weight
31、function of a hypergraph is a set function onVby settingsA= wAifA Eand sA= 0otherwise. Additionally, hypergraphs induce two set functions, namely the hypergraph cut and association score function: cutA= X BE,BA6=, B(V A)6= wBandassocA= X BE,BA wB.(7) 2.2Convolutional Pattern Matching The powerset co
32、nvolution in(3) raises the question of which patterns are “detected” by a fi lter (hQ)QN . In other words, to which signal does the fi lterhrespond strongest when evaluated at a 3 given subset A? We call this signal pA(the pattern matched at position A). Formally, pA= argmax s:ksk=1 (h s)A.(8) ForpN
33、, the answer ispN= (1/khk)(hNB)BN. This is because the dot producthh,si, with s A= sNA, is maximal ifhands are aligned. Slightly rewriting(3)yields the answer for the general case A N: (h s)A= X QN hQsAQ= X Q1A X Q2NA hQ1Q2 |z =:h0 Q1 sAQ1.(9) Namely,(9)showsthatthepowersetconvolutionevaluatedatposi
34、tionAcanbeseenastheconvolution of a new fi lterh0withsrestricted to the powerset2Aevaluated at positionA, the case for which we know the answer: pA B = (1/kh0k)h0 AB if B A and pA B = 0 otherwise. Example 1. (One-hop patterns) For a one-hop fi lterh, i.e.,(h s)A= hsA+ P xNhxsAx the pattern matched a
35、t position A takes the form pA B= 1 kh0k(h + P xNAhx) if B = A, 1 kh0khx if B = A x with x A, 0else. (10) Here, h0 corresponds to the fi lter restricted to the powerset 2Aas in (9). Notice that this behavior is different from 1D and 2D convolutions: there the underlying shifts (translations) are inv
36、ertible and thus the detected patterns are again shifted versions of each other. For example, the 1D convolutional fi lter(hq)qZmatchesp0= (hq)qZat position0andpt= Ttp0= (hq+t)qZat positiont , and, the group convolutional fi lter(hq)qGmatchespe= (hq1)qGat the unit elementeandpg= Tg1pe= (hgq1)qGat po
37、sitiong. Since powerset shifts are not invertible, the detected patterns by a fi lter are not just (set-)shifted versions of each other as shown above. A similar behavior can be expected with graph convolutions since the Laplacian shift is never invertible and the adjacency shift is not always inver
38、tible. 3Powerset Convolutional Neural Networks Convolutional layers We defi ne a convolutional layer by extending the convolution to multiple channels, summing up the feature maps obtained by channel-wise convolution as in 10: Defi nition 2. (Powerset convolutional layer) A powerset convolutional la
39、yer is defi ned as follows: 1. The input is given by ncset functions s = (s(1),.,s(nc) R2 Nnc ; 2. The output is given by nfset functions t = L(s) = (t(1),.,t(nf) R2 Nnf ; 3. The layer applies a bank of set function fi lters = (h(i,j)i,j, withi 1,.,ncand j 1,.,nf, and a point-wise non-linearity resu
40、lting in t(j) A = ( nc X i=1 (h(i,j) s(i)A).(11) Pooling layers As in conventional CNNs, we defi ne powerset pooling layers to gain additional robustness with respect to input perturbations, and to control the number of features extracted by the convolutional part of the powerset CNN. From a signal
41、processing perspective, the crucial aspect of the pooling operation is that the pooled signal lives on a valid signal domain, i.e., a powerset. One way to achieve this is by combining elements of the ground set. 4 inputconv1pool1conv2pool2MLP 22x 5 24x 1 23x 523x 3 24x 3 Figure 1: Forward pass of a
42、simple powerset CNN with two convolutional and two pooling layers. Set functions are depicted as signals on the powerset lattice. Defi nition 3.(Powerset pooling) LetN0(X)be the ground set of sizen |X| + 1obtained by combining all the elements inX Ninto a single element. E.g., forX = x1,x2we get N0(
43、X) = x1,x2,x3,.,xn . Therefore every subset X N defi nes a pooling operation PX: R2 N R2 N0(X) : (sA)AN7 (sB)B:BX=X or BX=.(12) In our experiments we always useP := Px1,x2. It is also possible to pool a set function by combining elements of the powerset as in 48 or by the simple rulesB= max(sB,sBx)f
44、or B N x. Then, a pooling layer is obtained by applying our pooling strategy to every channel. Defi nition 4.(Powerset pooling layer) A powerset pooling layer takesncset functions as inputs = (s(1),.,s(nc) R2 Nnc and outputsncpooled set functionst=LP(s)= (t(1),.,t(nc) R2 N0nc, with |N0| = |N| 1, by
45、applying the pooling operation to every channel t(i)= P(s(i).(13) Powerset CNNA powerset CNN is a composition of several powerset convolutional and pooling layers. Dependingonthetask, theoutputsoftheconvolutionalcomponentcanbefedintoamulti-layer perceptron, e.g., for classifi cation. Fig. 1 illustra
46、tes a forward pass of a powerset CNN with two convolutional layers, each of which is followed by a pooling layer. The fi rst convolutional layer is parameterized by three one-hop fi lters and the second one is parameterized by fi fteen (three times fi ve) one-hop fi lters. The fi lter coeffi cients
47、were initialized with random weights for this illustration. Implementation1 We implemented the powerset convolutional and pooling layers in Tensorfl ow 1. Our implementation supports various defi nitions of powerset shifts, and utilizes the respective Fourier transforms to compute the convolutions i
48、n the frequency domain. 4Experimental Evaluation Our powerset CNN is built on the premise that the successful components of conventional CNNs are domain independent and only rely on the underlying concepts of shift and shift-equivariant convolutions. In particular, if we use only one-hop fi lters, o
49、ur powerset CNN satisfi es locality and compositionality. Thus, similar to image CNNs, it should be able to learn localized hierarchical features. To understand whether this is useful when applied to set function classifi cation problems, we evaluate our powerset CNN architectures on three synthetic
50、 tasks and on two tasks based on real-world hypergraph data. Problem formulation Intuitively, our set function classifi cation task will require the models to learn to classify a collection of set functions sampled from some natural distributions. One such example would be to classify (hyper-)graphs
51、 coming from some underlying data distributions. Formally, the set function classifi cation problem is characterized by a training set(s(i),t(i)m i=1 (R2 N C) composed of pairs (set function, label), as well as a test set. The learning task is to utilize the training set to learn a mapping from the
52、space of set functions R2 N to the label space C = 1,.,k. 1Sample implementations are provided at 5 4.1Synthetic Datasets Unless stated otherwise, we consider the ground setN = x1,.,xnwithn = 10, and sample 10,000set functions per class. We use80%of the samples for training, and the remaining20%for
53、testing. We only use one random split per dataset. Given this, we generated the following three synthetic datasets, meant to illustrate specifi c applications of our framework. Spectral patterns In order to obtain non-trivial classes of set functions, we defi ne a sampling proce- dure based on the F
54、ourier expansion associated with the shiftTQs = (sAQ)AN. In particular, we sample Fourier sparse set functions,s = F1 swith ssparse. We implement this by associating each target “class” with a collection of frequencies, and sample normally distributed Fourier coeffi cients for these frequencies. In
55、our example, we defi ned four classes, where the Fourier support of the fi rst and second class is obtained by randomly selecting roughly half of the frequencies. For the third class we use the entire spectrum, while for the fourth we use the frequencies that are either in both of class ones and cla
56、ss twos Fourier support, or in neither of them. k-junta classifi cationAk-junta 33 is a boolean function defi ned onnvariablesx1,.,xnthat only depends onkof the variables:xi1,.,xik. In the same spirit, we call a set function ak-junta if its evaluations only depend on the presence or absence of k of
57、the n elements of the ground set: Defi nition 5.(k-junta) A set functionson the ground setNis called ak-junta if there exists a subset N0 N, with |N0| = k such that s(A) = s(A N0), for all A N. We generate ak -junta classifi cation dataset by sampling randomk-juntas fork 3,.,7. We do so by utilizing
58、 the fact that shifting a set function byxeliminates its dependency onx, i.e., for Awithx Awe have(Txs)A= sAx= (Txs)Axbecause(A x) x = A x. Therefore, sampling a randomk -junta amounts to fi rst sampling a random value for every subset A N and performing n k set shifts by randomly selected singleton sets. Submodularity classifi cationA set functions is submodular if it satisfi es the diminishing returns property A,B N with A
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 酒店集團財務制度
- 村集體建立相關財務制度
- 甘肅省社會團體財務制度
- 街道辦事處健全財務制度
- 小企業(yè)公司內(nèi)部財務制度
- 雙簽字雙負責財務制度
- 農(nóng)村公廁管護制度
- 醫(yī)院出入人員管理制度范本(3篇)
- 標點地產(chǎn)策劃活動方案(3篇)
- 常熟裝修施工方案(3篇)
- UL676標準中文版-2019水下燈具和接線盒UL標準中文版
- 醫(yī)學教材 常見心律失常診治(基層醫(yī)院培訓)
- 體溫單模板完整版本
- 武漢市2024屆高中畢業(yè)生二月調(diào)研考試(二調(diào))英語試卷(含答案)
- 天然美肌無添加的護膚品
- 《正常人體形態(tài)學》考試復習題庫大全(含答案)
- 湖南省長沙市外國語學校 2021-2022學年高一數(shù)學文模擬試卷含解析
- 3D車載蓋板玻璃項目商業(yè)計劃書
- 阿米巴經(jīng)營管理培訓課件
- 我國的宗教政策-(共38張)專題培訓課件
- 鋁材廠煲模作業(yè)指導書
評論
0/150
提交評論