iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations

上傳人：我*** IP屬地：北京上傳時(shí)間：2020-06-15 格式：PDF 頁(yè)數(shù)：12 大?。?.53MB 積分：12 舉報(bào) 版權(quán)申訴

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations_第2頁(yè)

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations_第3頁(yè)

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations_第4頁(yè)

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations_第5頁(yè)

已閱讀5頁(yè)，還剩7頁(yè)未讀，繼續(xù)免費(fèi)閱讀

版權(quán)說(shuō)明：本文檔由用戶提供并上傳，收益歸屬內(nèi)容提供方，若內(nèi)容存在侵權(quán)，請(qǐng)進(jìn)行舉報(bào)或認(rèn)領(lǐng)

文檔簡(jiǎn)介

Certifi able Robustness to Graph Perturbations Aleksandar Bojchevski Technical University of Munich a.bojchevskiin.tum.de Stephan Gnnemann Technical University of Munich guennemannin.tum.de Abstract Despite the exploding interest in graph neural networks there has been little effort to verify and improve their robustness. This is even more alarming given recent fi ndings showing that they are extremely vulnerable to adversarial attacks on both the graph structure and the node attributes. We propose the fi rst method for verifying certifi able (non-)robustness to graph perturbations for a general class of models that includes graph neural networks and label/feature propagation. By exploiting connections to PageRank and Markov decision processes our certifi cates can be effi ciently (and under many threat models exactly) computed. Furthermore, we investigate robust training procedures that increase the number of certifi ably robust nodes while maintaining or improving the clean predictive accuracy. 1Introduction As the number of machine learning models deployed in the real world grows, questions regarding their robustness become increasingly important. In particular, it is critical to assess their vulnerability to adversarial attacks deliberate perturbations of the data designed to achieve a specifi c (malicious) goal. Graph-based models suffer from poor adversarial robustness 13,60, yet in domains where they are often deployed (e.g. the Web) 50, adversaries are pervasive and attacks have a low cost 9,26. Even in scenarios where adversaries are not present such analysis is important since it allows us to reason about the behavior of our models in the worst case (i.e. treating nature as an adversary). Here we focus on semi-supervised node classifi cation given a single large (attributed) graph and the class labels of a few nodes the goal is to predict the labels of the remaining unlabelled nodes. Graph Neural Networks (GNNs) have emerged as the de-facto way to tackle this task, signifi cantly improving performance over the previous state-of-the-art. They are used for various high impact applications across many domains such as: protein interface prediction 20 , classifi cation of scientifi c papers 28, fraud detection 44 , and breast cancer classifi cation 36. Therefore, it is crucial to asses their sensitivity to adversaries and ensure they behave as expected. However, despite their popularity there is scarcely any work on certifying or improving the robustness of GNNs. As shown in Zgner et al.60 node classifi cation with GNNs is not robust and can even be attacked on multiple fronts slight perturbations of either the node features or the graph structure can lead to wrong predictions. Moreover, since we are dealing with non i.i.d. data by taking the graph structure into account, robustifying GNNs is more diffi cult compared to traditional models perturbing only a few edges affects the predictions for all nodes. What can we do to fortify GNNs and make sure they produce reliable predictions in the presence of adversarial perturbations? We propose the fi rst method for provable robustness regarding perturbations of the graph structure. Our approach is applicable to a general family of models where the predictions are a linear function of (personalized) PageRank. This family includes GNNs 29 and other graph-based models such as label/feature propagation 7,53 . Specifi cally, we provide: 1. Certifi cates: Given a trained model and a general set of admissible graph perturbations we can effi ciently verify whether a node is certifi ably robust there exists no perturbation that can change its prediction. We also provide non- 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. robustness certifi cates via adversarial examples.2. Robust training: We investigate robust training schemes based on our certifi cates and show that they improve both robustness and clean accuracy. Our theoretical fi ndings are empirically demonstrated and the code is provided for reproducibility1. Interestingly, in contrast to existing works on provable robustness 23,46,59 that derive bounds (by relaxing the problem), we can effi ciently compute exact certifi cates for some threat models. 2Related work Neural networks 41,21, and recently graph neural networks 13,60,58 and node embeddings 5 were shown to be highly sensitive to small adversarial perturbations. There exist many (heuristic) approaches aimed at robustifying these models, however, they have only limited usefulness since there is always a new attack able to break them, leading to a cat-and-mouse game between attackers and defenders. A more promising line of research studies certifi able robustness 23,35,46 . Certifi cates provide guarantees that no perturbation regarding a specifi c threat model will change the prediction of an instance. So far, there has been almost no work on certifying graph-based models. Different heuristics have been explored in the literature to improve robustness of graph-based models: (virtual) adversarial training 49,16,10,40, trainable edge weights 48 , graph encoder refi ning and adversarial contrastive learning 45, transfer learning 42, smoothing distillation 10, decoupling structure from attributes 31, measuring logit discrepancy 51, allocating reliable queries 56, representing nodes as Gaussian distributions 57, and Bayesian graph neural networks 52. Other robustness aspects of graph-based models (e.g. noise or anomalies) have also been investigated 3, 6, 24. However, none of these works provide provable guarantees or certifi cates. Zgner & Gnnemann59 is the only work that proposes robustness certifi cates for graph neural networks (GNNs). However, their approach can handle perturbations only to the node attributes. Our approach is completely orthogonal to theirs since we consider adversarial perturbations to the graph structure instead. Furthermore, our certifi cates are also valid for other semi-supervised learning approaches such as label/feature propagation. Nonetheless, there is a critical need for both types of certifi cates given that GNNs are shown to be vulnerable to attacks on both the attributes and the structure. As future work, we aim to consider perturbations of the node features and the graph jointly. 3Background and preliminaries LetG = (V,E)be an attributed graph withN = |V|nodes and edge setE V V. We denote with A 0,1NNthe adjacency matrix andX RNDthe matrix ofD-dimensional node features for each node. Given a subsetVL V = 1,.,Nof labelled nodes the goal of semi-supervised node classifi cation is to predict for each nodev Vone class inC = 1,.,K. We focus on deriving (exact) robustness certifi cates for graph neural networks via optimizing personalized PageRank. We also show (Appendix 8.1) how to apply our approach for label/feature propagation 7. Topic-sensitive PageRank.The topic-sensitive PageRank 22,27 vectorG(z)for a graphGand a probability distribution over nodesz is defi ned asG(z) = (1 )(IN D1A)1z. 2 Here Dis a diagonal matrix of node out-degrees withDii= P jAij. Intuitively,(z)u represent the probability of random walker on the graph to land at nodeuwhen it follows edges at random with probabilityand teleports back to the nodevwith probability(1 )zv. Thus, we have(z)u 0 and P u(z)u = 1. Forz = ev, thev-th canonical basis vector, we get the personalized PageRank vector for nodev. We drop the index onG,andzinG,(z)when they are clear from the context. Graph neural networks.As an instance of graph neural network (GNN) methods we consider an adaptation of the recently proposed PPNP approach 29 since it shows superior performance on the semi-supervised node classifi cation task 19. PPNP unlike message-passing GNNs decouples the feature transformation from the propagation. We have: Y = softmax?symH?,Hv,:= f(Xv,:),sym= (1 )(IN D1/2AD1/2)1(1) whereINis the identity,sym RNNis a symmetric propagation matrix,H RNCcollects the individual per-node logits, andY RNC collects the fi nal predictions after propagation. A 1Code, data, and further supplementary material available at https:/www.kdd.in.tum.de/graph-cert 2In practice we do not invert the matrix, but rather we solve the associated sparse linear system of equations. 2 neural networkfoutputs the logitsHv,:by processing the featuresXv,:of every nodevindepen- dently. Multiplying them withsymwe obtain the diffused logitsHdiff:= symHwhich implicitly incorporate the graph structure and avoid the expensive multi-hop message-passing procedure. To make PPNP more amenable to theoretical analysis we replacesymwith the personalized PageRank matrix = (1 )(IN D1A)1which has a similar spectrum. Here each rowv,:= (ev)equals to the personalized PageRank vector of nodev. This model which we denote as-PPNP has similar prediction performance to PPNP. We can see that the diffused logit after propagation for classcof nodevis a linear function of its personalized PageRank score: Hdiff v,c = (ev)H:,c, i.e. a weighted combination of the logits of all nodes for classc. Similarly, the marginmc1,c2(v) = Hdiff v,c1 Hdiffv,c2 = (ev)(H:,c1 H:,c2) defi ned as the difference in logits for nodevfor two given classesc1andc2is also linear in(ev). Ifmincmyv,c(v) 0, node t is certifi ably robust w.r.t. the logits H, and the set QF. Our goal is to verify whether no admissible G QFcan change the prediction for a target nodet. FromProblem1weseethatiftheworstmarginoverallclassesm yt,(t)ispositive, thenmyt,c(t) 0, for allyt6= c, which implies that there exists no adversarial example withinQFthat leads to a change in the prediction to some other class c, that is, the logit for the given class ytis always largest. 3 Challenges and core idea. From a cursory look at Eq. 3 it appears that fi nding the minimum is intractable. After all, our domain is discrete and we are optimizing over exponentially many confi gurations. Moreover, the margin is a function of the personalized PageRank which has a non- trivial dependency on the perturbed graph. But there is hope: For a fi xedH, the marginmyt,c(t)is a linear function of(et). Thus, Problem 1 reduces to optimizing a linear function of personalized PageRank over a specifi c constraint set. This is the core idea of our approach. As we will show, if we consider only local budget constraints the exact certifi cate can be effi ciently computed. This is in contrast to most certifi cates for neural networks that rely on different relaxations to make the problem tractable. Including the global budget constraint, however, makes the problem hard. For this case, we derive an effi cient to compute lower bound on the worst-case margin. Thus, if the lower bound is positive we can still guarantee that our classifi er is robust w.r.t. the set of admissible perturbations. 4.3Optimizing topics-sensitive PageRank with global and local constraints We are interested in optimizing a linear function of the topic-sensitive PageRank vector of a graph by modifying its structure. That is, we want to confi gure a set of fragile edges into included/excluded to obtain a perturbed graph G maximizing the objective. Formally, we study the general problem: Problem 2.Given a graphG, a set of admissible perturbationsQF as in Problem 1, and any fi xed z,r RN, (0,1) solve the following optimization problem: max GQF rT G,(z). Settingr = (H:,yt H:,c)andz = et, we see that Problem 1 is a special case of Problem 2. We can think ofras a reward/cost vector, i.e.rvis the reward that a random walker obtains when visiting nodev. The objective valuerT(z) is proportional to the overall reward obtained during an infi nite random walk with teleportation since (z)vexactly equals to the frequency of visits to v. Variations and special cases of this problem have been previously studied 2,11,12,15,18,25,32. Notably, Fercoq et al.18 cast the problem as an average cost infi nite horizon Markov decision process (MDP), also called ergodic control problem, where each node corresponds to a state and the actions correspond to choosing a subset of included fragile edges, i.e. we have 2|F v| actions at each statev(see also Fig. 2a). They show that despite the exponential number of actions, the problem can be effi ciently solved in polynomial time, and they derive a value iteration algorithm with different local constraints. They enforce that the fi nal perturbed graph has at mostbvtotal number of edges per node, while we enforce that at most bvedges per node are perturbed (see Sec. 4.1). Our approach for local budget only.Inspired by the MDP idea we derive a policy iteration (PI) algorithm which also runs in polynomial time 25. Intuitively, every policy corresponds to a perturbed graph inQF, and each iteration improves the policy. The PI algorithm allows us to: incorporate our local constraints easily, take advantage of effi cient solvers for sparse systems of linear equations (line 3 in Alg. 1), and implement the policy improvement step in parallel (lines 4-6 in Alg. 1). It can easily handle very large sets of fragile edges and it scales to large graphs. Proposition 1. Algorithm 1 which greedily selects the fragile edges fi nds an optimal solution for Problem 2 with only local constraints in a number of steps independent of the size of the graph. Algorithm 1 POLICYITERATION WITHLOCALBUDGET Require: Graph G = (V,E), reward r, set of fi xed Efand fragile F edges, local budgets bv 1:Initialization: W0 F as any arbitrary subset, AGcorresponding to G 2:while Wk6= Wk1do 3:Solve (IN D1A)x = r for x, where Aij= 1 AG ijif (i,j) Wk # fl ip the edges 4:Let lij (1 2AG ij)(xj xiri ) for all (i,j) F# calculate the improvement 5:Let Lv (v,j) F | lvj 0 lvj top bvlargest lvj, v V 6:Wk S vLv, k k + 1 7:end while 8:return Wk# optimal graph G QF obtained by fl ipping all (i,j) Wkof G We provide the proof in Sec. 8.3 in the appendix. The main idea for Alg. 1 is starting from a random policy, in each iteration we fi rst compute the mean reward before teleportationxfor the current policy (line 3), and then greedily select the topbvedges that improve the policy (lines 4-6). This algorithm is guaranteed to converge to the optimal policy, and thus to the optimal confi guration of fragile edges. 4 Figure 1: The upper part outlines our approach for local budget only: the exact certifi cate is effi ciently computed with policy iteration. The lower part outlines our 3 step approach for local and global budget: (a) forumlate an MDP on an auxiliary graph, (b) augment the corresponding LP with quadratic constraints to enforce the global budget, and (c) apply the RLT relaxation to the resulting QCLP. Certifi cate for local budget only.Proposition 1 implies that for local constraints only, the optimal solution does not depend on the teleport vectorz. Regardless of the nodet(i.e. whichz = etin Eq. 3), the optimal edges to perturb are the same if the admissible setQFand the rewardrare the same. This means that for a fi xedQFwe only need to run the algorithmK Ktimes to obtain the certifi cates for allNnodes: For each pair of classesc1,c2we have a different reward vector r = (H:,c1 H:,c2), and we can recover the exact worst-case marginsm yt,()for allNnodes by just computingon the resultingK Kmany perturbed graphs G. Now,m yt,() 0implies certifi able robustness, whilem yt,() 0) in the optimal solution. max X vV xvrv X (i,j)F x0 ijri (4a) xv X (i,v)Ef xi di |z incoming fi xed edges X (j,v)F x1 jv |z incoming on edges X (v,k)F x0 vk |z returning off edges = (1 )zvv V(4b) x0 ij+ x 1 ij = xi di ,x0 ij 0,x1 ij 0(i,j) F(4c) X (v,i)F (v,i) Ex0 ij |z removed existing edges +(v,i) / Ex1 ij |z newly added edges xv dv bv,xv 0v V(4d) x0 ij 1 ij = 0,x1 ij 0 ij = 0,1 ij = 1 0 ij, 0 0 ij 1(i,j) F(4e) X (i,j)F(i,j) E 0 ij+ (i,j) / E 1 ij B(4f) Key idea and insights.Eqs. 4b and 4c correspond to the LP of the unconstrained MDP. Intuitively, the variablexvmaps to the PageRank score of nodev, and from the variablesx0 ij/x1ijwe can recover the optimal policy: if the variablex0 ij (respectivelyx1 ij) is non-zero then in the optimal policy the fragile edge(i,j)is turned off (respectively on). Since there exists a deterministic optimal policy, only one of them is non-zero but never both. Eq. 4d corresponds to the local budget. Remarkably, despite the variablesx0 ij/x1ij not being integral, since they share the factorxid1 i from Eq. 4c we can exactly count the number of edges that are turned off or on using only linear constraints. Eqs. 4e and 4f enforce the global budget. From Eq. 4e we have that wheneverx0 ijis nonzero it follows that 1 ij = 0and0 ij = 1 since that is the only confi guration that satisfi es the constraints (similarly for x1 ij). Intuitively, this effectively makes the0ij/1ijvariables counters and we can utilize them in Eq. 4f to enforce the total number of perturbed edges to not exceedB. See detailed proof in Sec. 8.3. (c) Effi cient Reformulation Linearization Technique (RLT).The quadratic constraints in our QCLP make the problem non-convex and diffi cult to solve. We relax the problem using the Refor- mulation Linearization Technique (RLT) 38 which gives us an upper bound on the objective. The alternative SDP-relaxation 43 based on semidefi nite programming is not suitable for our problem since the constraints are trivially satisfi ed (see Appendix 8.4 for details). While in general, the RLT introduces many new variables (replacing each product termmimjwith a variableMij) along with multiple new linear inequality constraints, it turns out that in our case the solution is highly compact: Proposition 3. Given a fi xed upper boundxvforxvand using the RLT relaxation, the

人人文庫(kù)> 全部分類> 應(yīng)用文書

溫馨提示

1. 本站所有資源如無(wú)特殊說(shuō)明，都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請(qǐng)下載最新的WinRAR軟件解壓。
2. 本站的文檔不包含任何第三方提供的附件圖紙等，如果需要附件，請(qǐng)聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
3. 本站RAR壓縮包中若帶圖紙，網(wǎng)頁(yè)內(nèi)容里面會(huì)有圖紙預(yù)覽，若沒有圖紙預(yù)覽就沒有圖紙。
4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
5. 人人文庫(kù)網(wǎng)僅提供信息存儲(chǔ)空間，僅對(duì)用戶上傳內(nèi)容的表現(xiàn)方式做保護(hù)處理，對(duì)用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯，并不能對(duì)任何下載內(nèi)容負(fù)責(zé)。
6. 下載文件中如有侵權(quán)或不適當(dāng)內(nèi)容，請(qǐng)與我們聯(lián)系，我們立即糾正。
7. 本站不保證下載資源的準(zhǔn)確性、安全性和完整性, 同時(shí)也不承擔(dān)用戶因使用這些下載資源對(duì)自己和他人造成任何形式的傷害或損失。

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

iccv2019論文全集9041-certifiable-robustness-to-graph-perturbations

文檔簡(jiǎn)介

溫馨提示

最新文檔

評(píng)論

相關(guān)文檔