版權(quán)說明:本文檔由用戶提供并上傳,收益歸屬內(nèi)容提供方,若內(nèi)容存在侵權(quán),請進行舉報或認領(lǐng)
文檔簡介
Adversarial
Example
Detection姜育剛,馬興軍,吳祖煊Recap:
week
3
1.
Adversarial
Examples
2.
Adversarial
Attacks
3.
Adversarial
Vulnerability
UnderstandingIn-class
Adversarial
Attack
Competitionhttps://codalab.lisn.upsaclay.fr/competitions/15669?secret_key=77cb8986-d5bd-4009-82f0-7dde2e819ff8
In-class
Adversarial
Attack
CompetitionIn-class
Adversarial
Attack
CompetitionAdversarial
attack
competition(account
for
30%)必須使用學校郵箱注冊比賽(否則無成績)比賽時間:Phase
1:10月1號–
10月28號Phase
2:評估階段,學生不參與沒卡的同學可以使用Google
Colab:/
按排名算分:第一名30分最后一名15分Adversarial
Example
Detection
(AED)A
binary
classification
problem:
clean
(y=0)
or
adv
(y=1)?An
anomaly
detection
problem:
benign
(y=0)
or
abnormal
(y=1)?
Principles
for
AEDAll
binary
classification
methods
can
be
applied
for
AEDPrinciples
for
AEDAll
anomaly
detection
methods
can
be
applied
for
AEDPrinciples
for
AEDUse
as
much
information
as
you
canInput
statisticsManual
featuresTraining
dataAttention
mapTransformationMixupDenoising…ActivationsDeep
featuresProbabilitiesLogitsGradientsLoss
landscapeUncertainty…Principles
for
AEDLeverage
unique
characteristics
of
adversarial
examplesTwinsStrangersExtremely
close
to
the
clean
sampleFar
away
in
predictionPrinciples
for
AEDBuild
detectors
based
on
existing
understandingsHigh
dimensional
pocketsLocal
linearityTilting
boundaryPrinciples
for
AEDIt’s
is
still
feature
engineering!Challenges
in
AEDThe
diversity
of
adversarial
examples
used
for
training
the
detectors
determine
the
detection
performanceDetectors
are
also
machine
learning
models:
they
are
also
vulnerable
to
adversarial
attacks
The
detectors
need
to
detect
both
existing
and
unknown
attacksThe
detectors
need
to
be
robust
to
adaptive
attacksExisting
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Secondary
Classification
MethodsTake
adversarial
examples
as
a
new
classAdversarialRetraining
(對抗重訓練)Grosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280Secondary
Classification
MethodsClean
samples
as
class
0,
adversarial
as
class
1AdversarialClassification
(對抗分類)Gong
et
al.
Adversarialandcleandataarenottwins,
arXiv:1704.04960Secondary
Classification
MethodsTraining
a
detector
for
each
intermediate
layerCascade
Classifiers
(級聯(lián)分類器)Metzen,JanHendrik,etal."Ondetectingadversarialperturbations."
arXivpreprintarXiv:1702.04267
(2017).Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Principle
Component
Analysis
(PCA)The
last
few
components
differentiate
adversarial
examplesHendrycks,Dan,andKevinGimpel.“Earlymethodsfordetectingadversarialimages.”
arXiv:1608.00530
(2016);
Carlini
and
Wagner."Adversarialexamplesarenoteasilydetected:Bypassingtendetectionmethods."
AISec.2017.Blue:
a
clean
sampleYellow:
an
adv
exampleAn
artifact
caused
by
the
black
backgroundDimensionality
ReductionBhagoji,ArjunNitin,DanielCullina,andPrateekMittal."Dimensionalityreductionasadefenseagainstevasionattacksonmachinelearningclassifiers."arXiv:1704.02654
2.1(2017).Train
on
PCA
reduced
dataExisting
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Distribution
DetectionGrosse
et
al.
Onthe(Statistical)DetectionofAdversarialExamples,
arXiv:1702.06280MaximumMeanDiscrepancy
(MMD)Two
datasets:
Distribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
spaceDistribution
DetectionFeinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).KernelDensityEstimation
(KDE)Adversarial
examples
are
in
low
density
space
Bypassing
10
Detection
MethodsAdversarialExamplesAreNotEasilyDetected:BypassingTenDetectionMethods.
Carlini
and
Wagner,
AISec
2017.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Definition(LocalIntrinsicDimensionality)AdversarialexamplesareinhighdimensionalsubspacesLocal
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018AdversarialSubspacesandExpansionDimension:
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Estimatinglocalintrinsicdimensionality.Amsaleg
et
al.KDD
2015EstimationofLID:
Hill(MLE)estimator(Hill1975,Amsalegetal.2015):BasedonExtremeValueTheory:Nearestneighbordistancesareextremeevents.LowertaildistributionfollowsGeneralizedParetoDistribution(GPD).
Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018InterpretationofLIDforAdversarialSubspaces:LIDdirectlymeasuresexpansionrateoflocaldistancedistributions.Theexpansionofadversarialsubspaceishigherthannormaldatasubspace.LIDassessesthespace-fillingcapabilityofthesubspace,basedonthedistancedistributionoftheexampletoitsneighbors.Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018LID
of
adversarial
examples
(red)
are
higherLID
at
deeper
layers
are
more
differentiableLocal
Intrinsic
Dimensionality
(LID)Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:DatasetFeatureFGMBIM-aBIM-bJSMAOptMNISTKD78.1298.1498.6168.7795.15BU32.3791.5525.4688.7471.30LID96.8999.6099.8392.2499.24CIFAR-10KD64.9268.3898.7085.7791.35BU70.5381.6097.3287.3691.39LID82.3882.5199.7895.8798.94SVHNKD70.3977.1899.5786.4687.41BU86.7884.0786.9391.3387.13LID97.6187.5599.7295.0797.60Local
Intrinsic
Dimensionality
(LID)CharacterizingAdversarialSubspaceUsingLocalIntrinsicDimensionality.
Maet
al.
ICLR
2018Experiments&Results:Train\TestattackFGMBIM-aBIM-bJSMAOptFGSMKD64.9269.1589.7185.7291.22BU70.5381.672.6586.7991.27LID82.3882.3091.6189.9393.32Detectors
trained
on
simple
attacks
FGSM
can
detect
complex
attacksAn
Improved
Detector
of
LID/pdf/2212.06776.pdf
An
Improved
Detector
of
LID/pdf/2212.06776.pdfMahalanobisDistance
(MD)Mahalanobis,PrasantaChandra."Onthegeneralizeddistanceinstatistics."NationalInstituteofScienceofIndia,1936.
The
MD
of
between
two
data
points:MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.
MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.MahalanobisDistance
(MD)Leeetal.“Asimpleunifiedframeworkfordetectingout-of-distributionsamplesandadversarialattacks.”
NeurIPS
2018.Experiments&Results:Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
Inconsistency
(重建不一致性)Trapping
Based
Detection
(誘捕檢測法)Bayes
UncertaintyBayesianUncertainty(BU)
Feinman,Reuben,etal."Detectingadversarialsamplesfromartifacts."
arXivpreprintarXiv:1703.00410
(2017).Feature
SqueezingXu
et
al."Featuresqueezing:Detectingadversarialexamplesindeepneuralnetworks."
arXiv:1704.01155
(2017).Bit
depth
reductionSqueezing
clean
and
adv
examplesReducing
input
dimensionality
improves
robustnessThe
prediction
inconsistency
before
and
after
squeezing
can
detect
advsRandom
TransformationTian
et
al."Detectingadversarialexamplesthroughimagetransformation."
AAAI2018.The
prediction
of
advs
will
change
after
random
transformationsLog-OddsRoth
et
al.“Theoddsareodd:Astatisticaltestfordetectingadversarialexamples.”
ICML2019.Add
random
noise
to
the
input
Log-OddsHuetal.“Anewdefenseagainstadversarialimages:Turningaweaknessintoastrength.”
NeurIPS
2019.原則1:對抗樣本的梯度更均勻原則2:對抗樣本難以被攻擊第二次測試準則1:隨機噪聲不會改變預測結(jié)果測試準則1:再次攻擊需要更多的擾動Existing
MethodsSecondary
Classification
Methods
(二級分類法)Principle
Component
Analysis
(主成分分析法,PCA)Distribution
Detection
Methods
(分布檢測法)Prediction
Inconsistency
(預測不一致性)Reconstruction
溫馨提示
- 1. 本站所有資源如無特殊說明,都需要本地電腦安裝OFFICE2007和PDF閱讀器。圖紙軟件為CAD,CAXA,PROE,UG,SolidWorks等.壓縮文件請下載最新的WinRAR軟件解壓。
- 2. 本站的文檔不包含任何第三方提供的附件圖紙等,如果需要附件,請聯(lián)系上傳者。文件的所有權(quán)益歸上傳用戶所有。
- 3. 本站RAR壓縮包中若帶圖紙,網(wǎng)頁內(nèi)容里面會有圖紙預覽,若沒有圖紙預覽就沒有圖紙。
- 4. 未經(jīng)權(quán)益所有人同意不得將文件中的內(nèi)容挪作商業(yè)或盈利用途。
- 5. 人人文庫網(wǎng)僅提供信息存儲空間,僅對用戶上傳內(nèi)容的表現(xiàn)方式做保護處理,對用戶上傳分享的文檔內(nèi)容本身不做任何修改或編輯,并不能對任何下載內(nèi)容負責。
- 6. 下載文件中如有侵權(quán)或不適當內(nèi)容,請與我們聯(lián)系,我們立即糾正。
- 7. 本站不保證下載資源的準確性、安全性和完整性, 同時也不承擔用戶因使用這些下載資源對自己和他人造成任何形式的傷害或損失。
最新文檔
- 2025山東德州臨邑縣人民醫(yī)院招聘備案制工作人員15人備考核心題庫及答案解析
- 2026年湖南科技職業(yè)學院輔導員招聘備考題庫附答案
- 2025江蘇先科半導體新材料有限公司招聘11人考試核心試題及答案解析
- 2025遼寧葫蘆島市市直部分事業(yè)單位招聘高層次人才84人備考核心試題附答案解析
- 2025廣西貴港市港北區(qū)第四初級中學招募高校畢業(yè)生就業(yè)見習人員6人備考核心題庫及答案解析
- 2025廣東下半年揭陽市市直衛(wèi)生健康事業(yè)單位赴外地院校招聘工作人員27人備考核心題庫及答案解析
- 2025浙江寧波農(nóng)商發(fā)展集團有限公司招聘15人參考考試試題及答案解析
- 2025湖北武漢人才服務發(fā)展有限公司招聘政治教師派往武漢市公立職高工作2人備考核心題庫及答案解析
- 2025年下半年九江市第五人民醫(yī)院自主招聘衛(wèi)生專業(yè)技術(shù)人員7人備考題庫附答案
- 2025河南商丘梁園區(qū)招聘安全服務人員50人筆試重點題庫及答案解析
- 2025下半年貴州遵義市市直事業(yè)單位選調(diào)56人備考筆試試題及答案解析
- 2025中原農(nóng)業(yè)保險股份有限公司招聘67人備考題庫附答案
- 2025年個人所得稅贍養(yǎng)老人分攤協(xié)議范本下載8篇
- 2023年民航華北空管局招聘筆試真題
- DB51∕2672-2020 成都市鍋爐大氣污染物排放標準
- 《山東省建筑工程消耗量定額》解釋全集
- 高考作文寫作訓練:“傳承古韻創(chuàng)新前行”作文閱卷細則及高分作文
- 技術(shù)賦能 融合實踐 推動區(qū)域教育高質(zhì)量發(fā)展
- 泛酸鈣在口腔科疾病中的應用研究
- 診所危險化學物品應急預案
- 潔凈區(qū)管理及無菌操作知識培訓課件
評論
0/150
提交評論