快捷搜索:  汽车  科技

信息防御的战例研究:文本对抗攻击基础 前沿及相关资源

信息防御的战例研究:文本对抗攻击基础 前沿及相关资源[3] Adversarial attacks and defences: A survey. Anirban Chakraborty Manaar Alam Vishal Dey Anupam Chattopadhyay Debdeep Mukhopadhyay. arXiv 2018.[2] Ian J Goodfellow Jonathon Shlens Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of ICLR.作者:岂凡超,清华大学计算机系博士,导师为孙茂松教授,主要研究方向为自然语言处理,其研究工作曾在 EMNLP 等发表。参考文献:[1] Intriguing properties of neural networks. Christ

图8:OpenAttack打印攻击结果示例

信息防御的战例研究:文本对抗攻击基础 前沿及相关资源(1)

图9: OpenAttack目前覆盖的所有攻击方法

这类文本对抗攻击工具包有丰富的应用场景,包括提供现成的对抗攻击基线模型、针对文本对抗的全面的评测、辅助设计新的攻击模型、评测自己模型的鲁棒性、进行对抗训练等等。相信它们会像图像领域的对抗攻击工具包一样,极大地推进文本对抗攻击领域的发展。

作者:岂凡超,清华大学计算机系博士,导师为孙茂松教授,主要研究方向为自然语言处理,其研究工作曾在 EMNLP 等发表。

参考文献:

[1] Intriguing properties of neural networks. Christian Szegedy Wojciech Zaremba Ilya Sutskever Joan Bruna Dumitru Erhan Ian Goodfellow Rob Fergus. ICLR 2014.

[2] Ian J Goodfellow Jonathon Shlens Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of ICLR.

[3] Adversarial attacks and defences: A survey. Anirban Chakraborty Manaar Alam Vishal Dey Anupam Chattopadhyay Debdeep Mukhopadhyay. arXiv 2018.

[4] Semantically Equivalent Adversarial Rules for Debugging NLP Models. Marco Tulio Ribeiro Sameer Singh Carlos Guestrin. ACL 2018.

[5] Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer. NAACL-HLT 2018.

[6] Generating Natural Adversarial Examples. Zhengli Zhao Dheeru Dua Sameer Singh. ICLR 2018.

[7] Adversarial Examples for Evaluating Reading Comprehension Systems. Robin Jia and Percy Liang. EMNLP 2017.

[8] Generating Natural Language Adversarial Examples. Moustafa Alzantot Yash Sharma Ahmed Elgohary Bo-Jhang Ho Mani Srivastava Kai-Wei Chang. EMNLP 2018.

[9] Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Di Jin Zhijing Jin Joey Tianyi Zhou Peter Szolovits. AAAI-20.

[10] Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. Shuhuai Ren Yihe Deng Kun He Wanxiang Che. ACL 2019.

[11] Word-level Textual Adversarial Attacking as Combinatorial Optimization. Yuan Zang Fanchao Qi Chenghao Yang Zhiyuan Liu Meng Zhang Qun Liu Maosong Sun. ACL 2020.

[12] Generating Fluent Adversarial Examples for Natural Languages. Huangzhao Zhang Hao Zhou Ning Miao Lei Li. ACL 2019.

[13] Deep Text Classification Can be Fooled. Bin Liang Hongcheng Li Miaoqiang Su Pan Bian Xirong Li Wenchang Shi. IJCAI 2018.

[14] Synthetic and Natural Noise Both Break Neural Machine Translation. Yonatan Belinkov Yonatan Bisk. ICLR 2018.

[15] HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi Anyi Rao Daniel Lowd Dejing Dou. ACL 2018.

[16] Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger Gözde Gül ¸Sahin Andreas Rücklé Ji-Ung Lee Claudia Schulz Mohsen Mesgar Krishnkant Swarnkar Edwin Simpson Iryna Gurevych. NAACL-HLT 2019.

[17] TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li Shouling Ji Tianyu Du Bo Li Ting Wang. NDSS 2019.

[18] Combating Adversarial Misspellings with Robust Word Recognition. Danish Pruthi Bhuwan Dhingra Zachary C. Lipton. ACL 2019.

[19] Crafting Adversarial Input Sequences For Recurrent Neural Networks. Nicolas Papernot Patrick McDaniel Ananthram Swami Richard Harang. MILCOM 2016.

[20] HowNet and the computation of meaning. Zhendong Dong Qiang Dong. 2006.

[21] Particle swarm optimization. Russell Eberhart James Kennedy. IEEE International Conference on Neural Networks 1995.

[22] A comparison of particle swarm optimization and the genetic algorithm. Rania Hassan Babak Cohanim Olivier De Weck Gerhard Venter. 46th AIAA/ASME/ASCE/AHS/ASC Structures Structural Dynamics and Materials Conference 2005.

[23] Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. Marco Tulio Ribeiro Tongshuang Wu Carlos Guestrin Sameer Singh. ACL 2020.

信息防御的战例研究:文本对抗攻击基础 前沿及相关资源(2)

猜您喜欢: