信息防御的战例研究：文本对抗攻击基础前沿及相关资源

小君 2022-11-05 12:58:51 783

信息防御的战例研究：文本对抗攻击基础前沿及相关资源[3] Adversarial attacks and defences: A survey. Anirban Chakraborty Manaar Alam Vishal Dey Anupam Chattopadhyay Debdeep Mukhopadhyay. arXiv 2018.[2] Ian J Goodfellow Jonathon Shlens Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of ICLR.作者：岂凡超，清华大学计算机系博士，导师为孙茂松教授，主要研究方向为自然语言处理，其研究工作曾在 EMNLP 等发表。参考文献：[1] Intriguing properties of neural networks. Christ

图8：OpenAttack打印攻击结果示例

图9: OpenAttack目前覆盖的所有攻击方法

这类文本对抗攻击工具包有丰富的应用场景，包括提供现成的对抗攻击基线模型、针对文本对抗的全面的评测、辅助设计新的攻击模型、评测自己模型的鲁棒性、进行对抗训练等等。相信它们会像图像领域的对抗攻击工具包一样，极大地推进文本对抗攻击领域的发展。

作者：岂凡超，清华大学计算机系博士，导师为孙茂松教授，主要研究方向为自然语言处理，其研究工作曾在 EMNLP 等发表。

参考文献：

[1] Intriguing properties of neural networks. Christian Szegedy Wojciech Zaremba Ilya Sutskever Joan Bruna Dumitru Erhan Ian Goodfellow Rob Fergus. ICLR 2014.

[2] Ian J Goodfellow Jonathon Shlens Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of ICLR.

[3] Adversarial attacks and defences: A survey. Anirban Chakraborty Manaar Alam Vishal Dey Anupam Chattopadhyay Debdeep Mukhopadhyay. arXiv 2018.

[4] Semantically Equivalent Adversarial Rules for Debugging NLP Models. Marco Tulio Ribeiro Sameer Singh Carlos Guestrin. ACL 2018.

[5] Adversarial Example Generation with Syntactically Controlled Paraphrase Networks. Mohit Iyyer John Wieting Kevin Gimpel Luke Zettlemoyer. NAACL-HLT 2018.

[6] Generating Natural Adversarial Examples. Zhengli Zhao Dheeru Dua Sameer Singh. ICLR 2018.

[7] Adversarial Examples for Evaluating Reading Comprehension Systems. Robin Jia and Percy Liang. EMNLP 2017.

[8] Generating Natural Language Adversarial Examples. Moustafa Alzantot Yash Sharma Ahmed Elgohary Bo-Jhang Ho Mani Srivastava Kai-Wei Chang. EMNLP 2018.

[9] Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. Di Jin Zhijing Jin Joey Tianyi Zhou Peter Szolovits. AAAI-20.

[10] Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. Shuhuai Ren Yihe Deng Kun He Wanxiang Che. ACL 2019.

[11] Word-level Textual Adversarial Attacking as Combinatorial Optimization. Yuan Zang Fanchao Qi Chenghao Yang Zhiyuan Liu Meng Zhang Qun Liu Maosong Sun. ACL 2020.

[12] Generating Fluent Adversarial Examples for Natural Languages. Huangzhao Zhang Hao Zhou Ning Miao Lei Li. ACL 2019.

[13] Deep Text Classification Can be Fooled. Bin Liang Hongcheng Li Miaoqiang Su Pan Bian Xirong Li Wenchang Shi. IJCAI 2018.

[14] Synthetic and Natural Noise Both Break Neural Machine Translation. Yonatan Belinkov Yonatan Bisk. ICLR 2018.

[15] HotFlip: White-Box Adversarial Examples for Text Classification. Javid Ebrahimi Anyi Rao Daniel Lowd Dejing Dou. ACL 2018.

[16] Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems. Steffen Eger Gözde Gül ¸Sahin Andreas Rücklé Ji-Ung Lee Claudia Schulz Mohsen Mesgar Krishnkant Swarnkar Edwin Simpson Iryna Gurevych. NAACL-HLT 2019.

[17] TEXTBUGGER: Generating Adversarial Text Against Real-world Applications. Jinfeng Li Shouling Ji Tianyu Du Bo Li Ting Wang. NDSS 2019.

[18] Combating Adversarial Misspellings with Robust Word Recognition. Danish Pruthi Bhuwan Dhingra Zachary C. Lipton. ACL 2019.

[19] Crafting Adversarial Input Sequences For Recurrent Neural Networks. Nicolas Papernot Patrick McDaniel Ananthram Swami Richard Harang. MILCOM 2016.

[20] HowNet and the computation of meaning. Zhendong Dong Qiang Dong. 2006.

[21] Particle swarm optimization. Russell Eberhart James Kennedy. IEEE International Conference on Neural Networks 1995.

[22] A comparison of particle swarm optimization and the genetic algorithm. Rania Hassan Babak Cohanim Olivier De Weck Gerhard Venter. 46th AIAA/ASME/ASCE/AHS/ASC Structures Structural Dynamics and Materials Conference 2005.

[23] Beyond Accuracy: Behavioral Testing of NLP Models with CheckList. Marco Tulio Ribeiro Tongshuang Wu Carlos Guestrin Sameer Singh. ACL 2020.

上一页 1 2 3 尾页

网站首页

返回栏目

信息防御的战例研究：文本对抗攻击基础 前沿及相关资源

猜您喜欢：

相关文章

信息防御的战例研究：文本对抗攻击基础前沿及相关资源