nlp的最好方法是什么(下一步研究目标)
nlp的最好方法是什么(下一步研究目标)当开始新领域的研究时,你会发现寻找引人注目的主题并学会问正确的问题是一件很难的事。这种情况在机器学习这种进展很快的领域里尤其突出——你很难找到突破点。独立于任务的架构提升用于 NLP 的的迁移学习多任务学习跨语言学习
在开始你的研究之前,了解目标领域中最重要的研究方向是很重要的任务。本文中,德国海德堡大学的计算语言学在读博士 Sebastian Ruder 为我们介绍了 NLP 领域里最具潜力的几个研究方向。
目录
-
独立于任务的 NLP 数据增强
-
用于 NLP 的 few-shot learning
-
用于 NLP 的的迁移学习
-
多任务学习
-
跨语言学习
-
独立于任务的架构提升
当开始新领域的研究时,你会发现寻找引人注目的主题并学会问正确的问题是一件很难的事。这种情况在机器学习这种进展很快的领域里尤其突出——你很难找到突破点。
参考文献
1. Krizhevsky A. Sutskever I. & Hinton G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
2. Tobin J. Fong R. Ray A. Schneider J. Zaremba W. & Abbeel P. (2017). Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World. arXiv Preprint arXiv:1703.06907. Retrieved from http://arxiv.org/abs/1703.06907
3. Zhang H. Cisse M. Dauphin Y. N. & Lopez-Paz D. (2017). mixup: Beyond Empirical Risk Minimization 1–11. Retrieved from http://arxiv.org/abs/1710.09412
4. Vinyals O. Blundell C. Lillicrap T. Kavukcuoglu K. & Wierstra D. (2016). Matching Networks for One Shot Learning. NIPS 2016. Retrieved from http://arxiv.org/abs/1606.04080
5. Li Y. Cohn T. & Baldwin T. (2017). Robust Training under Linguistic Adversity. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (Vol. 2 pp. 21–27).
6. Wang D. & Eisner J. (2016). The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages. Tacl 4 491–505. Retrieved from https://www.transacl.org/ojs/index.php/tacl/article/viewFile/917/212 https://transacl.org/ojs/index.php/tacl/article/view/917
7. Liu T. Cui Y. Yin Q. Zhang W. Wang S. & Hu G. (2017). Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (pp. 102–111).
8. Samanta S. & Mehta S. (2017). Towards Crafting Text Adversarial Samples. arXiv preprint arXiv:1707.02812.
9. Ebrahimi J. Rao A. Lowd D. & Dou D. (2017). HotFlip: White-Box Adversarial Examples for NLP. Retrieved from http://arxiv.org/abs/1712.06751
10. Yasunaga M. Kasai J. & Radev D. (2017). Robust Multilingual Part-of-Speech Tagging via Adversarial Training. In Proceedings of NAACL 2018. Retrieved from http://arxiv.org/abs/1711.04903
11. Jia R. & Liang P. (2017). Adversarial Examples for Evaluating Reading Comprehension Systems. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
12. Sennrich R. Haddow B. & Birch A. (2015). Improving neural machine translation models with monolingual data. arXiv preprint arXiv:1511.06709.
13. Sennrich R. Haddow B. & Birch A. (2016). Edinburgh neural machine translation systems for wmt 16. arXiv preprint arXiv:1606.02891.
14. Mallinson J. Sennrich R. & Lapata M. (2017). Paraphrasing revisited with neural machine translation. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1 Long Papers (Vol. 1 pp. 881-893).
15. Dong L. Mallinson J. Reddy S. & Lapata M. (2017). Learning to Paraphrase for Question Answering. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
16. Li J. Monroe W. Shi T. Ritter A. & Jurafsky D. (2017). Adversarial Learning for Neural Dialogue Generation. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Retrieved from http://arxiv.org/abs/1701.06547
17. Bowman S. R. Vilnis L. Vinyals O. Dai A. M. Jozefowicz R. & Bengio S. (2016). Generating Sentences from a Continuous Space. In Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning (CoNLL). Retrieved from http://arxiv.org/abs/1511.06349
18. Hu Z. Yang Z. Liang X. Salakhutdinov R. & Xing E. P. (2017). Toward Controlled Generation of Text. In Proceedings of the 34th International Conference on Machine Learning.
19. Guu K. Hashimoto T. B. Oren Y. & Liang P. (2017). Generating Sentences by Editing Prototypes.
20. Shen T. Lei T. Barzilay R. & Jaakkola T. (2017). Style Transfer from Non-Parallel Text by Cross-Alignment. In Advances in Neural Information Processing Systems. Retrieved from http://arxiv.org/abs/1705.09655
21. Mrkšić N. Vulić I. Séaghdha D. Ó. Leviant I. Reichart R. Gašić M. … Young S. (2017). Semantic Specialisation of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints. TACL. Retrieved from http://arxiv.org/abs/1706.00374
22. Ribeiro M. T. Singh S. & Guestrin C. (2016 August). Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144). ACM.
23. Ravi S. & Larochelle H. (2017). Optimization as a Model for Few-Shot Learning. In ICLR 2017.
24. Snell J. Swersky K. & Zemel R. S. (2017). Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems.
25. Song Y. & Roth D. (2014). On dataless hierarchical text classification. Proceedings of AAAI 1579–1585. Retrieved from http://cogcomp.cs.illinois.edu/papers/SongSoRo14.pdf
26. Song Y. Upadhyay S. Peng H. & Roth D. (2016). Cross-Lingual Dataless Classification for Many Languages. Ijcai 2901–2907.
27. Augenstein I. Ruder S. & Søgaard A. (2018). Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces. In Proceedings of NAACL 2018.
28. Alonso H. M. & Plank B. (2017). When is multitask learning effective? Multitask learning for semantic sequence prediction under varying data conditions. In EACL. Retrieved from http://arxiv.org/abs/1612.02251
29. Misra I. Shrivastava A. Gupta A. & Hebert M. (2016). Cross-stitch Networks for Multi-task Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. http://doi.org/10.1109/CVPR.2016.433
30. Ruder S. Bingel J. Augenstein I. & Søgaard A. (2017). Sluice networks: Learning what to share between loosely related tasks. arXiv preprint arXiv:1705.08142.
31. Peters M. E. Ammar W. Bhagavatula C. & Power R. (2017). Semi-supervised sequence tagging with bidirectional language models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017).
32. Peters M. E. Neumann M. Iyyer M. Gardner M. Clark C. Lee K. & Zettlemoyer L. (2018). Deep contextualized word representations. Proceedings of NAACL.
33. Howard J. & Ruder S. (2018). Fine-tuned Language Models for Text Classification. arXiv preprint arXiv:1801.06146.
34. Conneau A. Kiela D. Schwenk H. Barrault L. & Bordes A. (2017). Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.
35. Subramanian S. Trischler A. Bengio Y. & Pal C. J. (2018). Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning. In Proceedings of ICLR 2018.
36. Ruder S. Vulić I. & Søgaard A. (2017). A Survey of Cross-lingual Word Embedding Models. arXiv Preprint arXiv:1706.04902. Retrieved from http://arxiv.org/abs/1706.04902
37. Vaswani A. Shazeer N. Parmar N. Uszkoreit J. Jones L. Gomez A. N. … Polosukhin I. (2017). Attention Is All You Need. In Advances in Neural Information Processing Systems.
38. Mou L. Meng Z. Yan R. Li G. Xu Y. Zhang L. & Jin Z. (2016). How Transferable are Neural Networks in NLP Applications? Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing.
39. Xie Z. Wang S. I. Li J. Levy D. Nie A. Jurafsky D. & Ng A. Y. (2017). Data Noising as Smoothing in Neural Network Language Models. In Proceedings of ICLR 2017.
40. Nie A. Bennett E. D. & Goodman N. D. (2017). DisSent: Sentence Representation Learning from Explicit Discourse Relations. arXiv Preprint arXiv:1710.04334. Retrieved from http://arxiv.org/abs/1710.04334