博客原文:The Best and Most Current of Modern Natural Language Processing
作者:Victor Sanh
过去的两年,NLP社区目睹了各种任务和应用的加速发展🚀。这一进展是由于我们传统地构建NLP系统的方式发生了转变:长久以来,我们使用预训练的词嵌入,像word2vec或GloVe来初始化神经网络的第一层,然后用单个数据集通过监督学习来训练具体任务的体系结构。
最近,一些作品证明了我们可以在网络规模的数据集📖上学习分层上下文表示,利用无监督(或半监督)信号例如语言模型,把这些预训练任务的转换成下游任务(迁移学习)。令人鼓舞地是,这种转换导致了各种下游应用的重大进步,从问答,到自然语言推理,再到的句法分析。
我该读哪些论文来了解现代NLP的最新趋势?
几周前,我的一个朋友决定潜心研究NLP。他已经有机器学习和深度学习的背景,所以他真诚地问我:“我该读哪些论文来了解现代NLP的最新趋势?”。 👩🎓👨🎓
这是一个好问题,尤其是当你考虑到NLP会议(和一般的机器学习会议)获得指数增长的提交数量时,2019年NAACL的提交量比2018年增加了80%,ACL增加了90%。
我为他编写了这个论文列表和资源📚,并且我认为把它分享给NLP社区会很棒,我相信它会帮到更多人。
免责声明:这个列表不会面面俱到,也不会囊括NLP的每一个领域(例如,没有语义分析,对抗学习,增强学习在NLP方面的应用)。这是过去几年、几个月中最新最有影响的作品(截至2019年5月),主要是受到我所读的东西的影响。
通常来说,一个好的开始方式是读介绍性或总结性的博客文章(例如,这篇文章或这篇),在真正花费时间读论文之前,这些文章能从高层次的视角给你足够的背景 ✋。
🌊 一种新范式:迁移学习
-
Deep contextualized word representations(NAACL 2018)
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, Luke Zettlemoyer
-
Universal Language Model Fine-tuning for Text Classification(ACL 2018)
Jeremy Howard, Sebastian Ruder
-
Improving Language Understanding by Generative Pre-Training
Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever
-
Language Models are Unsupervised Multitask Learners
Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever
-
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding(NAACL 2019)
Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova
-
Cloze-driven Pretraining of Self-attention Networks(arXiv 2019)
Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli
-
Unified Language Model Pre-training for Natural Language Understanding and Generation(arXiv 2019)
Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon
-
MASS: Masked Sequence to Sequence Pre-training for Language Generation(ICML 2019)
Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu
🖼 表示学习
-
What you can cram into a single vector: Probing sentence embeddings for linguistic properties(ACL 2018)
Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni
-
No Training Required: Exploring Random Encoders for Sentence Classification(ICLR 2019)
John Wieting, Douwe Kiela
-
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding(ICLR 2019)
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
和
SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems(arXiv 2019)
Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman
-
Linguistic Knowledge and Transferability of Contextual Representations(NAACL 2019)
Nelson F. Liu, Matt Gardner, Yonatan Belinkov, Matthew E. Peters, Noah A. Smith
-
To Tune or Not to Tune? Adapting Pretrained Representations to Diverse Tasks(arXiv 2019)
Matthew Peters, Sebastian Ruder, Noah A. Smith
🗣 神经对话
-
A Neural Conversational Model(ICML Deep Learning Workshop 2015)
Oriol Vinyals, Quoc Le
-
A Persona-Based Neural Conversation Model(ACL 2016)
Jiwei Li, Michel Galley, Chris Brockett, Georgios P. Spithourakis, Jianfeng Gao, Bill Dolan
-
A Simple, Fast Diverse Decoding Algorithm for Neural Generation(arXiv 2017)
Jiwei Li, Will Monroe, Dan Jurafsky
-
Neural Approaches to Conversational AI(arXiv 2018)
Jianfeng Gao, Michel Galley, Lihong Li
-
TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents(NeurIPS 2018 CAI Workshop)
Thomas Wolf, Victor Sanh, Julien Chaumond, Clement Delangue
免责声明:我是这个出版物的作者之一
逐步解释博客文章
-
Wizard of Wikipedia: Knowledge-Powered Conversational agents(ICLR 2019)
Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston
-
Learning to Speak and Act in a Fantasy Text Adventure Game(arXiv 2019)
Jack Urbanek, Angela Fan, Siddharth Karamcheti, Saachi Jain, Samuel Humeau, Emily Dinan, Tim Rocktäschel, Douwe Kiela, Arthur Szlam, Jason Weston
🍱 任你选
-
Pointer Networks(NIPS 2015)
Oriol Vinyals, Meire Fortunato, Navdeep Jaitly
-
End-To-End Memory Networks(NIPS 2015)
Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, Rob Fergus
-
Get To The Point: Summarization with Pointer-Generator Networks(ACL 2017)
Abigail See, Peter J. Liu, Christopher D. Manning
-
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data(EMNLP 2017)
Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes
-
End-to-end Neural Coreference Resolution(EMNLP 2017)
Kenton Lee, Luheng He, Mike Lewis, Luke Zettlemoyer
-
StarSpace: Embed All The Things!(AAAI 2018)
Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, Jason Weston
-
The Natural Language Decathlon: Multitask Learning as Question Answering(arXiv 2018)
Bryan McCann, Nitish Shirish Keskar, Caiming Xiong, Richard Socher
-
Character-Level Language Modeling with Deeper Self-Attention(arXiv 2018)
Rami Al-Rfou, Dokook Choe, Noah Constant, Mandy Guo, Llion Jones
-
Linguistically-Informed Self-Attention for Semantic Role Labeling(EMNLP 2018)
Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum
-
Phrase-Based & Neural Unsupervised Machine Translation(EMNLP 2018)
Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc’Aurelio Ranzato
-
Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning(ICLR 2018)
Sandeep Subramanian, Adam Trischler, Yoshua Bengio, Christopher J Pal
-
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context(arXiv 2019)
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov
-
Universal Transformers(ICLR 2019)
Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser
-
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models(NAACL 2019)
Alexandra Chronopoulou, Christos Baziotis, Alexandros Potamianos
-
…对于老一点的论文,在选择阅读内容时,引用量通常是一个合理的参考。
一个好的法则是,你应该阅读那些你感兴趣并能激发你快乐的文章!🤷🌟
🌍 通用资源
还有大量可选择的资源供你使用,并不一定是论文,如下所示:
书籍:
-
Speech and Language Processing(3rd ed. draft)
Dan Jurafsky and James H. Martin
-
Neural Network Methods for Natural Language Processing
Yoav Goldberg
课程材料:
-
Natural Language Understanding and Computational Semantics with Katharina Kann and Sam Bowman at NYU
-
CS224n: Natural Language Processing with Deep Learning with Chris Manning and Abigail See at Standford
-
Contextual Word Representations: A Contextual Introduction from Noah A. Smith’s teaching material at UW
博客和播客
-
NLP Highlights hosted by Matt Gardner and Waleed Ammar
其他
-
Twitter🐦
-
arXiv daily newsletter
-
Survey papers
-
…
🎅 写在最后
就到这里!阅读其中的这些资源应该可以让你对现代NLP的最新趋势有一个很好的认识并希望帮你建立自己的NLP系统!🎮
最后一件事,我没有在这篇博客里过多谈及,但是我发现它是极其重要的(有时候可以忽略),那就是动手实践比单纯的阅读要更好! 👩💻通过深入阅读附带的代码或尝试自己实现一些代码,你常常能学到更多。实践的资源包括the amazing blog posts and courses from fast.ai或我们的开源库🤗
你感觉如何呢?哪些作品对你影响最深?现在就告诉我们吧!⌨️
像往常一样,如果你喜欢这篇文章,👏 告诉我们并分享一些你身边的消息吧!
非常感谢Lysandre Debut, Clément Delangue, Thibault Févry, Peter Martigny, Anthony Moi and Thomas Wolf 提供的评价和反馈。