文档

您需要什么主题的文档?

EDITOR: An Edit-Based Transformer with Repositioning

EDITOR: An Edit-Based Transformer with Repositioning for Neural Machine Translation with Soft Lexical Constraints Weijia Xu University of Maryland weijia@cs.umd.edu Marine Carpuat University of Maryland marine@cs.umd.edu Abstract We introduce an Edit-Based TransfOrmer with Repositioning (EDITOR), which makes sequence generation flexible by seamlessly allowing users to specify preferences in out- put lexical choice. Building on recent models for non-autoregressive sequence generation (Gu et al., 2019), EDITOR

阅读更多 ”

Aligning Faithful Interpretations with their Social Attribution

Aligning Faithful Interpretations with their Social Attribution Alon Jacovi Bar Ilan University alonjacovi@gmail.com Yoav Goldberg Bar Ilan University and Allen Institute for AI yoav.goldberg@gmail.com Abstract We find that the requirement of model inter- pretations to be faithful is vague and incom- plete. With interpretation by textual highlights as a case study, we present several failure cases. Borrowing concepts from social science, we identify that the

阅读更多 ”

Morphology Matters: A Multilingual Language Modeling Analysis

Morphology Matters: A Multilingual Language Modeling Analysis Hyunji Hayley Park University of Illinois hpark129@illinois.edu Katherine J. Zhang Carnegie Mellon University kjzhang@cmu.edu Coleman Haley Johns Hopkins University chaley7@jhu.edu Kenneth Steimel Indiana University ksteimel@iu.edu Han Liu University of Chicago∗ hanliu@uchicago.edu Lane Schwartz University of Illinois lanes@illinois.edu Abstract Prior studies in multilingual language model- 英 (例如, Cotterell et al., 2018; Mielke et al., 2019) disagree on whether or

阅读更多 ”

Supertagging the Long Tail with Tree-Structured Decoding

Supertagging the Long Tail with Tree-Structured Decoding of Complex Categories Jakob Prange Nathan Schneider Vivek Srikumar Georgetown University University of Utah {jp1724, nathan.schneider}@georgetown.edu svivek@cs.utah.edu l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c l /

阅读更多 ”

Infusing Finetuning with Semantic Dependencies

Infusing Finetuning with Semantic Dependencies Zhaofeng Wu♠ Hao Peng♠ Noah A. Smith♠♦ ♠Paul G. Allen School of Computer Science & Engineering, University of Washington ♦Allen Institute for Artificial Intelligence {zfw7,hapeng,nasmith}@cs.washington.edu Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models ‘‘pretrained’’ on large unannotated corpora: performance on application-inspired bench- marks (Peters et al., 2018, 国际米兰

阅读更多 ”

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization

WikiAsp: A Dataset for Multi-domain Aspect-based Summarization Prashant Budania2 Hiroaki Hayashi1 Peng Wang2 Chris Ackerson2 Raj Neervannan2 Graham Neubig1 1Language Technologies Institute, Carnegie Mellon University 2AlphaSense {hiroakih,gneubig}@cs.cmu.edu {pbudania,pwang,cackerson,rneervannan}@alpha-sense.com Abstract Aspect-based summarization is the task of gen- erating focused summaries based on specific points of interest. Such summaries aid efficient analysis of text, such as quickly understanding reviews or opinions from different angles. 然而, due to

阅读更多 ”

Latent Compositional Representations Improve Systematic

Latent Compositional Representations Improve Systematic Generalization in Grounded Question Answering Ben Bogin1 Sanjay Subramanian2 Matt Gardner2 Jonathan Berant1,2 1Tel-Aviv University 2Allen Institute for AI {ben.bogin,joberant}@cs.tau.ac.il, {sanjays,mattg}@allenai.org Abstract Answering questions that involve multi-step reasoning requires decomposing them and using the answers of intermediate steps to reach the final answer. 然而, state-of-the- art models in grounded question answering often do not explicitly perform decomposition, leading to difficulties

阅读更多 ”

KEPLER: A Unified Model for Knowledge Embedding and

KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation Xiaozhi Wang1, Tianyu Gao3, Zhaocheng Zhu4,5, Zhengyan Zhang1 Zhiyuan Liu1,2∗, Juanzi Li1,2, and Jian Tang4,6,7∗ 1Department of CST, BNRist; 2KIRC, Institute for AI, 清华大学, 北京, 中国 {wangxz20,zy-z19}@mails.tsinghua.edu.cn {liuzy,lijuanzi}@tsinghua.edu.cn 3Department of Computer Science, 普林斯顿大学, 普林斯顿大学, 新泽西州, USA tianyug@princeton.edu 4Mila – Qu´ebec AI Institute; 5Univesit´e de Montr´eal; 6HEC, Montr´eal, Canada zhaocheng.zhu@umontreal.ca, jian.tang@hec.ca 7CIFAR AI

阅读更多 ”

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals Yanai Elazar1,2 Shauli Ravfogel1,2 Alon Jacovi1 Yoav Goldberg1,2 1Computer Science Department, Bar Ilan University 2Allen Institute for Artificial Intelligence {yanaiela,shauli.ravfogel,alonjacovi,yoav.goldberg}@gmail.com Abstract A growing body of work makes use of probing in order to investigate the working of neural models, often considered black boxes. 最近, an ongoing debate emerged surrounding the limitations of the probing paradigm. 在这项工作中,

阅读更多 ”

Recursive Non-Autoregressive Graph-to-Graph Transformer

Recursive Non-Autoregressive Graph-to-Graph Transformer for Dependency Parsing with Iterative Refinement Alireza Mohammadshahi Idiap Research Institute / EPFL alireza.mohammadshahi@idiap.ch James Henderson Idiap Research Institute james.henderson@idiap.ch Abstract We propose the Recursive Non-autoregressive architecture Graph-to-Graph Transformer (RNGTr) for the iterative refinement of arbi- trary graphs through the recursive application of a non-autoregressive Graph-to-Graph Trans- former and apply it to syntactic dependency parsing. We demonstrate the power and effec-

阅读更多 ”

Modeling Content and Context with Deep Relational Learning

Modeling Content and Context with Deep Relational Learning Maria Leonor Pacheco and Dan Goldwasser Department of Computer Science Purdue University West Lafayette, 在 47907 {pachecog, dgoldwas}@purdue.edu Abstract Building models for realistic natural language tasks requires dealing with long texts and ac- counting for complicated structural depen- dencies. Neural-symbolic representations have emerged as a way to combine the reasoning capabilities of symbolic methods, with the expressiveness

阅读更多 ”

Augmenting Transformers with KNN-Based

Augmenting Transformers with KNN-Based Composite Memory for Dialog Angela Fan Facebook AI Research Universit´e de Lorraine LORIA angelafan@fb.com Claire Gardent CNRS/LORIA claire.gardent@loria.fr Chlo´e Braud CNRS/IRIT chloe.braud@irit.fr Antoine Bordes Facebook AI Research abordes@fb.com Abstract Various machine learning tasks can benefit from access to external information of different modalities, such as text and images. Recent work has focused on learning architectures with large memories capable of storing

阅读更多 ”

On the Relationships Between the Grammatical Genders of Inanimate

On the Relationships Between the Grammatical Genders of Inanimate Nouns and Their Co-Occurring Adjectives and Verbs Adina Williams∗1 Ryan Cotterell∗,2,3 Lawrence Wolf-Sonkin4 Dami´an Blasi5 Hanna Wallach6 2ETH Z¨urich 5Universit¨at Z¨urich 1Facebook AI Research 4Johns Hopkins University 3University of Cambridge 6Microsoft Research adinawilliams@fb.com ryan.cotterell@inf.ethz.ch lawrencews@jhu.edu damian.blasi@uzh.ch wallach@microsoft.com Abstract We use large-scale corpora in six different gendered languages, along with tools from NLP and information theory, 到

阅读更多 ”

Deciphering Undersegmented Ancient Scripts Using Phonetic Prior

Deciphering Undersegmented Ancient Scripts Using Phonetic Prior Jiaming Luo CSAIL, MIT j luo@csail.mit.edu Frederik Hartmann University of Konstanz frederik.hartmann @uni-konstanz.de Enrico Santus Bayer enrico.santus@bayer.com Regina Barzilay CSAIL, MIT regina@csail.mit.edu Yuan Cao Google Brain yuancao@google.com Abstract Most undeciphered lost languages exhibit two characteristics that pose significant decipher- ment challenges: (1) the scripts are not fully segmented into words; (2) the closest known language is not determined.

阅读更多 ”

Efficient Content-Based Sparse Attention with Routing Transformers

Efficient Content-Based Sparse Attention with Routing Transformers Aurko Roy Mohammad Saffar Ashish Vaswani David Grangier Google Research {aurkor,msaffar,avaswani,grangier}@google.com Abstract Self-attention has recently been adopted for a wide range of sequence modeling problems. Despite its effectiveness, self-attention suffers from quadratic computation and memory re- quirements with respect to sequence length. Successful approaches to reduce this complex- ity focused on attending to local sliding win- dows or

阅读更多 ”

Revisiting Multi-Domain Machine Translation

Revisiting Multi-Domain Machine Translation MinhQuang Pham† ‡, Josep Maria Crego†, Franc¸ois Yvon‡ ‡Universit´e Paris-Saclay, 法国国家科学研究中心, LIMSI, 91400, Orsay, France francois.yvon@limsi.fr †SYSTRAN, 5 rue Feydeau, 75002 巴黎, 法国 {minhquang.pham,josep.crego}@systrangroup.com Abstract When building machine translation systems, one often needs to make the best out of hetero- geneous sets of parallel data in training, and to robustly handle inputs from unexpected domains in testing. This multi-domain scenario has

阅读更多 ”

Reducing Confusion in Active Learning for Part-Of-Speech Tagging

Reducing Confusion in Active Learning for Part-Of-Speech Tagging Aditi Chaudhary1, Antonios Anastasopoulos2,∗, Zaid Sheikh1, Graham Neubig1 1Language Technologies Institute, Carnegie Mellon University 2Department of Computer Science, George Mason University {aschaudh,zsheikh,gneubig}@cs.cmu.edu antonis@gmu.edu Abstract Active learning (AL) uses a data selection algorithm to select useful training samples to minimize annotation cost. This is now an es- sential tool for building low-resource syntactic analyzers such as part-of-speech (销售点)

阅读更多 ”

A Primer in BERTology: What We Know About How BERT Works

A Primer in BERTology: What We Know About How BERT Works Anna Rogers Center for Social Data Science University of Copenhagen arogers@sodas.ku.dk Olga Kovaleva Dept. of Computer Science University of Massachusetts Lowell okovalev@cs.uml.edu Anna Rumshisky Dept. of Computer Science University of Massachusetts Lowell arum@cs.uml.edu Abstract Transformer-based models have pushed state of the art in many areas of NLP, but our under- standing of what is

阅读更多 ”