What topic do you need documentation on?
Quality at a Glance:
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets Julia Kreutzer1,2, Isaac Caswell3, Lisa Wang3,4, Ahsan Wahab5,47, Daan van Esch6, Nasanbayar Ulzii-Orshikh7, Allahsera Tapo8,9, Nishant Subramani10,11, Artem Sokolov4, Claytone Sikasote12,13, Monang Setyawan14, Supheakmungkol Sarin14, Sokhar Samb15,16, Benoˆıt Sagot17, Clara Rivera18, Annette Rios19, Isabel Papadimitriou20, Salomey Osei21,22, Pedro Ortiz Suarez17,23, Iroro Orife10,24, Kelechi Ogueji2,25, Andre Niyongabo Rubungo26,27, Toan Q. Nguyen28, Mathias M ¨uller19, Andr´e M
Decomposing and Recomposing Event Structure
Decomposing and Recomposing Event Structure William Gantt University of Rochester, USA wgantt@cs.rochester.edu Lelia Glass Georgia Institute of Technology, USA lelia.glass@modlangs.gatech.edu Aaron Steven White University of Rochester, USA aaron.white@rochester.edu Abstract We present an event structure classification empirically derived from inferential properties annotated on sentence- and document-level Universal Decompositional Semantics (UDS) graphs. We induce this classification jointly with semantic role, entity, and event-event re- lation classifications using
Word Acquisition in Neural Language Models
Word Acquisition in Neural Language Models Tyler A. Chang1,2, Benjamin K. Bergen1 1Department of Cognitive Science 2Halıcıo˘glu Data Science Institute University of California, San Diego, Etats-Unis {tachang, bkbergen}@ucsd.edu Abstract We investigate how neural language mod- els acquire individual words during training, extracting learning curves and ages of acqui- sition for over 600 words on the MacArthur- Bates Communicative Development Inventory (Fenson et al., 2007). Drawing
Word Representation Learning in Multimodal Pre-Trained
Word Representation Learning in Multimodal Pre-Trained Transformers: An Intrinsic Evaluation Sandro Pezzelle, Ece Takmaz, Raquel Fern´andez Institute for Logic, Language and Computation University of Amsterdam, The Netherlands {s.pezzelle|e.takmaz|raquel.fernandez}@uva.nl Abstract This study carries out a systematic intrin- sic evaluation of the semantic representations learned by state-of-the-art pre-trained multi- modal Transformers. These representations are claimed to be task-agnostic and shown to help on many downstream language-and-vision tasks.
Idiomatic Expression Identification using Semantic Compatibility
Idiomatic Expression Identification using Semantic Compatibility Ziheng Zeng and Suma Bhat Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Champaign, IL USA {zzeng13, spbhat2}@illinois.edu Abstract Idiomatic expressions are an integral part of natural language and constantly being added to a language. Owing to their non-compositionality and their ability to take on a figurative or literal meaning depending on the sentential context, ils
Quantifying Cognitive Factors in Lexical Decline
Quantifying Cognitive Factors in Lexical Decline David Francis1 Ella Rabinovich1 Farhan Samir1 David Mortensen2 Suzanne Stevenson1 1Department of Computer Science, University of Toronto, Canada 2Language Technologies Institute, Carnegie Mellon University, Etats-Unis {dfrancis, ella, fsamir, suzanne}@cs.toronto.edu dmortens@cs.cmu.edu Abstract We adopt an evolutionary view on language change in which cognitive factors (in addition to social ones) affect the fitness of words and their success in the linguistic
Explanation-Based Human Debugging of NLP Models: A Survey
Explanation-Based Human Debugging of NLP Models: A Survey Piyawat Lertvittayakumjorn and Francesca Toni Department of Computing Imperial College London, ROYAUME-UNI {pl1515, ft}@imperial.ac.uk Abstract Debugging a machine learning model is hard since the bug usually involves the training data and the learning process. This becomes even harder for an opaque deep learning model if we have no clue about how the model actually works. Dans ce
Instance-Based Neural Dependency Parsing
Instance-Based Neural Dependency Parsing Hiroki Ouchi1,3 Jun Suzuki2,3 Sosuke Kobayashi2,4 Sho Yokoi2,3 Tatsuki Kuribayashi2,5 Masashi Yoshikawa2,3 Kentaro Inui2,3 1 NAIST, Japan 2 Tohoku University, Japan 3 RIKEN, Japan 4 Preferred Networks, Inc., Japan 5 Langsmith, Inc., Japan hiroki.ouchi@is.naist.jp, sosk@preferred.jp, {jun.suzuki, yokoi, kuribayashi, yoshikawa, inui}@tohoku.ac.jp Abstract Interpretable rationales for model predictions are crucial in practical applications. Nous de- velop neural models that possess an interpret- capable
Planning with Learned Entity Prompts for Abstractive Summarization
Planning with Learned Entity Prompts for Abstractive Summarization Shashi Narayan Google Research shashinarayan@google.com Yao Zhao Google Brain yaozhaoyz@google.com Joshua Maynez Google Research joshuahm@google.com Gonc¸alo Sim˜oes Google Research gsimoes@google.com Vitaly Nikolaev Google Research vitalyn@google.com Ryan McDonald∗ ASAPP ryanmcd@asapp.com Abstract We introduce a simple but flexible mechanism to learn an intermediate plan to ground the gen- eration of abstractive summaries. Spécifiquement, we prepend (or prompt) target summaries
Experts, Errors, and Context:
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation Markus Freitag George Foster David Grangier Viresh Ratnakar Qijun Tan Wolfgang Macherey Google Research {freitag, fosterg, grangier, vratnakar, qijuntan, wmach}@google.com Abstract Human evaluation of modern high-quality machine translation systems is a difficult prob- lem, and there is increasing evidence that inadequate evaluation procedures can lead to erroneous conclusions. While there has been
Differentiable Subset Pruning of Transformer Heads
Differentiable Subset Pruning of Transformer Heads Jiaoda Li♣ Ryan Cotterell♣♠ Mrinmaya Sachan♣ ♠University of Cambridge, UK ♣ETH Z¨urich, Suisse {jiaoda.li,ryan.cotterell,mrinmaya.sachan}@inf.ethz.ch Abstract Multi-head attention, a collection of several attention mechanisms that independently at- tend to different parts of the input, is the key ingredient in the Transformer. Recent work has shown, cependant, that a large proportion of the heads in a Transformer’s multi-head at- tention mechanism
Weisfeiler-Leman in the BAMBOO : Novel AMR Graph Metrics
Weisfeiler-Leman in the BAMBOO : Novel AMR Graph Metrics and a Benchmark for AMR Graph Similarity Juri Opitz1 Angel Daza2 Anette Frank1 1Dept. of Computational Linguistics, Heidelberg University, Germany 2CLTL, Vrije Universiteit Amsterdam, The Netherlands {opitz, frank}@cl.uni-heidelberg.de, j.a.dazaarevalo@vu.nl Abstract Several metrics have been proposed for as- sessing the similarity of (abstract) meaning representations (AMRs), but little is known about how they relate to human similarity
Self-Diagnosis and Self-Debiasing:
Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP Timo Schick∗ Sahana Udupa† Hinrich Sch ¨utze∗ ∗Center for Information and Language Processing (CIS), LMU Munich, Germany †Institute of Social and Cultural Anthropology, LMU Munich, Germany schickt@cis.lmu.de, sahana.udupa@lmu.de, inquiries@cislmu.org Abstract (cid:2) This paper contains prompts and model outputs that are offensive in nature. When trained on large, unfiltered crawls from the Internet, language models
MKQA: A Linguistically Diverse Benchmark for
MKQA: A Linguistically Diverse Benchmark for Multilingual Open Domain Question Answering Shayne Longpre Apple Inc. slongpre@mit.edu Yi Lu Apple Inc. ylu7@apple.com Joachim Daiber Apple Inc. jodaiber@apple.com Abstract Progress in cross-lingual modeling depends on challenging, realistic, and diverse evaluation sets. We introduce Multilingual Knowledge Questions and Answers (MKQA), an open- domain question answering evaluation set comprising 10k question-answer pairs aligned across 26 typologically diverse languages (260k
A Biologically Plausible Parser
A Biologically Plausible Parser Daniel Mitropolsky Department of Computer Science Columbia University New York, New York 10027, USA Michael J. Collins Google Research New York, New York 10011, USA Christos H. Papadimitriou Department of Computer Science Columbia University New York, New York 10027, USA Abstract We describe a parser of English effectuated by biologically plausible neurons and syn- apses, and implemented through the Assembly Calculus, a recently proposed
Model Compression for Domain Adaptation through Causal Effect
Model Compression for Domain Adaptation through Causal Effect Estimation Guy Rotman∗, Amir Feder∗, Roi Reichart Faculty of Industrial Engineering and Management, Technion, IIT, Israel grotman@campus.technion.ac.il feder@campus.technion.ac.il roiri@technion.ac.il Abstract Recent improvements in the predictive quality of natural language processing systems are of- ten dependent on a substantial increase in the number of model parameters. This has led to various attempts of compressing such models, but existing
On Generative Spoken Language Modeling from Raw Audio
On Generative Spoken Language Modeling from Raw Audio Kushal Lakhotia∗ , Eugene Kharitonov∗, Wei-Ning Hsu, Yossi Adi, Adam Polyak, Benjamin Bolte§, Tu-Anh Nguyen†, Jade Copet, Alexei Baevski, Abdelrahman Mohamed, Emmanuel Dupoux‡ Facebook AI Research textlessNLP@fb.com Abstract We introduce Generative Spoken Language Modeling, the task of learning the acoustic and linguistic characteristics of a language from raw audio (no text, no labels), and a set of
Partially Supervised Named Entity Recognition
Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss Thomas Effland Columbia University, USA teffland@cs.columbia.edu Michael Collins Google Research, USA mjcollins@google.com Abstract We study learning named entity recognizers in the presence of missing entity annotations. We approach this setting as tagging with la- tent variables and propose a novel loss, the Expected Entity Ratio, to learn models in the presence of systematically missing