Documentation - Recherche en IA spécialisée au MIT

What topic do you need documentation on?

Continual Learning for Grounded Instruction Generation

Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior Noriyuki Kojima, Alane Suhr, Yoav Artzi Department of Computer Science and Cornell Tech, Cornell University, USA nk654@cornell.edu {suhr, yoav}@cs.cornell.edu Abstract We study continual learning for natural lan- guage instruction generation, by observing hu- man users’ instruction execution. We focus on a collaborative scenario, where the system both acts and delegates tasks to human users

Lexically Aware Semi-Supervised Learning for OCR Post-Correction

Lexically Aware Semi-Supervised Learning for OCR Post-Correction Shruti Rijhwani1, Daisy Rosenblum2, Antonios Anastasopoulos3, Graham Neubig1 1Language Technologies Institute, Carnegie Mellon University, USA 2University of British Columbia, Canada 3Department of Computer Science, George Mason University, USA srijhwan@cs.cmu.edu, daisy.rosenblum@ubc.ca, antonis@gmu.edu, gneubig@cs.cmu.edu Abstract Much of the existing linguistic data in many languages of the world is locked away in non- digitized books and documents. Optical char- acter recognition

Structured Self-Supervised Pretraining for Commonsense Knowledge

Structured Self-Supervised Pretraining for Commonsense Knowledge Graph Completion Jiayuan Huang•∗, Yangkai Du•∗, Shuting Tao•∗, Kun Xu(cid:3), Pengtao Xie(cid:2)† •Zhejiang University, Chine, (cid:3)Tencent AI Lab, Etats-Unis, (cid:2)UC San Diego, USA p1xie@eng.ucsd.edu Abstract To develop commonsense-grounded NLP ap- plications, a comprehensive and accurate com- monsense knowledge graph (CKG) is needed. It is time-consuming to manually construct CKGs and many research efforts have been de- voted to the

Quantifying Social Biases in NLP:

Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics Paula Czarnowska♠ University of Cambridge, UK pjc211@cam.ac.uk Yogarshi Vyas Amazon AI, USA yogarshi@amazon.com Kashif Shah Amazon AI, USA shahkas@amazon.com Abstract Measuring bias is key for better understanding and addressing unfairness in NLP/ML models. This is often done via fairness metrics, which quantify the differences in a model’s behaviour across a range

On the Difficulty of Translating Free-Order Case-Marking Languages

On the Difficulty of Translating Free-Order Case-Marking Languages Arianna Bisazza Ahmet ¨Ust ¨un Center for Language and Cognition University of Groningen, The Netherlands {a.bisazza, a.ustun}@rug.nl, research@spor.tel Stephan Sportel Abstract Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies. Free-order case-marking languages, such as Russian, Latin, or Tamil, have proved more challenging

Controllable Summarization with Constrained Markov Decision Process

Controllable Summarization with Constrained Markov Decision Process Hou Pong Chan1, Lu Wang2, and Irwin King3 1University of Macau, Macau SAR, China 2University of Michigan, Ann-Arbor, MI, USA 3The Chinese University of Hong Kong, Hong Kong SAR, China 1hpchan@um.edu.mo 2wangluxy@umich.edu 3king@cse.cuhk.edu.hk Abstract We study controllable text summarization, which allows users to gain control on a particu- lar attribute (par exemple., length limit) of the generated summaries.

Memory-Based Semantic Parsing

Memory-Based Semantic Parsing Parag Jain and Mirella Lapata Institute for Language, Cognition and Computation School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, UK parag.jain@ed.ac.uk mlap@inf.ed.ac.uk Abstract We present a memory-based model for context- dependent semantic parsing. Previous approaches focus on enabling the decoder to copy or mod- ify the parse from the previous utterance, comme- suming there is a dependency between

Identity-Based Patterns in Deep Convolutional Networks: Generative

Identity-Based Patterns in Deep Convolutional Networks: Generative Adversarial Phonology and Reduplication Gaˇsper Beguˇs University of California, Berkeley, USA begus@berkeley.edu Abstract This paper models unsupervised learning of an identity-based pattern (or copying) in speech called reduplication from raw continuous data with deep convolutional neural networks. We use the ciwGAN architecture (Beguˇs, 2021un) in which learning of meaningful representa- tions in speech emerges from a requirement that

What Helps Transformers Recognize Conversational Structure?

What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition Piotr ˙Zelasko†‡, Raghavendra Pappagari†‡, Najim Dehak†‡ †Center of Language and Speech Processing, ‡Human Language Technology Center of Excellence, Université Johns Hopkins, Baltimore, MARYLAND, USA piotr.andrzej.zelasko@gmail.com Abstract Dialog acts can be interpreted as the atomic units of a conversation, more fine-grained than utterances, characterized by a specific com- municative function.

PARSINLU: A Suite of Language Understanding Challenges for Persian

PARSINLU: A Suite of Language Understanding Challenges for Persian Daniel Khashabi1 Arman Cohan1 Siamak Shakeri2 Pedram Hosseini3 Pouya Pezeshkpour4 Malihe Alikhani5 Moin Aminnaseri6 Marzieh Bitaab7 Faeze Brahman8 Sarik Ghazarian9 Mozhdeh Gheini9 Arman Kabiri10 Rabeeh Karimi Mahabagdi11 Omid Memarrast12 Ahmadreza Mosallanezhad7 Sepideh Sadeghi2 Erfan Noury13 Shahab Raji14 Mohammad Sadegh Rasooli15 Erfan Sadeqi Azer2 Niloofar Safi Samghabadi16 Mahsa Shafaei17 Saber Sheybani18 Ali Tazarv4 Yadollah Yaghoobzadeh19 1Allen Institute

A Statistical Analysis of Summarization Evaluation Metrics Using

A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods Daniel Deutsch, Rotem Dror, and Dan Roth Department of Computer and Information Science University of Pennsylvania, Etats-Unis {ddeutsch,rtmdrr,danroth}@seas.upenn.edu Abstract The quality of a summarization evaluation metric is quantified by calculating the correla- tion between its scores and human annotations across a large number of summaries. Actuellement, it is unclear how precise these correlation es- timates

MasakhaNER: Named Entity Recognition for African Languages

MasakhaNER: Named Entity Recognition for African Languages David Ifeoluwa Adelani1∗, Jade Abbott2∗, Graham Neubig3, Daniel D’souza4∗, Julia Kreutzer5∗, Constantine Lignos6∗, Chester Palen-Michel6∗, Happy Buzaaba7∗, Shruti Rijhwani3, Sebastian Ruder8, Stephen Mayhew9, Israel Abebe Azime10∗, Shamsuddeen H. Muhammad11,12∗, Chris Chinenye Emezue13∗, Joyce Nakatumba-Nabende14∗, Perez Ogayo15∗, Aremu Anuoluwapo16∗, Catherine Gitau∗, Derguene Mbaye∗,Jesujoba Alabi17∗,Seid Muhie Yimam18,Tajuddeen Rabiu Gwadabe19∗, Ignatius Ezeani20∗, Rubungo Andre Niyongabo21∗, Jonathan Mukiibi14, Verrah Otiende22∗, Iroro Orife23∗,

PAQ: 65 Million Probably-Asked Questions and

PAQ: 65 Million Probably-Asked Questions and What You Can Do With Them Patrick Lewis†‡ Yuxiang Wu‡ Linqing Liu‡ Pasquale Minervini‡ Heinrich K ¨uttler† Aleksandra Piktus† Pontus Stenetorp‡ Sebastian Riedel†‡ †Facebook AI Research ‡University College London, ROYAUME-UNI {plewis,hnr,piktus,sriedel}@fb.com {yuxiang.wu,linqing.liu,p.minervini,p.stenetorp}@cs.ucl.ac.uk Abstract Open-domain Question Answering models that directly leverage question-answer (QA) pairs, such as closed-book QA (CBQA) models and QA-pair retrievers, show promise in terms of speed and

He Thinks He Knows Better than the Doctors:

He Thinks He Knows Better than the Doctors: BERT for Event Factuality Fails on Pragmatics Nanjiang Jiang Department of Linguistics The Ohio State University, USA jiang.1879@osu.edu Marie-Catherine de Marneffe Department of Linguistics The Ohio State University, USA demarneffe.1@osu.edu Abstract We investigate how well BERT performs on predicting factuality in several existing English datasets, encompassing various linguistic con- structions. Although BERT obtains a strong performance on

Compressing Large-Scale Transformer-Based Models:

Compressing Large-Scale Transformer-Based Models: A Case Study on BERT Prakhar Ganesh1∗, Yao Chen1∗, Xin Lou1, Mohammad Ali Khan1, Yin Yang2, Hassan Sajjad3, Preslav Nakov3, Deming Chen4, Marianne Winslett4 1Advanced Digital Sciences Center, Singapore 2College of Science and Engineering, Hamad Bin Khalifa University, Qatar 3Qatar Computing Research Institute, Hamad Bin Khalifa University, Qatar 4University of Illinois at Urbana-Champaign, Etats-Unis {prakhar.g,yao.chen,lou.xin,mohammad.k}@adsc-create.edu.sg, {yyang,hsajjad,pnakov}@hbku.edu.qa, {dchen,winslett}@illinois.edu Abstract Pre-trained Transformer-based models

Provable Limitations of Acquiring Meaning from Ungrounded Form:

Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand? William Merrill∗ Yoav Goldberg∗ † Roy Schwartz‡ Noah A. Smith∗§ ∗Allen Institute for AI, United States †Bar Ilan University, Israel ‡Hebrew University of Jerusalem, Israel §University of Washington, États-Unis {willm,yoavg,roys,noah}@allenai.org Abstract Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success

Narrative Question Answering with Cutting-Edge Open-Domain QA

Narrative Question Answering with Cutting-Edge Open-Domain QA Techniques: A Comprehensive Study Xiangyang Mou∗ Chenghao Yang∗ Mo Yu∗ Bingsheng Yao Xiaoxiao Guo Saloni Potdar Hui Su Rensselaer Polytechnic Institute & IBM, United States moux4@rpi.edu gflfof@gmail.com Abstract Recent advancements in open-domain ques- tion answering (ODQA), c'est, finding an- swers from large open-domain corpus like Wikipedia, have led to human-level perfor- mance on many datasets. Cependant, progress

Measuring and Improving Consistency in Pretrained Language Models

Measuring and Improving Consistency in Pretrained Language Models Yanai Elazar1,2 Nora Kassner3 Shauli Ravfogel1,2 Abhilasha Ravichander4 Eduard Hovy4 Hinrich Sch ¨utze3 Yoav Goldberg1,2 1Computer Science Department, Bar Ilan University, Israel 2Allen Institute for Artificial Intelligence, United States 3Center for Information and Language Processing (CIS), LMU Munich, Germany 4Language Technologies Institute, Carnegie Mellon University, États-Unis {yanaiela,shauli.ravfogel,yoav.goldberg}@gmail.com kassner@cis.lmu.de {aravicha,hovy}@cs.cmu.edu Abstract Consistency of a model—that is, le