Documentazione - Ricerca sull'intelligenza artificiale specializzata al MIT

What topic do you need documentation on?

LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text

LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation Jian Guan1, Zhuoer Feng1, Yamei Chen1, Ruilin He2, Xiaoxi Mao3, Changjie Fan3, Minlie Huang1∗ 1The CoAI group, DCST, China; 2Huawei Technologies Co., Ltd., China; 3Netease Fuxi AI Lab., China {j-guan19,fze17}@mails.tsinghua.edu.cn, chenziym4132013@163.com, {maoxiaoxi,fanchangjie}@corp.netease.com, heruilin@huawei.com, aihuang@tsinghua.edu.cn Abstract Standard multi-task benchmarks are essen- tial for developing pretraining models that can generalize to various downstream tasks. Existing

PADA: Example-based Prompt Learning

PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains Eyal Ben-David∗ Nadav Oved∗ Roi Reichart {eyalbd12@campus.|nadavo@campus.|roiri@}technion.ac.il Technion – Israel Institute of Technology, Israel Abstract Natural Language Processing algorithms have made incredible progress, but they still struggle when applied to out-of-distribution exam- ples. We address a challenging and under- explored version of this domain adaptation problem, where an algorithm is trained on several source domains,

Data-driven Model Generalizability in Crosslinguistic Low-resource

Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation Zoey Liu Department of Computer Science Boston College, USA zoey.liu@bc.edu Emily Prud’hommeaux Department of Computer Science Boston College, USA prudhome@bc.edu Abstract Common designs of model evaluation typi- cally focus on monolingual settings, where different models are compared according to their performance on a single data set that is assumed to be representative of all possible data for

VILA: Improving Structured Content Extraction from Scientific PDFs

VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups Zejiang Shen1 Kyle Lo1 Lucy Lu Wang1 Bailey Kuehl1 Daniel S. Weld1,2 Doug Downey1,3 1Allen Institute for AI, USA 2University of Washington, USA 3Northwestern University, USA {shannons,kylel,lucyw,baileyk,danw,dougd}@allenai.org Abstract Accurately extracting structured content from PDFs is a critical first step for NLP over scientific papers. Recent work has improved extraction accuracy by incorporating elemen-

Evaluating Explanations: How Much Do Explanations

Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students? Danish Pruthi1∗ Rachit Bansal2 Bhuwan Dhingra3 Livio Baldini Soares3 Michael Collins3 Zachary C. Lipton1 Graham Neubig1 William W. Cohen3 1 Carnegie Mellon University, USA 2 Delhi Technological University, India 3 Google Research, USA {ddanish, zlipton, gneubig}@cs.cmu.edu, racbansa@gmail.com {bdhingra, liviobs, mjcollins, wcohen}@google.com Abstract While many methods purport to explain pre- dictions by highlighting salient features,

A Multi-Level Optimization Framework for End-to-End

A Multi-Level Optimization Framework for End-to-End Text Augmentation Sai Ashish Somayajula UC San Diego, USA ssomayaj@ucsd.edu Linfeng Song Tencent AI Lab, USA lfsong@tencent.com Pengtao Xie∗ UC San Diego, USA p1xie@eng.ucsd.edu Abstract Text augmentation is an effective technique in alleviating overfitting in NLP tasks. In ex- isting methods, text augmentation and down- stream tasks are mostly performed separately. Di conseguenza, the augmented texts may not

Towards General Natural Language Understanding

Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov and Tom M. Mitchell Machine Learning Department, Carnegie Mellon University, USA {asaparov, tom.mitchell}@cs.cmu.edu Abstract We introduce the Probabilistic Worldbuilding Model (PWM), a new fully symbolic Bayesian model of semantic parsing and reasoning, as a first step in a research program toward more domain- and task-general NLU and AI. Eh- mans create internal mental models of

Designing an Automatic Agent for Repeated Language–based

Designing an Automatic Agent for Repeated Language–based Persuasion Games Maya Raifer, Guy Rotman, Reut Apel, Moshe Tennenholtz, Roi Reichart Technion—Israel Institute of Technology, Israel {mayatarno, grotman, reutapel}@campus.technion.ac.il {roiri, moshet}@technion.ac.il Abstract Persuasion games are fundamental in eco- nomics and AI research and serve as the basis for important applications. Tuttavia, work on this setup assumes communication with styl- ized messages that do not consist of rich

Time-Aware Language Models as Temporal Knowledge Bases

Time-Aware Language Models as Temporal Knowledge Bases Bhuwan Dhingra∗∗∗ Jeremy R. Cole∗ Jacob Eisenstein William W. Cohen Google Research Julian Martin Eisenschlos Daniel Gillick {bdhingra,jrcole,eisenjulian,dgillick,jeisenstein,wcohen}@google.com Abstract Many facts come with an expiration date, from the name of the President to the basketball team Lebron James plays for. Tuttavia, most language models (LMs) are trained on snap- shots of data collected at a specific moment in

ABNIRML: Analyzing the Behavior of Neural IR Models

ABNIRML: Analyzing the Behavior of Neural IR Models Sean MacAvaney†∗ Sergey Feldman‡ Nazli Goharian† Doug Downey‡ Arman Cohan‡§ †IR Lab, Georgetown University, Washington, DC, USA ‡Allen Institute for AI, Seattle, WA, USA §Paul G. Allen School of Computer Science, University of Washington, WA, USA {sean,nazli}@ir.cs.georgetown.edu {sergey,dougd,armanc}@allenai.org Abstract Pretrained contextualized language models such as BERT and T5 have established a new state-of-the-art for ad-hoc search. How-

Predicting Document Coverage for Relation Extraction

Predicting Document Coverage for Relation Extraction Sneha Singhania, Simon Razniewski, Gerhard Weikum Max Planck Institute for Informatics, Germany {ssinghan,srazniew,weikum}@mpi-inf.mpg.de Abstract This paper presents a new task of predicting the coverage of a text document for relation extraction (RE): Does the document contain many relational tuples for a given entity? Coverage predictions are useful in selecting the best documents for knowledge base con- struction with large

A Survey on Automated Fact-Checking

A Survey on Automated Fact-Checking Zhijiang Guo∗, Michael Schlichtkrull∗, Andreas Vlachos Department of Computer Science and Technology University of Cambridge, UK {zg283,mss84,av308}@cam.ac.uk Abstract Fact-checking has become increasingly im- portant due to the speed with which both information and misinformation can spread in the modern media ecosystem. Therefore, researchers have been exploring how fact- checking can be automated, using techniques based on natural language processing, machine

SUMMAC: Re-Visiting NLI-based Models for

SUMMAC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization Philippe Laban Tobias Schnabel Paul N. Bennett Marti A. Hearst UC Berkeley, USA Microsoft, USA Microsoft, USA UC Berkeley, USA∗ Abstract In the summarization domain, a key require- ment for summaries is to be factually consis- tent with the input document. Previous work has found that natural language inference (NLI) models do not perform competitively when

Samanantar: The Largest Publicly Available

Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages Gowtham Ramesh1∗ Sumanth Doddapaneni1∗ Aravinth Bheemaraj2,5 Mayank Jobanputra3 Raghavan AK4 Ajitesh Sharma2,5 Sujit Sahoo2,5 Harshita Diddee4 Mahalakshmi J4 Divyanshu Kakwani3,4 Navneet Kumar2,5 Aswin Pradeep2,5 Srihari Nagaraj2,5 Kumar Deepak2,5 Vivek Raghavan5 Anoop Kunchukuttan4,6 Pratyush Kumar1,3,4 Mitesh Shantadevi† Khapra1,3,4‡ 2Tarento Technologies, India 5EkStep Foundation, India 3IIT Madras, India 6Microsoft, India 1RBCDSAI, India 4AI4Bharat, India Abstract

Out-of-Domain Discourse Dependency Parsing via Bootstrapping:

Out-of-Domain Discourse Dependency Parsing via Bootstrapping: An Empirical Analysis on Its Effectiveness and Limitation Noriki Nishida and Yuji Matsumoto RIKEN Center for Advanced Intelligence Project, Japan {noriki.nishida, yuji.matsumoto}@riken.jp Abstract Discourse parsing has been studied for de- cades. Tuttavia, it still remains challenging to utilize discourse parsing for real-world appli- cations because the parsing accuracy degrades significantly on out-of-domain text. In this pa- per, we report

Break, Perturb, Build: Automatic Perturbation of Reasoning Paths

Break, Perturb, Build: Automatic Perturbation of Reasoning Paths Through Question Decomposition Mor Geva, Tomer Wolfson, Jonathan Berant School of Computer Science, Tel Aviv University, Israel Allen Institute for Artificial Intelligence {morgeva@mail,tomerwol@mail,joberant@cs}.tau.ac.il Abstract Recent efforts to create challenge benchmarks that test the abilities of natural language un- derstanding models have largely depended on human annotations. In this work, we in- troduce the ‘‘Break, Perturb, Build’’ (BPB)

Dealing with Disagreements:

Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations Aida Mostafazadeh Davani University of Southern California, USA mostafaz@usc.edu Mark D´ıaz Google Research, USA markdiaz@google.com Vinodkumar Prabhakaran Google Research, USA vinodkpg@google.com Abstract Majority voting and averaging are common approaches used to resolve annotator dis- agreements and derive single ground truth labels from multiple annotations. Tuttavia, annotators may systematically disagree with one another, often reflecting

CANINE: Pre-training an Efficient Tokenization-Free Encoder

CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting Google Research, USA {jhclark,dhgarrette,iuliaturc,jwieting}@google.com Abstract Pipelined NLP systems have largely been su- perseded by end-to-end neural modeling, yet nearly all commonly used models still require an explicit tokenization step. While recent to- kenization approaches based on data-derived subword lexicons are less brittle than manually engineered tokenizers, these