A Neighborhood Framework for Resource-Lean Content Flagging Sheikh Muhammad Sarwar2,5,∗ and Dimitrina Zlatkova1 and Momchil Hardalov1,6 and Yoan Dinkov1 and Isabelle Augenstein1,3 and Preslav Nakov1,4 1Checkstep, Vereinigtes Königreich, 2University of Massachusetts, Amherst, 3University of Copenhagen, Denmark,…
Suchkategorieangehen
TopiOCQA: Open-domain Conversational Question Answering
TopiOCQA: Open-domain Conversational Question Answering with Topic Switching Vaibhav Adlakha1,4 Shehzaad Dhuliawala2 Kaheer Suleman3 Harm de Vries4 Siva Reddy1,5 2ETH Z¨urich, Switzerland 3Microsoft Montr´eal, Canada 1Mila, McGill-Universität, Canada 4ServiceNow Research, Canada 5Facebook CIFAR AI…
Czech Grammar Error Correction with a Large and Diverse Corpus
Czech Grammar Error Correction with a Large and Diverse Corpus Jakub N´aplava† Milan Straka† Jana Strakov´a† Alexandr Rosen‡ †Charles University, Faculty of Mathematics and Physics Institute of Formal and Applied Linguistics, Czech Republic {naplava,straka,strakova}@ufal.mff.cuni.cz ‡Charles…
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text
LOT: A Story-Centric Benchmark for Evaluating Chinese Long Text Understanding and Generation Jian Guan1, Zhuoer Feng1, Yamei Chen1, Ruilin He2, Xiaoxi Mao3, Changjie Fan3, Minlie Huang1∗ 1The CoAI group, DCST, China; 2Huawei Technologies Co., Ltd.,…
PADA: Example-based Prompt Learning
PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains Eyal Ben-David∗ Nadav Oved∗ Roi Reichart {eyalbd12@campus.|nadavo@campus.|roiri@}technion.ac.il Technion – Israel Institute of Technology, Israel Abstract Natural Language Processing algorithms have made incredible progress, but they…
Data-driven Model Generalizability in Crosslinguistic Low-resource
Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation Zoey Liu Department of Computer Science Boston College, USA zoey.liu@bc.edu Emily Prud’hommeaux Department of Computer Science Boston College, USA prudhome@bc.edu Abstract Common designs of model evaluation typi-…
VILA: Improving Structured Content Extraction from Scientific PDFs
VILA: Improving Structured Content Extraction from Scientific PDFs Using Visual Layout Groups Zejiang Shen1 Kyle Lo1 Lucy Lu Wang1 Bailey Kuehl1 Daniel S. Weld1,2 Doug Downey1,3 1Allen Institute for AI, USA 2University of Washington, USA…
Evaluating Explanations: How Much Do Explanations
Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students? Danish Pruthi1∗ Rachit Bansal2 Bhuwan Dhingra3 Livio Baldini Soares3 Michael Collins3 Zachary C. Lipton1 Graham Neubig1 William W. Cohen3 1 Carnegie Mellon University, USA…
A Multi-Level Optimization Framework for End-to-End
A Multi-Level Optimization Framework for End-to-End Text Augmentation Sai Ashish Somayajula UC San Diego, USA ssomayaj@ucsd.edu Linfeng Song Tencent AI Lab, USA lfsong@tencent.com Pengtao Xie∗ UC San Diego, USA p1xie@eng.ucsd.edu Abstract Text augmentation is an…
Towards General Natural Language Understanding
Towards General Natural Language Understanding with Probabilistic Worldbuilding Abulhair Saparov and Tom M. Mitchell Machine Learning Department, Carnegie Mellon University, USA {asaparov, tom.mitchell}@cs.cmu.edu Abstract We introduce the Probabilistic Worldbuilding Model (PWM), a new fully symbolic…