在动态环境中保持共同点 Takuma Udakawa1 和 Akiko Aizawa1,2 东京大学, 东京, 日本1国立信息研究所, 东京, 日本2 {拓真你分裂了,相泽}@nii.ac.jp 摘要 共同基础是创造的过程- ing and maintaining mutual understandings, which is a critical aspect of sophisticated human communication. While various task set- tings have been proposed in existing literature, they mostly focus on creating common ground
多模式预训练揭秘: 荟萃分析和统一
多模式预训练揭秘: 视觉和语言 BERT 的元分析和统一框架 Emanuele Bugliarello Ryan Cotterell Naoaki Okazaki Desmond Elliott 哥本哈根大学 剑桥大学 ETH Zürich 东京工业大学 emanuele@di.ku.dk, rcotterell@inf.ethz.ch, 冈崎@c.titech.ac.jp, de@di.ku.dk Abstract Large-scale pretraining and task-specific fine- tuning is now the standard methodology for many tasks in computer vision and natural language processing. 最近, a multitude of methods have been
我们如何知道语言模型何时知道? 关于问答语言模型的标定姜正宝†, 荒木淳‡, Haibo Ding‡, Graham Neubig† †语言技术研究所, 卡内基梅隆大学, 美国 ‡博世研究, 美国 {zhengbaj,gneubig}@cs.cmu.edu {jun.araki,haibo.ding}@us.bosch.com Abstract Recent works have shown that language mod- 这 (LM) capture different types of knowledge regarding facts or common sense. 然而, because no model is perfect,
Unsupervised Abstractive Opinion Summarization
Unsupervised Abstractive Opinion Summarization by Generating Sentences with Tree-Structured Topic Guidance Masaru Isonuma1 Junichiro Mori1,2 Danushka Bollegala3 Ichiro Sakata1 1The University of Tokyo, 日本 2 RIKEN, 日本 3 University of Liverpool, United Kingdom isonuma@ipr-ctr.t.u-tokyo.ac.jp mori@mi.u-tokyo.ac.jp danushka@liverpool.ac.uk isakata@ipr-ctr.t.u-tokyo.ac.jp Abstract This paper presents a novel unsupervised abstractive summarization method for opin- ionated texts. While the basic variational autoencoder-based models assume a unimodal Gaussian prior for the latent
Relevance-guided Supervision for OpenQA with ColBERT
Relevance-guided Supervision for OpenQA with ColBERT Omar Khattab Stanford University, United States okhattab@stanford.edu Christopher Potts Stanford University, United States cgpotts@stanford.edu Matei Zaharia Stanford University, United States matei@cs.stanford.edu Abstract Systems for Open-Domain Question An- swering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a
Neural Modeling for Named Entities and Morphology (NEMO2)
Neural Modeling for Named Entities and Morphology (NEMO2) Dan Bareket1,2 and Reut Tsarfaty1 1Bar Ilan University, Ramat-Gan, Israel 2Open Media and Information Lab (OMILab), The Open University of Israel, Israel dbareket@gmail.com, reut.tsarfaty@biu.ac.il Abstract Named Entity Recognition (NER) is a funda- mental NLP task, commonly formulated as classification over a sequence of tokens. Mor- phologically rich languages (MRLs) pose a challenge to this basic formulation, 作为
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Sensitivity as a Complexity Measure for Sequence Classification Tasks Michael Hahn Stanford University, United States mhahn2@stanford.edu Dan Jurafsky Stanford University, United States jurafsky@stanford.edu Richard Futrell University of California, 尔湾, United States rfutrell@uci.edu Abstract We introduce a theoretical framework for understanding and predicting the complexity of sequence classification tasks, using a novel extension of the theory of Boolean function sensitivity. The sensitivity of a function, 给定
Neural Event Semantics for Grounded Language Understanding
Neural Event Semantics for Grounded Language Understanding Shyamal Buch Li Fei-Fei Noah D. 古德曼 {shyamal,feifeili}@cs.stanford.edu ngoodman@stanford.edu Stanford University, United States Abstract We present a new conjunctivist framework, neural event semantics (NES), for composi- tional grounded language understanding. Our approach treats all words as classifiers that compose to form a sentence meaning by mul- tiplying output scores. These classifiers apply to spatial regions (事件) and NES
Gender Bias in Machine Translation
Gender Bias in Machine Translation Beatrice Savoldi1,2, Marco Gaido1,2, Luisa Bentivogli2, Matteo Negri2, Marco Turchi2 1University of Trento, Italy 2Fondazione Bruno Kessler, 意大利 {bsavoldi,mgaido,bentivo,negri,turchi}@fbk.eu Abstract Machine translation (公吨) technology has fa- cilitated our daily tasks by providing acces- sible shortcuts for gathering, 加工, and communicating information. 然而, it can suffer from biases that harm users and society at large. As a relatively new field of
Let’s Play Mono-Poly: BERT Can Reveal Words’ Polysemy Level
Let’s Play Mono-Poly: BERT Can Reveal Words’ Polysemy Level and Partitionability into Senses Aina Gar´ı Soler Universit´e Paris-Saclay CNRS, LISN 91400, Orsay, France aina.gari@limsi.fr Marianna Apidianaki Department of Digital Humanities University of Helsinki Helsinki, Finland marianna.apidianaki@helsinki.fi Abstract Pre-trained language models (LMs) encode rich information about linguistic structure but their knowledge about lexical polysemy remains unclear. We propose a novel exper- imental setup for analyzing this
独奏者: Building Task Bots at Scale with
独奏者: Building Task Bots at Scale with Transfer Learning and Machine Teaching Baolin Peng, Chunyuan Li, Jinchao Li Shahin Shayandeh, Lars Liden, Jianfeng Gao Microsoft Research, Redmond, 美国 {bapeng,chunyl,jincli,shahins,lars.liden,jfgao}@microsoft.com Abstract We present a new method, 独奏者,1 that uses transfer learning and machine teaching to build task bots at scale. We parameterize classical modular task-oriented dialog systems using a Transformer-based auto-regressive language model, which subsumes
Classifying Argumentative Relations
Classifying Argumentative Relations Using Logical Mechanisms and Argumentation Schemes Yohan Jo1 Seojin Bang1 Chris Reed2 Eduard Hovy1 1School of Computer Science, 卡内基梅隆大学, United States 2Centre for Argument Technology, University of Dundee, 英国 1{yohanj,seojinb,ehovy}@andrew.cmu.edu, 2c.a.reed@dundee.ac.kr Abstract While argument mining has achieved sig- nificant success in classifying argumentative relations between statements (支持, 攻击, and neutral), we have a limited computa- tional understanding of logical
Strong Equivalence of TAG and CCG
Strong Equivalence of TAG and CCG Lena Katharina Schiffer and Andreas Maletti Faculty of Mathematics and Computer Science, Universit¨at Leipzig, Germany P.O. Box 100 920, D-04009 莱比锡, 德国 {schiffer,maletti}@informatik.uni-leipzig.de Abstract Tree-adjoining grammar (TAG) and combina- tory categorial grammar (CCG) are two well- established mildly context-sensitive grammar formalisms that are known to have the same expressive power on strings (IE。, generate the same class of string
Revisiting Few-shot Relation Classification:
Revisiting Few-shot Relation Classification: Evaluation Data and Classification Schemes Ofer Sabo1 Yanai Elazar1,2 Yoav Goldberg1,2 Ido Dagan1 1Computer Science Department, 巴伊兰大学, 以色列2艾伦人工智能研究所 {ofersabo,yanaiela,yoav.goldberg,ido.k.dagan}@gmail.com Abstract We explore few-shot learning (FSL) for re- lation classification (RC). Focusing on the realistic scenario of FSL, in which a test instance might not belong to any of the target categories (none-of-the-above, [NOTA]), 我们
Efficient Computation of Expectations under Spanning Tree Distributions
Efficient Computation of Expectations under Spanning Tree Distributions Ran Zmigrod , University of Cambridge, United Kingdom Tim Vieira , Ryan Cotterell , Johns Hopkins University, United States ETH Z¨urich, United Kingdom rz279@cam.ac.uk tim.f.vieira@gmail.com ryan.cotterell@inf.ethz.ch Abstract We give a general framework for inference in spanning tree models. We propose unified al- gorithms for the important cases of first-order expectations and second-order expectations in edge-factored, non-projective spanning-tree
Pretraining the Noisy Channel Model for Task-Oriented Dialogue
Pretraining the Noisy Channel Model for Task-Oriented Dialogue Qi Liu2∗, Lei Yu1, Laura Rimell1, and Phil Blunsom1,2 1DeepMind, United Kingdom 2University of Oxford, United Kingdom qi.liu@cs.ox.ac.uk {leiyu,laurarimell,pblunsom}@google.com Abstract Direct decoding for task-oriented dialogue is known to suffer from the explaining-away ef- 影响, manifested in models that prefer short and generic responses. Here we argue for the use of Bayes’ theorem to factorize the dialogue task
Self-supervised Regularization for Text Classification
Self-supervised Regularization for Text Classification Meng Zhou∗ Shanghai Jiao Tong University, China Zechen Li∗ Northeastern University, United States Pengtao Xie† UC San Diego, United States p1xie@eng.ucsd.edu zhoumeng9904@sjtu.edu.cn li.zec@northeastern.edu Abstract Text classification is a widely studied problem and has broad applications. In many real-world problems, the number of texts for training classification models is limited, which renders these models prone to overfitting. To address this problem,
Evaluating Document Coherence Modeling
Evaluating Document Coherence Modeling Aili Shen♣, Meladel Mistica♣, Bahar Salehi♣, Hang Li♦, Timothy Baldwin♣, Jianzhong Qi♣ ♣ The University of Melbourne, Australia ♦ AI Lab at ByteDance, 中国 {aili.shen, misticam, tbaldwin, jianzhong.qi}@unimelb.edu.au baharsalehi@gmail.com, lihang.lh@bytedance.com Abstract While pretrained language models (LMs) have driven impressive gains over morpho- syntactic and semantic tasks, their ability to model discourse and pragmatic phenomena is less clear. As a step towards