Documentación

¿Sobre qué tema necesitas documentación??

Analysis Methods in Neural Language Processing: Una encuesta

Analysis Methods in Neural Language Processing: A Survey Yonatan Belinkov1,2 and James Glass1 1MIT Computer Science and Artificial Intelligence Laboratory 2Harvard School of Engineering and Applied Sciences Cambridge, MAMÁ, EE.UU {belinkov, glass}@mit.edu Abstract The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new mod- els have been proposed,

Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing

Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing Strategies for MRLs and a Case Study from Modern Hebrew Amir More Open University Ra’anana, Israel habeanf@gmail.com Victoria Basmova Open University Ra’anana, Israel vicbas@openu.ac.il Amit Seker Open University Ra’anana, Israel amitse@openu.ac.il Reut Tsarfaty Open University Ra’anana, Israel reutts@openu.ac.il Abstract In standard NLP pipelines, morphological analysis and disambiguation (MAMÁ&D) pre- cedes syntactic and semantic downstream tasks. Sin embargo, for languages

Traducción automática neuronal semántica mediante AMR

Semantic Neural Machine Translation Using AMR Linfeng Song,1 Daniel Gildea,1 Yue Zhang,2 Zhiguo Wang,3 and Jinsong Su4 1Department of Computer Science, University of Rochester, Rochester, Nueva York 14627 2School of Engineering, Westlake University, China 3IBM T.J. Watson Research Center, Yorktown Heights, Nueva York 10598 4Xiamen University, Xiamen, Porcelana 1{lsong10,gildea}@cs.rochester.edu 2yue.zhang@wias.org.cn 3zgw.tomorrow@gmail.com 4jssu@xmu.edu.cn Abstract It is intuitive that semantic representations can be useful for machine translation, mainly be-

Grammar Error Correction in Morphologically Rich Languages:

Grammar Error Correction in Morphologically Rich Languages: The Case of Russian Alla Rozovskaya Queens College, City University of New York arozovskaya@qc.cuny.edu Dan Roth University of Pennsylvania danroth@seas.upenn.edu Abstract Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich lan- calibres, con

Learning Typed Entailment Graphs with Global Soft Constraints

Learning Typed Entailment Graphs with Global Soft Constraints Mohammad Javad Hosseini(cid:63)§ Nathanael Chambers(cid:63)(cid:63) Siva Reddy† Xavier R. Holt‡ Shay B. cohen(cid:63) Mark Johnson‡ and Mark Steedman(cid:63) (cid:63)University of Edinburgh §The Alan Turing Institute, Reino Unido (cid:63)(cid:63)United States Naval Academy †Stanford University ‡Macquarie University javad.hosseini@ed.ac.uk, nchamber@usna.edu, sivar@stanford.edu {xavier.ricketts-holt,mark.johnson}@mq.edu.au {scohen,steedman}@inf.ed.ac.uk Abstract This paper presents a new method for learn- ing typed entailment graphs from text. We extract predicate-argument

Surface Statistics of an Unknown Language Indicate How to Parse It

Surface Statistics of an Unknown Language Indicate How to Parse It Dingquan Wang and Jason Eisner Department of Computer Science, Universidad Johns Hopkins {wdd,jason}@cs.jhu.edu Abstract We introduce a novel framework for delex- icalized dependency parsing in a new lan- guage. We show that useful features of the target language can be extracted automati- cally from an unparsed corpus, cual estafa- sists only of gold part-of-speech

Attentive Convolution:

Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms Wenpeng Yin Department of Computer and Information Science, University of Pennsylvania wenpeng@seas.upenn.edu Hinrich Schütze Center for Information and Language Processing, LMU Munich, Germany inquiries@cislmu.org Abstract In NLP, convolutional neural networks (CNNs) have beneﬁted less than recur- rent neural networks (RNNs) from attention mechanisms. We hypothesize that this is be- cause the attention in CNNs has been mainly

Errata: “Improving Topic Models with Latent Feature Word

Errata: “Improving Topic Models with Latent Feature Word Representations” Dat Quoc Nguyen, Richard Billingsley, Lan Du and Mark Johnson Abstract FROM (a part of Table 10 in the original published arti- cle): F1 scores for TMN and TMNtitle datasets. Change in clustering and classiﬁcation results due to the DMM and LF-DMM bugs. Data TMN 4.3 Document clustering evaluation FROM (in the original published article): Para

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 429–440. Editor de acciones: Philipp Koehn.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 429–440. Editor de acciones: Philipp Koehn. Submitted 3/2013; Revised 8/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. MeasuringMachineTranslationErrorsinNewDomainsAnnIrvineJohnsHopkinsUniversityanni@jhu.eduJohnMorganUniversityofMarylandjjm@cs.umd.eduMarineCarpuatNationalResearchCouncilCanadamarine.carpuat@nrc.gc.caHalDaum´eIIIUniversityofMarylandme@hal3.nameDragosMunteanuSDLResearchdmunteanu@sdl.comAbstractWedeveloptwotechniquesforanalyzingtheeffectofportingamachinetranslationsystemtoanewdomain.Oneisamacro-levelana-lysisthatmeasureshowdomainshiftaffectscorpus-levelevaluation;thesecondisamicro-levelanalysisforword-levelerrors.Weap-plythesemethodstounderstandwhathappenswhenaParliament-trainedphrase-basedma-chinetranslationsystemisappliedinfourverydifferentdomains:noticias,medicaltexts,scien-tiﬁcarticlesandmoviesubtitles.Wepresentquantitativeandqualitativeexperimentsthathighlightopportunitiesforfutureresearchindomainadaptationformachinetranslation.1IntroductionWhenbuildingastatisticalmachinetranslation(SMT)sistema,theexpectedusecaseisoftenlimitedtoaspeciﬁcdomain,genreandregister(henceforth“domain”referstothisset,inkeepingwithstandard,imprecise,terminology),suchasaparticulartypeoflegalormedicaldocument.Unfortunately,itisex-pensivetoobtainenoughparalleldatatoreliablyes-timatetranslationmodelsinanewdomain.Instead,onecanhopethatlargeamountsofdatafromano-ther,“olddomain,”mightbecloseenoughtostandasaproxy.Thisisthedefactostandard:wetrainSMTsystemsonParliamentproceedings,butthenusethemtotranslateallsortsofnewtext.Unfortuna-tely,thisresultsinsigniﬁcantlydegradedtranslationquality.Inthispaper,wepresenttwocomplemen-tarymethodsforquantiﬁablymeasuringthesourceoftranslationerrors(§5.1and§5.2)inanoveltaxo-nomy(§4).Weshowquantitative(§7.1)andquali-tative(§7.2)resultsobtainedfromourmethodsonOldDomain(Hansard)Inpmonsieurlepr´esident,lespˆecheursdehomarddelar´egiondel’atlantiquesontdansunesituationcatastro-phique.Refmr.speaker,lobsterﬁshersinatlanticcanadaarefacingadisaster.Outmr.speaker,thelobsterﬁshersinatlanticcanadaareinamess.NewDomain(Medical)Inpmodeetvoie(s)d’administrationRefmethodandroute(s)ofadministrationOutfashionandvoie(s)ofdirectorsTABLE1:Exampleinputs,referencesandsystemoutputs.Therearethreetypesoferrors:unseenwords(azul),in-correctsenseselection(rojo)andunknownsense(verde).fourverydifferentnewdomains:newswire,medicaltexts,scientiﬁcabstracts,andmoviesubtitles.Ourbasicapproachistothinkoftranslationer-rorsinthecontextofanoveltaxonomyoferrorcategories,“S4.”OurtaxonomycontainscategoriesfortheerrorsshowninTable1,inwhichanSMTsystemtrainedontheHansardparliamentaryproce-dingsisappliedtoanewdomain(inthiscase,me-dicaltexts).Ourcategorizationfocusesonthefollo-wing:newFrenchwords,newFrenchsenses,andin-correctlychosentranslations.Theﬁrstmethodologywedevelopforstudyingsucherrorsisamicro-levelstudyofthefrequencyanddistributionoftheseerrortypesinrealtranslationoutputatthelevelofindivi-dualwords(§5.1),withoutrespecttohowtheseer-rorsaffectoveralltranslationquality.Thesecondisamacro-levelstudyofhowtheseerrorsaffecttrans-lationperformance(measuredbyBLEU;§5.2).Oneimportantfeatureofourmethodologiesisthatwefocusonerrorsthatcouldpossiblybeﬁxedgivenaccesstodatafromanewdomain,ratherthanallerrorsthatmightarisebecausetheparticulartransla-tionmodelusedisinadequatetocapturetherequired l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 415–428. Editor de acciones: Brian Roark.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 415–428. Editor de acciones: Brian Roark. Submitted 7/2013; Revised 9/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. JointMorphologicalandSyntacticAnalysisforRichlyInﬂectedLanguagesBerndBohnet∗JoakimNivre?IgorBoguslavsky•◦Rich´ardFarkas(cid:5)FilipGinter†JanHajiˇc‡∗UniversityofBirmingham,SchoolofComputerScience?UppsalaUniversity,DepartmentofLinguisticsandPhilology•UniversidadPolit´ecnicadeMadrid,DepartamentodeInteligenciaArtiﬁcial◦RussianAcademyofSciences,InstituteforInformationTransmissionProblems(cid:5)UniversityofSzeged,InstituteofInformatics†UniversityofTurku,DepartmentofInformationTechnology‡CharlesUniversityinPrague,InstituteofFormalandAppliedLinguisticsAbstractJointmorphologicalandsyntacticanalysishasbeenproposedasawayofimprovingparsingaccuracyforrichlyinﬂectedlanguages.Start-ingfromatransition-basedmodelforjointpart-of-speechtagginganddependencypars-ing,weexploredifferentwaysofintegratingmorphologicalfeaturesintothemodel.Wealsoinvestigatetheuseofrule-basedmor-phologicalanalyzerstoprovidehardorsoftlexicalconstraintsandtheuseofwordclus-terstotacklethesparsityoflexicalfeatures.Evaluationonﬁvemorphologicallyrichlan-guages(checo,Finnish,Alemán,Hungarian,y ruso)showsconsistentimprovementsinbothmorphologicalandsyntacticaccuracyforjointpredictionoverapipelinemodel,withfurtherimprovementsthankstolexicalcon-straintsandwordclusters.Theﬁnalresultsimprovethestateoftheartindependencyparsingforalllanguages.1IntroductionSyntacticparsingofnaturallanguagehaswitnessedatremendousdevelopmentduringthelasttwentyyears,especiallythroughtheuseofstatisticalmod-elsforrobustandaccuratebroad-coverageparsing.However,asstatisticalparsingtechniqueshavebeenappliedtomoreandmorelanguages,ithasalsobeenobservedthattypologicaldifferencesbetweenlanguagesleadtonewchallenges.Inparticular,ithasbeenfoundoverandoveragainthatlanguagesexhibitingrichmorphologicalstructure,oftento-getherwitharelativelyfreewordorder,usuallyob-tainlowerparsingaccuracy,especiallyincompar-isontoEnglish.OnestrikingdemonstrationofthistendencycanbefoundintheCoNLLsharedtasksonmultilingualdependencyparsing,organizedin2006and2007,whererichlyinﬂectedlanguagesclusteredatthelowerendofthescalewithrespecttopars-ingaccuracy(BuchholzandMarsi,2006;Nivreetal.,2007).Theseandsimilarobservationshaveledtoanincreasedinterestinthespecialchallengesposedbyparsingmorphologicallyrichlanguages,asevidencedmostclearlybyanewseriesofwork-shopsdevotedtothistopic(Tsarfatyetal.,2010),aswellasaspecialissueinComputationalLinguistics(Tsarfatyetal.,2013)andasharedtaskonparsingmorphologicallyrichlanguages.1Onehypothesizedexplanationforthelowerpars-ingaccuracyobservedforrichlyinﬂectedlanguagesisthestrictseparationofmorphologicalandsyn-tacticanalysisassumedinmanyparsingframe-works(Tsarfatyetal.,2010;Tsarfatyetal.,2013).Thisistrueinparticularfordata-drivendependencyparsers,whichtendtoassumethatallmorphologicaldisambiguationhasbeenperformedbeforesyntacticanalysisbegins.However,asarguedbyLeeetal.(2011),inmorphologicallyrichlanguagesthereisoftenconsiderableinteractionbetweenmorphologyandsyntax,suchthatneithercanbedisambiguatedwithouttheother.Leeetal.(2011)goontoshowthatadiscriminativemodelforjointmorphologicaldisambiguationanddependencyparsinggivescon-sistentimprovementsinmorphologicalandsyntac-ticaccuracy,comparedtoapipelinemodel,forAn-cientGreek,checo,HungarianandLatin.Simi-larly,BohnetandNivre(2012)proposeamodelfor1Seehttps://sites.google.com/site/spmrl2013/home/sharedtask. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 403–414. Editor de acciones: Jason Eisner.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 403–414. Editor de acciones: Jason Eisner. Submitted 6/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. TrainingDeterministicParserswithNon-DeterministicOraclesYoavGoldbergBar-IlanUniversityDepartmentofComputerScienceRamat-Gan,Israelyoav.goldberg@gmail.comJoakimNivreUppsalaUniversityDepartmentofLinguisticsandPhilologyUppsala,Swedenjoakim.nivre@lingﬁl.uu.seAbstractGreedytransition-basedparsersareveryfastbuttendtosufferfromerrorpropagation.Thisproblemisaggravatedbythefactthattheyarenormallytrainedusingoraclesthataredeter-ministicandincompleteinthesensethattheyassumeauniquecanonicalpaththroughthetransitionsystemandareonlyvalidaslongastheparserdoesnotstrayfromthispath.Inthispaper,wegiveageneralcharacterizationoforaclesthatarenondeterministicandcom-plete,presentamethodforderivingsuchora-clesfortransitionsystemsthatsatisfyaprop-ertywecallarcdecomposition,andinstanti-atethismethodforthreewell-knowntransi-tionsystemsfromtheliterature.Wesaythattheseoraclesaredynamic,becausetheyallowustodynamicallyexplorealternativeandnon-optimalpathsduringtraining–incontrasttooraclesthatstaticallyassumeauniqueoptimalpath.Experimentalevaluationonawiderangeofdatasetsclearlyshowsthatusingdynamicoraclestotraingreedyparsersgivessubstan-tialimprovementsinaccuracy.Moreover,thisimprovementcomesatnocostintermsofefﬁciency,unlikeothertechniqueslikebeamsearch.1IntroductionGreedytransition-basedparsersareeasytoimple-mentandareveryefﬁcient,buttheyaregenerallynotasaccurateasparsersthatarebasedonglobalsearch(McDonaldetal.,2005;KooandCollins,2010)orastransition-basedparsersthatusebeamsearch(ZhangandClark,2008)ordynamicpro-gramming(HuangandSagae,2010;Kuhlmannetal.,2011).Thisworkispartofalineofresearchtryingtopushtheboundariesofgreedyparsingandnarrowtheaccuracygapof2–3%betweensearch-basedandgreedyparsers,whilemaintainingtheef-ﬁciencyandincrementalnatureofgreedyparsers.Onereasonfortheloweraccuracyofgreedyparsersiserrorpropagation:oncetheparsermakesanerrorindecoding,moreerrorsarelikelytofol-low.Thisbehavioriscloselyrelatedtothewayinwhichgreedyparsersarenormallytrained.Givenatreebankoracle,agoldsequenceoftransitionsisderived,andapredictoristrainedtopredicttransi-tionsalongthisgoldsequence,withoutconsideringanyparserstateoutsidethissequence.Thus,oncetheparserstraysfromthegoldenpathattesttime,itventuresintounknownterritoryandisforcedtoreacttosituationsithasneverbeentrainedfor.Inrecentwork(GoldbergandNivre,2012),weintroducedtheconceptofadynamicoracle,whichisnon-deterministicandnotrestrictedtoasinglegoldenpath,butinsteadprovidesoptimalpredic-tionsforanypossiblestatetheparsermightbein.Dynamicoraclesarenon-deterministicinthesensethattheyreturnasetofvalidtransitionsforagivenparserstateandgoldtree.Moreover,theyarewell-deﬁnedandoptimalalsoforstatesfromwhichthegoldtreecannotbederived,inthesensethattheyreturnthesetoftransitionsleadingtothebesttreederivablefromeachstate.Weshowedexperimen-tallythat,usingadynamicoracleforthearc-eagertransitionsystem(Nivre,2003),agreedyparsercanbetrainedtoperformwellalsoafterincurringamis-take,thusalleviatingtheeffectoferrorpropagationandresultinginconsistentlybetterparsingaccuracy. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t a c

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 379–390. Editor de acciones: Liliana Lee.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 379–390. Editor de acciones: Liliana Lee. Submitted 6/2013; Revised 9/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. Data-DrivenMetaphorRecognitionandExplanationHongsongLiMicrosoftResearchAsiahongsli@microsoft.comKennyQ.ZhuShanghaiJiaoTongUniversitykzhu@cs.sjtu.edu.cnHaixunWangGoogleResearchhaixun@google.comAbstractRecognizingmetaphorsandidentifyingthesource-targetmappingsisanimportanttaskasmetaphoricaltextposesabigchallengeformachinereading.Toaddressthisproblem,weautomaticallyacquireametaphorknowledgebaseandanisAknowledgebasefrombillionsofwebpages.Usingtheknowledgebases,wedevelopaninferencemechanismtorec-ognizeandexplainthemetaphorsinthetext.Toourknowledge,thisistheﬁrstpurelydata-drivenapproachofprobabilisticmetaphorac-quisition,recognition,andexplanation.Ourresultsshowsthatitsigniﬁcantlyoutperformsotherstate-of-the-artmethodsinrecognizingandexplainingmetaphors.1IntroductionAmetaphorisawayofcommunicating.Itenablesustocomprehendonethingintermsofanother.Forexample,themetaphor,Julietisthesun,allowsustoseeJulietmuchmorevividlythanifShakespearehadtakenamoreliteralapproach.Weutteraboutonemetaphorforeverytentotwenty-ﬁvewords,oraboutsixmetaphorsaminute(Geary,2011).Específicamente,ametaphorisamappingofconceptsfromasourcedomaintoatargetdomain(LakoffandJohnson,1980).Thesourcedomainisoftencon-creteandbasedonsensoryexperience,whiletar-getdomainisusuallyabstract.Twoconceptsareconnectedbythismappingbecausetheysharesomecommonorsimilarproperties,andasaresult,themeaningofoneconceptcanbetransferredtoan-other.Forexample,in“Julietisthesun,”thesunisthesourceconceptwhileJulietisthetargetconcept.Oneinterpretationofthismetaphoristhatbothcon-ceptssharethepropertythattheirexistencebringsaboutwarmth,vida,andexcitement.Inametaphor-icalsentence,atleastoneofthetwoconceptsmustbeexplicitlypresent.Thisleadstothreetypesofmetaphors:1.Julietisthesun.Here,boththesource(sun)andthetarget(Juliet)areexplicit.2.Pleasewashyourclawsbeforescratchingme.Here,thesource(claws)isexplicit,whilethetarget(manos)isimplicit,andthecontextofwashisintermsofthetarget.3.Yourwordscutdeep.Here,thetarget(palabras)isexplicit,whilethesource(possibly,knife)isimplicit,andthecontextofcutisintermsofthesource.Inthispaper,wefocusontherecognitionandex-planationofmetaphors.Foragivensentence,weﬁrstcheckwhetheritcontainsametaphoricexpres-sion(whichwecallmetaphorrecognition),andifitdoes,weidentifythesourceandthetargetcon-ceptsofthemetaphor(whichwecallmetaphorex-planation).Metaphorexplanationisimportantforunderstandingmetaphors.Explainingtype2and3metaphorsisparticularlychallenging,y,tothebestofourknowledge,hasnotbeenattemptedfornominalconcepts1before.Inourexamples,know-ingthatlifeandhandsarethetargetconceptsavoidstheconfusionthatmayariseifsourceconceptssunandclawsareusedliterallyinunderstandingthesen-tences.This,sin embargo,doesnotmeanthatthesource1Nominalconceptsarethoserepresentedbynounphrases. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 391–402. Editor de acciones: Rada Mihalcea.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 391–402. Editor de acciones: Rada Mihalcea. Submitted 5/2013; Publicado 10/2013. C(cid:13)2013 Asociación de Lingüística Computacional. 391 Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading Sumit Basu Chuck Jacobs Lucy Vanderwende Microsoft Research Microsoft Research Microsoft Research One Microsoft Way One Microsoft Way One Microsoft Way Redmond, WA Redmond, WA Redmond, WA sumitb@microsoft.com cjacobs@microsoft.com

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 367–378. Editor de acciones: Kristina Toutanova.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 367–378. Editor de acciones: Kristina Toutanova. Submitted 7/2013; Revised 8/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. ModelingMissingDatainDistantSupervisionforInformationExtractionAlanRitterMachineLearningDepartmentCarnegieMellonUniversityrittera@cs.cmu.eduLukeZettlemoyer,MausamComputerSci.&Eng.UniversityofWashington{lsz,mausam}@cs.washington.eduOrenEtzioniVulcanInc.Seattle,WAorene@vulcan.comAbstractDistantsupervisionalgorithmslearninforma-tionextractionmodelsgivenonlylargeread-ilyavailabledatabasesandtextcollections.Mostpreviousworkhasusedheuristicsforgeneratinglabeleddata,forexampleassum-ingthatfactsnotcontainedinthedatabasearenotmentionedinthetext,andfactsinthedatabasemustbementionedatleastonce.Inthispaper,weproposeanewlatent-variableapproachthatmodelsmissingdata.Thispro-videsanaturalwaytoincorporatesidein-formation,forinstancemodelingtheintuitionthattextwilloftenmentionrareentitieswhicharelikelytobemissinginthedatabase.De-spitetheaddedcomplexityintroducedbyrea-soningaboutmissingdata,wedemonstratethatacarefullydesignedlocalsearchapproachtoinferenceisveryaccurateandscalestolargedatasets.Experimentsdemonstrateim-provedperformanceforbinaryandunaryre-lationextractionwhencomparedtolearningwithheuristiclabels,includingonaveragea27%increaseinareaundertheprecisionre-callcurveinthebinarycase.1IntroductionThispaperaddressestheissueofmissingdata(Lit-tleandRubin,1986)inthecontextofdistantsuper-vision.Thegoalofdistantsupervisionistolearntoprocessunstructureddata,forinstancetoextractbinaryorunaryrelationsfromtext(BunescuandMooney,2007;SnyderandBarzilay,2007;WuandWeld,2007;Mintzetal.,2009;CollinsandSinger,1999),usingalargedatabaseofpropositionsasaPersonEMPLOYERBibbLatan´eUNCChapelHillTimCookAppleSusanWojcickiGoogleTruePositive“BibbLatan´e,aprofessorattheUniversityofNorthCarolinaatChapelHill,publishedthetheoryin1981.”FalsePositive“TimCookpraisedApple’srecordrevenue…”FalseNegative“JohnP.McNamara,aprofessoratWashingtonStateUniversity’sDepartmentofAnimalSciences…”Figure1:Asmallhypotheticaldatabaseandheuris-ticallylabeledtrainingdatafortheEMPLOYERrela-tion.distantsourceofsupervision.Inthecaseofbinaryrelations,theintuitionisthatanysentencewhichmentionsapairofentities(e1ande2)thatpartici-pateinarelation,r,islikelytoexpresstheproposi-tionr(e1,e2),sowecantreatitasapositivetrainingexampleofr.Figure1presentsanexampleofthisprocess.Onequestionwhichhasreceivedlittleattentioninpreviousworkishowtohandlethesituationwhereinformationismissing,eitherfromthetextcorpus,orthedatabase.Asanexample,supposethepairofentities(JohnP.McNamara,WashingtonStateUni-versity)isabsentfromtheEMPLOYERrelation.Inthiscase,thesentenceinFigure1(andotherswhichmentiontheentitypair)iseffectivelytreatedasanegativeexampleoftherelation.Thisisanissue l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 353–366. Editor de acciones: Patricio Pantel.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 353–366. Editor de acciones: Patricio Pantel. Submitted 5/2013; Revised 7/2013; Publicado 10/2013. C(cid:13)2013 Asociación de Lingüística Computacional. DistributionalSemanticsBeyondWords:SupervisedLearningofAnalogyandParaphrasePeterD.TurneyNationalResearchCouncilCanadaInformationandCommunicationsTechnologiesOttawa,ontario,Canada,K1A0R6peter.turney@nrc-cnrc.gc.caAbstractTherehavebeenseveraleffortstoextenddistributionalsemanticsbeyondindividualwords,tomeasurethesimilarityofwordpairs,phrases,andsentences(brieﬂy,tuples;orderedsetsofwords,contiguousornoncontiguous).Onewaytoextendbeyondwordsistocom-paretwotuplesusingafunctionthatcom-binespairwisesimilaritiesbetweenthecom-ponentwordsinthetuples.Astrengthofthisapproachisthatitworkswithbothrela-tionalsimilarity(analogy)andcompositionalsimilarity(paraphrase).Sin embargo,pastworkrequiredhand-codingthecombinationfunc-tionfordifferenttasks.Themaincontributionofthispaperisthatcombinationfunctionsaregeneratedbysupervisedlearning.Weachievestate-of-the-artresultsinmeasuringrelationalsimilaritybetweenwordpairs(SATanalo-giesandSemEval2012Task2)andmeasur-ingcompositionalsimilaritybetweennoun-modiﬁerphrasesandunigrams(multiple-choiceparaphrasequestions).1IntroductionHarris(1954)andFirth(1957)hypothesizedthatwordsthatappearinsimilarcontextstendtohavesimilarmeanings.Thishypothesisisthefounda-tionfordistributionalsemantics,inwhichwordsarerepresentedbycontextvectors.Thesimilarityoftwowordsiscalculatedbycomparingthetwocor-respondingcontextvectors(Lundetal.,1995;Lan-dauerandDumais,1997;TurneyandPantel,2010).Distributionalsemanticsishighlyeffectiveformeasuringthesemanticsimilaritybetweenindivid-ualwords.Onasetofeightymultiple-choicesyn-onymquestionsfromthetestofEnglishasafor-eignlanguage(TOEFL),adistributionalapproachrecentlyachieved100%accuracy(BullinariaandLevy,2012).Sin embargo,ithasbeendifﬁculttoextenddistributionalsemanticsbeyondindividualwords,towordpairs,phrases,andsentences.Movingbeyondindividualwords,therearevari-oustypesofsemanticsimilaritytoconsider.Herewefocusonparaphraseandanalogy.Paraphraseissimilarityinthemeaningoftwopiecesoftext(AndroutsopoulosandMalakasiotis,2010).Anal-ogyissimilarityinthesemanticrelationsoftwosetsofwords(Turney,2008a).Itiscommontostudyparaphraseatthesentencelevel(AndroutsopoulosandMalakasiotis,2010),butweprefertoconcentrateonthesimplesttypeofparaphrase,whereabigramparaphrasesaunigram.Forexample,doghouseisaparaphraseofkennel.Inourexperiments,weconcentrateonnoun-modiﬁerbigramsandnoununigrams.Analogiesmaptermsinonedomaintotermsinanotherdomain(Gentner,1983).Thefamiliaranal-ogybetweenthesolarsystemandtheRutherford-Bohratomicmodelinvolvesseveraltermsfromthedomainofthesolarsystemandthedomainoftheatomicmodel(Turney,2008a).Thesimplesttypeofanalogyisproportionalanal-ogy,whichinvolvestwopairsofwords(Turney,2006b).Forexample,thepairhcook,rawiisanal-ogoustothepairhdecorate,plaini.Ifwecookathing,itisnolongerraw;ifwedecorateathing,it l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t a c

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 341–352. Editor de acciones: Mirella Lapata.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 341–352. Editor de acciones: Mirella Lapata. Submitted 12/2012; Revised 3/2013, 5/2013; Publicado 7/2013. C(cid:13)2013 Asociación de Lingüística Computacional. 341 WhatMakesWritingGreat?FirstExperimentsonArticleQualityPredictionintheScienceJournalismDomainAnnieLouisUniversityofPennsylvaniaPhiladelphia,PA19104lannie@seas.upenn.eduAniNenkovaUniversityofPennsylvaniaPhiladelphia,PA19104nenkova@seas.upenn.eduAbstractGreatwritingisrareandhighlyadmired.Readersseekoutarticlesthatarebeautifullywritten,informativeandentertaining.Yetinformation-accesstechnologieslackcapabil-itiesforpredictingarticlequalityatthislevel.Inthispaperwepresentﬁrstexperimentsonarticlequalitypredictioninthesciencejour-nalismdomain.Weintroduceacorpusofgreatpiecesofsciencejournalism,alongwithtypicalarticlesfromthegenre.Weimple-mentfeaturestocaptureaspectsofgreatwrit-ing,includingsurprising,visualandemotionalcontent,aswellasgeneralfeaturesrelatedtodiscourseorganizationandsentencestructure.Weshowthatthedistinctionbetweengreatandtypicalarticlescanbedetectedfairlyac-curately,andthattheentirespectrumofourfeaturescontributetothedistinction.1IntroductionMeasuresofarticlequalitywouldbehugelybene-ﬁcialforinformationretrievalandrecommendationsystems.Inthispaper,wedescribeadatasetofNewYorkTimessciencejournalismarticleswhichwehavecategorizedforqualitydifferencesandpresentasystemthatcanautomaticallymakethedistinction.Sciencejournalismconveyscomplexscientiﬁcideas,entertainingandeducatingatthesametime.Considerthefollowingopeningofa2005articlebyDavidQuammenfromHarper’smagazine:Onemorningearlylastwinterasmallitemappearedinmylocalnewspaperannouncingthebirthofanextraordi-naryanimal.AteamofresearchersatTexasA&MUni-versityhadsucceededincloningawhitetaildeer.Neverdonebefore.Thefawn,knownasDewey,wasdevelopingnormallyandseemedtobehealthy.Hehadnomother,justasurrogatewhohadcarriedhisfetustoterm.Hehadnofather,justa“donor”ofallhischromosomes.HewasthegeneticduplicateofacertaintrophybuckoutofsouthTexaswhoseskincellshadbeenculturedinalaboratory.Oneofthosecellsfurnishedanucleusthat,transplantedandrejiggered,becametheDNAcoreofaneggcell,whichbecameanembryo,whichintimebe-cameDewey.Sohewaswildlife,inasense,andinan-othersenseelaboratelysynthetic.Thisisthesortofnews,quirkybutepochal,thatcancauseapersonwithamouth-fuloftoasttopauseandmarvel.Whatadumbidea,Imarveled.Thewritingisclearandwell-organizedbutthetextalsocontainscreativeuseoflanguageandacleverstory-likeexplanationofthescientiﬁccon-tribution.Suchpropertiesmakesciencejournalismanattractivegenreforstudyingwritingquality.Sci-encejournalismisalsoahighlyrelevantdomainforinformationretrievalinthecontextofeducationalaswellasentertainingapplications.Articlequalitymeasurescanhugelybeneﬁtsuchsystems.Priorworkindicatesthatthreeaspectsofarticlequalitycanbesuccessfullypredicted:a)whetheratextmeetstheacceptablestandardsforspelling(BrillandMoore,2000),gramática(TetreaultandChodorow,2008;RozovskayaandRoth,2010)anddiscourseorganization(Barzilayetal.,2002;Lap-ata,2003);b)hasatopicthatisinterestingtoapar-ticularuser.Forexample,content-basedrecommen-dationsystemsstandardlyrepresentuserinterestus-ingfrequentwordsfromarticlesinauser’shistoryandretrieveotherarticlesonthesametopics(Paz- l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. Submitted 1/2013; Revised 5/2013; Publicado 7/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. DynamicallyShapingtheReorderingSearchSpaceofPhrase-BasedStatisticalMachineTranslationAriannaBisazzaandMarcelloFedericoFondazioneBrunoKesslerTrento,Italia{bisazza,federico}@fbk.euAbstractDeﬁningthereorderingsearchspaceisacru-cialissueinphrase-basedSMTbetweendis-tantlanguages.Infact,theoptimaltrade-offbetweenaccuracyandcomplexityofde-codingisnowadaysreachedbyharshlylim-itingtheinputpermutationspace.Wepro-poseamethodtodynamicallyshapesuchspaceand,de este modo,capturelong-rangewordmovementswithouthurtingtranslationqual-itynordecodingtime.Thespacedeﬁnedbyloosereorderingconstraintsisdynamicallyprunedthroughabinaryclassiﬁerthatpredictswhetheragiveninputwordshouldbetrans-latedrightafteranother.Theintegrationofthismodelintoaphrase-baseddecoderim-provesastrongArabic-Englishbaselineal-readyincludingstate-of-the-artearlydistor-tioncost(MooreandQuirk,2007)andhierar-chicalphraseorientationmodels(GalleyandManning,2008).Signiﬁcantimprovementsinthereorderingofverbsareachievedbyasys-temthatisnotablyfasterthanthebaseline,whileBLEUandMETEORremainstable,orevenincrease,ataveryhighdistortionlimit.1IntroductionWordorderdifferencesareamongthemostimpor-tantfactorsdeterminingtheperformanceofstatisti-calmachinetranslation(SMT)onagivenlanguagepair(Birchetal.,2009).Thisisparticularlytrueintheframeworkofphrase-basedSMT(PSMT)(Zensetal.,2002;Koehnetal.,2003;OchandNey,2002),anapproachthatremainshighlycompetitivedespitetherecentadvancesofthetree-basedapproaches.DuringthePSMTdecodingprocess,theoutputsentenceisbuiltfromlefttoright,whiletheinputsentencepositionscanbecoveredindifferentor-ders.Thus,reorderinginPSMTcanbeviewedastheproblemofchoosingtheinputpermutationthatleadstothehighest-scoringoutputsentence.Duetoefﬁciencyreasons,sin embargo,theinputpermutationspacecannotbefullyexplored,andisthereforelim-itedwithhardreorderingconstraints.Althoughmanysolutionshavebeenproposedtoexplicitlymodelwordreorderingduringdecoding,PSMTstilllargelyfailstohandlelong-rangewordmovementsinlanguagepairswithdifferentsyntac-ticstructures1.Webelievethisismostlynotduetodeﬁcienciesoftheexistingreorderingmodels,butrathertoaverycoarsedeﬁnitionofthereorder-ingsearchspace.Indeed,theexistingreorderingconstraintsarerathersimpleandtypicallybasedonword-to-worddistances.Moreover,theyareuni-formthroughouttheinputsentenceandinsensitivetotheactualwordsbeingtranslated.Relaxingthiskindofconstraintsmeansdramaticallyincreasingthesizeofthesearchspaceandmakingthereorder-ingmodel’staskextremelycomplex.Asaresult,eveninlanguagepairswherelongreorderingisreg-ularlyobserved,PSMTqualitydegradeswhenlongwordmovementsareallowedtothedecoder.Weaddressthisproblembytrainingabinaryclassiﬁertopredictwhetheragiveninputpositionshouldbetranslatedrightafteranother,giventhewordsatthosepositionsandtheircontexts.Whenthismodelisintegratedintothedecoder,itspredic-1Forempiricalevidence,seeforinstance(Birchetal.,2009;GalleyandManning,2008;BisazzaandFederico,2012). l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 315–326. Editor de acciones: Marcos Steedman.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 315–326. Editor de acciones: Marcos Steedman. Submitted 2/2013; Revised 6/2013; Publicado 7/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. Parsingentirediscoursesasverylongstrings:CapturingtopiccontinuityingroundedlanguagelearningMinh-ThangLuongDepartmentofComputerScienceStanfordUniversityStanford,Californialmthang@stanford.eduMichaelC.FrankDepartmentofPsychologyStanfordUniversityStanford,Californiamcfrank@stanford.eduMarkJohnsonDepartmentofComputingMacquarieUniversitySydney,AustraliaMark.Johnson@MQ.edu.auAbstractGroundedlanguagelearning,thetaskofmap-pingfromnaturallanguagetoarepresentationofmeaning,hasattractedmoreandmorein-terestinrecentyears.Inmostworkonthistopic,sin embargo,utterancesinaconversationaretreatedindependentlyanddiscoursestruc-tureinformationislargelyignored.Inthecontextoflanguageacquisition,thisindepen-denceassumptiondiscardscuesthatareim-portanttothelearner,e.g.,thefactthatcon-secutiveutterancesarelikelytosharethesamereferent(Franketal.,2013).Thecurrentpa-perdescribesanapproachtotheproblemofsimultaneouslymodelinggroundedlanguageatthesentenceanddiscourselevels.Wecom-bineideasfromparsingandgrammarinduc-tiontoproduceaparserthatcanhandlelonginputstringswiththousandsoftokens,creat-ingparsetreesthatrepresentfulldiscourses.Bycastinggroundedlanguagelearningasagrammaticalinferencetask,weuseourparsertoextendtheworkofJohnsonetal.(2012),investigatingtheimportanceofdiscoursecon-tinuityinchildren’slanguageacquisitionanditsinteractionwithsocialcues.Ourmodelboostsperformanceinalanguageacquisitiontaskandyieldsgooddiscoursesegmentationscomparedwithhumanannotators.1IntroductionLearningmappingsbetweennaturallanguage(NL)andmeaningrepresentations(MR)isanimportantgoalforbothcomputationallinguisticsandcognitivescience.Accuratelylearningnovelmappingsiscru-cialingroundedlanguageunderstandingtasksandsuchsystemscansuggestinsightsintothenatureofchildrenlanguagelearning.Twoinﬂuentialexamplesofgroundedlanguagelearningtasksarethesportscastingtask,RoboCup,wheretheNListhesetofrunningcommentaryandtheMRisthesetoflogicalformsrepresentingac-tionslikekickingorpassing(ChenandMooney,2008),andthecross-situationalword-learningtask,wheretheNListhecaregiver’sutterancesandtheMRisthesetofobjectspresentinthecontext(Siskind,1996;YuandBallard,2007).Workinthesedomainssuggeststhat,basedontheco-occurrencebetweenwordsandtheirreferentsincontext,itispossibletolearnmappingsbetweenNLandMRevenundersubstantialambiguity.Nevertheless,contextslikeRoboCup—whereev-erysingleutteranceisgrounded—areextremelyrare.Muchmorecommonarecaseswhereasin-gletopicisintroducedandthendiscussedatlengththroughoutadiscourse.Inatelevisionnewsshow,forexample,atopicmightbeintroducedbypresent-ingarelevantpictureorvideoclip.Oncethetopicisintroduced,theanchorscandiscussitbynameorevenusingapronounwithoutshowingapicture.Thediscourseisgroundedwithouthavingtogroundeveryutterance.Moreover,althoughpreviousworkhaslargelytreatedutteranceorderasindependent,theorderofutterancesiscriticalingroundeddiscoursecontexts:iftheorderisscrambled,itcanbecomeimpossibletorecoverthetopic.Supportingthisidea,Franketal.(2013)foundthattopiccontinuity—thetendencytotalkaboutthesametopicinmultipleutterancesthatarecontiguousintime—isbothprevalentandinformativeforwordlearning.Thispaperexaminestheimportanceoftopiccontinuitythroughagram-maticalinferenceproblem.WebuildonJohnsonetal.(2012)’sworkthatusedgrammaticalinferenceto l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t