¿Sobre qué tema necesitas documentación??
Analysis Methods in Neural Language Processing: Una encuesta
Analysis Methods in Neural Language Processing: A Survey Yonatan Belinkov1,2 and James Glass1 1MIT Computer Science and Artificial Intelligence Laboratory 2Harvard School of Engineering and Applied Sciences Cambridge, MAMÁ, EE.UU {belinkov, glass}@mit.edu Abstract The field of natural language processing has seen impressive progress in recent years, with neural network models replacing many of the traditional systems. A plethora of new mod- els have been proposed,
Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing
Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing Strategies for MRLs and a Case Study from Modern Hebrew Amir More Open University Ra’anana, Israel habeanf@gmail.com Victoria Basmova Open University Ra’anana, Israel vicbas@openu.ac.il Amit Seker Open University Ra’anana, Israel amitse@openu.ac.il Reut Tsarfaty Open University Ra’anana, Israel reutts@openu.ac.il Abstract In standard NLP pipelines, morphological analysis and disambiguation (MAMÁ&D) pre- cedes syntactic and semantic downstream tasks. Sin embargo, for languages
Traducción automática neuronal semántica mediante AMR
Semantic Neural Machine Translation Using AMR Linfeng Song,1 Daniel Gildea,1 Yue Zhang,2 Zhiguo Wang,3 and Jinsong Su4 1Department of Computer Science, University of Rochester, Rochester, Nueva York 14627 2School of Engineering, Westlake University, China 3IBM T.J. Watson Research Center, Yorktown Heights, Nueva York 10598 4Xiamen University, Xiamen, Porcelana 1{lsong10,gildea}@cs.rochester.edu 2yue.zhang@wias.org.cn 3zgw.tomorrow@gmail.com 4jssu@xmu.edu.cn Abstract It is intuitive that semantic representations can be useful for machine translation, mainly be-
Grammar Error Correction in Morphologically Rich Languages:
Grammar Error Correction in Morphologically Rich Languages: The Case of Russian Alla Rozovskaya Queens College, City University of New York arozovskaya@qc.cuny.edu Dan Roth University of Pennsylvania danroth@seas.upenn.edu Abstract Until now, most of the research in grammar error correction focused on English, and the problem has hardly been explored for other languages. We address the task of correcting writing mistakes in morphologically rich lan- calibres, con
Learning Typed Entailment Graphs with Global Soft Constraints
Learning Typed Entailment Graphs with Global Soft Constraints Mohammad Javad Hosseini(cid:63)§ Nathanael Chambers(cid:63)(cid:63) Siva Reddy† Xavier R. Holt‡ Shay B. cohen(cid:63) Mark Johnson‡ and Mark Steedman(cid:63) (cid:63)University of Edinburgh §The Alan Turing Institute, Reino Unido (cid:63)(cid:63)United States Naval Academy †Stanford University ‡Macquarie University javad.hosseini@ed.ac.uk, nchamber@usna.edu, sivar@stanford.edu {xavier.ricketts-holt,mark.johnson}@mq.edu.au {scohen,steedman}@inf.ed.ac.uk Abstract This paper presents a new method for learn- ing typed entailment graphs from text. We extract predicate-argument
Surface Statistics of an Unknown Language Indicate How to Parse It
Surface Statistics of an Unknown Language Indicate How to Parse It Dingquan Wang and Jason Eisner Department of Computer Science, Universidad Johns Hopkins {wdd,jason}@cs.jhu.edu Abstract We introduce a novel framework for delex- icalized dependency parsing in a new lan- guage. We show that useful features of the target language can be extracted automati- cally from an unparsed corpus, cual estafa- sists only of gold part-of-speech
Attentive Convolution:
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms Wenpeng Yin Department of Computer and Information Science, University of Pennsylvania wenpeng@seas.upenn.edu Hinrich Schütze Center for Information and Language Processing, LMU Munich, Germany inquiries@cislmu.org Abstract In NLP, convolutional neural networks (CNNs) have benefited less than recur- rent neural networks (RNNs) from attention mechanisms. We hypothesize that this is be- cause the attention in CNNs has been mainly
Errata: “Improving Topic Models with Latent Feature Word
Errata: “Improving Topic Models with Latent Feature Word Representations” Dat Quoc Nguyen, Richard Billingsley, Lan Du and Mark Johnson Abstract FROM (a part of Table 10 in the original published arti- cle): F1 scores for TMN and TMNtitle datasets. Change in clustering and classification results due to the DMM and LF-DMM bugs. Data TMN 4.3 Document clustering evaluation FROM (in the original published article): Para
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 429–440. Editor de acciones: Philipp Koehn.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 429–440. Editor de acciones: Philipp Koehn. Submitted 3/2013; Revised 8/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. MeasuringMachineTranslationErrorsinNewDomainsAnnIrvineJohnsHopkinsUniversityanni@jhu.eduJohnMorganUniversityofMarylandjjm@cs.umd.eduMarineCarpuatNationalResearchCouncilCanadamarine.carpuat@nrc.gc.caHalDaum´eIIIUniversityofMarylandme@hal3.nameDragosMunteanuSDLResearchdmunteanu@sdl.comAbstractWedeveloptwotechniquesforanalyzingtheeffectofportingamachinetranslationsystemtoanewdomain.Oneisamacro-levelana-lysisthatmeasureshowdomainshiftaffectscorpus-levelevaluation;thesecondisamicro-levelanalysisforword-levelerrors.Weap-plythesemethodstounderstandwhathappenswhenaParliament-trainedphrase-basedma-chinetranslationsystemisappliedinfourverydifferentdomains:noticias,medicaltexts,scien-tificarticlesandmoviesubtitles.Wepresentquantitativeandqualitativeexperimentsthathighlightopportunitiesforfutureresearchindomainadaptationformachinetranslation.1IntroductionWhenbuildingastatisticalmachinetranslation(SMT)sistema,theexpectedusecaseisoftenlimitedtoaspecificdomain,genreandregister(henceforth“domain”referstothisset,inkeepingwithstandard,imprecise,terminology),suchasaparticulartypeoflegalormedicaldocument.Unfortunately,itisex-pensivetoobtainenoughparalleldatatoreliablyes-timatetranslationmodelsinanewdomain.Instead,onecanhopethatlargeamountsofdatafromano-ther,“olddomain,”mightbecloseenoughtostandasaproxy.Thisisthedefactostandard:wetrainSMTsystemsonParliamentproceedings,butthenusethemtotranslateallsortsofnewtext.Unfortuna-tely,thisresultsinsignificantlydegradedtranslationquality.Inthispaper,wepresenttwocomplemen-tarymethodsforquantifiablymeasuringthesourceoftranslationerrors(§5.1and§5.2)inanoveltaxo-nomy(§4).Weshowquantitative(§7.1)andquali-tative(§7.2)resultsobtainedfromourmethodsonOldDomain(Hansard)Inpmonsieurlepr´esident,lespˆecheursdehomarddelar´egiondel’atlantiquesontdansunesituationcatastro-phique.Refmr.speaker,lobsterfishersinatlanticcanadaarefacingadisaster.Outmr.speaker,thelobsterfishersinatlanticcanadaareinamess.NewDomain(Medical)Inpmodeetvoie(s)d’administrationRefmethodandroute(s)ofadministrationOutfashionandvoie(s)ofdirectorsTABLE1:Exampleinputs,referencesandsystemoutputs.Therearethreetypesoferrors:unseenwords(azul),in-correctsenseselection(rojo)andunknownsense(verde).fourverydifferentnewdomains:newswire,medicaltexts,scientificabstracts,andmoviesubtitles.Ourbasicapproachistothinkoftranslationer-rorsinthecontextofanoveltaxonomyoferrorcategories,“S4.”OurtaxonomycontainscategoriesfortheerrorsshowninTable1,inwhichanSMTsystemtrainedontheHansardparliamentaryproce-dingsisappliedtoanewdomain(inthiscase,me-dicaltexts).Ourcategorizationfocusesonthefollo-wing:newFrenchwords,newFrenchsenses,andin-correctlychosentranslations.Thefirstmethodologywedevelopforstudyingsucherrorsisamicro-levelstudyofthefrequencyanddistributionoftheseerrortypesinrealtranslationoutputatthelevelofindivi-dualwords(§5.1),withoutrespecttohowtheseer-rorsaffectoveralltranslationquality.Thesecondisamacro-levelstudyofhowtheseerrorsaffecttrans-lationperformance(measuredbyBLEU;§5.2).Oneimportantfeatureofourmethodologiesisthatwefocusonerrorsthatcouldpossiblybefixedgivenaccesstodatafromanewdomain,ratherthanallerrorsthatmightarisebecausetheparticulartransla-tionmodelusedisinadequatetocapturetherequired l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 415–428. Editor de acciones: Brian Roark.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 415–428. Editor de acciones: Brian Roark. Submitted 7/2013; Revised 9/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. JointMorphologicalandSyntacticAnalysisforRichlyInflectedLanguagesBerndBohnet∗JoakimNivre?IgorBoguslavsky•◦Rich´ardFarkas(cid:5)FilipGinter†JanHajiˇc‡∗UniversityofBirmingham,SchoolofComputerScience?UppsalaUniversity,DepartmentofLinguisticsandPhilology•UniversidadPolit´ecnicadeMadrid,DepartamentodeInteligenciaArtificial◦RussianAcademyofSciences,InstituteforInformationTransmissionProblems(cid:5)UniversityofSzeged,InstituteofInformatics†UniversityofTurku,DepartmentofInformationTechnology‡CharlesUniversityinPrague,InstituteofFormalandAppliedLinguisticsAbstractJointmorphologicalandsyntacticanalysishasbeenproposedasawayofimprovingparsingaccuracyforrichlyinflectedlanguages.Start-ingfromatransition-basedmodelforjointpart-of-speechtagginganddependencypars-ing,weexploredifferentwaysofintegratingmorphologicalfeaturesintothemodel.Wealsoinvestigatetheuseofrule-basedmor-phologicalanalyzerstoprovidehardorsoftlexicalconstraintsandtheuseofwordclus-terstotacklethesparsityoflexicalfeatures.Evaluationonfivemorphologicallyrichlan-guages(checo,Finnish,Alemán,Hungarian,y ruso)showsconsistentimprovementsinbothmorphologicalandsyntacticaccuracyforjointpredictionoverapipelinemodel,withfurtherimprovementsthankstolexicalcon-straintsandwordclusters.Thefinalresultsimprovethestateoftheartindependencyparsingforalllanguages.1IntroductionSyntacticparsingofnaturallanguagehaswitnessedatremendousdevelopmentduringthelasttwentyyears,especiallythroughtheuseofstatisticalmod-elsforrobustandaccuratebroad-coverageparsing.However,asstatisticalparsingtechniqueshavebeenappliedtomoreandmorelanguages,ithasalsobeenobservedthattypologicaldifferencesbetweenlanguagesleadtonewchallenges.Inparticular,ithasbeenfoundoverandoveragainthatlanguagesexhibitingrichmorphologicalstructure,oftento-getherwitharelativelyfreewordorder,usuallyob-tainlowerparsingaccuracy,especiallyincompar-isontoEnglish.OnestrikingdemonstrationofthistendencycanbefoundintheCoNLLsharedtasksonmultilingualdependencyparsing,organizedin2006and2007,whererichlyinflectedlanguagesclusteredatthelowerendofthescalewithrespecttopars-ingaccuracy(BuchholzandMarsi,2006;Nivreetal.,2007).Theseandsimilarobservationshaveledtoanincreasedinterestinthespecialchallengesposedbyparsingmorphologicallyrichlanguages,asevidencedmostclearlybyanewseriesofwork-shopsdevotedtothistopic(Tsarfatyetal.,2010),aswellasaspecialissueinComputationalLinguistics(Tsarfatyetal.,2013)andasharedtaskonparsingmorphologicallyrichlanguages.1Onehypothesizedexplanationforthelowerpars-ingaccuracyobservedforrichlyinflectedlanguagesisthestrictseparationofmorphologicalandsyn-tacticanalysisassumedinmanyparsingframe-works(Tsarfatyetal.,2010;Tsarfatyetal.,2013).Thisistrueinparticularfordata-drivendependencyparsers,whichtendtoassumethatallmorphologicaldisambiguationhasbeenperformedbeforesyntacticanalysisbegins.However,asarguedbyLeeetal.(2011),inmorphologicallyrichlanguagesthereisoftenconsiderableinteractionbetweenmorphologyandsyntax,suchthatneithercanbedisambiguatedwithouttheother.Leeetal.(2011)goontoshowthatadiscriminativemodelforjointmorphologicaldisambiguationanddependencyparsinggivescon-sistentimprovementsinmorphologicalandsyntac-ticaccuracy,comparedtoapipelinemodel,forAn-cientGreek,checo,HungarianandLatin.Simi-larly,BohnetandNivre(2012)proposeamodelfor1Seehttps://sites.google.com/site/spmrl2013/home/sharedtask. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 403–414. Editor de acciones: Jason Eisner.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 403–414. Editor de acciones: Jason Eisner. Submitted 6/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. TrainingDeterministicParserswithNon-DeterministicOraclesYoavGoldbergBar-IlanUniversityDepartmentofComputerScienceRamat-Gan,Israelyoav.goldberg@gmail.comJoakimNivreUppsalaUniversityDepartmentofLinguisticsandPhilologyUppsala,Swedenjoakim.nivre@lingfil.uu.seAbstractGreedytransition-basedparsersareveryfastbuttendtosufferfromerrorpropagation.Thisproblemisaggravatedbythefactthattheyarenormallytrainedusingoraclesthataredeter-ministicandincompleteinthesensethattheyassumeauniquecanonicalpaththroughthetransitionsystemandareonlyvalidaslongastheparserdoesnotstrayfromthispath.Inthispaper,wegiveageneralcharacterizationoforaclesthatarenondeterministicandcom-plete,presentamethodforderivingsuchora-clesfortransitionsystemsthatsatisfyaprop-ertywecallarcdecomposition,andinstanti-atethismethodforthreewell-knowntransi-tionsystemsfromtheliterature.Wesaythattheseoraclesaredynamic,becausetheyallowustodynamicallyexplorealternativeandnon-optimalpathsduringtraining–incontrasttooraclesthatstaticallyassumeauniqueoptimalpath.Experimentalevaluationonawiderangeofdatasetsclearlyshowsthatusingdynamicoraclestotraingreedyparsersgivessubstan-tialimprovementsinaccuracy.Moreover,thisimprovementcomesatnocostintermsofefficiency,unlikeothertechniqueslikebeamsearch.1IntroductionGreedytransition-basedparsersareeasytoimple-mentandareveryefficient,buttheyaregenerallynotasaccurateasparsersthatarebasedonglobalsearch(McDonaldetal.,2005;KooandCollins,2010)orastransition-basedparsersthatusebeamsearch(ZhangandClark,2008)ordynamicpro-gramming(HuangandSagae,2010;Kuhlmannetal.,2011).Thisworkispartofalineofresearchtryingtopushtheboundariesofgreedyparsingandnarrowtheaccuracygapof2–3%betweensearch-basedandgreedyparsers,whilemaintainingtheef-ficiencyandincrementalnatureofgreedyparsers.Onereasonfortheloweraccuracyofgreedyparsersiserrorpropagation:oncetheparsermakesanerrorindecoding,moreerrorsarelikelytofol-low.Thisbehavioriscloselyrelatedtothewayinwhichgreedyparsersarenormallytrained.Givenatreebankoracle,agoldsequenceoftransitionsisderived,andapredictoristrainedtopredicttransi-tionsalongthisgoldsequence,withoutconsideringanyparserstateoutsidethissequence.Thus,oncetheparserstraysfromthegoldenpathattesttime,itventuresintounknownterritoryandisforcedtoreacttosituationsithasneverbeentrainedfor.Inrecentwork(GoldbergandNivre,2012),weintroducedtheconceptofadynamicoracle,whichisnon-deterministicandnotrestrictedtoasinglegoldenpath,butinsteadprovidesoptimalpredic-tionsforanypossiblestatetheparsermightbein.Dynamicoraclesarenon-deterministicinthesensethattheyreturnasetofvalidtransitionsforagivenparserstateandgoldtree.Moreover,theyarewell-definedandoptimalalsoforstatesfromwhichthegoldtreecannotbederived,inthesensethattheyreturnthesetoftransitionsleadingtothebesttreederivablefromeachstate.Weshowedexperimen-tallythat,usingadynamicoracleforthearc-eagertransitionsystem(Nivre,2003),agreedyparsercanbetrainedtoperformwellalsoafterincurringamis-take,thusalleviatingtheeffectoferrorpropagationandresultinginconsistentlybetterparsingaccuracy. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t a c
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 379–390. Editor de acciones: Liliana Lee.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 379–390. Editor de acciones: Liliana Lee. Submitted 6/2013; Revised 9/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. Data-DrivenMetaphorRecognitionandExplanationHongsongLiMicrosoftResearchAsiahongsli@microsoft.comKennyQ.ZhuShanghaiJiaoTongUniversitykzhu@cs.sjtu.edu.cnHaixunWangGoogleResearchhaixun@google.comAbstractRecognizingmetaphorsandidentifyingthesource-targetmappingsisanimportanttaskasmetaphoricaltextposesabigchallengeformachinereading.Toaddressthisproblem,weautomaticallyacquireametaphorknowledgebaseandanisAknowledgebasefrombillionsofwebpages.Usingtheknowledgebases,wedevelopaninferencemechanismtorec-ognizeandexplainthemetaphorsinthetext.Toourknowledge,thisisthefirstpurelydata-drivenapproachofprobabilisticmetaphorac-quisition,recognition,andexplanation.Ourresultsshowsthatitsignificantlyoutperformsotherstate-of-the-artmethodsinrecognizingandexplainingmetaphors.1IntroductionAmetaphorisawayofcommunicating.Itenablesustocomprehendonethingintermsofanother.Forexample,themetaphor,Julietisthesun,allowsustoseeJulietmuchmorevividlythanifShakespearehadtakenamoreliteralapproach.Weutteraboutonemetaphorforeverytentotwenty-fivewords,oraboutsixmetaphorsaminute(Geary,2011).Específicamente,ametaphorisamappingofconceptsfromasourcedomaintoatargetdomain(LakoffandJohnson,1980).Thesourcedomainisoftencon-creteandbasedonsensoryexperience,whiletar-getdomainisusuallyabstract.Twoconceptsareconnectedbythismappingbecausetheysharesomecommonorsimilarproperties,andasaresult,themeaningofoneconceptcanbetransferredtoan-other.Forexample,in“Julietisthesun,”thesunisthesourceconceptwhileJulietisthetargetconcept.Oneinterpretationofthismetaphoristhatbothcon-ceptssharethepropertythattheirexistencebringsaboutwarmth,vida,andexcitement.Inametaphor-icalsentence,atleastoneofthetwoconceptsmustbeexplicitlypresent.Thisleadstothreetypesofmetaphors:1.Julietisthesun.Here,boththesource(sun)andthetarget(Juliet)areexplicit.2.Pleasewashyourclawsbeforescratchingme.Here,thesource(claws)isexplicit,whilethetarget(manos)isimplicit,andthecontextofwashisintermsofthetarget.3.Yourwordscutdeep.Here,thetarget(palabras)isexplicit,whilethesource(possibly,knife)isimplicit,andthecontextofcutisintermsofthesource.Inthispaper,wefocusontherecognitionandex-planationofmetaphors.Foragivensentence,wefirstcheckwhetheritcontainsametaphoricexpres-sion(whichwecallmetaphorrecognition),andifitdoes,weidentifythesourceandthetargetcon-ceptsofthemetaphor(whichwecallmetaphorex-planation).Metaphorexplanationisimportantforunderstandingmetaphors.Explainingtype2and3metaphorsisparticularlychallenging,y,tothebestofourknowledge,hasnotbeenattemptedfornominalconcepts1before.Inourexamples,know-ingthatlifeandhandsarethetargetconceptsavoidstheconfusionthatmayariseifsourceconceptssunandclawsareusedliterallyinunderstandingthesen-tences.This,sin embargo,doesnotmeanthatthesource1Nominalconceptsarethoserepresentedbynounphrases. l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 391–402. Editor de acciones: Rada Mihalcea.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 391–402. Editor de acciones: Rada Mihalcea. Submitted 5/2013; Publicado 10/2013. C(cid:13)2013 Asociación de Lingüística Computacional. 391 Powergrading: a Clustering Approach to Amplify Human Effort for Short Answer Grading Sumit Basu Chuck Jacobs Lucy Vanderwende Microsoft Research Microsoft Research Microsoft Research One Microsoft Way One Microsoft Way One Microsoft Way Redmond, WA Redmond, WA Redmond, WA sumitb@microsoft.com cjacobs@microsoft.com
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 367–378. Editor de acciones: Kristina Toutanova.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 367–378. Editor de acciones: Kristina Toutanova. Submitted 7/2013; Revised 8/2013; Publicado 10/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. ModelingMissingDatainDistantSupervisionforInformationExtractionAlanRitterMachineLearningDepartmentCarnegieMellonUniversityrittera@cs.cmu.eduLukeZettlemoyer,MausamComputerSci.&Eng.UniversityofWashington{lsz,mausam}@cs.washington.eduOrenEtzioniVulcanInc.Seattle,WAorene@vulcan.comAbstractDistantsupervisionalgorithmslearninforma-tionextractionmodelsgivenonlylargeread-ilyavailabledatabasesandtextcollections.Mostpreviousworkhasusedheuristicsforgeneratinglabeleddata,forexampleassum-ingthatfactsnotcontainedinthedatabasearenotmentionedinthetext,andfactsinthedatabasemustbementionedatleastonce.Inthispaper,weproposeanewlatent-variableapproachthatmodelsmissingdata.Thispro-videsanaturalwaytoincorporatesidein-formation,forinstancemodelingtheintuitionthattextwilloftenmentionrareentitieswhicharelikelytobemissinginthedatabase.De-spitetheaddedcomplexityintroducedbyrea-soningaboutmissingdata,wedemonstratethatacarefullydesignedlocalsearchapproachtoinferenceisveryaccurateandscalestolargedatasets.Experimentsdemonstrateim-provedperformanceforbinaryandunaryre-lationextractionwhencomparedtolearningwithheuristiclabels,includingonaveragea27%increaseinareaundertheprecisionre-callcurveinthebinarycase.1IntroductionThispaperaddressestheissueofmissingdata(Lit-tleandRubin,1986)inthecontextofdistantsuper-vision.Thegoalofdistantsupervisionistolearntoprocessunstructureddata,forinstancetoextractbinaryorunaryrelationsfromtext(BunescuandMooney,2007;SnyderandBarzilay,2007;WuandWeld,2007;Mintzetal.,2009;CollinsandSinger,1999),usingalargedatabaseofpropositionsasaPersonEMPLOYERBibbLatan´eUNCChapelHillTimCookAppleSusanWojcickiGoogleTruePositive“BibbLatan´e,aprofessorattheUniversityofNorthCarolinaatChapelHill,publishedthetheoryin1981.”FalsePositive“TimCookpraisedApple’srecordrevenue…”FalseNegative“JohnP.McNamara,aprofessoratWashingtonStateUniversity’sDepartmentofAnimalSciences…”Figure1:Asmallhypotheticaldatabaseandheuris-ticallylabeledtrainingdatafortheEMPLOYERrela-tion.distantsourceofsupervision.Inthecaseofbinaryrelations,theintuitionisthatanysentencewhichmentionsapairofentities(e1ande2)thatpartici-pateinarelation,r,islikelytoexpresstheproposi-tionr(e1,e2),sowecantreatitasapositivetrainingexampleofr.Figure1presentsanexampleofthisprocess.Onequestionwhichhasreceivedlittleattentioninpreviousworkishowtohandlethesituationwhereinformationismissing,eitherfromthetextcorpus,orthedatabase.Asanexample,supposethepairofentities(JohnP.McNamara,WashingtonStateUni-versity)isabsentfromtheEMPLOYERrelation.Inthiscase,thesentenceinFigure1(andotherswhichmentiontheentitypair)iseffectivelytreatedasanegativeexampleoftherelation.Thisisanissue l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 353–366. Editor de acciones: Patricio Pantel.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 353–366. Editor de acciones: Patricio Pantel. Submitted 5/2013; Revised 7/2013; Publicado 10/2013. C(cid:13)2013 Asociación de Lingüística Computacional. DistributionalSemanticsBeyondWords:SupervisedLearningofAnalogyandParaphrasePeterD.TurneyNationalResearchCouncilCanadaInformationandCommunicationsTechnologiesOttawa,ontario,Canada,K1A0R6peter.turney@nrc-cnrc.gc.caAbstractTherehavebeenseveraleffortstoextenddistributionalsemanticsbeyondindividualwords,tomeasurethesimilarityofwordpairs,phrases,andsentences(briefly,tuples;orderedsetsofwords,contiguousornoncontiguous).Onewaytoextendbeyondwordsistocom-paretwotuplesusingafunctionthatcom-binespairwisesimilaritiesbetweenthecom-ponentwordsinthetuples.Astrengthofthisapproachisthatitworkswithbothrela-tionalsimilarity(analogy)andcompositionalsimilarity(paraphrase).Sin embargo,pastworkrequiredhand-codingthecombinationfunc-tionfordifferenttasks.Themaincontributionofthispaperisthatcombinationfunctionsaregeneratedbysupervisedlearning.Weachievestate-of-the-artresultsinmeasuringrelationalsimilaritybetweenwordpairs(SATanalo-giesandSemEval2012Task2)andmeasur-ingcompositionalsimilaritybetweennoun-modifierphrasesandunigrams(multiple-choiceparaphrasequestions).1IntroductionHarris(1954)andFirth(1957)hypothesizedthatwordsthatappearinsimilarcontextstendtohavesimilarmeanings.Thishypothesisisthefounda-tionfordistributionalsemantics,inwhichwordsarerepresentedbycontextvectors.Thesimilarityoftwowordsiscalculatedbycomparingthetwocor-respondingcontextvectors(Lundetal.,1995;Lan-dauerandDumais,1997;TurneyandPantel,2010).Distributionalsemanticsishighlyeffectiveformeasuringthesemanticsimilaritybetweenindivid-ualwords.Onasetofeightymultiple-choicesyn-onymquestionsfromthetestofEnglishasafor-eignlanguage(TOEFL),adistributionalapproachrecentlyachieved100%accuracy(BullinariaandLevy,2012).Sin embargo,ithasbeendifficulttoextenddistributionalsemanticsbeyondindividualwords,towordpairs,phrases,andsentences.Movingbeyondindividualwords,therearevari-oustypesofsemanticsimilaritytoconsider.Herewefocusonparaphraseandanalogy.Paraphraseissimilarityinthemeaningoftwopiecesoftext(AndroutsopoulosandMalakasiotis,2010).Anal-ogyissimilarityinthesemanticrelationsoftwosetsofwords(Turney,2008a).Itiscommontostudyparaphraseatthesentencelevel(AndroutsopoulosandMalakasiotis,2010),butweprefertoconcentrateonthesimplesttypeofparaphrase,whereabigramparaphrasesaunigram.Forexample,doghouseisaparaphraseofkennel.Inourexperiments,weconcentrateonnoun-modifierbigramsandnoununigrams.Analogiesmaptermsinonedomaintotermsinanotherdomain(Gentner,1983).Thefamiliaranal-ogybetweenthesolarsystemandtheRutherford-Bohratomicmodelinvolvesseveraltermsfromthedomainofthesolarsystemandthedomainoftheatomicmodel(Turney,2008a).Thesimplesttypeofanalogyisproportionalanal-ogy,whichinvolvestwopairsofwords(Turney,2006b).Forexample,thepairhcook,rawiisanal-ogoustothepairhdecorate,plaini.Ifwecookathing,itisnolongerraw;ifwedecorateathing,it l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t a c
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 341–352. Editor de acciones: Mirella Lapata.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 341–352. Editor de acciones: Mirella Lapata. Submitted 12/2012; Revised 3/2013, 5/2013; Publicado 7/2013. C(cid:13)2013 Asociación de Lingüística Computacional. 341 WhatMakesWritingGreat?FirstExperimentsonArticleQualityPredictionintheScienceJournalismDomainAnnieLouisUniversityofPennsylvaniaPhiladelphia,PA19104lannie@seas.upenn.eduAniNenkovaUniversityofPennsylvaniaPhiladelphia,PA19104nenkova@seas.upenn.eduAbstractGreatwritingisrareandhighlyadmired.Readersseekoutarticlesthatarebeautifullywritten,informativeandentertaining.Yetinformation-accesstechnologieslackcapabil-itiesforpredictingarticlequalityatthislevel.Inthispaperwepresentfirstexperimentsonarticlequalitypredictioninthesciencejour-nalismdomain.Weintroduceacorpusofgreatpiecesofsciencejournalism,alongwithtypicalarticlesfromthegenre.Weimple-mentfeaturestocaptureaspectsofgreatwrit-ing,includingsurprising,visualandemotionalcontent,aswellasgeneralfeaturesrelatedtodiscourseorganizationandsentencestructure.Weshowthatthedistinctionbetweengreatandtypicalarticlescanbedetectedfairlyac-curately,andthattheentirespectrumofourfeaturescontributetothedistinction.1IntroductionMeasuresofarticlequalitywouldbehugelybene-ficialforinformationretrievalandrecommendationsystems.Inthispaper,wedescribeadatasetofNewYorkTimessciencejournalismarticleswhichwehavecategorizedforqualitydifferencesandpresentasystemthatcanautomaticallymakethedistinction.Sciencejournalismconveyscomplexscientificideas,entertainingandeducatingatthesametime.Considerthefollowingopeningofa2005articlebyDavidQuammenfromHarper’smagazine:Onemorningearlylastwinterasmallitemappearedinmylocalnewspaperannouncingthebirthofanextraordi-naryanimal.AteamofresearchersatTexasA&MUni-versityhadsucceededincloningawhitetaildeer.Neverdonebefore.Thefawn,knownasDewey,wasdevelopingnormallyandseemedtobehealthy.Hehadnomother,justasurrogatewhohadcarriedhisfetustoterm.Hehadnofather,justa“donor”ofallhischromosomes.HewasthegeneticduplicateofacertaintrophybuckoutofsouthTexaswhoseskincellshadbeenculturedinalaboratory.Oneofthosecellsfurnishedanucleusthat,transplantedandrejiggered,becametheDNAcoreofaneggcell,whichbecameanembryo,whichintimebe-cameDewey.Sohewaswildlife,inasense,andinan-othersenseelaboratelysynthetic.Thisisthesortofnews,quirkybutepochal,thatcancauseapersonwithamouth-fuloftoasttopauseandmarvel.Whatadumbidea,Imarveled.Thewritingisclearandwell-organizedbutthetextalsocontainscreativeuseoflanguageandacleverstory-likeexplanationofthescientificcon-tribution.Suchpropertiesmakesciencejournalismanattractivegenreforstudyingwritingquality.Sci-encejournalismisalsoahighlyrelevantdomainforinformationretrievalinthecontextofeducationalaswellasentertainingapplications.Articlequalitymeasurescanhugelybenefitsuchsystems.Priorworkindicatesthatthreeaspectsofarticlequalitycanbesuccessfullypredicted:a)whetheratextmeetstheacceptablestandardsforspelling(BrillandMoore,2000),gramática(TetreaultandChodorow,2008;RozovskayaandRoth,2010)anddiscourseorganization(Barzilayetal.,2002;Lap-ata,2003);b)hasatopicthatisinterestingtoapar-ticularuser.Forexample,content-basedrecommen-dationsystemsstandardlyrepresentuserinterestus-ingfrequentwordsfromarticlesinauser’shistoryandretrieveotherarticlesonthesametopics(Paz- l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. Submitted 1/2013; Revised 5/2013; Publicado 7/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. DynamicallyShapingtheReorderingSearchSpaceofPhrase-BasedStatisticalMachineTranslationAriannaBisazzaandMarcelloFedericoFondazioneBrunoKesslerTrento,Italia{bisazza,federico}@fbk.euAbstractDefiningthereorderingsearchspaceisacru-cialissueinphrase-basedSMTbetweendis-tantlanguages.Infact,theoptimaltrade-offbetweenaccuracyandcomplexityofde-codingisnowadaysreachedbyharshlylim-itingtheinputpermutationspace.Wepro-poseamethodtodynamicallyshapesuchspaceand,de este modo,capturelong-rangewordmovementswithouthurtingtranslationqual-itynordecodingtime.Thespacedefinedbyloosereorderingconstraintsisdynamicallyprunedthroughabinaryclassifierthatpredictswhetheragiveninputwordshouldbetrans-latedrightafteranother.Theintegrationofthismodelintoaphrase-baseddecoderim-provesastrongArabic-Englishbaselineal-readyincludingstate-of-the-artearlydistor-tioncost(MooreandQuirk,2007)andhierar-chicalphraseorientationmodels(GalleyandManning,2008).Significantimprovementsinthereorderingofverbsareachievedbyasys-temthatisnotablyfasterthanthebaseline,whileBLEUandMETEORremainstable,orevenincrease,ataveryhighdistortionlimit.1IntroductionWordorderdifferencesareamongthemostimpor-tantfactorsdeterminingtheperformanceofstatisti-calmachinetranslation(SMT)onagivenlanguagepair(Birchetal.,2009).Thisisparticularlytrueintheframeworkofphrase-basedSMT(PSMT)(Zensetal.,2002;Koehnetal.,2003;OchandNey,2002),anapproachthatremainshighlycompetitivedespitetherecentadvancesofthetree-basedapproaches.DuringthePSMTdecodingprocess,theoutputsentenceisbuiltfromlefttoright,whiletheinputsentencepositionscanbecoveredindifferentor-ders.Thus,reorderinginPSMTcanbeviewedastheproblemofchoosingtheinputpermutationthatleadstothehighest-scoringoutputsentence.Duetoefficiencyreasons,sin embargo,theinputpermutationspacecannotbefullyexplored,andisthereforelim-itedwithhardreorderingconstraints.Althoughmanysolutionshavebeenproposedtoexplicitlymodelwordreorderingduringdecoding,PSMTstilllargelyfailstohandlelong-rangewordmovementsinlanguagepairswithdifferentsyntac-ticstructures1.Webelievethisismostlynotduetodeficienciesoftheexistingreorderingmodels,butrathertoaverycoarsedefinitionofthereorder-ingsearchspace.Indeed,theexistingreorderingconstraintsarerathersimpleandtypicallybasedonword-to-worddistances.Moreover,theyareuni-formthroughouttheinputsentenceandinsensitivetotheactualwordsbeingtranslated.Relaxingthiskindofconstraintsmeansdramaticallyincreasingthesizeofthesearchspaceandmakingthereorder-ingmodel’staskextremelycomplex.Asaresult,eveninlanguagepairswherelongreorderingisreg-ularlyobserved,PSMTqualitydegradeswhenlongwordmovementsareallowedtothedecoder.Weaddressthisproblembytrainingabinaryclassifiertopredictwhetheragiveninputpositionshouldbetranslatedrightafteranother,giventhewordsatthosepositionsandtheircontexts.Whenthismodelisintegratedintothedecoder,itspredic-1Forempiricalevidence,seeforinstance(Birchetal.,2009;GalleyandManning,2008;BisazzaandFederico,2012). l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 315–326. Editor de acciones: Marcos Steedman.
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 315–326. Editor de acciones: Marcos Steedman. Submitted 2/2013; Revised 6/2013; Publicado 7/2013. C (cid:13) 2013 Asociación de Lingüística Computacional. Parsingentirediscoursesasverylongstrings:CapturingtopiccontinuityingroundedlanguagelearningMinh-ThangLuongDepartmentofComputerScienceStanfordUniversityStanford,Californialmthang@stanford.eduMichaelC.FrankDepartmentofPsychologyStanfordUniversityStanford,Californiamcfrank@stanford.eduMarkJohnsonDepartmentofComputingMacquarieUniversitySydney,AustraliaMark.Johnson@MQ.edu.auAbstractGroundedlanguagelearning,thetaskofmap-pingfromnaturallanguagetoarepresentationofmeaning,hasattractedmoreandmorein-terestinrecentyears.Inmostworkonthistopic,sin embargo,utterancesinaconversationaretreatedindependentlyanddiscoursestruc-tureinformationislargelyignored.Inthecontextoflanguageacquisition,thisindepen-denceassumptiondiscardscuesthatareim-portanttothelearner,e.g.,thefactthatcon-secutiveutterancesarelikelytosharethesamereferent(Franketal.,2013).Thecurrentpa-perdescribesanapproachtotheproblemofsimultaneouslymodelinggroundedlanguageatthesentenceanddiscourselevels.Wecom-bineideasfromparsingandgrammarinduc-tiontoproduceaparserthatcanhandlelonginputstringswiththousandsoftokens,creat-ingparsetreesthatrepresentfulldiscourses.Bycastinggroundedlanguagelearningasagrammaticalinferencetask,weuseourparsertoextendtheworkofJohnsonetal.(2012),investigatingtheimportanceofdiscoursecon-tinuityinchildren’slanguageacquisitionanditsinteractionwithsocialcues.Ourmodelboostsperformanceinalanguageacquisitiontaskandyieldsgooddiscoursesegmentationscomparedwithhumanannotators.1IntroductionLearningmappingsbetweennaturallanguage(NL)andmeaningrepresentations(MR)isanimportantgoalforbothcomputationallinguisticsandcognitivescience.Accuratelylearningnovelmappingsiscru-cialingroundedlanguageunderstandingtasksandsuchsystemscansuggestinsightsintothenatureofchildrenlanguagelearning.Twoinfluentialexamplesofgroundedlanguagelearningtasksarethesportscastingtask,RoboCup,wheretheNListhesetofrunningcommentaryandtheMRisthesetoflogicalformsrepresentingac-tionslikekickingorpassing(ChenandMooney,2008),andthecross-situationalword-learningtask,wheretheNListhecaregiver’sutterancesandtheMRisthesetofobjectspresentinthecontext(Siskind,1996;YuandBallard,2007).Workinthesedomainssuggeststhat,basedontheco-occurrencebetweenwordsandtheirreferentsincontext,itispossibletolearnmappingsbetweenNLandMRevenundersubstantialambiguity.Nevertheless,contextslikeRoboCup—whereev-erysingleutteranceisgrounded—areextremelyrare.Muchmorecommonarecaseswhereasin-gletopicisintroducedandthendiscussedatlengththroughoutadiscourse.Inatelevisionnewsshow,forexample,atopicmightbeintroducedbypresent-ingarelevantpictureorvideoclip.Oncethetopicisintroduced,theanchorscandiscussitbynameorevenusingapronounwithoutshowingapicture.Thediscourseisgroundedwithouthavingtogroundeveryutterance.Moreover,althoughpreviousworkhaslargelytreatedutteranceorderasindependent,theorderofutterancesiscriticalingroundeddiscoursecontexts:iftheorderisscrambled,itcanbecomeimpossibletorecoverthetopic.Supportingthisidea,Franketal.(2013)foundthattopiccontinuity—thetendencytotalkaboutthesametopicinmultipleutterancesthatarecontiguousintime—isbothprevalentandinformativeforwordlearning.Thispaperexaminestheimportanceoftopiccontinuitythroughagram-maticalinferenceproblem.WebuildonJohnsonetal.(2012)’sworkthatusedgrammaticalinferenceto l D o w n o a d e d f r o m h t t p : / / directo . m i t . e d u / t