Documentación

¿Sobre qué tema necesitas documentación??

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 47–60, 2016. Editor de acciones: David Chiang.

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 47–60, 2016. Editor de acciones: David Chiang. Lote de envío: 11/2015; Publicado 2/2016. 2016 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) DetectingCross-CulturalDifferencesUsingaMultilingualTopicModelE.D.Guti´errez1EkaterinaShutova2PatriciaLichtenstein3GerarddeMelo4LucaGilardi51UniversityofCalifornia,SanDiego2ComputerLaboratory,UniversityofCambridge3UniversityofCalifornia,Merced4IIIS,TsinghuaUniversity,5ICSI,Berkeleyedg@icsi.berkeley.edues407@cam.ac.uktricia1@uchicago.edugdm@demelo.orglucag@icsi.berkeley.eduAbstractUnderstandingcross-culturaldifferenceshasimportantimplicationsforworldaffairsandmanyaspectsofthelifeofsociety.Yet,themajorityoftext-miningmethodstodatefocusontheanalysisofmonolingualtexts.Incon-trast,wepresentastatisticalmodelthatsimul-taneouslylearnsasetofcommontopicsfrommultilingual,non-paralleldataandautomati-callydiscoversthedifferencesinperspectivesonthesetopicsacrosslinguisticcommunities.Weperformabehaviouralevaluationofasub-setofthedifferencesidentifiedbyourmodelinEnglishandSpanishtoinvestigatetheirpsy-chologicalvalidity.1IntroductionRecentyearshaveseenagrowinginterestintext-miningapplicationsaimedatuncoveringpublicopinionsandsocialtrends(Faderetal.,2007;Mon-roeetal.,2008;GerrishandBlei,2011;Pennac-chiottiandPopescu,2011).Theyrestontheas-sumptionthatthelanguageweuseisindicativeofourunderlyingworldviews.Researchincognitiveandsociolinguisticssuggeststhatlinguisticvaria-tionacrosscommunitiessystematicallyreflectsdif-ferencesintheirculturalandmoralmodelsandgoesbeyondlexiconandgrammar(K¨ovecses,2004;LakoffandWehling,2012).Cross-culturaldiffer-encesmanifestthemselvesintextinamultitudeofways,mostprominentlythroughtheuseofexplicitopinionvocabularywithrespecttoacertaintopic(e.g.“policiesthatbenefitthepoor”),idiomaticandmetaphoricallanguage(e.g.“thecompanyisspin-ningitswheels”)andothertypesoffigurativelan-guage,suchasironyorsarcasm.Theconnectionbetweenlanguage,cultureandreasoningremainsoneofthecentralresearchques-tionsinpsychology.ThibodeauandBorodit-sky(2011)investigatedhowmetaphorsaffectourdecision-making.Theypresentedtwogroupsofhu-mansubjectswithtwodifferenttextsaboutcrime.Inthefirsttext,crimewasmetaphoricallyportrayedasavirusandinthesecondasabeast.Thetwogroupswerethenaskedasetofquestionsonhowtotacklecrimeinthecity.Asaresult,whilethefirstgrouptendedtooptforpreventivemeasures(e.g.strongersocialpolicies),thesecondgroupconvergedonpunishment-orrestraint-orientedmea-sures.AccordingtoThibodeauandBoroditsky,theirresultsdemonstratethatmetaphorshaveprofoundinfluenceonhowweconceptualizeandactwithre-specttosocietalissues.Thissuggeststhatinordertogainafullunderstandingofsocialtrendsacrosspop-ulations,oneneedstoidentifysubtlebutsystematiclinguisticdifferencesthatstemfromthegroups’cul-turalbackgrounds,expressedbothliterallyandfig-uratively.Performingsuchananalysisbyhandislabor-intensiveandoftenimpractical,particularlyinamultilingualsettingwhereexpertiseinallofthelanguagesofinterestmayberare.Withtheriseofbloggingandsocialmedia,NLPtechniqueshavebeensuccessfullyusedforanumberoftasksinpoliticalscience,includingautomaticallyestimatingtheinfluenceofparticularpoliticiansintheUSsenate(Faderetal.,2007),identifyinglex-icalfeaturesthatdifferentiatepoliticalrhetoricofopposingparties(Monroeetal.,2008),predictingvotingpatternsofpoliticiansbasedontheiruseoflanguage(GerrishandBlei,2011),andpredictingpoliticalaffiliationofTwitterusers(PennacchiottiandPopescu,2011).Fangetal.(2012)addressed l D o w n o a d e d f r o m h t t p : / / directo . m i

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 31–45, 2016. Editor de acciones: Tim Baldwin.

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 31–45, 2016. Editor de acciones: Tim Baldwin. Lote de envío: 12/2015; Lote de revisión: 2/2016; Publicado 2/2016. 2016 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) ABayesianModelofDiachronicMeaningChangeLeaFrermannandMirellaLapataInstituteforLanguage,CognitionandComputationSchoolofInformatics,UniversityofEdinburgh10CrichtonStreet,EdinburghEH89ABl.frermann@ed.ac.uk,mlap@inf.ed.ac.ukAbstractWordmeaningschangeovertimeandanau-tomatedprocedureforextractingthisinfor-mationfromtextwouldbeusefulforhistor-icalexploratorystudies,informationretrievalorquestionanswering.Wepresentady-namicBayesianmodelofdiachronicmeaningchange,whichinferstemporalwordrepresen-tationsasasetofsensesandtheirprevalence.Unlikepreviouswork,weexplicitlymodellanguagechangeasasmooth,gradualpro-cess.Weexperimentallyshowthatthismodel-ingdecisionisbeneficial:ourmodelperformscompetitivelyonmeaningchangedetectiontaskswhilstinducingdiscerniblewordsensesandtheirdevelopmentovertime.ApplicationofourmodeltotheSemEval-2015temporalclassificationbenchmarkdatasetsfurtherre-vealsthatitperformsonparwithhighlyop-timizedtask-specificsystems.1IntroductionLanguageisadynamicsystem,constantlyevolv-ingandadaptingtotheneedsofitsusersandtheirenvironment(Aitchison,2001).Wordsinalllan-guagesnaturallyexhibitarangeofsenseswhosedis-tributionorprevalencevariesaccordingtothegenreandregisterofthediscourseaswellasitshistoricalcontext.Asanexample,considerthewordcutewhichaccordingtotheOxfordEnglishDictionary(OED,Stevenson2010)firstappearedintheearly18thcenturyandoriginallymeantcleverorkeen-witted.1Bythelate19thcenturycutewasusedin1Throughoutthispaperwedenotewordsintruetype,theirsensesinitalics,andsense-specificcontextwordsas{liza}.thesamesenseascunning.Todayitmostlyreferstoobjectsorpeopleperceivedasattractive,prettyorsweet.Anotherexampleisthewordmousewhichinitiallywasonlyusedintherodentsense.TheOEDdatesthecomputerpointingdevicesenseofmouseto1965.Thelattersensehasbecomepar-ticularlydominantinrecentdecadesduetotheever-increasinguseofcomputertechnology.Thearrivaloflarge-scalecollectionsofhistorictexts(Davies,2010)andonlinelibrariessuchastheInternetArchiveandGoogleBookshavegreatlyfacilitatedcomputationalinvestigationsoflanguagechange.Theabilitytoautomaticallydetecthowthemeaningofwordsevolvesovertimeispotentiallyofsignificantvaluetolexicographicandlinguisticresearchbutalsotorealworldapplications.Time-specificknowledgewouldpresumablyrenderwordmeaningrepresentationsmoreaccurate,andbenefitseveraldownstreamtaskswheresemanticinforma-tioniscrucial.Examplesincludeinformationre-trievalandquestionanswering,wheretime-relatedinformationcouldincreasetheprecisionofquerydisambiguationanddocumentretrieval(e.g.,byre-turningdocumentswithnewlycreatedsensesorfil-teringoutdocumentswithobsoletesenses).InthispaperwepresentadynamicBayesianmodelofdiachronicmeaningchange.Wordmean-ingismodeledasasetofsenses,whicharetrackedoverasequenceofcontiguoustimeintervals.Weinfertemporalmeaningrepresentations,consistingofaword’ssenses(asaprobabilitydistributionoverwords)andtheirrelativeprevalence.Ourmodelisthusabletodetectthatmousehadonesenseuntilthemid-20thcentury(characterizedbywordssuchas{cheese,tail,rat})andsubsequentlyacquireda l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 17–30, 2016. Editor de acciones: Chris Callison Burch.

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 17–30, 2016. Editor de acciones: Chris Callison Burch. Lote de envío: 9/2015; revisado 12/2015; revisado 1/2016; Publicado 2/2016. 2016 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) LearningtoUnderstandPhrasesbyEmbeddingtheDictionaryFelixHillComputerLaboratoryUniversityofCambridgefelix.hill@cl.cam.ac.ukKyunghyunCho∗CourantInstituteofMathematicalSciencesandCentreforDataScienceNewYorkUniversitykyunghyun.cho@nyu.eduAnnaKorhonenDepartmentofTheoreticalandAppliedLinguisticsUniversityofCambridgealk23@cam.ac.ukYoshuaBengioCIFARSeniorFellowUniversit´edeMontr´ealyoshua.bengio@umontreal.caAbstractDistributionalmodelsthatlearnrichseman-ticwordrepresentationsareasuccessstoryofrecentNLPresearch.However,develop-ingmodelsthatlearnusefulrepresentationsofphrasesandsentenceshasprovedfarharder.Weproposeusingthedefinitionsfoundineverydaydictionariesasameansofbridg-ingthisgapbetweenlexicalandphrasalse-mantics.Neurallanguageembeddingmod-elscanbeeffectivelytrainedtomapdictio-narydefinitions(phrases)a(lexical)repre-sentationsofthewordsdefinedbythosedefi-nitions.Wepresenttwoapplicationsofthesearchitectures:reversedictionariesthatreturnthenameofaconceptgivenadefinitionordescriptionandgeneral-knowledgecrosswordquestionanswerers.Onbothtasks,neurallan-guageembeddingmodelstrainedondefini-tionsfromahandfuloffreely-availablelex-icalresourcesperformaswellorbetterthanexistingcommercialsystemsthatrelyonsig-nificanttask-specificengineering.There-sultshighlighttheeffectivenessofbothneu-ralembeddingarchitecturesanddefinition-basedtrainingfordevelopingmodelsthatun-derstandphrasesandsentences.1IntroductionMuchrecentresearchincomputationalseman-ticshasfocussedonlearningrepresentationsofarbitrary-lengthphrasesandsentences.Thistaskischallengingpartlybecausethereisnoobviousgoldstandardofphrasalrepresentationthatcouldbeused∗WorkmainlydoneattheUniversityofMontreal.intrainingandevaluation.Consequently,itisdiffi-culttodesignapproachesthatcouldlearnfromsuchagoldstandard,andalsohardtoevaluateorcomparedifferentmodels.Inthiswork,weusedictionarydefinitionstoad-dressthisissue.Thecomposedmeaningofthewordsinadictionarydefinition(atall,long-necked,spottedruminantofAfrica)shouldcorrespondtothemeaningofthewordtheydefine(giraffe).Thisbridgebetweenlexicalandphrasalsemanticsisuse-fulbecausehighqualityvectorrepresentationsofsinglewordscanbeusedasatargetwhenlearningtocombinethewordsintoacoherentphrasalrepre-sentation.Thisapproachstillrequiresamodelcapableoflearningtomapbetweenarbitrary-lengthphrasesandfixed-lengthcontinuous-valuedwordvectors.Forthispurposeweexperimentwithtwobroadclassesofneurallanguagemodels(NLMs):Recur-rentNeuralNetworks(RNNs),whichnaturallyen-codetheorderofinputwords,andsimpler(feed-forward)bag-of-words(BOW)embeddingmodels.PriortotrainingtheseNLMs,welearntargetlexi-calrepresentationsbytrainingtheWord2Vecsoft-ware(Mikolovetal.,2013)onbillionsofwordsofrawtext.Wedemonstratetheusefulnessofourapproachbybuildingandreleasingtwoapplications.Thefirstisareversedictionaryorconceptfinder:asystemthatreturnswordsbasedonuserdescriptionsordefini-tions(ZockandBilac,2004).Reversedictionariesareusedbycopywriters,novelists,translatorsandotherprofessionalwriterstofindwordsfornotionsorideasthatmightbeonthetipoftheirtongue. l D o w n o a d e d f r o m h t t p : / / d i r e c

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 529–542, 2017. Editor de acciones: Diana McCarthy.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 529–542, 2017. Editor de acciones: Diana McCarthy. Lote de envío: 7/2017 Publicado 12/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) AnchoredCorrelationExplanation:TopicModelingwithMinimalDomainKnowledgeRyanJ.Gallagher1,2,KyleReing1,DavidKale1,andGregVerSteeg11InformationSciencesInstitute,UniversityofSouthernCalifornia2VermontComplexSystemsCenter,ComputationalStoryLab,UniversityofVermontryan.gallagher@uvm.edu{reing,kale,gregv}@isi.eduAbstractWhilegenerativemodelssuchasLatentDirichletAllocation(LDA)haveprovenfruit-fulintopicmodeling,theyoftenrequirede-tailedassumptionsandcarefulspecificationofhyperparameters.Suchmodelcomplexityis-suesonlycompoundwhentryingtogeneral-izegenerativemodelstoincorporatehumaninput.WeintroduceCorrelationExplanation(CorEx),analternativeapproachtotopicmod-elingthatdoesnotassumeanunderlyinggen-erativemodel,andinsteadlearnsmaximallyinformativetopicsthroughaninformation-theoreticframework.Thisframeworknat-urallygeneralizestohierarchicalandsemi-supervisedextensionswithnoadditionalmod-elingassumptions.Inparticular,word-leveldomainknowledgecanbeflexiblyincorpo-ratedwithinCorExthroughanchorwords,al-lowingtopicseparabilityandrepresentationtobepromotedwithminimalhumaninterven-tion.Acrossavarietyofdatasets,métrica,andexperiments,wedemonstratethatCorExproducestopicsthatarecomparableinqualitytothoseproducedbyunsupervisedandsemi-supervisedvariantsofLDA.1IntroductionThemajorityoftopicmodelingapproachesutilizeprobabilisticgenerativemodels,modelswhichspec-ifymechanismsforhowdocumentsarewritteninordertoinferlatenttopics.Thesemechanismsmaybeexplicitlystated,asinLatentDirichletAlloca-tion(LDA)(Bleietal.,2003),orimplicitlystated,aswithmatrixfactorizationtechniques(Hofmann,1999;Dingetal.,2008;BuntineandJakulin,2006).ThecoregenerativemechanismsofLDA,inpar-ticular,haveinspirednumerousgeneralizationsthataccountforadditionalinformation,suchastheau-thorship(Rosen-Zvietal.,2004),documentlabels(McAuliffeandBlei,2008),orhierarchicalstructure(Griffithsetal.,2004).Sin embargo,thesegeneralizationscomeatthecostofincreasinglyelaborateandunwieldygenerativeassumptions.Whiletheseassumptionsallowtopicinferencetobetractableinthefaceofadditionalmetadata,theyprogressivelyconstraintopicstoanarrowerviewofwhatatopiccanbe.Suchassump-tionsareundesirableincontextswhereonewishestominimizemodelcomplexityandlearntopicswith-outpreexistingnotionsofhowthosetopicsorigi-nated.Forthesereasons,weproposetopicmodelingbywayofCorrelationExplanation(CorEx),1aninformation-theoreticapproachtolearninglatenttopicsoverdocuments.UnlikeLDA,CorExdoesnotassumeaparticulardatageneratingmodel,andinsteadsearchesfortopicsthatare“maximallyin-formative”aboutasetofdocuments.Bylearninginformativetopicsratherthangeneratedtopics,weavoidspecifyingthestructureandnatureoftopicsaheadoftime.Inaddition,thelightweightframeworkunderly-ingCorExisversatileandnaturallyextendstohier-archicalandsemi-supervisedvariantswithnoaddi-tionalmodelingassumptions.Morespecifically,we1Opensource,documentedcodefortheCorExtopicmodelavailableathttps://github.com/gregversteeg/corex_topic. l D o w n o a d e d f r o m h t t p : / / directo . m i

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 487–500, 2017. Editor de acciones: Chris Quirk.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 487–500, 2017. Editor de acciones: Chris Quirk. Lote de envío: 3/2017; Publicado 11/2017. C(cid:13)2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. PhraseTableInductionUsingIn-DomainMonolingualDataforDomainAdaptationinStatisticalMachineTranslationBenjaminMarieAtsushiFujitaNationalInstituteofInformationandCommunicationsTechnology3-5Hikaridai,Seika-cho,Soraku-gun,Kioto,619-0289,Japón{bmarie,atsushi.fujita}@nict.go.jpAbstractWepresentanewframeworktoinduceanin-domainphrasetablefromin-domainmonolin-gualdatathatcanbeusedtoadaptageneral-domainstatisticalmachinetranslationsystemtothetargeteddomain.OurmethodfirstcompilessetsofphrasesinsourceandtargetlanguagesseparatelyandgeneratescandidatephrasepairsbytakingtheCartesianproductofthetwophrasesets.Itthencomputesin-expensivefeaturesforeachcandidatephrasepairandfiltersthemusingasupervisedclas-sifierinordertoinduceanin-domainphrasetable.WeexperimentedonthelanguagepairEnglish–French,bothtranslationdirections,intwodomainsandobtainedconsistentlybetterresultsthanastrongbaselinesystemthatusesanin-domainbilinguallexicon.Wealsocon-ductedanerroranalysisthatshowedthein-ducedphrasetablesproposedusefultransla-tions,especiallyforwordsandphrasesunseenintheparalleldatausedtotrainthegeneral-domainbaselinesystem.1IntroductionInphrase-basedstatisticalmachinetranslation(SMT),translationmodelsareestimatedoveralargeamountofparalleldata.Ingeneral,usingmoredataleadstoabettertranslationmodel.Whennospecificdomainistargeted,general-domain1par-alleldatafromvariousdomainsmaybeusedto1AsinAxelrodetal.(2011),inthispaper,weusethetermgeneral-domaininsteadofthecommonlyusedout-of-domainbecauseweassumethattheparalleldatamaycontainsomein-domainsentencepairs.trainageneral-purposeSMTsystem.However,itiswell-knownthat,intrainingasystemtotrans-latetextsfromaspecificdomain,usingin-domainparalleldatacanleadtoasignificantlybettertrans-lationquality(Carpuatetal.,2012).En efecto,whenonlygeneral-domainparalleldataareused,itisun-likelythatthetranslationmodelcanlearnexpres-sionsandtheirtranslationsspecifictothetargeteddomain.Suchexpressionswillthenremainuntrans-latedinthein-domaintextstotranslate.Sofar,in-domainparalleldatahavebeenhar-nessedtocoverdomain-specificexpressionsandtheirtranslationsinthetranslationmodel.However,evenifwecanassumetheavailabilityofalargequantityofgeneral-domainparalleldata,atleastforresource-richlanguagepairs,findingin-domainpar-alleldataspecifictoaparticulardomainremainschallenging.In-domainparalleldatamaynotexistforthetargetedlanguagepairsormaynotbeavail-ableathandtotrainagoodtranslationmodel.Inordertocircumventthelackofin-domainpar-alleldata,thispaperpresentsanewmethodtoadaptanexistingSMTsystemtoaspecificdomainbyin-ducinganin-domainphrasetable,i.e.,asetofphrasepairsassociatedwithfeaturesfordecoding,fromin-domainmonolingualdata.AswereviewinSec-tion2,mostoftheexistingmethodsforinducingphrasetablesarenotdesigned,andmaynotperformasexpected,toinduceaphrasetableforaspecificdomainforwhichonlylimitedresourcesareavail-able.Insteadofrelyingonlargequantityofparalleldataorhighlycomparablecorpora,ourmethodin-ducesanin-domainphrasetablefromunalignedin-domainmonolingualdatathroughathree-steppro- l D o w n o a d e d f r o m h t t p : / / directo . m i t .

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 441–454, 2017. Editor de acciones: Marco Kuhlmann.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 441–454, 2017. Editor de acciones: Marco Kuhlmann. Lote de envío: 4/2017; Publicado 11/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) ParsingwithTraces:AnO(n4)AlgorithmandaStructuralRepresentationJonathanK.KummerfeldandDanKleinComputerScienceDivisionUniversityofCalifornia,BerkeleyBerkeley,CA94720,USA{jkk,klein}@cs.berkeley.eduAbstractGeneraltreebankanalysesaregraphstruc-tured,butparsersaretypicallyrestrictedtotreestructuresforefficiencyandmodelingrea-sons.Weproposeanewrepresentationandalgorithmforaclassofgraphstructuresthatisflexibleenoughtocoveralmostalltreebankstructures,whilestilladmittingefficientlearn-ingandinference.Inparticular,weconsiderdirected,acyclic,one-endpoint-crossinggraphstructures,whichcovermostlong-distancedislocation,sharedargumentation,andsimilartree-violatinglinguisticphenomena.Wede-scribehowtoconvertphrasestructureparses,includingtraces,toournewrepresentationinareversiblemanner.Ourdynamicprogramuniquelydecomposesstructures,issoundandcomplete,andcovers97.3%ofthePennEn-glishTreebank.Wealsoimplementaproof-of-conceptparserthatrecoversarangeofnullelementsandtracetypes.1IntroductionManysyntacticrepresentationsusegraphsand/ordiscontinuousstructures,suchastracesinGovern-mentandBindingtheoryandf-structureinLexi-calFunctionalGrammar(Chomsky1981;KaplanandBresnan1982).SentencesinthePennTree-bank(PTB,Marcusetal.1993)haveacoreprojec-tivetreestructureandtraceedgesthatrepresentcon-trolstructures,wh-movementandmore.However,mostparsersandthestandardevaluationmetricig-noretheseedgesandallnullelements.Byleavingoutpartsofthestructure,theyfailtoprovidekeyrelationstodownstreamtaskssuchasquestionan-swering.Whiletherehasbeenworkoncapturingsomepartsofthisextrastructure,ithasgenerallyei-therbeenthroughpost-processingontrees(Johnson2002;Jijkoun2003;Campbell2004;LevyandMan-ning2004;Gabbardetal.2006)orhasonlycapturedalimitedsetofphenomenaviagrammaraugmenta-tion(Collins1997;DienesandDubey2003;Schmid2006;Caietal.2011).Weproposeanewgeneral-purposeparsingalgo-rithmthatcanefficientlysearchoverawiderangeofsyntacticphenomena.Ouralgorithmextendsanon-projectivetreeparsingalgorithm(Pitleretal.2013;Pitler2014)tographstructures,withimprove-mentstoavoidderivationalambiguitywhilemain-taininganO(n4)runtime.Ouralgorithmalsoin-cludesanoptionalextensiontoensureparsescontainadirectedprojectivetreeofnon-traceedges.Ouralgorithmcannotapplydirectlytocon-stituencyparses–itrequireslexicalizedstructuressimilartodependencyparses.Weextendandim-provepreviousworkonlexicalizedconstituentrep-resentations(Shenetal.2007;Carrerasetal.2008;HayashiandNagata2016)tohandletraces.Inthisform,tracescancreateproblematicstructuressuchasdirectedcycles,butweshowhowcarefulchoiceofheadrulescanminimizesuchissues.Weimplementaproof-of-conceptparser,scor-ing88.1ontreesinsection23and70.6ontraces.Together,ourrepresentationandalgorithmcover97.3%ofsentences,farabovethecoverageofpro-jectivetreeparsers(43.9%).2BackgroundThisworkbuildsontwoareas:non-projectivetreeparsing,andparsingwithnullelements.Non-projectivityisimportantinsyntaxforrep- l D o w n o a d e d f r o m h t t p : / / directo . m i

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 413–424, 2017. Editor de acciones: Brian Roark.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 413–424, 2017. Editor de acciones: Brian Roark. Lote de envío: 6/2017; Publicado 11/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) In-OrderTransition-basedConstituentParsingJiangmingLiuandYueZhangSingaporeUniversityofTechnologyandDesign,8SomapahRoad,Singapur,487372jmliunlp@gmail.com,yuezhang@sutd.edu.sgAbstractBothbottom-upandtop-downstrategieshavebeenusedforneuraltransition-basedcon-stituentparsing.Theparsingstrategiesdif-ferintermsoftheorderinwhichtheyrecog-nizeproductionsinthederivationtree,wherebottom-upstrategiesandtop-downstrategiestakepost-orderandpre-ordertraversalovertrees,respectively.Bottom-upparsersbene-fitfromrichfeaturesfromreadilybuiltpar-tialparses,butlacklookaheadguidanceintheparsingprocess;top-downparsersbenefitfromnon-localguidanceforlocaldecisions,butrelyonastrongencoderovertheinputtopredictaconstituenthierarchybeforeitscon-struction.Tomitigatebothissues,wepro-poseanovelparsingsystembasedonin-ordertraversaloversyntactictrees,designingasetoftransitionactionstofindacompromisebe-tweenbottom-upconstituentinformationandtop-downlookaheadinformation.Basedonstack-LSTM,ourpsycholinguisticallymoti-vatedconstituentparsingsystemachieves91.8F1ontheWSJbenchmark.Furthermore,thesystemachieves93.6F1withsupervisedrerankingand94.2F1withsemi-supervisedreranking,whicharethebestresultsontheWSJbenchmark.1IntroductionTransition-basedconstituentparsingemploysse-quencesoflocaltransitionactionstoconstructcon-stituenttreesoversentences.Therearetwopop-ulartransition-basedconstituentparsingsystems,namelybottom-upparsing(SagaeandLavie,2005;ZhangandClark,2009;Zhuetal.,2013;WatanabeandSumita,2015)andtop-downparsing(Dyeretal.,2016;Kuncoroetal.,2017).Theparsingstrate-giesdifferintermsoftheorderinwhichtheyrecog-nizeproductionsinthederivationtree.Theprocessofbottom-upparsingcanbere-gardedaspost-ordertraversaloveraconstituenttree.Forexample,giventhesentenceinFigure1,abottom-upshift-reduceparsertakestheac-tionsequenceinTable2(a)1tobuildtheoutput,wherethewordsequence“Thelittleboy”isfirstread,andthenanNPrecognizedforthewordsequence.Afterthesystemreadstheverb“likes”anditssubsequentNP,aVPisrecognized.Thefullorderofrecognitionforthetreenodesis3(cid:13)→4(cid:13)→5(cid:13)→2(cid:13)→7(cid:13)→9(cid:13)→10(cid:13)→8(cid:13)→6(cid:13)→11(cid:13)→1(cid:13).Whenmakinglocaldecisions,richinformationisavailablefromreadilybuiltpartialtrees(Zhuetal.,2013;WatanabeandSumita,2015;CrossandHuang,2016),whichcontributestolocaldisambiguation.However,thereislackoftop-downguidancefromlookaheadinformation,whichcanbeuseful(Johnson,1998;RoarkandJohnson,1999;Charniak,2000;LiuandZhang,2017).Inaddition,binarizationmustbeappliedtotrees,asshowninFigure1(b),toensureaconstantnumberofactions(SagaeandLavie,2005),andtotakeadvantageoflexicalheadinformation(collins,2003).Sin embargo,suchbinarizationrequiresasetoflanguage-specificrules,whichhampersadaptationofparsingtootherlanguages.Ontheotherhand,theprocessoftop-downparsingcanberegardedaspre-ordertraversaloveratree.GiventhesentenceinFigure1,atop-down1Theactionsequenceistakenonunbinarizedtrees. l D o w n o a d e d f r o m h t t p : / / directo . m i

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 379–395, 2017. Editor de acciones: Marcos Steedman.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 379–395, 2017. Editor de acciones: Marcos Steedman. Lote de envío: 12/2016; Lote de revisión: 3/2017; Publicado 11/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) OrdinalCommon-senseInferenceShengZhangJohnsHopkinsUniversityzsheng2@jhu.eduRachelRudingerJohnsHopkinsUniversityrudinger@jhu.eduKevinDuhJohnsHopkinsUniversitykevinduh@cs.jhu.eduBenjaminVanDurmeJohnsHopkinsUniversityvandurme@cs.jhu.eduAbstractHumanshavethecapacitytodrawcommon-senseinferencesfromnaturallanguage:vari-ousthingsthatarelikelybutnotcertaintoholdbasedonestablisheddiscourse,andarerarelystatedexplicitly.Weproposeanevaluationofautomatedcommon-senseinferencebasedonanextensionofrecognizingtextualentail-ment:predictingordinalhumanresponsesonthesubjectivelikelihoodofaninferencehold-inginagivencontext.Wedescribeaframe-workforextractingcommon-senseknowledgefromcorpora,whichisthenusedtoconstructadatasetforthisordinalentailmenttask.Wetrainaneuralsequence-to-sequencemodelonthisdataset,whichweusetoscoreandgen-eratepossibleinferences.Further,weanno-tatesubsetsofpreviouslyestablisheddatasetsviaourordinalannotationprotocolinordertothenanalyzethedistinctionsbetweentheseandwhatwehaveconstructed.1IntroductionWeusewordstotalkabouttheworld.There-fore,tounderstandwhatwordsmean,wemusthaveapriorexplicationofhowweviewtheworld.–Hobbs(1987)ResearchersinArtificialIntelligenceand(Compu-tational)Linguisticshavelong-citedtherequire-mentofcommon-senseknowledgeinlanguageun-derstanding.1Thisknowledgeisviewedasakey1Schank(1975):Ithasbeenapparent…dentro…naturallanguageunderstandingthattheeventuallimittooursolu-tionwouldbeourabilitytocharacterizeworldknowledge.Samboughtanewclock;TheclockrunsDavefoundanaxeinhisgarage;AcarisparkedinthegarageTomwasaccidentallyshotbyhisteammateinthearmy;TheteammatediesTwofriendswereinaheatedgameofcheckers;ApersonshootsthecheckersMyfriendsandIdecidedtogoswimmingintheocean;TheoceaniscarbonatedFigure1:Examplesofcommon-senseinferencerangingfromverylikely,likely,plausible,technicallypossible,toimpossible.componentinfillinginthegapsbetweenthetele-graphicstyleofnaturallanguagestatements.Weareabletoconveyconsiderableinformationinarela-tivelysparsechannel,presumablyowingtoapar-tiallysharedmodelatthestartofanydiscourse.2Common-senseinference–inferencesbasedoncommon-senseknowledge–ispossibilistic:thingseveryonemoreorlesswouldexpecttoholdinagivencontext,butwithoutthenecessarystrengthoflogicalentailment.3Becausenaturallanguagecor-poraexhibitshumanreportingbias(GordonandVanDurme,2013),systemsthatderiveknowledgeex-clusivelyfromsuchcorporamaybemoreaccuratelyconsideredmodelsoflanguage,ratherthanofthe2McCarthy(1959):aprogramhascommonsenseifitau-tomaticallydeducesforitselfasufficientlywideclassofimme-diateconsequencesofanythingitistoldandwhatitalreadyknows.3ManyofthebridginginferencesofClark(1975)makeuseofcommon-senseknowledge,suchasthefollowingexampleof“Probablepart”:Iwalkedintotheroom.Thewindowslookedouttothebay.Toresolvethedefinitereferencethewindows,oneneedstoknowthatroomshavewindowsisprobable. l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 365–378, 2017. Editor de acciones: Adam Lopez.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 365–378, 2017. Editor de acciones: Adam Lopez. Lote de envío: 11/2016; Lote de revisión: 2/2017; Publicado 10/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) FullyCharacter-LevelNeuralMachineTranslationwithoutExplicitSegmentationJasonLee∗ETHZ¨urichjasonlee@inf.ethz.chKyunghyunChoNewYorkUniversitykyunghyun.cho@nyu.eduThomasHofmannETHZ¨urichthomas.hofmann@inf.ethz.chAbstractMostexistingmachinetranslationsystemsop-erateatthelevelofwords,relyingonex-plicitsegmentationtoextracttokens.Wein-troduceaneuralmachinetranslation(NMT)modelthatmapsasourcecharactersequencetoatargetcharactersequencewithoutanyseg-mentation.Weemployacharacter-levelcon-volutionalnetworkwithmax-poolingattheencodertoreducethelengthofsourcerep-resentation,allowingthemodeltobetrainedataspeedcomparabletosubword-levelmod-elswhilecapturinglocalregularities.Ourcharacter-to-charactermodeloutperformsarecentlyproposedbaselinewithasubword-levelencoderonWMT’15DE-ENandCS-EN,andgivescomparableperformanceonFI-ENandRU-EN.Wethendemonstratethatitispossibletoshareasinglecharacter-levelencoderacrossmultiplelanguagesbytrainingamodelonamany-to-onetransla-tiontask.Inthismultilingualsetting,thecharacter-levelencodersignificantlyoutper-formsthesubword-levelencoderonallthelanguagepairs.WeobservethatonCS-EN,FI-ENandRU-EN,thequalityofthemultilin-gualcharacter-leveltranslationevensurpassesthemodelsspecificallytrainedonthatlan-guagepairalone,bothintermsoftheBLEUscoreandhumanjudgment.1IntroductionNearlyallpreviousworkinmachinetranslationhasbeenatthelevelofwords.Asidefromourintu-∗ThemajorityofthisworkwascompletedwhiletheauthorwasvisitingNewYorkUniversity.itiveunderstandingofwordasabasicunitofmean-ing(Jackendoff,1992),onereasonbehindthisisthatsequencesaresignificantlylongerwhenrep-resentedincharacters,compoundingtheproblemofdatasparsityandmodelinglong-rangedepen-dencies.ThishasdrivenNMTresearchtobeal-mostexclusivelyword-level(Bahdanauetal.,2015;Sutskeveretal.,2014).Despitetheirremarkablesuccess,word-levelNMTmodelssufferfromseveralmajorweaknesses.Forone,theyareunabletomodelrare,out-of-vocabularywords,makingthemlimitedintranslat-inglanguageswithrichmorphologysuchasCzech,FinnishandTurkish.Ifoneusesalargevocabularytocombatthis(Jeanetal.,2015),thecomplexityoftraininganddecodinggrowslinearlywithrespecttothetargetvocabularysize,leadingtoaviciouscycle.Toaddressthis,wepresentafullycharacter-levelNMTmodelthatmapsacharactersequenceinasourcelanguagetoacharactersequenceinatargetlanguage.Weshowthatourmodeloutperformsabaselinewithasubword-levelencoderonDE-ENandCS-EN,andachievesacomparableresultonFI-ENandRU-EN.Apurelycharacter-levelNMTmodelwithabasicencoderwasproposedasabase-linebyLuongandManning(2016),buttrainingitwasprohibitivelyslow.Wewereabletotrainourmodelatareasonablespeedbydrasticallyreducingthelengthofsourcesentencerepresentationusingastackofconvolutional,poolingandhighwaylayers.Oneadvantageofcharacter-levelmodelsisthattheyarebettersuitedformultilingualtranslationthantheirword-levelcounterpartswhichrequireaseparatewordvocabularyforeachlanguage.We l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 353–364, 2017. Editor de acciones: Eric Fosler-Lussier.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 353–364, 2017. Editor de acciones: Eric Fosler-Lussier. Lote de envío: 10/2016; Lote de revisión: 12/2016; Publicado 10/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) UnsupervisedLearningofMorphologicalForestsJiamingLuoCSAIL,MITj_luo@mit.eduKarthikNarasimhanCSAIL,MITkarthikn@mit.eduReginaBarzilayCSAIL,MITregina@csail.mit.eduAbstractThispaperfocusesonunsupervisedmodelingofmorphologicalfamilies,collectivelycom-prisingaforestoverthelanguagevocabulary.Thisformulationenablesustocaptureedge-wisepropertiesreflectingsingle-stepmorpho-logicalderivations,alongwithglobaldistribu-tionalpropertiesoftheentireforest.Theseglobalpropertiesconstrainthesizeoftheaf-fixsetandencourageformationoftightmor-phologicalfamilies.TheresultingobjectiveissolvedusingIntegerLinearProgramming(ILP)pairedwithcontrastiveestimation.Wetrainthemodelbyalternatingbetweenop-timizingthelocallog-linearmodelandtheglobalILPobjective.Weevaluateoursys-temonthreetasks:rootdetection,clusteringofmorphologicalfamilies,andsegmentation.Ourexperimentsdemonstratethatourmodelyieldsconsistentgainsinallthreetaskscom-paredwiththebestpublishedresults.11IntroductionThemorphologicalstudyofalanguageinherentlydrawsupontheexistenceoffamiliesofrelatedwords.Allwordswithinafamilycanbederivedfromacommonrootviaaseriesoftransformations,whetherinflectionalorderivational.Figure1de-pictsonesuchfamily,originatingfromthewordfaith.Thisrepresentationcanbenefitarangeofapplications,includingsegmentation,rootdetectionandclusteringofmorphologicalfamilies.1Codeisavailableathttps://github.com/j-luo93/MorphForest.Figure1:Anillustrationofasingletreeinamor-phologicalforest.preandsufrepresentprefixationandsuffixation.Eachedgehasanassociatedproba-bilityforthemorphologicalchange.Usinggraphterminology,afullmorphologicalas-signmentofthewordsinalanguagecanberepre-sentedasaforest.2Validforestsofmorphologicalfamiliesexhibitanumberofwell-knownregulari-ties.Atthegloballevel,thenumberofrootsislim-ited,andonlyconstituteasmallfractionofthevo-cabulary.Asimilarconstraintappliestothenum-berofpossibleaffixes,sharedacrossfamilies.Atthelocaledgelevel,wepreferderivationsthatfol-lowregularorthographicpatternsandpreservese-manticrelatedness.Wehypothesizethatenforcingtheseconstraintsaspartoftheforestinductionpro-2ThecorrectmathematicaltermforthestructureinFigure1isadirected1-forestorfunctionalgraph.Forsimplicity,weshallusethetermsforestandtreetorefertoadirected1-forestoradirected1-treebecauseofthecycleattheroot. l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 309–324, 2017. Editor de acciones: Sebastián Padó.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 309–324, 2017. Editor de acciones: Sebastián Padó. Lote de envío: 4/2017; Lote de revisión: 7/2017; Publicado 9/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) SemanticSpecializationofDistributionalWordVectorSpacesusingMonolingualandCross-LingualConstraintsNikolaMrkši´c1,2,IvanVuli´c1,DiarmuidÓSéaghdha2,IraLeviant3RoiReichart3,MilicaGaši´c1,AnnaKorhonen1,SteveYoung1,21UniversityofCambridge2AppleInc.3Technion,IITAbstractWepresentATTRACT-REPEL,analgorithmforimprovingthesemanticqualityofwordvectorsbyinjectingconstraintsextractedfromlexicalresources.ATTRACT-REPELfacilitatestheuseofconstraintsfrommono-andcross-lingualresources,yieldingsemanticallyspe-cializedcross-lingualvectorspaces.Ourevalu-ationshowsthatthemethodcanmakeuseofex-istingcross-linguallexiconstoconstructhigh-qualityvectorspacesforaplethoraofdifferentlanguages,facilitatingsemantictransferfromhigh-tolower-resourceones.Theeffectivenessofourapproachisdemonstratedwithstate-of-the-artresultsonsemanticsimilaritydatasetsinsixlanguages.WenextshowthatATTRACT-REPEL-specializedvectorsboostperformanceinthedownstreamtaskofdialoguestatetrack-ing(DST)acrossmultiplelanguages.Finally,weshowthatcross-lingualvectorspacespro-ducedbyouralgorithmfacilitatethetrainingofmultilingualDSTmodels,whichbringsfurtherperformanceimprovements.1IntroductionWordrepresentationlearninghasbecomeare-searchareaofcentralimportanceinmodernnatu-rallanguageprocessing.Thecommontechniquesforinducingdistributedwordrepresentationsaregroundedinthedistributionalhypothesis,relyingonco-occurrenceinformationinlargetextualcorporatolearnmeaningfulwordrepresentations(Mikolovetal.,2013b;Penningtonetal.,2014;ÓSéaghdhaandKorhonen,2014;LevyandGoldberg,2014).Recientemente,methodsthatgobeyondstand-aloneunsu-pervisedlearninghavegainedincreasedpopularity.Thesemodelstypicallybuildondistributionalonesbyusinghuman-orautomatically-constructedknowl-edgebasestoenrichthesemanticcontentofexistingwordvectorcollections.Oftenthisisdoneasapost-processingstep,wherethedistributionalwordvectorsarerefinedtosatisfyconstraintsextractedfromalex-icalresourcesuchasWordNet(Faruquietal.,2015;Wietingetal.,2015;Mrkši´cetal.,2016).Wetermthisapproachsemanticspecialization.Inthispaperweadvancethesemanticspecializa-tionparadigminanumberofways.Weintroduceanewalgorithm,ATTRACT-REPEL,thatusessyn-onymyandantonymyconstraintsdrawnfromlexi-calresourcestotunewordvectorspacesusinglin-guisticinformationthatisdifficulttocapturewithconventionaldistributionaltraining.OurevaluationshowsthatATTRACT-REPELoutperformspreviousmethodswhichmakeuseofsimilarlexicalresources,achievingstate-of-the-artresultsontwowordsim-ilaritydatasets:SimLex-999(Hilletal.,2015)andSimVerb-3500(Gerzetal.,2016).WethendeploytheATTRACT-REPELalgorithminamultilingualsetting,usingsemanticrelationsex-tractedfromBabelNet(NavigliandPonzetto,2012;Ehrmannetal.,2014),across-linguallexicalre-source,toinjectconstraintsbetweenwordsofdiffer-entlanguagesintothewordrepresentations.Thisal-lowsustoembedvectorspacesofmultiplelanguagesintoasinglevectorspace,exploitinginformationfromhigh-resourcelanguagestoimprovethewordrepresentationsoflower-resourceones.Table1illus-tratestheeffectsofcross-lingualATTRACT-REPELspecializationbyshowingthenearestneighborsforthreeEnglishwordsacrossthreecross-lingualspaces. l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 295–307, 2017. Editor de acciones: Christopher Potts.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 295–307, 2017. Editor de acciones: Christopher Potts. Lote de envío: 10/2016; Lote de revisión: 12/2016; Publicado 8/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) OvercomingLanguageVariationinSentimentAnalysiswithSocialAttentionYiYangandJacobEisensteinSchoolofInteractiveComputingGeorgiaInstituteofTechnologyAtlanta,GA30308{yiyang+jacobe}@gatech.eduAbstractVariationinlanguageisubiquitous,particu-larlyinnewerformsofwritingsuchassocialmedia.Fortunately,variationisnotrandom;itisoftenlinkedtosocialpropertiesoftheau-thor.Inthispaper,weshowhowtoexploitsocialnetworkstomakesentimentanalysismorerobusttosociallanguagevariation.Thekeyideaislinguistichomophily:thetendencyofsociallylinkedindividualstouselanguageinsimilarways.Weformalizethisideainanovelattention-basedneuralnetworkarchitec-ture,inwhichattentionisdividedamongsev-eralbasismodels,dependingontheauthor’spositioninthesocialnetwork.Thishastheeffectofsmoothingtheclassificationfunctionacrossthesocialnetwork,andmakesitpos-sibletoinducepersonalizedclassifiersevenforauthorsforwhomthereisnolabeleddataordemographicmetadata.Thismodelsignif-icantlyimprovestheaccuraciesofsentimentanalysisonTwitterandonreviewdata.1IntroductionWordscanmeandifferentthingstodifferentpeople.Fortunately,thesedifferencesarerarelyidiosyn-cratic,butareoftenlinkedtosocialfactors,suchasage(RosenthalandMcKeown,2011),género(Eck-ertandMcConnell-Ginet,2003),carrera(Verde,2002),geography(Trudgill,1974),andmoreinef-fablecharacteristicssuchaspoliticalandculturalattitudes(pescador,1958;Labov,1963).Innaturallanguageprocessing(NLP),socialmediadatahasbroughtvariationtothefore,spurringthedevelop-mentofnewcomputationaltechniquesforcharac-terizingvariationinthelexicon(Eisensteinetal.,2010),orthography(Eisenstein,2015),andsyn-tax(Blodgettetal.,2016).Sin embargo,asidefromthefocusedtaskofspellingnormalization(Sproatetal.,2001;Awetal.,2006),therehavebeenfewattemptstomakeNLPsystemsmorerobusttolanguagevari-ationacrossspeakersorwriters.OneexceptionistheworkofHovy(2015),whoshowsthattheaccuraciesofsentimentanalysisandtopicclassificationcanbeimprovedbytheinclusionofcoarse-grainedauthordemographicssuchasageandgender.However,suchdemographicinforma-tionisnotdirectlyavailableinmostdatasets,anditisnotyetclearwhetherpredictedageandgen-derofferanyimprovements.Ontheotherendofthespectrumareattemptstocreatepersonalizedlan-guagetechnologies,asareoftenemployedininfor-mationretrieval(Shenetal.,2005),recommendersystems(BasilicoandHofmann,2004),andlan-guagemodeling(Federico,1996).Butpersonal-izationrequiresannotateddataforeachindividualuser—somethingthatmaybepossibleininteractivesettingssuchasinformationretrieval,butisnottyp-icallyfeasibleinnaturallanguageprocessing.Weproposeamiddlegroundbetweengroup-leveldemographiccharacteristicsandpersonalization,byexploitingsocialnetworkstructure.Thesociologi-caltheoryofhomophilyassertsthatindividualsareusuallysimilartotheirfriends(McPhersonetal.,2001).Thispropertyhasbeendemonstratedforlan-guage(Brydenetal.,2013)aswellasforthedemo-graphicpropertiestargetedbyHovy(2015),whicharemorelikelytobesharedbyfriendsthanbyran-dompairsofindividuals(Thelwall,2009).Social l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 279–293, 2017. Editor de acciones: Yuji Matsumoto.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 279–293, 2017. Editor de acciones: Yuji Matsumoto. Lote de envío: 5/2016; Lote de revisión: 10/2016; 2/2017; Publicado 8/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) Cross-LingualSyntacticTransferwithLimitedResourcesMohammadSadeghRasooliandMichaelCollins∗DepartmentofComputerScience,ColumbiaUniversityNewYork,NY10027,USA{rasooli,mcollins}@cs.columbia.eduAbstractWedescribeasimplebuteffectivemethodforcross-lingualsyntactictransferofdepen-dencyparsers,inthescenariowherealargeamountoftranslationdataisnotavailable.Thismethodmakesuseofthreesteps:1)amethodforderivingcross-lingualwordclus-ters,whichcanthenbeusedinamultilingualparser;2)amethodfortransferringlexicalinformationfromatargetlanguagetosourcelanguagetreebanks;3)amethodforintegrat-ingthesestepswiththedensity-drivenannota-tionprojectionmethodofRasooliandCollins(2015).Experimentsshowimprovementsoverthestate-of-the-artinseverallanguagesusedinpreviouswork,inasettingwheretheonlysourceoftranslationdataistheBible,acon-siderablysmallercorpusthantheEuroparlcorpususedinpreviouswork.ResultsusingtheEuroparlcorpusasasourceoftranslationdatashowadditionalimprovementsovertheresultsofRasooliandCollins(2015).Wecon-cludewithresultson38datasetsfromtheUni-versalDependenciescorpora.1IntroductionCreatingmanually-annotatedsyntactictreebanksisanexpensiveandtimeconsumingtask.Recentlytherehasbeenagreatdealofinterestincross-lingualsyntactictransfer,whereaparsingmodelistrainedforsomelanguageofinterest,usingonlytreebanksinotherlanguages.Thereisaclearmotivationforthisinbuildingparsingmodelsforlanguagesforwhichtreebankdataisunavailable.Methods∗OnleaveatGoogleInc.NewYork.forsyntactictransferincludeannotationprojectionmethods(Hwaetal.,2005;Ganchevetal.,2009;McDonaldetal.,2011;MaandXia,2014;RasooliandCollins,2015;Lacroixetal.,2016;Agi´cetal.,2016),learningofdelexicalizedmodelsonuniver-saltreebanks(ZemanandResnik,2008;McDon-aldetal.,2011;T¨ackstr¨ometal.,2013;RosaandZabokrtsky,2015),treebanktranslation(Tiedemannetal.,2014;Tiedemann,2015;TiedemannandAgi´c,2016)andmethodsthatleveragecross-lingualrep-resentationsofwordclusters,embeddingsordictio-naries(T¨ackstr¨ometal.,2012;Durrettetal.,2012;Duongetal.,2015a;ZhangandBarzilay,2015;XiaoandGuo,2015;Guoetal.,2015;Guoetal.,2016;Ammaretal.,2016a).Thispaperconsiderstheproblemofcross-lingualsyntactictransferwithlimitedresourcesofmono-lingualandtranslationdata.Specifically,weusetheBiblecorpusofChristodouloupoulosandSteed-man(2014)asasourceoftranslationdata,andWikipediaasasourceofmonolingualdata.Wede-liberatelylimitourselvestotheuseofBibletrans-lationdatabecauseitisavailableforaverybroadsetoflanguages:thedatafromChristodouloupou-losandSteedman(2014)includesdatafrom100languages.TheBibledatacontainsamuchsmallersetofsentences(around24,000)thanothertransla-tioncorpora,forexampleEuroparl(Koehn,2005),whichhasaround2millionsentencesperlanguagepair.Thismakesitaconsiderablymorechalleng-ingcorpustoworkwith.Similarly,ourchoiceofWikipediaasthesourceofmonolingualdataismo-tivatedbytheavailabilityofWikipediadatainaverybroadsetoflanguages. l D o w n o a d e d f r o m h t t p : / / d i r e c

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 247–261, 2017. Editor de acciones: Hinrich Schütze.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 247–261, 2017. Editor de acciones: Hinrich Schütze. Lote de envío: 12/2015; Lote de revisión: 5/2016; 11/2016; Publicado 7/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) SparseCodingofNeuralWordEmbeddingsforMultilingualSequenceLabelingG´aborBerendDepartmentofInformaticsUniversityofSzeged2´Arp´adt´er,6720Szeged,Hungaryberendg@inf.u-szeged.huAbstractInthispaperweproposeandcarefullyeval-uateasequencelabelingframeworkwhichsolelyutilizessparseindicatorfeaturesde-rivedfromdensedistributedwordrepresen-tations.Theproposedmodelobtains(cerca)state-of-theartperformanceforbothpart-of-speechtaggingandnamedentityrecognitionforavarietyoflanguages.Ourmodelreliesonlyonafewthousandsparsecoding-derivedfeatures,withoutapplyinganymodificationofthewordrepresentationsemployedforthedifferenttasks.Theproposedmodelhasfa-vorablegeneralizationpropertiesasitretainsover89.8%ofitsaveragePOStaggingaccu-racywhentrainedat1.2%ofthetotalavailabletrainingdata,i.e.150sentencesperlanguage.1IntroductionDeterminingthelinguisticstructureofnaturallan-guagetextsbasedonrichhand-craftedfeatureshasalong-goinghistoryinnaturallanguageprocessing.Thefocusoftraditionalapproacheshasmostlybeenonbuildinglinguisticanalyzersforaparticularkindofanalysis,whichoftenleadstotheincorporationofextensivelinguisticand/ordomainknowledgefordefiningthefeaturespace.Consequently,traditionalmodelseasilybecomelanguageand/ortaskspecificresultinginimpropergeneralizationproperties.Anewresearchdirectionhasemergedrecently,thataimsatbuildingmoregeneralmodelsthatre-quirefarlessfeatureengineeringornoneatall.Theseadvancementsinnaturallanguageprocessing,pioneeredbyBengioetal.(2003),followedbyCol-lobertandWeston(2008),Collobertetal.(2011),Mikolovetal.(2013a)amongothers,employadif-ferentphilosophy.Theobjectiveoftheseworksistofindrepresentationsforlinguisticphenomenainanunsupervisedmannerbyrelyingonlargeamountsoftext.Naturallanguagephenomenaareextremelysparsebytheirnature,whereascontinuouswordem-beddingsemploydenserepresentationsofwords.Inourpaperweempiricallyverifyviarigorousexper-imentsthatturningthesedenserepresentationsintoamuchsparser(yetdenserthanone-hotencoding)formcankeepthemostsalientpartsofwordrepre-sentationsthatarehighlysuitableforsequencemod-els.Furthermore,ourexperimentsrevealthatourpro-posedmodelperformssubstantiallybetterthantra-ditionalfeature-richmodelsintheabsenceofabun-danttrainingdata.Ourproposedmodelalsohastheadvantageofperformingwellonmultiplesequencelabelingtaskswithoutanymodificationintheap-pliedwordrepresentationsthankstothesparsefea-turesderivedfromcontinuouswordrepresentations.Ourworkaimsatintroducinganovelsequencela-belingmodelsolelyutilizingfeaturesderivedfromthesparsecodingofcontinuouswordembeddings.Eventhoughsparsecodinghadpreviouslybeenuti-lizedinNLPpriortous(Faruquietal.,2015;Chenetal.,2016),tothebestofourknowledge,wearethefirsttoproposeasequencelabelingframeworkincorporatingitwiththefollowingcontributions:•Weshowthattheproposedsparserepresen-tationisgeneralassequencelabelingmodelstrainedonthemachieve(cerca)state-of-the-artperformancesforbothPOStaggingandNER. l D o w n o a d e d f r o m h t t p : / / d i r e c

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 233–246, 2017. Editor de acciones: Patricio Pantel.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 233–246, 2017. Editor de acciones: Patricio Pantel. Lote de envío: 11/2016; Lote de revisión: 2/2017; Publicado 7/2017. C(cid:13)2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. Domain-Targeted,HighPrecisionKnowledgeExtractionBhavana Dalvi MishraNiketTandonAllenInstituteforArtificialIntelligence2157NNorthlakeWaySuite110,Seattle,WA98103{bhavanad,nikett,peterc}@allenai.orgPeterClarkAbstractOurgoalistoconstructadomain-targeted,highprecisionknowledgebase(KB),contain-inggeneral(sujeto,predicate,object)state-mentsabouttheworld,insupportofadown-streamquestion-answering(control de calidad)application.Despiterecentadvancesininformationextrac-tion(IE)técnicas,nosuitableresourceforourtaskalreadyexists;existingresourcesareeithertoonoisy,toonamed-entitycentric,ortooincomplete,andtypicallyhavenotbeenconstructedwithaclearscopeorpurpose.Toaddressthese,wehavecreatedadomain-targeted,highprecisionknowledgeextractionpipeline,leveragingOpenIE,crowdsourcing,andanovelcanonicalschemalearningalgo-rithm(calledCASI),thatproduceshighpre-cisionknowledgetargetedtoaparticulardo-main-inourcase,elementaryscience.TomeasuretheKB’scoverageofthetargetdo-main’sknowledge(its“comprehensiveness”withrespecttoscience)wemeasurerecallwithrespecttoanindependentcorpusofdo-maintext,andshowthatourpipelineproducesoutputwithover80%precisionand23%re-callwithrespecttothattarget,asubstantiallyhighercoverageoftuple-expressiblescienceknowledgethanothercomparableresources.WehavemadetheKBpubliclyavailable1.1IntroductionWhiletherehavebeensubstantialadvancesinknowledgeextractiontechniques,theavailabilityofhighprecision,generalknowledgeabouttheworld,1ThisKBnamedas“AristoTupleKB”isavailablefordown-loadathttp://data.allenai.org/tuple-kbremainselusive.Specifically,ourgoalisalarge,highprecisionbodyof(sujeto,predicate,object)statementsrelevanttoelementaryscience,tosup-portadownstreamQAapplicationtask.Althoughthereareseveralimpressive,existingresourcesthatcancontributetoourendeavor,e.g.,NELL(Carlsonetal.,2010),ConceptNet(SpeerandHavasi,2013),WordNet(Fellbaum,1998),WebChild(Tandonetal.,2014),Yago(Suchaneketal.,2007),FreeBase(Bollackeretal.,2008),andReVerb-15M(Faderetal.,2011),theirapplicabilityislimitedbyboth•limitedcoverageofgeneralknowledge(e.g.,FreeBaseandNELLprimarilycontainknowl-edgeaboutNamedEntities;WordNetusesonlyafew(80%)precisionoverthatcorpus(its“comprehensiveness”withrespecttosci-ence).ThismeasureissimilartorecallatthepointP=80%onthePRcurve,exceptmeasuredagainstadomain-specificsampleofdatathatreflectsthedis-tributionofthetargetdomainknowledge.Compre-hensivenessthusgivesusanapproximatenotionofthecompletenessoftheKBfor(tuple-expressible)factsinourtargetdomain,somethingthathasbeenlackinginearlierKBconstructionresearch.WeshowthatourKBhascomprehensiveness(recallofdomainfactsat>80%precision)of23%withrespecttoscience,asubstantiallyhighercoverage2AristoTupleKBisavailablefordownloadathttp://allenai.org/data/aristo-tuple-kboftuple-expressiblescienceknowledgethanothercomparableresources.WearemakingtheKBpub-liclyavailable.OutlineWediscusstherelatedworkinSection2.InSec-tion3,wedescribethedomain-targetedpipeline,in-cludinghowthedomainischaracterizedtotheal-gorithmandthesequenceoffiltersandpredictorsused.InSection4,wedescribehowtherelation-shipsbetweenpredicatesinthedomainareidenti-fiedandthemoregeneralpredicatesfurtherpop-ulated.FinallyinSection5,weevaluateourap-proach,includingevaluatingitscomprehensiveness(high-precisioncoverageofscienceknowledge).2RelatedWorkTherehasbeensubstantial,recentprogressinknowledgebasesthat(primarily)encodeknowledgeaboutNamedEntities,includingFreebase(Bol-lackeretal.,2008),KnowledgeVault(Dongetal.,2014),DBPedia(Aueretal.,2007),andothersthathierarchicallyorganizenounsandnamedentities,e.g.,Yago(Suchaneketal.,2007).WhiletheseKBsarerichinfactsaboutnamedentities,theyaresparseingeneralknowledgeaboutcommonnouns(e.g.,thatbearshavefur).KBscoveringgeneralknowledgehavereceivedlessattention,althoughtherearesomenotableexceptionsconstructedusingmanualmethods,e.g.,WordNet(Fellbaum,1998),crowdsourcing,e.g.,ConceptNet(SpeerandHavasi,2013),y,morerecently,usingautomatedmeth-ods,e.g.,WebChild(Tandonetal.,2014).Whileuseful,theseresourceshavebeenconstructedtotar-getonlyasmallsetofrelations,providingonlylim-itedcoverageforadomainofinterest.Toovercomerelationsparseness,theparadigmofOpenIE(Bankoetal.,2007;Soderlandetal.,2013)extractsknowledgefromtextusinganopensetofrelationships,andhasbeenusedtosuccess-fullybuildlarge-scale(arg1,relation,arg2)resourcessuchasReVerb-15M(containing15milliongeneraltriples)(Faderetal.,2011).Althoughbroadcov-erage,sin embargo,OpenIEtechniquestypicallypro-ducenoisyoutput.OurextractionpipelinecanbeviewedasanextensionoftheOpenIEparadigm:westartwithtargetedOpenIEoutput,andthenap-plyasequenceoffilterstosubstantiallyimprovethe l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 205–218, 2017. Editor de acciones: Stefan Riezler.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 205–218, 2017. Editor de acciones: Stefan Riezler. Lote de envío: 12/2016; Lote de revisión: 2/2017; Publicado 7/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) PushingtheLimitsofTranslationQualityEstimationAndr´eF.T.MartinsUnbabelInstitutodeTelecomunicac¸˜oesLisbon,Portugalandre.martins@unbabel.comMarcinJunczys-DowmuntAdamMickiewiczUniversityinPozna´nPozna´n,Polandjunczys@amu.edu.plFabioN.KeplerUnbabelL2F/INESC-ID,Lisbon,PortugalUniversityofPampa,Alegrete,Brazilkepler@unbabel.comRam´onAstudilloUnbabelL2F/INESC-IDLisbon,Portugalramon@unbabel.comChrisHokampDublinCityUniversityDublin,Irelandchokamp@computing.dcu.ieRomanGrundkiewiczAdamMickiewiczUniversityinPozna´nPozna´n,Polandromang@amu.edu.plAbstractTranslationqualityestimationisataskofgrowingimportanceinNLP,duetoitspoten-tialtoreducepost-editinghumaneffortindis-ruptiveways.However,thispotentialiscur-rentlylimitedbytherelativelylowaccuracyofexistingsystems.Inthispaper,weachieveremarkableimprovementsbyexploitingsyn-ergiesbetweentherelatedtasksofword-levelqualityestimationandautomaticpost-editing.First,westackanew,carefullyengineered,neuralmodelintoarichfeature-basedword-levelqualityestimationsystem.Then,weusetheoutputofanautomaticpost-editingsys-temasanextrafeature,obtainingstrikingre-sultsonWMT16:aword-levelFMULT1scoreof57.47%(anabsolutegainof+7.95%overthecurrentstateoftheart),andaPearsoncorrela-tionscoreof65.56%forsentence-levelHTERprediction(anabsolutegainof+13.36%).1IntroductionThegoalofqualityestimation(QE)istoevaluateatranslationsystem’squalitywithoutaccesstoref-erencetranslations(Blatzetal.,2004;Speciaetal.,2013).Thishasmanypotentialusages:informinganenduseraboutthereliabilityoftranslatedcon-tent;decidingifatranslationisreadyforpublish-ingorifitrequireshumanpost-editing;highlightingthewordsthatneedtobechanged.QEsystemsareparticularlyappealingforcrowd-sourcedandpro-fessionaltranslationservices,duetotheirpotentialtodramaticallyreducepost-editingtimesandtosavelaborcosts(Specia,2011).Theincreasinginterestinthisproblemfromanindustrialanglecomesasnosurprise(Turchietal.,2014;deSouzaetal.,2015;Martinsetal.,2016;Kozlovaetal.,2016).Inthispaper,wetackleword-levelQE,whosegoalistoassignalabelofOKorBADtoeachwordinthetranslation(Figure1).Pastapproachestothisproblemincludelinearclassifierswithhandcraftedfeatures(UeffingandNey,2007;Bic¸ici,2013;Shahetal.,2013;Luongetal.,2014),oftencombinedwithfeatureselection(Avramidis,2012;Becketal.,2013),recurrentneuralnetworks(deSouzaetal.,2014;KimandLee,2016),andsystemsthatcom-binelinearandneuralmodels(Kreutzeretal.,2015;Martinsetal.,2016).Westartbyproposinga“pure”QEsystem(§3)consistingofanew,carefullyen-gineeredneuralmodel(NEURALQE),stackedintoalinearfeature-richclassifier(LINEARQE).Alongtheway,weprovidearigorousempiricalanalysistobetterunderstandthecontributionoftheseveralgroupsoffeaturesandtojustifythearchitectureoftheneuralsystem.Asecondcontributionofthispaperisbring-ingintherelatedtaskofautomaticpost-editing(APE;Simardetal.(2007)),whichaimstoau- l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 135–146, 2017. Editor de acciones: Hinrich Schütze.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 135–146, 2017. Editor de acciones: Hinrich Schütze. Lote de envío: 9/2016; Lote de revisión: 12/2016; Publicado 6/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) Enriquecimiento de vectores de palabras con información de subpalabras Piotr Bojanowski∗ y Edouard Grave∗ y Armand Joulina y Tomas Mikolov Facebook AIResearch{bojanowski,agravante,yo conduje,tmikolov}@fb.comRepresentaciones de palabras continuas abstractas,entrenados en grandes corpus sin etiquetar son útiles para muchas tareas de procesamiento del lenguaje natural. Los modelos populares que aprenden tales representaciones ignoran la morfología de las palabras,asignando un vector distinto a cada palabra. Esta es una limitación,especialmente para idiomas con vocabulario extenso y muchas palabras raras.,Proponemos un nuevo enfoque basado en el modelo de skipgram.,dondecadapalabraserepresentacomounabolsadegramas-de-caracteres.Unarepresentaciónvectorialestáasociadaacadagrama-de-caracteres.;palabras siendo representadas como la suma de estas representaciones. Nuestro método es rápido,Permitir entrenar modelos en grandes corporaciones rápidamente y permitirnos calcular representaciones de palabras para palabras que no aparecieron en los datos de entrenamiento. Evaluamos nuestras representaciones de palabras en nueve idiomas diferentes.,tanto en tareas de similitud como de analogía de palabras. Comparando representaciones morfológicas de palabras recientemente propuestas,demostramos que nuestros vectores logran un rendimiento de vanguardia en estas tareas. 1 Introducción El aprendizaje de representaciones continuas de palabras tiene una larga historia en el procesamiento del lenguaje natural(Rumel-hartetal.,1988).Estas representaciones se derivan normalmente de grandes corporaciones sin etiquetar que utilizan estadísticas de coexistencia.(Deerwesteretal.,1990;Proteger,1992;lundandburgess,1996).Un gran cuerpo de trabajo,conocida como semántica distributiva,ha estudiado las propiedades de estos métodos(Turney∗Los dos primeros autores contribuyeron igualmente.etal., 2010;BaroniandLenci,2010).En la comunidad de la red neuronal,collobertandweston(2008)propuesto para aprender la incorporación de palabras mediante una red neuronal de retroalimentación,prediciendo una palabra basada en las dos palabras de la izquierda y las dos palabras de la derecha. Más recientemente,Mikolovetal.(2013b)Los modelos logarítmicos bilineales simples propuestos permitieron aprender representaciones continuas de palabras en corporaciones muy grandes de manera eficiente. La mayoría de estas técnicas representan cada palabra del vocabulario mediante un vector distinto,sin compartir parámetros. En particular,ignoran la estructura interna de las palabras,lo cual es una limitación importante para los idiomas morfológicamente ricos,como turco-ish o finlandés. Por ejemplo,enfrancésoespañol,la mayoría de los verbos tienen más de cuarenta formas flexionadas diferentes,mientras que el idioma finlandés tiene quince casos para sustantivos. Estos idiomas contienen muchas formas de palabras que ocurren raramente(ornotatal)en el corpus de entrenamiento,haciendo que sea difícil aprender buenas representaciones de palabras. Porque muchas formaciones de palabras siguen reglas,Es posible mejorar las representaciones vectoriales de lenguajes morfológicamente ricos mediante el uso de información a nivel de caracteres. En este documento,Proponemos aprender representaciones de gramos de caracteres.,y representar palabras como la suma de vectores de entonces gramos. Nuestra principal contribución es introducir una extensión del modelo continuo de salto de gramos.(Mikolovetal.,2013b),que tiene en cuenta información de subpalabras. Evaluamos este modelo en nueve idiomas que exhiben diferentes morfologías,mostrando el beneficio de nuestro enfoque. l D o w n o a d e d f r o m h t t p : / / directo

Leer más "

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 101–115, 2017. Editor de acciones: Marcos Johnson.

Transacciones de la Asociación de Lingüística Computacional, volumen. 5, páginas. 101–115, 2017. Editor de acciones: Marcos Johnson. Lote de envío: 10/2016; Lote de revisión: 4/2017; Publicado 4/2017. 2017 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia. C (cid:13) Cross-SentenceN-aryRelationExtractionwithGraphLSTMsNanyunPeng1∗HoifungPoon2ChrisQuirk2KristinaToutanova3∗Wen-tauYih21CenterforLanguageandSpeechProcessing,ComputerScienceDepartmentJohnsHopkinsUniversity,baltimore,Maryland,USA2MicrosoftResearch,Redmond,Washington,USA3GoogleResearch,seattle,Washington,USAnpeng1@jhu.edu,kristout@google.com{hoifung,chrisq,scottyih}@microsoft.comAbstractPastworkinrelationextractionhasfocusedonbinaryrelationsinsinglesentences.Re-centNLPinroadsinhigh-valuedomainshavesparkedinterestinthemoregeneralsettingofextractingn-aryrelationsthatspanmul-tiplesentences.Inthispaper,weexploreageneralrelationextractionframeworkbasedongraphlongshort-termmemorynetworks(graphLSTMs)thatcanbeeasilyextendedtocross-sentencen-aryrelationextraction.ThegraphformulationprovidesaunifiedwayofexploringdifferentLSTMapproachesandin-corporatingvariousintra-sententialandinter-sententialdependencies,suchassequential,syntactic,anddiscourserelations.Arobustcontextualrepresentationislearnedfortheen-tities,whichservesasinputtotherelationclas-sifier.Thissimplifieshandlingofrelationswitharbitraryarity,andenablesmulti-tasklearningwithrelatedrelations.Weevaluatethisframe-workintwoimportantprecisionmedicineset-tings,demonstratingitseffectivenesswithbothconventionalsupervisedlearninganddistantsupervision.Cross-sentenceextractionpro-ducedlargerknowledgebases.andmulti-tasklearningsignificantlyimprovedextractionac-curacy.AthoroughanalysisofvariousLSTMapproachesyieldedusefulinsighttheimpactoflinguisticanalysisonextractionaccuracy.1IntroductionRelationextractionhasmadegreatstridesinnewswireandWebdomains.Recently,therehas∗ThisresearchwasconductedwhentheauthorswereatMicrosoftResearch.beenincreasinginterestinapplyingrelationextrac-tiontohigh-valuedomainssuchasbiomedicine.Theadventof$1000humangenome1heraldsthedawnofprecisionmedicine,butprogressinpersonalizedcan-certreatmenthasbeenhinderedbythearduoustaskofinterpretinggenomicdatausingpriorknowledge.Forexample,givenatumorsequence,amoleculartumorboardneedstodeterminewhichgenesandmu-tationsareimportant,andwhatdrugsareavailabletotreatthem.Alreadytheresearchliteraturehasawealthofrelevantknowledge,anditisgrowingatanastonishingrate.PubMed2,theonlinerepositoryofbiomedicalarticles,addstwonewpapersperminute,oronemillioneachyear.Itisthusimperativetoadvancerelationextractionformachinereading.Inthevastliteratureonrelationextraction,pastworkfocusedprimarilyonbinaryrelationsinsinglesentences,limitingtheavailableinformation.Con-siderthefollowingexample:“Thedeletionmutationonexon-19ofEGFRgenewaspresentin16patients,whiletheL858Epointmutationonexon-21wasnotedin10.Allpatientsweretreatedwithgefitinibandshowedapartialresponse.”.Collectively,thetwosentencesconveythefactthatthereisaternaryinteractionbetweenthethreeentitiesinbold,whichisnotexpressedineithersentencealone.Namely,tumorswithL858EmutationinEGFRgenecanbetreatedwithgefitinib.Extractingsuchknowledgeclearlyrequiresmovingbeyondbinaryrelationsandsinglesentences.N-aryrelationsandcross-sentenceextractionhavereceivedrelativelylittleattentioninthepast.Prior1http://www.illumina.com/systems/hiseq-x-sequencing-system.html2https://www.ncbi.nlm.nih.gov/pubmed l D o w n o a d e d f r o m h t t p : / / directo

Leer más "