您需要什么主题的文档?
计算语言学协会会刊, 1 (2013) 301–314. 动作编辑器: Jason Eisner.
计算语言学协会会刊, 1 (2013) 301–314. 动作编辑器: Jason Eisner. Submitted 6/2012; 修改 10/2012; 已发表 7/2013. C (西德:13) 2013 计算语言学协会. Data-driven,PCFG-basedandPseudo-PCFG-basedModelsforChineseDependencyParsingWeiweiSunandXiaojunWanInstituteofComputerScienceandTechnology,PekingUniversityTheMOEKeyLaboratoryofComputationalLinguistics,PekingUniversity{ws,wanxiaojun}@pku.edu.cnAbstractWepresentacomparativestudyoftransition-,graph-andPCFG-basedmodelsaimedatil-luminatingmorepreciselythelikelycontri-butionofCFGsinimprovingChinesedepen-dencyparsingaccuracy,especiallybycom-biningheterogeneousmodels.Inspiredbytheimpactofaconstituencygrammaronde-pendencyparsing,weproposeseveralstrate-giestoacquirepseudoCFGsonlyfromde-pendencyannotations.Comparedtolinguisticgrammarslearnedfromrichphrase-structuretreebanks,welldesignedpseudogrammarsachievesimilarparsingaccuracyandhaveequivalentcontributionstoparserensemble.Moreover,pseudogrammarsincreasethedi-versityofbasemodels;所以,togetherwithallothermodels,furtherimprovesys-temcombination.BasedonautomaticPOStagging,ourfinalmodelachievesaUASof87.23%,resultinginasignificantimprove-mentofthestateoftheart.1IntroductionPopularapproachestodependencyparsingcanbedividedintotwoclasses:grammar-freeandgrammar-based.Data-driven,grammar-freeap-proachesmakeessentialuseofmachinelearningfromlinguisticannotationsinordertoparsenewsentences.Suchapproaches,e.g.transition-based(Nivre,2008)andgraph-based(麦当劳,2006;TorresMartinsetal.,2009)haveattractedthemostattentioninrecentyears.Incontrast,grammar-basedapproachesrelyonlinguisticgrammars(ineitherdependencyorconstituencyformalisms)toshapethesearchspaceforpossiblesyntacticanal-ysis.Inparticular,CFG-baseddependencyparsingexploitsamappingbetweendependencyandcon-stituencyrepresentationsandreusesparsingalgo-rithmsdevelopedforCFGtoproducedependencystructures.Inpreviouswork,data-driven,discrim-inativeapproacheshavebeenwidelydiscussedforChinesedependencyparsing.Ontheotherhand,variousPCFG-basedconstituentparsingmethodshavebeenappliedtoobtainphrase-structuresaswell.Withrichlinguisticrules,phrase-structuresofChinesesentencescanbewelltransformedtotheircorrespondingdependencystructures(薛,2007).所以,PCFGparserswithsuchconversionrulescanbetakenasanothertypeofdependencyparser.WecallthemPCFG-basedparsers,inthispaper.Explicitlydefininglinguisticrulestoexpresspreciselygenericgrammaticalregularities,acon-stituencygrammarcanbeappliedtoarrangesen-tencesintoahierarchyofnestedphrases,whichde-terminesconstructionsbetweenlargerphrasesandtheirsmallercomponentphrases.Thistypeofinfor-mationisdifferentfrom,buthighlyrelatedto,theinformationcapturedbyadependencyrepresenta-tion.Aconstituencygrammar,因此,hasgreatpossi-blecontributionstodependencyparsing.Inordertopavethewayfornewandbettermethods,westudytheimpactofCFGsonChinesedependencyparsing.Aseriesofempiricalanalysisofstate-of-the-artgraph-,transition-andPCFG-basedparsersispresentedtoilluminatemorepreciselythepropertiesofheterogeneousmodels.WeshowthatCFGshaveagreatimpactondependencyparsingandPCFG-basedmodelshavecomplementarypredictivepow-erstodata-drivenmodels.Systemensembleisaneffectiveandimportanttechniquetobuildmoreaccurateparsersbasedonmultiple,diverse,weakermodels.Exploitingdiffer- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 291–300. 动作编辑器: Chris Callison-Burch.
计算语言学协会会刊, 1 (2013) 291–300. 动作编辑器: Chris Callison-Burch. Submitted 5/2013; 已发表 7/2013. C(西德:13)2013 计算语言学协会. 291 Large-scale Word Alignment Using Soft Dependency Cohesion Constraints Zhiguo Wang and Chengqing Zong National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy of Sciences {zgwang, cqzong}@nlpr.ia.ac.cn Abstract Dependency cohesion refers to the observation that phrases dominated by disjoint dependency subtrees
计算语言学协会会刊, 1 (2013) 279–290. 动作编辑器: Lillian Lee.
计算语言学协会会刊, 1 (2013) 279–290. 动作编辑器: Lillian Lee. Submitted 11/2012; 修改 1/2013; 已发表 7/2013. C (西德:13) 2013 计算语言学协会. 好的,Great,Excellent:GlobalInferenceofSemanticIntensitiesGerarddeMeloICSI,Berkeleydemelo@icsi.berkeley.eduMohitBansalCSDivision,UCBerkeleymbansal@cs.berkeley.eduAbstractAdjectiveslikegood,great,andexcellentaresimilarinmeaning,butdifferinintensity.In-tensityorderinformationisveryusefulforlanguagelearnersaswellasinseveralNLPtasks,butismissinginmostlexicalresources(dictionaries,WordNet,andthesauri).Inthispaper,wepresentaprimarilyunsupervisedapproachthatusessemanticsfromWeb-scaledata(e.g.,phraseslikegoodbutnotexcel-lent)torankwordsbyassigningthemposi-tionsonacontinuousscale.WerelyonMixedIntegerLinearProgrammingtojointlydeter-minetheranks,suchthatindividualdecisionsbenefitfromglobalinformation.Whenrank-ingEnglishadjectives,ourglobalalgorithmachievessubstantialimprovementsoverpre-viousworkonbothpairwiseandrankcorre-lationmetrics(具体来说,70%pairwiseac-curacyascomparedtoonly56%bypreviouswork).而且,ourapproachcanincorpo-rateexternalsynonymyinformation(increas-ingitspairwiseaccuracyto78%)andextendseasilytonewlanguages.Wealsomakeourcodeanddatafreelyavailable.11IntroductionCurrentlexicalresourcessuchasdictionariesandthesauridonotprovideinformationaboutthein-tensityorderofwords.Forexample,bothWordNet(磨坊主,1995)andRoget’s21stCenturyThesaurus(thesaurus.com)presentacceptable,great,andsu-perbassynonymsoftheadjectivegood.However,anativespeakerknowsthatthesewordsrepresentvaryingintensityandcaninfactgenerallyberankedbyintensityasacceptable
计算语言学协会会刊, 1 (2013) 267–278. 动作编辑器: Brian Roark.
计算语言学协会会刊, 1 (2013) 267–278. 动作编辑器: Brian Roark. Submitted 3/2013; 已发表 7/2013. C(西德:13)2013 计算语言学协会. EfficientParsingforHead-SplitDependencyTreesGiorgioSattaDept.ofInformationEngineeringUniversityofPadua,Italysatta@dei.unipd.itMarcoKuhlmannDept.ofLinguisticsandPhilologyUppsalaUniversity,Swedenmarco.kuhlmann@lingfil.uu.seAbstractHeadsplittingtechniqueshavebeensuccess-fullyexploitedtoimprovetheasymptoticruntimeofparsingalgorithmsforproject-ivedependencytrees,underthearc-factoredmodel.Inthisarticleweextendthesetech-niquestoaclassofnon-projectivedependencytrees,calledwell-nesteddependencytreeswithblock-degreeatmost2,whichhasbeenprevi-ouslyinvestigatedintheliterature.Wedefineastructuralpropertythatallowsheadsplittingforthesetrees,andpresenttwoalgorithmsthatim-proveovertheruntimeofexistingalgorithmsatnosignificantlossincoverage.1IntroductionMuchoftherecentworkondependencyparsinghasbeenaimedatfindingagoodbalancebetweenac-curacyandefficiency.Foroneendofthespectrum,艾斯纳(1997)showedthatthehighest-scoringpro-jectivedependencytreeunderanarc-factoredmodelcanbecomputedintimeO.n3/,wherenisthelengthoftheinputstring.Laterworkhasfocusedonmak-ingprojectiveparsingviableundermoreexpressivemodels(Carreras,2007;KooandCollins,2010).Atthesametime,ithasbeenobservedthatformanystandarddatasets,thecoverageofprojectivetreesisfarfromcomplete(KuhlmannandNivre,2006),whichhasledtoaninterestinparsingal-gorithmsfornon-projectivetrees.Whilenon-project-iveparsingunderanarc-factoredmodelcanbedoneintimeO.n2/(McDonaldetal.,2005),parsingwithmoreinformedmodelsisintractable(McDonaldandSatta,2007).Thishasledseveralauthorstoinvestig-ate‘mildlynon-projective’classesoftrees,withthegoalofachievingabalancebetweenexpressivenessandcomplexity(KuhlmannandNivre,2006).Inthisarticlewefocusonaclassofmildlynon-projectivedependencystructurescalledwell-nesteddependencytreeswithblock-degreeatmost2.ThisclasswasfirstintroducedbyBodirskyetal.(2005),whoshowedthatitcorresponds,inanaturalway,totheclassofderivationtreesoflexicalizedtree-adjoin-inggrammars(JoshiandSchabes,1997).Whiletherearelinguisticargumentsagainsttherestrictiontothisclass(MaierandLichte,2011;Chen-MainandJoshi,2010),KuhlmannandNivre(2006)foundthatithasexcellentcoverageonstandarddatasets.Assum-inganarc-factoredmodel,well-nesteddependencytreeswithblock-degree(西德:20)2canbeparsedintimeO.n7/usingthealgorithmofG´omez-Rodr´ıguezetal.(2011).最近,Pitleretal.(2012)haveshownthatifanadditionalrestrictioncalled1-inheritisim-posed,parsingcanbedoneintimeO.n6/,withoutanyadditionallossincoverageonstandarddatasets.Standardcontext-freeparsingmethods,whenadap-tedtotheparsingofprojectivetrees,provideO.n5/timecomplexity.TheO.n3/timeresultreportedbyEisner(1997)hasbeenobtainedbyexploitingmoresophisticateddynamicprogrammingtechniquesthat‘split’dependencytreesatthepositionoftheirheads,inordertosavebookkeeping.Splittingtechniqueshavealsobeenexploitedtospeedupparsingtimeforotherlexicalizedformalisms,suchasbilexicalcontext-freegrammarsandheadautomata(EisnerandSatta,1999).然而,toourknowledgenoat-tempthasbeenmadeintheliteraturetoextendthesetechniquestonon-projectivedependencyparsing.InthisarticleweleveragethecentralideafromEisner’salgorithmandextendittotheclassofwell-nesteddependencytreeswithblock-degreeatmost2. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c l /
计算语言学协会会刊, 1 (2013) 255–266. 动作编辑器: Kristina Toutanova.
计算语言学协会会刊, 1 (2013) 255–266. 动作编辑器: Kristina Toutanova. Submitted 11/2012; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. Minimally-SupervisedMorphologicalSegmentationusingAdaptorGrammarsKairitSirtsInstituteofCyberneticsTallinnUniversityofTechnologysirts@phon.ioc.eeSharonGoldwaterSchoolofInformaticsTheUniversityofEdinburghsgwater@inf.ed.ac.ukAbstractThispaperexplorestheuseofAdaptorGram-mars,anonparametricBayesianmodellingframework,forminimallysupervisedmorpho-logicalsegmentation.Wecomparethreetrain-ingmethods:unsupervisedtraining,semi-supervisedtraining,andanovelmodelselec-tionmethod.Inthemodelselectionmethod,wetrainunsupervisedAdaptorGrammarsus-inganover-articulatedmetagrammar,thenuseasmalllabelleddatasettoselectwhichpoten-tialmorphboundariesidentifiedbythemeta-grammarshouldbereturnedinthefinaloutput.Weevaluateonfivelanguagesandshowthatsemi-supervisedtrainingprovidesaboostoverunsupervisedtraining,whilethemodelselec-tionmethodyieldsthebestaverageresultsoveralllanguagesandiscompetitivewithstate-of-the-artsemi-supervisedsystems.Moreover,thismethodprovidesthepotentialtotuneper-formanceaccordingtodifferentevaluationmet-ricsordownstreamtasks.1IntroductionResearchintounsupervisedlearningofmorphologyhasalonghistory,startingwiththeworkofHarris(1951).Whileearlyresearchwasmostlymotivatedbylinguisticinterests,morerecentworkinNLPoftenaimstoreducedatasparsityinmorphologicallyrichlanguagesfortaskssuchasautomaticspeechrecogni-tion,statisticalmachinetranslation,orautomatictextgeneration.Fortheseapplications,然而,com-pletelyunsupervisedsystemsmaynotbeidealifevenasmallamountofsegmentedtrainingdataisavailable.Inthispaper,weexploretheuseofAdap-torGrammars(Johnsonetal.,2007)forminimallysupervisedmorphologicalsegmentation.AdaptorGrammars(AGs)areanonparametricBayesianmodellingframeworkthatcanlearnlatenttreestructuresoveraninputcorpusofstrings.Forexample,theycanbeusedtodefineamorpholog-icalgrammarwhereeachwordconsistsofzeroormoreprefixes,astem,andzeroormoresuffixes;theactualformsofthesemorphs(andthesegmentationofwordsintomorphs)arelearnedfromthedata.InthisgeneralapproachAGsaresimilartomanyotherunsupervisedmorphologicalsegmentationsystems,suchasLinguistica(Goldsmith,2001)andtheMor-fessorfamily(CreutzandLagus,2007).Amajordifference,然而,isthatthemorphologicalgram-marisspecifiedasaninputtotheprogram,ratherthanhard-coded,whichallowsdifferentgrammarstobeexploredeasily.Forthetaskofsegmentingutterancesintowords,forexample,Johnsonandcol-leagueshaveexperimentedwithgrammarsencodingdifferentkindsofsub-wordandsuper-wordstructure(e.g.,syllablesandcollocations),showingthatthebestgrammarsfaroutperformothersystemsonthesamecorpora(约翰逊,2008A;JohnsonandGoldwa-ter,2009;JohnsonandDemuth,2010).ThesewordsegmentationpapersdemonstratedboththepoweroftheAGapproachandthesyner-gisticbehaviorthatoccurswhenlearningmultiplelevelsofstructuresimultaneously.However,thebest-performinggrammarswereselectedusingthesamecorpusthatwasusedforfinaltesting,andeachpaperdealtwithonlyonelanguage.Theidealunsuper-visedlearnerwoulduseasinglegrammartunedon l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c
计算语言学协会会刊, 1 (2013) 231–242. 动作编辑器: Noah Smith.
计算语言学协会会刊, 1 (2013) 231–242. 动作编辑器: Noah Smith. Submitted 11/2012; 修改 2/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. ModelingSemanticRelationsExpressedbyPrepositionsVivekSrikumarandDanRothUniversityofIllinois,Urbana-ChampaignUrbana,IL.61801.{vsrikum2,danr}@illinois.eduAbstractThispaperintroducestheproblemofpredict-ingsemanticrelationsexpressedbypreposi-tionsanddevelopsstatisticallearningmodelsforpredictingtherelations,theirargumentsandthesemantictypesofthearguments.Wedefineaninventoryof32relations,build-ingonthewordsensedisambiguationtaskforprepositionsandcollapsingrelatedsensesacrossprepositions.Givenaprepositioninasentence,ourcomputationaltasktojointlymodeltheprepositionrelationanditsargu-mentsalongwiththeirsemantictypes,asawaytosupporttherelationprediction.Thean-notateddata,然而,onlyprovideslabelsfortherelationlabel,andnottheargumentsandtypes.Weaddressthisbypresentingtwomod-elsforprepositionrelationlabeling.Ourgen-eralizationoflatentstructureSVMgivescloseto90%accuracyonrelationlabeling.Further,byjointlypredictingtherelation,论据,andtheirtypesalongwithprepositionsense,weshowthatwecannotonlyimprovethere-lationaccuracy,butalsosignificantlyimprovesensepredictionaccuracy.1IntroductionThispaperaddressestheproblemofpredictingse-manticrelationsconveyedbyprepositionsintext.Prepositionsexpressmanysemanticrelationsbe-tweentheirgovernorandobject.Predictingthesecanhelpadvancingtextunderstandingtaskslikequestionansweringandtextualentailment.Considerthesentence:(1)ThebookofProf.Alexanderonprimaryschoolmethodsisavaluableteachingresource.Here,theprepositiononindicatesthatthebookandprimaryschoolmethodsareconnectedbytherelationTopicandofindicatestheCreator-CreationrelationbetweenProf.Alexanderandthebook.Predictingtheserelationscanhelpanswerquestionsaboutthesubjectofthebookandalsorec-ognizetheentailmentofsentenceslikeProf.Alexan-derhaswrittenaboutprimaryschoolmethods.Beinghighlypolysemous,thesameprepositioncanindicatedifferentkindsofrelations,dependingonitsgovernorandobject.Furthermore,severalprepositionscanindicatethesamesemanticrelation.Forexample,considerthesentence:(2)Poorcareledtoherdeathfrompneumonia.TheprepositionfrominthissentenceexpressestherelationCause(死亡,pneumonia).Inadiffer-entcontext,itcandenoteotherrelations,asinthephrasescopiedfromthefilm(来源)andrecog-nizedfromthestart(Temporal).Ontheotherhand,therelationCausecanbeexpressedbysev-eralprepositions;forexample,thefollowingphrasesexpressaCauserelation:diedofpneumoniaandtiredafterthesurgery.Wecharacterizesemanticrelationsexpressedbytransitiveprepositionsanddevelopaccuratemodelsforpredictingtherelations,identifyingtheirargu-mentsandrecognizingthesemantictypesofthear-guments.Buildingonthewordsensedisambigua-tiontaskforprepositions,wecollapsesemanticallyrelatedsensesacrossprepositionstoderiveourre-lationinventory.Theserelationsactaspredicatesinapredicate-argumentrepresentation,wheretheargumentsarethegovernorandtheobjectofthe l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 219–230. 动作编辑器: Brian Roark.
计算语言学协会会刊, 1 (2013) 219–230. 动作编辑器: Brian Roark. Submitted 1/2013; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. JointArc-factoredParsingofSyntacticandSemanticDependenciesXavierLlu´ısandXavierCarrerasandLlu´ısM`arquezTALPResearchCenterUniversitatPolit`ecnicadeCatalunyaJordiGirona1–3,08034Barcelona{xlluis,carreras,lluism}@lsi.upc.eduAbstractInthispaperweintroduceajointarc-factoredmodelforsyntacticandsemanticdependencyparsing.Thesemanticrolelabelerpredictsthefullsyntacticpathsthatconnectpredicateswiththeirarguments.Thisprocessisframedasalinearassignmenttask,whichallowstocontrolsomewell-formednessconstraints.Forthesyntacticpart,wedefineastandardarc-factoreddependencymodelthatpredictsthefullsyntactictree.Finally,weemploydualdecompositiontechniquestoproduceconsis-tentsyntacticandpredicate-argumentstruc-tureswhilesearchingoveralargespaceofsyntacticconfigurations.InexperimentsontheCoNLL-2009Englishbenchmarkweob-serveverycompetitiveresults.1IntroductionSemanticrolelabeling(SRL)isthetaskofidenti-fyingtheargumentsoflexicalpredicatesinasen-tenceandlabelingthemwithsemanticroles(GildeaandJurafsky,2002;M`arquezetal.,2008).SRLisanimportantshallowsemantictaskinNLPsincepredicate-argumentrelationsdirectlyrepresentse-manticpropertiesofthetype“who”did“what”to“whom”,“how”,and“why”foreventsexpressedbypredicates(typicallyverbsandnouns).Predicate-argumentrelationsarestronglyrelatedtothesyntacticstructureofthesentence:thema-jorityofpredicateargumentscorrespondtosomesyntacticconstituent,andthesyntacticstructurethatconnectsanargumentwiththepredicateisastrongindicatorofitssemanticrole.Actually,semanticrolesrepresentanabstractionofthesyntacticformofapredicativeevent.Whilesyntacticfunctionsofargumentschangewiththeformoftheevent(e.g.,activevs.passiveforms),thesemanticrolesofargu-mentsremaininvarianttotheirsyntacticrealization.Consequently,sincethefirstworks,SRLsystemshaveassumedaccesstothesyntacticstructureofthesentence(GildeaandJurafsky,2002;CarrerasandM`arquez,2005).Asimpleapproachistoobtaintheparsetreesasapre-processtotheSRLsystem,whichallowstheuseofunrestrictedfeaturesofthesyntax.However,asinotherpipelineapproachesinNLP,ithasbeenshownthattheerrorsofthesyn-tacticparserseverelydegradethepredictionsoftheSRLmodel(GildeaandPalmer,2002).AcommonapproachtoalleviatethisproblemistoworkwithmultiplealternativesyntactictreesandlettheSRLsystemoptimizeoveranyinputtreeorpartofit(Toutanovaetal.,2008;Punyakanoketal.,2008).Asastepfurther,morerecentworkhasproposedparsingmodelsthatpredictsyntacticstructureaug-mentedwithsemanticpredicate-argumentrelations(Surdeanuetal.,2008;Hajiˇcetal.,2009;约翰逊,2009;Titovetal.,2009;Llu´ısetal.,2009),whichisthefocusofthispaper.Thesejointmodelsshouldfavorthesyntacticstructurethatismostconsistentwiththesemanticpredicate-argumentstructuresofasentence.Inprinciple,thesemodelscanexploitsyntacticandsemanticfeaturessimultaneously,andcouldpotentiallyimprovetheaccuracyforbothsyn-tacticandsemanticrelations.Onedifficultyinthedesignofjointsyntactic-semanticparsingmodelsisthatthereexistimpor-tantstructuraldivergencesbetweenthetwolayers. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 207–218. 动作编辑器: Ben Taskar.
计算语言学协会会刊, 1 (2013) 207–218. 动作编辑器: Ben Taskar. Submitted 10/2012; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. DualCoordinateDescentAlgorithmsforEfficientLargeMarginStructuredPredictionMing-WeiChangWen-tauYihMicrosoftResearchRedmond,WA98052,USA{minchang,scottyih}@microsoft.comAbstractDuetothenatureofcomplexNLPproblems,structuredpredictionalgorithmshavebeenimportantmodelingtoolsforawiderangeoftasks.WhilethereexistsevidenceshowingthatlinearStructuralSupportVectorMachine(SSVM)algorithmperformsbetterthanstruc-turedPerceptron,theSSVMalgorithmisstilllessfrequentlychosenintheNLPcommunitybecauseofitsrelativelyslowtrainingspeed.Inthispaper,weproposeafastandeasy-to-implementdualcoordinatedescentalgorithmforSSVMs.UnlikealgorithmssuchasPer-ceptronandstochasticgradientdescent,ourmethodkeepstrackofdualvariablesandup-datestheweightvectormoreaggressively.Asaresult,thistrainingprocessisasefficientasexistingonlinelearningmethods,andyetde-rivesconsistentlybettermodels,asevaluatedonfourbenchmarkNLPdatasetsforpart-of-speechtagging,named-entityrecognitionanddependencyparsing.1IntroductionComplexnaturallanguageprocessingtasksarein-herentlystructured.Fromsequencelabelingprob-lemslikepart-of-speechtaggingandnamedentityrecognitiontotreeconstructiontaskslikesyntacticparsing,strongdependenciesexistamongthela-belsofindividualcomponents.Bymodelingsuchrelationsintheoutputspace,structuredoutputpre-dictionalgorithmshavebeenshowntooutperformsignificantlysimplebinaryormulti-classclassi-fiers(Laffertyetal.,2001;柯林斯,2002;McDonaldetal.,2005).Amongtheexistingstructuredoutputpredictionalgorithms,thelinearStructuralSupportVectorMachine(SSVM)algorithm(Tsochantaridisetal.,2004;Joachimsetal.,2009)hasshownoutstandingperformanceinseveralNLPtasks,suchasbilingualwordalignment(Mooreetal.,2007),constituencyanddependencyparsing(Taskaretal.,2004b;Kooetal.,2007),sentencecompression(CohnandLa-pata,2009)anddocumentsummarization(Lietal.,2009).尽管如此,asalearningmethodforNLP,theSSVMalgorithmhasbeenlessthanpopularal-gorithmssuchasthestructuredPerceptron(柯林斯,2002).Thismaybeduetothefactthatcur-rentSSVMimplementationsoftensufferfromsev-eralpracticalissues.First,state-of-the-artimple-mentationsofSSVMsuchascuttingplanemeth-ods(Joachimsetal.,2009)aretypicallycompli-cated.1Second,whilemethodslikestochasticgradi-entdescentaresimpletoimplement,tuninglearningratescanbedifficult.Finally,whileSSVMmod-elscanachievesuperioraccuracy,thisoftenrequireslongtrainingtime.Inthispaper,weproposeanoveloptimiza-tionmethodforefficientlytraininglinearSSVMs.Ourmethodnotonlyiseasytoimplement,butalsohasexcellenttrainingspeed,competitivewithbothstructuredPerceptron(柯林斯,2002)andMIRA(Crammeretal.,2005).WhenevaluatedonseveralNLPtasks,includingPOStagging,NERanddependencyparsing,thisoptimizationmethodalsooutperformsotherapproachesintermsofpredictionaccuracy.Ourfinalalgorithmisadualcoordinate1Ouralgorithmiseasytoimplementmainlybecauseweusethesquarehingelossfunction. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 193–206. 动作编辑器: Jason Eisner.
计算语言学协会会刊, 1 (2013) 193–206. 动作编辑器: Jason Eisner. Submitted 10/2012; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. JointlyLearningtoParseandPerceive:ConnectingNaturalLanguagetothePhysicalWorldJayantKrishnamurthyComputerScienceDepartmentCarnegieMellonUniversityjayantk@cs.cmu.eduThomasKollarComputerScienceDepartmentCarnegieMellonUniversitytkollar@andrew.cmu.eduAbstractThispaperintroducesLogicalSemanticswithPerception(LSP),amodelforgroundedlan-guageacquisitionthatlearnstomapnatu-rallanguagestatementstotheirreferentsinaphysicalenvironment.Forexample,givenanimage,LSPcanmapthestatement“bluemugonthetable”tothesetofimageseg-mentsshowingbluemugsontables.LSPlearnsphysicalrepresentationsforbothcate-gorical(“blue,”“mug”)andrelational(“on”)语言,andalsolearnstocomposetheserepresentationstoproducethereferentsofen-tirestatements.WefurtherintroduceaweaklysupervisedtrainingprocedurethatestimatesLSP’sparametersusingannotatedreferentsforentirestatements,withoutannotatedref-erentsforindividualwordsortheparsestruc-tureofthestatement.Weperformexperimentsontwoapplications:sceneunderstandingandgeographicalquestionanswering.WefindthatLSPoutperformsexisting,lessexpressivemodelsthatcannotrepresentrelationallan-guage.Wefurtherfindthatweaklysupervisedtrainingiscompetitivewithfullysupervisedtrainingwhilerequiringsignificantlylessan-notationeffort.1IntroductionLearningthemappingfromnaturallanguagetophysicalenvironmentsisacentralproblemfornatu-rallanguagesemantics.Understandingthismappingisnecessarytoenablenaturallanguageinteractionswithrobotsandotherembodiedsystems.Forexam-ple,foranautonomousrobottounderstandthesen-tence“Thebluemugisonthetable,”itmustbeabletoidentify(1)theobjectsinitsenvironmentcorre-spondingto“bluemug”and“table,”and(2)theob-jectswhichparticipateinthespatialrelationdenotedby“on.”Iftherobotcansuccessfullyidentifytheseobjects,itunderstandsthemeaningofthesentence.Theproblemoflearningtomapfromnaturallan-guageexpressionstotheirreferentsinanenviron-mentisknownasgroundedlanguageacquisition.Inembodiedsettings,environmentsconsistofrawsensordata–forexample,anenvironmentcouldbeanimagecollectedfromarobot’scamera.Insuchapplications,groundedlanguageacquisitionhastwosubproblems:解析,learningthecompositionalstructureofnaturallanguage;andperception,learn-ingtheenvironmentalreferentsofindividualwords.Acquiringbothkindsofknowledgeisnecessarytounderstandnovellanguageinnovelenvironments.Unfortunately,perceptionisoftenignoredinworkonlanguageacquisition.Othervariantsofgroundedlanguageacquisitioneliminatetheneedforpercep-tionbyassumingaccesstoalogicalrepresentationoftheenvironment(ZettlemoyerandCollins,2005;WongandMooney,2006;Matuszeketal.,2010;ChenandMooney,2011;Liangetal.,2011).Theexistingworkwhichhasjointlyaddressedbothpars-ingandperceptionhassignificantdrawbacks,in-cluding:(1)fullysupervisedmodelsrequiringlargeamountsofmanualannotationand(2)limitedse-manticrepresentations(Kollaretal.,2010;Tellexetal.,2011;Matuszeketal.,2012).ThispaperintroducesLogicalSemanticswithPerception(LSP),amodelforgroundedlanguageacquisitionthatjointlylearnstosemanticallyparselanguageandperceivetheworld.LSPmodelsamappingfromnaturallanguagequeriestosetsofob-jectsinareal-worldenvironment.TheinputtoLSPisanenvironmentcontainingobjects,suchasaseg- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 179–192. 动作编辑器: Johan Bos.
计算语言学协会会刊, 1 (2013) 179–192. 动作编辑器: Johan Bos. Submitted 1/2013; 修改 3/2013; 已发表 5/2013. C(西德:13)2013 计算语言学协会. CombinedDistributionalandLogicalSemanticsMikeLewisSchoolofInformaticsUniversityofEdinburghEdinburgh,EH89AB,UKmike.lewis@ed.ac.ukMarkSteedmanSchoolofInformaticsUniversityofEdinburghEdinburgh,EH89AB,UKsteedman@inf.ed.ac.ukAbstractWeintroduceanewapproachtosemanticswhichcombinesthebenefitsofdistributionalandformallogicalsemantics.Distributionalmodelshavebeensuccessfulinmodellingthemeaningsofcontentwords,butlogicalse-manticsisnecessarytoadequatelyrepresentmanyfunctionwords.Wefollowformalse-manticsinmappinglanguagetologicalrep-resentations,butdifferinthattherelationalconstantsusedareinducedbyofflinedistri-butionalclusteringatthelevelofpredicate-argumentstructure.Ourclusteringalgorithmishighlyscalable,allowingustorunoncor-porathesizeofGigaword.Differentsensesofawordaredisambiguatedbasedontheirin-ducedtypes.Weoutperformavarietyofex-istingapproachesonawide-coveragequestionansweringtask,anddemonstratetheabilitytomakecomplexmulti-sentenceinferencesin-volvingquantifiersontheFraCaSsuite.1IntroductionMappingnaturallanguagetomeaningrepresenta-tionsisacentralchallengeofNLP.Therehasbeenmuchrecentprogressinunsuperviseddistributionalsemantics,inwhichthemeaningofawordisin-ducedbasedonitsusageinlargecorpora.Thisap-proachisusefulforarangeofkeyapplicationsin-cludingquestionansweringandrelationextraction(LinandPantel,2001;PoonandDomingos,2009;Yaoetal.,2011).Becausesuchasemanticscanbeautomicallyinduced,itescapesthelimitationofde-pendingonrelationsfromhand-builttrainingdata,knowledgebasesorontologies,whichhaveprovedoflimiteduseincapturingthehugevarietyofmean-ingsthatcanbeexpressedinlanguage.However,distributionalsemanticshaslargelyde-velopedinisolationfromtheformalsemanticsliter-ature.Whilstdistributionalsemanticshasbeenef-fectiveinmodellingthemeaningsofcontentwordssuchasnounsandverbs,itislessclearthatitcanbeappliedtothemeaningsoffunctionwords.Semanticoperators,suchasdeterminers,否定,conjunc-tions,modals,tense,情绪,aspect,andpluralsareubiquitousinnaturallanguage,andarecrucialforhighperformanceonmanypracticalapplications—butcurrentdistributionalmodelsstruggletocaptureevensimpleexamples.Conversely,computationalmodelsofformalsemanticshaveshownlowrecallonpracticalapplications,stemmingfromtheirre-lianceonontologiessuchasWordNet(磨坊主,1995)tomodelthemeaningsofcontentwords(Bobrowetal.,2007;BosandMarkert,2005).Forexample,considerwhatisneededtoansweraquestionlikeDidGooglebuyYouTube?fromthefollowingsentences:1.GooglepurchasedYouTube2.Google’sacquisitionofYouTube3.Googleacquiredeverycompany4.YouTubemaybesoldtoGoogle5.GooglewillbuyYouTubeorMicrosoft6.Googledidn’ttakeoverYouTubeAlloftheserequireknowledgeoflexicalseman-tics(e.g.thatbuyandpurchasearesynonyms),butsomealsoneedinterpretationofquantifiers,nega-tives,modalsanddisjunction.Itseemsunlikelythat l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c
计算语言学协会会刊, 1 (2013) 165–178. 动作编辑器: David Chiang.
计算语言学协会会刊, 1 (2013) 165–178. 动作编辑器: David Chiang. Submitted 11/2012; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. Learningtotranslatewithproductsofnovices:asuiteofopen-endedchallengeproblemsforteachingMTAdamLopez1,MattPost1,ChrisCallison-Burch1,2,JonathanWeese,JuriGanitkevitch,NargesAhmidi,OliviaBuzek,LeahHanson,BeenishJamil,MatthiasLee,Ya-TingLin,HenryPao,FatimaRivera,LeiliShahriyari,DebuSinha,AdamTeichert,StephenWampler,MichaelWeinberger,DaguangXu,LinYang,andShangZhao∗DepartmentofComputerScience,JohnsHopkinsUniversity1HumanLanguageTechnologyCenterofExcellence,JohnsHopkinsUniversity2ComputerandInformationScienceDepartment,UniversityofPennsylvaniaAbstractMachinetranslation(公吨)drawsfromseveraldifferentdisciplines,makingitacomplexsub-jecttoteach.Thereareexcellentpedagogicaltexts,butproblemsinMTandcurrentalgo-rithmsforsolvingthemarebestlearnedbydoing.AsacenterpieceofourMTcourse,wedevisedaseriesofopen-endedchallengesforstudentsinwhichthegoalwastoim-proveperformanceoncarefullyconstrainedinstancesoffourkeyMTtasks:结盟,decoding,评估,andreranking.Studentsbroughtadiversesetoftechniquestotheprob-lems,includingsomenovelsolutionswhichperformedremarkablywell.Asurprisingandexcitingoutcomewasthatstudentsolutionsortheircombinationsfaredcompetitivelyonsometasks,demonstratingthatevennewcom-erstothefieldcanhelpimprovethestate-of-the-artonhardNLPproblemswhilesimulta-neouslylearningagreatdeal.Theproblems,baselinecode,andresultsarefreelyavailable.1IntroductionAdecadeago,studentsinterestedinnaturallan-guageprocessingarrivedatuniversitieshavingbeenexposedtotheideaofmachinetranslation(公吨)primarilythroughsciencefiction.Today,incomingstudentshavebeenexposedtoserviceslikeGoogleTranslatesincetheywereinsecondaryschoolorear-lier.Forthem,MTissciencefact.SoitmakessensetoteachstatisticalMT,eitheronitsownorasaunit∗Thefirstfiveauthorswereinstructorsandtheremainingau-thorswerestudentsintheworkeddescribedhere.ThisresearchwasconductedwhileChrisCallison-BurchwasatJohnsHop-kinsUniversity.inaclassonnaturallanguageprocessing(自然语言处理),ma-chinelearning(机器学习),orartificialintelligence(人工智能).AcoursethatpromisestoshowstudentshowGoogleTranslateworksandteachthemhowtobuildsome-thinglikeitisespeciallyappealing,andseveraluni-versitiesandsummerschoolsnowoffersuchclasses.Thereareexcellentintroductorytexts—dependingonthelevelofdetailrequired,instructorscanchoosefromacomprehensiveMTtextbook(科恩,2010),achapterofapopularNLPtextbook(JurafskyandMartin,2009),atutorialsurvey(洛佩兹,2008),oranintuitivetutorialontheIBMModels(骑士,1999乙),amongmanyothers.ButMTisnotjustanobjectofacademicstudy.It’sarealapplicationthatisn’tfullyperfected,andthebestwaytolearnaboutitistobuildanMTsys-tem.Thiscanbedonewithopen-sourcetoolkitssuchasMoses(Koehnetal.,2007),cdec(Dyeretal.,2010),orJoshua(Ganitkevitchetal.,2012),butthesesystemsarenotdesignedforpedagogy.Theyarematurecodebasesfeaturingtensofthousandsofsourcecodelines,makingitdifficulttofocusontheircorealgorithms.Mosttutorialspresentthemasblackboxes.ButourgoalisforstudentstolearnthekeytechniquesinMT,andideallytolearnbydoing.Blackboxesareincompatiblewiththisgoal.Wesolvethisdilemmabypresentingstudentswithconcise,fully-functioning,self-containedcom-ponentsofastatisticalMTsystem:wordalignment,decoding,评估,andreranking.Eachimple-mentationconsistsofana¨ıvebaselinealgorithminlessthan150linesofPythoncode.Weassignthemtostudentsasopen-endedchallengesinwhichthegoalistoimproveperformanceonobjectiveeval-uationmetricsasmuchaspossible.ThissettingmirrorsevaluationsconductedbytheNLPresearch l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 151–164. 动作编辑器: Patrick Pantel.
计算语言学协会会刊, 1 (2013) 151–164. 动作编辑器: Patrick Pantel. Submitted 12/2012; 修改 2/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. Dijkstra-WSA:AGraph-BasedApproachtoWordSenseAlignmentMichaelMatuschek‡andIrynaGurevych†‡†UbiquitousKnowledgeProcessingLab(UKP-DIPF),GermanInstituteforEducationalResearchandEducationalInformationSchloßstr.29,60486Frankfurt,Germany‡UbiquitousKnowledgeProcessingLab(UKP-TUDA),DepartmentofComputerScience,TechnischeUniversit¨atDarmstadtHochschulstr.10,64289Darmstadt,Germanyhttp://www.ukp.tu-darmstadt.deAbstractInthispaper,wepresentDijkstra-WSA,anovelgraph-basedalgorithmforwordsensealignment.Weevaluateitonfourdifferentpairsoflexical-semanticresourceswithdif-ferentcharacteristics(WordNet-OmegaWiki,WordNet-Wiktionary,GermaNet-WiktionaryandWordNet-Wikipedia)andshowthatitachievescompetitiveperformanceon3outof4datasets.Dijkstra-WSAoutperformsthestateoftheartoneverydatasetifitiscom-binedwithaback-offbasedonglosssimilar-ity.WealsodemonstratethatDijkstra-WSAisnotonlyflexiblyapplicabletodifferentre-sourcesbutalsohighlyparameterizabletoop-timizeforprecisionorrecall.1IntroductionLexical-semanticresources(LSRs)areacorner-stoneformanyNaturalLanguageProcessing(自然语言处理)applicationssuchaswordsensedisambiguation(水务署)andinformationextraction.However,thegrowingdemandforlarge-scaleresourcesindif-ferentlanguagesishardtomeet.ThePrincetonWordNet(WN)(Fellbaum,1998)iswidelyusedforEnglish,butformostlanguagescorrespondingre-sourcesareconsiderablysmallerormissing.Col-laborativelyconstructedresourceslikeWiktionary(WKT)andOmegaWiki(OW)provideaviableop-tionforsuchcasesandseemespeciallysuitableforsmallerlanguages(Matuscheketal.,2013),buttherearestillconsiderablegapsincoveragewhichneedtobefilled.Arelatedproblemisthatthereusuallydoesnotexistasingleresourcewhichworksbestforallpurposes,asdifferentLSRscoverdiffer-entwords,sensesandinformationtypes.Theseconsiderationshavesparkedincreasingre-searcheffortsintheareaofwordsensealignment(WSA).Ithasbeenshownthatalignedresourcescanindeedleadtobetterperformancethanusingtheresourcesindividually.Examplesincludeseman-ticparsingusingFrameNet(FN),WN,andVerbNet(VN)(ShiandMihalcea,2005),wordsensedisam-biguationusinganalignmentofWNandWikipedia(WP)(NavigliandPonzetto,2012)andsemanticrolelabelingusingacombinationofPropBank,VNandFNintheSemLinkproject(帕尔默,2009).SomeoftheseapproachestoWSAeitherrelyheav-ilyonmanuallabor(e.g.ShiandMihalcea(2005))oroninformationwhichisonlypresentinfewresourcessuchasthemostfrequentsense(MFS)(Suchaneketal.,2008).Thismakesitdifficulttoapplythemtoalargersetofresources.Inearlierwork,wepresentedthelarge-scalere-sourceUBY(Gurevychetal.,2012).ItcontainsnineresourcesintwolanguageswhicharemappedtoauniformrepresentationusingtheLMFstandard(Eckle-Kohleretal.,2012).Theyarethusstruc-turallyinteroperable.UBYcontainspairwisesensealignmentsbetweenasubsetoftheseresources,andthisworkalsopresentedaframeworkforcreat-ingalignmentsbasedonthesimilarityofglosses(MeyerandGurevych,2011).然而,itisnotcleartowhatextentthisapproachcanbeappliedtoresourceswhichlackthiskindofinformation(seeSection3).Insummary,aligningsensesisakeyrequirementforsemanticinteroperabilityofLSRstoincreasethe l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 139–150. 动作编辑器: 乔金·尼弗尔.
计算语言学协会会刊, 1 (2013) 139–150. 动作编辑器: 乔金·尼弗尔. Submitted 12/2012; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. EfficientStackedDependencyParsingbyForestRerankingKatsuhikoHayashiandShuheiKondoandYujiMatsumotoGraduateSchoolofInformationScienceNaraInstituteofScienceandTechnology8916-5,Takayama,Ikoma,Nara630-0192,Japan{katsuhiko-h,shuhei-k,matsu}@is.naist.jpAbstractThispaperproposesadiscriminativefor-estrerankingalgorithmfordependencypars-ingthatcanbeseenasaformofefficientstackedparsing.Adynamicprogrammingshift-reduceparserproducesapackedderiva-tionforestwhichisthenscoredbyadiscrim-inativereranker,usingthe1-besttreeoutputbytheshift-reduceparserasguidefeaturesinadditiontothird-ordergraph-basedfeatures.Toimproveefficiencyandaccuracy,thispa-peralsoproposesanovelshift-reduceparserthateliminatesthespuriousambiguityofarc-standardtransitionsystems.TestingontheEnglishPennTreebankdata,forestrerankinggaveastate-of-the-artunlabeleddependencyaccuracyof93.12.1IntroductionTherearetwomainapproachesofdata-drivende-pendencyparsing–oneisgraph-basedandtheotheristransition-based.Inthegraph-basedapproach,globaloptimiza-tionalgorithmsfindthehighest-scoringtreewithlocallyfactoredmodels(McDonaldetal.,2005).Whilethird-ordergraph-basedmodelsachievestate-of-the-artaccuracy,ithasO(n4)timecomplexityforasentenceoflengthn.Recently,someprun-ingtechniqueshavebeenproposedtoimprovetheefficiencyofthird-ordermodels(RushandPetrov,2012;ZhangandMcDonald,2012).Thetransition-basedapproachusuallyemploystheshift-reduceparsingalgorithmwithlinear-timecomplexity(Nivre,2008).Itgreedilychoosesthetransitionwiththehighestscoreandtheresult-ingtransitionsequenceisnotalwaysgloballyop-timal.Thebeamsearchalgorithmimprovespars-ingflexibilityindeterministicparsing(ZhangandClark,2008;ZhangandNivre,2011),anddy-namicprogrammingmakesbeamsearchmoreeffi-cient(HuangandSagae,2010).Thereisalsoanalternativeapproachthatin-tegratesgraph-basedandtransition-basedmodels(SagaeandLavie,2006;ZhangandClark,2008;NivreandMcDonald,2008;Martinsetal.,2008).Martinsetal.(2008)formulatedtheirapproachasstackingofparserswheretheoutputofthefirst-stageparserisprovidedtothesecondasguidefeatures.Inparticular,theyusedatransition-basedparserforthefirststageandagraph-basedparserforthesecondstage.Themaindrawbackofthisapproachisthattheefficiencyofthetransition-basedparserissacri-ficedbecausethesecond-stageemploysfullparsing.Thispaperproposesanefficientstackedpars-ingmethodthroughdiscriminativererankingwithhigher-ordergraph-basedfeatures,whichworksontheforestsoutputbythefirst-stagedynamicpro-grammingshift-reduceparserandintegratesnon-localfeaturesefficientlywithcube-pruning(HuangandChiang,2007).Theadvantagesofourmethodareasfollows:•Unliketheconventionalstackingapproach,thefirst-stageshift-reduceparserprunesthesearchspaceofthesecond-stagegraph-basedparser.•Inadditiontoguidefeatures,thesecond-stagegraph-basedparsercanemploythescoresofthefirst-stageparserwhichcannotbeincorpo- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 125–138. 动作编辑器: Sharon Goldwater.
计算语言学协会会刊, 1 (2013) 125–138. 动作编辑器: Sharon Goldwater. Submitted 10/2012; 修改 3/2013; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. ModelingChildDivergencesfromAdultGrammarSamSahakianUniversityofWisconsin-Madisonsahakian@cs.wisc.eduBenjaminSnyderUniversityofWisconsin-Madisonbsnyder@cs.wisc.eduAbstractDuringthecourseoffirstlanguageacquisi-tion,childrenproducelinguisticformsthatdonotconformtoadultgrammar.Inthispaper,weintroduceadatasetandapproachforsys-tematicallymodelingthischild-adultgrammardivergence.Ourcorpusconsistsofchildsen-tenceswithcorrectedadultforms.Webridgethegapbetweentheseformswithadiscrim-inativelyrerankednoisychannelmodelthattranslateschildsentencesintoequivalentadultutterances.OurmethodoutperformsMTandESLbaselines,reducingchilderrorby20%.Ourmodelallowsustochartspecificaspectsofgrammardevelopmentinlongitudinalstud-iesofchildren,andinvestigatethehypothesisthatchildrenshareacommondevelopmentalpathinlanguageacquisition.1IntroductionSincethepublicationoftheBrownStudy(1973),theexistenceofstandardstagesofdevelopmenthasbeenanunderlyingassumptioninthestudyoffirstlanguagelearning.Asachildmovestowardslan-guagemastery,theirlanguageusegrowspredictablytoincludemorecomplexsyntacticstructures,even-tuallyconvergingtofulladultusage.Inthecourseofthisprocess,childrenmayproducelinguisticformsthatdonotconformtothegrammaticalstandard.Fromtheadultpointofviewthesearelanguageer-rors,alabelwhichimpliesafaultyproduction.Con-sideringthework-in-progressnatureofachildlan-guagelearner,thesedivergencescouldalsobede-scribedasexpressionsofthestructuraldifferencesbetweenchildandadultgrammar.Thepredictabilityofthesedivergenceshasbeenobservedbypsychol-ogists,linguistsandparents(Owens,2008).1Ourworkleveragesthedifferencesbetweenchildandadultlanguagetomaketwocontributionsto-wardsthestudyoflanguageacquisition.First,weprovideacorpusoferrorfulchildsentencesanno-tatedwithadult-likerephrasings.Thisdatawillal-lowresearcherstotesthypothesesandbuildmodelsrelatingthedevelopmentofchildlanguagetoadultforms.Oursecondcontributionisaprobabilisticmodeltrainedonourcorpusthatpredictsagram-maticalrephrasinggivenanerrorfulchildsentence.Thegenerativeassumptionofourmodelisthatsentencesbegininunderlyingadultforms,andarethenstochasticallytransformedintoobservedchildutterances.Givenanobservedchildutterances,wecalculatetheprobabilityofthecorrectedadulttrans-lationtasP(t|s)∝P(s|t)磷(t),whereP(t)isanadultlanguagemodelandP(s|t)isanoisemodelcraftedtocapturechildgrammarerrorslikeomissionofcertainfunctionwordsandcorruptionsoftenseordeclension.Theparame-tersofthisnoisemodelareestimatedusingourcor-pusofchildandadult-formutterances,usingEMtohandleunobservedwordalignments.Weusethisgenerativemodeltoproducen-bestlistsofcandi-datecorrectionswhicharethenrerankedusinglongrangesentencefeaturesinadiscriminativeframe-work(CollinsandRoark,2004).1Fortheremainderofthispaperweuse“error”and“diver-gence”interchangeably. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 99–110. 动作编辑器: Chris Callison-Burch.
计算语言学协会会刊, 1 (2013) 99–110. 动作编辑器: Chris Callison-Burch. Submitted 12/2012; 已发表 5/2013. C (西德:13) 2013 计算语言学协会. UsingPivot-BasedParaphrasingandSentimentProfilestoImproveaSubjectivityLexiconforEssayDataBeataBeigmanKlebanov,NitinMadnani,JillBursteinEducationalTestingService660RosedaleRoad,普林斯顿大学,NJ08541,USA{bbeigmanklebanov,nmadnani,jburstein@ets.org}AbstractWedemonstrateamethodofimprovingaseedsentimentlexicondevelopedonessaydatabyusingapivot-basedparaphrasingsystemforlexicalexpansioncoupledwithsentimentpro-fileenrichmentusingcrowdsourcing.Profileenrichmentaloneyieldsupto15%improve-mentintheaccuracyoftheseedlexiconon3-waysentence-levelsentimentpolarityclassifi-cationofessaydata.Usinglexicalexpansioninadditiontosentimentprofilesprovidesafurther7%improvementinperformance.Ad-ditionalexperimentsshowthattheproposedmethodisalsoeffectivewithothersubjectivitylexiconsandinadifferentdomainofapplica-tion(productreviews).1IntroductionInalmostanysub-fieldofcomputationallinguistics,creationofworkingsystemsstartswithaninvest-mentinmanually-generatedormanually-annotateddataforcomputationalexploration.Insubjectivityandsentimentanalysis,annotationoftrainingandtestingdataandconstructionofsubjectivitylexiconshavebeenthelociofcostlylaborinvestment.Manysubjectivitylexiconsarementionedintheliterature.Thetwolargemanually-builtlexiconsforEnglish–theGeneralInquirer(Stoneetal.,1966)andthelexiconprovidedwiththeOpinion-Finderdistribution(WiebeandRiloff,2005)–areavailableforresearchandeducationonly1andun-derGNUGPLlicensethatdisallowstheirincor-porationintoproprietarymaterials,2respectively.1http://www.wjh.harvard.edu/inquirer/j11/manual/2http://www.gnu.org/copyleft/gpl.htmlThosewishingtointegratesentimentanalysisintoproducts,alongwiththosestudyingsubjectivityinlanguagesotherthanEnglish,orforspecificdo-mainssuchasfinance,orforparticulargenressuchasMySpacecomments,reportedconstructionoflexicons(Taboadaetal.,2011;LoughranandMcDonald,2011;Thelwalletal.,2010;RaoandRavichandran,2009;JijkounandHofmann,2009;PitelandGrefenstette,2008;Mihalceaetal.,2007).Inthispaper,weaddressthestepofexpandingasmall-scale,manually-builtsubjectivitylexicon(aseedlexicon,typicallyforadomainorlanguageinquestion)intoamuchlargerbutnoisierlexi-conusinganautomaticprocedure.Wepresentanovelexpansionmethodusingastate-of-the-artparaphrasingsystem.Theexpansionyieldsa4-foldincreaseinlexiconsize;然而,theexpansionaloneisinsufficientinordertoimproveperformanceonsentence-levelsentimentpolarityclassification.Inthispaperwetestthefollowinghypothesis.Wesuggestthattheeffectivenessoftheexpansionishamperedby(1)introductionofopposite-polarityitems,suchasintroducingresoluteasanexpansionofforceful,orremarkableasanexpansionofpecu-liar;(2)introductionofweaklypolar,neutral,oram-biguouswordsasexpansionsofpolarseedwords,suchasgeneratingconcernasanexpansionofanx-ietyorfutureasanexpansionofaftermath;3(3)in-abilitytodistinguishbetweenstrongerorclear-cutversusweakerorambiguoussentimentandtomakeadifferentialuseofthose.Weaddressitems(1)和(2)byenrichingthelexi-conwithsentimentprofiles(section3),andpropose3Table2andFigure1providesupporttotheseassessments. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c
计算语言学协会会刊, 1 (2013) 75–88. 动作编辑器: Sharon Goldwater.
计算语言学协会会刊, 1 (2013) 75–88. 动作编辑器: Sharon Goldwater. Submitted 11/2012; 修改 1/2013; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. AnHDPModelforInducingCombinatoryCategorialGrammarsYonatanBiskandJuliaHockenmaierDepartmentofComputerScienceTheUniversityofIllinoisatUrbana-Champaign201NGoodwinAveUrbana,IL61801{bisk1,juliahmr}@illinois.eduAbstractWeintroduceanovelnonparametricBayesianmodelfortheinductionofCombinatoryCat-egorialGrammarsfromPOS-taggedtext.Itachievesstateoftheartperformanceonanumberoflanguages,andinduceslinguisti-callyplausiblelexicons.1IntroductionWhatgrammaticalrepresentationisappropriateforunsupervisedgrammarinduction?Initialattemptswithcontext-freegrammars(CFGs)werenotverysuccessful(CarrollandCharniak,1992;查尼亚克,1993).OnereasonmaybethatCFGsrequirethespecificationofafiniteinventoryofnonterminalcat-egoriesandrewriterules,butunlessoneadoptslin-guisticprinciplessuchasX-bartheory(杰肯道夫,1977),thesenonterminalsareessentiallyarbitrarylabelsthatcanbecombinedinarbitraryways.WhilefurtherCFG-basedapproacheshavebeenproposed(克拉克,2001;KuriharaandSato,2004),mostre-centworkhasfollowedKleinandManning(2004)indevelopingmodelsfortheinductionofprojec-tivedependencygrammars.Ithasbeenshownthatmoresophisticatedprobabilitymodels(HeaddenIIIetal.,2009;Gillenwateretal.,2011;CohenandSmith,2010)andlearningregimes(Spitkovskyetal.,2010),aswellastheincorporationofpriorlin-guisticknowledge(CohenandSmith,2009;Berg-KirkpatrickandKlein,2010;Naseemetal.,2010)canleadtosignificantimprovementoverKleinandManning’sbaselinemodel.Theuseofdependencygrammarscircumventsthequestionofhowtoobtainanappropriateinventoryofcategories,sincedepen-dencyparsesaresimplydefinedbyunlabelededgesbetweenthelexicalitemsinthesentence.Butde-pendencygrammarsmakeitalsodifficulttocap-turenon-localstructures,andBlunsomandCohn(2010)showthatitmaybeadvantageoustorefor-mulatetheunderlyingdependencygrammarintermsofatree-substitutiongrammar(TSG)whichpairswordswithtreeletsthatspecifythenumberofleftandrightdependentstheyhave.Inthispaper,weexploreyetanotheroption:insteadofdependencygrammars,weuseCombinatoryCategorialGram-mar(CCG,斯蒂德曼(1996;2000)),alinguisticallyexpressiveformalismthatpairslexicalitemswithrichcategoriesthatcapturealllanguage-specificin-formation.Thismayseemapuzzlingchoice,sinceCCGrequiresasignificantlylargerinventoryofcat-egoriesthaniscommonlyassumedforCFGs.How-ever,unlikeCFGnonterminals,CCGcategoriesarenotarbitrarysymbols:theyencode,andaredeter-minedby,thebasicwordorderofthelanguageandthenumberofargumentseachwordtakes.CCGisverysimilartoTSGinthatitalsopairslexicalitemswithrichitemsthatcapturealllanguage-specificin-formation.LikeTSGandprojectivedependencygrammars,werestrictourselvestoaweaklycontext-freefragmentofCCG.ButwhileTSGdoesnotdis-tinguishbetweenargumentandmodifierdependen-cies,CCGmakesanexplicitdistinctionbetweenthetwo.AndwhiletheelementarytreesofBlunsomandCohn(2010)’sTSGandtheirinternalnodella-belshavenoobviouslinguisticinterpretation,thesyntacticbehaviorofanyCCGconstituentcanbedirectlyinferredfromitscategory.Toseewhether l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t
计算语言学协会会刊, 1 (2013) 63–74. 动作编辑器: Brian Roark.
计算语言学协会会刊, 1 (2013) 63–74. 动作编辑器: Brian Roark. Submitted 9/2012; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. UnsupervisedDependencyParsingwithAcousticCuesJohnKPate†‡j.k.pate@sms.ed.ac.ukSharonGoldwater†sgwater@inf.ed.ac.uk†ILCC,SchoolofInformatics‡DepartmentofComputingUniversityofEdinburghMacquarieUniversityEdinburgh,EH89AB,UKSydney,NSW2109,AustraliaAbstractUnsupervisedparsingisadifficulttaskthatinfantsreadilyperform.Progresshasbeenmadeonthistaskusingtext-basedmodels,butfewcomputationalapproacheshaveconsideredhowinfantsmightbenefitfromacousticcues.Thispaperexploresthehypothesisthatworddurationcanhelpwithlearningsyntax.Wede-scribehowdurationinformationcanbeincor-poratedintoanunsupervisedBayesiandepen-dencyparserwhoseonlyothersourceofinfor-mationisthewordsthemselves(withoutpunc-tuationorpartsofspeech).Ourresults,evalu-atedonbothadult-directedandchild-directedutterances,showthatusingworddurationcanimproveparsequalityrelativetowords-onlybaselines.Theseresultssupporttheideathatacousticcuesprovideusefulevidenceaboutsyntacticstructureforlanguage-learningin-fants,andmotivatetheuseofworddurationcuesinNLPtaskswithspeech.1IntroductionUnsupervisedlearningofsyntaxisdifficultforNLPsystems,yetinfantsperformthistaskroutinely.Pre-viousworkinNLPhasfocusedonusingtheimplicitsyntacticinformationavailableinpart-of-speech(销售点)tags(KleinandManning,2004),punctuation(Seginer,2007;Spitkovskyetal.,2011b;Ponvertetal.,2011),andsyntacticsimilaritiesbetweenrelatedlanguages(CohenandSmith,2009;Cohenetal.,2011).然而,theseapproacheslikelyusethedatainaverydifferentwayfromchildren:neitherPOStagsnorpunctuationareobservedduringlanguageacquisition(althoughseeSpitkovskyetal.(2011A)andChristodoulopoulosetal.(2012)forencourag-ingresultsusingunsupervisedPOStags),andmanychildrenlearninabroadlymonolingualenvironment.ThispaperexploresapossiblesourceofinformationthatNLPsystemstypicallyignore:wordduration,orthelengthoftimetakentopronounceeachword.Therearegoodreasonstothinkthatworddura-tionmightbeusefulforlearningsyntax.First,thewell-establishedProsodicBootstrappinghypothesis(GleitmanandWanner,1982)proposesthatinfantsuseacoustic-prosodiccues(suchaswordduration)tohelpthemidentifysyntacticstructure,becauseprosodicandsyntacticstructuressometimescoincide.Morerecently,weproposed(PateandGoldwater,2011)thatinfantsmightuseworddurationasadi-rectcuetosyntacticstructure(i.e.,withoutrequir-ingintermediateprosodicstructure),becausewordsinhigh-probabilitysyntacticstructurestendtobepronouncedmorequickly(GahlandGarnsey,2004;Gahletal.,2006;Tilyetal.,2009).Likemostrecentworkonunsupervisedparsing,wefocusonlearningsyntacticdependencies.OurworkisbasedonHeaddenetal.(2009)’sBayesianversionoftheDependencyModelwithValence(DMV)(KleinandManning,2004),usinginterpo-latedbackofftechniquestoincorporatemultipleinfor-mationsourcespertoken.However,whereasHead-denetal.usedwordsandPOStagsasinput,weusewordsandworddurationinformation,presentingthreevariantsoftheirmodelthatusethisinformationinslightlydifferentways.11Byusingneithergold-standardnorlearnedPOStagsasinput,ourworkdiffersfromnearlyallpreviousworkonunsuper-viseddependencyparsing.Whilelearnedtagsmightbeplausible l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c
计算语言学协会会刊, 1 (2013) 49–62. 动作编辑器: Jason Eisner.
计算语言学协会会刊, 1 (2013) 49–62. 动作编辑器: Jason Eisner. Submitted 11/2012; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. WeaklySupervisedLearningofSemanticParsersforMappingInstructionstoActionsYoavArtziandLukeZettlemoyerComputerScience&EngineeringUniversityofWashingtonSeattle,WA98195{yoav,lsz}@cs.washington.eduAbstractThecontextinwhichlanguageisusedpro-videsastrongsignalforlearningtorecoveritsmeaning.Inthispaper,weshowitcanbeusedwithinagroundedCCGsemanticparsingapproachthatlearnsajointmodelofmean-ingandcontextforinterpretingandexecutingnaturallanguageinstructions,usingvarioustypesofweaksupervision.Thejointnatureprovidescrucialbenefitsbyallowingsituatedcues,suchasthesetofvisibleobjects,todi-rectlyinfluencelearning.Italsoenablesalgo-rithmsthatlearnwhileexecutinginstructions,forexamplebytryingtoreplicatehumanac-tions.Experimentsonabenchmarknaviga-tionaldatasetdemonstratestrongperformanceunderdifferingformsofsupervision,includ-ingcorrectlyexecuting60%moreinstructionsetsrelativetothepreviousstateoftheart.1IntroductionThecontextinwhichnaturallanguageisusedpro-videsastrongsignaltoreasonaboutitsmeaning.However,usingsuchasignaltoautomaticallylearntounderstandunrestrictednaturallanguageremainsachallenging,unsolvedproblem.Forexample,considertheinstructionsinFigure1.Correctinterpretationrequiresustosolvemanysub-problems,suchasresolvingallreferringexpres-sionstospecificobjectsintheenvironment(includ-ing,“thecorner”or“thethirdintersection”),disam-biguatingwordsensebasedoncontext(e.g.,“thechair”couldrefertoachairorsofa),andfindingexecutableactionsequencesthatsatisfystatedcon-straints(suchas“twice”or“tofacethebluehall”).moveforwardtwicetothechairλa.move(A)∧dir(A,向前)∧len(A,2)∧to(A,ιx.chair(X))atthecornerturnlefttofacethebluehallλa.pre(A,ιx.corner(X))∧turn(A)∧dir(A,左边)∧post(A,正面(你,ιx.blue(X)∧hall(X)))movetothechairinthethirdintersectionλa.move(A)∧to(A,ιx.sofa(X))∧intersect(命令(λy.junction(y),frontdist,3),X)Figure1:Asamplenavigationinstructionset,pairedwithlambda-calculusmeaningrepresentations.Wemustalsounderstandimplicitrequests,forex-amplefromthephrase“atthecorner,”thatdescribegoalstobeachievedwithoutspecifyingthespecificsteps.Finally,todoallofthisrobustlywithoutpro-hibitiveengineeringeffort,weneedgroundedlearn-ingapproachesthatjointlyreasonaboutmeaningandcontexttolearndirectlyfromtheirinterplay,withaslittlehumaninterventionaspossible.Althoughmanyofthesechallengeshavebeenstudiedseparately,aswewillreviewinSection3,thispaperrepresents,tothebestofourknowledge,thefirstattemptatacomprehensivemodelthatad-dressesthemall.OurapproachinducesaweightedCombinatoryCategorialGrammar(CCG),includ-ingboththeparametersofthelinearmodelandaCCGlexicon.Tomodelcomplexinstructionallan-guage,weintroduceanewsemanticmodelingap-proachthatcanrepresentanumberofkeylinguisticconstructsthatarecommoninspatialandinstruc-tionallanguage.Tolearnfromindirectsupervision,wedefinethenotionofavalidationfunction,forexamplethatteststhestateoftheagentafterin-terpretinganinstruction.Wethenshowhowthisfunctioncanbeusedtodriveonlinelearning.For l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c