文档

您需要什么主题的文档?

计算语言学协会会刊, 1 (2013) 37–48. 动作编辑器: Ryan McDonald.

计算语言学协会会刊, 1 (2013) 37–48. 动作编辑器: Ryan McDonald. Submitted 11/2012; 修改 2/2013; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. BranchandBoundAlgorithmforDependencyParsingwithNon-localFeaturesXianQianandYangLiuComputerScienceDepartmentTheUniversityofTexasatDallas{qx,yangl}@hlt.utdallas.eduAbstractGraphbaseddependencyparsingisinefficientwhenhandlingnon-localfeaturesduetohighcomputationalcomplexityofinference.Inthispaper,weproposedanexactandeffi-cientdecodingalgorithmbasedontheBranchandBound(乙&乙)frameworkwherenon-localfeaturesareboundedbyalinearcombi-nationoflocalfeatures.Dynamicprogram-mingisusedtosearchtheupperbound.Ex-perimentsareconductedonEnglishPTBandChineseCTBdatasets.Weachievedcompeti-tiveUnlabeledAttachmentScore(UAS)whennoadditionalresourcesareavailable:93.17%forEnglishand87.25%forChinese.Parsingspeedis177wordspersecondforEnglishand97wordspersecondforChinese.Ouralgo-rithmisgeneralandcanbeadaptedtonon-projectivedependencyparsingorothergraph-icalmodels.1IntroductionForgraphbasedprojectivedependencyparsing,dy-namicprogramming(DP)ispopularfordecodingduetoitsefficiencywhenhandlinglocalfeatures.Itperformscubictimeparsingforarc-factoredmod-els(艾斯纳,1996;McDonaldetal.,2005a)andbi-quadratictimeforhigherordermodelswithrichersiblingandgrandchildfeatures(Carreras,2007;KooandCollins,2010).然而,formodelswithgen-eralnon-localfeatures,DPisinefficient.Therehavebeennumerousstudiesonglobalin-ferencealgorithmsforgeneralhigherorderparsing.Onepopularapproachisreranking(柯林斯,2000;CharniakandJohnson,2005;大厅,2007).Ittypi-callyhastwosteps:thelowlevelclassifiergener-atesthetopkhypothesesusinglocalfeatures,thenthehighlevelclassifierreranksthesecandidatesus-ingglobalfeatures.Sincethererankingqualityisboundedbytheoracleperformanceofcandidates,someworkhascombinedcandidategenerationandrerankingstepsusingcubepruning(黄,2008;ZhangandMcDonald,2012)toachievehigheror-acleperformance.Theyparseasentenceinbottomuporderandkeepthetopkderivationsforeachs-panusingkbestparsing(HuangandChiang,2005).Aftermergingthetwospans,non-localfeaturesareusedtoreranktopkcombinations.Thisapproachisveryefficientandflexibletohandlevariousnon-localfeatures.Thedisadvantageisthatittendstocomputenon-localfeaturesasearlyaspossiblesothatthedecodercanutilizethatinformationatinter-nalspans,henceitmaymisslonghistoricalfeaturessuchaslongdependencychains.SmithandEisnermodeleddependencyparsingusingMarkovRandomFields(MRFs)withglob-alconstraintsandappliedloopybeliefpropaga-tion(LBP)forapproximatelearningandinference(SmithandEisner,2008).SimilarworkwasdoneforCombinatorialCategorialGrammar(CCG)pars-ing(AuliandLopez,2011).Theyusedposteriormarginalbeliefsforinferencetosatisfythetreecon-straint:foreachfactor,onlylegalmessages(satisfy-ingglobalconstraints)areconsideredinthepartitionfunction.Asimilarlineofresearchinvestigatedtheuseofintegerlinearprogramming(ILP)basedparsing(RiedelandClarke,2006;Martinsetal.,2009).This l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 1 (2013) 25–36. 动作编辑器: Hal Daum´e III.

计算语言学协会会刊, 1 (2013) 25–36. 动作编辑器: Hal Daum´e III. Submitted 10/2012; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. GroundingActionDescriptionsinVideosMichaelaRegneri∗,MarcusRohrbach(西德:5),DominikusWetzel∗,StefanThater∗,BerntSchiele(西德:5)andManfredPinkal∗∗DepartmentofComputationalLinguistics,SaarlandUniversity,萨尔布吕肯,德国(regneri|dwetzel|stth|pinkal)@coli.uni-saarland.de(西德:5)MaxPlanckInstituteforInformatics,萨尔布吕肯,德国(rohrbach|schiele)@mpi-inf.mpg.deAbstractRecentworkhasshownthattheintegrationofvisualinformationintotext-basedmodelscansubstantiallyimprovemodelpredictions,butsofaronlyvisualinformationextractedfromstaticimageshasbeenused.Inthispaper,weconsidertheproblemofgroundingsentencesdescribingactionsinvisualinformationex-tractedfromvideos.Wepresentageneralpurposecorpusthatalignshighqualityvideoswithmultiplenaturallanguagedescriptionsoftheactionsportrayedinthevideos,togetherwithanannotationofhowsimilartheactiondescriptionsaretoeachother.Experimentalresultsdemonstratethatatext-basedmodelofsimilaritybetweenactionsimprovessubstan-tiallywhencombinedwithvisualinformationfromvideosdepictingthedescribedactions.1IntroductionTheestimationofsemanticsimilaritybetweenwordsandphrasesisabasictaskincomputationalsemantics.Vector-spacemodelsofmeaningareonestandardapproach.Followingthedistributionalhy-pothesis,frequenciesofcontextwordsarerecordedinvectors,andsemanticsimilarityiscomputedasaproximitymeasureintheunderlyingvectorspace.Suchdistributionalmodelsareattractivebecausetheyareconceptuallysimple,easytoimplementandrelevantforvariousNLPtasks(TurneyandPan-tel,2010).Atthesametime,theyprovideasub-stantiallyincompletepictureofwordmeaning,sincetheyignoretherelationbetweenlanguageandextra-linguisticinformation,whichisconstitutiveforlin-guisticmeaning.Inthelastfewyears,agrowingamountofworkhasbeendevotedtothetaskofgroundingmeaninginvisualinformation,inpar-ticularbyextendingthedistributionalapproachtojointlycovertextsandimages(FengandLapata,2010;Brunietal.,2011).Asaclearresult,visualinformationimprovesthequalityofdistributionalmodels.Brunietal.(2011)showthatvisualinfor-mationdrawnfromimagesisparticularlyrelevantforconcretecommonnounsandadjectives.Anaturalnextstepistointegratevisualinfor-mationfromvideosintoasemanticmodelofeventandactionverbs.Psychologicalstudieshaveshowntheconnectionbetweenactionsemanticsandvideos(Glenberg,2002;Howelletal.,2005),buttoourknowledge,wearethefirsttoprovideasuitabledatasourceandtoimplementsuchamodel.Thecontributionofthispaperisthree-fold:•Wepresentamultimodalcorpuscontainingtextualdescriptionsalignedwithhigh-qualityvideos.StartingfromthevideocorpusofRohrbachetal.(2012乙),whichcontainshigh-resolutionvideorecordingsofbasiccookingtasks,wecollectedmultipletextualdescrip-tionsofeachvideoviaMechanicalTurk.Wealsoprovideanaccuratesentence-levelalign-mentofthedescriptionswiththeirrespectivevideos.Weexpectthecorpustobeavalu-ableresourceforcomputationalsemantics,andmoreoverhelpfulforavarietyofpurposes,in-cludingvideounderstandingandgenerationoftextfromvideos.•Weprovideagold-standarddatasetfortheevaluationofsimilaritymodelsforactionverbsandphrases.ThedatasethasbeendesignedasanalogoustotheUsageSimilaritydatasetof l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a

阅读更多 ”

计算语言学协会会刊, 1 (2013) 13–24. 动作编辑器: Giorgio Satta.

计算语言学协会会刊, 1 (2013) 13–24. 动作编辑器: Giorgio Satta. Submitted 11/2012; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. FindingOptimal1-Endpoint-CrossingTreesEmilyPitler,SampathKannan,MitchellMarcusComputerandInformationScienceUniversityofPennsylvaniaPhiladelphia,PA19104epitler,kannan,mitch@seas.upenn.eduAbstractDependencyparsingalgorithmscapableofproducingthetypesofcrossingdependenciesseeninnaturallanguagesentenceshavetra-ditionallybeenordersofmagnitudeslowerthanalgorithmsforprojectivetrees.For95.8-99.8%ofdependencyparsesinvariousnat-urallanguagetreebanks,wheneveranedgeiscrossed,theedgesthatcrossitallhaveacommonvertex.Theoptimaldependencytreethatsatisfiesthis1-Endpoint-Crossingprop-ertycanbefoundwithanO(n4)parsingal-gorithmthatrecursivelycombinesforestsoverintervalswithoneexteriorpoint.1-Endpoint-CrossingtreesalsohavenaturalconnectionstolinguisticsandanotherclassofgraphsthathasbeenstudiedinNLP.1IntroductionDependencyparsingisoneofthefundamentalprob-lemsinnaturallanguageprocessingtoday,withap-plicationssuchasmachinetranslation(DingandPalmer,2005),informationextraction(CulottaandSorensen,2004),andquestionanswering(Cuietal.,2005).Mosthigh-accuracygraph-baseddepen-dencyparsers(KooandCollins,2010;RushandPetrov,2012;ZhangandMcDonald,2012)findthehighest-scoringprojectivetrees(inwhichnoedgescross),despitethefactthatalargeproportionofnat-urallanguagesentencesarenon-projective.Projec-tivetreescanbefoundinO(n3)时间(艾斯纳,2000),butcoveronly63.6%ofsentencesinsomenaturallanguagetreebanks(Table1).TheclassofdirectedspanningtreescoversalltreebanktreesandcanbeparsedinO(n2)withedge-basedfeatures(McDonaldetal.,2005),butitisNP-hardtofindthemaximumscoringsuchtreewithgrandparentorsiblingfeatures(McDonaldandPereira,2006;McDonaldandSatta,2007).Therearevariousexistingdefinitionsofmildlynon-projectivetreeswithbetterempiricalcoveragethanprojectivetreesthatdonothavethehardnessofextensibilitythatspanningtreesdo.However,thesehavehadparsingalgorithmsthatareordersofmag-nitudeslowerthantheprojectivecaseortheedge-basedspanningtreecase.Forexample,well-nesteddependencytreeswithblockdegree2(Kuhlmann,2013)coveratleast95.4%ofnaturallanguagestruc-tures,buthaveaparsingtimeofO(n7)(Gómez-Rodríguezetal.,2011).Nopreviouslydefinedclassoftreessimultane-ouslyhashighcoverageandlow-degreepolynomialalgorithmsforparsing,allowinggrandparentorsib-lingfeatures.Wepropose1-Endpoint-Crossingtrees,inwhichforanyedgethatiscrossed,allotheredgesthatcrossthatedgeshareanendpoint.Whilesimpletostate,thispropertycovers95.8%ormoreofde-pendencyparsesinnaturallanguagetreebanks(Ta-ble1).Theoptimal1-Endpoint-Crossingtreecanbefoundinfasterasymptotictimethananyprevi-ouslyproposedmildlynon-projectivedependencyparsingalgorithm.Weshowhowany1-Endpoint-Crossingtreecanbedecomposedintoisolatedsetsofintervalswithoneexteriorpoint(Section3).Thisisthekeyinsightthatallowsefficientparsing;theO(n4)parsingalgorithmispresentedinSection4.1-Endpoint-Crossingtreesareasubclassof2-planargraphs(Section5.1),aclassthathasbeenstudied l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a c

阅读更多 ”

计算语言学协会会刊, 1 (2013) 1–12. 动作编辑器: Sharon Goldwater.

计算语言学协会会刊, 1 (2013) 1–12. 动作编辑器: Sharon Goldwater. Submitted 11/2012; 修改 1/2013; 已发表 3/2013. C (西德:13) 2013 计算语言学协会. TokenandTypeConstraintsforCross-LingualPart-of-SpeechTaggingOscarT¨ackstr¨om(西德:5)†∗DipanjanDas‡SlavPetrov‡RyanMcDonald‡JoakimNivre†∗(西德:5)SwedishInstituteofComputerScience†DepartmentofLinguisticsandPhilology,UppsalaUniversity‡GoogleResearch,NewYorkoscar@sics.se{dipanjand|slav|ryanmcd}@google.comjoakim.nivre@lingfil.uu.seAbstractWeconsidertheconstructionofpart-of-speechtaggersforresource-poorlanguages.Recently,manuallyconstructedtagdictionariesfromWiktionaryanddictionariesprojectedviabitexthavebeenusedastypeconstraintstoovercomethescarcityofannotateddatainthissetting.Inthispaper,weshowthatadditionaltokenconstraintscanbeprojectedfromaresource-richsourcelanguagetoaresource-poortargetlanguageviaword-alignedbitext.Wepresentseveralmodelstothisend;inparticularapar-tiallyobservedconditionalrandomfieldmodel,wherecoupledtokenandtypeconstraintspro-videapartialsignalfortraining.AveragedacrosseightpreviouslystudiedIndo-Europeanlanguages,ourmodelachievesa25%relativeerrorreductionoverthepriorstateoftheart.Wefurtherpresentsuccessfulresultsonsevenadditionallanguagesfromdifferentfamilies,empiricallydemonstratingtheapplicabilityofcoupledtokenandtypeconstraintsacrossadiversesetoflanguages.1IntroductionSupervisedpart-of-speech(销售点)taggersareavail-ableformorethantwentylanguagesandachieveac-curaciesofaround95%onin-domaindata(Petrovetal.,2012).Thankstotheirefficiencyandrobustness,supervisedtaggersareroutinelyemployedinmanynaturallanguageprocessingapplications,suchassyn-tacticandsemanticparsing,named-entityrecognitionandmachinetranslation.Unfortunately,theresourcesrequiredtotrainsupervisedtaggersareexpensivetocreateandunlikelytoexistforthemajorityofwritten∗WorkprimarilycarriedoutwhileatGoogleResearch.languages.ThenecessityofbuildingNLPtoolsfortheseresource-poorlanguageshasbeenpartofthemotivationforresearchonunsupervisedlearningofPOStaggers(Christodoulopoulosetal.,2010).Inthispaper,weinsteadtakeaweaklysupervisedapproachtowardsthisproblem.Recently,learningPOStaggerswithtype-leveltagdictionaryconstraintshasgainedpopularity.Tagdictionaries,noisilypro-jectedviaword-alignedbitext,havebridgedthegapbetweenpurelyunsupervisedandfullysupervisedtaggers,resultinginanaverageaccuracyofover83%onabenchmarkofeightIndo-Europeanlanguages(DasandPetrov,2011).Lietal.(2012)furtherim-proveduponthisresultbyemployingWiktionary1asatagdictionarysource,resultinginthehithertobestpublishedresultofalmost85%onthesamesetup.Althoughtheaforementionedweaklysupervisedapproacheshaveresultedinsignificantimprovementsoverfullyunsupervisedapproaches,theyhavenotexploitedthebenefitsoftoken-levelcross-lingualprojectionmethods,whicharepossiblewithword-alignedbitextbetweenatargetlanguageofinterestandaresource-richsourcelanguage,suchasEnglish.Thisisthesettingweconsiderinthispaper(§2).Whilepriorworkhassuccessfullyconsideredbothtoken-andtype-levelprojectionacrossword-alignedbitextforestimatingthemodelparametersofgenera-tivetaggingmodels(YarowskyandNgai,2001;XiandHwa,2005,interalia),akeyobservationunder-lyingthepresentworkisthattoken-andtype-levelinformationofferdifferentandcomplementarysig-nals.Ontheonehand,highconfidencetoken-levelprojectionsofferpreciseconstraintsonataginaparticularcontext.Ontheotherhand,manuallycre-1http://www.wiktionary.org/. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 卷. 2, PP. 561–572, 2014. 动作编辑器: Ryan McDonald.

计算语言学协会会刊, 卷. 2, PP. 561–572, 2014. 动作编辑器: Ryan McDonald. 2014 计算语言学协会. 提交批次: 10/2014; 修改批次 11/2014; 已发表 12/2014. C (西德:13) 561 ExploringCompositionalArchitecturesandWordVectorRepresentationsforPrepositionalPhraseAttachmentYonatanBelinkov,TaoLei,ReginaBarzilayMassachusettsInstituteofTechnology{belinkov,taolei,regina}@csail.mit.eduAmirGlobersonTheHebrewUniversitygamir@cs.huji.ac.ilAbstractPrepositionalphrase(PP)attachmentdisam-biguationisaknownchallengeinsyntacticparsing.ThelexicalsparsityassociatedwithPPattachmentsmotivatesresearchinwordrepresentationsthatcancapturepertinentsyn-tacticandsemanticfeaturesoftheword.Onepromisingsolutionistousewordvectorsin-ducedfromlargeamountsofrawtext.How-ever,state-of-the-artsystemsthatemploysuchrepresentationsyieldmodestgainsinPPat-tachmentaccuracy.Inthispaper,weshowthatwordvectorrepre-sentationscanyieldsignificantPPattachmentperformancegains.Thisisachievedviaanon-lineararchitecturethatisdiscriminativelytrainedtomaximizePPattachmentaccuracy.Thearchitectureisinitializedwithwordvec-torstrainedfromunlabeleddata,andrelearnsthosetomaximizeattachmentaccuracy.Weobtainadditionalperformancegainswithal-ternativerepresentationssuchasdependency-basedwordvectors.WhentestedonbothEn-glishandArabicdatasets,ourmethodoutper-formsbothastrongSVMclassifierandstate-of-the-artparsers.Forinstance,weachieve82.6%PPattachmentaccuracyonArabic,whiletheTurboandCharniakself-trainedparsersobtain76.7%and80.8%respectively.11IntroductionTheproblemofprepositionalphrase(PP)attach-mentdisambiguationhasbeenunderinvestigation1Thecodeanddataforthisworkareavailableathttp://groups.csail.mit.edu/rbg/code/pp.SheatespaghettiwithbutterSheatespaghettiwithchopsticksFigure1:TwosentencesillustratingtheimportanceoflexicalizationinPPattachmentdecisions.Inthetopsentence,thePPwithbutterattachestothenounspaghetti.Inthebottomsentence,thePPwithchop-sticksattachestotheverbate.foralongtime.However,despiteatleasttwodecadesofresearch(BrillandResnik,1994;Rat-naparkhietal.,1994;CollinsandBrooks,1995),itremainsamajorsourceoferrorsforstate-of-the-artparsers.Forinstance,inacomparativeevaluationofparserperformanceontheWallStreetJournalcor-pus,Kummerfeldetal.(2012)reportthatPPattach-mentisthelargestsourceoferrorsacrossallparsers.Moreover,theextentofimprovementovertimehasbeenratherlimited,amountingtoabout32%errorreductionsincetheworkof(柯林斯,1997).PPattachmentsareinherentlylexicalizedandpart-of-speech(销售点)tagsarenotsufficientfortheircorrectdisambiguation.Forexample,thetwosen-tencesinFigure1varybyasinglenoun—buttervschopsticks.However,thisworddeterminesthestructureofthewholePPattachment.Ifthecorre- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t .

阅读更多 ”

计算语言学协会会刊, 卷. 2, PP. 547–559, 2014. 动作编辑器: Sharon Goldwater, Alexander Koller.

计算语言学协会会刊, 卷. 2, PP. 547–559, 2014. 动作编辑器: Sharon Goldwater, Alexander Koller. 提交批次: 3/2014; 修改批次 8/2014; 已发表 12/2014. C (西德:13) 2014 计算语言学协会. 547 ANewCorpusandImitationLearningFrameworkforContext-DependentSemanticParsingAndreasVlachosComputerScienceDepartmentUniversityCollegeLondona.vlachos@cs.ucl.ac.ukStephenClarkComputerLaboratoryUniversityofCambridgesc609@cam.ac.ukAbstractSemanticparsingisthetaskoftranslatingnaturallanguageutterancesintoamachine-interpretablemeaningrepresentation.Mostapproachestothistaskhavebeenevaluatedonasmallnumberofexistingcorporawhichassumethatallutterancesmustbeinterpretedaccordingtoadatabaseandtypicallyignorecontext.Inthispaperwepresentanew,pub-liclyavailablecorpusforcontext-dependentsemanticparsing.TheMRLusedforthean-notationwasdesignedtosupportaportable,interactivetouristinformationsystem.WedevelopasemanticparserforthiscorpusbyadaptingtheimitationlearningalgorithmDAGGERwithoutrequiringalignmentinfor-mationduringtraining.DAGGERimprovesuponindependentlytrainedclassifiersby9.0and4.8pointsinF-scoreonthedevelopmentandtestsetsrespectively.1IntroductionSemanticparsingisthetaskoftranslatingnatu-rallanguageutterancesintoamachine-interpretablemeaningrepresentation(MR).Progressinsemanticparsinghasbeenfacilitatedbytheexistenceofcor-poracontainingutterancesannotatedwithMRs,themostcommonlyusedbeingATIS(Dahletal.,1994)andGeoQuery(Zelle,1995).Asthesecorporacoverrathernarrowapplicationdomains,recentworkhasdevelopedcorporatosupportnaturallanguagein-terfacestotheFreebasedatabase(CaiandYates,2013),aswellasthedevelopmentofMTsystems(Banarescuetal.,2013).然而,theseexistingcorporahavesomeim-portantlimitations.TheMRsaccompanyingtheutterancesaretypicallyrestrictedtosomeformofdatabasequery.Furthermore,inmostcaseseachutteranceisinterpretedinisolation;thusutterancesthatusecoreferenceorwhosesemanticsarecontext-dependentaretypicallyignored.Inthispaperwepresentanewcorpusforcontext-dependentseman-ticparsingtosupportthedevelopmentofaninterac-tivenavigationandexplorationsystemfortourism-relatedactivities.ThenewcorpuswasannotatedwithMRsthatcanhandledialogcontextsuchascoreferenceandcanaccommodateutterancesthatarenotinterpretableaccordingtoadatabase,e.g.repetitionrequests.Theutteranceswerecollectedinexperimentswithhumansubjects,andcontainphe-nomenasuchasellipsisanddisfluency.Wedevel-opedguidelinesandannotated17dialogscontaining2,374utterances,with82.9%exactmatchagreementbetweentwoannotators.Wealsodevelopasemanticparserforthiscor-pus.AstheoutputMRsarerathercomplex,in-steadofadoptinganapproachthatsearchestheout-putspaceexhaustively,weusetheimitationlearningalgorithmDAGGER(Rossetal.,2011)thatconvertslearningastructuredpredictionmodelintolearningasetofclassificationmodels.Wetakeadvantageofitsabilitytolearnwithnon-decomposablelossfunc-tionsandextendittohandletheabsenceofalign-mentinformationduringtrainingbydevelopingarandomizedexpertpolicy.Ourapproachimprovesuponindependentlytrainedclassifiersby9.0and4.8F-scoreonthedevelopmentandtestsets.2MeaningRepresentationLanguageOurproposedMRlanguage(MRL)wasdesignedinthecontextoftheportable,interactivenaviga- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i

阅读更多 ”

计算语言学协会会刊, 卷. 2, PP. 531–545, 2014. 动作编辑器: Janyce Wiebe.

计算语言学协会会刊, 卷. 2, PP. 531–545, 2014. 动作编辑器: Janyce Wiebe. 提交批次: 3/2014; 修改批次 9/2014; 已发表 12/2014. C(西德:13)2014 计算语言学协会. 531 ALargeScaleEvaluationofDistributionalSemanticModels:Parameters,InteractionsandModelSelectionGabriellaLapesa2,11Universit¨atOsnabr¨uckInstitutf¨urKognitionswissenschaftAlbrechtstr.28,Osnabr¨uck,Germanygabriella.lapesa@fau.deStefanEvert22FAUErlangen-N¨urnbergProfessurf¨urKorpuslinguistikBismarckstr.6,Erlangen,Germanystefan.evert@fau.deAbstractThispaperpresentstheresultsofalarge-scaleevaluationstudyofwindow-basedDistribu-tionalSemanticModelsonawidevarietyoftasks.Ourstudycombinesabroadcoverageofmodelparameterswithamodelselectionmethodologythatisrobusttooverfittingandabletocaptureparameterinteractions.Weshowthatourstrategyallowsustoidentifypa-rameterconfigurationsthatachievegoodper-formanceacrossdifferentdatasetsandtasks1.1IntroductionDistributionalSemanticModels(DSMs)areem-ployedtoproducesemanticrepresentationsofwordsfromco-occurrencepatternsintextsordocuments(Sahlgren,2006;TurneyandPantel,2010).Build-ingontheDistributionalHypothesis(哈里斯,1954),DSMsquantifytheamountofmeaningsharedbywordsasthedegreeofoverlapofthesetsofcontextsinwhichtheyoccur.Awidelyusedapproachoperationalizesthesetofcontextsasco-occurrenceswithotherwordswithinacertainwindow(e.g.,5words).Awindow-basedDSMcanberepresentedasaco-occurrencematrixinwhichrowscorrespondtotargetwords,columnscorrespondtocontextwords,andcellsstoretheco-occurrencefrequenciesoftargetwordsandcontextwords.Theco-occurrenceinformationisusuallyweightedbysomescoringfunctionandtherowsofthematrixarenormalized.Sincetheco-occurrence1Theanalysispresentedinthispaperiscomplementedbysupplementarymaterials,whichareavailablefordownloadathttp://www.linguistik.fau.de/dsmeval/.Thispagewillalsobekeptuptodatewiththeresultsoffollow-upexperiments.matrixtendstobeverylargeandsparselypopu-lated,dimensionalityreductiontechniquesareoftenusedtoobtainamorecompactrepresentation.Lan-dauerandDumais(1997)claimthatdimensionalityreductionalsoimprovesthesemanticrepresentationencodedintheco-occurrencematrix.Finally,dis-tancesbetweentherowvectorsofthematrixarecomputedand–accordingtotheDistributionalHy-pothesis–interpretedasacorrelateofthesemanticsimilaritiesbetweenthecorrespondingtargetwords.TheconstructionanduseofaDSMinvolvesmanydesignchoices,suchas:selectionofasourcecor-pus,sizeoftheco-occurrencewindow;choiceofasuitablescoringfunction,possiblycombinedwithanadditionaltransformation;whethertoapplydimen-sionalityreduction,andthenumberofreduceddi-mensions;metricformeasuringdistancesbetweenvectors.Differentdesignchoices–technically,theDSMparameters–canresultinquitedifferentsim-ilaritiesforthesamewords(Sahlgren,2006).DSMshavealreadyprovensuccessfulinmodel-inglexicalmeaning:theyhavebeenappliedinNatu-ralLanguageProcessing(Sch¨utze,1998;林,1998),InformationRetrieval(Saltonetal.,1975),andCog-nitiveModeling(LandauerandDumais,1997;LundandBurgess,1996;Pad´oandLapata,2007;Ba-roniandLenci,2010).最近,thefieldofDis-tributionalSemanticshasmovedtowardsnewchal-lenges,suchaspredictingbrainactivation(Mitchelletal.,2008;Murphyetal.,2012;BullinariaandLevy,2013)andmodelingmeaningcomposition(Baronietal.,2014,andreferencestherein).Despitesuchprogress,afullunderstandingofthedifferentparametersgoverningaDSMandtheirin-fluenceonmodelperformancehasnotbeenachievedyet.Thepresentpaperisacontributiontowardsthis l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . e d

阅读更多 ”

计算语言学协会会刊, 卷. 2, PP. 505–516, 2014. 动作编辑器: Janyce Wiebe.

计算语言学协会会刊, 卷. 2, PP. 505–516, 2014. 动作编辑器: Janyce Wiebe. Submitted 4/2014; 修改 8/2014; Published November 1, 2014. C (西德:13) 2014 计算语言学协会. JointModelingofOpinionExpressionExtractionandAttributeClassificationBishanYangDepartmentofComputerScienceCornellUniversitybishan@cs.cornell.eduClaireCardieDepartmentofComputerScienceCornellUniversitycardie@cs.cornell.eduAbstractInthispaper,westudytheproblemsofopin-ionexpressionextractionandexpression-levelpolarityandintensityclassification.Tradi-tionalfine-grainedopinionanalysissystemsaddresstheseproblemsinisolationandthuscannotcaptureinteractionsamongthetex-tualspansofopinionexpressionsandtheiropinion-relatedproperties.Wepresenttwotypesofjointapproachesthatcanaccountforsuchinteractionsduring1)bothlearningandinferenceor2)onlyduringinference.Exten-siveexperimentsonastandarddatasetdemon-stratethatourapproachesprovidesubstantialimprovementsoverpreviouslypublishedre-sults.Byanalyzingtheresults,wegainsomeinsightintotheadvantagesofdifferentjointmodels.1IntroductionAutomaticextractionofopinionsfromtexthasat-tractedconsiderableattentioninrecentyears.Inparticular,significantresearchhasfocusedonex-tractingdetailedinformationforopinionsatthefine-grainedlevel,e.g.identifyingopinionexpressionswithinasentenceandpredictingphrase-levelpo-larityandintensity.Theabilitytoextractfine-grainedopinioninformationiscrucialinsupportingmanyopinion-miningapplicationssuchasopinionsummarization,opinion-orientedquestionanswer-ingandopinionretrieval.Inthispaper,wefocusontheproblemofidenti-fyingopinionexpressionsandclassifyingtheirat-tributes.Weconsiderasanopinionexpressionanysubjectiveexpressionthatexplicitlyorimplic-itlyconveysemotions,情绪,信仰,意见(i.e.privatestates)(Wiebeetal.,2005),andcon-sidertwokeyattributes—polarityandintensity—forcharacterizingtheopinions.Considerthesen-tenceinFigure1,forexample.Thephrases“abiasinfavorof”and“beingseverelycriticized”areopin-ionexpressionscontainingpositivesentimentwithmediumintensityandnegativesentimentwithhighintensity,respectively.Mostexistingapproachestacklethetasksofopin-ionexpressionextractionandattributeclassificationinisolation.Thefirsttaskistypicallyformulatedasasequencelabelingproblem,wherethegoalistola-beltheboundariesoftextspansthatcorrespondtoopinionexpressions(Brecketal.,2007;YangandCardie,2012).Thesecondtaskisusuallytreatedasabinaryormulti-classclassificationproblem(Wil-sonetal.,2005;ChoiandCardie,2008;YessenalinaandCardie,2011),wherethegoalistoassignaclasslabeltoatextfragment(e.g.aphraseorasen-tence).Solutionstothetwotaskscanbeappliedinapipelinearchitecturetoextractopinionexpressionsandtheirattributes.However,pipelinesystemssuf-ferfromerrorpropagation:opinionexpressioner-rorspropagateandleadtounrecoverableerrorsinattributeclassification.Limitedworkhasbeendoneonthejointmodelingofopinionexpressionextractionandattributeclas-sification.ChoiandCardie(2010)firstproposedajointsequencelabelingapproachtoextractopin-ionexpressionsandlabelthemwithpolarityandin-tensity.Theirapproachtreatsbothexpressionex-tractionandattributeclassificationastoken-levelse- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . e

阅读更多 ”

计算语言学协会会刊, 2 (2014) 465–476. 动作编辑器: Kristina Toutanova.

计算语言学协会会刊, 2 (2014) 465–476. 动作编辑器: Kristina Toutanova. Submitted 11/2013; 修改 5/2014; 修改 9/2014; 已发表 10/2014. C (西德:13) 2014 计算语言学协会. 465 OnlineAdaptorGrammarswithHybridInferenceKeZhaiComputerScienceandUMIACSUniversityofMarylandCollegePark,MDUSAzhaike@cs.umd.eduJordanBoyd-GraberComputerScienceUniversityofColoradoBoulder,COUSAjordan.boyd.graber@colorado.eduShayB.CohenSchoolofInformaticsUniversityofEdinburghEdinburgh,苏格兰,UKscohen@inf.ed.ac.ukAbstractAdaptorgrammarsareaflexible,powerfulformalismfordefiningnonparametric,un-supervisedmodelsofgrammarproductions.Thisflexibilitycomesatthecostofexpensiveinference.Weaddressthedifficultyofinfer-encethroughanonlinealgorithmwhichusesahybridofMarkovchainMonteCarloandvariationalinference.Weshowthatthisin-ferencestrategyimprovesscalabilitywithoutsacrificingperformanceonunsupervisedwordsegmentationandtopicmodelingtasks.1IntroductionNonparametricBayesianmodelsareeffectivetoolstodiscoverlatentstructureindata(M¨ullerandQuin-tana,2004).Thesemodelshavehadgreatsuccessintextanalysis,especiallysyntax(Shindoetal.,2012).Nonparametricdistributionsprovidesupportoveracountablyinfinitelong-taileddistributionscommoninnaturallanguage(Goldwateretal.,2011).Wefocusonadaptorgrammars(Johnsonetal.,2006),syntacticnonparametricmodelsbasedonprobabilisticcontext-freegrammars.Adaptorgram-marsweakenthestrongstatisticalindependenceas-sumptionsPCFGsmake(Section2).Theweakerstatisticalindependenceassumptionsthatadaptorgrammarsmakecomeatthecostofex-pensiveinference.Adaptorgrammarsarenotaloneinthistrade-off.Forexample,nonparametricexten-sionsoftopicmodels(Tehetal.,2006)havesubstan-tiallymoreexpensiveinferencethantheirparametriccounterparts(Yaoetal.,2009).Acommonapproachtoaddressthiscompu-tationalbottleneckisthroughvariationalinfer-ence(WainwrightandJordan,2008).Oneoftheadvantagesofvariationalinferenceisthatitcanbeeasilyparallelized(Nallapatietal.,2007)ortrans-formedintoanonlinealgorithm(Hoffmanetal.,2010),whichoftenconvergesinfeweriterationsthanbatchvariationalinference.Pastvariationalinferencetechniquesforadap-torgrammarsassumeapreprocessingstepthatlooksatallavailabledatatoestablishthesupportofthesenonparametricdistributions(Cohenetal.,2010).因此,thesepastapproachesarenotdirectlyamenabletoonlineinference.MarkovchainMonteCarlo(MCMC)inference,analternativetovariationalinference,doesnothavethisdisadvantage.MCMCiseasiertoimplement,anditdiscoversthesupportofnonparametricmod-elsduringinferenceratherthanassumingitapriori.Weapplystochastichybridinference(Mimnoetal.,2012)toadaptorgrammarstogetthebestofbothworlds.WeinterleaveMCMCinferenceinsidevari-ationalinference.Thispreservesthescalabilityofvariationalinferencewhileaddingthesparsestatis-ticsandimprovedexplorationMCMCprovides.OurinferencealgorithmforadaptorgrammarsstartswithavariationalalgorithmsimilartoCohenetal.(2010)andaddshybridsamplingwithinvaria-tionalinference(Section3).Thisobviatestheneedforexpensivepreprocessingandisanecessarysteptocreateanonlinealgorithmforadaptorgrammars.Ouronlineextension(Section4)processesexam-plesinsmallbatchestakenfromastreamofdata.Asdataarrive,thealgorithmdynamicallyextendstheunderlyingapproximateposteriordistributionsasmoredataareobserved.Thismakesthealgo-rithmflexible,scalable,andamenabletodatasetsthatcannotbeexaminedexhaustivelybecauseoftheirsize—e.g.,terabytesofsocialmediadataap-peareverysecond—ortheirnature—e.g.,speechac-quisition,wherealanguagelearnerislimitedtothebandwidthofthehumanperceptualsystemandcan-notacquiredatainamonolithicbatch(B¨orschingerandJohnson,2012).Weshowourapproach’sscalabilityandeffective- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . e d

阅读更多 ”

计算语言学协会会刊, 2 (2014) 435–448. 动作编辑器: Sharon Goldwater.

计算语言学协会会刊, 2 (2014) 435–448. 动作编辑器: Sharon Goldwater. Submitted 8/2014; 修改 10/2014; 已发表 10/2014. C (西德:13) 2014 计算语言学协会. ExtractingLexicallyDivergentParaphrasesfromTwitterWeiXu1,AlanRitter2,ChrisCallison-Burch1,WilliamB.Dolan3andYangfengJi41UniversityofPennsylvania,费城,PA,美国{xwe,ccb}@cis.upenn.edu2TheOhioStateUniversity,Columbus,哦,USAritter.1492@osu.edu3MicrosoftResearch,Redmond,WA,USAbilldol@microsoft.com4GeorgiaInstituteofTechnology,亚特兰大,遗传算法,USAjiyfeng@gatech.eduAbstractWepresentMULTIP(Multi-instanceLearn-ingParaphraseModel),anewmodelsuitedtoidentifyparaphraseswithintheshortmes-sagesonTwitter.Wejointlymodelpara-phraserelationsbetweenwordandsentencepairsandassumeonlysentence-levelannota-tionsduringlearning.Usingthisprincipledla-tentvariablemodelalone,weachievetheper-formancecompetitivewithastate-of-the-artmethodwhichcombinesalatentspacemodelwithafeature-basedsupervisedclassifier.Ourmodelalsocaptureslexicallydivergentpara-phrasesthatdifferfromyetcomplementprevi-ousmethods;combiningourmodelwithpre-viousworksignificantlyoutperformsthestate-of-the-art.Inaddition,wepresentanovelan-notationmethodologythathasallowedustocrowdsourceaparaphrasecorpusfromTwit-ter.Wemakethisnewdatasetavailabletotheresearchcommunity.1IntroductionParaphrasesarealternativelinguisticexpressionsofthesameorsimilarmeaning(BhagatandHovy,2013).Twitterengagesmillionsofusers,whonat-urallytalkaboutthesametopicssimultaneouslyandfrequentlyconveysimilarmeaningusingdiverselinguisticexpressions.Theuniquecharacteristicsofthisuser-generatedtextpresentsnewchallengesandopportunitiesforparaphraseresearch(Xuetal.,2013b;Wangetal.,2013).Formanyapplications,likeautomaticsummarization,firststorydetection(Petrovi´cetal.,2012)andsearch(Zanzottoetal.,2011),itiscrucialtoresolveredundancyintweets(e.g.oscarnom’ddoc↔Oscar-nominateddocu-mentary).Inthispaper,weinvestigatethetaskofdetermin-ingwhethertwotweetsareparaphrases.Previousworkhasexploitedapairofsharednamedentitiestolocatesemanticallyequivalentpatternsfromre-latednewsarticles(Shinyamaetal.,2002;Sekine,2005;ZhangandWeld,2013).ButshortsentencesinTwitterdonotoftenmentiontwonamedentities(Ritteretal.,2012)andrequirenontrivialgeneral-izationfromnamedentitiestootherwords.Forex-ample,considerthefollowingtwosentencesaboutbasketballplayerBrookLopezfromTwitter:◦ThatboyBrookLopezwithadeep3◦brooklopezhita3andimisseditAlthoughthesesentencesdonothavemanywordsincommon,theidenticalword“3”isastrongindicatorthatthetwosentencesareparaphrases.Wethereforeproposeanoveljointword-sentenceapproach,incorporatingamulti-instancelearningassumption(Dietterichetal.,1997)thattwosen-tencesunderthesametopic(wehighlighttopicsinbold)areparaphrasesiftheycontainatleastonewordpair(wecallitananchorandhighlightwithunderscores;thewordsintheanchorpairneednotbeidentical)thatisindicativeofsententialpara-phrase.Thisat-least-one-anchorassumptionmightbeineffectiveforlongorrandomlypairedsentences,butholdsupbetterforshortsentencesthataretem-porallyandtopicallyrelatedonTwitter.Moreover,ourmodeldesign(seeFigure1)allowsexploitationofarbitraryfeaturesandlinguisticresources,suchaspart-of-speechfeaturesandanormalizationlex- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 2 (2014) 419–434. 动作编辑器: Alexander Koller.

计算语言学协会会刊, 2 (2014) 419–434. 动作编辑器: Alexander Koller. Submitted 10/2013; 修改 6/2014; 已发表 10/2014. C(西德:13)2014 计算语言学协会. 419 BuildingaState-of-the-ArtGrammaticalErrorCorrectionSystemAllaRozovskayaCenterforComputationalLearningSystemsColumbiaUniversityNewYork,NY10115alla@ccls.columbia.eduDanRothDepartmentofComputerScienceUniversityofIllinoisUrbana,IL61801danr@illinois.eduAbstractThispaperidentifiesandexaminesthekeyprinciplesunderlyingbuildingastate-of-the-artgrammaticalerrorcorrectionsystem.WedothisbyanalyzingtheIllinoissystemthatplacedfirstamongseventeenteamsinthere-centCoNLL-2013sharedtaskongrammaticalerrorcorrection.Thesystemfocusesonfivedifferenttypesoferrorscommonamongnon-nativeEnglishwriters.Wedescribefourdesignprinciplesthatarerelevantforcorrectingalloftheseer-rors,analyzethesystemalongthesedimen-sions,andshowhoweachofthesedimensionscontributestotheperformance.1IntroductionThefieldoftextcorrectionhasseenanincreasedinterestinthepastseveralyears,withafocusoncorrectinggrammaticalerrorsmadebyEnglishasaSecondLanguage(ESL)learners.Threecompeti-tionsdevotedtoerrorcorrectionfornon-nativewrit-erstookplacerecently:HOO-2011(DaleandKil-garriff,2011),HOO-2012(Daleetal.,2012),andtheCoNLL-2013sharedtask(Ngetal.,2013).Themostrecentandmostprominentamongthese,theCoNLL-2013sharedtask,coversseveralcommonESLerrors,includingarticleandprepositionusagemistakes,mistakesinnounnumber,andvariousverberrors,asillustratedinFig.1.1Seventeenteamsthat1TheCoNLL-2014sharedtaskthatcompletedatthetimeofwritingthispaperwasanextensionoftheCoNLL-2013com-petition(Ngetal.,2014)butaddressedalltypesoferrors.TheIllinois-Columbiasubmission,aslightlyextendedversionoftheNowadays*phone/phones*has/havemanyfunctionalities,*included/including*∅/acameraand*∅/aWi-Fireceiver.Figure1:ExamplesofrepresentativeESLerrors.participatedinthetaskdevelopedawidearrayofap-proachesthatincludediscriminativeclassifiers,lan-guagemodels,statisticalmachine-translationsys-tems,andrule-basedmodules.Manyofthesystemsalsomadeuseoflinguisticresourcessuchasaddi-tionalannotatedlearnercorpora,anddefinedhigh-levelfeaturesthattakeintoaccountsyntacticandse-manticknowledge.Eventhoughthesystemsincorporatedsimilarre-sources,thescoresvariedwidely.Thetopsystem,fromtheUniversityofIllinois,obtainedanF1scoreof31.202,whilethesecondteamscored25.01andthemedianresultwas8.48points.3Theseresultssuggestthatthereisnotenoughunderstandingofwhatworksbestandwhatelementsareessentialforbuildingastate-of-the-arterrorcorrectionsystem.Inthispaper,weidentifykeyprinciplesforbuild-ingarobustgrammaticalerrorcorrectionsystemandshowtheirimportanceinthecontextofthesharedtask.WedothisbyanalyzingtheIllinoissystemandevaluatingitalongseveraldimensions:choiceIllinoisCoNLL-2013system,rankedatthetop.Foradescrip-tionoftheIllinois-Columbiasubmission,wereferthereadertoRozovskayaetal.(2014A).2Thestate-of-the-artperformanceoftheIllinoissystemdis-cussedhereiswithrespecttoindividualcomponentsfordiffer-enterrors.ImprovementsinRozovskayaandRoth(2013)overtheIllinoissystemthatareduetojointlearningandinferenceareorthogonal,andtheanalysisinthispaperstillappliesthere.3F1mightnotbetheidealmetricforthistaskbutthiswastheonechosenintheevaluation.SeemoreinSec.6. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a

阅读更多 ”

计算语言学协会会刊, 2 (2014) 405–418. 动作编辑器: Mark Steedman.

计算语言学协会会刊, 2 (2014) 405–418. 动作编辑器: Mark Steedman. Submitted 4/2014; 修改 8/2014; 已发表 10/2014. C(西德:13)2014 计算语言学协会. 405 ANewParsingAlgorithmforCombinatoryCategorialGrammarMarcoKuhlmannDepartmentofComputerandInformationScienceLinköpingUniversity,Swedenmarco.kuhlmann@liu.seGiorgioSattaDepartmentofInformationEngineeringUniversityofPadua,Italysatta@dei.unipd.itAbstractWepresentapolynomial-timeparsingalgo-rithmforCCG,basedonanewdecompositionofderivationsintosmall,shareableparts.Ouralgorithmhasthesameasymptoticcomplex-ity,O.n6/,asapreviousalgorithmbyVijay-ShankerandWeir(1993),butiseasiertoun-derstand,implement,andprovecorrect.1IntroductionCombinatoryCategorialGrammar(CCG;SteedmanandBaldridge(2011))isalexicalizedgrammarfor-malismthatbelongstotheclassofso-calledmildlycontext-sensitiveformalisms,ascharacterizedbyJoshi(1985).CCGhasbeensuccessfullyusedforawiderangeofpracticaltasksincludingdata-drivenparsing(ClarkandCurran,2007),wide-coveragese-manticconstruction(Bosetal.,2004;Kwiatkowskietal.,2010;LewisandSteedman,2013)andmachinetranslation(Weeseetal.,2012).SeveralparsingalgorithmsforCCGhavebeenpresentedintheliterature.Earlierproposalsshowrunningtimeexponentialinthelengthoftheinputstring(PareschiandSteedman,1987;Tomita,1988).AbreakthroughcamewiththeworkofVijay-ShankerandWeir(1990)andVijay-ShankerandWeir(1993)whoreportthefirstpolynomial-timealgorithmforCCGparsing.Untilthisday,thisalgorithm,whichweshallrefertoastheV&Walgorithm,remainstheonlypublishedpolynomial-timeparsingalgorithmforCCG.However,wearenotawareofanypracticalparserforCCGthatactuallyusesit.Wespeculatethatthishastwomainreasons:第一的,someauthorshavearguedthatlinguisticresourcesavailableforCCGcanbecoveredwithcontext-freefragmentsoftheformalism(FowlerandPenn,2010),forwhichmoreefficientparsingalgorithmscanbegiven.Sec-ond,theV&Walgorithmisconsiderablymorecom-plexthanparsingalgorithmsforequivalentmildlycontext-sensitiveformalisms,suchasTree-Adjoin-ingGrammar(JoshiandSchabes,1997),andisquitehardtounderstand,implement,andprovecorrect.TheV&Walgorithmisbasedonaspecialdecom-positionofCCGderivationsintosmallerpartsthatcanthenbesharedamongdifferentderivations.Thissharingisthekeytothepolynomialruntime.Inthisarticlewebuildonthesameidea,butdevelopanalternativepolynomial-timealgorithmforCCGparsing.ThenewalgorithmisbasedonadifferentdecompositionofCCGderivations,andisarguablysimplerthantheV&Walgorithminatleasttwore-spects:第一的,thenewalgorithmusesonlythreebasicsteps,againsttheninebasicstepsoftheV&Wparser.Second,thecorrectnessproofofthenewalgorithmissimplerthantheonereportedbyVijay-ShankerandWeir(1993).ThenewalgorithmrunsintimeO.n6/wherenisthelengthoftheinputstring,thesameastheV&Wparser.Weorganizeourpresentationasfollows.InSec-tion2weintroduceCCGandthecentralnotionofderivationtrees.InSection3westartwithasimplebutexponential-timeparserforCCG,fromwhichwederiveourpolynomial-timeparserinSection4.Section5furthersimplifiesthealgorithmandprovesitscorrectness.WethenprovideadiscussionofouralgorithmandpossibleextensionsinSection6.Sec-tion7concludesthearticle. l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t a

阅读更多 ”

计算语言学协会会刊, 2 (2014) 393–404. 动作编辑器: Robert C. 摩尔.

计算语言学协会会刊, 2 (2014) 393–404. 动作编辑器: Robert C. 摩尔. Submitted 2/2014; 修改 6/2014; 已发表 10/2014. C (西德:13) 2014 计算语言学协会. LocallyNon-LinearLearningforStatisticalMachineTranslationviaDiscretizationandStructuredRegularizationJonathanH.Clark∗ChrisDyer†AlonLavie†*MicrosoftResearch†CarnegieMellonUniversityRedmond,WA98052,USAPittsburgh,PA15213,USAjonathan.clark@microsoft.com{cdyer,alavie}@cs.cmu.eduAbstractLinearmodels,whichsupportefficientlearn-ingandinference,aretheworkhorsesofstatis-ticalmachinetranslation;然而,linearde-cisionrulesarelessattractivefromamodelingperspective.Inthiswork,weintroduceatech-niqueforlearningarbitrary,rule-local,non-linearfeaturetransformsthatimprovemodelexpressivity,butdonotsacrificetheefficientinferenceandlearningassociatedwithlinearmodels.Todemonstratethevalueofourtech-nique,wediscardthecustomarylogtransformoflexicalprobabilitiesanddropthephrasaltranslationprobabilityinfavorofrawcounts.Weobservethatouralgorithmlearnsavari-ationofalogtransformthatleadstobettertranslationqualitycomparedtotheexplicitlogtransform.Weconcludethatnon-linearre-sponsesplayanimportantroleinSMT,anob-servationthatwehopewillinformtheeffortsoffeatureengineers.1IntroductionLinearmodelsusinglog-transformedprobabilitiesasfeatureshaveemergedasthedominantmodelinMTsystems.ThispracticecanbetracedbacktotheIBMnoisychannelmodels(Brownetal.,1993),whichdecomposedecodingintotheproductofatranslationmodel(TM值)andalanguagemodel(LM),motivatedbyBayes’Rule.WhenOchandNey(2002)introducedalog-linearmodelfortrans-lation(alinearsumoflog-spacefeatures),theynotedthatthenoisychannelmodelwasaspecialcaseoftheirmodelusinglogprobabilities.This∗Thisworkwasconductedaspartofthefirstauthor’sPh.D.workatCarnegieMellonUniversity.sameformulationpersistedevenaftertheintroduc-tionofMERT(和,2003),whichoptimizesalin-earmodel;再次,usingtwologprobabilityfea-tures(TMandLM)withequalweightrecoveredthenoisychannelmodel.Yetsystemsnowusemanymorefeatures,someofwhicharenotevenprobabil-ities.Wenolongerbelievethatequalweightsbe-tweentheTMandLMprovidesoptimaltranslationquality;theprobabilitiesintheTMdonotobeythechainrulenorBayes’rule,nullifyingseveralthe-oreticalmathematicaljustificationsformultiplyingprobabilities.Thestoryofmultiplyingprobabilitiesmayjustamounttoheavilypenalizingsmallvalues.Thecommunityhasabandonedtheoriginalmo-tivationsforalinearinterpolationoftwolog-transformedfeatures.Isthereempiricalevidencethatweshouldcontinueusingthisparticulartrans-formation?Dowehaveanyreasontobelieveitisbetterthanothernon-lineartransformations?Toan-swerthese,weexploretheissueofnon-linearityinmodelsforMT.Intheprocess,wewilldiscusstheimpactoflinearityonfeatureengineeringandde-velopageneralmechanismforlearningaclassofnon-lineartransformationsofreal-valuedfeatures.Applyinganon-lineartransformationsuchaslogtofeaturesisonewayofachievinganon-linearresponsefunction,evenifthosefeaturesareaggre-gatedinalinearmodel.Alternatively,wecouldachieveanon-linearresponseusinganativelynon-linearmodelsuchasaSVM(Wangetal.,2007)orRankBoost(Sokolovetal.,2012).然而,MTisastructuredpredictionproblem,inwhichafullhypothesisiscomposedofpartialhypotheses.MTdecoderstakeadvantageofthefactthatthemodel l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 /

阅读更多 ”

计算语言学协会会刊, 2 (2014) 351–362. 动作编辑器: Hal Daume III.

计算语言学协会会刊, 2 (2014) 351–362. 动作编辑器: Hal Daume III. Submitted 2/2014; 修改 5/2014; 已发表 10/2014. C (西德:13) 2014 计算语言学协会. TREETALK:CompositionandCompressionofTreesforImageDescriptionsPolinaKuznetsova††StonyBrookUniversityStonyBrook,NYpkuznetsova@cs.stonybrook.eduVicenteOrdonez‡TamaraL.Berg‡‡UNCChapelHillChapelHill,NC{vicente,tlberg}@cs.unc.eduYejinChoi††††UniversityofWashingtonSeattle,WAyejin@cs.washington.eduAbstractWepresentanewtreebasedapproachtocomposingexpressiveimagedescriptionsthatmakesuseofnaturallyoccuringwebimageswithcaptions.Weinvestigatetworelatedtasks:imagecaptiongeneralizationandgen-eration,wheretheformerisanoptionalsub-taskofthelatter.Thehigh-levelideaofourapproachistoharvestexpressivephrases(astreefragments)fromexistingimagedescrip-tions,thentocomposeanewdescriptionbyselectivelycombiningtheextracted(andop-tionallypruned)treefragments.Keyalgo-rithmiccomponentsaretreecompositionandcompression,bothintegratingtreestructurewithsequencestructure.Ourproposedsystemattainssignificantlybetterperformancethanpreviousapproachesforbothimagecaptiongeneralizationandgeneration.Inaddition,ourworkisthefirsttoshowtheempiricalben-efitofautomaticallygeneralizedcaptionsforcomposingnaturalimagedescriptions.1IntroductionThewebisincreasinglyvisual,withhundredsofbil-lionsofusercontributedphotographshostedonline.Asubstantialportionoftheseimageshavesomesortofaccompanyingtext,rangingfromkeywords,tofreetextonwebpages,totextualdescriptionsdi-rectlydescribingdepictedimagecontent(i.e.cap-tions).Wetapintothelastkindoftext,usingnatu-rallyoccuringpairsofimageswithnaturallanguagedescriptionstocomposeexpressivedescriptionsforqueryimagesviatreecompositionandcompression.Suchautomaticimagecaptioningeffortscouldpotentiallybeusefulformanyapplications:fromautomaticorganizationofphotocollections,tofacil-itatingimagesearchwithcomplexnaturallanguagequeries,toenhancingwebaccessibilityforthevi-suallyimpaired.Ontheintellectualside,bylearn-ingtodescribethevisualworldfromnaturallyexist-ingwebdata,ourstudyextendsthedomainsoflan-guagegroundingtothehighlyexpressivelanguagethatpeopleuseintheireverydayonlineactivities.Therehasbeenarecentspikeineffortstoau-tomaticallydescribevisualcontentinnaturallan-guage(Yangetal.,2011;Kulkarnietal.,2011;Lietal.,2011;Farhadietal.,2010;Krishnamoorthyetal.,2013;ElliottandKeller,2013;YuandSiskind,2013;Socheretal.,2014).Thisreflectsthelongstandingunderstandingthatencodingthecomplex-itiesandsubtletiesofimagecontentoftenrequiresmoreexpressivelanguageconstructsthanasetoftags.Nowthatvisualrecognitionalgorithmsarebe-ginningtoproducereliableestimatesofimagecon-tent(Perronninetal.,2012;Dengetal.,2012a;Dengetal.,2010;Krizhevskyetal.,2012),thetimeseemsripetobeginexploringhigherlevelsemantictasks.Therehavebeentwomaincomplementarydirec-tionsexploredforautomaticimagecaptioning.Thefirstfocusesondescribingexactlythoseitems(e.g.,objects,属性)thataredetectedbyvisionrecog-nition,whichsubsequentlyconfineswhatshouldbedescribedandhow(Yaoetal.,2010;Kulkarnietal.,2011;Kojimaetal.,2002).Approachesinthisdirec-tioncouldbeidealforvariouspracticalapplicationssuchasimagedescriptionforthevisuallyimpaired.However,itisnotclearwhetherthesemanticexpres-sivenessoftheseapproachescaneventuallyscaleuptothecasual,buthighlyexpressivelanguagepeo- l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 /

阅读更多 ”

计算语言学协会会刊, 2 (2014) 297–310. 动作编辑器: Hal Daume III.

计算语言学协会会刊, 2 (2014) 297–310. 动作编辑器: Hal Daume III. Submitted 5/2014; 修改 6/2014; 已发表 10/2014. C(西德:13)2014 计算语言学协会. 297 ExploitingSocialNetworkStructureforPerson-to-PersonSentimentAnalysisRobertWestStanfordUniversitywest@cs.stanford.eduHristoS.PaskovStanfordUniversityhpaskov@stanford.eduJureLeskovecStanfordUniversityjure@cs.stanford.eduChristopherPottsStanfordUniversitycgpotts@stanford.eduAbstractPerson-to-personevaluationsareprevalentinallkindsofdiscourseandimportantfores-tablishingreputations,buildingsocialbonds,andshapingpublicopinion.Suchevaluationscanbeanalyzedseparatelyusingsignedso-cialnetworksandtextualsentimentanalysis,butthismissestherichinteractionsbetweenlanguageandsocialcontext.Tocapturesuchinteractions,wedevelopamodelthatpre-dictsindividualA’sopinionofindividualBbysynthesizinginformationfromthesignedsocialnetworkinwhichAandBareembed-dedwithsentimentanalysisoftheevaluativetextsrelatingAtoB.Weprovethatthisprob-lemisNP-hardbutcanberelaxedtoanef-ficientlysolvablehinge-lossMarkovrandomfield,andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyver-sionsintwoverydifferentdatasetsinvolvingcommunity-leveldecision-making:theWiki-pediaRequestsforAdminshipcorpusandtheConvoteU.S.Congressionalspeechcorpus.1IntroductionPeople’sevaluationsofoneanotherareprevalentinallkindsofdiscourse,publicandprivate,acrossages,genders,cultures,andsocialclasses(Dunbar,2004).Suchopinionsmatterforestablishingrep-utationsandreinforcingsocialbonds,andtheyareespeciallyconsequentialinpoliticalcontexts,wheretheytaketheformofendorsements,accusations,andassessmentsintendedtoswaypublicopinion.Thesignificanceofsuchperson-to-personevalu-ationsmeansthatthereisapressingneedforcom-putationalmodelsandtechnologiesthatcananalyzethem.Researchonsignedsocialnetworkssuggestsonepathforward:howonepersonwillevaluatean-othercanoftenbepredictedfromthenetworktheyareembeddedin.Linguisticsentimentanalysissug-gestsanotherpathforward:onecouldleveragetex-tualfeaturestopredictthevalenceofevaluativetextsdescribingpeople.Suchindependenteffortshavebeensuccessful,buttheygenerallyneglectthewaysinwhichsocialandlinguisticfeaturescomplementeachother.Insomesettings,textualdataissparsebutthenetworkstructureislargelyobserved;inoth-ers,textisabundantbutthenetworkispartlyorun-reliablyrecorded.Inaddition,weoftenseerichin-teractionsbetweenthetwokindsofinformation—politicalalliesmightteaseeachotherwithnegativelanguagetoenhancesocialbonds,andopponentsof-tenusesarcasticallypositivelanguageintheircriti-cisms.Separatesentimentorsigned-networkmod-elswillmissormisreadthesesignals.Wedevelop(Sec.3)agraphicalmodelthatsyn-thesizesnetworkandlinguisticinformationtomakemoreandbetterpredictionsaboutboth.Theobjec-tiveofthemodelistopredictA’sopinionofBusingasynthesisofthestructuralcontextaroundAandBinsidethesocialnetworkandsentimentanalysisoftheevaluativetextsrelatingAtoB.WeprovethattheproblemisNP-hardbutthatitcanberelaxedtoanefficientlysolvablehinge-lossMarkovrandomfield(Broecheleretal.,2010),andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyversionsintwoverydifferentdatasetsinvolv-ingcommunity-leveldecision-making:theWikipe-diaRequestsforAdminshipcorpus,inwhichWi-kipediaeditorsdiscussandvoteonwhoshouldbe l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 2 (2014) 245–258. 动作编辑器: Patrick Pantel.

计算语言学协会会刊, 2 (2014) 245–258. 动作编辑器: Patrick Pantel. Submitted 11/2013; 修改 3/2014; 已发表 10/2014. C (西德:13) 2014 计算语言学协会. CrosslingualandMultilingualConstructionofSyntax-BasedVectorSpaceModelsJasonUttandSebastianPadóInstitutfürMaschinelleSprachverarbeitungUniversitätStuttgart[uttjn|pado]@ims.uni-stuttgart.deAbstractSyntax-baseddistributionalmodelsoflexicalsemanticsprovideaflexibleandlinguisticallyadequaterepresentationofco-occurrenceinfor-mation.However,theirconstructionrequireslarge,accuratelyparsedcorpora,whichareun-availableformostlanguages.Inthispaper,wedevelopanumberofmeth-odstoovercomethisobstacle.Wedescribe(A)acrosslingualapproachthatconstructsasyntax-basedmodelforanewlanguagerequir-ingonlyanEnglishresourceandatranslationlexicon;和(乙)multilingualapproachesthatcombinecrosslingualwithmonolingualinfor-mation,subjecttoavailability.WeevaluateontwolexicalsemanticbenchmarksinGer-manandCroatian.Wefindthatthemodelsexhibitcomplementaryprofiles:crosslingualmodelsyieldhigheraccuracieswhilemonolin-gualmodelsprovidebettercoverage.Inaddi-tion,weshowthatsimplemultilingualmodelscansuccessfullycombinetheirstrengths.1IntroductionBuildingontheDistributionalHypothesis(哈里斯,1954;MillerandCharles,1991),whichstatesthatwordsoccurringinsimilarcontextsaresimilarinmeaning,distributionalsemanticmodels(DSMs)rep-resentaword’smeaningviaitsoccurrenceincontextinlargecorpora.Vectorspaces,themostwidelyusedtypeofDSMs,representwordsasvectorsinahigh-dimensionalspacewhosedimensionscorrespondtofeaturesofthewords’contexts.Wordspacesrepre-sentthesimplestcaseofDSMsinwhichthedimen-sionsaresimplythecontextwords(Schütze,1992).AnotablesubclassofDSMsaresyntax-basedmod-els(林,1998;BaroniandLenci,2010)whichuse(lexicalized)syntacticrelationsasdimensions.Theyareabletomodelmorefine-graineddistinctionsthanwordspacesandhavebeenfoundtobeusefulfortaskssuchasselectionalpreferencelearning(Erketal.,2010),verbclassinduction(SchulteimWalde,2006),analogicalreasoning(特尼,2006),andalter-nationdiscovery(Joanisetal.,2006).Despitetheirflexibilityandusefulness,syntax-basedDSMsareusedlessoftenthanword-basedspaces.Animpor-tantreasonisthattheirconstructionrequiresaccurateparsers,whichareunavailableformanylanguages.Inaddition,syntax-basedDSMsareinherentlymoresparsethanwordspaces,whichcallsforalargecor-pusofwellparsabledata.ItisthusnotsurprisingthatbesidesEnglish(BaroniandLenci,2010),onlyfewotherlanguagespossesslarge-scalesyntax-basedDSMs(PadóandUtt,2012;Šnajderetal.,2013).ThispaperdevelopsmethodsthattakeadvantageoftheresourcegradientbetweenEnglishandotherlanguages,exploitingthehigher-qualityresourcesoftheformertoinduceresourcesfortargetlanguagesamongthelatter,bytranslatingtheword-link-wordco-occurrencesthatunderliesyntax-basedDSMs.Thisdirectlyprovidesacrosslingualmethodtocon-structsyntax-basedDSMsfortargetlanguageswith-outanytargetlanguagedata,requiringonlyanEn-glishsyntax-basedDSMandatranslationlexicon.Suchlexiconsareavailableformanylanguagepairs,andweoutlineamethodtoreduceambiguityinherentinsuchdictionaries.Wedescribeasetofmultilin-gualmethodsthatcancombinecorpusevidencefromEnglishandthetargetlanguagetofurtherimprovetheperformanceoftheobtainedDSM.Weconsidertwotargetlanguages,GermanandCroatian,asexamplesofonecloseandonemoreremotetargetlanguage.Forevaluation,weusetwo l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 2 (2014) 219–230. 动作编辑器: Alexander Koller.

计算语言学协会会刊, 2 (2014) 219–230. 动作编辑器: Alexander Koller. Submitted 11/2013; 修改 1/2014; 已发表 5/2014. C (西德:13) 2014 计算语言学协会. BacktoBasicsforMonolingualAlignment:ExploitingWordSimilarityandContextualEvidenceMdArafatSultan†,StevenBethard‡andTamaraSumner††InstituteofCognitiveScienceandDepartmentofComputerScienceUniversityofColoradoBoulder‡DepartmentofComputerandInformationSciencesUniversityofAlabamaatBirminghamarafat.sultan@colorado.edu,bethard@cis.uab.edu,sumner@colorado.eduAbstractWepresentasimple,easy-to-replicatemonolin-gualalignerthatdemonstratesstate-of-the-artperformancewhilerelyingonalmostnosu-pervisionandaverysmallnumberofexternalresources.Basedonthehypothesisthatwordswithsimilarmeaningsrepresentpotentialpairsforalignmentiflocatedinsimilarcontexts,weproposeasystemthatoperatesbyfindingsuchpairs.Intwointrinsicevaluationsonalignmenttestdata,oursystemachievesF1scoresof88–92%,demonstrating1–3%absoluteimprove-mentoverthepreviousbestsystem.Moreover,intwoextrinsicevaluationsouralignerout-performsexistingaligners,andevenanaiveapplicationofthealignerapproachesstate-of-the-artperformanceineachextrinsictask.1IntroductionMonolingualalignmentisthetaskofdiscoveringandaligningsimilarsemanticunitsinapairofsentencesexpressedinanaturallanguage.Suchalignmentspro-videvaluableinformationregardinghowandtowhatextentthetwosentencesarerelated.Consequently,alignmentisacentralcomponentofanumberofimportanttasksinvolvingtextcomparison:textualentailmentrecognition,textualsimilarityidentifica-tion,paraphrasedetection,questionansweringandtextsummarization,tonameafew.Thehighutilityofmonolingualalignmenthasspawnedsignificantresearchonthetopicinthere-centpast.Majoreffortsthathavetreatedalignmentasastandaloneproblem(MacCartneyetal.,2008;ThadaniandMcKeown,2011;Yaoetal.,2013a)areprimarilysupervised,thankstothemanuallyalignedcorpuswithtrainingandtestsetsfromMicrosoftRe-search(Brockett,2007).Primaryconcernsofsuchworkincludebothqualityandspeed,duetothefactthatalignmentisfrequentlyacomponentoflargerNLPtasks.Drivenbysimilarmotivations,weseektodevisealightweight,easy-to-constructalignerthatproduceshigh-qualityoutputandisapplicabletovariousendtasks.Amidavarietyofproblemformulationsandingeniousapproachestoalignment,wetakeastepbackandexaminecloselytheeffectivenessoftwofrequentlymadeassumptions:1)Relatedsemanticunitsintwosentencesmustbesimilarorrelatedintheirmeaning,and2)Commonalitiesintheirse-manticcontextsintherespectivesentencesprovideadditionalevidenceoftheirrelatedness(MacCartneyetal.,2008;ThadaniandMcKeown,2011;Yaoetal.,2013a;Yaoetal.,2013b).Alignment,basedsolelyonthesetwoassumptions,reducestofindingthebestcombinationofpairsofsimilarsemanticunitsinsim-ilarcontexts.Exploitingexistingresourcestoidentifysimilarityofsemanticunits,wesearchforrobusttechniquestoidentifycontextualcommonalities.Dependencytreesareacommonlyusedstructureforthispurpose.Whiletheyremainacentralpartofouraligner,weexpandthehorizonsofdependency-basedalignmentbeyondexactmatchingbysystematicallyexploitingthenotionof“typeequivalence”withasmallhand-craftedsetofequivalentdependencytypes.Inaddi-tion,weaugmentdependency-basedalignmentwithsurface-leveltextanalysis.Whilephrasalalignmentsareimportantandhave l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”

计算语言学协会会刊, 2 (2014) 207–218. 动作编辑器: Alexander Clark.

计算语言学协会会刊, 2 (2014) 207–218. 动作编辑器: Alexander Clark. Submitted 10/2013; 修改 3/2014; 已发表 4/2014. C (西德:13) 2014 计算语言学协会. GroundedCompositionalSemanticsforFindingandDescribingImageswithSentencesRichardSocher,AndrejKarpathy,QuocV.Le*,ChristopherD.Manning,AndrewY.NgStanfordUniversity,ComputerScienceDepartment,*GoogleInc.richard@socher.org,karpathy@cs.stanford.edu,qvl@google.com,manning@stanford.edu,ang@cs.stanford.eduAbstractPreviousworkonRecursiveNeuralNetworks(RNNs)showsthatthesemodelscanproducecompositionalfeaturevectorsforaccuratelyrepresentingandclassifyingsentencesorim-ages.However,thesentencevectorsofprevi-ousmodelscannotaccuratelyrepresentvisu-allygroundedmeaning.WeintroducetheDT-RNNmodelwhichusesdependencytreestoembedsentencesintoavectorspaceinordertoretrieveimagesthataredescribedbythosesentences.UnlikepreviousRNN-basedmod-elswhichuseconstituencytrees,DT-RNNsnaturallyfocusontheactionandagentsinasentence.Theyarebetterabletoabstractfromthedetailsofwordorderandsyntacticexpression.DT-RNNsoutperformotherre-cursiveandrecurrentneuralnetworks,kernel-izedCCAandabag-of-wordsbaselineonthetasksoffindinganimagethatfitsasentencedescriptionandviceversa.Theyalsogivemoresimilarrepresentationstosentencesthatdescribethesameimage.1IntroductionSinglewordvectorspacesarewidelyused(TurneyandPantel,2010)andsuccessfulatclassifyingsin-glewordsandcapturingtheirmeaning(CollobertandWeston,2008;Huangetal.,2012;Mikolovetal.,2013).Sincewordsrarelyappearinisolation,thetaskoflearningcompositionalmeaningrepre-sentationsforlongerphraseshasrecentlyreceivedalotofattention(MitchellandLapata,2010;Socheretal.,2010;Socheretal.,2012;Grefenstetteetal.,2013).相似地,classifyingwholeimagesintoafixedsetofclassesalsoachievesveryhighperfor-mance(Leetal.,2012;Krizhevskyetal.,2012).然而,similartowords,objectsinimagesareof-tenseeninrelationshipswithotherobjectswhicharenotadequatelydescribedbyasinglelabel.Inthiswork,weintroduceamodel,illustratedinFig.1,whichlearnstomapsentencesandimagesintoacommonembeddingspaceinordertobeabletoretrieveonefromtheother.Weassumewordandimagerepresentationsarefirstlearnedintheirre-spectivesinglemodalitiesbutfinallymappedintoajointlylearnedmultimodalembeddingspace.OurmodelformappingsentencesintothisspaceisbasedonideasfromRecursiveNeuralNetworks(RNNs)(波拉克,1990;Costaetal.,2003;Socheretal.,2011b).然而,unlikeallpreviousRNNmodelswhicharebasedonconstituencytrees(CT-RNNs),ourmodelcomputescompositionalvectorrepresentationsinsidedependencytrees.Thecom-positionalvectorscomputedbythisnewdependencytreeRNN(DT-RNN)capturemoreofthemeaningofsentences,wherewedefinemeaningintermsofsimilaritytoa“visualrepresentation”ofthetextualdescription.DT-RNNinducedvectorrepresenta-tionsofsentencesaremorerobusttochangesinthesyntacticstructureorwordorderthanrelatedmod-elssuchasCT-RNNsorRecurrentNeuralNetworkssincetheynaturallyfocusonasentence’sactionanditsagents.WeevaluateandcompareDT-RNNinducedrep-resentationsontheirabilitytouseasentencesuchas“Amanwearingahelmetjumpsonhisbikenearabeach.”tofindimagesthatshowsuchascene.Thegoalistolearnsentencerepresentationsthatcapture l D o w n o a d e d f r o m h t t p : / / 直接的 . m i t . 呃呃 / t

阅读更多 ”