计算语言学协会会刊, 1 (2013) 13–24. 动作编辑器: Giorgio Satta.
Submitted 11/2012; 已发表 3/2013. C
(西德:13)
2013 计算语言学协会.
FindingOptimal1-Endpoint-CrossingTreesEmilyPitler,SampathKannan,MitchellMarcusComputerandInformationScienceUniversityofPennsylvaniaPhiladelphia,PA19104epitler,kannan,mitch@seas.upenn.eduAbstractDependencyparsingalgorithmscapableofproducingthetypesofcrossingdependenciesseeninnaturallanguagesentenceshavetra-ditionallybeenordersofmagnitudeslowerthanalgorithmsforprojectivetrees.For95.8-99.8%ofdependencyparsesinvariousnat-urallanguagetreebanks,wheneveranedgeiscrossed,theedgesthatcrossitallhaveacommonvertex.Theoptimaldependencytreethatsatisfiesthis1-Endpoint-Crossingprop-ertycanbefoundwithanO(n4)parsingal-gorithmthatrecursivelycombinesforestsoverintervalswithoneexteriorpoint.1-Endpoint-CrossingtreesalsohavenaturalconnectionstolinguisticsandanotherclassofgraphsthathasbeenstudiedinNLP.1IntroductionDependencyparsingisoneofthefundamentalprob-lemsinnaturallanguageprocessingtoday,withap-plicationssuchasmachinetranslation(DingandPalmer,2005),informationextraction(CulottaandSorensen,2004),andquestionanswering(Cuietal.,2005).Mosthigh-accuracygraph-baseddepen-dencyparsers(KooandCollins,2010;RushandPetrov,2012;ZhangandMcDonald,2012)findthehighest-scoringprojectivetrees(inwhichnoedgescross),despitethefactthatalargeproportionofnat-urallanguagesentencesarenon-projective.Projec-tivetreescanbefoundinO(n3)时间(艾斯纳,2000),butcoveronly63.6%ofsentencesinsomenaturallanguagetreebanks(Table1).TheclassofdirectedspanningtreescoversalltreebanktreesandcanbeparsedinO(n2)withedge-basedfeatures(McDonaldetal.,2005),butitisNP-hardtofindthemaximumscoringsuchtreewithgrandparentorsiblingfeatures(McDonaldandPereira,2006;McDonaldandSatta,2007).Therearevariousexistingdefinitionsofmildlynon-projectivetreeswithbetterempiricalcoveragethanprojectivetreesthatdonothavethehardnessofextensibilitythatspanningtreesdo.However,thesehavehadparsingalgorithmsthatareordersofmag-nitudeslowerthantheprojectivecaseortheedge-basedspanningtreecase.Forexample,well-nesteddependencytreeswithblockdegree2(Kuhlmann,2013)coveratleast95.4%ofnaturallanguagestruc-tures,buthaveaparsingtimeofO(n7)(Gómez-Rodríguezetal.,2011).Nopreviouslydefinedclassoftreessimultane-ouslyhashighcoverageandlow-degreepolynomialalgorithmsforparsing,allowinggrandparentorsib-lingfeatures.Wepropose1-Endpoint-Crossingtrees,inwhichforanyedgethatiscrossed,allotheredgesthatcrossthatedgeshareanendpoint.Whilesimpletostate,thispropertycovers95.8%ormoreofde-pendencyparsesinnaturallanguagetreebanks(Ta-ble1).Theoptimal1-Endpoint-Crossingtreecanbefoundinfasterasymptotictimethananyprevi-ouslyproposedmildlynon-projectivedependencyparsingalgorithm.Weshowhowany1-Endpoint-Crossingtreecanbedecomposedintoisolatedsetsofintervalswithoneexteriorpoint(Section3).Thisisthekeyinsightthatallowsefficientparsing;theO(n4)parsingalgorithmispresentedinSection4.1-Endpoint-Crossingtreesareasubclassof2-planargraphs(Section5.1),aclassthathasbeenstudied
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
e
d
你
/
t
A
C
我
/
我
A
r
t
我
C
e
–
p
d
F
/
d
哦
我
/
.
1
0
1
1
6
2
/
t
我
A
C
_
A
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
我
A
C
_
A
_
0
0
2
0
6
p
d
.
F
乙
y
G
你
e
s
t
t
哦
n
0
8
S
e
p
e
米
乙
e
r
2
0
2
3
14
inNLP.1-Endpoint-Crossingtreesalsohavesomelinguisticinterpretation(pairsofcrossserialverbsproduce1-Endpoint-Crossingtrees,Section5.2).2DefinitionsofNon-ProjectivityDefinition1.Edgeseandfcrossifeandfhavedistinctendpointsandexactlyoneoftheendpointsoffliesbetweentheendpointsofe.Definition2.Adependencytreeis1-Endpoint-Crossingifforanyedgee,alledgesthatcrosseshareanendpointp.Table1showsthepercentageofdependencyparsesintheCoNLL-Xtrainingsetsthatare1-Endpoint-Crossingtrees.Acrosssixlanguageswithvaryingamountsofnon-projectivity,95.8-99.8%ofdependencyparsesintreebanksare1-Endpoint-Crossingtrees.1Wenextreviewandcompareotherrelevantdef-initionsofnon-projectivityfrompriorwork:well-nestedwithblockdegree2,gap-minding,projective,and2-planar.Thedefinitionsofblockdegreeandwell-nestednessaregivenbelow:Definition3.Foreachnodeuinthetree,ablockofthenodeis“alongestsegmentconsistingofdescen-dantsofu.”(Kuhlmann,2013).Theblock-degreeofuis“thenumberofdistinctblocksofu”.Theblockdegreeofatreeisthemaximumblockdegreeofanyofitsnodes.Thegapdegreeisthenumberofgapsbetweentheseblocks,andsobydefinitionisonelessthantheblockdegree.(Kuhlmann,2013)Definition4.Twotrees“T1andT2interleaveifftherearenodesl1,r1∈T1andl2,r2∈T2suchthatl1
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
e
d
你
/
t
A
C
我
/
我
A
r
t
我
C
e
–
p
d
F
/
d
哦
我
/
.
1
0
1
1
6
2
/
t
我
A
C
_
A
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
我
A
C
_
A
_
0
0
2
0
6
p
d
.
F
乙
y
G
你
e
s
t
t
哦
n
0
8
S
e
p
e
米
乙
e
r
2
0
2
3
21
WhatdidsayBAC…Zatet?nsaid1said2t1t2Figure8:Anexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Theedgesbe-tweentheheadsofeachclausecrosstheedgesfromtracetotrace,butallobey1-Endpoint-Crossing.Endpoint-Crossing.Psycholinguistically,betweentwoandthreeverbsisexactlywherethereisalargechangeinthesentenceprocessingabilitiesofhumanlisteners(basedonbothgrammaticaljudgmentsandscoresonacomprehensiontask)(Bachetal.,1986).Morespeculatively,theremaybeaconnectionbetweentheformof1-Endpoint-Crossingtreesandphases(大致,propositionalunitssuchasclauses)inMinimalism(Chomskyetal.,1998).Figure8showsanexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Thephase-impenetrabilitycondition(PIC)statesthatonlytheheadofthephaseandelementsthathavemovedtoitsedgeareaccessibletotherestofthesentence(Chomskyetal.,1998,p.22).Movementisthere-forerequiredtobesuccessivecyclic,withamovedelementleavingachainoftracesattheedgeofeachclauseonitswaytoitsfinalpronouncedloca-tion(Chomsky,1981).InFigure8,noticethatthecrossingedgesformarepeatedpatternthatobeysthe1-Endpoint-Crossingproperty.Moregenerally,wesuspectthattreessatisfyingthePICwilltendtoalsobe1-Endpoint-Crossing.Furthermore,ifthetraceswerenotattheedgeofeachclause,andin-steadwerepositionedbetweenaheadandoneofitsarguments,1-Endpoint-Crossingwouldbevio-lated.Forexample,ift2inFigure8werebe-tweenCandsaid2,thentheedge(t1,t2)wouldcross(说,said1),(said1,said2),和(C,said2),whichdonotallshareanendpoint.Anexplorationoftheselinguisticconnectionsmaybeaninterestingavenueforfurtherresearch.6Conclusions1-Endpoint-Crossingtreescharacterizeover95%ofstructuresfoundinnaturallanguagetreebank,andcanbeparsedinonlyafactorofnmoretimethanprojectivetrees.Thedynamicprogrammingalgo-rithmforprojectivetrees(艾斯纳,2000)hasbeenextendedtohandlehigherorderfactors(McDonaldandPereira,2006;Carreras,2007;KooandCollins,2010),addingatmostafactorofntotheedge-basedrunningtime;itwouldbeinterestingtoex-tendthealgorithmpresentedheretoincludehigherorderfactors.1-Endpoint-Crossingisaconditiononedges,whilepropertiessuchaswell-nestednessorblockdegreeareframedintermsofsubtrees.Threeedgeswillalwayssufficeasacertificateofa1-Endpoint-Crossingviolation(twovertex-disjointedgesthatbothcrossathird).Incontrast,forapropertylikeill-nestedness,twonodesmighthavealeastcommonancestorarbitrarilyfaraway,andsoonemightneedtheentiregraphtoverifywhetherthesub-treesrootedatthosenodesaredisjointandill-nested.Wehavediscussedcross-serialdepen-dencies;afurtherexplorationofwhichlinguisticphenomenawouldandwouldnothave1-Endpoint-Crossingdependencytreesmayberevealing.AcknowledgmentsWewouldliketothankJulieLegateforanin-terestingdiscussion.ThismaterialisbaseduponworksupportedunderaNationalScienceFoun-dationGraduateResearchFellowship,NSFAwardCCF1137084,andArmyResearchOfficeMURIgrantW911NF-07-1-0216.ADynamicProgramtofindthemaximumscoring1-Endpoint-CrossingTreeInput:MatrixS:S[我,j]isthescoreofthedirectededge(我,j)输出:Maximumscoreofa1-Endpoint-Crossingtreeoververtices[0,n],rootedat0Init:∀iInt[我,我,F,F]=Int[我,i+1,F,F]=0Int[我,我,时间,F]=Int[我,我,F,时间]=Int[我,我,时间,时间]=−∞Final:Int[0,n,F,时间]Shorthandforbooleans:TF(X,S):=ifx=T,exactlyoneofthesetSistrueifx=F,allofthesetSmustbefalsebi,bj,bxaretrueiffthecorrespondingboundarypointhasitsincomingedge(parent)inthatsub-problem.FortheLRsub-problem,biandbjarealwaysfalse,andsoomitted.Forallsub-problemswiththesuffixAFromB,theboundarypointAhasitsparentedgeinthesub-problemsolution;theothertwoboundarypointsdonot.Forexample,L_XFromIwouldcor-respondtohavingbooleansbi=bj=Fandbx=T,withtherestrictionthatxmustbeadescendantofi.
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
e
d
你
/
t
A
C
我
/
我
A
r
t
我
C
e
–
p
d
F
/
d
哦
我
/
.
1
0
1
1
6
2
/
t
我
A
C
_
A
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
我
A
C
_
A
_
0
0
2
0
6
p
d
.
F
乙
y
G
你
e
s
t
t
哦
n
0
8
S
e
p
e
米
乙
e
r
2
0
2
3
22
Int[我,j,F,bj]←maxInt[i+1,j,时间,F]ifbj=FS[我,j]+Int[我,j,F,F]ifbj=Tmaxk∈(我,j)S[我,k]+Int[我,k,F,F]+Int[k,j,F,bj]maxTF(bj,{bl,br})LR[我,k,j,bl]+Int[k,j,F,br]maxl∈(k,j),TF(时间,{bl,bm,br})(西德:26)右[我,k,我,F,F,bl]+Int[k,我,F,bm]+L[我,j,k,br,bj,F]LR[我,k,我,bl]+Int[k,我,F,bm]+Int[我,j,br,bj]maxl∈(我,k),TF(时间,{bl,bm,br})(西德:26)Int[我,我,F,bl]+L[我,k,我,bm,F,F]+氮[k,j,我,F,bj,br]右[我,我,k,F,bl,F]+Int[我,k,bm,F]+L[k,j,我,F,bj,br]Int[我,j,时间,F]←symmetrictoInt[我,j,F,时间]Int[我,j,时间,时间]←−∞LR[我,j,X,bx]←maxL[我,j,X,F,F,bx]右[我,j,X,F,F,bx]maxk∈(我,j),TF(bx,{bxl,bxr}),TF(时间,{bkl,bkr})L[我,k,X,F,bkl,bxl]+右[k,j,X,bkr,F,bxr]氮[我,j,X,双,bj,F]←maxInt[我,j,双,bj]S[X,我]+氮[我,j,X,F,bj,F]ifbi=TS[X,j]+氮[我,j,X,双,F,F]ifbj=Tmaxk∈(我,j)S[X,k]+氮[我,k,X,双,F,F]+Int[k,j,F,bj]氮[我,j,X,F,bj,时间]←maxS[我,X]+氮[我,j,X,F,bj,F]S[X,j]+N_XFromI[我,j,X]ifbj=TS[j,X]+氮[我,j,X,F,F,F]ifbj=FS[j,X]+Int[我,j,F,时间]ifbj=Tmaxk∈(我,j)S[X,k]+N_XFromI[我,k,X]+Int[k,j,F,bj]maxk∈(我,j)S[k,X]+(西德:26)Int[我,k,F,时间]+Int[k,j,F,bj]氮[我,k,X,F,F,F]+Int[k,j,时间,bj]氮[我,j,X,时间,F,时间]←symmetrictoN[我,j,X,F,时间,时间]氮[我,j,X,时间,时间,时间]←−∞N_XFromI[我,j,X]←maxS[我,X]+氮[我,j,X,F,F,F]maxk∈(我,j)(西德:26)S[X,k]+N_XFromI[我,k,X]+Int[k,j,F,F]S[k,X]+Int[我,k,F,时间]+Int[k,j,F,F]N_IFromX[我,j,X]←max(S[X,我]+氮[我,j,X,F,F,F]maxk∈(我,j)S[X,k]+氮[我,k,X,时间,F,F]+Int[k,j,F,F]N_XFromJ[我,j,X]←symmetrictoN_XFromI[我,j,X]N_JFromX[我,j,X]←symmetrictoN_IFromX[我,j,X]L[我,j,X,双,bj,F]←maxInt[我,j,双,bj]S[X,我]+L[我,j,X,F,bj,F]ifbi=TS[X,j]+L[我,j,X,双,F,F]ifbj=Tmaxk∈(我,j),TF(双,{bl,br})S[X,k]+(西德:26)L[我,k,X,bl,F,F]+氮[k,j,我,F,bj,br]Int[我,k,bl,F]+L[k,j,我,F,bj,br]L[我,j,X,F,bj,时间]←maxS[我,X]+L[我,j,X,F,bj,F]S[X,j]+L_XFromI[我,j,X]ifbj=TS[j,X]+L[我,j,X,F,F,F]ifbj=FS[j,X]+L_JFromI[我,j,X]ifbj=Tmaxk∈(我,j)S[X,k]+L_XFromI[我,k,X]+氮[k,j,我,F,bj,F]maxk∈(我,j)S[k,X]+L_JFromI[我,k,X]+氮[k,j,我,F,bj,F]L[我,k,X,F,F,F]+氮[k,j,我,时间,bj,F]maxTF(时间,{bl,br})Int[我,k,F,bl]+L[k,j,我,br,bj,F]L[我,j,X,时间,bj,时间]←notreachableL_XFromI[我,j,X]←maxS[我,X]+L[我,j,X,F,F,F]maxk∈(我,j)S[X,k]+L_XFromI[我,k,X]+氮[k,j,我,F,F,F]maxk∈(我,j)S[k,X]+L_JFromI[我,k,X]+氮[k,j,我,F,F,F]L[我,k,X,F,F,F]+N_IFromX[k,j,我]Int[我,k,F,时间]+L[k,j,我,F,F,F]Int[我,k,F,F]+L_IFromX[k,j,我]L_IFromX[我,j,X]←maxS[X,我]+L[我,j,X,F,F,F]maxk∈(我,j)S[X,k]+L[我,k,X,时间,F,F]+氮[k,j,我,F,F,F]L[我,k,X,F,F,F]+N_XFromI[k,j,我]Int[我,k,时间,F]+L[k,j,我,F,F,F]Int[我,k,F,F]+L_XFromI[k,j,我]L_JFromX[我,j,X]←maxS[X,j]+L[我,j,X,F,F,F]maxk∈(我,j)S[X,k]+(西德:26)L[我,k,X,F,F,F]+Int[k,j,F,时间]Int[我,k,F,F]+L_JFromI[k,j,我]L_JFromI[我,j,X]←maxInt[我,j,F,时间]maxk∈(我,j)S[X,k]+(西德:26)L[我,k,X,F,F,F]+N_JFromX[k,j,我]Int[我,k,F,F]+L_JFromX[k,j,我]
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
e
d
你
/
t
A
C
我
/
我
A
r
t
我
C
e
–
p
d
F
/
d
哦
我
/
.
1
0
1
1
6
2
/
t
我
A
C
_
A
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
我
A
C
_
A
_
0
0
2
0
6
p
d
.
F
乙
y
G
你
e
s
t
t
哦
n
0
8
S
e
p
e
米
乙
e
r
2
0
2
3
23
右[我,j,X,双,bj,F]←symmetrictoL[我,j,X,双,bj,F]右[我,j,X,双,F,时间]←symmetrictoL[我,j,X,F,bj,时间]右[我,j,X,双,时间,时间]←notreachableR_XFromJ[我,j,X]←symmetrictoL_XFromI[我,j,X]R_JFromX[我,j,X]←symmetrictoL_IFromX[我,j,X]R_IFromX[我,j,X]←symmetrictoL_JFromX[我,j,X]R_IFromJ[我,j,X]←symmetrictoL_JFromI[我,j,X]ReferencesE.Bach,C.Brown,andW.Marslen-Wilson.1986.Crossedandnesteddependenciesingermananddutch:Apsycholinguisticstudy.LanguageandCognitiveProcesses,1(4):249–262.F.BernhartandP.C.Kainen.1979.Thebookthicknessofagraph.JournalofCombinatorialTheory,SeriesB,27(3):320–331.M.Bodirsky,M.Kuhlmann,andM.Möhl.2005.Well-nesteddrawingsasmodelsofsyntacticstructure.InInTenthConferenceonFormalGrammarandNinthMeetingonMathematicsofLanguage,pages88–1.UniversityPress.X.Carreras.2007.Experimentswithahigher-orderprojectivedependencyparser.InProceedingsoftheCoNLLSharedTaskSessionofEMNLP-CoNLL,vol-ume7,pages957–961.N.Chomsky,MassachusettsInstituteofTechnology.Dept.ofLinguistics,andPhilosophy.1998.Minimal-istinquiries:theframework.MIToccasionalpapersinlinguistics.DistributedbyMITWorkingPapersinLinguistics,和,Dept.ofLinguistics.N.Chomsky.1981.LecturesonGovernmentandBind-ing.Dordrecht:Foris.F.Chung,F.Leighton,andA.Rosenberg.1987.Em-beddinggraphsinbooks:Alayoutproblemwithap-plicationstoVLSIdesign.SIAMJournalonAlgebraicDiscreteMethods,8(1):33–58.H.Cui,R.Sun,K.Li,M.Y.Kan,andT.S.Chua.2005.Questionansweringpassageretrievalusingdepen-dencyrelations.InProceedingsofthe28thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages400–407.ACM.A.CulottaandJ.Sorensen.2004.Dependencytreekernelsforrelationextraction.InProceedingsofthe42ndAnnualMeetingonAssociationforComputa-tionalLinguistics,page423.AssociationforCompu-tationalLinguistics.Y.DingandM.Palmer.2005.Machinetranslationusingprobabilisticsynchronousdependencyinsertiongram-mars.InProceedingsofthe43rdAnnualMeetingonAssociationforComputationalLinguistics,pages541–548.AssociationforComputationalLinguistics.J.Eisner.2000.Bilexicalgrammarsandtheircubic-timeparsingalgorithms.InHarryBuntandAntonNijholt,编辑,AdvancesinProbabilisticandOtherParsingTechnologies,pages29–62.KluwerAcademicPublishers,October.S.EvenandA.Itai.1971.Queues,stacks,andgraphs.InProc.InternationalSymp.onTheoryofMachinesandComputations,pages71–86.C.Gómez-RodríguezandJ.Nivre.2010.Atransition-basedparserfor2-planardependencystructures.InProceedingsofACL,pages1492–1501.C.Gómez-Rodríguez,J.Carroll,andD.Weir.2011.De-pendencyparsingschemataandmildlynon-projectivedependencyparsing.ComputationalLinguistics,37(3):541–586.T.KooandM.Collins.2010.Efficientthird-orderde-pendencyparsers.InProceedingsofACL,pages1–11.M.Kuhlmann.2013.Mildlynon-projectivedependencygrammar.ComputationalLinguistics,39(2).R.McDonaldandF.Pereira.2006.Onlinelearningofapproximatedependencyparsingalgorithms.InPro-ceedingsofEACL,pages81–88.R.McDonaldandG.Satta.2007.Onthecomplexityofnon-projectivedata-drivendependencyparsing.InProceedingsofthe10thInternationalConferenceonParsingTechnologies,pages121–132.R.McDonald,F.Pereira,K.Ribarov,andJ.Hajiˇc.2005.Non-projectivedependencyparsingusingspanningtreealgorithms.InProceedingsoftheconferenceonHumanLanguageTechnologyandEmpiricalMethodsinNaturalLanguageProcessing,pages523–530.As-sociationforComputationalLinguistics.E.Pitler,S.Kannan,andM.Marcus.2012.Dynamicprogrammingforhigherorderparsingofgap-mindingtrees.InProceedingsofEMNLP,pages478–488.L.A.Ringenberg.1967.Collegegeometry.Wiley.A.RushandS.Petrov.2012.Vinepruningforeffi-cientmulti-passdependencyparsing.InProceedingsofNAACL,pages498–507.S.M.Shieber.1985.Evidenceagainstthecontext-freenessofnaturallanguage.LinguisticsandPhiloso-phy,8(3):333–343.H.ZhangandR.McDonald.2012.Generalizedhigher-orderdependencyparsingwithcubepruning.InPro-ceedingsofEMNLP,pages320–331.
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
e
d
你
/
t
A
C
我
/
我
A
r
t
我
C
e
–
p
d
F
/
d
哦
我
/
.
1
0
1
1
6
2
/
t
我
A
C
_
A
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
我
A
C
_
A
_
0
0
2
0
6
p
d
.
F
乙
y
G
你
e
s
t
t
哦
n
0
8
S
e
p
e
米
乙
e
r
2
0
2
3
24
下载pdf