Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 13–24. Editor de acciones: Giorgio Satta.
Submitted 11/2012; Publicado 3/2013. C
(cid:13)
2013 Asociación de Lingüística Computacional.
FindingOptimal1-Endpoint-CrossingTreesEmilyPitler,SampathKannan,MitchellMarcusComputerandInformationScienceUniversityofPennsylvaniaPhiladelphia,PA19104epitler,kannan,mitch@seas.upenn.eduAbstractDependencyparsingalgorithmscapableofproducingthetypesofcrossingdependenciesseeninnaturallanguagesentenceshavetra-ditionallybeenordersofmagnitudeslowerthanalgorithmsforprojectivetrees.For95.8-99.8%ofdependencyparsesinvariousnat-urallanguagetreebanks,wheneveranedgeiscrossed,theedgesthatcrossitallhaveacommonvertex.Theoptimaldependencytreethatsatisfiesthis1-Endpoint-Crossingprop-ertycanbefoundwithanO(n4)parsingal-gorithmthatrecursivelycombinesforestsoverintervalswithoneexteriorpoint.1-Endpoint-CrossingtreesalsohavenaturalconnectionstolinguisticsandanotherclassofgraphsthathasbeenstudiedinNLP.1IntroductionDependencyparsingisoneofthefundamentalprob-lemsinnaturallanguageprocessingtoday,withap-plicationssuchasmachinetranslation(DingandPalmer,2005),informationextraction(CulottaandSorensen,2004),andquestionanswering(Cuietal.,2005).Mosthigh-accuracygraph-baseddepen-dencyparsers(KooandCollins,2010;RushandPetrov,2012;ZhangandMcDonald,2012)findthehighest-scoringprojectivetrees(inwhichnoedgescross),despitethefactthatalargeproportionofnat-urallanguagesentencesarenon-projective.Projec-tivetreescanbefoundinO(n3)tiempo(Eisner,2000),butcoveronly63.6%ofsentencesinsomenaturallanguagetreebanks(Table1).TheclassofdirectedspanningtreescoversalltreebanktreesandcanbeparsedinO(n2)withedge-basedfeatures(McDonaldetal.,2005),butitisNP-hardtofindthemaximumscoringsuchtreewithgrandparentorsiblingfeatures(McDonaldandPereira,2006;McDonaldandSatta,2007).Therearevariousexistingdefinitionsofmildlynon-projectivetreeswithbetterempiricalcoveragethanprojectivetreesthatdonothavethehardnessofextensibilitythatspanningtreesdo.However,thesehavehadparsingalgorithmsthatareordersofmag-nitudeslowerthantheprojectivecaseortheedge-basedspanningtreecase.Forexample,well-nesteddependencytreeswithblockdegree2(Kuhlmann,2013)coveratleast95.4%ofnaturallanguagestruc-tures,buthaveaparsingtimeofO(n7)(Gómez-Rodríguezetal.,2011).Nopreviouslydefinedclassoftreessimultane-ouslyhashighcoverageandlow-degreepolynomialalgorithmsforparsing,allowinggrandparentorsib-lingfeatures.Wepropose1-Endpoint-Crossingtrees,inwhichforanyedgethatiscrossed,allotheredgesthatcrossthatedgeshareanendpoint.Whilesimpletostate,thispropertycovers95.8%ormoreofde-pendencyparsesinnaturallanguagetreebanks(Ta-ble1).Theoptimal1-Endpoint-Crossingtreecanbefoundinfasterasymptotictimethananyprevi-ouslyproposedmildlynon-projectivedependencyparsingalgorithm.Weshowhowany1-Endpoint-Crossingtreecanbedecomposedintoisolatedsetsofintervalswithoneexteriorpoint(Section3).Thisisthekeyinsightthatallowsefficientparsing;theO(n4)parsingalgorithmispresentedinSection4.1-Endpoint-Crossingtreesareasubclassof2-planargraphs(Section5.1),aclassthathasbeenstudied
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
yo
a
C
_
a
_
0
0
2
0
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
14
inNLP.1-Endpoint-Crossingtreesalsohavesomelinguisticinterpretation(pairsofcrossserialverbsproduce1-Endpoint-Crossingtrees,Section5.2).2DefinitionsofNon-ProjectivityDefinition1.Edgeseandfcrossifeandfhavedistinctendpointsandexactlyoneoftheendpointsoffliesbetweentheendpointsofe.Definition2.Adependencytreeis1-Endpoint-Crossingifforanyedgee,alledgesthatcrosseshareanendpointp.Table1showsthepercentageofdependencyparsesintheCoNLL-Xtrainingsetsthatare1-Endpoint-Crossingtrees.Acrosssixlanguageswithvaryingamountsofnon-projectivity,95.8-99.8%ofdependencyparsesintreebanksare1-Endpoint-Crossingtrees.1Wenextreviewandcompareotherrelevantdef-initionsofnon-projectivityfrompriorwork:well-nestedwithblockdegree2,gap-minding,projective,and2-planar.Thedefinitionsofblockdegreeandwell-nestednessaregivenbelow:Definition3.Foreachnodeuinthetree,ablockofthenodeis“alongestsegmentconsistingofdescen-dantsofu.”(Kuhlmann,2013).Theblock-degreeofuis“thenumberofdistinctblocksofu”.Theblockdegreeofatreeisthemaximumblockdegreeofanyofitsnodes.Thegapdegreeisthenumberofgapsbetweentheseblocks,andsobydefinitionisonelessthantheblockdegree.(Kuhlmann,2013)Definition4.Twotrees“T1andT2interleaveifftherearenodesl1,r1∈T1andl2,r2∈T2suchthatl1
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
yo
a
C
_
a
_
0
0
2
0
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
21
WhatdidsayBAC…Zatet?nsaid1said2t1t2Figure8:Anexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Theedgesbe-tweentheheadsofeachclausecrosstheedgesfromtracetotrace,butallobey1-Endpoint-Crossing.Endpoint-Crossing.Psycholinguistically,betweentwoandthreeverbsisexactlywherethereisalargechangeinthesentenceprocessingabilitiesofhumanlisteners(basedonbothgrammaticaljudgmentsandscoresonacomprehensiontask)(Bachetal.,1986).Morespeculatively,theremaybeaconnectionbetweentheformof1-Endpoint-Crossingtreesandphases(apenas,propositionalunitssuchasclauses)inMinimalism(Chomskyetal.,1998).Figure8showsanexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Thephase-impenetrabilitycondition(PIC)statesthatonlytheheadofthephaseandelementsthathavemovedtoitsedgeareaccessibletotherestofthesentence(Chomskyetal.,1998,p.22).Movementisthere-forerequiredtobesuccessivecyclic,withamovedelementleavingachainoftracesattheedgeofeachclauseonitswaytoitsfinalpronouncedloca-tion(Chomsky,1981).InFigure8,noticethatthecrossingedgesformarepeatedpatternthatobeysthe1-Endpoint-Crossingproperty.Moregenerally,wesuspectthattreessatisfyingthePICwilltendtoalsobe1-Endpoint-Crossing.Furthermore,ifthetraceswerenotattheedgeofeachclause,andin-steadwerepositionedbetweenaheadandoneofitsarguments,1-Endpoint-Crossingwouldbevio-lated.Forexample,ift2inFigure8werebe-tweenCandsaid2,thentheedge(t1,t2)wouldcross(decir,said1),(said1,said2),y(C,said2),whichdonotallshareanendpoint.Anexplorationoftheselinguisticconnectionsmaybeaninterestingavenueforfurtherresearch.6Conclusions1-Endpoint-Crossingtreescharacterizeover95%ofstructuresfoundinnaturallanguagetreebank,andcanbeparsedinonlyafactorofnmoretimethanprojectivetrees.Thedynamicprogrammingalgo-rithmforprojectivetrees(Eisner,2000)hasbeenextendedtohandlehigherorderfactors(McDonaldandPereira,2006;Carreras,2007;KooandCollins,2010),addingatmostafactorofntotheedge-basedrunningtime;itwouldbeinterestingtoex-tendthealgorithmpresentedheretoincludehigherorderfactors.1-Endpoint-Crossingisaconditiononedges,whilepropertiessuchaswell-nestednessorblockdegreeareframedintermsofsubtrees.Threeedgeswillalwayssufficeasacertificateofa1-Endpoint-Crossingviolation(twovertex-disjointedgesthatbothcrossathird).Incontrast,forapropertylikeill-nestedness,twonodesmighthavealeastcommonancestorarbitrarilyfaraway,andsoonemightneedtheentiregraphtoverifywhetherthesub-treesrootedatthosenodesaredisjointandill-nested.Wehavediscussedcross-serialdepen-dencies;afurtherexplorationofwhichlinguisticphenomenawouldandwouldnothave1-Endpoint-Crossingdependencytreesmayberevealing.AcknowledgmentsWewouldliketothankJulieLegateforanin-terestingdiscussion.ThismaterialisbaseduponworksupportedunderaNationalScienceFoun-dationGraduateResearchFellowship,NSFAwardCCF1137084,andArmyResearchOfficeMURIgrantW911NF-07-1-0216.ADynamicProgramtofindthemaximumscoring1-Endpoint-CrossingTreeInput:MatrixS:S[i,j]isthescoreofthedirectededge(i,j)Output:Maximumscoreofa1-Endpoint-Crossingtreeoververtices[0,norte],rootedat0Init:∀iInt[i,i,F,F]=Int[i,i+1,F,F]=0Int[i,i,t,F]=Int[i,i,F,t]=Int[i,i,t,t]=−∞Final:Int[0,norte,F,t]Shorthandforbooleans:TF(X,S):=ifx=T,exactlyoneofthesetSistrueifx=F,allofthesetSmustbefalsebi,bj,bxaretrueiffthecorrespondingboundarypointhasitsincomingedge(parent)inthatsub-problem.FortheLRsub-problem,biandbjarealwaysfalse,andsoomitted.Forallsub-problemswiththesuffixAFromB,theboundarypointAhasitsparentedgeinthesub-problemsolution;theothertwoboundarypointsdonot.Forexample,L_XFromIwouldcor-respondtohavingbooleansbi=bj=Fandbx=T,withtherestrictionthatxmustbeadescendantofi.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
yo
a
C
_
a
_
0
0
2
0
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
22
Int[i,j,F,bj]←maxInt[i+1,j,t,F]ifbj=FS[i,j]+Int[i,j,F,F]ifbj=Tmaxk∈(i,j)S[i,k]+Int[i,k,F,F]+Int[k,j,F,bj]maxTF(bj,{bl,br})LR[i,k,j,bl]+Int[k,j,F,br]maxl∈(k,j),TF(t,{bl,bm,br})(cid:26)R[i,k,yo,F,F,bl]+Int[k,yo,F,bm]+l[yo,j,k,br,bj,F]LR[i,k,yo,bl]+Int[k,yo,F,bm]+Int[yo,j,br,bj]maxl∈(i,k),TF(t,{bl,bm,br})(cid:26)Int[i,yo,F,bl]+l[yo,k,i,bm,F,F]+norte[k,j,yo,F,bj,br]R[i,yo,k,F,bl,F]+Int[yo,k,bm,F]+l[k,j,yo,F,bj,br]Int[i,j,t,F]←symmetrictoInt[i,j,F,t]Int[i,j,t,t]←−∞LR[i,j,X,bx]←maxL[i,j,X,F,F,bx]R[i,j,X,F,F,bx]maxk∈(i,j),TF(bx,{bxl,bxr}),TF(t,{bkl,bkr})l[i,k,X,F,bkl,bxl]+R[k,j,X,bkr,F,bxr]norte[i,j,X,bi,bj,F]←maxInt[i,j,bi,bj]S[X,i]+norte[i,j,X,F,bj,F]ifbi=TS[X,j]+norte[i,j,X,bi,F,F]ifbj=Tmaxk∈(i,j)S[X,k]+norte[i,k,X,bi,F,F]+Int[k,j,F,bj]norte[i,j,X,F,bj,t]←maxS[i,X]+norte[i,j,X,F,bj,F]S[X,j]+N_XFromI[i,j,X]ifbj=TS[j,X]+norte[i,j,X,F,F,F]ifbj=FS[j,X]+Int[i,j,F,t]ifbj=Tmaxk∈(i,j)S[X,k]+N_XFromI[i,k,X]+Int[k,j,F,bj]maxk∈(i,j)S[k,X]+(cid:26)Int[i,k,F,t]+Int[k,j,F,bj]norte[i,k,X,F,F,F]+Int[k,j,t,bj]norte[i,j,X,t,F,t]←symmetrictoN[i,j,X,F,t,t]norte[i,j,X,t,t,t]←−∞N_XFromI[i,j,X]←maxS[i,X]+norte[i,j,X,F,F,F]maxk∈(i,j)(cid:26)S[X,k]+N_XFromI[i,k,X]+Int[k,j,F,F]S[k,X]+Int[i,k,F,t]+Int[k,j,F,F]N_IFromX[i,j,X]←max(S[X,i]+norte[i,j,X,F,F,F]maxk∈(i,j)S[X,k]+norte[i,k,X,t,F,F]+Int[k,j,F,F]N_XFromJ[i,j,X]←symmetrictoN_XFromI[i,j,X]N_JFromX[i,j,X]←symmetrictoN_IFromX[i,j,X]l[i,j,X,bi,bj,F]←maxInt[i,j,bi,bj]S[X,i]+l[i,j,X,F,bj,F]ifbi=TS[X,j]+l[i,j,X,bi,F,F]ifbj=Tmaxk∈(i,j),TF(bi,{bl,br})S[X,k]+(cid:26)l[i,k,X,bl,F,F]+norte[k,j,i,F,bj,br]Int[i,k,bl,F]+l[k,j,i,F,bj,br]l[i,j,X,F,bj,t]←maxS[i,X]+l[i,j,X,F,bj,F]S[X,j]+L_XFromI[i,j,X]ifbj=TS[j,X]+l[i,j,X,F,F,F]ifbj=FS[j,X]+L_JFromI[i,j,X]ifbj=Tmaxk∈(i,j)S[X,k]+L_XFromI[i,k,X]+norte[k,j,i,F,bj,F]maxk∈(i,j)S[k,X]+L_JFromI[i,k,X]+norte[k,j,i,F,bj,F]l[i,k,X,F,F,F]+norte[k,j,i,t,bj,F]maxTF(t,{bl,br})Int[i,k,F,bl]+l[k,j,i,br,bj,F]l[i,j,X,t,bj,t]←notreachableL_XFromI[i,j,X]←maxS[i,X]+l[i,j,X,F,F,F]maxk∈(i,j)S[X,k]+L_XFromI[i,k,X]+norte[k,j,i,F,F,F]maxk∈(i,j)S[k,X]+L_JFromI[i,k,X]+norte[k,j,i,F,F,F]l[i,k,X,F,F,F]+N_IFromX[k,j,i]Int[i,k,F,t]+l[k,j,i,F,F,F]Int[i,k,F,F]+L_IFromX[k,j,i]L_IFromX[i,j,X]←maxS[X,i]+l[i,j,X,F,F,F]maxk∈(i,j)S[X,k]+L[i,k,X,t,F,F]+norte[k,j,i,F,F,F]l[i,k,X,F,F,F]+N_XFromI[k,j,i]Int[i,k,t,F]+l[k,j,i,F,F,F]Int[i,k,F,F]+L_XFromI[k,j,i]L_JFromX[i,j,X]←maxS[X,j]+l[i,j,X,F,F,F]maxk∈(i,j)S[X,k]+(cid:26)l[i,k,X,F,F,F]+Int[k,j,F,t]Int[i,k,F,F]+L_JFromI[k,j,i]L_JFromI[i,j,X]←maxInt[i,j,F,t]maxk∈(i,j)S[X,k]+(cid:26)l[i,k,X,F,F,F]+N_JFromX[k,j,i]Int[i,k,F,F]+L_JFromX[k,j,i]
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
yo
a
C
_
a
_
0
0
2
0
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
23
R[i,j,X,bi,bj,F]←symmetrictoL[i,j,X,bi,bj,F]R[i,j,X,bi,F,t]←symmetrictoL[i,j,X,F,bj,t]R[i,j,X,bi,t,t]←notreachableR_XFromJ[i,j,X]←symmetrictoL_XFromI[i,j,X]R_JFromX[i,j,X]←symmetrictoL_IFromX[i,j,X]R_IFromX[i,j,X]←symmetrictoL_JFromX[i,j,X]R_IFromJ[i,j,X]←symmetrictoL_JFromI[i,j,X]ReferencesE.Bach,C.Brown,andW.Marslen-Wilson.1986.Crossedandnesteddependenciesingermananddutch:Apsycholinguisticstudy.LanguageandCognitiveProcesses,1(4):249–262.F.BernhartandP.C.Kainen.1979.Thebookthicknessofagraph.JournalofCombinatorialTheory,SeriesB,27(3):320–331.M.Bodirsky,M.Kuhlmann,andM.Möhl.2005.Well-nesteddrawingsasmodelsofsyntacticstructure.InInTenthConferenceonFormalGrammarandNinthMeetingonMathematicsofLanguage,pages88–1.UniversityPress.X.Carreras.2007.Experimentswithahigher-orderprojectivedependencyparser.InProceedingsoftheCoNLLSharedTaskSessionofEMNLP-CoNLL,vol-ume7,pages957–961.N.Chomsky,MassachusettsInstituteofTechnology.Dept.ofLinguistics,andPhilosophy.1998.Minimal-istinquiries:theframework.MIToccasionalpapersinlinguistics.DistributedbyMITWorkingPapersinLinguistics,CON,Dept.ofLinguistics.N.Chomsky.1981.LecturesonGovernmentandBind-ing.Dordrecht:Foris.F.Chung,F.Leighton,andA.Rosenberg.1987.Em-beddinggraphsinbooks:Alayoutproblemwithap-plicationstoVLSIdesign.SIAMJournalonAlgebraicDiscreteMethods,8(1):33–58.H.Cui,R.Sun,K.Li,M.Y.Kan,andT.S.Chua.2005.Questionansweringpassageretrievalusingdepen-dencyrelations.InProceedingsofthe28thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages400–407.ACM.A.CulottaandJ.Sorensen.2004.Dependencytreekernelsforrelationextraction.InProceedingsofthe42ndAnnualMeetingonAssociationforComputa-tionalLinguistics,page423.AssociationforCompu-tationalLinguistics.Y.DingandM.Palmer.2005.Machinetranslationusingprobabilisticsynchronousdependencyinsertiongram-mars.InProceedingsofthe43rdAnnualMeetingonAssociationforComputationalLinguistics,pages541–548.AssociationforComputationalLinguistics.J.Eisner.2000.Bilexicalgrammarsandtheircubic-timeparsingalgorithms.InHarryBuntandAntonNijholt,editores,AdvancesinProbabilisticandOtherParsingTechnologies,pages29–62.KluwerAcademicPublishers,October.S.EvenandA.Itai.1971.Queues,stacks,andgraphs.InProc.InternationalSymp.onTheoryofMachinesandComputations,pages71–86.C.Gómez-RodríguezandJ.Nivre.2010.Atransition-basedparserfor2-planardependencystructures.InProceedingsofACL,pages1492–1501.C.Gómez-Rodríguez,J.Carroll,andD.Weir.2011.De-pendencyparsingschemataandmildlynon-projectivedependencyparsing.ComputationalLinguistics,37(3):541–586.T.KooandM.Collins.2010.Efficientthird-orderde-pendencyparsers.InProceedingsofACL,pages1–11.M.Kuhlmann.2013.Mildlynon-projectivedependencygrammar.ComputationalLinguistics,39(2).R.McDonaldandF.Pereira.2006.Onlinelearningofapproximatedependencyparsingalgorithms.InPro-ceedingsofEACL,pages81–88.R.McDonaldandG.Satta.2007.Onthecomplexityofnon-projectivedata-drivendependencyparsing.InProceedingsofthe10thInternationalConferenceonParsingTechnologies,pages121–132.R.McDonald,F.Pereira,K.Ribarov,andJ.Hajiˇc.2005.Non-projectivedependencyparsingusingspanningtreealgorithms.InProceedingsoftheconferenceonHumanLanguageTechnologyandEmpiricalMethodsinNaturalLanguageProcessing,pages523–530.As-sociationforComputationalLinguistics.E.Pitler,S.Kannan,andM.Marcus.2012.Dynamicprogrammingforhigherorderparsingofgap-mindingtrees.InProceedingsofEMNLP,pages478–488.L.A.Ringenberg.1967.Collegegeometry.Wiley.A.RushandS.Petrov.2012.Vinepruningforeffi-cientmulti-passdependencyparsing.InProceedingsofNAACL,pages498–507.S.M.Shieber.1985.Evidenceagainstthecontext-freenessofnaturallanguage.LinguisticsandPhiloso-phy,8(3):333–343.H.ZhangandR.McDonald.2012.Generalizedhigher-orderdependencyparsingwithcubepruning.InPro-ceedingsofEMNLP,pages320–331.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
t
yo
a
C
_
a
_
0
0
2
0
6
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3