Operazioni dell'Associazione per la Linguistica Computazionale, 1 (2013) 13–24. Redattore di azioni: Giorgio Satta.
Submitted 11/2012; Pubblicato 3/2013. C
(cid:13)
2013 Associazione per la Linguistica Computazionale.
FindingOptimal1-Endpoint-CrossingTreesEmilyPitler,SampathKannan,MitchellMarcusComputerandInformationScienceUniversityofPennsylvaniaPhiladelphia,PA19104epitler,kannan,mitch@seas.upenn.eduAbstractDependencyparsingalgorithmscapableofproducingthetypesofcrossingdependenciesseeninnaturallanguagesentenceshavetra-ditionallybeenordersofmagnitudeslowerthanalgorithmsforprojectivetrees.For95.8-99.8%ofdependencyparsesinvariousnat-urallanguagetreebanks,wheneveranedgeiscrossed,theedgesthatcrossitallhaveacommonvertex.Theoptimaldependencytreethatsatisfiesthis1-Endpoint-Crossingprop-ertycanbefoundwithanO(n4)parsingal-gorithmthatrecursivelycombinesforestsoverintervalswithoneexteriorpoint.1-Endpoint-CrossingtreesalsohavenaturalconnectionstolinguisticsandanotherclassofgraphsthathasbeenstudiedinNLP.1IntroductionDependencyparsingisoneofthefundamentalprob-lemsinnaturallanguageprocessingtoday,withap-plicationssuchasmachinetranslation(DingandPalmer,2005),informationextraction(CulottaandSorensen,2004),andquestionanswering(Cuietal.,2005).Mosthigh-accuracygraph-baseddepen-dencyparsers(KooandCollins,2010;RushandPetrov,2012;ZhangandMcDonald,2012)findthehighest-scoringprojectivetrees(inwhichnoedgescross),despitethefactthatalargeproportionofnat-urallanguagesentencesarenon-projective.Projec-tivetreescanbefoundinO(n3)time(Eisner,2000),butcoveronly63.6%ofsentencesinsomenaturallanguagetreebanks(Table1).TheclassofdirectedspanningtreescoversalltreebanktreesandcanbeparsedinO(n2)withedge-basedfeatures(McDonaldetal.,2005),butitisNP-hardtofindthemaximumscoringsuchtreewithgrandparentorsiblingfeatures(McDonaldandPereira,2006;McDonaldandSatta,2007).Therearevariousexistingdefinitionsofmildlynon-projectivetreeswithbetterempiricalcoveragethanprojectivetreesthatdonothavethehardnessofextensibilitythatspanningtreesdo.However,thesehavehadparsingalgorithmsthatareordersofmag-nitudeslowerthantheprojectivecaseortheedge-basedspanningtreecase.Forexample,well-nesteddependencytreeswithblockdegree2(Kuhlmann,2013)coveratleast95.4%ofnaturallanguagestruc-tures,buthaveaparsingtimeofO(n7)(Gómez-Rodríguezetal.,2011).Nopreviouslydefinedclassoftreessimultane-ouslyhashighcoverageandlow-degreepolynomialalgorithmsforparsing,allowinggrandparentorsib-lingfeatures.Wepropose1-Endpoint-Crossingtrees,inwhichforanyedgethatiscrossed,allotheredgesthatcrossthatedgeshareanendpoint.Whilesimpletostate,thispropertycovers95.8%ormoreofde-pendencyparsesinnaturallanguagetreebanks(Ta-ble1).Theoptimal1-Endpoint-Crossingtreecanbefoundinfasterasymptotictimethananyprevi-ouslyproposedmildlynon-projectivedependencyparsingalgorithm.Weshowhowany1-Endpoint-Crossingtreecanbedecomposedintoisolatedsetsofintervalswithoneexteriorpoint(Section3).Thisisthekeyinsightthatallowsefficientparsing;theO(n4)parsingalgorithmispresentedinSection4.1-Endpoint-Crossingtreesareasubclassof2-planargraphs(Section5.1),aclassthathasbeenstudied
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
T
l
UN
C
_
UN
_
0
0
2
0
6
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3
14
inNLP.1-Endpoint-Crossingtreesalsohavesomelinguisticinterpretation(pairsofcrossserialverbsproduce1-Endpoint-Crossingtrees,Section5.2).2DefinitionsofNon-ProjectivityDefinition1.Edgeseandfcrossifeandfhavedistinctendpointsandexactlyoneoftheendpointsoffliesbetweentheendpointsofe.Definition2.Adependencytreeis1-Endpoint-Crossingifforanyedgee,alledgesthatcrosseshareanendpointp.Table1showsthepercentageofdependencyparsesintheCoNLL-Xtrainingsetsthatare1-Endpoint-Crossingtrees.Acrosssixlanguageswithvaryingamountsofnon-projectivity,95.8-99.8%ofdependencyparsesintreebanksare1-Endpoint-Crossingtrees.1Wenextreviewandcompareotherrelevantdef-initionsofnon-projectivityfrompriorwork:well-nestedwithblockdegree2,gap-minding,projective,and2-planar.Thedefinitionsofblockdegreeandwell-nestednessaregivenbelow:Definition3.Foreachnodeuinthetree,ablockofthenodeis“alongestsegmentconsistingofdescen-dantsofu.”(Kuhlmann,2013).Theblock-degreeofuis“thenumberofdistinctblocksofu”.Theblockdegreeofatreeisthemaximumblockdegreeofanyofitsnodes.Thegapdegreeisthenumberofgapsbetweentheseblocks,andsobydefinitionisonelessthantheblockdegree.(Kuhlmann,2013)Definition4.Twotrees“T1andT2interleaveifftherearenodesl1,r1∈T1andl2,r2∈T2suchthatl1
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
T
l
UN
C
_
UN
_
0
0
2
0
6
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3
21
WhatdidsayBAC…Zatet?nsaid1said2t1t2Figure8:Anexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Theedgesbe-tweentheheadsofeachclausecrosstheedgesfromtracetotrace,butallobey1-Endpoint-Crossing.Endpoint-Crossing.Psycholinguistically,betweentwoandthreeverbsisexactlywherethereisalargechangeinthesentenceprocessingabilitiesofhumanlisteners(basedonbothgrammaticaljudgmentsandscoresonacomprehensiontask)(Bachetal.,1986).Morespeculatively,theremaybeaconnectionbetweentheformof1-Endpoint-Crossingtreesandphases(roughly,propositionalunitssuchasclauses)inMinimalism(Chomskyetal.,1998).Figure8showsanexampleofwh-movementoverapoten-tiallyunboundednumberofclauses.Thephase-impenetrabilitycondition(PIC)statesthatonlytheheadofthephaseandelementsthathavemovedtoitsedgeareaccessibletotherestofthesentence(Chomskyetal.,1998,p.22).Movementisthere-forerequiredtobesuccessivecyclic,withamovedelementleavingachainoftracesattheedgeofeachclauseonitswaytoitsfinalpronouncedloca-tion(Chomsky,1981).InFigure8,noticethatthecrossingedgesformarepeatedpatternthatobeysthe1-Endpoint-Crossingproperty.Moregenerally,wesuspectthattreessatisfyingthePICwilltendtoalsobe1-Endpoint-Crossing.Furthermore,ifthetraceswerenotattheedgeofeachclause,andin-steadwerepositionedbetweenaheadandoneofitsarguments,1-Endpoint-Crossingwouldbevio-lated.Forexample,ift2inFigure8werebe-tweenCandsaid2,thentheedge(t1,t2)wouldcross(Dire,said1),(said1,said2),E(C,said2),whichdonotallshareanendpoint.Anexplorationoftheselinguisticconnectionsmaybeaninterestingavenueforfurtherresearch.6Conclusions1-Endpoint-Crossingtreescharacterizeover95%ofstructuresfoundinnaturallanguagetreebank,andcanbeparsedinonlyafactorofnmoretimethanprojectivetrees.Thedynamicprogrammingalgo-rithmforprojectivetrees(Eisner,2000)hasbeenextendedtohandlehigherorderfactors(McDonaldandPereira,2006;Carreras,2007;KooandCollins,2010),addingatmostafactorofntotheedge-basedrunningtime;itwouldbeinterestingtoex-tendthealgorithmpresentedheretoincludehigherorderfactors.1-Endpoint-Crossingisaconditiononedges,whilepropertiessuchaswell-nestednessorblockdegreeareframedintermsofsubtrees.Threeedgeswillalwayssufficeasacertificateofa1-Endpoint-Crossingviolation(twovertex-disjointedgesthatbothcrossathird).Incontrast,forapropertylikeill-nestedness,twonodesmighthavealeastcommonancestorarbitrarilyfaraway,andsoonemightneedtheentiregraphtoverifywhetherthesub-treesrootedatthosenodesaredisjointandill-nested.Wehavediscussedcross-serialdepen-dencies;afurtherexplorationofwhichlinguisticphenomenawouldandwouldnothave1-Endpoint-Crossingdependencytreesmayberevealing.AcknowledgmentsWewouldliketothankJulieLegateforanin-terestingdiscussion.ThismaterialisbaseduponworksupportedunderaNationalScienceFoun-dationGraduateResearchFellowship,NSFAwardCCF1137084,andArmyResearchOfficeMURIgrantW911NF-07-1-0216.ADynamicProgramtofindthemaximumscoring1-Endpoint-CrossingTreeInput:MatrixS:S[io,j]isthescoreofthedirectededge(io,j)Output:Maximumscoreofa1-Endpoint-Crossingtreeoververtices[0,N],rootedat0Init:∀iInt[io,io,F,F]=Int[io,i+1,F,F]=0Int[io,io,T,F]=Int[io,io,F,T]=Int[io,io,T,T]=−∞Final:Int[0,N,F,T]Shorthandforbooleans:TF(X,S):=ifx=T,exactlyoneofthesetSistrueifx=F,allofthesetSmustbefalsebi,bj,bxaretrueiffthecorrespondingboundarypointhasitsincomingedge(parent)inthatsub-problem.FortheLRsub-problem,biandbjarealwaysfalse,andsoomitted.Forallsub-problemswiththesuffixAFromB,theboundarypointAhasitsparentedgeinthesub-problemsolution;theothertwoboundarypointsdonot.Forexample,L_XFromIwouldcor-respondtohavingbooleansbi=bj=Fandbx=T,withtherestrictionthatxmustbeadescendantofi.
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
T
l
UN
C
_
UN
_
0
0
2
0
6
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3
22
Int[io,j,F,bj]←maxInt[i+1,j,T,F]ifbj=FS[io,j]+Int[io,j,F,F]ifbj=Tmaxk∈(io,j)S[io,k]+Int[io,k,F,F]+Int[k,j,F,bj]maxTF(bj,{bl,br})LR[io,k,j,bl]+Int[k,j,F,br]maxl∈(k,j),TF(T,{bl,bm,br})(cid:26)R[io,k,l,F,F,bl]+Int[k,l,F,bm]+l[l,j,k,br,bj,F]LR[io,k,l,bl]+Int[k,l,F,bm]+Int[l,j,br,bj]maxl∈(io,k),TF(T,{bl,bm,br})(cid:26)Int[io,l,F,bl]+l[l,k,io,bm,F,F]+N[k,j,l,F,bj,br]R[io,l,k,F,bl,F]+Int[l,k,bm,F]+l[k,j,l,F,bj,br]Int[io,j,T,F]←symmetrictoInt[io,j,F,T]Int[io,j,T,T]←−∞LR[io,j,X,bx]←maxL[io,j,X,F,F,bx]R[io,j,X,F,F,bx]maxk∈(io,j),TF(bx,{bxl,bxr}),TF(T,{bkl,bkr})l[io,k,X,F,bkl,bxl]+R[k,j,X,bkr,F,bxr]N[io,j,X,bi,bj,F]←maxInt[io,j,bi,bj]S[X,io]+N[io,j,X,F,bj,F]ifbi=TS[X,j]+N[io,j,X,bi,F,F]ifbj=Tmaxk∈(io,j)S[X,k]+N[io,k,X,bi,F,F]+Int[k,j,F,bj]N[io,j,X,F,bj,T]←maxS[io,X]+N[io,j,X,F,bj,F]S[X,j]+N_XFromI[io,j,X]ifbj=TS[j,X]+N[io,j,X,F,F,F]ifbj=FS[j,X]+Int[io,j,F,T]ifbj=Tmaxk∈(io,j)S[X,k]+N_XFromI[io,k,X]+Int[k,j,F,bj]maxk∈(io,j)S[k,X]+(cid:26)Int[io,k,F,T]+Int[k,j,F,bj]N[io,k,X,F,F,F]+Int[k,j,T,bj]N[io,j,X,T,F,T]←symmetrictoN[io,j,X,F,T,T]N[io,j,X,T,T,T]←−∞N_XFromI[io,j,X]←maxS[io,X]+N[io,j,X,F,F,F]maxk∈(io,j)(cid:26)S[X,k]+N_XFromI[io,k,X]+Int[k,j,F,F]S[k,X]+Int[io,k,F,T]+Int[k,j,F,F]N_IFromX[io,j,X]←max(S[X,io]+N[io,j,X,F,F,F]maxk∈(io,j)S[X,k]+N[io,k,X,T,F,F]+Int[k,j,F,F]N_XFromJ[io,j,X]←symmetrictoN_XFromI[io,j,X]N_JFromX[io,j,X]←symmetrictoN_IFromX[io,j,X]l[io,j,X,bi,bj,F]←maxInt[io,j,bi,bj]S[X,io]+l[io,j,X,F,bj,F]ifbi=TS[X,j]+l[io,j,X,bi,F,F]ifbj=Tmaxk∈(io,j),TF(bi,{bl,br})S[X,k]+(cid:26)l[io,k,X,bl,F,F]+N[k,j,io,F,bj,br]Int[io,k,bl,F]+l[k,j,io,F,bj,br]l[io,j,X,F,bj,T]←maxS[io,X]+l[io,j,X,F,bj,F]S[X,j]+L_XFromI[io,j,X]ifbj=TS[j,X]+l[io,j,X,F,F,F]ifbj=FS[j,X]+L_JFromI[io,j,X]ifbj=Tmaxk∈(io,j)S[X,k]+L_XFromI[io,k,X]+N[k,j,io,F,bj,F]maxk∈(io,j)S[k,X]+L_JFromI[io,k,X]+N[k,j,io,F,bj,F]l[io,k,X,F,F,F]+N[k,j,io,T,bj,F]maxTF(T,{bl,br})Int[io,k,F,bl]+l[k,j,io,br,bj,F]l[io,j,X,T,bj,T]←notreachableL_XFromI[io,j,X]←maxS[io,X]+l[io,j,X,F,F,F]maxk∈(io,j)S[X,k]+L_XFromI[io,k,X]+N[k,j,io,F,F,F]maxk∈(io,j)S[k,X]+L_JFromI[io,k,X]+N[k,j,io,F,F,F]l[io,k,X,F,F,F]+N_IFromX[k,j,io]Int[io,k,F,T]+l[k,j,io,F,F,F]Int[io,k,F,F]+L_IFromX[k,j,io]L_IFromX[io,j,X]←maxS[X,io]+l[io,j,X,F,F,F]maxk∈(io,j)S[X,k]+L[io,k,X,T,F,F]+N[k,j,io,F,F,F]l[io,k,X,F,F,F]+N_XFromI[k,j,io]Int[io,k,T,F]+l[k,j,io,F,F,F]Int[io,k,F,F]+L_XFromI[k,j,io]L_JFromX[io,j,X]←maxS[X,j]+l[io,j,X,F,F,F]maxk∈(io,j)S[X,k]+(cid:26)l[io,k,X,F,F,F]+Int[k,j,F,T]Int[io,k,F,F]+L_JFromI[k,j,io]L_JFromI[io,j,X]←maxInt[io,j,F,T]maxk∈(io,j)S[X,k]+(cid:26)l[io,k,X,F,F,F]+N_JFromX[k,j,io]Int[io,k,F,F]+L_JFromX[k,j,io]
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
T
l
UN
C
_
UN
_
0
0
2
0
6
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3
23
R[io,j,X,bi,bj,F]←symmetrictoL[io,j,X,bi,bj,F]R[io,j,X,bi,F,T]←symmetrictoL[io,j,X,F,bj,T]R[io,j,X,bi,T,T]←notreachableR_XFromJ[io,j,X]←symmetrictoL_XFromI[io,j,X]R_JFromX[io,j,X]←symmetrictoL_IFromX[io,j,X]R_IFromX[io,j,X]←symmetrictoL_JFromX[io,j,X]R_IFromJ[io,j,X]←symmetrictoL_JFromI[io,j,X]ReferencesE.Bach,C.Brown,andW.Marslen-Wilson.1986.Crossedandnesteddependenciesingermananddutch:Apsycholinguisticstudy.LanguageandCognitiveProcesses,1(4):249–262.F.BernhartandP.C.Kainen.1979.Thebookthicknessofagraph.JournalofCombinatorialTheory,SeriesB,27(3):320–331.M.Bodirsky,M.Kuhlmann,andM.Möhl.2005.Well-nesteddrawingsasmodelsofsyntacticstructure.InInTenthConferenceonFormalGrammarandNinthMeetingonMathematicsofLanguage,pages88–1.UniversityPress.X.Carreras.2007.Experimentswithahigher-orderprojectivedependencyparser.InProceedingsoftheCoNLLSharedTaskSessionofEMNLP-CoNLL,vol-ume7,pages957–961.N.Chomsky,MassachusettsInstituteofTechnology.Dept.ofLinguistics,andPhilosophy.1998.Minimal-istinquiries:theframework.MIToccasionalpapersinlinguistics.DistributedbyMITWorkingPapersinLinguistics,MIT,Dept.ofLinguistics.N.Chomsky.1981.LecturesonGovernmentandBind-ing.Dordrecht:Foris.F.Chung,F.Leighton,andA.Rosenberg.1987.Em-beddinggraphsinbooks:Alayoutproblemwithap-plicationstoVLSIdesign.SIAMJournalonAlgebraicDiscreteMethods,8(1):33–58.H.Cui,R.Sun,K.Li,M.Y.Kan,andT.S.Chua.2005.Questionansweringpassageretrievalusingdepen-dencyrelations.InProceedingsofthe28thannualinternationalACMSIGIRconferenceonResearchanddevelopmentininformationretrieval,pages400–407.ACM.A.CulottaandJ.Sorensen.2004.Dependencytreekernelsforrelationextraction.InProceedingsofthe42ndAnnualMeetingonAssociationforComputa-tionalLinguistics,page423.AssociationforCompu-tationalLinguistics.Y.DingandM.Palmer.2005.Machinetranslationusingprobabilisticsynchronousdependencyinsertiongram-mars.InProceedingsofthe43rdAnnualMeetingonAssociationforComputationalLinguistics,pages541–548.AssociationforComputationalLinguistics.J.Eisner.2000.Bilexicalgrammarsandtheircubic-timeparsingalgorithms.InHarryBuntandAntonNijholt,editors,AdvancesinProbabilisticandOtherParsingTechnologies,pages29–62.KluwerAcademicPublishers,October.S.EvenandA.Itai.1971.Queues,stacks,andgraphs.InProc.InternationalSymp.onTheoryofMachinesandComputations,pages71–86.C.Gómez-RodríguezandJ.Nivre.2010.Atransition-basedparserfor2-planardependencystructures.InProceedingsofACL,pages1492–1501.C.Gómez-Rodríguez,J.Carroll,andD.Weir.2011.De-pendencyparsingschemataandmildlynon-projectivedependencyparsing.ComputationalLinguistics,37(3):541–586.T.KooandM.Collins.2010.Efficientthird-orderde-pendencyparsers.InProceedingsofACL,pages1–11.M.Kuhlmann.2013.Mildlynon-projectivedependencygrammar.ComputationalLinguistics,39(2).R.McDonaldandF.Pereira.2006.Onlinelearningofapproximatedependencyparsingalgorithms.InPro-ceedingsofEACL,pages81–88.R.McDonaldandG.Satta.2007.Onthecomplexityofnon-projectivedata-drivendependencyparsing.InProceedingsofthe10thInternationalConferenceonParsingTechnologies,pages121–132.R.McDonald,F.Pereira,K.Ribarov,andJ.Hajiˇc.2005.Non-projectivedependencyparsingusingspanningtreealgorithms.InProceedingsoftheconferenceonHumanLanguageTechnologyandEmpiricalMethodsinNaturalLanguageProcessing,pages523–530.As-sociationforComputationalLinguistics.E.Pitler,S.Kannan,andM.Marcus.2012.Dynamicprogrammingforhigherorderparsingofgap-mindingtrees.InProceedingsofEMNLP,pages478–488.L.A.Ringenberg.1967.Collegegeometry.Wiley.A.RushandS.Petrov.2012.Vinepruningforeffi-cientmulti-passdependencyparsing.InProceedingsofNAACL,pages498–507.S.M.Shieber.1985.Evidenceagainstthecontext-freenessofnaturallanguage.LinguisticsandPhiloso-phy,8(3):333–343.H.ZhangandR.McDonald.2012.Generalizedhigher-orderdependencyparsingwithcubepruning.InPro-ceedingsofEMNLP,pages320–331.
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
2
0
6
1
5
6
6
6
3
9
/
/
T
l
UN
C
_
UN
_
0
0
2
0
6
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3