Transactions of the Association for Computational Linguistics, vol. 3, pp. 559–570, 2015. Action Editor: Joakim Nivre.

Transactions of the Association for Computational Linguistics, vol. 3, pp. 559–570, 2015. Action Editor: Joakim Nivre.
Submission batch: 9/2015;Published 11/2015.
c(cid:13)2015 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence.

ParsingtoNoncrossingDependencyGraphsMarcoKuhlmannandPeterJonssonDepartmentofComputerandInformationScienceLinköpingUniversity,Swedenmarco.kuhlmann@liu.seandpeter.jonsson@liu.seAbstractWestudythegeneralizationofmaximumspan-ningtreedependencyparsingtomaximumacyclicsubgraphs.Becausetheunderlyingop-timizationproblemisintractableevenunderanarc-factoredmodel,weconsidertherestric-tiontononcrossingdependencygraphs.Ourmaincontributionisacubic-timeexactinfer-encealgorithmforthisclass.Weextendthisal-gorithmintoapracticalparserandevaluateitsperformanceonfourlinguisticdatasetsusedinsemanticdependencyparsing.Wealsoexploreageneralizationofourparsingframeworktodependencygraphswithpagenumberatmostkandshowthattheresultingoptimizationprob-lemisNP-hardfork(cid:21)2.1IntroductionDependencyparsersprovidelightweightrepresen-tationsforthesyntacticandthesemanticstructureofnaturallanguage.Syntacticdependencyparsing(Kübleretal.,2009)hasbeenanextremelyactiveresearchareaforthelastdecadeorso,resultinginaccurateandefficientparsersforawiderangeoflanguages.Semanticdependencyparsinghasonlyrecentlybeenaddressedintheliterature(Oepenetal.,2014;Oepenetal.,2015;Duetal.,2015a).Syntacticdependencyparsinghasbeenformal-izedasthesearchformaximumspanningtreesinweighteddigraphs(McDonaldetal.,2005b).Forsemanticdependencyparsing,wheretargetrepresen-tationsarenotnecessarilytree-shaped,itisnaturaltogeneralizethisviewtomaximumacyclicsubgraphs,withorwithouttheadditionalrequirementofweakconnectivity(Schluter,2014).Whileamaximumspanningtreeofaweighteddigraphcanbefoundinpolynomialtime(Tarjan,1977),computingamaximumacyclicsubgraphisintractable,andevengoodapproximatesolutionsarehardtofind(Guruswamietal.,2011).Inthispa-perwethereforeaddressmaximumacyclicsubgraphparsingundertherestrictionthatthesubgraphshouldbenoncrossing,whichinformallymeansthatitsarcscanbedrawnonthehalf-planeabovethesentenceinsuchawaythatnotwoarcscross(andwithoutchang-ingtheorderofthewords).Themaincontributionofthispaperisanalgorithmthatfindsamaximumnoncrossingacyclicsubgraphofa(vertex-ordered)weighteddigraphonnverticesintimeO.n3/.Aftergivingsomebackground(Section2)wein-troducethenoncrossingcondition,compareittootherstructuralconstraintsfromtheliterature,andstudyitsempiricalcoverage(Section3).Wethenpresentourparsingalgorithm(Section4).Toturnthisalgorithmintoapracticalparser,wecombineitwithanoff-the-shelffeaturemodelandonlinetrain-ing(Section5);thesourcecodeofoursystemisre-leasedwiththispaper.1Weevaluatetheperformanceofourparseronfourlinguisticdatasets:thoseusedintherecentSemEvaltaskonsemanticdependencyparsing(Oepenetal.,2015),andthedependencygraphsextractedfromCCGbank(HockenmaierandSteedman,2007).Enfin,weexplorethelimitsofourapproachbyshowingthatfindingthemaximumacyclicsubgraphunderanaturalgeneralizationofthenoncrossingcondition,pagenumberatmostk,isNP-hardfork(cid:21)2(Section6).Weconcludethepaperbydiscussingrelatedandfuturework(Section7).1https://github.com/liu-nlp/gamma

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

560

Thegenethuscanpreventaplantfromfertilizingitselfbvarg1arg1arg1arg2bvarg1arg3arg2Figure1:Anoncrossingdependencygraph(Oepen,2014,DM#20209003).UsingtheterminologyofSagaeandTsujii(2008),theverticesthe,thus,aandfromareroots;ofthese,thus,aandfromarecoveredbyarcs.2BackgroundDependencyparsingisthetaskofmappinganaturallanguagesentenceintoaformalrepresentationofitssyntacticorsemanticstructureintheformofadependencygraph.2.1DependencyGraphsAdirectedgraphordigraphisapairGD.V;A/whereVisasetofverticesandA(cid:18)V(cid:2)Visasetofarcs.Weconsideranarc.u;v/tobedirectedfromutovandwriteitasu!v.A(directed)pathfromutovisasequenceofarcsoftheformv0!v1;v1!v2;:::;vm(cid:0)1!vm;m(cid:21)0;wherev0DuandvmDv.Notethatmisal-lowedtobezero;inthiscasethepathiscalledempty.Adigraphisacyclicifthereisnovertexuwithanonemptypathfromutoitself.Adigraphisatreeifthereexistsavertexr,theroot,suchthatforeveryvertexuthereisexactlyonepathfromrtou.Everytreeisacyclic,butnoteveryacyclicgraphisatree.Throughoutthispaperwewritexforthegenericsentencewithnwords.AdependencygraphforxisanacyclicdigraphGD.V;A/whoseverticescorrespondone-to-onetothewordsinx.Thiscor-respondenceimposesatotalorderonthevertices;werepresentthisorderbyequatingVwiththesetofpositionsinx,VDf1;:::;ng.Anexampledepen-dencygraphisshowninFigure1.2Adependencytreeisadependencygraphthatformsatree.Theexamplegraphisnotatree.2.2MaximumSpanningTreeParsingMcDonaldetal.(2005b)castdependencyparsingasthesearchforamaximumspanningtreeofan2Notethatthearcsoftheexamplegrapharelabeled.Tosimplifythepresentationwemostlyignorethisaspectinthispaper;butourimplementationsupportsparsingtolabeledarcs.arc-weighteddigraph.Theystartfromthecompletegraphonnverticeswhereeacharci!jcarriesareal-valuedweightwij,definedasadotproductwijDw(cid:1)˚.x;je!j/where˚isafunctionthatmapsthesentence–arcpairintoafeaturevectorandwisaglobalweightvectorthatislearnedfromtrainingdata.Takingthefeaturerepresentationandtheweightvectortobefixed,parsingunderthismodelamountstofindingaspanningtreeofmaximumtotalweight.2.3MaximumSubgraphParsingForsemanticdependencyparsing,wherethetargetrepresentationsarenotnecessarilytrees(viz.Fig-ure1),wegeneralizethemodelofMcDonaldetal.(2005b)toothertypesofsubgraphs.Ingeneralweareinterestedinthefollowingoptimizationproblem:MaximumSubgraphforGraphClassGGivenanarc-weighteddigraphGD.V;A/,findasubsetA0(cid:18)AwithmaximumtotalweightsuchthattheinducedsubgraphG0D.V;A0/belongstoG.ThecomputationalcomplexityofthisproblemvarieswiththechoiceofG.IfGisthesetofalldependencytrees,thentheproblemcanbesolvedintimeO.jVj2/(Tarjan,1977).IfGistheunrestrictedsetofalldependencygraphs,thentheMaximumSubgraphproblemisequivalenttothefollowing:MaximumAcyclicSubgraphGivenanarc-weighteddigraphGD.V;A/,findasubsetA0(cid:18)AwithmaximumtotalweightsuchthattheinducedsubgraphG0D.V;A0/isacyclic.ThisproblemisknowntobeNP-hard,andalsohardtoapproximate(Guruswamietal.,2011).

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

561

3NoncrossingDependencyGraphsBecauseoftheNP-hardnessofMaximumAcyclicSubgraphwecannotexpecttofindapolyno-mial-timeparsingalgorithmforgeneraldependencygraphs.Inthissectionwethereforeintroducetherestrictedclassofnoncrossingdependencygraphs.3.1TheNoncrossingConditionLetGD.V;A/beadependencygraph.RecallthatGisvertex-orderedbythecorrespondenceofverticestopositionsinx.Twoarcsa1;a22Acrossifeithermin.a1/l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 5 8 1 5 6 6 8 0 6 / / t l a c _ a _ 0 0 1 5 8 p d . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 563 4.1.1ItemsWeconsidersubproblemswhereweconstructamax-imumnoncrossingdependencygraphonagiven(closed)intervalŒi;k(cid:141)(cid:18)Vofvertices,usingonlyarcsinA.Basedonthestructureofthesubgraph,wedistinguishfivedifferenttypesofsubproblems:1.Constructagraphthatcontainsthearci!k.Wesaythatsuchagraphismin–max-covered.2.Constructagraphthatcontainsthearck!i.Wesaythatsuchagraphismax–min-covered.Notethattype1andtype2aremutuallyexclusive,asthearcsi!kandk!itogetherwouldformacycle.Ifagraphiseithermin–max-coveredormax–min-covered,wesaythatitisarc-covered.3.Constructagraphthatisnotarc-coveredbutcon-tainsanonemptydirectedpathfromitok.Wesaythatsuchagraphismin–max-connected.4.Constructagraphthatisnotarc-coveredbutcon-tainsanonemptydirectedpathfromktoi.Wesaythatsuchagraphismax–min-connected.Type3andtype4aremutuallyexclusive,asthetwopathstogetherwouldformacycle.5.Constructagraphthatisnotarc-coveredanddoesnotcontainanonemptydirectedpathbetweeniandk.Wesaythatsuchagraphisbland.Werepresentthesedifferenttypesofsubproblemsbythefollowingitems:ikik!ik ikikWeshallsetuptheweightofaniteminsuchawaythatitcorrespondstothesumofthearcweightsoftheconstructedsubgraph.4.1.2AxiomsForeachvertexiinG,thegraphontheone-vertexintervalŒi;je(cid:141)isablandgraph.Therefore,theitemsoftypewithiDkaresoundaxiomsofthedeductionsystem.Becauseone-vertexdependencygraphshavenoarcs,weassignzeroweighttothese.4.1.3GoalItemsOurobjectiveistoconstructamaximumnoncrossingdependencygraphonthefullvertexset.Therefore,thegoalitemsofthedeductionsystemareallitemsoverthefullintervalŒ1;n(cid:141).4.1.4InferenceRulesTheinferencerulesofthedeductionsystemareshowninFigure2.Foreachrule,welettheweightoftheconsequentbethesumoftheweightsoftheantecedents,plus(forR16–R19)theweightofthearcrequiredbythesidecondition.Thus,ruleR01forexamplestatesthatwheneverwehaveconstructedamin–max-coveredgraph()ontheintervalŒi;j(cid:141)withsomeweightw1andanothermin–max-coveredgraphontheintervalŒj;k(cid:141)withsomeweightw2,itissoundtoconcludethatwecanconstructamin–max-connectedgraph(!)ontheintervalŒi;k(cid:141)withweightw1Cw2.Similarly,ruleR19statesthatwhen-everwehaveconstructedablandgraph()ontheintervalŒi;k(cid:141)withsomeweightw,thenwemayaddthearck!je(ifitexistsinG)andthusconstructamax–min-coveredgraph()whoseweightisthesumofwandtheweightofthenewarc.4.2CorrectnessWehavealreadyarguedinformallythattheaxiomsandtheinferencerulesofthedeductionsystemaresound.Inordertoshowcompleteness,weprovethefollowinglemma:ForeachofthefivetypeslistedinSection4.1.1,wheneverthecorrespondingmaximumnoncrossingdependencygraphontheintervalŒi;k(cid:141)hasweightw,theappropriateitemisderived.Weonlysketchthe(straightforward)proofofthislemma.Wedefinethesizeofagraphasthetotalnumberofitsverticesandarcs.Theproofisbyinductiononthesize.Ifthegraphhassizeone(thatis,consistsofasinglevertex),thenitisblandandhasweightzero;thiscaseiscoveredbytheaxioms.Fortheinductivestep,weconsideragraphHwithsizem>1andassumethatthelemmaholdstrueforallgraphsofsmallersize.WethenshowhowtoderivetherelevantitemforHfromthepreviouslyderiveditemsforsubgraphsofHusingtheinferencerules.4.3ImplementationandRuntimeInaViterbi-stylesearchforoptimalderivationsinthedeductionsystem,weenumerateitemsbysizeandderivetheweightsofitemswithlargersizesfromtheweightsofitemswithsmallersizes.Inspectingthemaximalnumberofvariablesperrule(whichisthree,forR01–R08),weseethatsuchasearchcanbeimplementedtorunintimeO.n3/.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

564

R01ijjk!ikR02ijjkikR03ijjkikR04ijjk ikR05!ijjk!ikR06!ijjkikR07 ijjkikR08 ijjk ikR09ijjkikR10ijjkikR11ik(cid:0)ikR12ik(cid:0)ikR13!ik(cid:0)ikR14 ik(cid:0)ikR15ik(cid:0)ikR16!ikiki!kR17ikiki!kR18 ikikk!iR19ikikk!iFigure2:Deductionsystemfornoncrossingdependencygraphs(1(cid:20)jel D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 5 8 1 5 6 6 8 0 6 / / t l a c _ a _ 0 0 1 5 8 p d . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 565 4.4UniquenessofDerivationsBesidesbeingsoundandcomplete,thedeductionsystemalsohasthepropertythatitassignsauniquederivationtoeverynoncrossingdependencygraph.Theproof,again,isbyinductiononthesize.Toillustratetheargument,supposethatwewanttocon-structamin–max-connectedgraphHonsomeinter-valŒi;k(cid:141).Theonlyrulesthatcanbeusedtoderivethecorrespondingitem(oftype!)areR01andR05.Inbothrules,thegraphcorrespondingtothesecondantecedentmustbeamin–max-coveredgraph()onsomeintervalŒj;k(cid:141).Thismeansthatthevertexjisuniquelydetermined;itmustbetheleftend-pointofthelongestarcj!kinH.Withthis,thegraphontheremainingintervalŒi;j(cid:141)isuniquelyde-terminedaswell,andbothgraphsaresmallerthanH;hencewemayassumethattheyhaveuniquederiva-tions.Notethatjwouldnotbeuniquelydeterminedifthedeductionsystemincluded(say)aninferencerulethatderivesanitem!fromtwoothersuchitems:Sucharulewouldbesound,butitwouldcreatederivationalambiguity.Animmediateconsequenceoftheuniquenessofderivationsisthatourparsingalgorithmcanbeusedforcountingnoncrossingdependencygraphs,whichisusefulfor,amongotherthings,testingthecorrect-nessofimplementations.Tocount,wechangethesystemsuchthattheweightofanitemgivesthenum-berofitsderivations.5ThemodifiedsystemyieldssequenceA246756intheOn-LineEncyclopediaofIntegerSequences(OEISFoundationInc.,2011).4.5EnforcingWeakConnectivityOurdeductionsystemcanbemodifiedtoparseinto(andcount)variousotherclassesofnoncrossinggraphs.Forexample,wecanadaptoursystemtofindamaximumnoncrossingacyclicsubgraphundertheadditionalrestrictionthatthissubgraphshouldbeweaklyconnected(Schluter,2015).Todosowedistinguishbetweentwotypesofblandsubgraphs,(un)weaklyconnectedgraphsand(b)graphswithex-actlytwoweaklyconnectedcomponents,andadapttheinferencerulesandgoalitems.Thechangecanbeimplementedwithoutaffectingtheasymptoticrun-timeofthealgorithm.5Usingtheparlanceofsemiringparsing(Homme bon,1999),weswitchfromthemax–plussemiringtothecountingsemiring.Notethatwhenwetakethemodifieddeductionsys-temandconsiderundirectededgesinsteadofdirectedarcs,weobtainanalgorithmforfindingmaximumconnectednoncrossinggraphs,whicharecountedbysequenceA007297intheOEIS.ThesegraphsarethetargetrepresentationsofLinkGrammar(SleatorandTemperley,1993).5PracticalParsingWhilethemainfocusofthispaperistheoretical,inthissectionweextendourparsingalgorithmintoapracticalparser.Inthecontextofourgeneralmodel(Section2),thisrequirestwoadditionalcomponents:afeaturerepresentationandatrainingalgorithm.5.1FeaturesWeusethearc-basedfeaturesofTurboParser(Mar-tinsetal.,2009),whichdescendfromseveralotherfeaturemodelsfromtheliteratureonsyntacticde-pendencyparsing(McDonaldetal.,2005a;Carrerasetal.,2006;KooandCollins,2010).Inthesemod-els,thefeaturevectorforanarci!jrepresentsinformationaboutvariouscombinationsoftheexactforms,lemmasandpart-of-speechtagsofthewordsatpositionsiandj;thetagsoftheimmediatelysur-roundingwordsandthewordsbetweeniandj;aswellasthelengthofthearcanditsdirection.Tosupportparsingtolabeleddependencygraphs,weadditionallyconjoinsomeofthesefeatureswiththearclabel.Fordetailswerefertothesourcecode.5.2TrainingTolearnthefeatureweightsintheweightvectorfromdataweuseonlinepassive–aggressivetrainingasdescribedbyCrammeretal.(2006).Foreachgoldinstance.x;G/inthetrainingdatawelettheparsingalgorithmfindthemaximumnoncrossingdependencygraphOGgiventhecurrentweightvectorwandupdatetheweightvectorasw wC(cid:28).˚.x;G/(cid:0)˚.x;OG//where˚.x;G/and˚.x;OG/arethesumvectorsofthearc-specificfeaturevectorsforthedependencygraphsGandOG,andthescalar(cid:28)iscomputedasmin0B@C;w(cid:1).˚.x;OG/(cid:0)˚.x;G//Cq`.G;OG/(cid:13)(cid:13)(cid:13)˚.x;G/(cid:0)˚.x;OG/(cid:13)(cid:13)(cid:13)21Californie: l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 5 8 1 5 6 6 8 0 6 / / t l a c _ a _ 0 0 1 5 8 p d . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 566 Inthisformula,`.G;OG/isauser-definedlossfunc-tionandC>0isaparameterthatcontrolsthetrade-offbetweenoptimizingthecurrentlossandbeingclosetotheoldweightvector.WeuseHammingloss,definedasthenumberof(labeled)arcsthatareexclu-sivetoeitherGorOG.Followingcustompractice,weapplyweightvectoraveraging(FreundandSchapire,1999;Collins,2002).5.3ParsingExperimentsWereportparsingexperimentsonthefourdatasetsdescribedinSection3.3.1anddiscusstheirresults.Weusegold-standardlemmasandpart-of-speechtags,traineachparserfor10epochs,andreportre-sultsforthefinalmodelonthetestdata.Weusethesplitsrecommendedfortherespectivedatasets.FollowingCarreras(2007),priortotrainingwetrans-formeachdependencygraphinthetrainingdatatoaclosestnoncrossingdependencygraph.6Inapre-studyusingtheDMdevelopmentdatawefoundthebestvalueforthetradeoffparameterCtobe0:01.5.3.1ResultsandDiscussionTheexperimentalresultsareshowninTable2.FortheSDPdata,wereportstandardmetricsusedintheSemEvaltask(Oepenetal.,2015):precision,recall,andF1onlabeleddependencies.FortheCCGdata,wereportthesamemetricsforunlabeleddependen-cies;thesetakeintoaccountonlythetwodependentwordsbutnotthelexicalcategorycontainingthede-pendencyrelationortheargumentslot.7Giventhelowcoverageofnoncrossingdepen-dencygraphsonthefourdatasets(recallSec-tion3.3.2)andtheuseofasimplearc-factoredmodelwithitsoff-the-shelffeaturesoriginallydevelopedforsyntacticparsing,itisnotsurprisingthattheparserdoesnotachievestate-of-theartresults.Westillconsiderourresultstobeausefulreferencefortheemergingfieldofsemanticdependencyparsing.OntheSDPdata,theaveragedlabeledF1oftheparseris79.63,whichis5.70pointsbelowthecorrespond-ingscoreforthebest-performingsysteminthetask,Peking(Duetal.,2015b).LabeledF1ishighestonPAS(86.24)andlowestonPSD(70.01).Onthe6ThisisdonebyrunningtheparserwithanoraclemodelthatassignsascoreofC1tocorrectand(cid:0)1toincorrectarcs.7ThehighnumberofdifferentarclabelsintheCCGdataexceedswhatcanbehandledbythecurrentversionofourparser.CCGdata,theparserachievesanunlabeledF1of89.74,4.24pointsbelowthebestreportedresultforparsingwithgold-standardpart-of-speechtags(AuliandLopez,2011).Onallfourdatasets,precisionexceedsrecallbyasignificantmargin,morethanonemightexpectgiventherelativelyhighupperboundsonarc-basedrecallthatweobservedinSection3.3.2.Thespeedoftheparserisabout0.01secondspersentenceor110sentencespersecondoneachofthefourtestsets.Trainingfor10epochstakesaround40minutespertrainingset.Experimentswereper-formedonaniMac3.4GHzIntelCorei5CPUwith4coresandJava1.8.0.6IntractabilityforHigherPagenumbersTheparsingalgorithmpresentedinSection4hasmanyattractivetheoreticalproperties,butitspracti-calusefulnessislimitedbytherelativelylowcov-erageofnoncrossingdependencygraphsonthelin-guisticdata.Itwouldbedesirabletogeneralizethealgorithmtomoreexpressiveclassesofgraphs.Anaturalcandidateistheclassofdependencygraphswithpagenumberatmostk,whosecoverageisexcel-lentalreadyforkD2,aswesawinSection3.3.2.However,weshallnowprovethefollowing:Theorem2Fordependencygraphswithpagenum-beratmostk,MaximumSubgraphisNP-hardwheneverk(cid:21)2.FortheproofofthistheoremwedonotconsiderthemaximizationproblemdefinedinSection2.3butthefollowingdecisionversion:MaximumSubgraphforGraphClassG,DecisionVersionGivenadigraphGD.V;A/andanintegerm(cid:21)0,isthereasubsetA0(cid:18)AwithjA0j(cid:21)msuchthattheinducedsubgraphG0D.V;A0/belongstoG?Notethatanypolynomial-timealgorithmforthemaximizationproblemcanbeturnedintoapolyno-mial-timealgorithmforthedecisionproblem:Givenaninstanceforthedecisionproblem,assigneacharcweight1andsolvethemaximizationproblem;alors,testwhetherthesolutioncontainsatleastmarcs.ToshowtheNP-hardnessofthemaximizationproblemitthereforesufficestoshowitforthedecisionversionoftheproblem.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

567

c1c2c3c4c5c5c1c2c4c1c5c3c2c3c4c1c2c4c1c5c3c2c3c4c5Figure3:Acirclegraph(gauche),acorrespondingchorddrawing(middle),andadependencygraphasconstructedbythealgorithmspecifiedinSection6.2(droite).TheexampleisadaptedfromUnger(1992).6.1CircleGraphsToshowthat(thedecisionversionof)MaximumSub-graphfordependencygraphswithpagenumberatmostkisNP-hardwepresentapolynomialreductionfromadecisionproblemoncirclegraphs.Acirclegraphisanundirectedgraphthatrepre-sentstheintersectiongraphofasetofchordsofacircle.Thatis,foreverycirclegraphGwecandrawacircleandasetofchordsofthatcirclesuchthatthechordsareinone-to-onecorrespondencewiththeverticesofGandtwochordscrosseachotherifandonlyifthecorrespondingverticesareadjacentinG.AnexampleofacirclegraphandacorrespondingchorddrawingisgiveninFigure3.Notethatoneandthesamecirclegraphmayhavemanydifferentchorddrawings.Here,withoutlossofgenerality,werestrictourattentiontodrawingsinwhichnotwochordshaveacommonendpoint.Inthesedrawings,thereareexactlytwiceasmanypointsonthecircleasthereareverticesinthecirclegraphG.6.2ReductionTherelevantdecisionproblemoncirclegraphsisgivenbelow.ForagraphGD.V;E/andasubsetV0(cid:18)V,weletGjV0denotethesubgraphofGinducedbyV0,thatis,GjV0hasvertexsetV0andcontainseachedgeinEbetweenverticesinV0.k-ColorableInducedSubgraphforCircleGraphs(k-CIG)GivenacirclegraphGD.V;E/andanintegerm(cid:21)0,isthereasubsetV0(cid:18)VwithjV0j(cid:21)msuchthattheinducedgraphGjV0isk-colorable?CongandLiu(1991)showthatthisproblemisNP-completeifk(cid:21)2.ForkD2andk(cid:21)4theirproofisbasedonearlierresultsbySarrafzadehandLee(1989)andUnger(1988),respectively.ThefollowingproceduretransformsanarbitrarycirclegraphGintoadependencygraphH.Foranillustration,seeFigure3.81.ConstructachorddrawingforG.Recallthatforthisdrawingthereisaone-to-onecorrespondencebetweenthechordsandtheverticesofG.2.Cutthecircleofthechorddrawingandstraightenitoutintoaline.Thisyieldsatotalorderontheendpointsofthechords.WeidentifyeachendpointwithitspositioninthatorderanddenotetheleftandtherightendpointofachordcbycLandcR,respectively.Becauseweassumethatnotwochordshaveacommonendpoint,therewillbeexactlytwiceasmanypointsonthelineastherearechordsandthereforeverticesinG.3.DirecteachchordcfromcLtocR.Thisdefinesadependencygraphwhosearcsarethedirectedchordsandwhoseverticesare(thepositionsof)thechords’endpoints.Thecrucialpropertyofthisconstructionisthates-tablishesaone-to-onecorrespondencebetweentheverticesofGandthearcsofH.Moreformally,letGD.VG;E/andHD.VH;A/.TheconstructionestablishesabijectionaWVG!Abya.v/Dc.v/L!c.v/RwherecistheuniquechordcorrespondingtothevertexvinthechorddrawingofG.ThedependencygraphHcanbecomputedinpoly-nomialtimeinthesizeofG.Thenon-obviouspartisstep1;thiscanbecarriedoutintimequadraticinthenumberofverticesofGusingthealgorithmofSpinrad(1994).8Thedependencygraphobtainedbythisconstructionises-sentiallywhatUnger(1992)callstheoverlapgraphmodelofacirclegraph,exceptthatitsedgesaredirected.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

568

Wenowclaimthataninstance.G;m/ofk-CIGyieldsa“yes”answerifandonlyifthecorrespondinginstance.H;m/ofMaximumSubgraphfordepen-dencygraphswithpagenumberatmostkdoes.Assumethat.G;m/yieldsapositiveanswer.ThismeansthatthereexistsasubsetofverticesV0G(cid:18)VGwithjV0Gj(cid:21)mandsuchthattheverticesinthissetcanbecoloredwithatmostkcolorsinsuchawaythatnotwoadjacentverticeshavethesamecolor.Withoutlossofgeneralityweassumethatthecolorsarenumbersbetween1andk.Letf.v/denotethecolorassignedtov.Considerthesetofarcscorre-spondingtoV0G,A0Dfa.v/jv2V0Gg.WeclaimthatA0isasolutionsetfor.H;m/.ClearlyjA0j(cid:21)mandH0D.VH;A0/isadependencygraph.WeshowthatH0canbeembeddedintoak-book.Placeeveryarca.v/2A0onpagef.v/.Assumenowthattwoarcsa1Da.v1/anda2Da.v2/areplacedonthesamepageandcrosseachother.Thisimpliesthatf.v1/Df.v2/andthatthereisanedgebetweenv1andv2inG.Becausev1;v22V0G,weseethatthisedgealsoappearsinGjV0G.ThiscontradictstheassumptionthatthesetV0Gisk-colorable.Assumethat.H;m/yieldsapositiveanswer.ThismeansthatthereexistsasetofarcsA0withjA0j(cid:21)mandsuchthatthereisanassignmentofpagenumberstothearcsinthissetsuchthatnotwoarcswiththesamepagenumbercrosseachother.Letf.a/denotethepagenumberassignedtoa.ConsiderthesetofverticescorrespondingtoA0,V0GDfv2VGja.v/2A0g.WeclaimthatV0Gisasolutionsetfor.G;m/.ClearlyjV0j(cid:21)m.ToshowthatGjV0Gisk-colorable,weinterpretthepageassignmentfasak-coloringoftheverticesinV0Gintheobviousway:Wecoloreachvertexv2V0Gwiththepagenumberofthecorrespondingarca.v/2A0.Nowconsidertwoarbitraryverticesv1andv2thatareadjacentinGjV0.Byconstruction,thearcsa.v1/anda.v2/crossinH;thereforef.a.v1//¤f.a.v1//andv1andv2areassigneddifferentcolorsinGjV0.ThiscompletestheproofofTheorem2.7ConclusionThegoalofthispaperwastogeneralizemaximumspanningtreedependencyparsingtotargetstructuresthatarenotnecessarilytree-shaped.Becausepars-ingtounrestricteddependencygraphsisintractable,westudiedtheproblemundertherestrictiontonon-crossinggraphs,whichgeneralizeprojectivedepen-dencytreesastheyareknownfromsyntacticparsing.Wepresentedacubic-timeparsingalgorithmforthisclassofgraphs,extendedthealgorithmintoaprac-ticalparserthatweevaluatedonfourlinguisticdatasets,andfinallyprovedthatthe(naturel)stepbeyondthenoncrossingconditiontodependencygraphswithpagenumberatmostkrendersparsingintractable.Themaincontributionsofthispaperaretheoret-ical.Tothebestofourknowledge,oursisthefirstexact-inferencealgorithmforthefullclassofnon-crossingdependencygraphs.ThesimilaralgorithmbySchluter(2015),whichwasdevelopedcontempo-raneouslybutindependentlyofours,isrestrictedto(weakly)connectedgraphs.Thisisaseverepracti-callimitationwhensemanticallyvacuoustokensareanalyzedasunconnectednodes,assuchananalysisrendersmostgraphsunconnected.9Anotherdiffer-encebetweenouralgorithmandthealgorithmofSchluter(2015)isthatthelatterdoesnothavetheuniquenesspropertydiscussedinSection4.4.Aninterestingfollow-upquestiontoourresultinSection6.2iswhetherMaximumSubgraphremainsNP-hardwhenthecandidatespaceisrestrictedtotreeswithpagenumberatmostk.ThisquestionwasraisedbyGómez-RodríguezandNivre(2013),whopresentagreedyparserforthecasekD2.Weviewouralgorithmasacanonicalgeneraliza-tionoftheEisnerandSatta(1999)parsingalgorithmforprojectivedependencytrees,andexpectittoserveasasimilarpointofdepartureforfutureextensionsoftheparadigm.Ontheonehand,itseemsinter-estingtoexploretheuseofmoreexpressivefeaturemodels,includingthegeneralizationofourarc-fac-toredmodeltothegrandparentandsiblingfeaturesofCarreras(2007)andKooandCollins(2010).Ontheotherhand,giventhelowcoverageofnoncrossingdependencygraphs,itseemsnecessarytoexplorethegeneralizationtonew,“mildly”crossingclassesofgraphs,suchasagraphversionofthe1-endpointcrossingtreesofPitleretal.(2013).Pitler(2014)showsthatthetwodirectionsmayalsobecombined,whichwehopewillleadtonew,moreaccuratealgo-rithmsforsemanticdependencyparsing.9ForeachofthedatasetsintroducedinSection3.3.1,lessthan1%ofthegraphsarebothnoncrossingandconnected.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

569

AcknowledgmentsWearegratefulto:JordanTirrell,whoseanswertoaquestiononMathOverflowprovidedtheseedfortheparsingalgorithminSection4;EmilyPitler,whoproposedapresentationalsimplificationofthealgorithm;theparticipantsoftheDagstuhlSeminar15122“FormalModelsofGraphTransformationinNaturalLanguageProcessing”,fordiscussionsthateventuallyledtothehardnessresultinSection6;andtheanonymousreviewersandtheactioneditorforconstructivefeedback.TheresearchofMarcoKuhlmannisfundedbytheLinköpingInstituteofTechnologyundertheCENIITprogram.ReferencesMichaelAuliandAdamLopez.2011.Trainingalog-linearparserwithlossfunctionsviasoftmax-margin.InProceedingsofthe2011ConferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP),pages333–343,Edinburgh,UK.FrankBernhartandPaulC.Kainen.1979.Thebookthicknessofagraph.JournalofCombinatorialTheory,SeriesB,27(3):320–331.XavierCarreras,MihaiSurdeanu,andLluísMàrquez.2006.Projectivedependencyparsingwithperceptron.InProceedingsoftheTenthConferenceonCompu-tationalNaturalLanguageLearning(CoNLL),pages181–185,NewYork,USA.XavierCarreras.2007.Experimentswithahigher-orderprojectivedependencyparser.InProceedingsoftheCoNLLSharedTaskSessionofEMNLP-CoNLL2007,pages957–961,Prague,CzechRepublic.MichaelCollins.2002.DiscriminativetrainingmethodsforHiddenMarkovModels:Theoryandexperimentswithperceptronalgorithms.InProceedingsoftheCon-ferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP),pages1–8,Philadelphia,USA.JasonCongandChungLaungLiu.1991.Onthek-layerplanarsubsetandtopologicalviaminimizationproblems.IEEETrans.onCADofIntegratedCircuitsandSystems,10(8):972–981.KobyCrammer,OferDekel,JosephKeshet,ShaiShalev-Shwartz,andYoramSinger.2006.Onlinepassive-aggressivealgorithms.JournalofMachineLearningResearch,7:551–585.YantaoDu,WeiweiSun,andXiaojunWan.2015a.Adata-driven,factorizationparserforCCGdependencystructures.InProceedingsofthe53rdAnnualMeet-ingoftheAssociationforComputationalLinguisticsandthe7thInternationalJointConferenceofNaturalLanguageProcessingoftheAsianFederationofNat-uralLanguageProcessing,pages1545–1555,Beijing,China.YantaoDu,FanZhang,XunZhang,WeiweiSun,andXiaojunWan.2015b.Peking:Buildingsemanticde-pendencygraphswithahybridparser.InProceedingsofthe9thInternationalWorkshoponSemanticEval-uation(SemEval2015),pages927–931,Denver,CO,USA.JasonEisnerandGiorgioSatta.1999.Efficientparsingforbilexicalcontext-freegrammarsandHeadAutoma-tonGrammars.InProceedingsofthe37thAnnualMeetingoftheAssociationforComputationalLinguis-tics(ACL),pages457–464,CollegePark,MARYLAND,USA.PhilippeFlajoletandMarcNoy.1999.Analyticcom-binatoricsofnon-crossingconfigurations.DiscreteMathematics,204(1–3):203–229.YoavFreundandRobertE.Schapire.1999.Largemarginclassificationusingtheperceptronalgorithm.MachineLearning,37(3):277–296.CarlosGómez-RodríguezandJoakimNivre.2013.Di-visibletransitionsystemsandmultiplanardependencyparsing.ComputationalLinguistics,39(4):799–845.JoshuaGoodman.1999.Semiringparsing.Computa-tionalLinguistics,25(4):573–605.VenkatesanGuruswami,JohanHåstad,RajsekarManokaran,PrasadRaghavendra,andMosesCharikar.2011.Beatingtherandomorderingishard:EveryorderingCSPisapproximationresistant.SIAMJournalonComputing,40(3):878–914.JuliaHockenmaierandMarkSteedman.2005.CCGbankLDC2005T13.WebDownload.JuliaHockenmaierandMarkSteedman.2007.CCG-bank.AcorpusofCCGderivationsanddependencystructuresextractedfromthePennTreebank.Computa-tionalLinguistics,33:355–396.TerryKooandMichaelCollins.2010.Efficientthird-orderdependencyparsers.InProceedingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages1–11,Uppsala,Sweden.SandraKübler,RyanMcDonald,andJoakimNivre.2009.DependencyParsing.SynthesisLecturesonHumanLanguageTechnologies.MorganandClaypool.AndréF.T.Martins,NoahA.Smith,andEricP.Xing.2009.Conciseintegerlinearprogrammingformula-tionsfordependencyparsing.InProceedingsoftheJointConferenceofthe47thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL)andtheFourthInternationalJointConferenceonNaturalLanguageProcessingoftheAsianFederationofNat-uralLanguageProcessing(IJCNLP),pages342–350,Singapore.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
5
8
1
5
6
6
8
0
6

/

/
t

je

un
c
_
un
_
0
0
1
5
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

570

RyanMcDonald,KobyCrammer,andFernandoPereira.2005a.Onlinelarge-margintrainingofdependencyparsers.InProceedingsofthe43rdAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages91–98,AnnArbor,USA.RyanMcDonald,FernandoPereira,KirilRibarov,andJanHajiˇc.2005b.Non-projectivedependencypars-ingusingspanningtreealgorithms.InHumanLan-guageTechnologyConference(HLT)andConferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP),pages523–530,Vancouver,Canada.Mark-JanNederhof.2003.WeighteddeductiveparsingandKnuth’salgorithm.ComputationalLinguistics,29(1):135–143.OEISFoundationInc.2011.Theon-lineencyclopediaofintegersequences.http://oeis.org.StephanOepen,MarcoKuhlmann,YusukeMiyao,DanielZeman,DanFlickinger,JanHajiˇc,AngelinaIvanova,andYiZhang.2014.SemEval2014Task8:Broad-coveragesemanticdependencyparsing.InProceedingsofthe8thInternationalWorkshoponSemanticEvalua-tion(SemEval2014),pages63–72,Dublin,RepublicofIreland.StephanOepen,MarcoKuhlmann,YusukeMiyao,DanielZeman,SilvieCinková,DanFlickinger,JanHajiˇc,andZdeˇnkaUrešová.2015.SemEval2015Task18:Broad-coveragesemanticdependencyparsing.InProceedingsofthe9thInternationalWorkshoponSemanticEval-uation(SemEval2015),pages915–926,Denver,CO,USA.StephanOepen.2014.SemEval2015Task18evalua-tionLDC2014E100.WebDownload.Datasetonlyavailabletotaskparticipants.EmilyPitler,SampathKannan,andMitchellMarcus.2013.Findingoptimal1-endpoint-crossingtrees.TransactionsoftheAssociationforComputationalLin-guistics,1:13–24.EmilyPitler.2014.Acrossing-sensitivethird-orderfac-torizationfordependencyparsing.TransactionsoftheAssociationforComputationalLinguistics,2:41–54.KenjiSagaeandJun’ichiTsujii.2008.Shift-reducede-pendencyDAGparsing.InProceedingsofthe22ndInternationalConferenceonComputationalLinguistics(COLING),pages753–760,Manchester,UK.MajidSarrafzadehandDer-TsaiLee.1989.Anewap-proachtotopologicalviaminimization.IEEETrans-actionsonCADofIntegratedCircuitsandSystems,8(8):890–900.NatalieSchluter.2014.OnmaximumspanningDAGalgorithmsforsemanticDAGparsing.InProceedingsoftheACL2014WorkshoponSemanticParsing,pages61–65.NatalieSchluter.2015.ThecomplexityoffindingthemaximumspanningDAGandotherrestrictionsforDAGparsingofnaturallanguage.InProceedingsoftheFourthJointConferenceonLexicalandComputationalSemantics,pages259–268,Denver,CO,USA.DanielSleatorandDavyTemperley.1993.ParsingEn-glishwithaLinkGrammar.InProceedingsoftheThirdInternationalWorkshoponParsingTechnologies(IWPT),pages277–292,Tilburg,TheNetherlands,andDurbuy,Belgium.JeremyP.Spinrad.1994.Recognitionofcirclegraphs.J.Algorithms,16(2):264–282.RobertE.Tarjan.1977.Findingoptimumbranchings.Networks,7:25–35.IvanTitov,JamesHenderson,PaolaMerlo,andGabrieleMusillo.2009.Onlinegraphplanarisationforsyn-chronousparsingofsemanticandsyntacticdependen-cies.InInternationalJointConferencesonArtificialIntelligence,pages1562–1567,Pasadena,Californie,USA.WalterUnger.1988.Onthek-colouringofcircle-graphs.InRobertCoriandMartinWirsing,editors,STACS88,5thAnnualSymposiumonTheoreticalAspectsofComputerScience,Bordeaux,France,February11–13,1988,Proceedings,volume294ofLectureNotesinComputerScience,pages61–72.Springer.WalterUnger.1992.Thecomplexityofcolouringcir-clegraphs(extendedabstract).InAlainFinkelandMatthiasJantzen,editors,STACS92,9thAnnualSym-posiumonTheoreticalAspectsofComputerScience,Cachan,France,February13-15,1992,Proceedings,volume577ofLectureNotesinComputerScience,pages389–400.Springer.AndrewJ.Viterbi.1967.Errorboundsforconvolu-tionalcodesandanasymptoticallyoptimumdecodingalgorithm.IEEETransactionsonInformationTheory,13(2):260–269.
Télécharger le PDF