Transactions of the Association for Computational Linguistics, 2 (2014) 405–418. Action Editor: Mark Steedman.

Transactions of the Association for Computational Linguistics, 2 (2014) 405–418. Action Editor: Mark Steedman.
Submitted 4/2014; Überarbeitet 8/2014; Published 10/2014. C(cid:13)2014 Verein für Computerlinguistik.

405

ANewParsingAlgorithmforCombinatoryCategorialGrammarMarcoKuhlmannDepartmentofComputerandInformationScienceLinköpingUniversity,Swedenmarco.kuhlmann@liu.seGiorgioSattaDepartmentofInformationEngineeringUniversityofPadua,Italysatta@dei.unipd.itAbstractWepresentapolynomial-timeparsingalgo-rithmforCCG,basedonanewdecompositionofderivationsintosmall,shareableparts.Ouralgorithmhasthesameasymptoticcomplex-ity,O.n6/,asapreviousalgorithmbyVijay-ShankerandWeir(1993),butiseasiertoun-derstand,implement,andprovecorrect.1IntroductionCombinatoryCategorialGrammar(CCG;SteedmanandBaldridge(2011))isalexicalizedgrammarfor-malismthatbelongstotheclassofso-calledmildlycontext-sensitiveformalisms,ascharacterizedbyJoshi(1985).CCGhasbeensuccessfullyusedforawiderangeofpracticaltasksincludingdata-drivenparsing(ClarkandCurran,2007),wide-coveragese-manticconstruction(Bosetal.,2004;Kwiatkowskietal.,2010;LewisandSteedman,2013)andmachinetranslation(Weeseetal.,2012).SeveralparsingalgorithmsforCCGhavebeenpresentedintheliterature.Earlierproposalsshowrunningtimeexponentialinthelengthoftheinputstring(PareschiandSteedman,1987;Tomita,1988).AbreakthroughcamewiththeworkofVijay-ShankerandWeir(1990)andVijay-ShankerandWeir(1993)whoreportthefirstpolynomial-timealgorithmforCCGparsing.Untilthisday,thisalgorithm,whichweshallrefertoastheV&Walgorithm,remainstheonlypublishedpolynomial-timeparsingalgorithmforCCG.However,wearenotawareofanypracticalparserforCCGthatactuallyusesit.Wespeculatethatthishastwomainreasons:Erste,someauthorshavearguedthatlinguisticresourcesavailableforCCGcanbecoveredwithcontext-freefragmentsoftheformalism(FowlerandPenn,2010),forwhichmoreefficientparsingalgorithmscanbegiven.Sec-ond,theV&Walgorithmisconsiderablymorecom-plexthanparsingalgorithmsforequivalentmildlycontext-sensitiveformalisms,suchasTree-Adjoin-ingGrammar(JoshiandSchabes,1997),andisquitehardtounderstand,implement,andprovecorrect.TheV&Walgorithmisbasedonaspecialdecom-positionofCCGderivationsintosmallerpartsthatcanthenbesharedamongdifferentderivations.Thissharingisthekeytothepolynomialruntime.Inthisarticlewebuildonthesameidea,butdevelopanalternativepolynomial-timealgorithmforCCGparsing.ThenewalgorithmisbasedonadifferentdecompositionofCCGderivations,andisarguablysimplerthantheV&Walgorithminatleasttwore-spects:Erste,thenewalgorithmusesonlythreebasicsteps,againsttheninebasicstepsoftheV&Wparser.Second,thecorrectnessproofofthenewalgorithmissimplerthantheonereportedbyVijay-ShankerandWeir(1993).ThenewalgorithmrunsintimeO.n6/wherenisthelengthoftheinputstring,thesameastheV&Wparser.Weorganizeourpresentationasfollows.InSec-tion2weintroduceCCGandthecentralnotionofderivationtrees.InSection3westartwithasimplebutexponential-timeparserforCCG,fromwhichwederiveourpolynomial-timeparserinSection4.Section5furthersimplifiesthealgorithmandprovesitscorrectness.WethenprovideadiscussionofouralgorithmandpossibleextensionsinSection6.Sec-tion7concludesthearticle.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

406

2CombinatoryCategorialGrammarWeassumebasicfamiliaritywithCCGingeneralandtheformalismofWeirandJoshi(1988)inparticular.Inthissectionwesetupourterminologyandnotation.ACCGhastwomainparts:alexiconthatassociateswordswithcategories,andrulesthatspecifyhowcategoriescanbecombinedintoothercategories.Together,thesecomponentsgiverisetoderivationssuchastheoneshowninFigure1.2.1LexiconTheCCGlexiconisafinitesetofword–categorypairswWDX.1Categoriesarebuiltfromafinitesetofatomiccategoriesandtwobinaryoperators:forwardslash(=)andbackwardslash(=).Atomiccategoriesrepresentthesyntactictypesofcompleteconstituents;theyincludeadistinguishedcategorySforcompletesentences.Aconstituentwiththecom-plexcategoryX=YrepresentsafunctionthatseeksaconstituentofcategoryYimmediatelytoitsrightandreturnsaconstituentofcategoryX;similarly,X=YrepresentsafunctionthatseeksaYtoitsleft.Wetreatslashesasleft-associativeoperatorsandomitunnecessaryparentheses.Bythisconvention,everycategoryXcanbewrittenasXDAjmXm(cid:1)(cid:1)(cid:1)j1X1wherem(cid:21)0,AisanatomiccategorycalledthetargetofXandthejiXiareslash–categorypairscalledtheargumentsofX.Weviewtheseargumentsasbeingarrangedinastackwithj1X1atthetopandjmXmatthebottom.ThusanotherwayofwritingthecategoryXaboveisasXDA˛,where˛isa(possiblyempty)stackofmarguments.ThenumbermiscalledthearityofX;wedenoteitbyar.X/.2.2RulesTherulesofCCGaredirectedversionsof(general-ized)functionalcomposition.Therearetwoforms,forwardrulesandbackwardrules:X=YYjdYd(cid:1)(cid:1)(cid:1)j1Y1)XjdYd(cid:1)(cid:1)(cid:1)j1Y1.>d/YjdYd(cid:1)(cid:1)(cid:1)j1Y1X=Y)XjdYd(cid:1)(cid:1)(cid:1)j1Y1.1Harry…….NPS=NP>0S<0Figure1:Asamplederivationtree.Everyruleisobtainedbychoosingaspecificdegreed(cid:21)0andspecificdirections(forwardorbackward)foreachoftheslashesji,whileX,YandtheYiarevariablesrangingovercategories.Thusforeverydegreed(cid:21)0thereare2dforwardrulesand2dbackwardrules.Therulesofdegree0arecalledapplicationrules.Incontextswherewerefertobothapplicationandcomposition,weusethelattertermfor“proper”compositionrulesofdegreed>0.NotethatinmostofthisarticleweignoreadditionalrulesrequiredforlinguisticanalysiswithCCG,inparticulartype-raisingandsubstitution.WebrieflydiscusstheserulesinSection6.EveryCCGgrammarrestrictsitselftoafinitesetofrules,buteachsuchrulemaygiverisetoinfinitelymanyruleinstances.Aruleisinstantiatedbysub-stitutingconcretecategoriesforthevariables.Forexample,thederivationinFigure1containsthefol-lowinginstanceofforwardcomposition(>1):S=NP=.S=NP/S=NP=NP)S=NP=NPNotethatweoverloadthedoublearrowtodenotenotonlyrulesbutalsoruleinstances.Givenarulein-stance,thecategorythatinstantiatesthepatternX=Y(forward)orX=Y(backward)iscalledtheprimaryinput,andthecategorythatinstantiatesthepatternYjdYd(cid:1)(cid:1)(cid:1)j1Y1iscalledthesecondaryinput.Adopt-ingourstack-basedview,eachrulecanbeunderstoodasanoperationonargumentstacks:popjYoffthestackoftheprimaryinput;popthejiYioffthestackofthesecondaryinputandpushthemtothestackoftheprimaryinput(preservingtheirorder).TheformalismofWeirandJoshi(1988)allowstorestrictthevalidinstancesofindividualrules.Similartoourtreatmentofadditionalcombinatoryrules,inmostofthisarticleweignoretheserulerestrictions;butseethediscussioninSection6.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

407

w….XiffwWDXX=YYˇXˇt1t2iffX=YYˇ)XˇYˇX=YXˇt2t1iffYˇX=Y)XˇFigure2:Recursivedefinitionofderivationtrees.Nodeslabeledwithprimaryinputcategoriesareshaded.2.3DerivationTreesThesetofderivationtreesofaCCGcanbeformallydefinedasinFigure2.ThereandintheremainderofthisarticleweuseˇandothersymbolsfromthebeginningoftheGreekalphabettodenotea(pos-siblyempty)stackofarguments.Derivationtreesconsistofunarybranchingsandbinarybranchings:unarybranchings(drawnwithdottedlines)corre-spondtolexiconentries;binarybranchingscorre-spondto(validinstancesof)compositionrules.Theyieldofaderivationtreeistheleft-to-rightsequenceofitsleaves.Thetypeofaderivationtreeisthecategoryatitsroot.3CKY-StyleParsingAlgorithmAsthepointofdepartureforourownwork,wenowintroduceastraightforward,CKY-styleparsingalgo-rithmforCCGs.ItisasimplegeneralizationofthealgorithmpresentedbyShieberetal.(1995),whichisrestrictedtogrammarswithrulesofdegree0or1.Asinthatarticle,wespecifyouralgorithmintermsofagrammaticaldeductionsystem.3.1DeductionSystemWearegivenaCCGandastringwDw1(cid:1)(cid:1)(cid:1)wntobeparsed,whereeachwiisalexicaltoken.Asageneralnotation,forintegersi;jwith0(cid:20)ich(cid:20)J(cid:20)nwewritewŒi;J(cid:141)todenotethesubstringwiC1(cid:1)(cid:1)(cid:1)wjofw.Asusual,wetakewŒi;ich(cid:141)tobetheemptystring.ItemsTheCKY-stylealgorithmusesalogicwithitemsoftheformŒX;ich;J(cid:141)whereXisacategoryandi;jarefencepostpositionsinw.TheintendedinterpretationofsuchanitemistoassertthatwecanbuildaderivationtreewithyieldwŒi;J(cid:141)andtypeX.ThegoalofthealgorithmistheconstructionoftheitemŒS;0;N(cid:141),whichassertstheexistenceofaderivationtreefortheentireinputstring.(RecallthatSisthedistinguishedcategoryforsentences.)AxiomsandInferenceRulesThestepsoftheal-gorithmarespecifiedbymeansofinferencerulesoveritems.Theserulesimplementtherecursivedefi-nitionofderivationtreesgiveninFigure2.Thecon-structionstartswithaxiomsoftheformŒX;ich;iC1(cid:141)wherewiC1WDXisalexiconentry;theseitemsasserttheexistenceofaunary-branchingderivationtreeoftheformshownintheleftofFigure2foreachlexicaltokenwiC1.Thereisoneinferenceruleforeveryforwardrule(applicationorcomposition):ŒX=Y;ich;J(cid:141)ŒYˇ;J;k(cid:141)ŒXˇ;ich;k(cid:141)X=YYˇ)(1)Asymmetricalruleisusedforbackwardapplicationandcomposition.However,hereandintheremainderofthearticleweonlyspecifytheforwardversionofeachruleandleavethebackwardversionimplicit.3.2CorrectnessandRuntimeThesoundnessandcompletenessoftheCKY-stylealgorithmcanbeprovedbyinductiononthenumberofinferencesandthenumberofnodesinaderivationtree,respectively.Itisnothardtoseethat,inthegeneralcase,thealgorithmusesanamountoftimeandspaceexponen-tialwithrespecttothelengthoftheinputstring,n.Thisisbecauserule(1)maybeusedtogrowthearityofprimaryinputcategoriesuptosomelinearfunctionofn,resultinginexponentiallymanycate-gories.2Notethatthisisonlypossibleifthereareruleswithdegree2ormore.Forgrammarsrestrictedtoruleswithdegree0or1,suchasthoseconsideredbyShieberetal.(1995),theruntimeofthealgorithmiscubicinn.Thisrestrictedclassofgrammarsonlyholdscontext-freegenerativepower,whilethepowerofgeneralCCGisbeyondthatofcontext-freegram-mars(Vijay-ShankerandWeir,1994).2Categorieswhosearityisnotboundedbyalinearfunc-tionofnarenotuseful,inthesensethattheycannotoccurincompletederivations.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

408

wŒi0;j0(cid:141)t0wŒi;i0(cid:141)wŒj0;J(cid:141)X=YcXˇ(A)treetwŒi00;i0(cid:141)wŒj0;j00(cid:141)Xj1Yc1wŒi;i00(cid:141)wŒj00;J(cid:141)Xˇj2Zc2Xˇ(cid:13)(B)contextcFigure3:Decompositionofderivations.4APolynomial-TimeAlgorithmWenowintroduceourpolynomial-timealgorithm.ThisalgorithmusesthesameaxiomsandthesamegoalitemastheCKY-stylealgorithm,butfeaturesnewitemsandnewinferencerules.4.1NewItemsInafirststep,inordertoavoidruntimeexponentialinnwerestrictouritemsettocategorieswhosearityisboundedbysomegrammarconstantcG:ŒX;ich;J(cid:141)wherear.X/(cid:20)cGTheexactchoiceoftheconstantwillbediscussedinSection5.Withtherestricteditemset,thenewalgorithmbehavesliketheoldoneaslongasthearityofcategoriesdoesnotexceedcG.However,rule(1)aloneisnolongercomplete:Derivationswithcate-gorieswhosearityexceedscGcannotbesimulatedanymore.Toremedythisdeficiency,weintroduceanewtypeofitemtoimplementaspecificdecomposi-tionoflongderivationsintosmallerpieces.ConsideraderivationtoftheformshowninFig-ure3(A).NotethattheyieldoftiswŒi;J(cid:141).Thederivationconsistsoftwoparts,namedt0andc;theseshareacommonnodewithacategoryoftheformX=Y.NowassumethatchasthespecialpropertythatnoneofthecombinatoryrulesthatitusespopstheargumentstackofthecategoryX.Thismeansthatc,afterpoppingtheargument=Y,maypushnewargumentsandpopthemagain,butmaynever“touch”X.Wecallafragmentwiththisspecialprop-ertyaderivationcontext.(AformaldefinitionwillbegiveninSection5.2.)Thespecialpropertyofcisusefulbecauseitim-pliesthatccanbecarriedoutforanychoiceofX.Tobemorespecific,letuswriteˇforthe(possiblyempty)sequenceofargumentsthatcpushestotheargumentstackofXinplaceof=Y.Weshallreferto=Yasthebridgingargumentandtothesequenceˇastheexcessofc.Supposenowthatwereplacet0byaderivationtreewiththesameyieldbutwithatypeX0=YwhereX0¤X.ThenbecausecdoesnottouchX0weobtainanothervalidderivationtreewiththesameyieldast;thetypeofthistreewillbeX0ˇ.Forthecombinationwithc,theinternalstructureoft0isofnoimportance;theonlyimportantinforma-tionistheextentoftheyieldoft0andtheidentityofthebridgingargument=Y.Intermsofourdeductionsystem,thiscanbeexpressedasfollows:Thederiva-tioncontextccanbecombinedwithanytreet0thatisassociatedwithanitemoftheformŒX=Y;i0;j0(cid:141),whereXisanycategory.Similarly,theinternalstruc-tureofcisofnoimportanceeither,aslongastheargumentstackofthecategoryXremainsuntouched.Itsufficestorecordthefollowing:(cid:15)theextentoftheyieldoft,specifiedintermsofthepositionsiandj;(cid:15)theextentoftheyieldoft0,specifiedintermsofthepositionsi0andj0;(cid:15)thebridgingargument=Y;Und(cid:15)theexcessˇ.WerepresentthesepiecesofinformationinanewtypeofitemoftheformŒ=Y;ˇ;ich;i0;j0;J(cid:141).Thein-tendedinterpretationoftheseitemsistoassertthat,foranychoiceofX,ifwecanbuildaderivationtreet0withyieldwŒi0;j0(cid:141)andtypeX=Y,thenwecanalsobuildaderivationtreet0withyieldwŒi;J(cid:141)andtypeXˇ.WealsouseitemsŒ=Y;ˇ;ich;i0;j0;J(cid:141)withabackwardslash,withasimilarmeaning.Likeitemsthatrepresentderivationtrees,ouritemsforderivationcontextsarearity-restricted:ŒjY;ˇ;ich;i0;j0;J(cid:141)wherear.Yˇ/(cid:20)cGAswewillseeinSection5,theserestricteditemssufficetosimulateallderivationsofaCCG.Further-more,thiscanbedoneintimepolynomialinn,be-causeourencodingallowssharingofthesameitemsamongseveralderivations.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

409

ŒA;0;1(cid:141)ŒS=H=A=F;2;5(cid:141)ŒB;1;2(cid:141)ŒC=A=F;2;3(cid:141)ŒS=E;3;4(cid:141)ŒE=H=C;4;5(cid:141)ŒS=H=C;3;5(cid:141)(1)ŒS=H=A=F;2;5(cid:141)(1)ŒF=G=B;5;6(cid:141)Œ=F;=G=B;2;2;5;6(cid:141)(2)(cid:142)Œ=F;=G;1;2;5;6(cid:141)(4)ŒG;6;7(cid:141)Œ=F;”;1;2;5;7(cid:141)(4)ŒS=H=A;1;7(cid:141)(3)(cid:143)ŒS=H;0;7(cid:141)(1)ŒH;7;8(cid:141)ŒS;0;8(cid:141)(1)Figure4:AsamplederivationofthegrammaticaldeductionsystemofSection4.Inference(cid:142)triggersanewcontextitemfromatreeitem;inference(cid:143)reusesthetreeitem(asindicatedbythearrow),recombiningitwiththe(modified)contextitem.4.2NewInferenceRulesInourparsingalgorithm,contextitemsareintroducedwheneverthecompositionoftwocategorieswhosearitiesareboundedbycGwouldresultinacategorywhosearityexceedsthisbound:ŒX=Y;ich;J(cid:141)ŒYˇ;J;k(cid:141)Œ=Y;ˇ;ich;ich;J;k(cid:141)8<:X=YYˇ)Xˇar.Xˇ/>cG(2)Thenewrulehasthesameantecedentsasrule(1),butratherthanextendingthederivationassertedbythefirstantecedentŒX=Y;ich;J(cid:141),whichisnotpossiblebe-causeofthearitybound,ittriggersanewderivationcontext,assertedbytheitemŒ=Y;ˇ;ich;ich;J;k(cid:141).Fur-therapplicationsandcompositionswillextendthenewcontext,andonlywhentheexcessofthiscontexthasbecomesufficientlysmallwillitberecombinedwiththederivationthatoriginallytriggeredit.Thisisdonebythefollowingrule:ŒXjY;i0;j0(cid:141)ŒjY;ˇ;ich;i0;j0;J(cid:141)ŒXˇ;ich;J(cid:141)(3)Notethatthisrule(likeallrulesinthedeductionsystem)isonlydefinedonvaliditems;inparticularitonlyfiresifthearityofthecategoryXˇisboundedbycG.Theremainingrulesofthealgorithmparallelthethreerulesthatwehaveintroducedsofarbuttakeitemsthatrepresentderivationcontextsratherthanderivationtreesastheirfirstantecedents.Firstout,rule(4)extendsaderivationcontextinthesamewayasrule(1)extendsaderivationtree.ŒjY;ˇ=Z;ich;i0;j0;J(cid:141)ŒZ(cid:13);J;k(cid:141)ŒjY;ˇ(cid:13);ich;i0;j0;k(cid:141)8<:X=ZZ(cid:13))X(cid:13)ar.Yˇ(cid:13)/(cid:20)cG(4)Rule(5)istheobviouscorrespondentofrule(2):Ittriggersanewcontextwhentheantecedentcontextcannotbeextendedbecauseofthearitybound.ŒjY;ˇ=Z;i;i0;j0;j(cid:141)ŒZ(cid:13);j;k(cid:141)Œ=Z;(cid:13);i;i;j;k(cid:141)8<:X=ZZ(cid:13))X(cid:13)ar.Yˇ(cid:13)/>cG(5)Endlich,andparalleltorule(3),weneedaruletorecombineacontextwiththecontextthatoriginallytriggeredit.Asitwillturnout,weonlyneedthisincaseswherethetriggeredcontexthasnoexcess.Œj1Y;ˇj2Z;i00;i0;j0;j00(cid:141)Œj2Z;”;ich;i00;j00;J(cid:141)Œj1Y;ˇ;ich;i0;j0;J(cid:141)(6)4.3SampleDerivationWenowillustrateouralgorithmonatoygrammar.Thegrammarhasthefollowinglexicon:w1WDAw5WDE=H=Cw2WDBw6WDF=G=Bw3WDC=A=Fw7WDGw4WDS=Ew8WDHThestartsymbolisS.Thegrammarallowsallin-stancesofapplicationandallinstancesofcomposi-tionwithdegreeboundedby2.WeletcGD3(asexplainedlaterinSection5.2).Aderivationofourdeductionsystemonthein-putstringw1(cid:1)(cid:1)(cid:1)w8isgiveninFigure4.Westartbyapplyingrule(1)twice(onceforward,onceback-ward)toobtaintheitemŒS=H=A=F;2;5(cid:141).Com-biningthisitemwiththeaxiomŒF=G=B;5;6(cid:141)isnotpossibleusingrule(1),asthiswouldresultin

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

410

acategorywitharity4,exceedingthearitybound.Wethereforeuserule(2)totriggerthecontextitemŒ=F;=G=B;2;2;5;6(cid:141)((cid:142)).Successively,weuserule(4)twicetoobtaintheitemŒ=F;”;1;2;5;7(cid:141).Atthispointweuserule(3)(withˇD”)torecombinethecontextitemwiththetreeitemthatoriginallytriggeredit((cid:143));thisyieldstheitemŒS=H=A;1;7(cid:141).Notethattherecombinationeffectivelyretrievestheportionofthestackthatwasbelowtheargument=Fwhenthecontextitemwastriggeredin(cid:142).Doubleap-plicationofrule(1)producesthegoalitemŒS;0;8(cid:141).4.4RuntimeAnalysisWenowturntoananalysisoftheruntimecomplexityofouralgorithm.Wefirstconsiderruntimecomplex-itywithrespecttothelengthoftheinputstring,n.Theruntimeisdominatedbythenumberofinstantia-tionsofrule(6)whichinvolvestwocontextitemsasantecedents.Byinspectionofthisrule,weseethatthenumberofpossibleinstantiationsisboundedbyn6.ThereforeweconcludethatthealgorithmrunsintimeO.n6/.Wenowconsiderruntimecomplexitywithrespecttothesizeoftheinputgrammar.Heretheruntimeisdominatedbythenumberofinstantiationsofrules(1)–(5).Forexample,rule(5)combinesitemsŒjY;ˇ=Z;ich;i0;j0;J(cid:141)andŒZ(cid:13);J;k(cid:141):Byourrestrictionsonitems,boththearityofYˇ=ZandthearityofZ(cid:13)areupper-boundedbythecon-stantcG.NowrecallthateverycategoryXcanbewrittenasXDA˛forsomeatomiccategoryAandstackofarguments˛.LetAbethesetofatomiccategoriesintheinputgrammar,andletLbethesetofallargumentsoccurringinanycategoryofthelexicon.ByaresultofVijay-ShankerandWeir(1994,Lemma3.1),everyargumentthatmayoccurinaderivedcategoryoccursinL.Thenthenumberofpossibleinstantiationsofrule(5)aswellasrules(1)–(4),andhencetheruntimeofthealgorithm,isinO.jAjjLjcG(cid:1)jLjcG/DO.jAjjLj2cG/:NotethatbothAandLmaygrowwiththegrammarsize.AswewillseeinSection5.2,theconstantcGalsodependsonthegrammarsize.Thismeansthattheworst-caseruntimecomplexityofourparserisexponentialinthesizeoftheinputgrammar.WewillreturntothispointinSection7.5CorrectnessInthissectionweprovethecorrectnessofourparsingalgorithm.Inordertosimplifytheproofs,westartbysimplifyingouralgorithm,atthecostofmakingitlessefficient:(cid:15)Weremovetherulesforextendingtreesandcontexts,rule(1)andrule(4).(cid:15)Weconflatetherulesfortriggeringcontexts,rule(2)andrule(5),intothesingleruleŒYˇ;J;k(cid:141)Œ=Y;ˇ;ich;ich;J;k(cid:141)X=YYˇ)(7)Thisruleblindlyguessestheextensionofthetriggeringtreeorcontext(specifiedbyposi-tionsiandj),ratherthanwaitingforacor-respondingitemtobederived.ThesimplifiedalgorithmisspecifiedinFigure5.WenowarguethatthisalgorithmandthealgorithmfromSection4parseexactlythesamederivationtrees,althoughtheyusedifferentparsingstrategies.First,weobservethatrule(1)intheoldalgorithmcanbesimulatedbyacombinationofotherrulesinthesimplifiedalgorithm,asfollows:ŒX=Y;ich;J(cid:141)ŒYˇ;J;k(cid:141)Œ=Y;ˇ;ich;ich;J;k(cid:141)(7)ŒXˇ;ich;k(cid:141)(3)Außerdem,thesimplifiedalgorithmdoesnolongerneedrule(4),whoseroleisnowtakenoverbyrule(7).Toseethis,recallthatrule(4)extendsanexistingcontextwheneverthecompositionoftwocategoriesresultsinanewcategorywhosearitydoesnotexceedcG.Incontrast,rule(7)alwaystriggersanewcontextc,eveniftheresultofthecompositionofcwithsomeexistingcontextsatisfiestheabovearityrestriction.Despitethedifferenceintheadoptedstrategy,thesetworulesareequivalentintermsofstackcontent,leadingtothesamederivationtrees.5.1DefinitionsWeintroducesomeadditionalterminologyandnota-tionthatwewilluseintheproofs.Foraderivationtreetandanodeuoft,wewritetŒu(cid:141)todenotethecategoryatu,andwewritetjutodenotethesub-treeoftatu.Formally,tjuistherestrictionofttonodeuandallofitsdescendants.Eachsubtreeofaderivationtreeisanotherderivationtree.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

411

Itemsoftype1:ŒX;ich;J(cid:141),ar.X/(cid:20)cGItemsoftype2:ŒjY;ˇ;ich;i0;j0;J(cid:141),ar.Yˇ/(cid:20)cGAxioms:ŒX;ich;iC1(cid:141),wiC1WDXGoal:ŒS;0;N(cid:141)Inferencerules:ŒYˇ;J;k(cid:141)Œ=Y;ˇ;ich;ich;J;k(cid:141)X=YYˇ)(7)ŒX=Y;i0;j0(cid:141)Œ=Y;ˇ;ich;i0;j0;J(cid:141)ŒXˇ;ich;J(cid:141)(3)Œj1Y;ˇj2Z;i00;i0;j0;j00(cid:141)Œj2Z;”;ich;i00;j00;J(cid:141)Œj1Y;ˇ;ich;i0;j0;J(cid:141)(6)Figure5:Itemsandinferencerulesofthesimplifiedalgorithmforaninputstringw1(cid:1)(cid:1)(cid:1)wn.Definition1Lettbeaderivationtreewithrootr.ThenthassignatureŒX;ich;J(cid:141)if1.theyieldoftiswŒi;J(cid:141)and2.thetypeoftisX,thatis,tŒr(cid:141)DX.Notethatwhileweusethesamenotationforsig-naturesasforitems,thesignatureofaderivationtreeisapurelystructuralconcept,whereasanitemisanobjectinthealgorithm.Acentralconceptinourproofisthenotionofspine.Recallthataderivationtreeconsistsofunarybranchingsandbinarybranchings.Ineachbinarybranching,werefertothetwochildrenofthebranch-ing’srootnodeastheprimarychildandthesec-ondarychild,dependingonwhichofthetwoisla-beledwiththeprimaryandsecondaryinputcategoryofthecorrespondingruleinstance.InFigure2,theprimarychildrenoftherootnodeareshaded.Definition2Foraderivationtreet,thespineoftistheuniquepaththatstartsattherootnodeoftandateachnodeucontinuestotheprimarychildofu.Thespineofaderivationtreealwaysendsatanodethatislabeledwithacategoryfromthelexicon.Definition3Lettbeaderivationtreewithrootr.Aderivationcontextcisobtainedbyremovingallproperdescendantsofsomenodef¤ronthespineoft,undertherestrictionthatar.tŒu(cid:141)/>ar.tŒr(cid:141)/foreverynodeuonthespineproperlybetweenfandr.Thenodefiscalledthefootnodeofc.Theyieldofcisthepairwhosefirstcomponentistheyieldofttotheleftoffandwhosesecondcomponentistheyieldofttotherightoff.Foraderivationcontextcandanodeuofc,wewritecŒu(cid:141)todenotethecategoryatu.Definition3formalizestheconceptofderivationcontextsthatweintroducedinSection4.1.First,becausefisonthespineandf¤r,thecate-gorycŒf(cid:141)takestheformX=Y.Thearityrestric-tionimpliesthatthecategoryofeverynodeuonthespineproperlybetweenfandrtakestheformXˇu,jˇuj>0,andthatthecategoryattheroottakestheformXˇ,jˇj(cid:21)0.ThusthecategoryXisneverexposedinc,exceptperhapsatr.AswewillseeinSection5.4,thisproperty,togetherwithacarefulselectionof“splitnodes”,willallowustodecomposederivationsintosmaller,shareableparts.Thebasicideaisthesameasinthetabulationofpushdownautomata(Lang,1974;NederhofandSatta,2004),wherethepushdowninourcaseistheargumentstackoftheprimaryinputcategoriesalongaspine.Theconceptsofsignatureandspinearegeneral-izedtoderivationcontextsasfollows:Definition4Letcbeaderivationcontextwithrootnoderandfootnodef.ThenchassignatureŒjY;ˇ;ich;i0;j0;J(cid:141)if1.theyieldofcis.wŒi;i0(cid:141);wŒj0;J(cid:141)/;2.forsomeX,cŒf(cid:141)DXjYandcŒr(cid:141)DXˇ.Definition5Foraderivationcontextc,thespineofcisthepathfromitsrootnodetoitsfootnode.5.2GrammarConstantBeforewestartwiththeproofassuch,weturntothechoiceofthegrammarconstantcG,whichwasleftpendinginprevioussections.RecallthatweareusingcGasaboundonthearityofXintype1itemsŒX;ich;J(cid:141).Sincetheseitemsareproducedbyouraxiomsfromthesetofcategoriesinthelexicon,cGmustnotbesmallerthanthemaximumarity`ofacategoryinthisfiniteset.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

412

WealsousecGasaboundonthearityofthecat-egoryYˇintype2itemsŒjY;ˇ;ich;i0;j0;J(cid:141).Theseitemsareproducedbyinferencerule(7)tosimulateinstancesofcompositionoftheformX=YYˇ)Xˇ.Herethelengthofˇisboundedbythemaxi-mumdegreedofacompositionruleinthegrammar,andar.Y/isboundedbythemaximumarityaofanargumentfromthe(endlich)setLofargumentsinthelexicon(recallSection4.4).ThereforecGcannotbesmallerthanaCd.Puttingeverythingtogether,weobtaintheconditioncG(cid:21)maxf`;aCdg:(8)Thenextlemmawillbeusedinseveralplaceslater.Lemma1Letcbeaderivationcontextwithsigna-tureŒjY;ˇ;ich;i0;j0;J(cid:141).Thenar.Yˇ/(cid:20)cG.Proof.Letrandfbetherootandthefootnode,jeweils,ofc.Fromthedefinitionofsignature,theremustbesomeXsuchthatcŒr(cid:141)DXˇandcŒf(cid:141)DXjY.Nowletpbetheparentnodeoff,andassumethattheruleusedatpisinstantiatedasX=YYˇ0)Xˇ0,sothatcŒp(cid:141)DXˇ0.IfpDrthenˇ0Dˇ;ansonsten,becauseofthearityrestric-tioninthedefinitionofderivationcontexts(Defini-tion3)wehavejˇ0j>jˇj.Thenar.Yˇ/(cid:20)ar.Yˇ0/(cid:20)cG;wheretherightinequalityfollowsfromtheassump-tionthatX=YYˇ0)Xˇ0isaruleinstanceofthegrammar,andfrominequality(8).5.3SoundnessWestartthecorrectnessproofbyarguingforthesoundnessofthedeductionsysteminFigure5.Morespecifically,weshowthatforeveryitemoftype1thereexistsaderivationtreewiththesamesigna-ture,andthatforeveryitemoftype2thereexistsaderivationcontextwiththesamesignature.Thesoundnessoftheaxiomsisobvious.Rule(7)statesthat,ifwehavebuiltaderivationtreetwithsignatureŒYˇ;J;k(cid:141)thenwecanbuildaderivationcontextcwithsignatureŒ=Y;ˇ;ich;ich;J;k(cid:141).UndertheconditionthatthegrammaradmitstheruleinstanceX=YYˇ),thisinferenceissound;thecontextcanbebuiltasshowninFigure6.wŒj;k(cid:141)YˇX=YXˇtFigure6:Soundnessofrule(7).Rule(3)statesthat,ifwehavebuiltaderivationtreet0withsignatureŒX=Y;i0;j0(cid:141)andacontextcwithsignatureŒ=Y;ˇ;ich;i0;j0;J(cid:141),thenwecanbuildanewtreetwithsignatureŒXˇ;ich;J(cid:141).Weobtaintbysubstitutingt0forthefootnodeofc(seeFigure3(A)).Rule(6)statesthat,ifwehavebuiltaderivationcontextc1withsignatureŒj1Y;ˇj2Z;i00;i0;j0;j00(cid:141)andanothercontextc2withsignatureŒj2Z;”;ich;i00;j00;J(cid:141),thenwecanbuildaderivationcontextcwithsignatureŒj1Y;ˇ;ich;i0;j0;J(cid:141).Weobtaincbysubstitutingc1forthefootnodeofc2;thisisillustratedbyFigure3(B)ifweassume(cid:13)D”.5.4CompletenessInthefinalpartofourcorrectnessproofwenowprovethecompletenessofthedeductionsysteminFigure5.Specificallyweshowthefollowingstrongerstatement:ForeveryderivationtreeandforeveryderivationcontextwithasignatureIsatisfyingthearityboundsforitemsofFigure5,thedeductionsysteminfersthecorrespondingitemI.Fromthisstatementwecanimmediatelyconcludethatthesys-temconstructsthegoalitemwheneverthereexistsaderivationtreewhoseyieldisthecompleteinputstringandwhosetypeisthedistinguishedcategoryS.Ourproofisbyinductiononameasurethatwecallrank.Therankofaderivationtreeorcontextisthenumberofitsnon-leafnodes.Notethatthisdefinitionimpliesthatthefootnodeofacontextisnotcountedagainstitsrank.Therankofatreeorcontextisalwaysatleast1,withrank1onlyrealizedforderivationtreesconsistingofasinglenode.5.4.1BaseCaseConsideraderivationtreetwithsignatureŒX;ich;J(cid:141)andrank.t/D1.ThetreettakestheformshownintheleftofFigure2,andwehavejDiC1andwjWDX.TheitemŒX;ich;J(cid:141)isthenproducedbyoneoftheaxiomsofourdeductionsystem.

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

413

5.4.2InductiveCaseThegeneralideaunderlyingtheinductivecasecanbestatedasfollows.Weconsideraderivationtreeorcontext’withsignatureIsatisfyingtheboundsstatedinFigure5foritemsoftype1or2.Wethenidentifyaspecialnodesin’’sspine,whichwecallthesplitnode.Weusesto“split”’intotwopartsthatareeitherderivationtreesorcontexts,thatbothsatisfytheboundsforitemsoftype1or2,andthatbothhaveranksmallerthantherankof’.Wethenapplytheinductionhypothesistoobtaintwoitemsthatcanbecombinedbyoneoftheinferencerulesofouralgorithm,resultinginthedesireditemIfor’.Wefirstconsiderthecaseinwhich’isatree,andlaterthecaseinwhich’isacontext.5.4.3SplittingTreesConsideraderivationtreetwithsignatureŒX0;ich;J(cid:141),rootnoder,andrank.t/>1.Thenthespineoftconsistsofatleast2nodes.Nowassumethatar.X0/(cid:20)cG,thatis,ŒX0;ich;J(cid:141)isavaliditem.Choosethesplitnodestobethehighest(closesttotheroot)non-rootnodeonthespineforwhichar.tŒs(cid:141)/(cid:20)cG.Nodesalwaysexists,asthearityconstraintissatisfiedatleastforthelowest(farthestfromtheroot)nodeonthespine,whichislabeledwithacategoryfromthelexicon.Considerthesubtreet0Dtjs;thussistherootnodeoft0.Becausesisaprimarynodeint,thecategoryatshasatleastoneargument.WedealherewiththecasewherethiscategorytakestheformtŒs(cid:141)DX=Y;thecasetŒs(cid:141)DX=Yissym-metrical.Thusthesignatureoft0takestheformŒX=Y;i0;j0(cid:141)wherei0;j0areintegerswithi(cid:20)i0ar.tŒr(cid:141)/foreverynodeuthatisproperlybetweensandr:Iftherewasanodeusuchthatar.tŒu(cid:141)/(cid:20)ar.tŒr(cid:141)/thenbecauseoftŒr(cid:141)DX0andourassumptionthatar.X0/(cid:20)cGwewouldhavechosenuinsteadofs.Nowletˇbetheexcessofc;thenX0DtŒr(cid:141)DtŒs(cid:141)ˇDXˇ.ThusthesignatureofctakestheformŒ=Y;ˇ;ich;i0;j0;J(cid:141).ApplyingLemma1toc,wegetar.Yˇ/(cid:20)cG,andthereforeŒ=Y;ˇ;ich;i0;j0;J(cid:141)isalsoavaliditem.Furthermore,rank.c/ar.cŒr(cid:141)/.Nowassumethatar.Yˇ/(cid:20)cG,thatis,Œ=Y;ˇ;ich;i0;j0;J(cid:141)isavaliditem.Wedistinguishtwocasesbelow.Case1Supposethatthespineofcconsistsofex-actly2nodes.InthiscasethefootfistheleftchildoftherootrandiDi0.Letf0betherightsiblingoffandconsiderthesubtreet0Dtjf0;thusf0istherootnodeoft0.Thesignatureoft0takestheformŒYˇ;j0;J(cid:141).Byourassumption,ar.Yˇ/(cid:20)cG,andthenŒYˇ;j0;J(cid:141)isavaliditem.Furthermore,rank.t0/ar.cŒr(cid:141)/forevery

l

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

P

:
/
/

D
ich
R
e
C
T
.

M

ich
T
.

e
D
u

/
T

A
C
l
/

l

A
R
T
ich
C
e

P
D

F
/

D
Ö

ich
/

.

1
0
1
1
6
2

/
T

l

A
C
_
A
_
0
0
1
9
2
1
5
6
6
9
0
3

/

/
T

l

A
C
_
A
_
0
0
1
9
2
P
D

.

F

B
j
G
u
e
S
T

T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

414

nodeuinthespine,andatthesametime,nocombi-natoryrulecanreducethearityofitsprimaryinputcategorybymorethanoneunit.Considerthecontextc1thatisobtainedbyrestrict-ingctonodesandallofitsdescendants;thussistherootnodeofc1andfisthefootnode.Toseethatc1iswell-defined,notethatourchoiceofsguaranteesthatar.cŒu(cid:141)/>ar.cŒs(cid:141)/foreverynodeuthatisproperlybetweenfands.Toseethis,supposethattherewasanodeu¤ssuchthatar.cŒu(cid:141)/(cid:20)ar.cŒs(cid:141)/.Sincear.cŒs(cid:141)/Dar.cŒr(cid:141)/C1andar.cŒu(cid:141)/¤ar.cŒs(cid:141)/,byourdefinitionofs,wewouldhavear.cŒu(cid:141)/(cid:20)ar.cŒr(cid:141)/,whichcannotbebecauseinc,everynodeuproperlybetweenfandshasarityar.cŒu(cid:141)/>ar.cŒr(cid:141)/.Becausefisaprimarynodeinc,thecategoryatfhasatleastoneargument;callitj1Y.Thenodesisaprimarynodeincaswell,sotheexcessofc1takestheformˇj2Z,wherej2Zisthetopmostargumentofthecategoryats.Thusthesignatureofc1takestheformŒj1Y;ˇj2Z;i00;i0;j0;j00(cid:141)wherei00;j00areintegerswithi(cid:20)i00(cid:20)i0andj0(cid:20)j00(cid:20)j.ApplyingLemma1toc1,wegetar.Yˇj2Z/(cid:20)cG,andthereforeŒj1Y;ˇj2Z;i00;i0;j0;j00(cid:141)isavaliditem.Finally,wenotethatrank.c1/ar.cŒr(cid:141)/foreverynodeuthatisproperlybetweensandrsimplybecauseeverysuchnodeisalsoproperlybetweenfandr.Theexcessofc2istheemptystackbyourchoiceofs.Thusthesignatureofc2isŒj2Z;”;ich;i00;j00;J(cid:141).WeapplyLemma1oncemore,thistimetoc2,toshowthatar.Z/(cid:20)cG,andconcludethatŒj2Z;”;ich;i00;j00;J(cid:141)isalsoavaliditem.Finallywenotethatrank.c2/PDF Herunterladen