Transactions of the Association for Computational Linguistics, 1 (2013) 139–150. Action Editor: Joakim Nivre.

Submitted 12/2012; Überarbeitet 3/2013; Published 5/2013. C
(cid:13)

2013 Verein für Computerlinguistik.

EfﬁcientStackedDependencyParsingbyForestRerankingKatsuhikoHayashiandShuheiKondoandYujiMatsumotoGraduateSchoolofInformationScienceNaraInstituteofScienceandTechnology8916-5,Takayama,Ikoma,Nara630-0192,Japan{katsuhiko-h,shuhei-k,matsu}@is.naist.jpAbstractThispaperproposesadiscriminativefor-estrerankingalgorithmfordependencypars-ingthatcanbeseenasaformofefﬁcientstackedparsing.Adynamicprogrammingshift-reduceparserproducesapackedderiva-tionforestwhichisthenscoredbyadiscrim-inativereranker,usingthe1-besttreeoutputbytheshift-reduceparserasguidefeaturesinadditiontothird-ordergraph-basedfeatures.Toimproveefﬁciencyandaccuracy,thispa-peralsoproposesanovelshift-reduceparserthateliminatesthespuriousambiguityofarc-standardtransitionsystems.TestingontheEnglishPennTreebankdata,forestrerankinggaveastate-of-the-artunlabeleddependencyaccuracyof93.12.1IntroductionTherearetwomainapproachesofdata-drivende-pendencyparsing–oneisgraph-basedandtheotheristransition-based.Inthegraph-basedapproach,globaloptimiza-tionalgorithmsﬁndthehighest-scoringtreewithlocallyfactoredmodels(McDonaldetal.,2005).Whilethird-ordergraph-basedmodelsachievestate-of-the-artaccuracy,ithasO(n4)timecomplexityforasentenceoflengthn.Recently,someprun-ingtechniqueshavebeenproposedtoimprovetheefﬁciencyofthird-ordermodels(RushandPetrov,2012;ZhangandMcDonald,2012).Thetransition-basedapproachusuallyemploystheshift-reduceparsingalgorithmwithlinear-timecomplexity(Nivre,2008).Itgreedilychoosesthetransitionwiththehighestscoreandtheresult-ingtransitionsequenceisnotalwaysgloballyop-timal.Thebeamsearchalgorithmimprovespars-ingﬂexibilityindeterministicparsing(ZhangandClark,2008;ZhangandNivre,2011),anddy-namicprogrammingmakesbeamsearchmoreefﬁ-cient(HuangandSagae,2010).Thereisalsoanalternativeapproachthatin-tegratesgraph-basedandtransition-basedmodels(SagaeandLavie,2006;ZhangandClark,2008;NivreandMcDonald,2008;Martinsetal.,2008).Martinsetal.(2008)formulatedtheirapproachasstackingofparserswheretheoutputoftheﬁrst-stageparserisprovidedtothesecondasguidefeatures.Inparticular,theyusedatransition-basedparserfortheﬁrststageandagraph-basedparserforthesecondstage.Themaindrawbackofthisapproachisthattheefﬁciencyofthetransition-basedparserissacri-ﬁcedbecausethesecond-stageemploysfullparsing.Thispaperproposesanefﬁcientstackedpars-ingmethodthroughdiscriminativererankingwithhigher-ordergraph-basedfeatures,whichworksontheforestsoutputbytheﬁrst-stagedynamicpro-grammingshift-reduceparserandintegratesnon-localfeaturesefﬁcientlywithcube-pruning(HuangandChiang,2007).Theadvantagesofourmethodareasfollows:•Unliketheconventionalstackingapproach,theﬁrst-stageshift-reduceparserprunesthesearchspaceofthesecond-stagegraph-basedparser.•Inadditiontoguidefeatures,thesecond-stagegraph-basedparsercanemploythescoresoftheﬁrst-stageparserwhichcannotbeincorpo-

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u

/
T

A
C
l
/

A
R
T
ich
C
e
–
P
D

F
/

D
Ö

ich
/

1
0
1
1
6
2

/
T

A
C
_
A
_
0
0
2
1
6
1
5
6
6
6
2
9

/
T

A
C
_
A
_
0
0
2
1
6
P
D

B
j
G
u
e
S
T

Ö
N
0
8
S
e
P
e
M
B
e
R
2
0
2
3

140

axiom(c0):0:(0,1,w0):∅goal(c2n):2N:(0,N,s0):∅shift:statepz}|{ℓ:(,J,sd|sd−1|…|s1|s0):ℓ+1:(J,j+1,sd−1|sd−2|…|s0|wj):(P)ichPDF Herunterladen