Transactions of the Association for Computational Linguistics, 1 (2013) 139–150. Action Editor: Joakim Nivre.

Submitted 12/2012; Revised 3/2013; Published 5/2013. c
(cid:13)

2013 Association for Computational Linguistics.

EfﬁcientStackedDependencyParsingbyForestRerankingKatsuhikoHayashiandShuheiKondoandYujiMatsumotoGraduateSchoolofInformationScienceNaraInstituteofScienceandTechnology8916-5,Takayama,Ikoma,Nara630-0192,Japan{katsuhiko-h,shuhei-k,matsu}@is.naist.jpAbstractThispaperproposesadiscriminativefor-estrerankingalgorithmfordependencypars-ingthatcanbeseenasaformofefﬁcientstackedparsing.Adynamicprogrammingshift-reduceparserproducesapackedderiva-tionforestwhichisthenscoredbyadiscrim-inativereranker,usingthe1-besttreeoutputbytheshift-reduceparserasguidefeaturesinadditiontothird-ordergraph-basedfeatures.Toimproveefﬁciencyandaccuracy,thispa-peralsoproposesanovelshift-reduceparserthateliminatesthespuriousambiguityofarc-standardtransitionsystems.TestingontheEnglishPennTreebankdata,forestrerankinggaveastate-of-the-artunlabeleddependencyaccuracyof93.12.1IntroductionTherearetwomainapproachesofdata-drivende-pendencyparsing–oneisgraph-basedandtheotheristransition-based.Inthegraph-basedapproach,globaloptimiza-tionalgorithmsﬁndthehighest-scoringtreewithlocallyfactoredmodels(McDonaldetal.,2005).Whilethird-ordergraph-basedmodelsachievestate-of-the-artaccuracy,ithasO(n4)timecomplexityforasentenceoflengthn.Recently,someprun-ingtechniqueshavebeenproposedtoimprovetheefﬁciencyofthird-ordermodels(RushandPetrov,2012;ZhangandMcDonald,2012).Thetransition-basedapproachusuallyemploystheshift-reduceparsingalgorithmwithlinear-timecomplexity(Nivre,2008).Itgreedilychoosesthetransitionwiththehighestscoreandtheresult-ingtransitionsequenceisnotalwaysgloballyop-timal.Thebeamsearchalgorithmimprovespars-ingﬂexibilityindeterministicparsing(ZhangandClark,2008;ZhangandNivre,2011),anddy-namicprogrammingmakesbeamsearchmoreefﬁ-cient(HuangandSagae,2010).Thereisalsoanalternativeapproachthatin-tegratesgraph-basedandtransition-basedmodels(SagaeandLavie,2006;ZhangandClark,2008;NivreandMcDonald,2008;Martinsetal.,2008).Martinsetal.(2008)formulatedtheirapproachasstackingofparserswheretheoutputoftheﬁrst-stageparserisprovidedtothesecondasguidefeatures.Inparticular,theyusedatransition-basedparserfortheﬁrststageandagraph-basedparserforthesecondstage.Themaindrawbackofthisapproachisthattheefﬁciencyofthetransition-basedparserissacri-ﬁcedbecausethesecond-stageemploysfullparsing.Thispaperproposesanefﬁcientstackedpars-ingmethodthroughdiscriminativererankingwithhigher-ordergraph-basedfeatures,whichworksontheforestsoutputbytheﬁrst-stagedynamicpro-grammingshift-reduceparserandintegratesnon-localfeaturesefﬁcientlywithcube-pruning(HuangandChiang,2007).Theadvantagesofourmethodareasfollows:•Unliketheconventionalstackingapproach,theﬁrst-stageshift-reduceparserprunesthesearchspaceofthesecond-stagegraph-basedparser.•Inadditiontoguidefeatures,thesecond-stagegraph-basedparsercanemploythescoresoftheﬁrst-stageparserwhichcannotbeincorpo-

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u

/
t

a
c
l
/

a
r
t
i
c
e
–
p
d

f
/

d
o

i
/

1
0
1
1
6
2

/
t

a
c
_
a
_
0
0
2
1
6
1
5
6
6
6
2
9

/
t

a
c
_
a
_
0
0
2
1
6
p
d

b
y
g
u
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

140

axiom(c0):0:(0,1,w0):∅goal(c2n):2n:(0,n,s0):∅shift:statepz}|{ℓ:(,j,sd|sd−1|…|s1|s0):ℓ+1:(j,j+1,sd−1|sd−2|…|s0|wj):(p)iDownload pdf