Operazioni dell'Associazione per la Linguistica Computazionale, 1 (2013) 403–414. Redattore di azioni: Jason Eisner.

Submitted 6/2013; Pubblicato 10/2013. C
(cid:13)

2013 Associazione per la Linguistica Computazionale.

TrainingDeterministicParserswithNon-DeterministicOraclesYoavGoldbergBar-IlanUniversityDepartmentofComputerScienceRamat-Gan,Israelyoav.goldberg@gmail.comJoakimNivreUppsalaUniversityDepartmentofLinguisticsandPhilologyUppsala,Swedenjoakim.nivre@lingﬁl.uu.seAbstractGreedytransition-basedparsersareveryfastbuttendtosufferfromerrorpropagation.Thisproblemisaggravatedbythefactthattheyarenormallytrainedusingoraclesthataredeter-ministicandincompleteinthesensethattheyassumeauniquecanonicalpaththroughthetransitionsystemandareonlyvalidaslongastheparserdoesnotstrayfromthispath.Inthispaper,wegiveageneralcharacterizationoforaclesthatarenondeterministicandcom-plete,presentamethodforderivingsuchora-clesfortransitionsystemsthatsatisfyaprop-ertywecallarcdecomposition,andinstanti-atethismethodforthreewell-knowntransi-tionsystemsfromtheliterature.Wesaythattheseoraclesaredynamic,becausetheyallowustodynamicallyexplorealternativeandnon-optimalpathsduringtraining–incontrasttooraclesthatstaticallyassumeauniqueoptimalpath.Experimentalevaluationonawiderangeofdatasetsclearlyshowsthatusingdynamicoraclestotraingreedyparsersgivessubstan-tialimprovementsinaccuracy.Moreover,thisimprovementcomesatnocostintermsofefﬁciency,unlikeothertechniqueslikebeamsearch.1IntroductionGreedytransition-basedparsersareeasytoimple-mentandareveryefﬁcient,buttheyaregenerallynotasaccurateasparsersthatarebasedonglobalsearch(McDonaldetal.,2005;KooandCollins,2010)orastransition-basedparsersthatusebeamsearch(ZhangandClark,2008)ordynamicpro-gramming(HuangandSagae,2010;Kuhlmannetal.,2011).Thisworkispartofalineofresearchtryingtopushtheboundariesofgreedyparsingandnarrowtheaccuracygapof2–3%betweensearch-basedandgreedyparsers,whilemaintainingtheef-ﬁciencyandincrementalnatureofgreedyparsers.Onereasonfortheloweraccuracyofgreedyparsersiserrorpropagation:oncetheparsermakesanerrorindecoding,moreerrorsarelikelytofol-low.Thisbehavioriscloselyrelatedtothewayinwhichgreedyparsersarenormallytrained.Givenatreebankoracle,agoldsequenceoftransitionsisderived,andapredictoristrainedtopredicttransi-tionsalongthisgoldsequence,withoutconsideringanyparserstateoutsidethissequence.Thus,oncetheparserstraysfromthegoldenpathattesttime,itventuresintounknownterritoryandisforcedtoreacttosituationsithasneverbeentrainedfor.Inrecentwork(GoldbergandNivre,2012),weintroducedtheconceptofadynamicoracle,whichisnon-deterministicandnotrestrictedtoasinglegoldenpath,butinsteadprovidesoptimalpredic-tionsforanypossiblestatetheparsermightbein.Dynamicoraclesarenon-deterministicinthesensethattheyreturnasetofvalidtransitionsforagivenparserstateandgoldtree.Moreover,theyarewell-deﬁnedandoptimalalsoforstatesfromwhichthegoldtreecannotbederived,inthesensethattheyreturnthesetoftransitionsleadingtothebesttreederivablefromeachstate.Weshowedexperimen-tallythat,usingadynamicoracleforthearc-eagertransitionsystem(Nivre,2003),agreedyparsercanbetrainedtoperformwellalsoafterincurringamis-take,thusalleviatingtheeffectoferrorpropagationandresultinginconsistentlybetterparsingaccuracy.

D
o
w
N
o
UN
D
e
D

F
R
o
M
H

T
T

:
/
/

D
io
R
e
C
T
.

io
T
.

e
D
tu

/
T

UN
C
l
/

UN
R
T
io
C
e
–
P
D

F
/

D
o

io
/

1
0
1
1
6
2

/
T

UN
C
_
UN
_
0
0
2
3
7
1
5
6
6
6
8
1

/
T

UN
C
_
UN
_
0
0
2
3
7
P
D

B
sì
G
tu
e
S
T

o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

404

Inthispaper,weextendtheworkofGoldbergandNivre(2012)bygivingageneralcharacteri-zationofdynamicoraclesasoraclesthatarenon-deterministic,inthattheyreturnsetsoftransitions,andcomplete,inthattheyaredeﬁnedforallpossiblestates.Wethendeﬁneaformalpropertyoftransitionsystemswhichwecallarcdecomposition,andin-troduceaframeworkforderivingdynamicoraclesforarc-decomposablesystems.Usingthisframe-work,wederivenoveldynamicoraclesforthehy-brid(Kuhlmannetal.,2011)andeasy-ﬁrst(Gold-bergandElhadad,2010)transitionsystems,whicharearc-decomposable(asisthearc-eagersystem).Wealsoshowthatthepopulararc-standardsystem(Nivre,2004)isnotarc-decomposable,andsoderiv-ingadynamicoracleforitremainsanopenresearchquestion.Finally,weperformasetofexperimentsontheCoNLL2007datasets,validatingthattheuseofdynamicoraclesforexploringstatesthatresultfromparsingmistakesduringtrainingisbeneﬁcialacrosstransitionsystems.2Transition-BasedDependencyParsingWebeginwithaquickreviewoftransition-baseddependencyparsing,presentingthearc-eager,arc-standard,hybridandeasy-ﬁrsttransitionssystemsinacommonnotation.Thetransition-basedpars-ingframework(Nivre,2008)assumesatransitionsystem,anabstractmachinethatprocessessentencesandproducesparsetrees.Thetransitionsystemhasasetofconﬁgurationsandasetoftransitionswhichareappliedtoconﬁgurations.Whenparsingasen-tence,thesystemisinitializedtoaninitialconﬁgu-rationbasedontheinputsentence,andtransitionsarerepeatedlyappliedtothisconﬁguration.Afteraﬁnitenumberoftransitions,thesystemarrivesataterminalconﬁguration,andaparsetreeisreadofftheterminalconﬁguration.Inagreedyparser,aclas-siﬁerisusedtochoosethetransitiontotakeineachconﬁguration,basedonfeaturesextractedfromtheconﬁgurationitself.Transitionsystemsdifferbythewaytheydeﬁneconﬁgurations,andbytheparticularsetoftransitionsavailable.2.1DependencyTreesWedeﬁneadependencytreeforasentenceW=w1,…,wntobealabeleddirectedtreeT=(V,UN),whereV={w1,…,wn}isasetofnodesgivenbythetokensoftheinputsentence,andA⊆V×L×V(forsomedependencylabelsetL)isasetoflabeleddirectedarcsoftheform(H,lb,D),whereh∈Vissaidtobethehead,d∈Vthedependent,andlb∈Lthedependencylabel.Whendealingwithunlabeledparsing,orwhenthelabelidentityisirrelevant,wetakeA⊆V×Vtobeasetofordinarydirectedarcsoftheform(H,D).Notethat,sincethenodesofthetreearegivenbytheinputsentence,adependencytreeT=(V,UN)forasentenceWisuniquelydeﬁnedbythearcsetA.Forconvenience,wewillthereforeequatethetreewiththearcsetandandusethesymbolTforthelatter,reservingthesymbolAforarcsetsthatarenotnecessarilytrees.Inthecontextofthisworkitisassumedthatallthedependencytreesareprojective.Althoughthegeneraldeﬁnitionofadependencytreedoesnotmakeanyassumptionsaboutwhichnodeistherootofthetree,itiscommonpracticeindependencyparsingtoaddadummynodeROOT,whichispreﬁxedorsufﬁxedtothesentenceandwhichalwaysactsastherootofthetree.Wewillfollowthispracticeinourdescriptionofdifferenttransitionsystemsbelow.2.2TransitionSystemsArc-EagerInthearc-eagersystem(Nivre,2003),aconﬁgurationc=(P,β,UN)consistsofastackσ,abufferβ,andasetAofdependencyarcs.1GivenasentenceW=w1,…,wn,thesystemisinitializedwithanemptystack,anemptyarcset,andβ=w1,…,wn,ROOT,whereROOTisthespecialrootnode.AnyconﬁgurationcwithanemptystackandabuffercontainingonlyROOTisterminal,andtheparsetreeisgivenbythearcsetAcofc.2Thesystemhas4transitions:RIGHTlb,LEFTlb,SHIFT,REDUCE,deﬁnedasfollows:SHIFT[(P,B|β,UN)]=(P|B,β,UN)RIGHTlb[(P|S,B|β,UN)]=(P|S|B,β,A∪{(S,lb,B)})LEFTlb[(P|S,B|β,UN)]=(P,B|β,A∪{(B,lb,S)})REDUCE[(P|S,β,UN)]=(P,β,UN)1Weuseσ|xtodenoteastackwithtopelementxandre-mainderσ,andx|βtodenoteabufferwithaheadxfollowedbytheelementsinβ.2ThisdeﬁnitionofaterminalconﬁgurationdiffersfromthatinNivre(2003)butguaranteesthatthesetAcisadependencytreerootedinROOT.

D
o
w
N
o
UN
D
e
D

F
R
o
M
H

T
T

:
/
/

D
io
R
e
C
T
.

io
T
.

e
D
tu

/
T

UN
C
l
/

UN
R
T
io
C
e
–
P
D

F
/

D
o

io
/

1
0
1
1
6
2

/
T

UN
C
_
UN
_
0
0
2
3
7
1
5
6
6
6
8
1

/
T

UN
C
_
UN
_
0
0
2
3
7
P
D

B
sì
G
tu
e
S
T

o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

405

ThereisapreconditionontheRIGHTandSHIFTtransitionstobelegalonlywhenb6=ROOT,andforLEFT,RIGHTandREDUCEtobelegalonlywhenthestackisnon-empty.Moreover,LEFTisonlyle-galwhensdoesnothaveaparentinA,andREDUCEwhensdoeshaveaparentinA.Ingeneral,weuseLEGAL(C)torefertothesetoftransitionsthatarele-galinaconﬁgurationc.Thearc-eagersystembuildstreeseagerlyinthesensethatarcsareaddedattheearliesttimepossible.Inaddition,eachwordwillcollectallofitsleftdependentsbeforecollectingitsrightdependents.Arc-StandardThearc-standardsystem(Nivre,2004)hasconﬁgurationsofthesameformc=(P,β,UN)asthearc-eagersystem.Theinitialcon-ﬁgurationforasentenceW=w1,…,wnhasanemptystackandarcsetandβ=ROOT,w1,…,wn.AconﬁgurationcisterminalifithasanemptybufferandastackcontainingthesinglenodeROOT;theparsetreeisgivenbyAc.Thesystemhas3transi-tions:RIGHTlb,LEFTlb,SHIFT,deﬁnedasfollows:SHIFT[(P,B|β,UN)]=(P|B,β,UN)RIGHTlb[(P|s1|s0,β,UN)]=(P|s1,β,A∪{(s1,lb,s0)})LEFTlb[(P|s1|s0,β,UN)]=(P|s0,β,A∪{(s0,lb,s1)})ThereisapreconditionontheLEFTtransitiontobelegalonlywhens16=ROOT,andforLEFTandRIGHTtobelegalonlywhenthestackhasatleasttwoelements.Thearc-standardsystembuildstreesinabottom-upfashion:eachwordmustcollectallitsdependentsbeforebeingattachedtoitshead.Thesystemdoesnotposeanyrestrictionwithregardtotheorderofcollectingleftandrightdependents.HybridThehybridsystem(Kuhlmannetal.,2011)hasthesameconﬁgurationsandthesameinitializationandterminationconditionsasthearc-standardsystem.Thesystemhas3transitions:RIGHTlb,LEFTlb,SHIFT,deﬁnedasfollows:SHIFT[(P,B|β,UN)]=(P|B,β,UN)RIGHTlb[(P|s1|s0,β,UN)]=(P|s1,β,A∪{(s1,lb,s0)})LEFTlb[(P|S,B|β,UN)]=(P,B|β,A∪{(B,lb,S)})ThereisapreconditiononRIGHTtobelegalonlywhenthestackhasatleasttwoelements,andonLEFTtobelegalonlywhenthestackisnon-emptyands6=ROOT.Thehybridsystemcanbeseenasacombinationofthearc-standardandarc-eagerAlgorithm1Greedytransition-basedparsing1:Input:sentenceW,parameter-vectorw2:c←INITIAL(W)3:whilenotTERMINAL(C)do4:tp←argmaxt∈LEGAL(C)w·φ(C,T)5:c←tp(C)6:returnAcsystems,usingtheLEFTactionofarc-eagerandtheRIGHTactionofarc-standard.Likearc-standard,itbuildstreesinabottom-upfashion.Butlikearc-eager,itrequiresawordtocollectallitsleftdepen-dentsbeforecollectinganyrightdependent.Easy-FirstIntheeasy-ﬁrstsystem(GoldbergandElhadad,2010),aconﬁgurationc=(λ,UN)consistsofalistλandasetAofdependencyarcs.Weuselitodenotetheithmemberofλandwrite|λ|forthelengthofλ.GivenasentenceW=w1,…,wn,thesystemisinitializedwithanemptyarcsetandλ=ROOT,w1,…,wn.Aconﬁgurationcister-minalwithparsetreeAcifλ=ROOT.Thesetoftransitionsforagivenconﬁgurationc=(λ,UN)È:{LEFTilb|1kandRAND()l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 2 3 7 1 5 6 6 6 8 1 / / t l a c _ a _ 0 0 2 3 7 p d . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 410 Easy-FirstFortheeasy-ﬁrstsystem,weonlyneedtopartitionarcsintoL={(H,D)|d6∈λ}andL={(H,D)|H,d∈λ}.TheformermustalreadybeinA,andforthelattertherecanbenoconﬂictbetweenarcsaslongaswerespectthebottom-upordering.Arc-StandardUnfortunately,arcdecompositiondoesnotholdforthearc-standardsystem.Toseewhy,consideraconﬁgurationwiththestackσ=a,B,c.Thearc(C,B)isreachableviaLEFT,thearc(B,UN)isreachableviaRIGHT,LEFT,thearcsetA={(C,B),(B,UN)}formsaprojectivetreeandisthustreeconsistent,butitiseasytoconvinceoneselfthatAisnotreachablefromthisconﬁguration.Thereasonthattheaboveprooftechniquefailsforthearc-standardsystemisthatthearcsetcorrespond-ingtoBinthearc-eagersystemmayinvolvearcswherebothnodesarestillonthestack,andwecan-notguaranteethatallprojectivetreesconsistentwiththesearcscanbederived.Intheverysimilarhybridsystem,sucharcsexistaswellbuttheyarelimitedtoarcsoftheform(H,D)whereh0withrespecttoatreeTifoneofthefollowingholds:12Noteagainthats0maybemissinginthecaseofSHIFTcaseands1inthecaseofSHIFTandLEFT.

D
o
w
N
o
UN
D
e
D

F
R
o
M
H

T
T

:
/
/

D
io
R
e
C
T
.

io
T
.

e
D
tu

/
T

UN
C
l
/

UN
R
T
io
C
e
–
P
D

F
/

D
o

io
/

1
0
1
1
6
2

/
T

UN
C
_
UN
_
0
0
2
3
7
1
5
6
6
6
8
1

/
T

UN
C
_
UN
_
0
0
2
3
7
P
D

B
sì
G
tu
e
S
T

o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

412

0.3.6.9012345arabic81.6283.880.3.6.9012345basque75.1976.180.3.6.9012345catalan91.2492.100.3.6.9012345chinese84.8485.890.3.6.9012345czech79.0380.370.3.6.9012345english86.2988.690.3.6.9012345greek79.3681.190.3.6.9012345hungarian76.3577.580.3.6.9012345italian84.0184.510.3.6.9012345turkish77.0277.67Figure1:Effectofk(yaxis)andp(xaxis)valuesonparsingaccuraciesforthearc-eagersystemonthevariousCoNLL-2007shared-tasklanguages.EachpointisanaverageUASof5runswithdifferentseeds.Thegeneraltrendisthatsmallerkandhigherparebetter.(1)itaddsanarc(H,D)suchthat(h0,d)∈Tforsomeh0∈λ,h06=h;(2)itaddsanarc(H,D)suchthat(D,d0)∈Tforsomed0∈λ.Theexactcostcanbecalculatedbycountingthenumberofsucharcs.5ExperimentsandResultsSetup,dataandparametersThegoalofourex-perimentsistoevaluatetheutilityofthedynamicoraclesfortraining,bycomparingatrainingsce-nariowhichonlyseesconﬁgurationsthatcanleadtothegoldtree(followingastaticoraclefortheleft-to-rightsystemsandanon-deterministicbutin-completeoraclefortheeasy-ﬁrstsystem),againstatrainingscenariothatinvolvesexplorationofincor-rectstates,usingthedynamicoracles.Asourtrainingalgorithminvolvesarandomcom-ponent(weshufﬂethesentencespriortoeachitera-tion,andrandomlyselectwhethertofollowacor-rectorincorrectaction),weevaluateeachsetupﬁvetimesusingdifferentrandomseeds,andreporttheaveragedresults.Weperformalloftheexperimentsonthemulti-lingualCoNLL-2007datasets.Weuse15trainingiterationsfortheleft-to-rightparsers,and20trainingiterationsfortheeasy-ﬁrstparser.Weusethestan-dardperceptronupdateasourupdateruleintraining,andusetheaveragedweightvectorforpredictionintesttime.Thefeaturesetsdifferbytransitionsys-tembutarekeptthesameacrossdatasets.Theex-actfeature-setdeﬁnitionsforthedifferentsystemsareavailableintheaccompanyingsoftware,whichisavailableonlineattheﬁrstauthor’shomepage.EffectofexplorationparametersInaninitialsetofexperiments,weinvestigatetheeffectoftheex-plorationparameterskandponthearc-eagersys-tem.TheresultsarepresentedinFigure1.Whiletheoptimalparametersvarybydataset,thereisacleartrendtowardlowervaluesofkandhighervaluesofp.ThisisconsistentwiththereportofGoldbergandNivre(2012)whousedaﬁxedsmallvalueofkandlargevalueofpthroughouttheirexperiments.TrainingwithexplorationforthevarioussystemsForthesecondexperiment,inwhichwecomparedtrainingwithastaticoracletotrainingwithexplo-ration,weﬁxedtheexplorationparameterstok=1andp=0.9foralldatasetsandtransition-systemcombinations.Theresultsintermsoflabeledaccu-racies(fortheleft-to-rightsystems)andunlabeledaccuracies(forallsystems)arepresentedinTable1.Trainingwithexplorationusingthedynamicoraclesyieldsimprovedaccuracyforthevastmajorityofthesetups.Thenotableexceptionsarethearc-eagerandeasy-ﬁrstsystemsforunlabeledItalianandthearc-hybridsysteminCatalan,whereweobserveasmalldropinaccuracy.However,wecansafelyconcludethattrainingwithexplorationisbeneﬁcialandnotethatwemaygetevenfurthergainsinthefutureusingbettermethodsfortuningtheexplorationparametersorbettertrainingmethods.

D
o
w
N
o
UN
D
e
D

F
R
o
M
H

T
T

:
/
/

D
io
R
e
C
T
.

io
T
.

e
D
tu

/
T

UN
C
l
/

UN
R
T
io
C
e
–
P
D

F
/

D
o

io
/

1
0
1
1
6
2

/
T

UN
C
_
UN
_
0
0
2
3
7
1
5
6
6
6
8
1

/
T

UN
C
_
UN
_
0
0
2
3
7
P
D

B
sì
G
tu
e
S
T

o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

413

system/languagehungarianchinesegreekczechbasquecatalanenglishturkisharabicitalianUASeager:static76.4285.0179.5378.7075.1491.3086.1077.3881.5984.40eager:dynamic77.4885.8980.9880.2575.9792.0288.6977.3983.6284.30hybrid:static76.3984.9679.4079.7173.1891.3086.4375.9183.4383.43hybrid:dynamic77.5485.1080.4980.0773.7091.0687.6276.9084.0483.83easyﬁrst:static81.2787.0181.2882.0075.0192.5088.5778.9282.7385.31easyﬁrst:dynamic81.5287.4882.2582.3975.8792.8589.4179.2983.7085.11LASeager:static66.7281.2472.4471.0865.3486.0284.9366.5972.1080.17eager:dynamic68.4182.2373.8172.9966.6386.9387.6967.0573.9280.43hybrid:static66.5480.1770.9971.8862.8485.5784.9664.8073.1678.78hybrid:dynamic68.0580.5972.0772.1563.5285.4786.2866.1274.1079.25Table1:ResultsontheCoNLL2007dataset.UAS,includingpunctuation.Eachnumberisanaverageover5runswithdifferentrandomizationseeds.Allexperimentsusedthesameexplorationparametersofk=1,p=0.9.6RelatedWorkTheerrorpropagationproblemforgreedytransition-basedparsingwasdiagnosedbyMcDonaldandNivre(2007)andhasbeentackledwithavarietyoftechniquesincludingparserstacking(NivreandMc-Donald,2008;Martinsetal.,2008)andbeamsearchandstructuredprediction(ZhangandClark,2008;ZhangandNivre,2011).Thetechniquecalledboot-strappinginChoiandPalmer(2011)issimilarinspirittotrainingwithexplorationbutisappliediter-ativelyinbatchmodeandisonlyapproximateduetotheuseofstaticoracles.DynamicoracleswereﬁrstexploredbyGoldbergandNivre(2012).Inmachinelearningmoregenerally,ourapproachcanbeseenasaproblem-speciﬁcinstanceofimita-tionlearning(AbbeelandNg,2004;Vlachos,2012;Heetal.,2012;Daum´eIIIetal.,2009;Rossetal.,2011),wherethedynamicoracleisusedtoim-plementtheoptimalexpertneededintheimitationlearningsetup.Indeed,ourtrainingprocedureiscloselyrelatedtoDAgger(Rossetal.,2011),whichalsotrainsaclassiﬁertomatchanexpertonadis-tributionofpossiblysuboptimalstatesobtainedbyrunningthesystemitself.OurtrainingprocedurecanbeviewedasanonlineversionofDAgger(Heetal.,2012)withtwoextensions:Primo,ourlearn-ingalgorithminvolvesastochasticpolicyparame-terizedbyk,pforchoosingbetweentheoracleorthemodelprediction,whereasDAggeralwaysfol-lowsthesystem’sownprediction(essentiallyrun-ningwithk=0,p=1).TheheatmapsinFigure1showthatthisparameterizationisbeneﬁcial.Sec-ond,whileDAggerassumesanexpertprovidingasinglelabelateachstate,ouroracleisnondetermin-isticandallowsmultiplecorrectlabels(transitions)whichourtrainingproceduretie-breaksaccordingtothemodel’scurrentprediction,atechniquethathasrecentlybeenproposedinanextensiontoDAggerbyHeetal.(2012).Otherrelatedapproachesinthemachinelearningliteratureincludestackedsequen-tiallearning(CohenandCarvalho,2005),LaSO(Daum´eIIIandMarcu,2005),Searn(Daum´eIIIetal.,2009)andSMILe(RossandBagnell,2010).7ConclusionInthispaper,wehaveextendedtheworkondynamicoraclespresentedinGoldbergandNivre(2012)inseveraldirectionsbygivingformalcharacterizationsofnon-deterministicandcompleteoracles,deﬁningthearc-decompositionpropertyfortransitionsys-tems,andusingthispropertytoderivenovelcom-pletenon-deterministicoraclesforthehybridandeasy-ﬁrstsystems(aswellasacorrectedoracleforthearc-eagersystem).Wehavethenusedthecom-pletenessoftheseneworaclestoimprovethetrain-ingprocedureofgreedyparserstoincludeexplo-rationsofconﬁgurationswhichresultfromincor-recttransitions.Forallthreetransitionsystems,wegetsubstantialaccuracyimprovementsonmanylan-guages.Asthechangesalltakeplaceattrainingtime,theveryfastrunningtimeofthegreedyalgo-rithmattesttimeismaintained.

D
o
w
N
o
UN
D
e
D

F
R
o
M
H

T
T

:
/
/

D
io
R
e
C
T
.

io
T
.

e
D
tu

/
T

UN
C
l
/

UN
R
T
io
C
e
–
P
D

F
/

D
o

io
/

1
0
1
1
6
2

/
T

UN
C
_
UN
_
0
0
2
3
7
1
5
6
6
6
8
1

/
T

UN
C
_
UN
_
0
0
2
3
7
P
D

B
sì
G
tu
e
S
T

o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

414

ReferencesPieterAbbeelandAndrewYNg.2004.Apprenticeshiplearningviainversereinforcementlearning.InPro-ceedingsofthe21stInternationalConferenceonMa-chineLearning(ICML),page1.JinhoD.ChoiandMarthaPalmer.2011.Gettingthemostoutoftransition-baseddependencyparsing.InProceedingsofthe49thAnnualMeetingoftheAsso-ciationforComputationalLinguistics:HumanLan-guageTechnologies,pages687–692.WilliamW.CohenandVitorR.Carvalho.2005.Stackedsequentiallearning.InProceedingsoftheInter-nationalJointConferenceonArtiﬁcialIntelligence,pages671–676.HalDaum´eIIIandDanielMarcu.2005.Learningassearchoptimization:Approximatelargemarginmeth-odsforstructuredprediction.InProceedingsofthe22ndInternationalConferenceonMachineLearning,pages169–176.HalDaum´eIII,JohnLangford,andDanielMarcu.2009.Search-basedstructuredprediction.MachineLearn-ing,75:297–325.YoavGoldbergandMichaelElhadad.2010.Anefﬁ-cientalgorithmforeasy-ﬁrstnon-directionaldepen-dencyparsing.InHumanLanguageTechnologies:The2010AnnualConferenceoftheNorthAmericanChapteroftheAssociationforComputationalLinguis-tics(NAACLHLT),pages742–750.YoavGoldbergandJoakimNivre.2012.Adynamicor-acleforarc-eagerdependencyparsing.InProceed-ingsofthe24thInternationalConferenceonCompu-tationalLinguistics(COLING),pages959–976.HeHe,HalDaum´eIII,andJasonEisner.2012.Imitationlearningbycoaching.InAdvancesinNeuralInforma-tionProcessingSystems25.LiangHuangandKenjiSagae.2010.Dynamicprogram-mingforlinear-timeincrementalparsing.InProceed-ingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages1077–1086.TerryKooandMichaelCollins.2010.Efﬁcientthird-orderdependencyparsers.InProceedingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages1–11.MarcoKuhlmann,CarlosG´omez-Rodr´ıguez,andGior-gioSatta.2011.Dynamicprogrammingalgorithmsfortransition-baseddependencyparsers.InProceed-ingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages673–682.Andr´eFilipeMartins,DipanjanDas,NoahA.Smith,andEricP.Xing.2008.Stackingdependencyparsers.InProceedingsoftheConferenceonEmpiricalMeth-odsinNaturalLanguageProcessing(EMNLP),pages157–166.RyanMcDonaldandJoakimNivre.2007.Charac-terizingtheerrorsofdata-drivendependencyparsingmodels.InProceedingsofthe2007JointConferenceonEmpiricalMethodsinNaturalLanguageProcess-ingandComputationalNaturalLanguageLearning(EMNLP-CoNLL),pages122–131.RyanMcDonald,KobyCrammer,andFernandoPereira.2005.Onlinelarge-margintrainingofdependencyparsers.InProceedingsofthe43rdAnnualMeetingoftheAssociationforComputationalLinguistics(ACL),pages91–98.JoakimNivreandRyanMcDonald.2008.Integratinggraph-basedandtransition-baseddependencyparsers.InProceedingsofthe46thAnnualMeetingoftheAs-sociationforComputationalLinguistics(ACL),pages950–958.JoakimNivre.2003.Anefﬁcientalgorithmforpro-jectivedependencyparsing.InProceedingsofthe8thInternationalWorkshoponParsingTechnologies(IWPT),pages149–160.JoakimNivre.2004.Incrementalityindeterministicde-pendencyparsing.InProceedingsoftheWorkshoponIncrementalParsing:BringingEngineeringandCog-nitionTogether(ACL),pages50–57.JoakimNivre.2008.Algorithmsfordeterministicincre-mentaldependencyparsing.ComputationalLinguis-tics,34:513–553.St´ephaneRossandJ.AndrewBagnell.2010.Efﬁcientreductionsforimitationlearning.InProceedingsofthe13thInternationalConferenceonArtiﬁcialIntelli-genceandStatistics,pages661–668.St´ephaneRoss,GeoffreyJ.Gordon,andJ.AndrewBag-nell.2011.Areductionofimitationlearningandstructuredpredictiontono-regretonlinelearning.InProceedingsofthe14thInternationalConferenceonArtiﬁcialIntelligenceandStatistics,pages627–635.AndreasVlachos.2012.Aninvestigationofimitationlearningalgorithmsforstructuredprediction.InPro-ceedingsoftheEuropeanWorkshoponReinforcementLearning(EWRL),pages143–154.YueZhangandStephenClark.2008.Ataleoftwoparsers:Investigatingandcombininggraph-basedandtransition-baseddependencyparsing.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing(EMNLP),pages562–571.YueZhangandJoakimNivre.2011.Transition-baseddependencyparsingwithrichnon-localfeatures.InProceedingsofthe49thAnnualMeetingoftheAsso-ciationforComputationalLinguistics:HumanLan-guageTechnologies,pages188–193. Operazioni dell'Associazione per la Linguistica Computazionale, 1 (2013) 403–414. Redattore di azioni: Jason Eisner. Immagine

Scarica il pdf