Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn.

Submitted 1/2013; Revised 5/2013; Publicado 7/2013. C
(cid:13)

2013 Asociación de Lingüística Computacional.

DynamicallyShapingtheReorderingSearchSpaceofPhrase-BasedStatisticalMachineTranslationAriannaBisazzaandMarcelloFedericoFondazioneBrunoKesslerTrento,Italia{bisazza,federico}@fbk.euAbstractDefiningthereorderingsearchspaceisacru-cialissueinphrase-basedSMTbetweendis-tantlanguages.Infact,theoptimaltrade-offbetweenaccuracyandcomplexityofde-codingisnowadaysreachedbyharshlylim-itingtheinputpermutationspace.Wepro-poseamethodtodynamicallyshapesuchspaceand,de este modo,capturelong-rangewordmovementswithouthurtingtranslationqual-itynordecodingtime.Thespacedefinedbyloosereorderingconstraintsisdynamicallyprunedthroughabinaryclassifierthatpredictswhetheragiveninputwordshouldbetrans-latedrightafteranother.Theintegrationofthismodelintoaphrase-baseddecoderim-provesastrongArabic-Englishbaselineal-readyincludingstate-of-the-artearlydistor-tioncost(MooreandQuirk,2007)andhierar-chicalphraseorientationmodels(GalleyandManning,2008).Significantimprovementsinthereorderingofverbsareachievedbyasys-temthatisnotablyfasterthanthebaseline,whileBLEUandMETEORremainstable,orevenincrease,ataveryhighdistortionlimit.1IntroductionWordorderdifferencesareamongthemostimpor-tantfactorsdeterminingtheperformanceofstatisti-calmachinetranslation(SMT)onagivenlanguagepair(Birchetal.,2009).Thisisparticularlytrueintheframeworkofphrase-basedSMT(PSMT)(Zensetal.,2002;Koehnetal.,2003;OchandNey,2002),anapproachthatremainshighlycompetitivedespitetherecentadvancesofthetree-basedapproaches.DuringthePSMTdecodingprocess,theoutputsentenceisbuiltfromlefttoright,whiletheinputsentencepositionscanbecoveredindifferentor-ders.Thus,reorderinginPSMTcanbeviewedastheproblemofchoosingtheinputpermutationthatleadstothehighest-scoringoutputsentence.Duetoefficiencyreasons,sin embargo,theinputpermutationspacecannotbefullyexplored,andisthereforelim-itedwithhardreorderingconstraints.Althoughmanysolutionshavebeenproposedtoexplicitlymodelwordreorderingduringdecoding,PSMTstilllargelyfailstohandlelong-rangewordmovementsinlanguagepairswithdifferentsyntac-ticstructures1.Webelievethisismostlynotduetodeficienciesoftheexistingreorderingmodels,butrathertoaverycoarsedefinitionofthereorder-ingsearchspace.Indeed,theexistingreorderingconstraintsarerathersimpleandtypicallybasedonword-to-worddistances.Moreover,theyareuni-formthroughouttheinputsentenceandinsensitivetotheactualwordsbeingtranslated.Relaxingthiskindofconstraintsmeansdramaticallyincreasingthesizeofthesearchspaceandmakingthereorder-ingmodel’staskextremelycomplex.Asaresult,eveninlanguagepairswherelongreorderingisreg-ularlyobserved,PSMTqualitydegradeswhenlongwordmovementsareallowedtothedecoder.Weaddressthisproblembytrainingabinaryclassifiertopredictwhetheragiveninputpositionshouldbetranslatedrightafteranother,giventhewordsatthosepositionsandtheircontexts.Whenthismodelisintegratedintothedecoder,itspredic-1Forempiricalevidence,seeforinstance(Birchetal.,2009;GalleyandManning,2008;BisazzaandFederico,2012).

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

328

tionscanbeusednotonlyasanadditionalfeaturefunction,butalsoasanearlyindicationofwhetherornotagivenreorderingpathshouldbefurtherex-plored.Morespecifically,ateachhypothesisex-pansion,weconsiderthesetofinputpositionsthatarereachableaccordingtotheusualreorderingcon-straints,andpruneitbasedonlyonthereorder-ingmodelscore.Then,thehypothesiscanbeex-pandednormallybycoveringthenon-prunedposi-tions.Thistechniquemakesitpossibletodynami-callyshapethesearchspacewhiledecodingwithaveryhighdistortionlimit,whichcanimprovetrans-lationqualityandefficiencyatthesametime.Theremainderofthepaperisorganizedasfol-lows.Afteranoverviewoftherelevantliterature,wedescribeindetailourwordreorderingmodel.Inthefollowingsection,weintroduceearlypruningofreorderingstepsasawaytodynamicallyshapetheinputpermutationspace.Finally,wepresentanem-piricalanalysisofourapproach,includingintrinsicevaluationofthemodelandSMTexperimentsonawell-knownArabic-Englishnewstranslationtask.2PreviousWorkInthispaper,wefocusonmethodsthatguidethereorderingsearchduringthephrase-baseddecodingprocess.Seeforinstance(Costa-juss`aandFonol-losa,2009)forareviewofpre-andpost-reorderingapproachesthatarenottreatedhere.Assumingaone-to-onecorrespondencebetweensourceandtargetphrases,reorderinginPSMTcanbeviewedastheproblemofsearchingthroughasetofpermutationsoftheinputsentence.Thus,twosub-problemsarise:definingthesetofallowedper-mutations(reorderingconstraints)andscoringtheallowedpermutationsaccordingtosomelikelihoodcriterion(reorderingmodel).Webeginwiththelat-ter,returningtotheconstraintslaterinthissection.2.1ReorderingmodelingInitsoriginalformulation,thePSMTapproachincludesabasicreorderingmodel,calleddistor-tioncost,thatexponentiallypenalizeslongerjumpsamongconsecutivelytranslatedphrases(˜fi−1,˜fi):d(˜fi−1,˜fi)=e−|comenzar(˜fi)−end(˜fi−1)−1|Anumberofmoresophisticatedsolutionshavebeenproposedtoexplicitlymodelwordreorder-ingduringdecoding.Thesecanmostlybegroupedintothreefamilies:phraseorientationmodels,jumpmodelsandsourcedecodingsequencemodels.Phraseorientationmodels(Tillmann,2004;Koehnetal.,2005;ZensandNey,2006;GalleyandManning,2008),alsoknownaslexicalizedreorder-ingmodels,predicttheorientationofaphrasewithrespecttothelasttranslatedone,byclassifyingitasmonotone,swapordiscontinuous.Thesemod-elshaveprovenveryusefulforshortandmedium-rangereorderingandareamongthemostwidelyusedinPSMT.However,theircoarseclassificationofreorderingstepsmakesthemunsuitabletopredictlong-rangereorderings.Jumpmodels(Al-OnaizanandPapineni,2006;Greenetal.,2010;YahyaeiandMonz,2010)predictthedirectionandlengthofajumptoperformafteragiveninputword2.BoththeseworksachievetheirbestArabic-EnglishresultswithinarathersmallDL:a saber,8en(Al-OnaizanandPapineni,2006)and5in(Greenetal.,2010),thusfailingtocapturetherarebutcruciallongreorderingsthatweretheirmainmotivation.Adrawbackofthisapproachisthatlongjumpsaretypicallypenalizedbecauseoftheirlowfrequencycomparedtoshortjumps.Thisstrongbiasisundesirable,giventhatweareespeciallyin-terestedindetectingprobablelongreorderings.Sourcedecodingsequencemodelspredictwhichinputwordislikelytobetranslatedatagivenstateofdecoding.Forinstance,reorderedsourcelan-guagemodels(Fengetal.,2010)aresmoothedn-grammodelstrainedonacorpusofsourcesentencesreorderedtomatchthetargetwordorder.Wheninte-gratedintotheSMTsystem,theyassignaprobabil-itytoeachnewlytranslatedwordgiventhen-1pre-viouslytranslatedwords.Finally,sourcewordpairreorderingmodels(Visweswariahetal.,2011)esti-mate,foreachpairofinputwordsiandj,thecostoftranslatingjrightafterigivenvariousfeaturesofi,jandtheirrespectivecontexts.DifferentlyfromreorderedsourceLMs,thesemodelsarediscrimina-tiveandcanprofitfromricherfeaturesets.Atthesametime,theydonotemploydecodinghistory-basedfeatures,whichallowsformoreeffectivehy-2Inthispaper,aporte(orsource)worddenotesthewordatagivenpositionoftheinputsentence,ratherthanawordtype.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

329

pothesisrecombination.Themodelwearegoingtopresentbelongstothislastsub-group,whichwefindespeciallysuitabletopredictlongreorderings.2.2ReorderingconstraintsThereorderingconstraintoriginallyincludedinthePSMTframeworkandimplementedinourreferencetoolkit,Moses(Koehnetal.,2007),iscalleddis-tortionlimit(DL).Thisconsistsinallowingthede-codertoskip,orjump,atmostkwordsfromthelasttranslatedphrasetothenextone.Moreprecisely,thelimitisimposedonthedistortionDbetweenconsec-utivelytranslatedphrases(˜fi−1,˜fi):D(˜fi−1,˜fi)=!!!comenzar(˜fi)−end(˜fi−1)−1!!!≤DLLimitingtheinputpermutationspaceisnecessaryforbeam-searchPSMTdecoderstofunctioninlin-eartime.Reorderingconstraintsarealsoimportantfortranslationqualitybecausetheexistingmodelsaretypicallynotdiscriminativeenoughtoguidethesearchoververylargesetsofreorderinghypotheses.Despitetheircrucialeffectsonthecomplexityofreorderingmodeling,aunque,reorderingconstraintshavedrawnlessattentionintheliterature.Theex-istingreorderingconstraintsaretypicallybasedonword-to-worddistances–IBM(Bergeretal.,1996)andDL(Koehnetal.,2007)–oronpermutationpat-terns–ITG(Wu,1997).Bothkindsofconstraintsareuniformthroughouttheinputsentence,andin-sensitivetothewordbeingtranslatedandtoitscon-text.Thisresultsinaverycoarsedefinitionofthereorderingsearchspace,whichisproblematicinlan-guagepairswithdifferentsyntacticstructures.Toaddressthisproblem,YahyaeiandMonz(2010)presentatechniquetodynamicallysettheDL:theytrainaclassifiertopredictthemostprob-ablejumplengthaftereachinputword,andusethepredictedvalueastheDLafterthatposition.Un-fortunately,thismethodcangenerateinconsistentconstraintsleadingtodecodingdead-ends.Asaso-lution,thedynamicDLisrelaxedwhenneededtoreachthefirstuncoveredposition.Translationim-provementsarereportedonlyonasmall-scaletaskwithshortsentences(BTEC),overabaselinethatin-cludesaverysimplereorderingmodel.Inourworkwedevelopthisideafurtheranduseareorderingmodeltopredictwhichspecificinputwords,ratherthaninputintervals,arelikelybetranslatednext.Moreover,oursolutionisnotaffectedbythecon-straintinconsistencyproblem(seeSect.4).Inanotherrelatedwork,BisazzaandFederico(2012)generatelikelyreorderingsoftheinputsen-tencebymeansoflanguage-specificfuzzyrulesbasedonshallowsyntax.Longjumpsarethensug-gestedtothePSMTdecoderbyreducingthedistor-tioncostforspecificpairsofinputwords.Incom-parisontothedynamicDL,thatisamuchfinerwaytodefinethereorderingspace,leadingtoconsistentimprovementsofbothtranslationqualityandeffi-ciencyoverastrongbaseline.However,theneedofspecificreorderingrulesmakesthemethodhardertoapplytonewlanguagepairs.3TheWaWreorderingmodelWemodelreorderingastheproblemofdecidingwhetheragiveninputwordshouldbetranslatedafteranother(Word-after-Word).Thisformulationisparticularlysuitabletohelpthedecoderdecidewhetherareorderingpathispromisingenoughtobefurtherexplored.Moreover,whentranslatingasentence,choosingthenextsourcewordtotranslateappearsasamorenaturalproblemthanguessinghowmuchtotheleftortotherightweshouldmovefromthecurrentsourceposition.TheWaWreorderingmodeladdressesabinarydecisiontaskthroughthefollowingmaximum-entropyclassifier:PAG(Ri,j=Y|fJ1,i,j)=expmλmhm(fJ1,i,j,Ri,j=Y)]»Y!exp.[«mλmhm(fJ1,i,j,Ri,j=Y»)]wherefJ1isasourcesentenceofJwords,hmarefeaturefunctionsandλmthecorrespondingfeatureweights.TheoutcomeYcanbeeither1or0,withRi,j=1meaningthatthewordatpositionjistrans-latedrightafterthewordatpositioni.OurWaWreorderingmodelisstronglyrelatedtothatofVisweswariahetal.(2011)–herebycalledTravellingSalesmanProblem(TSP)model–withfewimportantdifferences:(i)wedonotincludeinthefeaturesanyexplicitindicationofthejumplength,inordertoavoidthebiasonshortjumps;(ii)theytrainalinearmodelwithMIRA(Cram-merandSinger,2003)byminimizingthenumber

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

330

ofinputwordsthatgetplacedafterthewrongpo-sition,whileweuseamaximum-entropyclassifiertrainedbymaximum-likelihood;(iii)theyuseanoff-theshelfTSPsolvertofindthebestsourcesen-tencepermutationandapplyitaspre-processingtotrainingandtestdata.Bycontrast,weintegratethemaximum-entropyclassifierdirectlyintotheSMTdecoderandletallitsothermodels(phraseorien-tation,traducción,targetLMetc.)contributetothefinalreorderingdecision.3.1FeaturesLiketheTSPmodel(Visweswariahetal.,2011),theWaWmodelbuildsonbinaryfeaturessimilartothosetypicallyemployedfordependencyparsing(McDonaldetal.,2005):a saber,combinationsofsurfaceformsorPOStagsofthewordsiandjandtheircontext.OurfeaturetemplatesarepresentedinTable1.ThemainnoveltieswithrespecttotheTSPmodelarethemixedword-POStemplates(rows16-17)andtheshallowsyntaxfeatures.Inparticular,weusethechunktypesofi,jandtheircontext(18-19),aswellasthechunkheadwordsofiandj(20).Fi-nallyweaddafeaturetoindicatewhetherthewordsiandjbelongtothesamechunk(21).Thejumporientation–forward/backward–isincludedinthefeaturesthatrepresentthewordscomprisedbetweeniandj(rows6,7,14,15).Noexplicitindicationofthejumplengthisincludedinanyfeature.3.2TrainingdataTogeneratetrainingdatafortheclassifier,wefirstextractreferencereorderingsfromaword-alignedparallelcorpus.Givenaparallelsentence,differ-entheuristicsmaybeusedtoconvertarbitrarywordalignmentstoasourcepermutation(Birchetal.,2010;Fengetal.,2010;Visweswariahetal.,2011).Similarlytothislastwork,wecomputeforeachsourcewordfithemeanaiofthetargetpositionsalignedtofi,thensortthesourcewordsaccordingtothisvalue.3Asadifference,aunque,wedonotdiscardunalignedwordsbutassignthemthemean3Usingthemeanofthealignedindicesmakesthegener-ationofreferencepermutationsmorerobusttoalignmenter-rors.Admittedly,thisheuristicdoesnothandlewellthecaseofsourcewordsthatarecorrectlyalignedtonon-consecutivetar-getwords.However,thisphenomenonisalsonotcapturedbystandardPSMTmodels,whoonlylearncontinuousphrases.i−2i−1ii+1bj−1jj+11ww2www3wwww4wwww5wwww6www7wallww8pp9ppp10pppp11pppp12pppp13pppppp14ppp15pallpp16wp17pw18cc19cccccc20hh21belongtosamechunk(i,j)?w:wordidentity,pag:POStag,C:chunktype,h:chunkheadTable1:Featuretemplatesusedtolearnwhetherasourcepositionjistobetranslatedrightafteri.Positionscom-prisedbetweeniandjaredenotedbybandgeneratetwofeaturetemplates:oneforeachposition(6and14)andonefortheconcatentationofthemall(7and15).oftheirneighbouringwords’alignmentmeans,sothatacompletepermutationofthesourcesentence(pag)isobtained.Table2(a)illustratesthisprocedure.Giventhereferencepermutation,wethengener-atepositiveandnegativetrainingsamplesbysimu-latingthedecodingprocess.Wetraversethesourcepositionsintheorderdefinedbyσ,keepingtrackofthepositionsthathavealreadybeencoveredand,foreacht:1≤t≤J,generate:•onepositivesample(Rσt,σt+1=1)forthesourcepositionthatcomesrightafterit,•anegativesample(Rσt,u=0)foreachsourcepositionin{tu:σt−δ+17andforwardjumpswithD>6.InTable4,resultsarecomparedwiththerankingproducedbystandarddistortion,whichalwaysfavorsshorterjumps.Twoconditionsareconsidered:DL=10correspondingtothesamplingwindowδusedtoproducethetrainingdata,andDL=18thatisthemaximumdistortionofjumpsthatwillbeconsideredinourearly-pruningSMTexperiment.ModelDLDL-errTop-1Top-3Top-3-longbackforw.Distortion102.461.879.650.766.0180.862.080.018.952.3WaW102.471.291.276.469.3180.871.291.868.051.8Table4:Word-to-wordjumprankingaccuracy(%)ofstandarddistortionandWaWreorderingmodel,indif-ferentDLconditions.DL-erristhepercentageofcorrectjumpsbeyondDL.Thetestsetconsistsof40Kreorderingdecisions:oneforeachsourcewordinTIDES-MT04.Wecanseethat,intermsofoverallaccuracies,theWaWreorderingmodeloutperformsstandarddistor-tionbyalargemargin(about10%absolute).Thisisanimportantresult,consideringthatthejumplength,stronglycorrelatingwiththejumplikeli-hood,isnotdirectlyknowntoourmodel.Asre-gardstheDL,thehigherlimitnaturallyresultsinalowerDL-errorrate(percentageofcorrectjumpsbe-yondDL):namely0.8%insteadof2.4%.However,jumppredictionbecomesmuchharder:Top-3accu-racyoflongjumpsbydistortiondropsfrom50.7%to18.9%(backward)andfrom66.0%to52.3%(for-ward).Ourmodelisremarkablyrobusttothiseffectonbackwardjumps,whereitachieves68.0%accu- l D o w n o a d e desde h t t p : / / directo . mi t . e d u / t a c l / lartice – pdf / ¿yo? / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 2 3 1 1 5 6 6 6 5 7 / / t l a c _ a _ 0 0 2 3 1 pd . f por invitado 0 7 septiembre 2 0 2 3 334 racy.DuetothesyntacticcharacteristicsofArabicandEnglish,thetypicallongreorderingpatterncon-sistsin(i)skippingaclause-initialArabicverb,(ii)coveringalongsubject,thenfinally(iii)jumpingbacktotranslatetheverband(iv)jumpingforwardtocontinuetranslatingtherestofthesentence(seeFig.3foranexample).12Decidingwhentojumpbacktocovertheverb(iii)isthehardestpartofthisprocess,andthatispreciselywhereourmodelseemsmorehelpful,whiledistortionalwayspreferstoproceedmonotonicallyachievingaverylowac-curacyof18.9%.Inthecaseoflongforwardjumps(iv),en cambio,distortionisadvantagedasthecorrectchoicetypicallycorrespondstotranslatingthefirstuncoveredposition,thatistheshortestjumpavail-ablefromthelasttranslatedword.Evenhere,ourmodelachievesanaccuracyof51.8%,onlyslightlylowerthanthatofdistortion(52.3%).Insummary,theWaWreorderingmodelsignifi-cantlyoutperformsdistortionintherankingoflongjumps.Inthelargemajorityofcases,itisabletorankacorrectlongjumpinthetop3reorderingop-tions,whichsuggeststhatitcanbeeffectivelyusedforearlyreorderingpruning.5.2SMTexperimentalsetupOurSMTsystemsarebuiltwiththeMosestoolkit,whilewordalignmentisproducedbytheBerke-leyAligner(Liangetal.,2006).Thebaselinede-coderincludesaphrasetranslationmodel,alexi-calizedreorderingmodel,a6-gramtargetlanguagemodel,distortioncost,wordandphrasepenalties.Morespecifically,thebaselinereorderingmodelisahierarchicalphraseorientationmodel(Tillmann,2004;Koehnetal.,2005;GalleyandManning,2008)trainedonalltheavailableparalleldata.Thisvariantwasshowntooutperformthedefaultword-basedonanArabic-Englishtask.Tomakeourbase-lineevenmorecompetitive,weapplyearlydistor-tioncost,asproposedbyMooreandQuirk(2007).Thisfunctionhasthesamevalueasthestandardoneoveracompletetranslationhypothesis,butitantic-ipatesthegradualaccumulationofthecost,mak-inghypothesesofthesamelengthmorecompara-bletooneanother.Notethatthisoptionhasnoef-12Clearly,wewouldexpectdifferentfiguresfromtestingthemodelonanotherlanguagepairlikeGerman-English,wheretheverbisoftenpostponedinthesourcewithrespecttothetarget.fectonthedistortionlimit,butonlyonthedistor-tioncostfeaturefunction.AsproposedbyJohnsonetal.(2007),statisticallyimprobablephrasepairsareremovedfromthetranslationmodel.Thelan-guagemodelsareestimatedbytheIRSTLMtoolkit(Federicoetal.,2008)withmodifiedKneser-Neysmoothing(ChenandGoodman,1999).FeatureweightsareoptimizedbyminimumBLEU-errortraining(Och,2003)ondev06-nw.Toreducetheeffectsoftheoptimizerinstability,wetuneeachconfigurationfourtimesandusetheav-erageoftheresultingweightvectorstotranslatethetestsets,assuggestedbyCettoloetal.(2011).Finalmente,eval08-nwisusedtoselecttheearlyprun-ingparametersforthelastexperiment,whileeval09-nwisalwaysreservedasblindtest.5.3EvaluationmetricsWeevaluateglobaltranslationqualitywithBLEU(Papinenietal.,2002)andMETEOR(BanerjeeandLavie,2005).Thesemetrics,aunque,areonlyin-directlysensitivetowordorder,andespeciallyun-likelytocaptureimprovementsattheleveloflong-rangereordering.Forthisreason,wealsocom-putetheKendallReorderingScoreorKRS(Birchetal.,2010)whichisapositivescorebasedontheKendall’sTaudistancebetweenthesource-outputpermutationπandthesource-referencepermuta-tionsσ:KRS(Pi,pag)=(1−#K(Pi,pag))·BPK(Pi,pag)=»ni=1nj=1d(i,j)12norte(n−1)d(i,j)=$1ifπi<πjandσi>σj0otherwisewhereBPisasentence-levelbrevitypenalty,similartothatofBLEU.TheKRSisrobusttolexicalchoicebecauseitperformsnocomparisonbetweenoutputandreferencewords,butonlybetweenthepositionsoftheirtranslations.Besides,itwasshowntocorre-latestronglywithhumanjudgementsoffluency.Ourworkspecificallyaddresseslong-rangere-orderingphenomenainlanguagepairswherethesearequiterare,althoughcrucialforpreservingthesourcetextmeaning.Hence,animprovementatthislevelmaynotbedetectedbythegeneral-purposemetrics.WethendevelopaKRSvariantthatisonly

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

335

sensitivetothepositioningofspecificinputwords.Assumingthateachinputwordfiisassignedaweightλi,theformulaaboveismodifiedasfollows:(i,j)=$λi+λjifπi<πjandσi>σj0otherwiseAsimilarelement-weightedversionofKendallTauwasproposedbyKumarandVassilvitskii(2010)toevaluatedocumentrankingsininformationretrieval.BecauselongreorderingerrorsinArabic-Englishmostlyaffectverbs,wesettheweightsto1forverbsand0forallotherwordstoonlycaptureverbre-orderingerrors,andcalltheresultingmetricKRS-V.Thesource-referencewordalignmentsneededtocomputethereorderingscoresaregeneratedbytheBerkeleyAlignerpreviouslytrainedonthetrainingdata.Source-outputwordalignmentsareinsteadob-tainedfromthedecoder’strace.5.4ResultsanddiscussionTomotivatethechoiceofourbaselinesetup(earlydistortioncostandDL=8),wefirstcomparetheper-formanceofstandardandearlydistortioncostsun-dervariousDLconditions.!»#$%!»#&'()%!»#($%!»#&%!»#$%!»#()%!»#($%!»#»$!%#»$!!#»$!&#»$!’#»$&(#»$&)#»$&*#»$&+#»$,+#)$,+#%$,,#)$,,#%$,»#)$*+,%-«./%-./01$23.45./5$Figure2:Standardvsearlydistortioncostresultsoneval08-nwunderdifferentdistortionlimits(DL),usingthemedium-sizeLM.Bestscoresareontop-rightcorner.AsshowninFig.2,mostresultsareclosetoeachotherintermsofBLEUandKRS,butearlydistor-tionconsistentlyoutperformsthestandardone(sta-tisticallysignificant).Themoststrikingdifferenceappearsataveryhighdistortionlimit(18),wherestandarddistortionscoresdropbymorethan1BLEUpointandalmost7KRSpoints!Earlydistortionismuchmorerobust(only-1KRSwhengoingfromDL=8toDL=18),whichmakesourbaselinesystemespeciallystrongatthelevelofreordering.Table5presentstheresultsobtainedbyintegrat-ingtheWaWreorderingmodelasanadditionalfeaturefunction,andbyapplyingearlyreorderingpruning.Theupperpartofthetablereferstothemedium-scaleevaluation,whilethelowerpartreferstothelarge-scaleevaluation.Ineachpart,statis-ticalsignificanceiscomputedagainstthebaseline[B]byapproximaterandomizationasin(RiezlerandMaxwell,2005).RuntimesareobtainedbyanIntelXeonX5650processoronthefirst500sentencesofeval08-nw,andexcludeloadingtimeofallmodels.Medium-scaleevaluation.IntegratingtheWaWmodelasanadditionalfeaturefunctionresultsinsmallbutconsistentimprovementsinallDLcondi-tions,whichshowsthatthistypeofmodelconveysinformationthatismissingfromthestate-of-the-artreorderingmodels.Asregardsefficiency,thenewmodelmakesdecodingtimeincreaseby8%.AmongtheDLsettingsconsidered,DL=8iscon-firmedastheoptimalone–withorwithoutWaWmodel.RaisingtheDLto18withnospecialprun-inghasanegativeimpactonbothtranslationqualityandefficiency.Theeffectisespeciallyvisibleonthereorderingscores:thatis,from84.7to83.9KRSandfrom86.2to85.8KRS-Voneval09-nw.Runtimesarealmostdoubled:from87to164andfrom94to178ms/word,thatisa89%increase.WethenproceedtothelastexperimentwherethereorderingspaceisdynamicallyprunedbasedontheWaWmodelscore.AsexplainedinSect.4,anon-prunable-zoneofwidthϑ=5issetaroundthelastcoveredposition.Tosettheearlypruningpa-rameters,weperformagridsearchoverthevalues(1,2,3,4,5)forhistogramand(0.5,0.25,0.1)forrelativethreshold,andselectthevaluesthatachievethebestBLEUandKRSoneval08-nw,namely3(his-togram)and0.1(límite).Theresultingconfigu-rationisthenre-optimizedbyMERTondev06-nw.Thissettingimpliesthat,atagivenpointofdecod-ingwhereiisthelastcoveredposition,anewwordcanbetranslatedonlyif:•itlieswithinaDLof5fromi,or•itlieswithinaDLof18fromianditsWaWreorderingscoreisamongthetop3andatleastequalto1/10ofthebestscore(inlinearspace).AsshowninTable5,earlypruningachievesthebestresultsoverall:despitethehighDL,wereport l D o w n o a d e d f r o m h t t p : / / directo . mi t . e d u / t a c l / lartice – pdf / ¿yo? / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 2 3 1 1 5 6 6 6 5 7 / / t l a c _ a _ 0 0 2 3 1 pd . f por invitado 0 7 septiembre 2 0 2 3 336 DLReo.modelseval08-nweval09-nwvs-09ms/bleumetkrskrs-Vbleumetkrskrs-Vkrs-VwordUsingthemedium-sizeLM(147MEnglishtokens):5hier.lexreo,earlydisto44.735.1!83.0!84.7!50.3″38.184.685.984.759+WaWmodel44.835.183.785.451.0#38.3#85.1#86.6$85.5#648hier.lexreo,earlydisto[B]44.835.283.485.650.638.184.786.284.887+WaWmodel45.035.283.7$85.951.1#38.3#85.1#86.8#85.8#9418hier.lexreo,earlydisto44.734.9!82.4!84.9!50.338.0″83.9!85.8″84.3″164+WaWmodel44.835.282.7!85.551.0$38.3#84.2″86.285.2178+earlyreo.pruning(ϑ=5)45.035.383.7$86.3#50.938.3#84.987.0#86.2#68UsingthelargeinterpolatedLM(2130MEnglishtokens)anddoublebeam-size:8hier.lexreo,earlydisto[B]46.335.083.285.051.638.384.585.884.5257918hier.lexreo,earlydisto45.9″34.9!81.7!84.1!51.438.1!83.0!84.6!83.1!5462+WaW+reo.pruning(ϑ=5)46.335.283.485.7#52.8#38.6#84.686.6#85.5#1588Table5:EffectsofWaWreorderingmodelingandearlyreorderingpruningontranslationquality,measuredwith%BLEU,METEOR,andKendallReorderingScore:regular(KRS)andverb-specific(KRS-V).Statisticallysignificantdifferenceswithrespecttothebaseline[B]aremarkedwith#!atthep≤.05leveland$»atthep≤.10level.Decodingtimeismeasuredinmillisecondsperinputword.nolossinBLEU,METEORandKRS,butweactuallyseeseveralimprovements.Inparticular,thegainsontheblindtesteval09-nware+0.3BLEU,+0.2ME-TEORand+0.2KRS(onlyMETEORissignificant).Whilethesegainsareadmittedlysmall,werecallthatourtechniquesaffectrareandisolatedeventswhichcanhardlyemergefromthegeneral-purposeevaluationmetrics.Moreover,toourknowledge,thisisthefirsttimethataPSMTsystemisshowntomaintainagoodperformanceonthislanguagepairwhileadmittingverylong-rangereorderings.Finallyandmoreimportantly,thereorderingofverbsimprovessignificantlyonbothgenerictestsandontheVS-sentencesubset(vs-09):a saber,inthelatter,weachieveanotablegainof1.4KRS-V.Efficiencyisalsolargelyimprovedbyourearlyreorderingpruningtechnique:decodingtimeisre-ducedto68ms/word,correspondingtoa22%speed-upoverthebaseline.Large-scaleevaluation.Wealsoinvestigatewhetherourmethodscanbeusefulinascenariowhereefficiencyislessimportantandmoredataisavailablefortraining.Tothisend,webuildaverylargeLMbyinterpolatingthemainLMwiththreeotherLMstrainedondifferentGigawordsec-tions(seeSect.5).Además,werelaxthedecoder’sbeamsizefromthedefaultvalueof200to400hy-potheses,toreducetheriskofsearcherrorsandob-tainthebestpossiblebaselineperformance.Bycomparingthelarge-scalewiththemedium-scalebaselineinTable5,wenotethattheadditionofLMdataisespeciallybeneficialforBLEU(+1.5oneval08-nwand+1.0oneval09-nw),butnotasmuchfortheothermetrics,whichchallengesthecommonlyheldideathatmoredataalwaysimprovestranslationquality.Heretoo,relaxingtheDLwithoutspecialpruninghurtsnotonlyefficiencybutalsotranslationqual-ity:allthescoresdecreaseconsiderably,showingthateventhestrongerLMisnotsufficienttoguidesearchthroughaverylargereorderingsearchspace.Asforourenhancedsystem,itachievessimi-largainsasinthemedium-scalescenario:thatis,BLEUandMETEORarepreservedorslightlyim-proveddespitetheveryhighDL,whileallthere-orderingscoresincrease.Inparticular,wereportsta-tisticallysignificantimprovementsinthereorderingofverbs,whichiswheretheimpactofourmethodisexpectedtoconcentrate(+0.7,+0.8and+1.0KRS-Voneval08-nw,eval09-nwandvs-09,respectivamente).Theseresultsconfirmtheusefulnessofourmethodnotonlyasanoptimizationtechnique,butalsoasawaytoimprovetranslationqualityontopofaverystrongbaseline.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

337

SRC!!!»#$!%&'()*+,-./0$%&'&-1!2+1+03*+045(67!8+9#+77!5:65&-3*;24<5(&-73!045(&-=>?@A(0B*+CDEF(23*verbsubj.obj.compl.ywASlsfyrAlmmlkpAlErbypAlsEwdypldYlbnAnEbdAlEzyzxwjptHrk-hfyAtjAhcontinuesambassadorKingdomArabianSauditoLebanonAbdulazizKhawjamovehisindirectionREFTheKingdomofSaudiArabia’sambassadortoLebanonAbdulazizKhawjacontinueshismovestowardsBASEcontinuetoSaudiArabianambassadortoLebanon,AbdulazizKhwjaitsmoveinthedirectionofNEWTheKingdomofSaudiArabia’sambassadortoLebanon,AbdulazizKhwjacontinueitsmoveinthedirectionofSRC;#7*$G’(h(+0&B5()I(E4JD>5)wereperformedtotrans-latethe586sentencesofeval09-nw;38outofthesewerelongerthan8andmostlyconcentratedontheVS-sentencesubset(27jumpsD>8performedinvs-09).13Thisandthehigherreorderingscoressug-gestthatlongjumpsaremainlycarriedouttocor-rectlyreorderclause-initalverbsoverlongsubjects.Fig.3showstwoArabicsentencestakenfromeval09-nw,thatwereerroneuouslyreorderedbythebaselinesystem.ThesystemincludingtheWaWmodelandearlyreorderingpruning,en cambio,pro-ducedthecorrecttranslation.ThefirstsentenceisatypicalexampleofVSOorderwithalongsubject:whilethebaselinesystemlefttheverbinitsAra-bicposition,producinganincomprehensibletrans-lation,thenewsystemplaceditrightlybetweentheEnglishsubjectandobject.Thisreorderinginvolvedtwolongjumps:onewithD=9backwardandonewithD=8forward.Thesecondsentencedisplaysanother,lesscom-mon,Arabicconstruction:namelyVOS,withaper-sonalpronounobject.Inthiscase,abackwardjumpwithD=10andaforwardjumpwithD=8werenec-essarytoachievethecorrectreordering.13Statisticscomputedonthemedium-LMsystem.6ConclusionsWehavetrainedadiscriminativemodeltopredictlikelyreorderingstepsinawaythatiscomplemen-tarytostate-of-the-artPSMTreorderingmodels.WehaveeffectivelyintegrateditintoaPSMTdecoderasadditionalfeature,ensuringthatitstotalscoreoveracompletetranslationhypothesisisconsistentacrossdifferentphrasesegmentations.Lastly,wehavepro-posedearlyreorderingpruningasanovelmethodtodynamicallyshapetheinputreorderingspaceandcapturelong-rangereorderingphenomenathatareoftencriticalwhentranslatingbetweenlanguageswithdifferentsyntacticstructures.EvaluatedonapopularArabic-Englishnewstranslationtaskagainstastrongbaseline,ourap-proachleadstosimilarorevenhigherBLEU,ME-TEORandKRSscoresataveryhighdistortionlimit(18),whichisbyitselfanimportantachievement.Atthesametime,thereorderingofverbs,measuredwithanovelversionoftheKRS,isconsistentlyim-proved,whiledecodinggetssignificantlyfaster.TheimprovementsarealsoconfirmedwhenaverylargeLMisusedandthedecoder’sbeamsizeisdou-bled,whichshowsthatourmethodreducesnotonlysearcherrorsbutalsomodelerrorsevenwhenbase-linemodelsareverystrong.Wordreorderingisprobablythemostdifficultas-pectofSMTandanimportantfactorofbothitsqual-ityandefficiency.GivenitsstronginteractionwiththeotheraspectsofSMT,itappearsnaturaltosolve

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

338

wordreorderingduringdecoding,ratherthanbeforeorafterit.Todate,sin embargo,thisobjectivewasonlypartiallyachieved.Webelievethereisapromisingwaytogobetweenfully-integratedreorderingmod-elsandmonolingualpre-orderingmethods.Thisworkhasstartedtoexploreit.AknowledgmentsThisworkwaspartiallyfundedbytheEuropeanUnionFP7grantagreement287658(EU-BRIDGE).ReferencesYaserAl-OnaizanandKishorePapineni.2006.Distor-tionmodelsforstatisticalmachinetranslation.InPro-ceedingsofthe21stInternationalConferenceonCom-putationalLinguisticsand44thAnnualMeetingoftheAssociationforComputationalLinguistics,pages529–536,Sydney,Australia,July.JacobAndreas,NizarHabash,andOwenRambow.2011.Fuzzysyntacticreorderingforphrase-basedstatisticalmachinetranslation.InProceedingsoftheSixthWork-shoponStatisticalMachineTranslation,pages227–236,Edinburgh,Escocia,July.SatanjeevBanerjeeandAlonLavie.2005.METEOR:AnautomaticmetricforMTevaluationwithimprovedcorrelationwithhumanjudgments.InProceedingsoftheACLWorkshoponIntrinsicandExtrinsicEvalu-ationMeasuresforMachineTranslationand/orSum-marization,pages65–72,AnnArbor,Michigan,June.A.L.Berger,P.F.Brown,S.A.DellaPietra,V.J.DellaPietra,J.R.Gillett,A.S.Kehler,andR.L.Mercer.1996.Languagetranslationapparatusandmethodofusingcontext-basedtranslationmodels.UnitedStatesPatent,No.5510981,Apr.AlexandraBirch,PhilBlunsom,andMilesOsborne.2009.Aquantitativeanalysisofreorderingphenom-ena.InStatMT’09:ProceedingsoftheFourthWork-shoponStatisticalMachineTranslation,pages197–205,Morristown,Nueva Jersey,USA.AlexandraBirch,MilesOsborne,andPhilBlunsom.2010.MetricsforMTevaluation:evaluatingreorder-ing.MachineTranslation,24(1):15–26.AriannaBisazzaandMarcelloFederico.2012.Modi-fieddistortionmatricesforphrase-basedstatisticalma-chinetranslation.InProceedingsofthe50thAnnualMeetingoftheAssociationforComputationalLinguis-tics(Volume1:LongPapers),pages478–487,JejuIs-land,Korea,July.AriannaBisazza,DanielePighin,andMarcelloFed-erico.2012.Chunk-latticesforverbreorderinginArabic-Englishstatisticalmachinetranslation.Ma-chineTranslation,SpecialIssueonMTforArabic,26(1-2):85–103.AriannaBisazza.2013.LinguisticallyMotivatedRe-orderingModelingforPhrase-BasedStatisticalMa-chineTranslation.Ph.D.thesis,UniversityofTrento.http://eprints-phd.biblio.unitn.it/1019/.MarineCarpuat,YuvalMarton,andNizarHabash.2012.ImprovedArabic-to-Englishstatisticalmachinetrans-lationbyreorderingpost-verbalsubjectsforwordalignment.MachineTranslation,SpecialIssueonMTforArabic,26(1-2):105–120.MauroCettolo,NicolaBertoldi,andMarcelloFederico.2011.MethodsforsmoothingtheoptimizerinstabilityinSMT.InMTSummitXIII:theThirteenthMachineTranslationSummit,pages32–39,Xiamen,China.StanleyF.ChenandJoshuaGoodman.1999.Anempiri-calstudyofsmoothingtechniquesforlanguagemodel-ing.ComputerSpeechandLanguage,4(13):359–393.MartaR.Costa-juss`aandJos´eA.R.Fonollosa.2009.State-of-the-artwordreorderingapproachesinstatisti-calmachinetranslation:Asurvey.IEICETRANSAC-TIONSonInformationandSystems,E92-D(11):2179–2185.KobyCrammerandYoramSinger.2003.Ultraconser-vativeonlinealgorithmsformulticlassproblems.J.Mach.Learn.Res.,3:951–991,March.HalDaum´eIII.2004.NotesonCGandLM-BFGSop-timizationoflogisticregression.Paperavailableathttp://pub.hal3.name,implementationavail-ableathttp://hal3.name/megam.MonaDiab,KadriHacioglu,andDanielJurafsky.2004.AutomaticTaggingofArabicText:FromRawTexttoBasePhraseChunks.InDanielMarcuSusanDumaisandSalimRoukos,editores,HLT-NAACL2004:ShortPapers,pages149–152,Boston,Massachusetts,USA.MarcelloFederico,NicolaBertoldi,andMauroCettolo.2008.IRSTLM:anOpenSourceToolkitforHandlingLargeScaleLanguageModels.InProceedingsofIn-terspeech,pages1618–1621,Melbourne,Australia.MinweiFeng,ArneMauser,andHermannNey.2010.Asource-sidedecodingsequencemodelforstatisti-calmachinetranslation.InConferenceoftheAssocia-tionforMachineTranslationintheAmericas(AMTA),Denver,Colorado,USA.MichelGalleyandChristopherD.Manning.2008.Asimpleandeffectivehierarchicalphrasereorderingmodel.InEMNLP’08:ProceedingsoftheConfer-enceonEmpiricalMethodsinNaturalLanguagePro-cessing,pages848–856,Morristown,Nueva Jersey,USA.SpenceGreen,ConalSathi,andChristopherD.Man-ning.2009.NPsubjectdetectioninverb-initialAra-bicclauses.InProceedingsoftheThirdWorkshop

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

339

onComputationalApproachestoArabicScript-basedLanguages(CAASL3),Ottawa,Canada.SpenceGreen,MichelGalley,andChristopherD.Man-ning.2010.Improvedmodelsofdistortioncostforstatisticalmachinetranslation.InHumanLanguageTechnologies:The2010AnnualConferenceoftheNorthAmericanChapteroftheAssociationforCom-putationalLinguistics(NAACL),pages867–875,LosAngeles,California.MagnusR.HestenesandEduardStiefel.1952.Meth-odsofconjugategradientsforsolvinglinearsystems.JournalofResearchoftheNationalBureauofStan-dards,49(6):409–436.H.Johnson,J.Martin,G.Foster,andR.Kuhn.2007.Im-provingtranslationqualitybydiscardingmostofthephrasetable.InInProceedingsofEMNLP-CoNLL07,pages967–975.PhilippKoehn,FranzJosefOch,andDanielMarcu.2003.Statisticalphrase-basedtranslation.InProceed-ingsofHLT-NAACL2003,pages127–133,Edmonton,Canada.PhilippKoehn,AmittaiAxelrod,AlexandraBirchMayne,ChrisCallison-Burch,MilesOsborne,andDavidTalbot.2005.Edinburghsystemdescriptionforthe2005IWSLTspeechtranslationevaluation.InProc.oftheInternationalWorkshoponSpokenLan-guageTranslation,October.P.Koehn,H.Hoang,A.Birch,C.Callison-Burch,M.Federico,N.Bertoldi,B.Cowan,W.Shen,C.Moran,R.Zens,C.Dyer,O.Bojar,A.Constantin,andE.Herbst.2007.Moses:OpenSourceToolkitforStatisticalMachineTranslation.InProceedingsofthe45thAnnualMeetingoftheAssociationforCom-putationalLinguisticsCompanionVolumeProceed-ingsoftheDemoandPosterSessions,pages177–180,Prague,CzechRepublic.RaviKumarandSergeiVassilvitskii.2010.General-izeddistancesbetweenrankings.InProceedingsofthe19thinternationalconferenceonWorldWideWeb,pages571–580,NewYork,Nueva York,USA.ACM.PercyLiang,BenTaskar,andDanKlein.2006.Align-mentbyagreement.InProceedingsoftheHumanLanguageTechnologyConferenceoftheNAACL,MainConference,pages104–111,NewYorkCity,EE.UU,June.RyanMcDonald,FernandoPereira,KirilRibarov,andJanHajiˇc.2005.Non-projectivedependencyparsingusingspanningtreealgorithms.InProceedingsoftheconferenceonHumanLanguageTechnologyandEm-piricalMethodsinNaturalLanguageProcessing,HLT’05,pages523–530,Stroudsburg,Pensilvania,USA.RobertC.MooreandChrisQuirk.2007.Fasterbeam-searchdecodingforphrasalstatisticalmachinetransla-tion.InInProceedingsofMTSummitXI,pages321–327,Copenhagen,Denmark.F.OchandH.Ney.2002.Discriminativetrainingandmaximumentropymodelsforstatisticalmachinetranslation.InProceedingsofthe40thAnnualMeet-ingoftheAssociationforComputationalLinguistics(LCA),pages295–302,Philadelhpia,PA.FranzJosefOch.2003.MinimumErrorRateTraininginStatisticalMachineTranslation.InErhardHinrichsandDanRoth,editores,Proceedingsofthe41stAnnualMeetingoftheAssociationforComputationalLinguis-tics,pages160–167.KishorePapineni,SalimRoukos,ToddWard,andWei-JingZhu.2002.BLEU:amethodforautomaticeval-uationofmachinetranslation.InProceedingsofthe40thAnnualMeetingoftheAssociationofCompu-tationalLinguistics(LCA),pages311–318,Philadel-phia,PA.StefanRiezlerandJohnT.Maxwell.2005.Onsomepitfallsinautomaticevaluationandsignificancetest-ingforMT.InProceedingsoftheACLWorkshoponIntrinsicandExtrinsicEvaluationMeasuresforMa-chineTranslationand/orSummarization,pages57–64,AnnArbor,Michigan,June.ChristophTillmann.2004.AUnigramOrientationModelforStatisticalMachineTranslation.InPro-ceedingsoftheJointConferenceonHumanLanguageTechnologiesandtheAnnualMeetingoftheNorthAmericanChapteroftheAssociationofComputa-tionalLinguistics(HLT-NAACL).KarthikVisweswariah,RajakrishnanRajkumar,AnkurGandhe,AnanthakrishnanRamanathan,andJiriNavratil.2011.Awordreorderingmodelforim-provedmachinetranslation.InProceedingsofthe2011ConferenceonEmpiricalMethodsinNatu-ralLanguageProcessing,pages486–496,Edinburgh,Escocia,UK.,July.DekaiWu.1997.Stochasticinversiontransductiongrammarsandbilingualparsingofparallelcorpora.ComputationalLinguistics,23(3):377–403.SirvanYahyaeiandChristofMonz.2010.Dynamicdis-tortioninadiscriminativereorderingmodelforsta-tisticalmachinetranslation.InInternationalWork-shoponSpokenLanguageTranslation(IWSLT),París,France.RichardZensandHermannNey.2006.Discriminativereorderingmodelsforstatisticalmachinetranslation.InProceedingsontheWorkshoponStatisticalMa-chineTranslation,pages55–63,NewYorkCity,June.R.Zens,F.J.Och,andH.Ney.2002.Phrase-basedsta-tisticalmachinetranslation.In25thGermanConfer-enceonArtificialIntelligence(KI2002),pages18–32,Aachen,Germany.SpringerVerlag.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
3
1
1
5
6
6
6
5
7

/

/
t

yo

a
C
_
a
_
0
0
2
3
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

340Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. imagen
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. imagen
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. imagen
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. imagen
Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 327–340. Editor de acciones: Philipp Koehn. imagen

Descargar PDF