Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III.

Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III.
Submission batch: 3/2015; Published 5/2015.

2015 Association for Computational Linguistics. Distributed under a CC-BY-NC-SA 4.0 Licence.

c
(cid:13)

DomainAdaptationforSyntacticandSemanticDependencyParsingUsingDeepBeliefNetworksHaitongYang,TaoZhuangandChengqingZongNationalLaboratoryofPatternRecognitionInstituteofAutomation,ChineseAcademyofSciences,Beijing,100190,Chine{htyang,tao.zhuang,cqzong}@nlpr.ia.ac.cnAbstractIncurrentsystemsforsyntacticandseman-ticdependencyparsing,peopleusuallyde-fineaveryhigh-dimensionalfeaturespacetoachievegoodperformance.Butthesesystemsoftensuffersevereperformancedropsonout-of-domaintestdataduetothediversityoffea-turesofdifferentdomains.Thispaperfo-cusesonhowtorelievethisdomainadapta-tionproblemwiththehelpofunlabeledtar-getdomaindata.Weproposeadeeplearningmethodtoadaptbothsyntacticandsemanticparsers.Withadditionalunlabeledtargetdo-maindata,ourmethodcanlearnalatentfea-turerepresentation(LFR)thatisbeneficialtobothdomains.ExperimentsonEnglishdataintheCoNLL2009sharedtaskshowthatourmethodlargelyreducedtheperformancedroponout-of-domaintestdata.Moreover,wegetaMacroF1scorethatis2.32pointshigherthanthebestsystemintheCoNLL2009sharedtaskinout-of-domaintests.1IntroductionBothsyntacticandsemanticdependencyparsingarethestandardtasksintheNLPcommunity.Thestate-of-the-artmodelperformswellifthetestdatacomesfromthedomainofthetrainingdata.Butifthetestdatacomesfromadifferentdomain,theperfor-mancedropsseverely.TheresultsofthesharedtasksofCoNLL2008and2009(Surdeanuetal.,2008;Hajiˇcetal.,2009)alsosubstantiatestheargument.Torelievethedomainadaptation,inthispaper,weproposeadeeplearningmethodforbothsyntacticandsemanticparsers.Wefocusonthesituationthat,besidessourcedomaintrainingdataandtargetdo-maintestdata,wealsohavesomeunlabeledtargetdomaindata.Manysyntacticandsemanticparsersaredevel-opedusingasupervisedlearningparadigm,whereeachdatasampleisrepresentedasavectoroffea-tures,usuallyahigh-dimensionalfeature.Theper-formancedegradationontargetdomaintestdataismainlycausedbythediversityoffeaturesofdiffer-entdomains,i.e.,manyfeaturesintargetdomaintestdataareneverseeninsourcedomaintrainingdata.Previousworkhaveshownthatusingwordclus-terstoreplacethesparselexicalizedfeatures(Kooetal.,2008;Turianetal.,2010),helpsrelievetheperformancedegradationonthetargetdomain.Butforsyntacticandsemanticparsing,peoplealsousealotofsyntacticfeatures,i.e.,featuresextractedfromsyntactictrees.Forexample,therelationpathbe-tweenapredicateandanargumentisasyntacticfea-tureusedinsemanticdependencyparsing(Johans-sonandNugues,2008).Figure1showsanexam-pleofthisrelationpathfeature.Obviously,syntac-ticfeatureslikethisarealsoverysparseandusu-allyspecifictoeachdomain.Themethodofclus-teringfailsingeneralizingthesekindsoffeatures.Ourmethod,cependant,isverydifferentfromclus-teringspecificfeaturesandsubstitutingthesefea-turesusingtheirclusters.Instead,weattackthedo-mainadaptionproblembylearningalatentfeaturerepresentation(LFR)fordifferentdomains,whichissimilartoTitov(2011).Officiellement,weproposeaDeepBeliefNetwork(DBN)modeltorepresentadatasampleusingavectoroflatentfeatures.ThislatentfeaturevectorisinferredbyourDBNmodel

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

272

wantsShepaytoayou.visitPSBJOPRDIMOBJNMODROOTOBJFigure1:Apathfeatureexample.TherededgesarethepathbetweenSheandvisitandthustherelationpathfea-turebetweenthemisSBJ↑OPRD↓IM↓OBJ↓basedonthedatasample’soriginalfeaturevector.OurDBNmodelistrainedunsupervisedlyonorig-inalfeaturevectorsofdatainbothdomains:train-ingdatafromthesourcedomain,andunlabeleddatafromthetargetdomain.SoourDBNmodelcanpro-duceacommonfeaturerepresentationfordatafrombothdomains.Acommonfeaturerepresentationcanmaketwodomainsmoresimilarandthusisveryhelpfulfordomainadaptation(Blitzer,2006).Dis-criminativemodelsusingourlatentfeaturesadaptbettertothetargetdomainthanmodelsusingorigi-nalfeatures.Discriminativemodelsinsyntacticandsemanticparsersusuallyusemillionsoffeatures.ApplyingatypicalDBNtolearnasensibleLFRonthatmanyoriginalfeaturesiscomputationallytooexpensiveandimpractical(Rainaetal.,2009).Donc,weconstraintheDBNbysplittingtheoriginalfeaturesintogroups.Inthisway,welargelyreducethecom-putationalcostandmakeLFRlearningpractical.WecarriedoutexperimentsontheEnglishdataoftheCoNLL2009sharedtask.Weuseabasicpipelinedsystemandcomparetheeffectivenessofthetwofea-turerepresentations:originalfeaturerepresentationandourLFR.Usingtheoriginalfeatures,theper-formancedroponout-of-domaintestdatais10.58pointsinMacroF1score.Incontrast,usingtheLFR,theperformancedropisonly4.97points.AndwehaveachievedaMacroF1scoreof80.83%ontheout-of-domaintestdata.Asfarasweknow,thisisthebestresultonthisdatasettodate.2RelatedWorkDependencyparsingandsemanticrolelabelingaretwostandardtasksintheNLPcommunity.Therehavebeenmanyworksonthetwotasks(McDon-aldetal.,2005;GildeaandJurafsky,2002;YangandZong,2014;ZhuangandZong,2010un;ZhuangandZong,2010b,etc.).Amongthem,researchesondomainadaptationfordependencyparsingandSRLaredirectlyrelatedtoourwork.Dredzeetal.,(2007)showthatdomainadaptationishardforde-pendencyparsingbasedonresultsintheCoNLL2007sharedtask(Nivreetal.,2007).Chenetal.,(2008)adaptedasyntacticdependencyparserbylearningreliableinformationonshorterdependen-ciesinunlabeledtargetdomaindata.Buttheydonotconsiderthetaskofsemanticdependencypars-ing.Huangetal.,(2010)usedanHMM-basedla-tentvariablelanguagemodeltoadaptaSRLsystem.Theirmethodistailoredforachunking-basedSRLsystemandcanhardlybeappliedtoourdependencybasedtask.Westonetal.,(2008)useddeepneuralnetworkstoimproveanSRLsystem.Buttheirtestsareonin-domaindata.Onmethodology,theworkinGlorotetal.,(2011)andTitov(2011)iscloselyrelatedtoours.TheyalsofocusonlearningLFRsfordomainadaptation.However,theirworkdealswithdomainadaptationforsentimentclassification,whichusesmuchfewerfeaturesandtrainingsamples.Sotheydonotneedtoworryaboutcomputationalcostasmuchaswedo.Titov(2011)usedagraphicalmodelthathasonlyonelayerofhiddenvariables.Oncontrast,weneedtouseamodelwithtwolayersofhiddenvariablesandsplitthefirsthiddenlayertoreducecomputationalcost.ThemodelofTitov(2011)alsoembodiesaspecificclassifier.Butourmodelisin-dependentoftheclassifiertobeused.Glorotetal.,(2011)usedamodelcalledStackedDenoisingAuto-Encoders,whichalsocontainsmultiplehiddenlayers.However,theydonotexploitthehierarchi-calstructureoftheirmodeltoreducecomputationalcost.Bysplitting,ourmodelcontainsmuchlesspa-rametersthantheirs.Infact,themodelsinGlorotetal.,(2011)andTitov(2011)cannotbeappliedtoourtasksimplybecauseofthehighcomputationalcost.3OurDBNModelforLFRIndiscriminativemodels,eachdatasampleisrep-resentedasavectoroffeatures.OurDBNmodelmapsthisoriginalfeaturevectortoavectoroflatent

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

273

features.Andweusethislatentfeaturevectortorep-resentthesample,i.e.,wereplacethewholeoriginalfeaturevectorbythelatentfeaturevector.Inthissection,weintroducehowourDBNmodelrepre-sentadatasampleasavectoroflatentfeatures.Be-foreintroducingourDBNmodel,wefirstreviewasimplermodelcalledRestrictedBoltzmanMachines(RBM)(Hintonetal.,2006).WhentrainingaDBNmodel,RBMisusedasabasicunitinaDBN.3.1RestrictedBoltzmannMachinesAnRBMisanundirectedgraphicalmodelwithalayerofvisiblevariablesv=(v1,…,vm),andalayerofhiddenvariablesh=(h1,…,hn).Thesevariablesarebinary.Figure2showsagraphicalrep-resentationofanRBM………….(un)(b)hvhvFigure2:GraphicalrepresentationsofanRBM:(un)rep-resentsanRBM.(b)isamorecompactrepresentationTheparametersofanRBMareθ=(W,un,b)whereW=(Wij)m×nisamatrixwithWijbe-ingtheweightfortheedgebetweenviandhj,anda=(a1,…,am),b=(b1,…,bn)arebiasvectorsforvandhrespectively.TheprobabilisticmodelofanRBMis:p(v,h|je)=1Z(je)exp(−E(v,h))(1)whereE(v,h)=−mXi=1aivi−nXj=1bjhj−mXi=1nXj=1viwijhjZ(je)=Xv,hexp(−E(v,h))BecausetheconnectionsinanRBMareonlybe-tweenvisibleandhiddenvariables,theconditionaldistributionoverahiddenoravisiblevariableisquitesimple:p(hj=1|v)(bj+mXi=1viwij)(2)p(vi=1|h)(ai+nXj=1hiwij)(3)whereσ(X)=1/(1+exp(−x))isthelogisticsig-moidfunction.AnRBMcanbeefficientlytrainedonasequenceofvisiblevectorsusingtheContrastiveDivergencemethod(Hinton,2002).3.2TheProblemofLargeScaleInoursyntacticandsemanticparsingtask,allfea-turesarebinary.Soeachdatasample(anshiftac-tioninsyntacticparsingoranargumentcandidateinsemanticparsing)isrepresentedasabinaryfeaturevector.Bytreatingasample’sfeaturevectorasvis-iblevariablevectorinanRBM,andtakinghiddenvariablesaslatentfeatures,wecouldgettheLFRofthissampleusingtheRBM.However,foroursyntacticandsemanticparsingtasks,trainingsuchanRBMiscomputationallyimpracticalduetothefollowingconsiderations.Letm,ndenoterespec-tivelythenumberofvisibleandhiddenvariablesintheRBM.ThenthereareO(mn)parametersinthisRBM.IfwetraintheRBMondsamples,thenthetimecomplexityforContrastiveDivergencetrainingisO(mnd).Forsyntacticorsemanticparsing,thereareover1millionuniquebinaryfeatures,andmil-lionsoftrainingsamples.Thatmeansbothmanddareinanorderof106.Withmandnofthatorder,nshouldnotbechosentoosmalltogetasensibleLFR(Hinton,2010).Ourexperienceindicatesthatnshouldbeatleastinanorderof103.NowweseewhytheO(mnd)complexityisformidableforourtask.3.3OurDBNModelADBNisaprobabilisticgenerativemodelthatiscomposedofmultiplelayersofstochastic,latentvariables(Hintonetal.,2006).Themotivationofus-ingaDBNistwo-fold.First,previousresearchhasshownthatadeepnetworkcancapturehigh-levelcorrelationsbetweenvisiblevariablesbetterthananRBM(Bengio,2009).Deuxième,asshowninthepre-cedingsubsection,thelargescaleofourtaskposes

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

274

h2vh1……………………Figure3:OurDBNmodel.Thebluenodesstandforthevisiblevariables(v)andtheblanknodestandsforthehiddenvariables(h1andh2).Thesymbolsarealsousedinthefiguresofthefollowingsubsectins.agreatchallengeforlearninganLFR.Bymanipu-latingthehierarchicalstructureofaDBN,wecansignificantlyreducethenumberofparametersintheDBNmodel.ThislargelyreducesthecomputationalcostfortrainingtheDBN.Withoutthistechnique,itisimpracticaltolearnaDBNmodelwiththatmanyparametersonlargetrainingsets.AsshowninFig.3,ourDBNmodelcontains2layersofhiddenvariables:h1,h2,andavisiblevec-torv.Thevisiblevectorcorrespondstoasample’soriginalfeaturevector.Thesecond-layerhiddenvariablevectorh2areusedastheLFRofthissam-ple.Supposetherearem,n1,n2variablesinv,h1,h2respectively.ToreducethenumberofparametersintheDBN,wesplititsfirstlayer(h1−v)intokgroups,aswewillexplaininthefollowingsub-section.Weconfinetheconnectionsinthislayertovariableswithinthesamegroup.Sothereareonlymn1/kparametersinthefirstlayer.Withoutsplitting,thenumberofparameterswouldbemn1.Therefore,learningthatmanyparametersrequirestoomuchcomputation.Bysplitting,wereducethenumberofparametersbyafactorofk.Ifwechoosekbigenough,learningisfeasible.Thesecondlayer(h2−h1)isfullyconnected,sothatthevariablesinthesecondlayercancapturetherelationsbetweenvariablesindifferentgroupsinthefirstlayer.Therearen1n2parametersinthesec-ondlayers.Becausen1andn2arerelativelysmall,learningtheparametersinthesecondlayerisalsofeasible.Insummary,bysplittingthefirstlayerintogroups,wehavelargelyreducedthenumberofpa-rametersinourDBNmodel.ThismakeslearningourDBNmodelpracticalforourtask.Inourtask,visiblevariablescorrespondstooriginalbinaryfea-turesandthesecondlayerhiddenvariablesareusedastheLFRoftheseoriginalfeatures.Onedeficiencyofsplittingisthattherelationshipsbetweenoriginalfeaturesindifferentgroupscannotbecapturedbyhiddenvariablesinthefirstlayer.However,thisde-ficiencyiscompensatedbyusingthesecondlayertocapturerelationshipsbetweenallvariablesinthefirstlayer.Inthisway,thesecondlayerstillcap-turestherelationshipsbetweenalloriginalfeaturesindirectly.3.3.1SplittingFeaturesintoGroupsWhenwesplitthefirstlayerintokgroups,ev-erygroup,exceptthelastone,containsbm/kcvis-iblevariablesandbn1/kchiddenvariables.Thelastgroupcontainstheremainingvisibleandhiddenvariables.Buthowtosplitthevisiblevariables,i.e.,theoriginalfeatures,intothesegroups?Ofcoursetherearemanywaystosplittheoriginalfeatures.Butitisdifficulttofindagoodprincipletosplit.Sowetriedtwosplittingstrategiesinthispaper.Thefirststrategyisverysimple.Wearrangeallfeaturesastheordertheyappearedinthetrainingdata.Sup-poseeachgroupcontainsroriginalfeatures.Wejustputthefirstruniquefeaturesoftrainingdataintothefirstgroup,thefollowingruniquefeaturesintothesecondgroup,andsoon.Thesecondstrategyismoresophisticated.Allfeaturescanbedividedintothreecategories:thecommonfeatures,thesource-specificfeaturesandthetarget-specificfeatures.Itsmainideaistomakeeachgroupcontainthethreecategoriesoffeaturesevenly,whichwethinkmakesthedistributionoffea-turesclosetothe‘true’distributionoverdomains.LetFsandFtdenotethesetsoffeaturesthatap-pearedonsourceandtargetdomaindatarespec-tively.WecollectFsandFtfromourtrainingdata.ThefeaturesinFsandFtareareorderedthesameastheordertheyappearedintrainingdata.AndletFs∩t=Fs∩Ft(thecommonfeatures),Fs\t=Fs\Ft(thesource-specificfeatures),Ft\s=Ft\Fs(thetarget-specificfeatures).Donc,toevenlydis-tributefeaturesinFs∩t,Fs\tandFt\stoeachgroup,eachgroupshouldconsistof|Fs∩t|/k,|Fs\t|/kand|Ft\s|/kfeaturesfromFs∩t,Fs\tandFt\srespec-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

275

tively.Therefore,weputthefirst|Fs∩t|/kfeaturesfromFs∩t,thefirst|Fs\t|/kfeaturesfromFs\tandthefirst|Ft\s|/kfeaturesfromFt\sintothefirstgroup.Similarly,weputthesecond|Fs∩t|/kfea-turesfromFs∩t,thesecond|Fs\t|/kfeaturesfromFs\tandthesecond|Ft\s|/kfeaturesfromFt\sintothesecondgroup.TheintuitionofthisstrategyistoletfeaturesinFs∩tactaspivotfeaturesthatlinkfea-turesinFs\tandFt\sineachgroup.Inthisway,thefirsthiddenlayermightcapturebetterrelationshipsbetweenfeaturesfromsourceandtargetdomains.3.3.2LFRofaSampleGivenasamplerepresentedasavectoroforigi-nalfeatures,ourDBNmodelwillrepresentitasavectoroflatentfeatures.Thesample’soriginalfea-turevectorcorrespondstothevisiblevectorvinourDBNmodelinFigure3.OurDBNmodelusesthesecond-layerhiddenvariablevectorh2torepresentthissample.Therefore,wemustinferthevalueofhiddenvariablesinthesecond-layergiventhevis-iblevector.ThisinferencecanbedoneusingthemethodsinHintonetal.,(2006).Giventhevisiblevector,thevaluesofthehiddenvariablesineverylayercanbeefficientlyinferredinasingle,bottom-uppass.3.4TrainingOurDBNModelInferenceinaDBNissimpleandfast.Nonetheless,trainingaDBNismorecomplicated.ADBNcanbetrainedintwostages:greedylayer-wisepretrainingandfinetuning(Hintonetal.,2006).3.4.1GreedyLayer-wisePretrainingInthisstage,theDBNistreatedasastackofRBMsasshowninFigure4.ThesecondlayeristreatedasasingleRBM.ThefirstlayeristreatedaskparallelRBMswitheachgroupbeingoneRBM.ThesekRBMsareparal-lelbecausetheirvisiblevariablevectorsconstituteapartitionoftheoriginalfeaturevector.Inthisstage,wetraintheseconstituentRBMsinabottom-uplayer-wisemanner.Tolearnparametersinthefirstlayer,weonlyneedtolearntheparametersofeachRBMinthefirstlayer.Withtheoriginalfeaturevectorvgiven,thesekRBMscanbetrainedusingtheContrastiveDiver-gencemethod(Hinton,2002).Afterthefirstlayerish2h1………………RBM……RBM……RBM……RBMFigure4:StackofRBMsinpretraining.trained,wewillfixtheparametersinthefirstlayerandstarttotrainthesecondlayer.FortheRBMofthesecondlayer,itsvisiblevari-ablesarethehiddenvariablesinthefirstlayer.Givenanoriginalfeaturevectorv,wefirstinfertheacti-vationprobabilitiesforthehiddenvariablesinthefirstlayerusingequation(2).Andweusetheseac-tivationprobabilitiesasvaluesforvisiblevariablesinthesecondlayerRBM.ThenwetrainthesecondlayerRBMusingcontrastivedivergencealgorithm.Notethattheactivationprobabilitiesarenotbinaryvalues.Butthisisonlyatrickfortrainingbecauseusingprobabilitiesgenerallyproducesbettermodels(Hintonetal.,2006).Thistrickdoesnotchangeourassumptionthateachvariableisbinary.3.4.2FineTuningThegreedylayer-wisepretraininginitializestheparametersofourDBNtosensiblevalues.Butthesevaluesarenotoptimalandtheparametersneedtobefinetuned.Forfinetuning,weunrolltheDBNtoformanautoencoderasinHintonandSalakhutdinov(2006),whichisshowninFigure5.Inthisautoencoder,thestochasticactivitiesofbi-naryhiddenvariablesarereplacedbyitsactivationprobabilities.Sotheautoencoderisinessenceafeed-forwardneuralnetwork.Wetunetheparam-etersofourDBNmodelonthisautoencoderusingbackpropagationalgorithm.4DomainAdaptationwithOurDBNModelInthissection,weintroducehowtouseourDBNmodeltoadaptabasicsyntacticandsemanticde-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

276

……………………………………………Figure5:UnrollingtheDBN.pendencyparsingsystemtotargetdomain.4.1TheBasicPipelinedSystemWebuildatypicalpipelinedsystem,whichfirstan-alyzesyntacticdependencies,andthenanalyzese-manticdependencies.Thisbasicsystemonlyservesasaplatformforexperimentingwithdifferentfea-turerepresentations.Sowejustbrieflyintroduceourbasicsysteminthissubsection.4.1.1SyntacticDependencyParsingForsyntacticdependencyparsing,weuseade-terministicshift-reducemethodasinNivreetal.,(2006).Ithasfourbasicactions:left-arc,right-arc,shift,andreduce.Aclassifierisusedtodetermineanactionateachstep.Todecidethelabelforeachdependencylink,weextendtheleft/right-arcactionstotheircorrespondingmulti-labelactions,leadingto31left-arcand66right-arcactions.Altogethera99-classproblemisyieldedforparsingactionclassifi-cation.WeaddarcstothedependencygraphinanarceagermannerasinHalletal.,(2007).Wealsoprojectivizethenon-projectivesequencesintrainingdatausingthetransformationfromNivreandNils-son(2005).Amaximumentropyclassifierisusedtomakedecisionsateachstep.ThefeaturesutilizedarethesameasthoseinZhaoetal.,(2008).4.1.2SemanticDependencyParsingOursemanticdependencyparserissimilartotheoneinCheetal.,(2009).Wefirsttrainapredicatesenseclassifierontrainingdata,usingthesamefea-turesasinCheetal.,(2009).Encore,amaximumen-tropyclassifierisemployed.Givenapredicate,weneedtodecideitssemanticdependencyrelationwitheachwordinthesentence.Toreducethenumberofargumentcandidates,weadoptthepruningstrat-egyinZhaoetal.,(2009),whichisadaptedfromthestrategyinXueandPalmer(2004).Intheseman-ticroleclassificationstage,weuseamaximumen-tropyclassifiertopredicttheprobabilitiesofacan-didatetobeeachsemanticrole.Wetraintwodiffer-entclassifiersforverbandnounpredicatesusingthesamefeaturesasinCheetal.,(2009).Weuseasim-plemethodforpostprocessing.Iftherearedupli-cateargumentsforARG0∼ARG5,wepreservetheonewiththehighestclassificationprobabilityandremoveitsduplicates.4.2AdaptingtheBasicSystemtoTargetDomainInourbasicpipelinesystem,boththesyntacticandsemanticdependencyparsersarebuiltusingdis-criminativemodels.Wetrainasyntacticparsingmodelandasemanticparsingmodelusingtheorig-inalfeaturerepresentation.Wewillrefertothissyn-tacticparsingmodelasOriSynModel,andthese-manticparsingmodelasOriSemModel.However,thesetwomodelsdonotadaptwelltothetargetdo-main.SoweusetheLFRofourDBNmodeltotrainnewsyntacticandsemanticparsingmodels.WewillrefertothenewsyntacticparsingmodelasLatSyn-Model,andthenewsemanticparsingmodelasLat-SemModel.DetailsofusingourDBNmodelareasfollows.4.2.1AdaptingtheSyntacticParserTheinputdatafortrainingourDBNmodelaretheoriginalfeaturevectorsontrainingandunla-beleddata.Therefore,totrainourDBNmodel,wefirstneedtoextracttheoriginalfeaturesforsyntacticparsingonthesedata.Featuresontrainingdatacanbedirectlyextractedusinggolden-standardannota-tions.Onunlabeleddata,cependant,somefeaturescannotbedirectlyextracted.Thisisbecauseoursyntacticparseruseshistory-basedfeatureswhichdependonpreviousactionstakenwhenparsingasentence.Therefore,featuresonunlabeleddatacanonlybeextractedafterthedataareparsed.Tosolvethisproblem,wefirstparsetheunlabeleddatausingthealreadytrainedOriSynModel.Inthisway,nous

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

277

canobtainthefeaturesontheunlabeleddata.Be-causeofthepoorperformanceoftheOriSynModelonthetargetdomain,theextractedfeaturesonun-labeleddatacontainssomenoise.However,exper-imentsshowthatourDBNmodelcanstilllearnagoodLFRdespitethenoiseintheextractedfeatures.UsingtheLFR,wecantrainthesyntacticparsingmodelLatSynModel.ThenbyapplyingtheLFRontestandunlabeleddata,wecanparsethedatausingLatSynModel.ExperimentsinlatersectionsshowthattheLatSynModeladaptsmuchbettertothetar-getdomainthantheOriSynModel.4.2.2AdaptingtheSemanticParserThesituationhereissimilartotheadaptationofthesyntacticparser.Featuresontrainingdatacanbedirectlyextracted.Toextractfeaturesonunla-beleddata,weneedtohavesyntacticdependencytreesonthisdata.SoweuseourLatSynModeltoparsetheunlabeleddatafirst.Andweautomaticallyidentifypredicatesonunlabeleddatausingaclas-sifierasinCheetal.,(2008).Thenweextracttheoriginalfeaturesforsemanticparsingonunlabeleddata.ByfeedingoriginalfeaturesextractedonthesedatatoourDBNmodel,welearntheLFRforse-manticdependencyparsing.UsingtheLFR,wecantrainthesemanticparsingmodelLatSemModel.5Experiments5.1ExperimentSetup5.1.1ExperimentDataWeusetheEnglishdataintheCoNLL2009sharedtaskforexperiments.Thetrainingdataandin-domaintestdataarefromtheWSJcorpus,whereastheout-of-domaintestdataisfromtheBrowncorpus.Wealsouseunlabeleddataconsist-ingofthefollowingsectionsoftheBrowncorpus:K,L,M.,N,P.Thetestdataareexcerptsfromfic-tions.Theunlabeleddataarealsoexcerptsfromfic-tionsorstories,whicharesimilartothetestdata.AlthoughtheunlabeleddataisactuallyannotatedinRelease3ofthePennTreebank,wedonotuseanyinformationcontainedintheannotation,onlyusingtherawtexts.Thetraining,testandunlabeleddatacontains39279,425,and16407sentencesrespec-tively.5.1.2SettingsofOurDBNModelForthesyntacticparsingtask,thereare748,598originalfeaturesintotal.Weuse7,486hiddenvari-ablesinthefirstlayerand3,743hiddenvariablesinthesecondlayer.Forsemanticparsing,thereare1,074,786originalfeatures.Weuse10,748hiddenvariablesinthefirstlayerand5,374hiddenvariablesinthesecondlayer.InourDBNmodels,weneedtodeterminethenumberofgroupsk.Becauselargerkmeanslesscomputationalcost,kshouldnotbesettoosmall.Weempiricallysetkasfollows:accordingtoourexperience,eachgroupshouldcontainabout5000originalfeatures.Wehaveabout106originalfea-turesinourtasks.Soweestimatek≈106/5000=200.Andwesetktobe200intheDBNmodelsforbothsyntacticandsemanticparsing.Asforsplit-tingstrategy,weusethemoresophisticatedoneinsubsection3.3.1becauseitshouldgeneratebetterre-sultsthanthesimpleone.5.1.3DetailsofDBNTrainingIngreedypretrainingoftheDBN,thecontrastivedivergencealgorithmisconfiguredasfollows:thetrainingdataisdividedtomini-batches,eachcon-taining100samples.Theweightsareupdatedwithalearningrateof0.3,momentumof0.9,weightde-cayof0.0001.Eachlayeristrainedfor30passes(epochs)overtheentiretrainingdata.Infine-tuning,thebackpropagationalgorithmisconfiguredasfollows:Thetrainingdataisdividedtomini-batches,eachcontaining50samples.Theweightsareupdatedwithalearningrateof0.1,mo-mentumof0.9,weightdecayof0.0001.Thefine-tuningisrepeatedfor50epochsovertheentiretrain-ingdata.WeusethefastcomputingtechniqueinRainaetal.,(2009)tolearntheLFRs.Moreover,ingreedypretraining,wetrainRBMsinthefirstlayerinpar-allel.5.2ResultsandDiscussionWeusetheofficialevaluationmeasuresoftheCoNLL2009sharedtask,whichconsistofthreedif-ferentscores:(je)syntacticdependenciesarescoredusingthelabeledattachmentscore,(ii)semanticde-pendenciesareevaluatedusingalabeledF1score,et(iii)theoveralltaskisscoredwithamacroav-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

278

TestdataSystemLASSemF1MacroF1WSJOri87.6384.8286.24Lat87.3084.2585.80BrownOri79.7271.5775.67Lat82.8478.7580.83Table1:Theresultsofourbasicandadaptedsystemserageofthetwopreviousscores.ThethreescoresabovearerepresentedbyLAS,SemF1,andMacroF1respectivelyinthispaper.5.2.1ComparisonwithUn-adaptedSystemOurbasicsystemusestheOriSynModelforsyn-tacticparsing,andtheOriSemModelforsemanticparsing.OuradaptedsystemusestheLatSynModelforsyntacticparsing,andtheLatSemModelforse-manticparsing.TheresultsofthesetwosystemsareshowninTable1,inwhichourbasicandadaptedsystemsaredenotedasOriandLatrespectively.FromtheresultsinTable1,wecanseethatLatperformsslightlyworsethanOrionin-domainWSJtestdata.Butontheout-of-domainBrowntestdata,LatperformsmuchbetterthanOri,with5pointsim-provementinMacroF1score.Thisshowstheeffec-tivenessofourmethodfordomainadaptationtasks.5.2.2DifferentSplittingConfigurationsAsdescribedinsubsection5.1.2,wehaveem-piricallysetthenumberofgroupsktobe200andchosenthemoresophisticatedsplittingstrategy.Inthissubsection,weexperimentwithdifferentsplit-tingconfigurationstoseetheireffects.Undereachsplittingconfiguration,welearntheLFRsusingourtheDBNmodels.UsingtheLFRs,wetesttheouradaptedsystemsonbothin-domainandout-of-domaindata.Thereforewegetmanytestresults,eachcorrespondingtoasplittingconfigura-tion.Thein-domainandout-of-domaintestresultsarereportedinTable2andTable3respectively.Inthesetwotables,‘s1’and‘s2’representsthesim-pleandthemoresophisticatedsplittingstrategiesinsubsection3.3.1respectively.‘k’representsthenumberofgroupsinourDBNmodels.Forbothsyntacticandsemanticparsing,weusethesamekintheirDBNmodels.The‘Time’columnreportsthetrainingtimeofourDBNmodelsforbothsyn-tacticandsemanticparsing.Theunitofthe‘Time’StrkTime(h)LASSemF1MacroF1s110039285.9582.4284.1920026185.7682.1483.9530021885.4881.6883.5840019684.8080.2482.52s210039286.2283.0384.6320026186.1082.8984.5030021885.7282.2483.9840019684.9681.1383.05Table2:Resultsofdifferentsplittingconfigurationsonin-domainWSJdevelopmentdataStrkTime(h)LASSemF1MacroF1s110039282.8178.7780.8220026182.7378.4980.6330021882.4477.9080.3740019681.8376.7279.31s210039282.9579.0381.0320026182.8478.7580.8330021882.6378.3480.5040019681.9776.9879.51Table3:Resultsofdifferentsplittingconfigurationsonout-of-domainBrowntestdatacolumnisthehour.PleasenotethatweonlyneedtotrainourDBNmodelsonce.AndwereportthetrainingtimeinTable2.Foreasyviewing,were-peatthosetrainingtimesinTable3.ButthisdoesnotmeanweneedtotrainnewDBNmodelsforout-of-domaintest.FromTables2and3wegetthefollowingobser-vations:D'abord,althoughthemoresophisticatedsplittingstrategy‘s2’generateslightlybetterresultthanthesimplestrategy‘s1’,thedifferenceisnotsignifi-cant.ThismeansthatthehierarchicalstructureofourDBNmodelcanrobustlycapturetherelation-shipsbetweenfeatures.Evenwiththesimplesplit-tingstrategy‘s1’,westillgetquitegoodresults.Second,the‘Time’columninTable2showsthatdifferentsplittingstrategieswiththesamekvaluehasthesametrainingtime.Thisisreasonablebe-causetrainingtimeonlydependsonthenumberofparametersinourDBNmodel.Anddifferentsplit-tingstrategiesdonotaffectthenumberofparame-tersinourDBNmodel.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

279

Troisième,thenumberofgroupskaffectsboththetrainingtimeandthefinalresults.Whenkincreases,thetrainingtimereducesbuttheresultsdegrade.Askgetslarger,thetimereductiongetslessobvious,butthedegradationofresultsgetsmoreobvious.Whenk=100,200,300,thereisnotmuchdiffer-encebetweentheresults.ThisshowsthattheresultsofourDBNmodelisnotsensitivetothevaluesofkwithinarangeof100aroundourinitialestima-tion200.Butwhenkisfurtherawayfromoures-timation,e.g.k=400,theresultsgetsignificantlyworse.PleasenotethattheresultsinTables2and3arenotusedtotunetheparameterkortochooseasplit-tingstrategyinourDBNmodel.Asmentionedinsubsection5.1.2,wehavechosenk=200andthemoresophisticatedsplittingstrategybeforehand.Inthispaper,wealwaysusetheresultswithk=200andthe‘s2’strategyasourmainresults,eventhoughtheresultswithk=100arebetter.5.3TheSizeofUnlabeledTargetDomainDataAninterestingquestionforourmethodishowmuchunlabeledtargetdomaindatashouldbeused.Toem-piricallyanswerthisquestion,welearnseveralLFRsbygraduallyaddingmoreunlabeleddatatotrainourDBNmodel.WecomparedtheperformanceoftheseLFRsasshowninFigure6. 74767880828486880300060009000120001500018000Target Domain TestSource Domain TestFigure6:MacroF1scoresontestdatawithrespecttothesizeofunlabeledtargetdomaindatausedinDBNtrain-ing.ThehorizontalaxisisthenumberofsentencesinunlabeledtargetdomaindataandthecoordinateaxisistheMacroF1Score.FromFigure6,wecanseethatbyaddingmoreunlabeledtargetdomaindata,oursystemadaptsbet-tertothetargetdomainwithonlysmalldegradationofresultonsourcedomain.However,withmoreun-labeleddataused,theimprovementontargetdomainresultgraduallygetssmaller.5.4ComparisonwithothermethodsInthissubsection,wecompareourmethodwithsev-eralsystems.Thesearedescribedbelow.Daume07.Daum´eIII(2007)proposedasimpleandeffectiveadaptationmethodbyaugmentingfea-turevector.Itsmainideaistoaugmentthefeaturevector.Theytookeachfeatureintheoriginalprob-lemandmadethreeversionsofit:ageneralversion,asource-specificversionandatarget-specificver-sion.Thus,theaugmentedsourcedatacontainsonlygeneralandsource-specificversions;theaugmentedtargetdatacontainsgeneralandtarget-specificver-sions.Inthebaselinesystem,weadoptthesametechniquefordependencyandsemanticparsing.Chen.TheparticipationsystemofZhaoetal.,(2009),reachedthebestresultintheout-of-domaintestoftheCoNLL2009sharedtask.InDaum´eIIIandandMarcu(2006),theypre-sentedanddiscussedseveral‘obvious’waystoat-tackthedomainadaptationproblemwithoutdevel-opingnewalgorithms.Followingtheiridea,wecon-structsimilarsystems.OnlySrc.Thesystemistrainedononlythedataofthesourcedomain(News).OnlyTgt.Thesystemistrainedononlythedataofthetargetdomain(Fiction).All.Thesystemistrainedonalldataofthesourcedomainandthetargetdomain.ItisworthnotingthattrainingthesystemsofDaume07,OnlyTgtandAllneedthelabeleddataofthetargetdomain.WeutilizeOnlySrctoparsetheunlabeleddataofthetargetdomaintogeneratethelabeleddata.ALlcomparisonresultsareshowninTable4,inwhichthe‘Diff’columnisthedifferenceofscoresonin-domainandout-of-domaintestdata.First,wecompareOnlySrc,OnlyTgtandAll.WecanseethatOnlyTgtperformsverypoorbothinthesourcedomainandinthetargetdomain.ItisnothardtounderstandthatOnlyTgtperformspoorinthesourcedomainbecauseoftheadaptationprob-lem.OnlyTgtalsoperformspoorinthetargetdo-main.WethinkthemainreasonisthatOnlyTgtistrainedontheautoparseddatainwhichthereare

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

280

ScoreSystemWSJBrownDiffLASOnlySrc87.6379.727.91OnlyTgt73.2578.305.05All87.4180.546.87Daume0787.4780.467.01Chen89.1982.386.81Ours87.3082.844.46SemF1OnlySrc84.8271.5713.25OnlyTgt73.7470.343.40All84.6872.7511.93Daume0784.5272.9011.62Chen86.1574.5811.57Ours84.2578.755.50MacroF1OnlySrc86.2475.6710.57OnlyTgt73.5074.320.82All86.0476.659.40Daume0786.0076.689.32Chen87.6978.519.18Ours85.8080.834.97Table4:Comparisonwithothermethods.manyparsingerrors.ButwenotethatAllperformsbetterthanbothOnlySrcandOnlyTgtonthetargetdomaintest,althoughitstrainingdatacontainssomeautoparseddata.Therefore,thedataofthetargetdomain,labeledorunlabeled,arepotentialinalle-viatingtheadaptationproblemofdifferentdomains.ButAlljustputstheautoparseddataofthetargetdomainintothetrainingset.Thus,itsimprovementonthetestdataofthetargetdomainislimited.Infact,howtousethedataofthetargetdomain,espe-ciallytheunlabeleddata,intheadaptationproblemisstillanopenandhottopicinNLPandmachinelearning.Second,wecompareDaume07,Allandourmethod.InDaume07,theyreportedimprovementonthetargetdomaintest.Butonepointtonoteisthatthetargetdomaindatausedintheirexperi-mentsislabeledwhileinourcasethereisonlyun-labeleddata.WecanseeDaume07havecompara-bleperformancewithAllinwhichthereisnotanyadaptationstrategybesidesaddingmoredataofthetargetdomain.Wethinkthemainreasonisthattherearemanyparsingerrorsinthedataofthetar-getdomain.ButourmethodperformsmuchbetterthanDaume07andAlleventhoughsomefaultydataarealsoutilizedinoursystem.Thissuggeststhatourmethodsuccessfullylearnsnewrobustrepresen-tationsfordifferentdomains,evenwhentherearesomenoisydata.Third,wecompareChenwithourmethod.Chenreachedthebestresultintheout-of-domaintestoftheCoNLL2009sharedtask.TheresultsinTable4showthatChen’ssystemperformsbetterthanoursonin-domaintestdata,especiallyonLASscore.Chen’ssystemusesasophisticatedgraph-basedsyn-tacticdependencyparser.Graph-basedparsersusesubstantiallymorefeatures,e.g.morethan1.3×107featuresareusedinMcDonaldetal.,(2005).LearninganLFRforthatmanyfeatureswouldtakemonthsoftimeusingourDBNmodel.Soatpresentweonlyuseatransition-basedparser.Thebetterper-formanceofChen’ssystemmainlycomesfromtheirsophisticatedsyntacticparsingmethod.Toreducethesparsityoffeatures,Chen’ssys-temuseswordclusterfeaturesasinKooetal.,(2008).Onout-of-domaintests,cependant,oursys-temstillperformsmuchbetterthanChen’s,espe-ciallyonsemanticparsing.Toourknowledge,onout-of-domaintestsonthisdataset,oursystemhasobtainedthebestperformancetodate.Moreim-portantly,theperformancedifferencebetweenindo-mainandout-of-domaintestsismuchsmallerinoursystem.Thisshowsthatoursystemadaptsmuchbettertothetargetdomain.6ConclusionsInthispaper,weproposeaDBNmodeltolearnLFRsforsyntacticandsemanticparsers.TheseLFRsarecommonrepresentationsoforiginalfea-turesinbothsourceandtargetdomains.SyntacticandsemanticparsersusingtheLFRsadapttotar-getdomainmuchbetterthanthesameparsersus-ingoriginalfeaturerepresentation.Ourmodelpro-videsaunifiedmethodthatadaptsbothsyntacticandsemanticdependencyparserstoanewdomain.Inthefuture,wehopetofurtherscaleupourmethodtoadaptparsingmodelsusingsubstantiallymorefeatures,suchasgraph-basedsyntacticdependencyparsingmodels.Wewillalsosearchforbettersplit-tingstrategiesforourDBNmodel.Finally,althoughourexperimentsareconductedonsyntacticandse-manticparsing,itisexpectedthattheproposedap-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

281

proachcanbeappliedtothedomainadaptationofothertaskswithlittleadaptationefforts.AcknowledgementsTheresearchworkhasbeenpartiallyfundedbytheNaturalScienceFoundationofChinaunderGrantNo.61333018andsupportedbytheWestLightFoundationofChineseAcademyofSciencesunderGrantNo.LHXZ201301.Wethankthethreeanony-mousreviewersandtheActionEditorfortheirhelp-fulcommentsandsuggestions.ReferencesYoshuaBengio.2009.LearningDeepArchitecturesforAI.InFoundationsandTrendsinMachineLearning,2(1):1-127.JohnBlitzer,RyanMcDonaldandFernandoPereira.2006.DomainAdaptationwithsturcturalcorrespon-dancelearning.InProceedingsofACL-2006.WanxiangChe,ZhenghuaLi,YuxuanHu,YongqiangLi,BingQin,TingLiuandShengLi.2008.ACascadedSyntacticandSemanticDependencyParsingSystem.InProceedingsofCoNLL-2008sharedtask.WanxiangChe,ZhenghuaLi,YongqiangLi,YuhangGuo,BingQinandTingLiu.2009.MultilingualDependency-basedSyntacticandSemanticParsing.InProceedingsofCoNLL-2009sharedtask.WenliangChen,YouzhengWuandHitoshiIsahara.2008.Learningreliableinformationfordependencyparsingadaptation.InProceedingsofCOLING-2008.HalDaum´eIII.2007.FrustratinglyEasyDomainAdap-tation.InProceedingsofACL-2007.HalDaum´eIIIandDanielMarcu.2006.DomainAdap-tationforStatisticalClassifer.InJournalofArtificialIntelligenceResearch,26(2006),101-126.MarkDredze,JohnBlitzer,ParthaP.Talukdar,KuzmanGanchev,JoaoGracaandFernandoPereira.2007.FrustratinglyHardDomainAdaptationforDepen-dencyParsing.InProceedingsofEMNLP-CoNLL-2007.XavierGlorot,AntoineBordesandYoshuaBengio.2011.DomainAdaptationforLarge-ScaleSentimentClassification:ADeepLearningApproach.InPro-ceedingsofInternationalConferenceonMachineLearning(ICML)2011.DanielGildeaandDanielJurafsky.2002.Automaticla-belingforsemanticroles.InComputationalLinguis-tics,28(3):245-288.I.Goodfellow,Q.Le,A.SaxeandA.Ng.2009.Mea-suringinvariancesindeepnetworks.InProceedingsofAdvancesinNeuralInformationProcessingSys-tems(NIPS)2011.JanHajiˇc,MassimilianoCiaramita,RichardJohans-son,DaisukeKawahara,MariaAnt`oniaMart´ı,Llu´ısM`arquez,AdamMeyers,JoakimNivre,SebastianPad´o,JanˇStˇep´anek,PavelStraˇn´ak,MihaiSurdeanu,NianwenXueandYiZhang.2009.TheCoNLL-2009SharedTask:SyntacticandSemanticDependenciesinMultipleLanguages.InProceedingsofCoNLL-2009.J.Hall,J.Nilsson,J.Nivre,G.Eryiˇgit,B.Megyesi,M.Nilsson,andM.Saers.2007.SingleMaltorBlended?AStudyinMultilingualParserOptimization.InPro-ceedingsofEMNLP-CoNLL-2007.GeoffreyHinton.2010.APracticalGuidetoTrain-ingRestrictedBoltzmannMachines.InTechnicalre-port2010-003,MachineLearningGroup,UniversityofToronto.GeoffreyHinton.2002.Trainingproductsofexpertsbyminimizingconstrastivedivergence.InNeuralCom-putation,14(8):1711-1800.GeoffreyHinton,SimonOsinderoandYee-WhyeTeh.2006.Afastlearningalgorithmfordeepbeliefnets.InNeuralComputation,18(7):1527-1554.GeoffreyHintonandR.Salakhutdinov.2006.Reducingthedimensionalityofdatawithneuralnetworks.InScience,313(5786),504-507.RichardJohanssonandPierreNugues.2008.Dependency-basedsemanticrolelabelingofProp-Bank.InProceedingsofEMNLP-2008.TerryKoo,XavierCarrerasandMichaelCollins.2008.SimpleSemi-supervisedDependencyParsing.InPro-ceedingsofACL-HLT-2008.Llu´ısM`arquez,XavierCarreras,KennethC.LitkowskiandSuzanneStevenson.2008.SemanticRoleLabel-ing:AnIntroductiontotheSpecialIssue.InCompu-tationalLinguistics,34(2):145-159.RyanMcDonald,FernandoPereira,JanHajˇc,andKirilRibarov.2005.Non-projectivedependencyparsingusingspanningtreealgortihms.InProceedingsofNAACL-HLT-2005.J.Nivre,J.Hall,S.K¨ubler,R.Mcdonald,J.Nilsson,S.Riedel,andD.Yuret.2007.TheCoNLL2007SharedTaskonDependencyParsing.InProceedingsofCoNLL-2007.J.Nivre,J.Hall,J.Nilsson,G.EryiˇgitandS.Marinov.2006.LabeledPseudo-ProjectiveDependencyParsingwithSupportVectorMachines.InProceedingsofCoNLL-2006.J.Nivre,andJ.Nilsson.2005.Pseudo-projectivedepen-dencyparsing.InProceedingsofACL-2005.RajatRaina,AnandMadhavan,andAndrewY.Ng.2009.Large-scaleDeepUnsupervisedLearningus-ingGraphicsProcessors.InProceedingsofthe26th

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
3
8
1
5
6
6
7
7
2

/

/
t

je

un
c
_
un
_
0
0
1
3
8
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

282

AnnualInternationalConferenceonMachineLearn-ing(ICML),pages152-164.MihaiSurdeanu,RichardJohansson,AdamMeyers,Llu´ısM`arquezandJoakimNivre.2008.TheCoNLL-2008SharedTaskonJointParsingofSyntacticandSemanticDependencies.InProceedingsofCoNLL-2008.IvanTitov.2011.DomainAdaptationbyConstrainingInter-DomainVariabilityofLatentFeatureRepresen-tation.InProceedingsofACL-2011.JosephTurian,LevRatinovandYoshuaBengio.2010.Wordrepresentations:asimpleandgeneralmethodforsemi-supervisedlearning.InProceedingsofACL-2010.J.Weston,F.Rattle,andR.Collobert.2008.DeepLearn-ingviaSemi-SupervisedEmbedding.InProceed-ingsofInternationalConferenceonMachineLearn-ing(ICML).NianwenXueandMarthaPalmer.2004.Calibratingfea-turesforsemanticrolelabeling.InProceedingsofEMNLP-2004.HaitongYangandChengqingZong.2014.Multi-PredicateSemanticRoleLabeling.InProceedingsofEMNLP-2014.HaiZhao,WenliangChen,ChunyuKit,GuodongZhou.2009.MultilingualDependencyLearning:ExploitingRichFeaturesforTaggingSyntacticandSemanticDe-pendencies.InProceedingsofCoNLL-2009sharedtask.HaiZhaoandChunyuKit.2008.ParsingSyntacticandSemanticDependencieswithTwoSingle-StageMax-imumEntropyModels.InProceedingsofCoNLL-2008.TaoZhuangandChengqingZong.2010a.AMinimumErrorWeightingCombinationStrategyforChineseSe-manticRoleLabeling.InProceedingsofCOLING2010.TaoZhuangandChengqingZong.2010b.JointInferenceforBilingualSemanticRoleLabeling.InProceedingsofEMNLP2010.Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image
Transactions of the Association for Computational Linguistics, vol. 3, pp. 271–282, 2015. Action Editor: Hal Daum´e III. image

Télécharger le PDF