Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 31–45, 2016. Editor de acciones: Tim Baldwin.

Transacciones de la Asociación de Lingüística Computacional, volumen. 4, páginas. 31–45, 2016. Editor de acciones: Tim Baldwin.
Lote de envío: 12/2015; Lote de revisión: 2/2016; Publicado 2/2016.

2016 Asociación de Lingüística Computacional. Distribuido bajo CC-BY 4.0 licencia.

C
(cid:13)

ABayesianModelofDiachronicMeaningChangeLeaFrermannandMirellaLapataInstituteforLanguage,CognitionandComputationSchoolofInformatics,UniversityofEdinburgh10CrichtonStreet,EdinburghEH89ABl.frermann@ed.ac.uk,mlap@inf.ed.ac.ukAbstractWordmeaningschangeovertimeandanau-tomatedprocedureforextractingthisinfor-mationfromtextwouldbeusefulforhistor-icalexploratorystudies,informationretrievalorquestionanswering.Wepresentady-namicBayesianmodelofdiachronicmeaningchange,whichinferstemporalwordrepresen-tationsasasetofsensesandtheirprevalence.Unlikepreviouswork,weexplicitlymodellanguagechangeasasmooth,gradualpro-cess.Weexperimentallyshowthatthismodel-ingdecisionisbeneficial:ourmodelperformscompetitivelyonmeaningchangedetectiontaskswhilstinducingdiscerniblewordsensesandtheirdevelopmentovertime.ApplicationofourmodeltotheSemEval-2015temporalclassificationbenchmarkdatasetsfurtherre-vealsthatitperformsonparwithhighlyop-timizedtask-specificsystems.1IntroductionLanguageisadynamicsystem,constantlyevolv-ingandadaptingtotheneedsofitsusersandtheirenvironment(Aitchison,2001).Wordsinalllan-guagesnaturallyexhibitarangeofsenseswhosedis-tributionorprevalencevariesaccordingtothegenreandregisterofthediscourseaswellasitshistoricalcontext.Asanexample,considerthewordcutewhichaccordingtotheOxfordEnglishDictionary(OED,Stevenson2010)firstappearedintheearly18thcenturyandoriginallymeantcleverorkeen-witted.1Bythelate19thcenturycutewasusedin1Throughoutthispaperwedenotewordsintruetype,theirsensesinitalics,andsense-specificcontextwordsas{liza}.thesamesenseascunning.Todayitmostlyreferstoobjectsorpeopleperceivedasattractive,prettyorsweet.Anotherexampleisthewordmousewhichinitiallywasonlyusedintherodentsense.TheOEDdatesthecomputerpointingdevicesenseofmouseto1965.Thelattersensehasbecomepar-ticularlydominantinrecentdecadesduetotheever-increasinguseofcomputertechnology.Thearrivaloflarge-scalecollectionsofhistorictexts(Davies,2010)andonlinelibrariessuchastheInternetArchiveandGoogleBookshavegreatlyfacilitatedcomputationalinvestigationsoflanguagechange.Theabilitytoautomaticallydetecthowthemeaningofwordsevolvesovertimeispotentiallyofsignificantvaluetolexicographicandlinguisticresearchbutalsotorealworldapplications.Time-specificknowledgewouldpresumablyrenderwordmeaningrepresentationsmoreaccurate,andbenefitseveraldownstreamtaskswheresemanticinforma-tioniscrucial.Examplesincludeinformationre-trievalandquestionanswering,wheretime-relatedinformationcouldincreasetheprecisionofquerydisambiguationanddocumentretrieval(e.g.,byre-turningdocumentswithnewlycreatedsensesorfil-teringoutdocumentswithobsoletesenses).InthispaperwepresentadynamicBayesianmodelofdiachronicmeaningchange.Wordmean-ingismodeledasasetofsenses,whicharetrackedoverasequenceofcontiguoustimeintervals.Weinfertemporalmeaningrepresentations,consistingofaword’ssenses(asaprobabilitydistributionoverwords)andtheirrelativeprevalence.Ourmodelisthusabletodetectthatmousehadonesenseuntilthemid-20thcentury(characterizedbywordssuchas{cheese,tail,rat})andsubsequentlyacquireda

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
0
8
1
1
5
6
7
3
7
0

/

/
t

yo

a
C
_
a
_
0
0
0
8
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

32

secondsenserelatingtocomputerdevice.More-over,itinferssubtlechangeswithinasinglesense.Forinstance,inthe1970sthewords{cable,ball,mousepad}weretypicalforthecomputerdevicesense,whereasnowadaystheterms{optical,laser,usb}aremoretypical.Contrarytopreviouswork(Mitraetal.,2014;MihalceaandNastase,2012;Gu-lordavaandBaroni,2011)wheretemporalrepresen-tationsarelearntinisolation,ourmodelassumesthatadjacentrepresentationsareco-dependent,thuscapturingthenatureofmeaningchangebeingfun-damentallysmoothandgradual(McMahon,1994).Thisalsoservesasaformofsmoothing:temporallyneighboringrepresentationsinfluenceeachotheriftheavailabledataissparse.Experimentalevaluationshowsthatourmodel(a)inducestemporalrepresentationswhichreflectwordsensesandtheirdevelopmentovertime,(b)isabletodetectmeaningchangebetweentwotimepe-riods,y(C)isexpressiveenoughtoobtainusefulfeaturesforidentifyingthetimeintervalinwhichapieceoftextwaswritten.Overall,ourresultsindi-catethatanexplicitmodeloftemporaldynamicsisadvantageousfortrackingmeaningchange.Com-parisonsacrossevaluationsandagainstavarietyofrelatedsystemsshowthatdespitenotbeingdesignedwithanyparticulartaskinmind,ourmodelperformscompetitivelyacrosstheboard.2RelatedWorkMostworkondiachroniclanguagechangehasfo-cusedondetectingwhetherandtowhatextentaword’smeaningchanged(e.g.,betweentwoepochs)withoutidentifyingwordsensesandhowthesevaryovertime.Avarietyofmethodshavebeenappliedtothetaskrangingfromtheuseofstatisticaltestsinordertodetectsignificantchangesinthedistributionoftermsfromtwotimeperiods(PopescuandStrap-parava,2013;CookandStevenson,2010),totrain-ingdistributionalsimilaritymodelsontimeslices(GulordavaandBaroni,2011;Sagietal.,2009),andneurallanguagemodels(Kimetal.,2014;Kulkarnietal.,2015).Otherwork(MihalceaandNastase,2012)takesasupervisedlearningapproachandpre-dictsthetimeperiodtowhichawordbelongsgivenitssurroundingcontext.Bayesianmodelshavebeenpreviouslydevelopedforvarioustasksinlexicalsemantics(BrodyandLa-pata,2009;´OS´eaghdha,2010;Ritteretal.,2010)andwordmeaningchangedetectionisnoexception.Usingtechniquesfromnon-parametrictopicmodel-ing,Lauetal.(2012)inducewordsenses(aka.top-ics)foragiventargetwordovertwotimeperiods.Novelsensesarethenaredetectedbasedonthediscrepancybetweensensedistributionsinthetwoperiods.Follow-upwork(Cooketal.,2014;Lauetal.,2014)furtherexploresmethodsforhowtobestmeasurethissensediscrepancy.Ratherthaninfer-ringwordsenses,WijayaandYeniterzi(2011)useaTopics-over-Timemodelandk-meansclusteringtoidentifytheperiodsduringwhichselectedwordsmovefromonetopictoanother.Anon-BayesianapproachisputforwardinMi-traetal.(2014,2015)whoadoptagraph-basedframeworkforrepresentingwordmeaning(seeTah-masebietal.(2011)forasimilarearlierproposal).Inthismodelwordscorrespondtonodesinase-manticnetworkandedgesaredrawnbetweenwordssharingcontextualfeatures(extractedfromadepen-dencyparser).Agraphisconstructedforeachtimeinterval,andnodesareclusteredintosenseswithChineseWhispers(Biemann,2006),arandomizedgraphclusteringalgorithm.Bycomparingthein-ducedsensesforeachtimesliceandobservinginter-clusterdifferences,theirmethodcandetectwhethersensesemergeordisappear.Ourworkdrawsideasfromdynamictopicmod-eling(BleiandLafferty,2006b)wheretheevolu-tionoftopicsismodeledvia(liso)changesintheirassociateddistributionsoverthevocabulary.Althoughthedynamiccomponentofourmodeliscloselyrelatedtopreviousworkinthisarea(Mimnoetal.,2008),ourmodelisspecificallyconstructedforcapturingsenseratherthantopicchange.Ourap-proachisconceptuallysimilartoLauetal.(2012).Wealsolearnajointsenserepresentationformulti-pletimeslices.However,inourcasethenumberoftimeslicesinnotrestrictedtotwoandweexplicitlymodeltemporaldynamics.LikeMitraetal.(2014,2015),wemodelhowsenseschangeovertime.Inourmodel,temporalrepresentationsarenotinde-pendent,butinfluencedbytheirtemporalneighbors,encouragingsmoothchangeovertime.Wethereforeinduceaglobalandconsistentsetoftemporalrepre-sentationsforeachword.Ourmodelisknowledge-lean(itdoesnotmakeuseofaparser)andlanguage

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
0
8
1
1
5
6
7
3
7
0

/

/
t

yo

a
C
_
a
_
0
0
0
8
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

33

independiente(allthatisneededisatime-stampedcorpusandtoolsforbasicpre-processing).ContrarytoMitraetal.(2014,2015),wedonottreatthetasksofinferringasemanticrepresentationforwordsandtheirsensesastwoseparateprocesses.Evaluationofmodelswhichdetectmeaningchangeisfraughtwithdifficulties.Thereisnostan-dardsetofwordswhichhaveundergonemeaningchangeorbenchmarkcorpuswhichrepresentsava-rietyoftimeintervalsandgenres,andisthematicallyconsistent.Previousworkhasgenerallyfocusedonafewhand-selectedwordsandmodelswereevalu-atedqualitativelybyinspectingtheiroutput,ortheextenttowhichtheycandetectmeaningchangesfromtwotimeperiods.Forexample,Cooketal.(2014)manuallyidentify13targetwordswhichun-dergomeaningchangeinafocuscorpuswithre-specttoareferencecorpus(bothnewstext).Theythenassesshowtheirmodelsfareatlearningsensedifferencesforthesetargetscomparedtodistractorswhichdidnotundergomeaningchange.Theyalsounderlinetheimportanceofusingthematicallycom-parablereferenceandfocuscorporatoavoidspuri-ousdifferencesinwordrepresentations.Inthisworkweevaluateourmodel’sabilitytodetectandquantifymeaningchangeacrossseveraltimeintervals(notjusttwo).Insteadofrelyingonafewhand-selectedtargetwords,weuselargersetssampledfromourlearningcorpusorfoundtoundergomeaningchangeinajudgmentelicitationstudy(GulordavaandBaroni,2011).Inaddition,weadopttheevaluationparadigmofMitraetal.(2014)andvalidateourfindingsagainstWordNet.Finally,weapplyourmodeltotherecentlyes-tablishedSemEval-2015diachronictextevaluationsubtasks(PopescuandStrapparava,2015).Inordertopresentaconsistentsetofexperiments,weuseourowncorpusthroughoutwhichcoversawiderrangeoftimeintervalsandiscompiledfromavarietyofgenresandsourcesandisthusthematicallycoher-ent(seeSection4fordetails).Whereverpossible,wecompareagainstpriorart,withthecaveatthattheuseofadifferentunderlyingcorpusunavoidablyinfluencestheobtainedsemanticrepresentations.3ABayesianModelofSenseChangeInthissectionweintroduceSCAN,ourdynamicBayesianmodelofSenseChANge.SCANcaptureshowaword’ssensesevolveovertime(e.g.,whethernewsensesemerge),whethersomesensesbecomemoreorlessprevalent,aswellasphenomenaper-tainingtoindividualsensessuchasmeaningexten-sion,shift,ormodification.Weassumethattimeisdiscrete,dividedintocontiguousintervals.Givenaword,ourmodelinfersitssensesforeachtimein-tervalandtheirprobability.Itcapturesthegradualnatureofmeaningchangeexplicitly,throughdepen-denciesbetweentemporallyadjacentmeaningrep-resentations.Sensesthemselvesareexpressedasaprobabilitydistributionoverwords,whichcanalsochangeovertime.3.1ModelDescriptionWecreateaSCANmodelforeachtargetwordc.Theinputtothemodelisacorpusofshorttextsnippets,eachconsistingofamentionofthetargetwordcanditslocalcontextw(inourexperimentsthisisasym-metriccontextwindowof±5words).Eachsnip-petisannotatedwithitsyearoforigin.Themodelisparametrizedwithregardtothenumberofsensesk∈[1…k]ofthetargetwordc,andthelengthoftimeintervals∆Twhichmightbefinelyorcoarselydefined(e.g.,spanningayearoradecade).Weconflatealldocumentsoriginatingfromthesametimeintervalt∈[1…t]andinferatempo-ralrepresentationofthetargetwordperinterval.Atemporalmeaningrepresentationfortimetis(a)aK-dimensionalmultinomialdistributionoverwordsensesφtand(b)aV-dimensionaldistributionoverthevocabularyψt,kforeachwordsensek.Inad-dition,ourmodelinfersaprecisionparameterκφ,whichcontrolstheextenttowhichsenseprevalencechangesforwordcovertime(seeSection3.2fordetailsonhowwemodeltemporaldynamics).Weplaceindividuallogisticnormalpriors(BleiandLafferty,2006a)onourmultinomialsensedis-tributionsφandsense-worddistributionsψk.Adrawfromthelogisticnormaldistributioncon-sistsof(a)adrawofann-dimensionalrandomvectorxfromthemultivariatenormaldistributionparametrizedbyann-dimensionalmeanvectorµandan×nvariance-covariancematrixΣ,x∼N(X|µ,S);y(b)amappingofthedrawnparam-eterstothesimplexthroughthelogistictransforma-tionφn=exp(xn)/Pn0exp(xn0),whichensuresadrawofvalidmultinomialparameters.Thenormaldistributionsareparametrizedtoencouragesmooth

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
0
8
1
1
5
6
7
3
7
0

/

/
t

yo

a
C
_
a
_
0
0
0
8
1
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

34

wzzwzwφt−1φtφt+1κφa,bψt−1ψtψt+1κψIDt−1IDtIDt+1KDrawκφ∼Gamma(a,b)fortimeintervalt=1..TdoDrawsensedistributionφt|φ−t,κφ∼N(12(φt−1+φt+1),κφ)forsensek=1..KdoDrawworddistributionψt,k|ψ−t,κψ∼N(12(ψt−1,k+ψt+1,k),κψ)fordocumentd=1..DdoDrawsensezd∼Mult(φt)forcontextpositioni=1..IdoDrawwordwd,i∼Mult(ψt,zd)Figure1:Left:platediagramforthedynamicsensemodelforthreetimesteps{t−1,t,t+1}.Constantparametersareshownasdashednodes,latentvariablesasclearnodes,andobservedvariablesasgraynodes.Right:thecorrespondinggenerativestory.changeinmultinomialparameters,overtime(seeSection3.2fordetails),andtheextentofchangeiscontrolledthroughaprecisionparameterκ.Welearnthevalueofκφduringinference,whichal-lowsustomodeltheextentoftemporalchangeinsenseprevalenceindividuallyforeachtargetword.WedrawκφfromaconjugateGammaprior.Wedonotinferthesense-wordprecisionparameterκψonallψk.Instead,wefixitatahighvalue,trig-geringlittlevariationofworddistributionswithinsenses.Thisleadstosensesbeingthematicallyco-herentovertime.Wenowdescribethegenerativestoryofourmodel,whichisdepictedinFigure1(bien),along-sideitsplatediagramrepresentation(izquierda).Primero,wedrawthesenseprecisionparameterκφfromaGammaprior.Foreachtimeintervaltwedraw(a)amultinomialdistributionoversensesφtfromalo-gisticnormalprior;y(b)amultinomialdistribu-tionoverthevocabularyψt,kforeachsensek,fromanotherlogisticnormalprior.Next,wegeneratetime-specifictextsnippets.Foreachsnippetd,wefirstobservethetimeintervalt,anddrawasensezdfromMult(φt).Finalmente,wegenerateIcontextwordswd,iindependentlyfromMult(ψt,zd).3.2BackgroundoniGMRFsLetφ={φ1φT}denoteaT-dimensionalrandomvector,whereeachφtmightforexamplecorrespondtoasenseprobabilityattimet.Wedefineapriorwhichencouragessmoothchangeofparametersatneighboringtimes,intermsofafirstorderrandomwalkontheline(graphicallyshowninFigure2,andthechainsofφandψinFigure1(izquierda)).Específicamente,wedefinethispriorasanintrinsicGaussianMarkovRandomField(iGMRF;RueandHeld2005),whichallowsustomodelthechangeofadjacentparame-tersasdrawnfromanormaldistribution,p.ej.:∆φt∼N(0,κ−1).(1)TheiGMRFisdefinedwithrespecttothegraphinFigure2;itissparselyconnectedwithonlyfirst-orderdependencieswhichallowsforefficientin-ference.Asecondfeature,whichmakesiGMRFspopularaspriorsinBayesianmodeling,isthefactthattheycanbedefinedpurelyintermsofthelo-calchangesbetweendependent(i.e.,adjacent)vari-ables,withouttheneedtospecifyanoverallmeanofthemodel.Thefullconditionalsexplicitlycap-turetheseintuitions:φt|φ−t,κ∼N(cid:16)12(φt−1+φt+1),12κ(cid:17),(2)for1φ1φt−1φtφt+1φTFigure2:AlinearchainiGMRF.inourcase:howtightlycoupledaretemporallyad-jacentmeaningrepresentationsofawordc?Wees-timatetheprecisionparameterκφduringinference.Thisallowsustoflexiblycapturesensevariationovertimeindividuallyforeachtargetword.Foradetailedintroductionto(i)GMRFswerefertheinterestedreadertoRueandHeld(2005).ForanapplicationofiGMRFstotopicmodelsseeMimnoetal.(2008).3.3InferenceWeuseablockedGibbssamplerforapproximatein-ference.Thelogisticnormalpriorisnotconjugatetothemultinomialdistribution.Thismeansthatthestraightforwardparameterupdatesknownforsam-plingstandard,Dirichlet-multinomial,topicmodelsdonotapply.However,sampling-basedmethodsforlogisticnormaltopicmodelshavebeenproposedintheliterature(Mimnoetal.,2008;Chenetal.,2013).Ateachiteration,wesample:(a)document-senseassignments,(b)multinomialparametersfromthelogisticnormalprior,y(C)thesensepreci-sionparameterfromaGammaprior.Ourblockedsamplerfirstiteratesoverallinputtextsnippetsdwithcontextw,andre-samplestheirsenseassign-mentsunderthecurrentmodelparameters{Fi}Tand{ψ}K×T,pag(zd|w,t,Fi,ψ)∝p(cid:16)zd|t(cid:17)pag(cid:16)w|t,zd(cid:17)=φtzdYw∈wψt,zdw(3)Próximo,were-sampleparameters{Fi}Tand{ψ}K×Tfromthelogisticnormalprior,giventhecurrentsenseassignments.WeusetheauxiliaryvariablemethodproposedinMimnoetal.(2008)(seealsoGroenewaldandMokgatlhe(2005)).Intuitivamente,eachindividualparameter(e.g.,sensek’sprevalenceattimet,φtk)is‘shifted’withinaweightedregionwhichisboundedbythenumberoftimessensekwasobservedattimet.Theweightsoftheregionaredeterminedbytheprior,inourcasethenormaldistributionsdefinedbytheiGMRF,whichensureCorpusyearscovered#wordsCOHA1810–2009142,587,656DTE1700–2010124,771CLMET3.01710–18104,531,505Table1:Sizeandcoverageofourthreetrainingcorpora(afterpre-processing).aninfluenceoftemporalneighborsφt−1kandφt+1konthenewparametervalueφtk,andsmoothtempo-ralvariationasdesired.Thesameprocedureappliestoeachwordparameterundereach{tiempo,sense}ψt,kw(seeMimnoetal.2008foramoredetailedde-scriptionofthesampler).Finalmente,weperiodicallyre-samplethesenseprecisionparameterκφfromitsconjugateGammaprior.4TheDATECorpusBeforepresentingourevaluationwedescribethecorpususedasabasisfortheexperimentsper-formedinthiswork.WeappliedourmodeltoaDiAchronicTExtcorpus(DATE)whichcollatesdocumentsspanningyears1700–2010fromthreesources:(a)theCOHAcorpus2(Davies,2010),alargecollectionoftextsfromvariousgenrescover-ingtheyears1810–2010;(b)thetrainingdatapro-videdbytheDTEtask3organizers(seeSection8);y(C)theportionoftheCLMET3.04corpus(Dilleretal.,2011)correspondingtotheperiod1710–1810(whichisnotcoveredbytheCOHAcorpusandthusunderrepresentedinourtrainingdata).CLMET3.0containstextsrepresentativeofarangeofgenresin-cludingnarrativefiction,drama,letters,andwascol-lectedfromvariousonlinearchives.Table1pro-videsdetailsonthesizeofourcorpus.Documentswereclusteredbytheiryearofpub-licationasindicatedintheoriginalcorpora.IntheCLMET3.0corpus,occasionallyarangeofyearswouldbeprovided.Inthiscaseweusedthefi-nalyearoftherange.Wetokenized,lemmatized,andpartofspeechtaggedDATEusingtheNLTK(Birdetal.,2009).Weremovedstopwordsandfunc-tionwords.Afterpreprocessing,weextractedtarget2http://corpus.byu.edu/coha/3http://alt.qcri.org/semeval2015/task7/index.php?id=data-and-tools4http://www.kuleuven.be/˜u0044428/clmet3_0.htm l D o w n o a d e d f r o m h t t p : / / d i r mi C t . metro i t . mi d tu / t a C yo / yo a r t i C mi - pag d F / d oh i / . 1 0 1 1 6 2 / t yo a C _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t yo a C _ a _ 0 0 0 8 1 pag d . F b y gramo tu mi s t t oh norte 0 8 S mi pag mi metro b mi r 2 0 2 3 36 word-specificinputcorporaforourmodels.Theseconsistedofmentionsofatargetcanditssurround-ingcontext,asymmetricwindowof±5words.5Experiment1:TemporalDynamicsAsdiscussedearlierourmodeldepartsfromprevi-ousapproaches(e.g.,Mitraetal.2014)inthatitlearnsgloballyconsistenttemporalrepresentationsforeachword.Inordertoassesswhethertemporaldependenciesareindeedbeneficial,weimplementedastripped-downversionofourmodel(SCAN-NOT)whichdoesnothaveanytemporaldependenciesbe-tweenindividualtimesteps(i.e.,withoutthechainiGMRFpriors).Wordmeaningisstillrepresentedassensesandsenseprevalenceismodeledasadis-tributionoversensesforeachtimeinterval.How-ever,timeintervalsarenowindependent.InferenceworksasdescribedinSection3.3,withouthavingtolearntheκprecisionparameters.ModelsandParametersWecomparedthetwomodelsintermsoftheirpredictivepower.WesplittheDATEcorpusintoatrainingperiod{d1...dt}oftimeslices1throughtandcomputedthelike-lihoodp(dt+1|φt,ψt)ofthedataattesttimeslicet+1,undertheparametersinferredfortheprevioustimeslice.Thetimeslicesizewassetto∆T=20years.WesetthenumberofsensestoK=8,thewordprecisionparameterκψ=10,ahighvaluewhichenforcesindividualsensestore-mainthematicallyconsistentacrosstime.Wesettheinitialsenseprecisionparameterκφ=4,andtheGammaparametersa=7andb=3.Thesepa-rameterswereoptimizedonceonthedevelopmentdatausedforthetask-basedevaluationdiscussedinSection8.Unlessotherwisespecifiedallex-perimentsusethesevalues.Noparametersweretunedonthetestsetforanytask.Inallexper-imentswerantheGibbssamplerfor1,000itera-tions,andresampledκφafterevery50iterations,startingfromiteration150.Weusedthefinalstateofthesamplerthroughout.Werandomlyselected50mid-frequencytargetconceptsfromalargersetoftargetconceptsdescribedinSection8.Predictiveloglikelihoodscoreswereaveragedacrossconceptsandwerecalculatedastheaverageunder10param-etersamples{φt,ψt}fromthetrainedmodels.1920-391940-591960-791980-99−1,7−1,65·105predictedtimeperiodloglikelihoodSCANSCAN-NOTFigure3:PredictiveloglikelihoodofSCANandaver-sionwithouttemporaldependencies(SCAN-NOT)acrossvarioustesttimeperiods.ResultsFigure3displayspredictiveloglikelihoodscoresforfourtesttimeintervals.SCANoutper-formsitsstripped-downversionthroughout(higherisbetter).SincetherepresentationslearntbySCANareinfluenced(orsmoothed)byneighboringrepre-sentations,theyoverfitspecifictimeintervalslesswhichleadstobetterpredictiveperformance.Fig-ure4furthershowshowSCANmodelsmeaningchangeforthewordsband,fuerza,transportandbank.Thesensedistributionsovertimeareshownasasequenceofstackedhistograms,sensesthemselvesarecolor-coded(andenumerated)abajo,inthesameorderasinthehistograms.Eachsensekisillustratedasthe10wordswassignedthehighestposteriorprobability,marginalizingoverthetime-specificrepresentationsp(w|k)=Ptψt,kw.Wordsrepresentativeofprevalentsensesarehighlightedinboldface.Figure4(topleft)demonstratesthatthemodelisabletocapturevarioussensesofthewordband,suchasstripusedforbinding(yellowbars/number3inthefigure)ormusicalband(grey/1,orange/7).Ourmodelpredictsanincreaseinprevalenceoverthemodeledtimeperiodforbothsenses.Thisiscor-roboratedbytheOEDwhichprovidesthemajorityofreferencesforthebindingstripsenseforthe20thcenturyanddatesthemusicalbandsenseto1812.Inadditionasocialbandsense(violet/6,darkgreen/8;inthesenseofbonding)emerges,whichispresentacrosstimeslices.Thesensecoloredbrown/2referstotheBritishBand,agroupofnativeAmericansinvolvedintheBlackHawkWarin1832,andthemodelindeedindicatesaprevalenceofthissensearoundthistime(seebars1800–1840inthefigure).Forthewordpower(Figure4(topright)), yo D oh w norte oh a d mi d F r oh metro h t t pag : / / d i r mi C t . metro i t . mi d tu / t a C yo / yo a r t i C mi - pag d F / d oh i / . 1 0 1 1 6 2 / t yo a C _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t yo a C _ a _ 0 0 0 8 1 pag d . F b y gramo tu mi s t t oh norte 0 8 S mi pag mi metro b mi r 2 0 2 3 37 1700172017401760178018001820184018601880190019201940196019802000band8 band play people time little call father day love boy7 play band music time country day march military frequency jazz6 little hand play land love time night speak strong name5 little soldier leader time land arm hand country war indian4 music play dance band hear time little evening stand house3 black white hat broad gold wear hair band head rubber2 indian little day horse time people meet chief leave war1 play music hand hear sound march street air look strike1700172017401760178018001820184018601880190019201940196019802000power8 power idea god hand mind body life time object nature7 power nation world war country time government sir mean lord6 power time company water force line electric plant day run5 power government law congress executive president legislative constitution4 love power life time woman heart god tell little day3 mind power time life friend woman nature love world reason2 power people law government mind call king time hand nature1 power country government nation war increase world political people europe1700172017401760178018001820184018601880190019201940196019802000transport8 road cost public railway transport rail average service bus time7 ozone epa example section transport air policy region measure caa6 time transport land public ship line water vessel london joy5 air plane ship army day transport land look leave hand4 time road worker union service public system industry air railway3 air international worker plane association united union aircraft line president2 troop ship day land army war send plane supply fleet1 air joy love heart heaven time company eye hand smile1700172017401760178018001820184018601880190019201940196019802000bank8 bank tell cashier teller money day ned president house city7 bank note money deposit credit amount pay species issue bill6 bank money national note government credit united time currency loan5 bank dollar money note national president account director company little4 river day opposite mile bank danube town left country shore3 bank capital company stock rate national president fund city loan2 river water stream foot mile tree stand reach little land1 note bank money time tell leave hard day dollar accountFigure4:Trackingmeaningchangeforthewordsband,fuerza,transportandbankover20-yeartimeintervalsbetween1700and2010.Eachbarshowstheproportionofeachsense(color-coded)andislabeledwiththestartyearoftherespectivetimeinterval.Sensesareshownasthe10mostprobablewords,andparticularlyrepresentativewordsarehighlightedforillustration.17002010timep(w|s,t)timelinewaterwatercompanycompanypowerpowercompanypowerpowerpowernuclearlinetimepowerforcepowerwatercompanycompanypowercompanyplantnuclearpowerpowerwaterlinecompanytimepowerforceforceforceplantnuclearplantplantwaterpowertimepowerforceforcewatertimeplantelectricelectrictimeutilityforceforceforcelinewatertimeelectricwaterwatertimetimecompanycompanywarwarcompanytimesteamdaydayplantdayforcecompanyutilitytimerundayrunsteamelectriclinetimedaytimedayrunrunpeopleequalhouseelectricdayrunsteamsteamelectricelectricrunutilityelectricenergycarryrunsteamelectricdaypurchaselinesteamrunwaterdaycostcostelectriccompanydayrunplantrunplantrunlinepeopleforcepeoplerunFigure5:Sense-internaltemporaldynamicsfortheenergysenseofthewordpower(violet/6inFigure4).Columnsshowthetenmosthighlyassociatedwordsforeachtimeintervalfortheperiodbetween1700and2010(orderedbydecreasingprobability).Wehighlighthowfourtermscharacteristicofthesensedevelopovertime(ver{agua,steam,plant,nuclear}inthefigure).threesensesemerge:theinstitutionalpower(col-orsgray/1,brown/2,pink/5,orange/7inthefigure),mentalpower(yellow/3,lightgreen/4,darkgreen/8),andpowerassupplyofenergy(violet/6).Thelatterisanexampleofa“sensebirth”(Mitraetal.,2014):thesensewashardlypresentbeforethemid-19thcentury.ThisiscorroboratedbytheOEDwhichdatesthesenseto1889,whereastheOEDcontainsreferencestotheremainingsensesforthewholemodeledtimeperiod,aspredictedbyourmodel. yo D oh w norte oh a d mi d F r oh metro h t t pag : / / d i r mi C t . metro i t . mi d tu / t a C yo / yo a r t i C mi - pag d F / d oh i / . 1 0 1 1 6 2 / t yo a C _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t yo a C _ a _ 0 0 0 8 1 pag d . F b y gramo tu mi s t t oh norte 0 8 S mi pag mi metro b mi r 2 0 2 3 38 1900-191920-391900-191940-591900-191960-791900-191980-991900-192000-101960-791980-991960-792000-101980-992000-100.20.40.6precisionSCANSCAN-NOTt1t2Figure6:PrecisionresultsfortheSCANandSCAN-NOTmodelsontheWordNet-basednovelsensedetection(Exper-iment2).Resultsareshownforaselectionofreferencetimes(t1)andfocustimes(t2).Similartrendsofmeaningchangeemergefortransport(Figure4bottomleft).Thebot-tomrightplotshowsthesensedevelopmentforthewordbank.Althoughthewell-knownsensesriverbank(brown/2,lightgreen/4)andmonetaryinstitu-tion(descansar)emergeclearly,theoverallsensepatternappearscomparativelystableacrossintervalsindi-catingthatthemeaningofthewordhasnotchangedmuchovertime.Besidestrackingsenseprevalenceovertime,ourmodelcanalsodetectchangeswithinindividualsenses.Becauseweareinterestedintrackingse-manticallystablesenses,wefixedtheprecisionpa-rameterκψtoahighvalue,todiscouragetoomuchvariancewithineachsense.Figure5illustrateshowtheenergysenseofthewordpower(violet/6inFigure4)haschangedovertime.Characteristictermsforagivensensearehighlightedinboldface.Forexample,theterm“water”isinitiallyprevalent,whiletheterm“steam”risesinprevalencetowardsthemiddleofthemodeledperiod,andissupersededbytheterms“plant”and“nuclear”towardstheend.6Experiment2:NovelSenseDetectionInthissectionandthenextwewillexplicitlyeval-uatethetemporalrepresentations(i.e.,probabilitydistributions)inducedbyourmodel,anddiscussitsperformanceinthecontextofpreviouswork.Large-scaleevaluationofmeaningchangeisnoto-riouslydifficult,andmanyevaluationsarebasedonlimitedhand-annotatedgoldstandarddatasets.Mi-traetal.(2015),sin embargo,bypassthisissuebyeval-uatingtheoutputoftheirsystemagainstWordNet(Fellbaum,1998).Aquí,weconsidertheirauto-maticevaluationofsense-births,i.e.,theemergenceofnovelsenses.Weassumethatnovelsensesaredetectedatafocustimet2whilstbeingcomparedtoareferencetimet1.WordNetisusedtoconfirmthattheproposednovelsenseisindeeddistinctfromallotherinducedsensesforagivenword.MethodMitraetal.’s(2015)evaluationmethodpresupposesasystemwhichisabletodetectsensesforasetoftargetwordsandidentifywhichonesarenovel.Ourmodeldoesnotautomaticallyyieldnov-eltyscoresfortheinducedsenses.However,Cooketal.(2014)proposeseveralwaystoperformthistaskpost-hoc.Weusetheirrelevancescore,whichisbasedontheintuitionthatkeywords(orcollo-cations)whichcharacterizethedifferenceofafo-cuscorpusfromareferencecorpusareindicativeofwordsensenovelty.Weidentifykeywordsforafocuscorpuswithre-specttoareferencecorpususingKilgarriff’s(2009)methodwhichisbasedonsmoothedrelativefre-quencies.5Thenoveltyofaninducedsensescanbethendefinedintermsoftheaggregatekeywordprobabilitiesgiventhatsense(andfocustimeofin-terest):rel(s)=Xw∈Wp(w|s,t2).(4)whereWisakeywordlistandt2thefocustime.Cooketal.(2014)suggestastraightforwardextrap-olationfromsensenoveltytowordnovelty:rel(C)=maxsrel(s),(5)5Wesetthesmoothingparameterton=10,andlikeCooketal.(2014)retrievethetop1000keywords. yo D oh w norte oh a d mi d F r oh metro h t t pag : / / d i r mi C t . metro i t . mi d tu / t a C yo / yo a r t i C mi - pag d F / d oh i / . 1 0 1 1 6 2 / t yo a C _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t yo a C _ a _ 0 0 0 8 1 pag d . F b y gramo tu mi s t t oh norte 0 8 S mi pag mi metro b mi r 2 0 2 3 39 t1=1900–1919t2=1980–1999unionsovietunitedamericanunioneuropeanwarcivilmilitarypeoplelibertydossystemwindowdiskpcoperateprogramruncomputerdedosentertainmenttelevisionindustryprogramtimebusinesspeopleworldpresidententertainmentcompanystationradiostationtelevisionlocalprogramnetworkspacetvbroadcastairt1=1960–1969t2=1990–1999environmentalsupranotelawprotectionidagencyimpactpolicyfactorfederaluserscomputerwindowinformationsoftwaresystemwirelessdrivewebbuildingavailablevirtualrealityvirtualcomputercenterexperienceweekcommunityseparationincreasediskharddiskdriveprogramcomputerfilestorerambusinessemboldenTable2:Exampletargetterms(izquierda)withnovelsenses(bien)asidentifiedbySCANinfocuscorpust2(whencomparedagainstreferencecorpust1).Top:termsusedinnovelsensedetectionstudy(Experiment2).Bottom:termsfromtheGulordavaandBaroni(2011)goldstandard(Experiment3).whererel(C)isthehighestnoveltyscoreassignedtoanyofthetargetword’ssenses.Ahighrel(C)scoresuggeststhatawordhasundergonemeaningchange.Weobtainedcandidatetermsandtheirassoci-atednovelsensesfromtheDATEcorpus,usingtherelevancemetricdescribedabove.Thenovelsensesfromthefocusperiodandallsensesinducedforthereferenceperiod,exceptfortheonecorre-spondingtothenovelsense,werepassedontoMitraetal.’s(2015)WordNet-basedevaluatorwhichpro-ceedsasfollows.Firstly,eachinducedsensesismappedtotheWordNetsynsetuwiththemaximumoverlap:synset(s)=argmaxuoverlap(s,tu).(6)Próximo,apredictednovelsensenisdeemedtrulynovelifitsmappedsynsetisdistinctfromanysynsetmappedtoadifferentinducedsense:∀s0synset(s0)6=synset(norte).(7)Finalmente,overallprecisioniscalculatedasthefrac-tionofsense-birthsconfirmedbyWordNetoverallbirth-candidatesproposedbythemodel.LikeMitraetal.(2015)weonlyreportresultsontargetwordsforwhichallinducedsensescouldbesuccessfullymappedtoasynset.ModelsandParametersWeobtainedthebroadsetoftargetwordsusedforthetask-basedevalua-tion(inSection8)andtrainedmodelsontheDATEcorpus.WesetthenumberofsensesK=4fol-lowingMitraetal.(2015)whonotethattheWord-Netmapperworksbestforwordswithasmallnum-berofsenses,andthetimeintervalsto∆T=20asinExperiment1.Weidentified200words6withhighestnoveltyscore(Ecuación(5))assensebirthcandidates.WecomparedtheperformanceofthefullSCANmodelagainstSCAN-NOTwhichlearnssensesindependentlyfortimeintervals.Wetrainedbothmodelsonthesamedatawithidenticalpa-rameters.ForSCAN-NOT,wemustpost-hociden-tifycorrespondingsensesacrosstimeintervals.WeusedtheJensen-Shannondivergencebetweenthereference-andfocus-timespecificworddistribu-tionsJS(pag(w|s,t1)||pag(w|s,t2))andassignedeachfocus-timesensetothesensewithsmallestdiver-genceatreferencetime.ResultsFigure6showstheperformanceofourmodelsonthetaskofsensebirthdetection.SCANperformsbetterthanSCAN-NOT,underscoringtheimportanceofjointmodelingofsensesacrosstimeslicesandincorporationoftemporaldynamics.OuraccuracyscoresareinthesameballparkasMitraetal.(2014,2015).Nota,howeverthatthescoresarenotdirectlycomparableduetodifferencesintrain-ingcorpora,focusandreferencetimes,andcandi-datewords.Mitraetal.(2015)usethelargerGooglesyntacticn-gramcorpus,aswellasricherlinguis-ticinformationintermsofsyntacticdependencies.Weshowthatourmodelwhichdoesnotrelyonsyntacticannotationsperformscompetitivelyevenwhentrainedonsmallerdata.Table2(arriba)displaysexamplesofwordsassignedhighestnoveltyscoresforthereferenceperiod1900–1919andfocuspe-riod1980–1999.6Thisthresholdwastunedononereference-focustimepair. yo D oh w norte oh a d mi d F r oh metro h t t pag : / / d i r mi C t . metro i t . mi d tu / t a C yo / yo a r t i C mi - pag d F / d oh i / . 1 0 1 1 6 2 / t yo a C _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t yo a C _ a _ 0 0 0 8 1 pag d . F b y gramo tu mi s t t oh norte 0 8 S mi pag mi metro b mi r 2 0 2 3 40 7Experiment3:WordMeaningChangeInthisexperimentweevaluatewhethermodelin-ducedtemporalwordrepresentationscaptureper-ceivedwordnovelty.Specifically,weadopttheeval-uationframework(anddataset)introducedinGulor-davaandBaroni(2011)7anddiscussedbelow.MethodGulordavaandBaroni(2011)donotmodelwordsensesdirectly;insteadtheyobtaindis-tributionalrepresentationsofwordsfromtheGoogleBooks(bigram)datafortwotimeslices,namelythe1960s(referencecorpus)and1990s(focuscor-pus).Todetectchangeinmeaning,theymeasurecosinesimilaritybetweenthevectorrepresentationsofatargetwordinthereferenceandfocuscorpus.Itisassumedthatlowsimilarityindicatessignifi-cantmeaningchange.Toevaluatetheoutputoftheirsystem,theycreatedatestsetof100targetwords(nouns,verbos,andadjectives),andaskedfiveanno-tatorstorateeachwordwithrespecttoitsdegreeofmeaningchangebetweenthe1960sandthe1990s.Theannotatorsuseda4-pointordinalscale(0:nochange,1:almostnochange,2:somewhatchange,3:changedsignificantly).Wordsweresubsequentlyrankedaccordingtothemeanratinggivenbytheannotators.Inter-annotatoragreementonthenovelsensedetectiontaskwas0.51(pairwisePearsoncor-relation)andcanberegardedasanupperboundonmodelperformance.ModelsandParametersWetrainedmodelsforallwordsinGulordavaandBaroni’s(2011)gold-standard.WeusedtheDATEsubcorpuscover-ingyears1960through1999partitionedbydecade(∆T=10).Thefirstandlasttimeintervalweredefinedasreferenceandfocustime,respec-tively(t1=1960–1969,t2=1990–1999).AsinEx-periment2,anoveltyscorewasassignedtoeachtargetword(usingEquation(5)).WecomputedSpearman’sρrankcorrelationsbetweengoldstan-dardandmodelrankings(GulordavaandBaroni,2011).WetrainedSCANmodelssettingthenum-berofsensestoK=8.WealsotrainedSCAN-NOTmodelswithidenticalparameters.Wereportresultsaveragedoverfiveindependentparameterestimates.Finally,asinGulordavaandBaroni(2011)wecom-pareagainstafrequencybaselinewhichrankswords7WethankKristinaGulordavaforsharingtheirevaluationdatasetoftargetwordsandhumanjudgments.systemcorpusSpearman’sρGulordava(2011)Google0.386SCANDATE0.377SCAN-NOTDATE0.255frequencybaselineDATE0.325Table3:Spearman’sρrankcorrelationsbetweensystemnoveltyrankingsandthehuman-producedratings.Allcorrelationsarestatisticallysignificant(pag<0.02).Re-sultsforSCANandSCAN-NOTareaveragesoverfivetrainedmodels.bytheirlogrelativefrequencyinthereferenceandfocuscorpus.ResultsTheresultsofthisevaluationareshowninTable3.Ascanbeseen,SCANoutperformsSCAN-NOTandthefrequencybaseline.Forrefer-ence,wealsoreportthecorrelationcoefficientob-tainedinGulordavaandBaroni(2011)butempha-sizethatthescoresarenotdirectlycomparableduetodifferencesintrainingdata:GulordavaandBa-roni(2011)usetheGooglebigramscorpus(whichismuchlargercomparedtoDATE).Table2(bottom)displaysexamplesofwordswhichachievedhighestnoveltyscoresinthisevaluation,andtheirassociatednovelsenses.8Experiment4:Task-basedEvaluationIntheprevioussectionswedemonstratedhowSCANcapturesmeaningchangebetweentwoperiods.Inthissection,weassessourmodelonanextrinsictaskwhichreliesonmeaningrepresentationsspan-ningseveraltimeslices.WequantitativelyevaluateourmodelontheSemEval-2015benchmarkdatasetsreleasedaspartoftheDiachronicTextEvaluationexercise(PopescuandStrapparava2015;DTE).InthefollowingwefirstpresenttheDTEsubtasks,andthenmoveontodescribeourtrainingdata,parame-tersettings,andsystemsusedforcomparisontoourmodel.SemEvalDTETasksDiachronictextevaluationisanumbrellatermusedbytheSemEval-2015or-ganizerstorepresentthreesubtasksaimingtoassesstheperformanceofcomputationalmethodsusedtoidentifywhenapieceoftextwaswritten.Asimi-larproblemistackledinChambers(2012)whola-beldocumentswithtimestampswhilstfocusingon l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 41 explicittimeexpressionsandtheirdiscriminatorypower.TheSemEvaldataconsistsofnewssnip-pets,whichrangebetweenafewwordsandmul-tiplesentences.Asetoftrainingsnippets,aswellasgold-annotateddevelopmentandtestdatasetsareprovided.DTEsubtasks1and2involvetempo-ralclassification:givenanewssnippetandasetofnon-overlappingtimeintervalscoveringthepe-riod1700through2010,thesystem’staskistose-lecttheintervalcorrespondingtothesnippet’syearoforigin.Temporalintervalsareconsecutiveandconstructedsuchthatthecorrectintervaliscenteredaroundtheactualyearoforigin.Forbothtaskstem-poralintervalsarecreatedatthreelevelsofgranular-ity(fine,medium,andcoarse).Subtask1involvessnippetswhichcontainanex-plicitcuefortimeoforigin.Thepresenceofatemporalcuewasdeterminedbytheorganizersbycheckingtheentities’informativenessinexternalre-sources.Considertheexamplebelow:(8)PresidentdeGaullefavorsanindependentEuropeannuclearstrikingforce[...]ThementionsofFrenchpresidentdeGaulleandnu-clearwarfaresuggestthatthesnippetwaswrittenafterthemid-1950sandindeeditwaspublishedin1962.Ahypotheticalsystemwouldthenhavetode-cideamongstthefollowingclasses:{1700–1702,1703–1705,...,1961–1963,...,2012–2014}{1699–1706,1707–1713,...,1959–1965,...,2008–2014}{1696–1708,1709–1721,...,1956–1968,...,2008–2020}Thefirstsetofclassescorrespondtofine-grainedin-tervalsof2-years,thesecondsettomedium-grainedintervalsof6-yearsandthethirdsettocoarse-grainedintervalsof12-years.Forthesnippetinexample(8)classes1961–1963,1959–1965,and1956–1968arethecorrectones.Subtask2involvestemporalclassificationofsnip-petswhichlackexplicittemporalcues,butcontainimplicitones,e.g.,asindicatedbylexicalchoiceorspelling.Thesnippetinexample(9)waspublishedin1891andthespellingofto-day,whichwascom-monuptotheearly20thcentury,isanimplicitcue:(9)Thelocalwheatmarketwasnotquitesostrongto-dayasyesterday.Analogouslytosubtask1,systemsmustselecttherighttemporalintervalfromasetofcontiguoustimeintervalsofdifferinggranularity.Forthistask,whichisadmittedlyharder,levelsoftemporalgran-ularityarecoarsercorrespondingto6-year,12-yearand20-yearintervals.ParticipatingSemEvalSystemsWecomparedourmodelagainstthreeothersystemswhichpar-ticipatedintheSemEvaltask.8AMBRA(Zampierietal.,2015)adoptsalearning-to-rankmodelingap-proachandusesseveralstylistic,grammatical,andlexicalfeatures.IXA(Salaberrietal.,2015)usesacombinationofapproachestodeterminethepe-riodoftimeinwhichapieceofnewswaswrit-ten.Thisinvolvessearchingforspecificmentionsoftimewithinthetext,searchingfornamedenti-tiespresentinthetextandthenestablishingtheirreferencetimebylinkingthesetoWikipedia,usingGooglen-grams,andlinguisticfeaturesindicativeoflanguagechange.Finally,UCD(SzymanskiandLynch,2015)employsSVMsforclassificationus-ingavarietyofinformativefeatures(e.g.,POS-tagn-grams,syntacticphrases),whichwereoptimizedforthetaskthroughautomaticfeatureselection.ModelsandParametersWetrainedourmodelforindividualwordsandobtainedrepresentationsoftheirmeaningfordifferentpointsintime.Oursetoftargetwordsconsistedofallnounswhichoc-curredinthedevelopmentdatasetsforDTEsub-tasks1and2aswellasallverbswhichoccurredatleasttwiceinthisdataset.Afterremovingin-frequentwordswewereleftwith883words(outof1,116)whichweusedinthisevaluation.Targetwordswerenotoptimizedwithrespecttothetestdatainanyway;itisthusreasonabletoexpectbet-terperformancewithanadjustedsetofwords.Wesetthemodeltimeintervalto∆T=5yearsandthenumberofsensesperwordtoK=8.WealsoevaluatedSCAN-NOT,thestripped-downversionofSCAN,withidenticalparameters.BothSCANandSCAN-NOTpredictthetimeoforiginforatestsnippetasfollows.Wefirstdetectmentionsoftargetwordsinthesnippet.Then,foreachmen-tioncweconstructadocument,akintothetrainingdocuments,consistingofcanditscontextw,the±5wordssurroundingc.Given{c,w},weapproximate8WedonotreportresultsforthesystemUSAARwhichachievedcloseto100%accuracybysearchingforthetestsnip-petsontheweb,withoutperforminganytemporalinference. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 42 Task1Task22yr6yr12yr6yr12yr20yraccpaccpaccpaccpaccpaccpBaseline.097.010.214.017.383.046.199.025.343.047.499.057SCAN-NOT.265.086.435.139.609.169.259.041.403.056.567.098SCAN.353.049.569.112.748.206.376.053.572.091.719.135IXA.187.020.375.041.557.090.261.037.428.067.622.098AMBRA.167.037.367.071.554.074.605.143.767.143.868.292UCD––––––.759.463.846.472.910.542SVMSCAN.192.034.417.097.545.127.573.331.667.368.790.428SVMSCAN+ngram.222.030.467.079.627.142.747.481.821.500.897.569Table4:ResultsonDiachronicTextEvaluationTasks1and2forarandombaseline,ourSCANmodel,itsstripped-downversionwithoutiGMRFs(SCAN-NOT),theSemEvalsubmissions(IXA,AMBRAandUCD),andSVMstrainedwithSCANfeatures(SVMSCAN),andwithadditionalcharactern-gramfeatures(SVMSCAN+ngram).Resultsareshownforthreelevelsofgranularity,astrictprecisionmeasurep,andadistance-discountingmeasureacc.adistributionovertimeintervalsas:p(c)(t|w)∝p(c)(w|t)×p(c)(t)(10)wherethesuperscript(c)indicatesparametersfromtheword-specificmodel,wemarginalizeoversensesandassumeauniformdistributionovertimeslicesp(c)(t).Finally,wecombinetheword-wisepredic-tionsintoafinaldistributionp(t)=Qcp(c)(t|,w),andpredictthetimetwithhighestprobability.SupervisedClassificationWealsoapplyourmodelinasupervisedsetting,i.e.,byextractingfeaturesforclassifierprediction.Specifically,wetrainedamulticlassSVM(ChangandLin,2011)onthetrainingdataprovidedbytheSemEvalorganiz-ers(forDTEtasks1and2).Foreachobservedwordwithineachsnippet,weaddedasfeatureitsmostlikelysensekgivent,thetruetimeoforigin:argmaxkp(c)(k|t).(11)WealsotrainedamulticlassSVMwhichuseschar-actern-gram(n∈{1,2,3})featuresinadditiontothemodelfeatures.SzymanskiandLynch(2015)identifiedcharactern-gramsasthemostpredictivefeaturefortemporaltextclassificationusingSVMs.Theirsystem(UCD)achievedthebestpublishedscoresinDTEsubtask2.Followingtheirapproach,weincludedalln-gramsthatwereobservedmorethan20timesintheDTEtrainingdata.ResultsWeemployedtwoevaluationmeasuresproposedbytheDTEorganizers.Thesearepre-cisionp,i.e.,thepercentageoftimesasystemhaspredictedthecorrecttimeperiod.Andaccuracyaccwhichismorelenient,andpenalizessystempredic-tionsproportionaltotheirdistancefromthetruein-terval.WecomputethepandaccscoresforourmodelsusingtheevaluationscriptprovidedbytheSemEvalorganizers.Table4summarizesourre-sultsforDTEsubtasks1and2.WecompareSCANagainstabaselinewhichselectsatimeintervalatrandom9averagedoverfiveruns.Wealsoshowre-sultsforastripped-downversionofourmodelwith-outtheiGMRFs(SCAN-NOT)andforthesystemswhichparticipatedinSemEval.Forsubtask1,thetwoversionsofSCANout-performallSemEvalsystemsacrosstheboard.SCAN-NOToccasionallyoutperformsSCANinthestrictprecisionmetric,however,thefullSCANmodelconsistentlyachievesbetteraccuracyscoreswhicharemorerepresentativesincetheyfactorintheproximityofthepredictiontothetruevalue.Insubtask2,theUCDandSVMSCAN+ngramsystemsperformcomparably.TheybothuseSVMsfortheclassificationtask,howeverourownmodelemploysalessexpressivefeaturesetbasedonSCANandcharactern-grams,anddoesnottakeadvantageoffeatureselectionwhichwouldpresumablyenhanceperformance.WiththeexceptionofAMBRA,allotherparticipatingsystemsusedexternalresources(suchasWikipediaandGooglen-grams);itisthusfairtoassumetheyhadaccesstoatleastasmuchtrainingdataasourSCANmodel.Consequently,the9Werecomputedthebaselinescoresforsubtasks1and2duetoinconsistenciesintheresultsprovidedbytheDTEorganizers. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 43 gapinperformancecannotsolelybeattributedtoadifferenceinthesizeofthetrainingdata.WealsoobservethatIXAandSCAN,giveniden-ticalgranularity,performbetteronsubtask1,whileAMBRAandourownSVM-basedsystemsexhibittheoppositetrend.TheIXAsystemusesacombi-nationofknowledgesourcesinordertodeterminewhenapieceofnewswaswritten,includingex-plicitmentionsoftemporalexpressionswithinthetext,namedentities,andlinkedinformationtothosenamedentitiesfromWikipedia.AMBRAontheotherhandexploitsmoreshallowstylistic,grammat-icalandlexicalfeatureswithinthelearning-to-rankparadigm.Aninterestingdirectionforfutureworkwouldbetoinvestigatewhichfeaturesaremostap-propriatefordifferentDTEtasks.Overall,itisen-couragingtoseethatthegenerictemporalwordrep-resentationsinferredbySCANleadtocompetitivelyperformingmodelsonbothtemporalclassificationtaskswithoutanyexplicittuning.9ConclusionInthispaperweintroducedSCAN,adynamicBayesianmodelofdiachronicmeaningchange.Ourmodellearnsacoherentsetofco-dependenttime-specificsensesforindividualwordsandtheirprevalence.Evaluationofthemodel’soutputshowedthatthelearntrepresentationsreflect(a)dif-ferentsensesofambiguouswords(b)differentkindsofmeaningchange(suchasnewsensesbeingestab-lished),and(c)connotationalchangeswithinsenses.SCANdepartsfrompreviousworkinthatitmodelstemporaldynamicsexplicitly.Wedemonstratedthatthisfeatureyieldsmoregeneralsemanticrepresen-tationsasindicatedbypredictiveloglikelihoodandavarietyofextrinsicevaluations.Wealsoexperi-mentallyevaluatedSCANonnovelsensedetectionandtheSemEvalDTEtask,whereitperformedonparwiththebestpublishedresults,withoutanyex-tensivefeatureengineeringortaskspecifictuning.Weconcludebydiscussinglimitationsofourmodelanddirectionsforfuturework.Inourexper-imentswefixthenumberofsensesKforallwordsacrossalltimeperiods.Althoughthisapproachdidnotharmperformance(evenincaseofSemEvalwherewehandledmorethan800targetconcepts),itisatoddswiththefactthatwordsvaryintheirdegreeofambiguity,andthatwordsensescontinu-ouslyappearanddisappear.Anon-parametricver-sionofourmodelwouldinferanappropriatenumberofsensesfromthedata,individuallyforeachtimeperiod.Alsonotethatinourexperimentsweusedcontextasabagofwords.Itwouldbeinterestingtoexploremoresystematicallyhowdifferentkindsofcontexts(e.g.,namedentities,multiwordexpres-sions,verbsvs.nouns)influencetherepresentationsthemodellearns.Furthermore,whileSCANcap-turesthetemporaldynamicsofwordsenses,itcan-notdosoforwordsthemselves.Putdifferently,themodelcannotidentifywhetheranewwordisusedwhichdidnotexistbefore,orthatawordceasedtoexistafteraspecificpointintime.Amodelinternalwayofdetectingword(dis)appearancewouldbede-sirable,especiallysincenewtermsarecontinuouslybeingintroducedthankstopopularcultureandvari-ousnewmediasources.Inthefuture,wewouldliketoapplyourmodeltodifferenttextgenresandlevelsoftemporalgranular-ity.Forexample,wecouldworkwithTwitterdata,anincreasinglypopularsourceforopiniontracking,anduseourmodeltoidentifyshort-termchangesinwordmeaningsorconnotations.AcknowledgmentsWearegratefultotheanonymousreviewerswhosefeedbackhelpedtosubstantiallyimprovethepresentpaper.WethankCharlesSuttonandIainMurrayforhelpfuldiscussions,andacknowledgethesupportofEPSRCthroughprojectgrantEP/I037415/1.ReferencesAitchison,Jean.2001.LanguageChange:ProgressOrDecay?.CambridgeApproachestoLinguis-tics.CambridgeUniversityPress.Biemann,Chris.2006.ChineseWhispers-anEffi-cientGraphClusteringAlgorithmanditsAppli-cationtoNaturalLanguageProcessingProblems.InProceedingsofTextGraphs:the1stWorkshoponGraphBasedMethodsforNaturalLanguageProcessing.NewYorkCity,NY,USA,pages73–80.Bird,Steven,EwanKlein,andEdwardLoper.2009.NaturalLanguageProcessingwithPython.O’ReillyMedia,Inc.,1stedition.Blei,DavidM.andJohnD.Lafferty.2006a.Cor- l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 44 relatedTopicModels.InAdvancesinNeuralIn-formationProcessingSystems18,Vancouver,BC,Canada,pages147–154.Blei,DavidM.andJohnD.Lafferty.2006b.Dy-namicTopicModels.InProceedingsofthe23rdInternationalConferenceonMachineLearning.Pittsburgh,PA,USA,pages113–120.Brody,SamuelandMirellaLapata.2009.BayesianWordSenseInduction.InProceedingsofthe12thConferenceoftheEuropeanChapteroftheACL.Athens,Greece,pages103–111.Chambers,Nathanael.2012.LabelingDocumentswithTimestamps:LearningfromtheirTimeEx-pressions.InProceedingsofthe50thAnnualMeetingoftheAssociationforComputationalLinguistics.JejuIsland,Korea,pages98–106.Chang,Chih-ChungandChih-JenLin.2011.LIBSVM:Alibraryforsupportvectorma-chines.ACMTransactionsonIntelligentSys-temsandTechnology2:27:1–27:27.Softwareavailableathttp://www.csie.ntu.edu.tw/˜cjlin/libsvm.Chen,Jianfei,JunZhu,ZiWang,XunZheng,andBoZhang.2013.ScalableInferenceforLogistic-NormalTopicModels.InAdvancesinNeuralIn-formationProcessingSystems,LakeTahoe,NV,USA,pages2445–2453.Cook,Paul,JeyHanLau,DianaMcCarthy,andTimothyBaldwin.2014.NovelWord-senseIden-tification.InProceedingsofthe25thInterna-tionalConferenceonComputationalLinguistics:TechnicalPapers.Dublin,Ireland,pages1624–1635.Cook,PaulandSuzanneStevenson.2010.Automat-icallyIdentifyingChangesintheSemanticOrien-tationofWords.InProceedingsoftheSeventhInternationalConferenceonLanguageResourcesandEvaluation.Valletta,Malta,pages28–34.Davies,Mark.2010.TheCorpusofHistoricalAmericanEnglish:400millionwords,1810-2009.Availableonlineathttp://corpus.byu.edu/coha/.Diller,Hans-J¨urgen,HendrikdeSmet,andJukkaTyrkk¨o.2011.AEuropeandatabaseofdescriptorsofenglishelectronictexts.TheEuropeanEnglishmessenger19(2):29–35.Fellbaum,Christiane.1998.WordNet:AnElectronicLexicalDatabase.BradfordBooks.Groenewald,PieterC.N.andLuckyMokgatlhe.2005.BayesianComputationforLogisticRegres-sion.ComputationalStatistics&DataAnalysis48(4):857–868.Gulordava,KristinaandMarcoBaroni.2011.ADis-tributionalSimilarityApproachtotheDetectionofSemanticChangeintheGoogleBooksNgramCorpus.InProceedingsoftheWorkshoponGEo-metricalModelsofNaturalLanguageSemantics.Edinburgh,Scotland,pages67–71.Kilgarriff,Adam.2009.Simplemathsforkeywords.InProceedingsoftheCorpusLinguisticsConfer-ence.Kim,Yoon,Yi-IChiu,KentaroHanaki,DarshanHegde,andSlavPetrov.2014.TemporalAnal-ysisofLanguagethroughNeuralLanguageMod-els.InProceedingsoftheACL2014WorkshoponLanguageTechnologiesandComputationalSo-cialScience.Baltimore,MD,USA,pages61–65.Kulkarni,Vivek,RamiAl-Rfou,BryanPerozzi,andStevenSkiena.2015.StatisticallySignificantDe-tectionofLinguisticChange.InProceedingsofthe24thInternationalConferenceonWorldWideWeb.Geneva,Switzerland,pages625–635.Lau,HanJey,PaulCook,DianaMcCarthy,Span-danaGella,andTimothyBaldwin.2014.Learn-ingWordSenseDistributions,DetectingUnat-testedSensesandIdentifyingNovelSensesus-ingTopicModels.InProceedingsofthe52ndAnnualMeetingoftheAssociationforCompu-tationalLinguistics.Baltimore,MD,USA,pages259–270.Lau,JeyHan,PaulCook,DianaMcCarthy,DavidNewman,andTimothyBaldwin.2012.WordSenseInductionforNovelSenseDetection.InProceedingsofthe13thConferenceoftheEu-ropeanChapteroftheAssociationforComputa-tionalLinguistics.Avignon,France,pages591–601.McMahon,AprilM.S.1994.UnderstandingLan-guageChange.CambridgeUniversityPress.Mihalcea,RadaandViviNastase.2012.WordEpochDisambiguation:FindingHowWordsChangeoverTime.InProceedingsofthe50th l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 45 AnnualMeetingoftheAssociationforComputa-tionalLinguistics.JejuIsland,Korea,pages259–263.Mimno,David,HannaWallach,andAndrewMc-Callum.2008.GibbsSamplingforLogisticNor-malTopicModelswithGraph-BasedPriors.InNIPSWorkshoponAnalyzingGraphs.Vancouver,Canada.Mitra,Sunny,RitwikMitra,SumanKalyanMaity,MartinRiedl,ChrisBiemann,PawanGoyal,andAnimeshMukherjee.2015.Anautomaticap-proachtoidentifywordsensechangesintextme-diaacrosstimescales.NaturalLanguageEngi-neering21:773–798.Mitra,Sunny,RitwikMitra,MartinRiedl,ChrisBiemann,AnimeshMukherjee,andPawanGoyal.2014.That’ssickdude!:Automaticidentificationofwordsensechangeacrossdifferenttimescales.InProceedingsofthe52ndAnnualMeetingoftheAssociationforComputationalLinguistics.Balti-more,MD,USA,pages1020–1029.´OS´eaghdha,Diarmuid.2010.LatentVariableMod-elsofSelectionalPreference.InProceedingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics.Uppsala,Sweden,pages435–444.Popescu,OctavianandCarloStrapparava.2013.Be-hindtheTimes:DetectingEpochChangesusingLargeCorpora.InProceedingsoftheSixthInter-nationalJointConferenceonNaturalLanguageProcessing.Nagoya,Japan,pages347–355.Popescu,OctavianandCarloStrapparava.2015.Se-mEval2015,Task7:DiachronicTextEvaluation.InProceedingsofthe9thInternationalWorkshoponSemanticEvaluation(SemEval2015).Denver,CO,USA,pages869–877.Ritter,Alan,Mausam,andOrenEtzioni.2010.ALatentDirichletAllocationMethodforSelec-tionalPreferences.InProceedingsofthe48thAnnualMeetingoftheAssociationforComputa-tionalLinguistics.Uppsala,Sweden,pages424–434.Rue,H˚avardandLeonhardHeld.2005.GaussianMarkovRandomFields:TheoryandApplica-tions.Chapman&Hall/CRCMonographsonStatistics&AppliedProbability.CRCPress.Sagi,Eyal,StefanKaufmann,andBradyClark.2009.SemanticDensityAnalysis:ComparingWordMeaningacrossTimeandPhoneticSpace.InProceedingsoftheWorkshoponGeometricalModelsofNaturalLanguageSemantics.Athens,Greece,pages104–111.Salaberri,Haritz,IkerSalaberri,OlatzArregi,andBe˜natZapirain.2015.IXAGroupEHUDiac:AMultipleApproachSystemtowardstheDi-achronicEvaluationofTexts.InProceedingsofthe9thInternationalWorkshoponSemanticEval-uation(SemEval2015).Denver,CO,USA,pages840–845.Stevenson,Angus,editor.2010.TheOxfordEnglishDictionary.OxfordUniversityPress,thirdedi-tion.Szymanski,TerrenceandGerardLynch.2015.UCD:DiachronicTextClassificationwithChar-acter,Word,andSyntacticN-grams.InPro-ceedingsofthe9thInternationalWorkshoponSe-manticEvaluation(SemEval2015).Denver,CO,USA,pages879–883.Tahmasebi,Nina,ThomasRisse,andStefanDi-etze.2011.Towardsautomaticlanguageevolu-tiontracking,Astudyonwordsensetracking.InProceedingsoftheJointWorkshoponKnowl-edgeEvolutionandOntologyDynamics(EvoDyn2011).Bonn,Germany.Wijaya,DerryTantiandReyyanYeniterzi.2011.UnderstandingSemanticChangeofWordsoverCenturies.InProceedingsofthe2011Inter-nationalWorkshoponDETectingandExploitingCulturaldiversiTyontheSocialWeb.Glasgow,Scotland,UK,pages35–40.Zampieri,Marcos,AlinaMariaCiobanu,VladNic-ulae,andLiviuP.Dinu.2015.AMBRA:ARank-ingApproachtoTemporalTextClassification.InProceedingsofthe9thInternationalWorkshoponSemanticEvaluation(SemEval2015).Denver,CO,USA,pages851–855. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 1 1 5 6 7 3 7 0 / / t l a c _ a _ 0 0 0 8 1 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 46
Descargar PDF