Transactions of the Association for Computational Linguistics, vol. 6, pp. 77–89, 2018. Action Editor: Patrick Pantel.

Transactions of the Association for Computational Linguistics, vol. 6, pp. 77–89, 2018. Action Editor: Patrick Pantel.
Submission batch: 6/2017; Revision batch: 10/2017; Published 2/2018.

2018 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence.

c
(cid:13)

EventTimeExtractionwithaDecisionTreeofNeuralClassiﬁersNilsReimers†,NazaninDehghani‡∗,IrynaGurevych††UbiquitousKnowledgeProcessingLab(UKP)andResearchTrainingGroupAIPHESDepartmentofComputerScience,TechnischeUniversit¨atDarmstadt‡SchoolofElectricalandComputerEngineering,UniversityofTehranwww.ukp.tu-darmstadt.deAbstractExtractingtheinformationfromtextwhenaneventhappenedischallenging.Documentsdonotonlyreportoncurrentevents,butalsoonpasteventsaswellasonfutureevents.Often,therelevanttimeinformationforaneventisscatteredacrossthedocument.Inthispaperwepresentanovelmethodtoauto-maticallyanchoreventsintime.Toourknowl-edgeitistheﬁrstapproachthattakestempo-ralinformationfromthecompletedocumentintoaccount.Wecreatedadecisiontreethatappliesneuralnetworkbasedclassiﬁersatitsnodes.Weusethistreetoincrementallyinfer,inastepwisemanner,atwhichtimeframeaneventhappened.WeevaluatetheapproachontheTimeBank-EventTimeCorpus(Reimersetal.,2016)achievinganaccuracyof42.0%com-paredtoaninter-annotatoragreement(IAA)of56.7%.Foreventsthatspanoverasingledayweobserveanaccuracyimprovementof33.1pointscomparedtothestate-of-the-artCAEVOsystem(Chambersetal.,2014).Withoutre-training,weapplythismodeltotheSemEval-2015Task4onautomatictimelinegenerationandachieveanimprovementof4.01pointsF1-scorecomparedtothestate-of-the-art.Ourcodeispublicallyavailable.11IntroductionKnowingwhenaneventhappenedisusefulforalotofusecases.Examplesareintheﬁeldsoftime-awareinformationretrieval,textsummarization,automatedtimelinegeneration,andautomaticknowledgebasepopulation.Manyfactsinaknowledgebaseare∗Duringauthor’sinternshipintheresearchtraininggroupAIPHESatUKPLab,TUDarmstadt.1https://github.com/ukplab/tacl2017-event-time-extractiononlytrueforacertaintimeperiod,forexamplethepresidencyofaperson.Hence,thepopulationofaknowledgebasecanhighlybeneﬁtfromhighqualityeventandeventtime2extraction(Surdeanu,2013).Inherenttoeventsistheconnectiontotime.Allan(2002)deﬁnesaneventas“somethingthathappensatsomespeciﬁctimeandplace”.Thechallengesforautomaticeventtimeextractionaremanifold.Thetemporalinformationinnewsarticleswhichstateswhenaneventhappenedis,inmostcases,notinthesameorinneighboringsentenceswiththeevent(Reimersetal.,2016).Itcanbementionedfarbeforetheeventorfaraftertheevent.Evenworse,formorethan60%ofevents,thespeciﬁcdayatwhichtheeventhappenedisnotmentioned.However,fromtheworldknowledgeandcausalrelations,thereadercaninferalotoftemporalinformationaboutthoseeventsandcanofteninferthattheeventhappenedbeforeoraftersomespeciﬁcpointintime.Inthispaperwedescribeanewclassiﬁerforauto-maticeventtimeextraction.WeusetheTimeBank-EventTimeCorpus(Reimersetal.,2016)totrainandevaluateourproposedarchitecture.Incontrasttoothercorporaontemporalrelations,theannota-tionoftheTimeBank-EventTimeCorpusdoesnotmakerestrictionswhere,andinwhichform,tempo-ralinformationforaneventmustbeprovided.Theannotatorswereallowedtotakethewholedocumentintoaccountandwereaskedtoanswer,tothebestoftheirability,thequestionatwhichdateortimeperiodtheeventhappened.Theeventtimeannotationforsomesampleeventsisshowninthefollowing:•Hewas[sent]1980-05-26intospaceonMay26,2Wewillrefertothetemporalinformationwhenaneventhappenedaseventtime.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

1980.Il[dépensé]endPoint=1980-06-01beginPoint=1980-05-26sixdaysaboardtheSalyut6spacecraft.•[…]twoareas[expected]endPoint=before1998-02-06beginPoint=before1998-02-06tobehardest[hit]after1998-01-01before1998-01-31whentheeffectsoftheAsiancrisis[…].Thisannotationimposesseveralchallengesforanautomaticapproach:1.Thenumberofpossiblelabelsisinﬁnite,asdatevaluesarepartofthelabels.2.Duetothediversetypesofeventsandduetovaryingtemporalinformationforevents,thestructureofthelabelsvaries.3.Temporalinformationfromthewholedocumentmustbetakenintoaccount.4.For12.6%oftheevents,theeventtimelabelisacombinationofseveraltemporalclues.Anexamplecouldbethattheannotatorcombinedthatthepersonwentmissingonthe15thandthatthepersonwentmissinginthemonthofAugust.However,nowhereintextisthe15thofAugustexplicitlymentioned.Themaincontributionofthispaperistheproposalofanovelcombinationofadecisiontreecombinedwithneuralnetworkclassiﬁersforthenodestosolvetheafore-mentionedchallenges.Toourknowledge,thisistheﬁrstsystemthatworksonthecompletedocumentandcanextractlong-rangerelationsbe-tweeneventsandtemporalexpressions.Further,itistheﬁrstsystemthatfocusesonextractingbeginandendpointsforeventsthatspanovermultipledays.EvaluatedontheTimeBank-EventTimeCorpus(Reimersetal.,2016),itachievesanaccuracyof42.0%comparedtoaninter-annotatoragreement(IAA)of56.7%.Comparedtothestate-of-the-artCAEVOsystem(Chambersetal.,2014),weobserveasubstantialimprovementinaccuracyof33.7per-centagepointsforeventsthathappenedonasingleday.ForMulti-DayEvents,weobserveanaccuracyof24.3%usingastrictmetric.Weshowthattheproposedmodelgeneralizeswelltonewtasksandtextualdomains.Weapplieditwithoutre-trainingtotheSemEval-2015Task4onautomatictimelinegeneration.There,itachievesanimprovementof4.01pointsF1-scorecomparedtothestate-of-the-art.2RelatedWorkWestartwithareviewoncommonannotationschemestocapturetemporalinformationforeventsindocuments.Afterwards,wepresentrelatedworkonautomaticallyextractingtemporalinformationforevents.2.1AnnotationofEventsandTemporalInformationOneofthemostwidelyusedspeciﬁcationsforeventsandtemporalexpressionsisTimeML(Saur´ıetal.,2004).Itprovidesspeciﬁcationsfortheannotationofevents,temporalexpressions,andthetemporallinks(TLINK).Aneventisdeﬁnedastermforsituationsthathappenoroccur.Temporalexpressions,suchastimes,dates,ordurations,areannotatedandtheirtemporalvaluesarenormalizedusingthedeﬁnitionsofFerro(2002).ATLINKistherelationbetweentwoevents,betweenaneventandatemporalexpres-sion,orbetweentwotemporalexpressions.TimeMLdeﬁnes14differentrelationtypes,cependant,mostcorporawhichareusingtheTimeMLspeciﬁcationrestrictthenumberofrelationstoasmallerset.AprominentcorpususingtheTimeMLspeciﬁca-tionsistheTimeBankCorpus(Pustejovskyetal.,2003),whichwasalsothebasisforthethreesharedtasksTempEval-1(Verhagenetal.,2007),TempEval-2(Verhagenetal.,2010)andTempEval-3(UzZamanetal.,2013).AdrawbackofTLINKsisthequadraticgrowthofpossibleTLINKswiththenumberofeventsandtem-poralexpressions,resultinginmorethan10,000pos-sibleTLINKsforseveraldocumentsintheTimeBankCorpus.AstheannotationofsuchalargenumberofTLINKswouldbeimpractical,annotationofthoseisalwaysrestrictedinsomeform.FortheTimeBankCorpus,onlysalientTLINKswereannotated.Whichlinksaresalientisn’twelldeﬁnedandalowagree-mentbetweenannotatorscanbeobserved.ThethreeTempEvalsharedtaskstriedtoimprovethecoverageandaddedsomefurthertemporallinksformentionsinthesamesentence.MoredenseannotationswereappliedbyBramsenetal.(2006),Kolomiyetsetal.(2012),Doetal.(2012)andCassidyetal.(2014).WhileBramsenetal.,Kolomiyetsetal.,andDoetal.onlyannotatedsometemporallinks,Cassidyetal.an-notatedallEvent-Event,Event-Time,andTime-Time

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

pairsinthesamesentenceaswellasinthedirectlysucceedingsentenceleadingtothedensestannota-tionfortheTimeBankCorpus.Theyusedsixdiffer-entrelationtypes:BEFORE,AFTER,INCLUDES,ISINCLUDED,SIMULTANEOUS,andVAGUE,whereVAGUEencodesthattheannotatorswerenotabletomakeastatementonthetemporalrelationofthepair.2.2ExistentEventTimeExtractionSystemsMostautomaticapproachesusethepreviouslyin-troducedTLINKstotrainandevaluatesystemsforextractingtemporalinformationaboutevents.Foranewdocument,thesystemﬁrstextractsthetemporalrelationsbetweeneventsandtemporalexpressions.Inapost-processingstep,thoseTLINKsareusedtoretrievetheinformationwhenaneventhappened.Extractingtherelationsisoftenformulatedasapair-wiseclassiﬁcationtask.Eachpairofeventsand/ortemporalexpressionsisexaminedandclassi-ﬁedaccordingtotheavailablerelationclasses.Ensur-ingtransitivityisabigchallengewhenformulatingthistaskasapair-wiseclassiﬁcationtask.Onesim-plebutnonethelessfrequentlyusedsolutionistoautomaticallyinferalltemporalrelationsthatcanbederivedfromtransitivity.Somesystemshavetriedtotakeadvantageofglobalinformationtoensuretransi-tivityusingMarkovlogicalnetworksorintegerlinearprogramming(Bramsenetal.,2006;ChambersandJurafsky,2008;Yoshikawaetal.,2009;UzZamanandAllen,2010).Cependant,thegainswereminor.Chambersetal.(2014)proposestheCAEVO-system,asieve-based-architecturethatblendsmul-tipleclassiﬁersintoaprecision-rankedcascadeofsieves.ThesystemwastrainedandevaluatedontheTimeBank-DenseCorpusandcreatedadenseTLINKannotationforallpairsofeventsand/ortemporalex-pressionsinthesameandinadjacentsentences.Thecodeispublicallyavailable.3AbottleneckofcurrentsystemsisthelimitationtoTLINKsforpairsthatareinthesameorinadjacentsentences.AccordingtoReimersetal.(2016)28.3%oftheeventshappenatthedocumentcreationtime(DCT).Fortheremaining71.7%ofevents,theeventtimemustbeinferredviaTLINKs.However,for3http://www.usna.edu/Users/cs/nchamber/caevo/58.7%ofthoseeventsthemostinformativetimeex-pression4isnotinthesamenorintheprevious/nextsentence.Inconclusion,for42.1%ofalltheeventsinatextitwouldbenecessarytotakelong-rangeTLINKsintoaccounttocorrectlyretrievetheeventtime.Extendingexistingsystemstotakelong-rangerelationsintoaccountisdifﬁcultduetoalackoftrainingandevaluationdata.3EventTimeAnnotationWeusetheTimeBank-EventTimeCorpus(Reimersetal.,2016)toevaluateourarchitectureforautomaticeventtimeextraction.TheTimeBank-EventTimeCorpusdoesnotusetheconceptofTLINKs,instead,foreveryevent,theannotatorswereaskedtoanchortheeventintimeaspreciselyaspossible.TheannotationdistinguishesbetweeneventsthathappenedonaSingleDayandMulti-DayEventsthatspanovermultipledays.ForSingleDayEvents,theannotatorsprovidethedaytheeventhappenedintheformatYYYY-MM-DD.Inthecasetheexactdateisnotmentionedinthedocument,theannotatorswereaskedtoanchortheeventintimeaspreciselyaspossibleusingtheannotationbeforeYYYY-MM-DDandafterYYYY-MM-DD.Beforenotesthattheeventmusthavehappenedbeforethestateddateandafterthattheeventmusthavehappenedafterthedate.Acombinationofbeforeandafterispossible.ForMulti-DayEvents,theannotatorswereaskedtoprovidethebeginandtheendpointoftheevent.AsforSingleDayEvents,theywereallowedtousethebeforeandafternotationinthecasetheexplicitbegin/endpointisnotmentionedinthedocument.TheannotatedcorpuscontainsnewsarticlesandTVbroadcasttranscriptsfromvarioussourceswrittenmainlybetweenJanuaryandApril1998.Theshortestdocumenthasﬁvesentences,whilethelongesthas63sentences.Alabeldistributioncanbefoundin(Reimersetal.,2016).4AutomaticEventTimeExtractionInthissectionweﬁrstpresentourhierarchicaltreeapproachtoautomaticallyinfertheeventtimesin4Themostinformativetemporalexpressionisdeﬁnedasthetemporalexpressiongivingthereadertheinformationatwhichdate,orinwhichtimeframe,theeventhappened.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

adocument.InSection4.3wepresenttwobase-linesthatweuseforcomparison:theﬁrstusesdenseTLINKsextractedbytheCAEVOsystemandthesecondbaselineisareducedversionofthepresentedtreeapproach.4.1EventTimeExtractionusingTreesWeusethetreestructuredepictedinFigure1toextracttheeventtimeforagiventargetevent.Thestructurewasinspiredbyhowannotatorslabeledthedata.Whenannotatingthetext,theﬁrstdecisionistypicallywhethertheeventisaSingleDayEventoraMulti-DayEvent.InthecasethatitisaSingleDayEvent,thenextquestioniswhethertheeventhappenedattheDocumentCreationTime(DCT)ornot.Astheannotateddatacomesfromthenewsdomain,alargesetofevents(48.28%oftheSingleDayEvents)happenedatthedocumentcreationtime.InthecasetheeventdidnothappenatDCT,thentheannotatorscannedthetexttodecidewhetherthedatewhentheeventhappenedisexplicitlymentionedornot.Ifitisnotmentioned,theannotatorusedthebeforeandafternotationtodeﬁnethetimeframewhentheeventhappenedaspreciselyaspossible.ForMulti-DayEvents,theprocessissimilartodeterminethebeginandendpointoftheevent.TheﬁrstclassiﬁerisabinaryclassiﬁertodecidewhethertheeventisaSingleoraMulti-DayEvent.InthecaseitisaSingleDayEvent,thenextclassiﬁerdecidestherelationbetweentheeventandtheDoc-umentCreationTime(DCT).InthecasetheeventhappenedatDCT,thearchitecturestops.IftheeventhappenedbeforeorafterDCT,thenextclassiﬁerisinvoked,detectingwhichtemporalexpressionsarerelevant.Forallrelevanttemporalexpressions,itisthendeterminedwhethertheeventhappenedsimul-taneously,before,orafterthetemporalexpressions.Theﬁnalstep(2.4)outputsasingleeventtimebynarrowingdowntheinformationitreceivesfromtherelationtoDCT(2.1)andthepoolofrelevanttempo-ralexpressionsandrelations(2.3).ForMulti-DayEventstheprocessissimilar,how-ever,thesystemmustreturnthebeginandtheendpoints.Thesystemrunsthreeprocessesinparallel:itextractstherelationstorelevanttimeexpressionsforthebeginpoint(3.1.1and3.1.2);itextractstherelationtoDCT(3.2)et;itextractstherelationstorelevanttimeexpressionsfortheendpoint(3.3.1and3.3.2).TherearethreepossiblerelationsbetweenaMulti-DayEventandtheDCT:theeventstartedandendedbeforetheDCT;itstartedandendedaftertheDCT;oritstartedbeforeDCTandendedafterDCT.Thisinformationistakenintoaccountinstep3.1.3and3.3.3whenproducingsinglebeginpointandendpointinformationforthegivenevent.4.2LocalClassiﬁersThissectiondescribesthedifferentlocalclassiﬁersappliedinourtreestructure.ForallexcepttheNar-rowDownclassiﬁer,weusedtheConvolutionalNeu-ralNetworksArchitecture(Lecun,1989)depictedinFigure2.TheNarrowDownclassiﬁerisasim-ple,hand-crafted,rule-basedclassiﬁerdescribedinSection4.2.6.4.2.1NeuralNetworkArchitectureWeusethesameneuralnetworkarchitecturewithslightlydifferentconﬁgurationsforthedifferentlocalclassiﬁers.ThearchitectureisdepictedinFigure2andisdescribedinthefollowingsections.TheneuralnetworkarchitectureisbasedonthedesignproposedbyZengetal.(2014),whichcanachievestate-of-the-artperformanceonrelationclas-siﬁcationtasks(Zengetal.,2014;dosSantosetal.,2015).Theneuralnetworkappliesaconvolutionoverthewordrepresentationsandpositionembeddingsoftheinputtextfollowedbyamax-over-timepoolinglayer.WecalltheoutputofthislayerInputTextFea-tures.ThoseInputTextFeaturesaremergedwiththewordembeddingfortheeventandtimeexpressiontoken.Themergedinputisfedintoahiddenlayerusingeitherthehyperbolictangenttanh(·)orarec-tiﬁedlinearunit(ReLU)asactivationfunction.Thechoiceoftheactivationfunctionisahyperparameterandwasoptimizedonadevelopmentset.Theﬁnallayeriseitherasinglesigmoidneuron,inthecaseofbinaryclassiﬁcation,orasoftmaxlayer.Toavoidoverﬁtting,weusedtwodropoutlayers(Srivastavaetal.,2014),theﬁrstbeforethedensehiddenlayerandthesecondafterthedensehiddenlayer.Thepercent-agesofthedropoutsweresetashyperparameters.WordEmbeddings.Weusedthepre-trainedwordembeddingspresentedbyLevyandGoldberg(2014).Theembeddinglayerofourneuralnetworksmapseachtokenfromtheinputtexttotheirrespec-tivewordembedding.Out-of-vocabularytokensare

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Figure1:Treestructureusedtoextractthetemporalinformationforanevent.RectanglesarelocalclassiﬁersbasedondeepconvolutionalneuralnetworksexceptfortheNarrowDownrectangles,whicharesimplerulebasedclassiﬁers.Figure2:Theneuralnetworkarchitectureusedforthedifferentlocalclassiﬁers.replacedwithaspecialUNKNOWNtoken,forwhichthewordembeddingwasrandomlyinitialized.PositionEmbeddings.Collobertetal.(2011)pro-posestheuseofpositionembeddingstokeeptrackhowclosewordsintheinputtextaretocertaintar-getwords.Foreachinputtext,wespecifycertainwordsastargets.Forexample,wespecifytheeventandthetemporalexpressionastargetwordsandtrainthenetworktolearnthetemporalrelationbetweenthose.Eachwordintheinputtextisthenaugmentedwiththerelativedistances.Letpos1,pos2,…bethepositionsofthetargetwordsintheinputtext.Then,awordatpositionjisaugmentedwiththefeaturesj−pos1,j−pos2,···.Theseaugmentedpositionfeaturesarethenmappedintheembeddinglayertoarandomlyinitializedvector.Thedimensionofthisvectorisahyperparameterofthenetwork.Thewordembeddingsandthepositionembed-dingsareconcatenatedtoformtheinputforthecon-volutionallayer.Inthecaseoftwotargetwords,theinputfortheconvolutionallayerwouldbe:emboutput={[wew1,pe1−pos1,pe1−pos2],[wew2,pe2−pos1,p2−pos2],…,[wewn,pen−pos1,pen−pos2]}withwewjtheembeddingofthej-thwordinthe

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

inputtext,pej−posktheembeddingforthedistancebetweenthej-thwordandthetargetwordk.Convolutional&Max-Over-TimeLayer.Achallengefortheclassiﬁeristhevariablelengthoftheinputtextandthatimportantinformationcanbeanywhereintheinputtext.Totacklethisissue,weuseaconvolutionallayertocomputeadistributedvectorrepresentationoftheinputtext.Letusdeﬁneavectorxkastheconcatenationofthewordandpo-sitionembeddingsforthepositionkaswellasformpositionstotheleftandtotheright:xk=([wewk−m,pek−m−pos1,pek−m−pos2]||…||[wewk,pek−pos1,pek−pos2]||…||[wewk+m,pek+m−pos1,pek+m−pos2])TheconvolutionallayermultipliesallxkbyaweightmatrixW1andappliestheactivationfunc-tioncomponent-wise.Afterthat,amax-over-timeisapplied,i.e.,themax-functionisappliedcomponent-wise.Thej-thentryoftheconvolutionalandmax-over-timelayeroutputisdeﬁnedas:[convoutput]j=max1≤k≤n[tanh(W1xk)]jLexicalFeatures.Previousapproachesheavilyrelyonlexicalfeatures.Forexample,theCAEVOsystem(Chambersetal.,2014)uses,fortheclassiﬁ-cationofevent-timeedges,thetoken,thelemma,thePOStag,thetense5,thegrammaticalaspect6andtheclassofevent7aswellastheparsetreebetweeneventandtimeexpression.Inourevaluation,wedidnotobservethatthesefeatureshaveasigniﬁcantimpactontheperformance.Hence,wedecidedtousetheeventandtimetokensastheonlyfeaturesbesidesthedensevectorrepresentationoftheinputtext.Formulti-tokenexpressions,weonlyusetheﬁrsttoken.Ourarchitecturefocusesonextractingtheeventtimewheneventannotationsandtemporalexpressionsareprovided.Inordertoevaluatetheaccuracyofthisisolatedstep,wedecidedtousetheprovidedannotationsinthecorpus.Thebaselineswe5Deﬁnedtenses:simple,perfect,andprogressive6DeﬁnedaspectsinTimeBank:past,présent,future7DeﬁnedclassesinTimeBank:occurrence,perception,re-porting,aspectual,État,istate,iactioncomparedagainstusethesegoldannotationsaswell.Output.Thedistributedvectorrepresentationoftheinputtextandtheembeddingsofevent/timetokenareconcatenatedandpassedthroughadenselayer.Astheactivationfunction,weallowedeitherthehy-perbolictangentortherectiﬁedlinearunit(ReLU).Thechoiceisaparameterofthenetwork.Theﬁnallayeriseitherasinglesigmoidneuron,inthecaseofbinaryclassiﬁcation,orasoftmaxlayertocomputetheprobabilitiesofthedifferenttags.4.2.2Singlevs.Multi-DayEventClassiﬁcationTheﬁrstlocalclassiﬁer,thatdecideswhetheraneventisaSingleDayEventoraMulti-DayEvent,usestheeventwordasthetargetword.4.2.3DCTClassiﬁcationASingleDayEventcanhappeneitherbeforethedocumentwascreated(Before-class),onthesameday(Simultaneous-class),oritwillhap-penatleastonedayafterthedocumentwascreated(After-class).Theconﬁgurationofthislocalclas-siﬁerisasintheprevioussection.Note,toclassifytherelationtotheDCT,inmostcases,itwasnotimportanttoknowtheconcreteDocumentCreationTime.Therefore,wedidnotpasstheDCTasavaluetothenetwork.ForMulti-DayEvents,wedecidedtogrouptheeventsintothreecategories:ﬁrst,eventsthatbe-ganandendedbeforetheDocumentCreationTime(Before-class);second,eventsthatbeganbeforeDCTandendedafterDCT(Includes-class);andthird,eventsthatwillbeginandendafterDCT(After-class).4.2.4DetectingRelevantTimeExpressionsInthecasetheeventdidnothappenattheDCT,itisimportanttotakethesurroundingtextandpo-tentiallythewholedocumentintoaccounttoﬁgureoutatwhichdatetheeventhappened.Forourclassi-ﬁer,weassumethattemporalexpressionsarealreadydetectedinthedocument.Todetecttemporalex-pressions,toolslikeHeidelTime8canbeusedthatachieveanF1-scoreof0.919onextractingtemporalexpressionsintheTimeBankCorpus(Str¨otgenandGertz,2015).8https://github.com/HeidelTime

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Asanintermediatesteptodetectwhenaneventhappened,weﬁrstdecidewhetherthetemporalex-pressionisrelevantfortheeventornot.Wedeﬁneatemporalexpressiontoberelevant,ifthe(normal-ized)valueofthetemporalexpressionispartoftheeventtimeannotation.Thevalueofthetemporalex-pressioncaneitherbetheeventtime,oritcanappearinthebeforeorafternotation.Theclassiﬁerisexecutedforalleventandtemporalexpressionpairs.Theinputtextforthedistributedtextrepresentationisthetextbetweentheeventandthetemporalexpression.4.2.5TemporalRelationClassiﬁcationGiventherelevanttemporalexpressionforaneventfromthepreviousstep,thenextlocalclassi-ﬁerestablishesthetemporalrelationbetweentheeventandthetemporalexpression.Foragiven,relevantevent-temporalexpressionpair,itoutputsBEFORE-whentheeventhappenedbeforethetem-poralexpression,AFTER-whenithappenedafter,orSIMULTANEOUS-whenithappenedonthemen-tioneddate.Thislocalclassiﬁerhasthesameconﬁgu-rationasthenetworkusedtodetectrelevanttemporalexpression.4.2.6NarrowDownClassiﬁerThegoaloftheNarrowDownClassiﬁer,thatisusedinstep2.4,3.1.3and3.3.3inFigure1,istoderivetheﬁnallabelgiventheinformationontherel-evanttemporalexpressions,theirrelationtotheevent,andtherelationtothedocumentcreationtime.Formosteventsinthecorpus,thisinformationwasun-ambiguous,e.g.,onlyonetemporalexpressionwasclassiﬁedasrelevantfortheevent.Theproposedapproachreturnsmultiplerelevanttemporalexpres-sionsonlyforasmallfractionofevents.However,thisnumberwastoosmalltotrainandtovalidatealearningalgorithmforthisstage.Hence,wedecidedtoimplementastraightforward,rule-basedclassiﬁer.ThisclassiﬁerisdepictedinAlgorithm1.Ittakesallrelationstorelevanttemporalexpres-sionsaswellastherelationtotheDocumentCre-ationTimetoderivetheﬁnaloutput.InthecaseaSIMULTANEOUSrelationexists,theclassiﬁerstopsandtheappropriatetemporalexpressionisusedaseventtime.Ifnosuchrelationexists,afrequencydistributionofthelinkeddatesandrelationsiscre-atedforBEFOREaswellasforAFTERrelations.Forexample,whenthesystemextractsthreerelevantBEFORErelationsofdifferentmentionsofdate1throughoutthetextandtworelevantBEFORErela-tionsofdifferentmentionsofdate2,thenthesys-temwouldchoosedate1asaslot-ﬁllerforthebe-foreproperty.IfthereareasmanyrelevantBEFORErelationsfordate1asfordate2,thesystemwillchoosethelowestdateforthebeforeproperty(line13-18).ForAFTERrelations,weusethesamelogic,exceptthatwechoosethelargestdate(line23).Algorithm1NarrowDownClassiﬁer1:functionNARROWDOWN(fois)2:fdbefore,fdafter=FreqDistribution()3:pour[relation,temps]intimesdo4:ifrelationisSIMULTANEOUSthen5:returntime6:elseifrelationisBEFOREthen7:fdbefore.newsample(temps)8:elseifrelationisAFTERthen9:fdafter.newsample(temps)10:endif11:endfor12://fdbeforeelementshavetheﬁelds.num=#samplesand.time=timevalue13:iffdbefore.size>0then14://ﬁndthelargestnumberofsamplesofatime15:maxsamples=fdbefore.max(.num)16://takeminimumoveralltimeshavingmaxsamples17:beforetime=fdbefore.ﬁlter(.num==maxsamples).min(.temps)18:endif19:iffdafter.size>0then20://ﬁndthelargestnumberofsamplesofatime21:maxsamples=fdafter.max(.num)22://takemaximumoveralltimeshavingmaxsamples23:aftertime=fdafter.ﬁlter(.num==maxsamples).maximum(.temps)24:endif25:returnafter+aftertime+before+beforetime26:endfunction4.3BaselineWeusetwobaselinestocompareoursystem.Astheﬁrstbaseline,weusethesystempresentedinReimersetal.(2016).Thebaselineisbasedonthemulti-passarchitectureCAEVOintroducedbyCham-bersetal.(2014)andextractsallTLINKsbetweeneventmentionsandtemporalexpressions.Thesys-tembyChambersetal.appliesmultiplerulesandtrainedclassiﬁerstoextractthoseTLINKs.Thedif-

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

ferentstagesarerankedbyprecisionandareexecutedconsecutively.Ashortcomingofthesystemisthatitdoesnotproducetemporalinformationifaneventlastedformorethanaday.Hence,thesystemcannotbeusedtodistinguishbetweenSingleDayandMulti-DayEvents,norcanitextractthebegin/endpointsforMulti-DayEvents.Ourpreviouslypresentedbaselineusestheex-tractedrelationsforSingleDayEventsandgener-atesasetoftuplesinwhichtheeventisinvolved.Weusethenarrowdownclassiﬁerfromsection4.2.6toextracttheﬁnallabel.WhenallextractedrelationsareoftypeVAGUE,thebaselinereturnsthatitcannotinferthetimefortheevent.Thesecondbaselineisareducedversionofthehierarchicaltree.Forthisbaseline,weﬁrstapplytheclassiﬁertodecidewhetheritisaSingleDayorMulti-DayEvent.WhenitisaSingleDayEvent,weclassifytherelationtothedocumentcreationtime(DCT)(classiﬁer2.1).WhentheeventdidnothappenatDCT,welinkittotheclosesttemporalexpressioninthedocument.ForMulti-DayEvents,weonlyruntheclassiﬁer3.2toextracttherelationtoDCT.WhentheeventhappenedbeforeDCT,wesetthebeginandendpointtoBEFOREDCT;whenithappenedafterDCT,wesetbothtoAFTERDCT;et,whentherelationwasIncludes,wesetthebeginpointtoBEFOREDCTandtheendpointtoAFTERDCT.5ExperimentalSetupWeconductourexperimentsontheTimeBank-EventTimeCorpus(Reimersetal.,2016).Thecorpusiscomprisedof36documentsand1498annotatedevents.Weusethesamesplitintotraining,develop-ment,andtestsetasChambersetal.(2014)resultingin22documentsfortraining,5documentsforhyper-parameteroptimization,and9documentsfortheﬁnalevaluation.UsingthissplitallowsafaircomparisontotheCAEVOsystem.Hyperparametersfortheindividuallocalclassi-ﬁerswerechosenusingrandomsearch(BergstraandBengio,2012)withatleast1000iterationsperlocalclassiﬁer.6ExperimentalResultsWeevaluateoursystemusingtwodifferentmetrics.Thestrictmetricrequiresanexactmatchbetweenthepredictedlabelandthegoldlabel.Adisadvantageofthismetricisthatitdoesnotallowpartialagree-ment.Thestrictagreementbetweentwoannotatorsisfairlylowforeventswheretheexactdateoftheeventwasnotmentioned.Inordertoallowpartialmatches,wewillalsousearelaxedmetric,whichwilljudgetwodifferentla-belsonlyasanerror,ifthosearemutuallyexclusive.Twolabelsaremutuallyexclusive,ifthereisnoeventdatewhichcouldsatisfybothlabelsatthesametime.IftheeventhappenedonAugust5th,1998,thetwoannotationsbefore1998-08-31andafter1998-08-01before1998-08-31wouldbothbesatisﬁed.There-fore,thesetwodifferentlabelswouldbeconsideredascorrect.Incontrast,thetwoannotationsafter1998-02-01andbefore1997-12-31canneverbesatisﬁedatthesametimeandarethereforemutuallyexclusive.Thescoreoftherelaxedmetricmustbeseenincombinationwiththestrictmetric.Asystemcouldtricktherelaxedmetricbyreturningabeforedatethatisfarinthefuturewhichresultsinahighrelaxedscorebutanegligiblestrictscore.Futureresearchisnecessarytojudgethequalityofdifferentkindofpartialmatchesandtodesignanappropriatemetric.6.1SystemPerformanceFollowingtherecommendationsin(ReimersandGurevych,2017),wetrainthesystemwith25dif-ferentrandomseedvalues,andcomputethemeanperformancescoreandthestandarddeviation.Table1showstheresultsincomparisontotheobservedinter-annotatoragreement(IAA).Theinter-annotatoragreementisbasedontwofullannotationsofthecor-pus.Thechance-correctedagreementisα=0.617usingKrippendorff’sα(Krippendorff,2004).Thetwoannotationsweremergedintoaﬁnalgoldlabelannotationofthecorpus,whichweusedfortrainingandevaluation.TheaccuracytodistinguishbetweenSingleDayandMulti-DayEventsis78.2%onthetestset,incomparisontoaninter-annotatoragreementof81.8%.Theoverallperformanceis42.0%,comparedtoanIAAof56.7%usingthestrictmetric.ForMulti-DayEvents,weobserveanaccuracywiththestrictmetricof24.5%,comparedtoanIAAof52.0%.Breakingitdowntothebegin-andend-pointextraction,weobserveamuchloweraccuracyforthebeginpointextractionofjust28.5%,com-

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

SystemIAASinglevs.Multi-Day78.2%±1.3381.8%SingleDay(Strict)74.6%±1.0480.5%SingleDay(Relaxed)92.5%±0.6098.0%Multi-Day(Strict)24.5%±1.6152.0%Begin(Strict)28.5%±0.7363.8%End(Strict)66.5%±1.0274.9%Multi-Day(Relaxed)74.6%±0.5594.6%Begin(Relaxed)94.9%±0.3898.6%End(Relaxed)80.2%±0.7396.1%OverallAcc.(Strict)42.0%±1.2156.7%OverallAcc.(Relaxed)84.6%±0.7195.3%Table1:Accuracyforthedifferentstagesofoursystemincomparisontotheobservedinter-annotatoragreement(IAA).Thestrictmetricrequiresanexactmatchbetweenthelabels.Therelaxedmetricrequiresthatthetwoanno-tationsarenotmutuallyexclusive.paredto66.7%accuracyfortheendpointextraction.However,usingtherelaxedmetric,weseeanaccu-racyof94.9%forthebeginpointand80.2%fortheendpoint.Wecanconcludethattheextractionofthebeginpointworkswell,cependant,inalargesetofcases(66.7%)theextractedbeginpointislessprecisethanthegoldannotation.ThebaselinebasedontheCAEVOsystemfromChambersetal.(2014)canonlybeappliedtoSingleDayEvents,asTLINKtypesthatdeﬁnethestartortheendofaneventdonotexist.WeranthisbaselineonalleventsthatwerecorrectlyidentiﬁedasSingleDayEvents.TheperformanceofthisbaselineisdepictedinTable2.Fortheproposedapproachweobserveaperformanceincreasefrom41.2%to74.6%.For18.3%oftheevents,theretrievedlabeloftheproposedapproachwaslessprecisethanthegoldlabel.Anexampleofalesspreciselabelwouldbebefore1998-12-31whilethegoldlabelwasbefore1998-08-15.Aclearwronglabelwasobservedfor7.1%ofthegeneratedlabels.AbigdisadvantageofadenseTLINKannotationistherestrictionofTLINKsforeventsandtemporalexpressionthatareinthesame,orinadjacent,sen-tences.For32.0%oftheevents,thebaselinewasnotabletoinferanyeventtimeinformation.Asoursys-temoutputsalabelforeveryevent,weseeaslightlyincreasednumberofwronglabelsincomparisontothebaseline.SingleDayEventsOursCAEVOExactmatch74.6%41.2%Lessprecise18.3%21.5%Wronglabel7.1%5.4%Cannotinfertime-32.0%Table2:Distributionoftheretrievedlabelsforthepro-posedsystemandforthebaseline.Lessprecisearelabelswherethetimeframewhentheeventhashappenedislargerthanforthegoldlabel.Wronglabelarelabelswhichareinclearcontradictiontothegoldstandard.Table3comparestheproposedsystemagainstthereducedtreethatonlyclassiﬁesthetypeoftheevent(SingleDayorMulti-Day)andtherelationtothedoc-umentcreationtime.WeobserveasigniﬁcantdropinaccuracyforSingleDayEvents,indicatingthatjustclassifyingtherelationtothedocumentcreationtimeisinsufﬁcientforthistask.SystemSDMDOverallFullsystem74.6%24.3%42.0%Reducedtree40.4%19.6%24.2%CAEVO41.2%-18.1%Table3:Comparisonoftheaccuracy(strictmetric)forSingleDayEvents(SD),Multi-DayEvents(MARYLAND)andoverall.Reducedtreeusesonlythelocalclassiﬁers1,2.1and3.2.6.2ErrorAnalysisErrorpropagationisanimportantfactorinadecisiontree.Table4depictstheaccuracyofthedifferentlocalclassiﬁers.WecomparethosetoaMajorityVotebaseline.Foralllocalclassiﬁerswecanseealargeperformanceincreaseoverthebaseline.Weobservethelowestaccuracyfortheclassiﬁersofthebeginpoint(3.1.1.and3.1.2.).ThisisinlinewiththepreviousobservationofthelowaccuracyforbeginpointlabelsaswellaswiththelowIAAforbeginpointannotations.Therootclassiﬁer,whichdecideswhethertheeventisaSingleDayoraMulti-DayEvent,isthemostcriticalclassiﬁer.Thisclassiﬁerisresponsiblefor21.7%oftheerroneouslylabeledevents.How-ever,withanaccuracyof78.3%itisalreadyfairlyclosetotheIAAof81.6%anditisunclearifthisclassiﬁercouldsubstantiallybeimproved.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

SystemMajorityVote1.EventType78.3%54.5%SingleDayEvent2.1.DCTRel.84.2%55.6%2.2.Relevant79.1%66.0%2.3.Relation81.0%72.7%Multi-DayEvent3.1.BeginPoint3.1.1.Relevant79.0%68.9%3.1.2.Relation63.1%42.9%3.2.DCTRel.65.2%46.8%3.3.EndPoint3.3.1.Relevant83.8%65.1%3.3.2.Relation85.1%79.0%Table4:Accuracyforthedifferentlocalclassiﬁersvs.aMajorityVotebaseline.LocalclassiﬁersarenumberedasdepictedinFigure1.Asmentionedintheintroduction,theannotatorswerenotrestrictedtothedatesthatareexplicitlymentionedinthedocumentbutcouldalsocreatenewdates.Forexample,inthesentenceIt’sthe[secondday]date:1998-03-06ofan[offensive]beginPoint=1998-03-05…itisclearfortheannotatorthattheoffensivestartedon1998-03-05.However,thisdateisnotexplicitlymentionedinthetext,onlythedate1998-03-06ismentioned.Wecallsuchdatesout-of-documentdates.Handlingthosecasesisextremelydifﬁcultandoursystemiscurrentlynotcapableofcreatingsuchout-of-documentdates.Table5depictstheerrorrateintroducedbythosedates.Asthetabledepicts,12.6%oftheeventtimelabelsareaffectedbyout-of-documentdates.Anespeciallyhighpercentageofsuchdatesisobservedforthebe-ginpointofMulti-DayEvents.Inalotofthesecasesthedocumentstateseitheranexplicitoraroughesti-mationonthedurationoftheevent.Inthepreviousexample,thetextstatedthattheoffensivealreadylastedfortwodays.Inanotherexample,thedocu-mentgivestheinformationthattheeventstartedinrecentyearsorthatitlastedforroughly21/2years.6.3AblationTestTable6presentsthechangesinaccuracyinper-centagepointswhenindividualcomponentsoftheproposedsystemarechanged.WeobserveaslightOut-of-documentdatesSingleDayEvents3.0%Multi-DayEvents24.1%BeginPoint17.0%EndPoint9.9%Overall12.6%Table5:Percentageoflabelsinthetestsetaffectedbyout-of-documentdates.dropof-2.3percentagepointsifbidirectionalLSTM-networkswith100recurrentunitsareusedinsteadofConvolutionalNeuralNetworks.LSTM-networksshowedforotherNLPtasksstate-of-the-artperfor-mance,cependant,forthistasktheywerenotabletoimprovetheperformance.Onereasoncouldbethecomparablysmalltrainingsetof22documents.AfurtherdisadvantageoftheBiLSTM-networkswasthesigniﬁcantlylongertrainingtime,prohibitingrun-ninganextensivehyperparametertuning.ConﬁgurationAccuracyFullsystem42.0%BiLSTMinsteadofCNN-2.3Rnd.wordembeddings-7.7Noinputtextfeature-9.7Nopositionfeature-3.9Nonarrowdown-1.3Table6:Changeinaccuracy(strictmetric)inpercent-agepointswhenreplacingindividualcomponentsofthearchitecture.Animportantfactorfortheperformancewasthepre-trainedwordembeddings.Replacingthosewithrandomlyinitializedembeddingsdecreasedtheper-formanceby-7.7percentagepoints.Asbefore,wethinkthisisduetothesmalltrainingsize.Alargenumberoftesttokensdonotappearinthetrainingsetandseveraltokensonlyappearinfrequentlyinthetrainingset.Hence,thenetworkisnotabletolearnmeaningfulrepresentationsforthosewords.Oursystemsuccessfullyusesthetextbetweentheeventandthetemporalexpression(InputTextFea-tures)forclassifyingtherelationbetweenthose.Re-movingthispartofthearchitecturedecreasestheac-curacyby-9.7percentagepoints.Further,itappearsthatnotonlythetokenitself,butalsothepositionofthetokenrelativetotheevent/timetokenistaken

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

intoaccount.Removingthispositioninformationfromtheinputtextfeaturereducestheperformanceby-3.9percentagepoints.Replacingthenarrowdownclassiﬁerwithaclassi-ﬁerthatrandomlyselectsoneoftherelevanttemporalexpressionsreducestheperformancebyonly-1.3per-centagepoints.Formostevents,therewasonlyonerelevanttemporalexpressionextracted.Weanalyzedtheparametersettingsforthetopﬁveperforminglo-calclassiﬁersforeachstage.Theactivationfunction(tanhandReLU)appearstohaveanegligibleimpactontheperformance.6.4EventTimelineConstructionWeevaluatedoursystemonthesharedtaskSemEval-2015Task4:TimeLine:Cross-DocumentEventOr-dering(Minardetal.,2015).Thegoalistoconstructaneventtimelineforatargetentitygivenasetof30documentsfromWikinewsoncertaintopics.WeusethesettingofTrackB,wheretheeventsareprovided.WeusedHeidelTimetodetectandnormalizetimeexpressions.Wethenranoursystemoutofthebox,i.e.,withoutretrainingforthenewdataset.Forthesharedtask,aneventcanoccureitherataspeciﬁcday,inaspeciﬁcmonth,orinaspeciﬁcyear.Eventsthatcannotbeanchoredintimeareremovedfromtheevaluation.Weimplementedsimplerulesthattransformoursystemoutputtotheformatofthesharedtask:ifaneventissimultaneouswithaspeciﬁctimeexpression,wewilloutputthisdate.Ifoursystemreturnsthatithappenedbeforeandafteracertaindate,itwilloutputtheyearandmonthifbothdatesareinthesamemonth.Ifbothdatesareinthesameyearbutindifferentmonths,itwilloutputtheyear.Eventswithpredictedtimespansofovermorethanoneyeararerejected.ForMulti-DayEvents,weonlyusethebeginpointasonlythisinformationwasannotatedforthissharedtask.Twoteamsparticipatedinthesharedtask(GPL-SIUAandHeidelToul).Actuellement,thebestpublishedperformancewasachievedbyCornegrutaandVla-chos(2016)withanF1-scoreof28.58.OursystemwasabletoimprovethetotalF1-scoreby4.01pointsasdepictedinTable7.Achallengeforoursystemisthedifferentanchor-ingofeventsintime:whileoursystemcananchoreventsattwoarbitrarydates,theSemEval-2015Task4onlyanchorseventseitherataspeciﬁcday,monthSystemAirbusGMStockTotalOurapproach30.3728.8338.0132.59Cornegruta25.6526.6432.3528.58GPLSIUA122.3519.2833.5925.36HeidelToul216.5010.9425.8918.34Table7:PerformanceofoursystemontheSemEval-2015Task4TrackBforthetopicsAirbus,GeneralMotors,andstockmarket.oryear.Whenoursystemreturnstheeventtimevalueafter2010-10-01andbefore2010-11-30,wehadtodecidehowtoanchorthiseventforthegen-eratedtimeline.Forsuchanevent,threeﬁnallabelswouldbeplausible:2010-10-xx,2010-11-xx,and2010-xx-xx.Asimilarchallengeoccursforeventsthatreceivedalabellikebefore2010-11-30.Ifweanchoritin2010-11-xx,wemustbecertainthattheeventhappenedinNovember.Similarly,ifwean-choritin2010-xx-xx,wemustbecertainthattheeventhappenedin2010.Suchinformationcannotbeinferreddirectlyfromthereturnedlabelofoursystem.Asonly30documentsonasingletopicwereprovidedfortraining,wecouldnottunethetransfor-mationaccordingly.Amanualanalysisrevealedthatthistransformationcausedaround15%oftheerrors.7ConclusionEventTimeExtractionisachallengingclassiﬁca-tiontaskasthesetoflabelsisinﬁniteandthelabeldependsontheinformationthatisscatteredacrossthedocument.Thepresentedclassiﬁerisabletotakethewholedocumentintoaccountandtoinferthedatewhenaneventhashappened.WeappliedthesystemtotheTimeBank-EventTimeCorpusandachievedanaccuracyof42.0%incomparisontoaninter-annotatoragreementof56.7%usingastrictmetric.For74.6%oftheSingleDayevents,theexacteventtimecouldbeextracted.Thisisa33.1percentagepointsimprovementincomparisontothestate-of-the-artapproachbyChambersetal.(2014).WedemonstratedthegeneralizabilitybyapplyingittotheSemEval-2015Task4ontimelinegeneration,whereitimprovedtheF1-scoreby4.01percentagepointscomparedtothestate-of-the-art.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

AcknowledgementsThisworkhasbeensupportedbytheGermanRe-searchFoundationaspartoftheResearchTrainingGroupAdaptivePreparationofInformationfromHet-erogeneousSources(AIPHES)undergrantNo.GRK1994/1.WewouldliketothanktheTACLeditorsandreviewersfortheireffortandthevaluablefeedbackwereceivedfromthem.ReferencesJamesAllan.2002.TopicDetectionandTracking:Event-basedInformationOrganization.pages1–16.KluwerAcademicPublishers,Norwell,MA,USA.JamesBergstraandYoshuaBengio.2012.RandomSearchforHyper-parameterOptimization.J.Mach.Learn.Res.,13:281–305,February.PhilipBramsen,PawanDeshpande,YoongKeokLee,andReginaBarzilay.2006.InducingTemporalGraphs.InProceedingsofthe2006ConferenceonEmpiricalMethodsinNaturalLanguageProcessing,EMNLP’06,pages189–198,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.TaylorCassidy,BillMcDowell,NathanaelChambers,andStevenBethard.2014.AnAnnotationFrameworkforDenseEventOrdering.InProceedingsofthe52ndAnnualMeetingoftheAssociationforComputationalLinguistics(Volume2:ShortPapers),pages501–506,Baltimore,Maryland,USA.AssociationforComputa-tionalLinguistics.NathanaelChambersandDanJurafsky.2008.Jointlycombiningimplicitconstraintsimprovestemporalor-dering.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing,EMNLP’08,pages698–706,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.NathanaelChambers,TaylorCassidy,BillMcDowell,andStevenBethard.2014.DenseEventOrderingwithaMulti-PassArchitecture.TransactionsoftheAssocia-tionforComputationalLinguistics,2:273–284.RonanCollobert,JasonWeston,L´eonBottou,MichaelKarlen,KorayKavukcuoglu,andPavelKuksa.2011.Naturallanguageprocessing(presque)fromscratch.J.Mach.Learn.Res.,12:2493–2537,November.SavelieCornegrutaandAndreasVlachos.2016.Time-lineextractionusingdistantsupervisionandjointin-ference.InProceedingsofthe2016ConferenceonEmpiricalMethodsinNaturalLanguageProcessing,EMNLP2016,Austin,Texas,Etats-Unis,November1-4,2016,pages1936–1942.QuangXuanDo,WeiLu,andDanRoth.2012.JointInferenceforEventTimelineConstruction.InPro-ceedingsofthe2012JointConferenceonEmpiricalMethodsinNaturalLanguageProcessingandCompu-tationalNaturalLanguageLearning,EMNLP-CoNLL’12,pages677–687,Stroudsburg,Pennsylvanie,USA.Associa-tionforComputationalLinguistics.C´ıceroNogueiradosSantos,BingXiang,andBowenZhou.2015.ClassifyingRelationsbyRankingwithConvolutionalNeuralNetworks.InProceedingsofthe53rdAnnualMeetingoftheAssociationforComputa-tionalLinguisticsandthe7thInternationalJointCon-ferenceonNaturalLanguageProcessingoftheAsianFederationofNaturalLanguageProcessing,ACL2015,July26-31,2015,Beijing,Chine,Volume1:LongPa-pers,pages626–634.LisaFerro.2002.TIDES.InstructionManualfortheAnnotationofTemporalExpressions.Technicalreport,MITRETECHNICALREPORT.OleksandrKolomiyets,StevenBethard,andMarie-FrancineMoens.2012.ExtractingNarrativeTimelinesAsTemporalDependencyStructures.InProceedingsofthe50thAnnualMeetingoftheAssociationforCom-putationalLinguistics:LongPapers-Volume1,ACL’12,pages88–97,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.KlausKrippendorff.2004.ContentAnalysis:AnIn-troductiontoItsMethodology(secondedition).SagePublications.YannLecun,1989.Generalizationandnetworkdesignstrategies.Elsevier.OmerLevyandYoavGoldberg.2014.Dependency-BasedWordEmbeddings.InProceedingsofthe52ndAnnualMeetingoftheAssociationforComputationalLinguistics,ACL2014,June22-27,2014,Baltimore,MARYLAND,Etats-Unis,Volume2:ShortPapers,pages302–308.Anne-LyseMinard,ManuelaSperanza,EnekoAgirre,ItziarAldabe,MariekevanErp,BernardoMagnini,GermanRigau,andRubenUrizar.2015.SemEval-2015Task4:TimeLine:Cross-DocumentEventOrder-ing.InProceedingsofthe9thInternationalWorkshoponSemanticEvaluation,SemEval@NAACL-HLT2015,Denver,Colorado,Etats-Unis,June4-5,2015,pages778–786.JamesPustejovsky,PatrickHanks,RoserSauri,AndrewSee,RobertGaizauskas,AndreaSetzer,DragomirRadev,BethSundheim,DavidDay,LisaFerro,andMarciaLazo.2003.TheTIMEBANKCorpus.InPro-ceedingsofCorpusLinguistics2003,pages647–656,Lancaster,UK.NilsReimersandIrynaGurevych.2017.ReportingScoreDistributionsMakesaDifference:PerformanceStudyofLSTM-networksforSequenceTagging.InProceed-ingsofthe2017ConferenceonEmpiricalMethodsin

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

NaturalLanguageProcessing(EMNLP),pages338–348,Copenhagen,Denmark,September.NilsReimers,NazaninDehghani,andIrynaGurevych.2016.TemporalAnchoringofEventsfortheTime-BankCorpus.InProceedingsofthe54thAnnualMeet-ingoftheAssociationforComputationalLinguistics(ACL2016),volume1:LongPapers,pages2195–2204.AssociationforComputationalLinguistics,August.RoserSaur´ı,JessicaLittman,RobertGaizauskas,AndreaSetzer,andJamesPustejovsky.2004.TimeMLAnno-tationGuidelines,Version1.2.1.NitishSrivastava,GeoffreyHinton,AlexKrizhevsky,IlyaSutskever,andRuslanSalakhutdinov.2014.Dropout:ASimpleWaytoPreventNeuralNetworksfromOver-ﬁtting.J.Mach.Learn.Res.,15(1):1929–1958,Jan-uary.JannikStr¨otgenandMichaelGertz.2015.ABaselineTemporalTaggerforallLanguages.InProceedingsofthe2015ConferenceonEmpiricalMethodsinNat-uralLanguageProcessing,pages541–547,Lisbon,Portugal,September.AssociationforComputationalLinguistics.MihaiSurdeanu.2013.OverviewoftheTAC2013KnowledgeBasePopulationEvaluation:EnglishSlotFillingandTemporalSlotFilling.InProceedingsoftheTAC-KBP2013Workshop,Gaithersburg,Maryland,USA.NaushadUzZamanandJamesF.Allen.2010.TRIPSandTRIOSSystemforTempEval-2:ExtractingTem-poralInformationfromText.InProceedingsofthe5thInternationalWorkshoponSemanticEvaluation,SemEval’10,pages276–283,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.NaushadUzZaman,HectorLlorens,LeonDerczynski,MarcVerhagen,JamesF.Allen,andJamesPustejovsky.2013.SemEval-2013Task1:TempEval-3:EvaluatingTimeExpressions,Events,andTemporalRelations.InProceedingsofthe7thInternationalWorkshoponSe-manticEvaluation(SemEval2013),pages1–9,Atlanta,Gerogia,USA.MarcVerhagen,RobertGaizauskas,FrankSchilder,MarkHepple,GrahamKatz,andJamesPustejovsky.2007.SemEval-2007Task15:TempEvalTemporalRela-tionIdentiﬁcation.InProceedingsofthe4thInter-nationalWorkshoponSemanticEvaluations,SemEval’07,pages75–80,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.MarcVerhagen,RoserSaur´ı,TommasoCaselli,andJamesPustejovsky.2010.SemEval-2010Task13:TempEval-2.InProceedingsofthe5thInternationalWorkshoponSemanticEvaluation,SemEval’10,pages57–62,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.KatsumasaYoshikawa,SebastianRiedel,MasayukiAsa-hara,andYujiMatsumoto.2009.JointlyIdentifyingTemporalRelationswithMarkovLogic.InProceed-ingsoftheJointConferenceofthe47thAnnualMeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcessingoftheAFNLP:Vol-ume1,ACL’09,pages405–413,Stroudsburg,Pennsylvanie,USA.AssociationforComputationalLinguistics.DaojianZeng,KangLiu,SiweiLai,GuangyouZhou,andJunZhao.2014.RelationClassiﬁcationviaConvolu-tionalDeepNeuralNetwork.InCOLING2014,25thInternationalConferenceonComputationalLinguis-tics,ProceedingsoftheConference:TechnicalPapers,August23-29,2014,pages2335–2344,Dublin,Ireland.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
J’ai
r
e
c
t
.

J’ai
t
.

e
d
u

/
t

un
c
l
/

un
r
t
J’ai
c
e
–
p
d

F
/

d
o

J’ai
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
6
1
5
6
7
6
1
0

/
t

un
c
_
un
_
0
0
0
0
6
p
d

b
oui
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

90 Transactions of the Association for Computational Linguistics, vol. 6, pp. 77–89, 2018. Action Editor: Patrick Pantel. image

Télécharger le PDF

Recherche en IA spécialisée au MIT

Recherche en IA spécialisée au MIT

Transactions of the Association for Computational Linguistics, vol. 6, pp. 77–89, 2018. Action Editor: Patrick Pantel.