Transactions of the Association for Computational Linguistics, vol. 6, pp. 121–132, 2018. Action Editor: Ani Nenkova.

Transactions of the Association for Computational Linguistics, vol. 6, pp. 121–132, 2018. Action Editor: Ani Nenkova.
Submission batch: 11/2016; Revision batch: 3/2017; Published 2/2018.

2018 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence.

c
(cid:13)

ConversationModelingonRedditUsingaGraph-StructuredLSTMVictoriaZayatsElectricalEngineeringDepartmentUniversityofWashingtonvzayats@uw.eduMariOstendorfElectricalEngineeringDepartmentUniversityofWashingtonostendor@uw.eduAbstractThispaperpresentsanovelapproachformod-elingthreadeddiscussionsonsocialmediausingagraph-structuredbidirectionalLSTM(long-shorttermmemory)whichrepresentsbothhierarchicalandtemporalconversationstructure.InexperimentswithataskofpredictingpopularityofcommentsinRedditdiscussions,theproposedmodeloutperformsanode-independentarchitecturefordifferentsetsofinputfeatures.Analysesshowabene-ﬁttothemodeloverthefullcourseofthedis-cussion,improvingdetectioninbothearlyandlatestages.Further,theuseoflanguagecueswiththebidirectionaltreestateupdateshelpswithidentifyingcontroversialcomments.1IntroductionSocialmediaprovidesaconvenientandwidelyusedplatformfordiscussionsamongusers.Whenthecomment-responselinksarepreserved,thosecon-versationscanberepresentedinatreestructurewherecommentsrepresentnodes,therootistheoriginalpost,andeachnewreplytoapreviouscom-mentisaddedasachildofthatcomment.Someexamplesofpopularserviceswithtree-likestruc-turesincludeFacebook,Reddit,Quora,andStack-Exchange.Figure1showsanexampleconversa-tiononReddit,wherebiggernodesindicatehigherupvotingofacomment.1InserviceslikeTwitter,1Thetoolhttps://whichlight.github.io/reddit-network-viswasusedtoobtainthisvisualiza-tion.Figure1:VisualizationofasamplethreadonReddit.tweetsandtheirretweetscanalsobeviewedasform-ingatreestructure.Whentimestampsareavail-ablewithacontribution,thenodesofthetreecanbeorderedandannotatedwiththatinformation.Thetreestructureisusefulforseeinghowadiscussionunfoldsintodifferentsubtopicsandshowingdiffer-encesinthelevelofactivityindifferentbranchesofthediscussion.Predictingpopularityofcommentsinsocialme-diaisataskofgrowinginterest.Popularityhasbeendeﬁnedintermsofthevolumeofthere-sponse,butwhenthesocialmediaplatformhasamechanismforreaderstolikeordislikecom-ments(ou,upvote/downvote),thenthedifferenceinpositive/negativevotesprovidesamoreinformativescoreforpopularityprediction.Thisdeﬁnitionof

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

122

(un)Forwardhierarchicalandtimingstructure(b)BackwardhierarchicalandtimingstructureFigure2:Anexampleofmodelpropagationinagraph-structuredLSTM.Here,thenodenameareshowninachrono-logicalorder,e.g.commentt1wasmadeearlierthant2.2(un)Propagationofgraph-structuredLSTMintheforwarddirection.Bluearrowsrepresenthierarchicalpropagation,greenarrowsrepresenttimingpropagation.2(b)Backwardhierarchical(blue)andtiming(vert)propagationofgraph-LSTM.popularity,whichhasalsobeencalledcommunityendorsement(Fangetal.,2016),isthetaskofinter-estinourworkontree-structuredmodelingofdis-cussions.Previousstudiesfoundthatthetimewhenthecomment/postwaspublishedhasabigimpactonitspopularity(Lakkarajuetal.,2013).Inaddition,thenumberofimmediateresponsescanbepredic-tiveofthepopularity,butsomecommentswithahighnumberofrepliescanbeeithercontroversialorhaveahighlynegativescore.Languageshouldbeextremelyimportantfordistinguishingthesecases.Indeed,communitystylematchingisshowntobecorrelatedtocommentpopularityinReddit(TranandOstendorf,2016).Cependant,learningusefullan-guagecuescanbedifﬁcultduetothelowfrequencyoftheseeventsandthedominanceoftime,topicandotherfactors.Thus,inseveralpriorstudies,au-thorsconstrainedtheproblemtoreducetheeffectofthosefactors(Lakkarajuetal.,2013;Tanetal.,2014;Jaechetal.,2015).Inthisstudy,wehavenosuchconstraints,butattempttousethetreestructuretocapturetheﬂowofinformationinordertobettermodelthecontextinwhichacommentissubmitted,includingboththehistoryitrespondstoaswellasthesubsequentresponsetothatcomment.Tocapturediscussiondynamics,weintroduceanovelapproachtomodelingthediscussionusingabidirectionalgraph-structuredLSTM,whereeachcommentinthetreecorrespondstoasingleLSTMunit.Inonedirection,wecapturethepriorhis-toryofcontributionsleadinguptoanode,andintheother,wecharacterizetheresponsetothatcom-ment.Motivatedbypriorﬁndingsthatbothresponsestructureandtimingareimportantinpredictingpop-ularity(Fangetal.,2016),theLSTMunitsincludebothhierachicalandtemporalcomponentstotheup-date,whichdistinguishesthisworkfrompriortree-structuredLSTMmodels.WeassesstheutilityofthemodelinexperimentsonpopularitypredictionwithRedditdiscussions,comparingtoaneuralnet-workbaselinethattreatscommentsindependentlybutleveragesinformationaboutthegraphcontextandtimingofthecomment.WeanalyzetheresultstoshowthatthegraphLSTMprovidesausefulsum-maryrepresentationofthelanguagecontextofthecomment.AsinFangetal.(2016),butunlikeotherwork(Heetal.,2016),ourmodelmakesuseofthefulldiscus-sionthreadinpredictingpopularity.Whileknowl-edgeofthefulldiscussionisonlyusefulforpost-hocanalysisofpastdiscussions,itisreasonabletoconsiderinitialresponsestoacomment,particularlygiventhatmanyresponsesoccurwithinminutesofsomeonepostingacomment.Commentsareoftenpopularbecauseofwittyanalogiesmade,whichre-quiresknowledgeoftheworldbeyondwhatiscap-turedincurrentmodels.Responsestothesecom-ments,aswellastocontroversialcomments,canimprovepopularityprediction.Responsesofothers

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

123

clearlyinﬂuencethelikelihoodofsomeonetolikeordislikeacomment,butalsowhethertheyevenreadacomment.Byintroducingaforward-backwardtree-structuredmodel,weprovideamechanismforlever-agingearlyresponsesinpredictingpopularity,aswellasaframeworkforbetterunderstandingtherel-ativeimportanceoftheseresponses.Themaincontributionsofthispaperinclude:anovelapproachforrepresentingtree-structuredlan-guageprocesses(e.g.,socialmediadiscussions)withLSTMs;evaluationofthemodelonthepop-ularitypredictiontaskusingRedditdiscussions;andanalysisoftheperformancegains,particularlywithrespecttotheroleoflanguagecontext.2MethodTheproposedmodelisabidirectionalgraphLSTMthatcharacterizesafullthreadeddiscussion,assum-ingatree-structuredresponsenetworkandaccount-ingfortherelativeorderofthecomments.Eachcommentinaconversationcorrespondstoanodeinthetree,whereitsparentisthecommentthatitisre-spondingtoanditschildrenaretherespondingcom-mentsthatitspursorderedintime.Eachnodeinthetreeisrepresentedwithasinglerecurrentneuralnet-work(RNN)unitthatoutputsavector(embedding)thatcharacterizestheinterimstateofthediscussion,analogoustothevectoroutputofanRNNunitwhichcharacterizesthewordhistoryinasentence.Intheforwarddirection,thestatevectorcanbethoughtofasasummaryofthediscussionpursuedinapartic-ularbranchofthetree,whileinthebackwarddi-rectionthestatevectorsummarizesthefullresponsesubtreethatfollowedaparticularcomment.Thestatevectorsfortheforwardandbackwarddirec-tionsareconcatenatedforthepurposeofpredictingcommentkarma.TheRNNupdates–bothforwardandbackward–incorporatebothtemporalandhier-archical(tree-structured)dependencies,sincecom-menterstypicallyconsiderwhathasalreadybeensaidinresponsetoaparentcomment.Hence,werefertoitasagraph-structuredRNNratherthanatree-structuredRNN.Figures2(un)and2(b)showanexampleofthestateconnectionsassociatedwithhi-erarchicalandtimingstructuresfortheforwardandbackwardRNNs,respectively.Thesupervisionsignalintrainingwillimpactthecharacterofthestatevector,andtheforwardandbackwardstatesub-vectorsarelikelytocapturedif-ferentphenomena.Here,theobjectiveistopredictquantizedcommentkarma.Weanticipatethattheforwardstatewillcapturerelevanceandinforma-tivenessofthecomment,andthebackwardprocesswillcapturesentimentandrichnessoftheensuingdiscussion.ThespeciﬁcformoftheRNNusedinthisworkisanLSTM.Thedetailedimplementationofthemodelisdescribedinthesectionstofollow.2.1Graph-structuredLSTMEachnodeinthetreeisassociatedwithanLSTMunit.Theinputxtisanembeddingthatcanincorpo-ratebothcommenttextandlocalsubmissioncontextfeaturesassociatedwiththreadstructureandtiming,describedfurtherinsection2.2.Thenodestatevec-torhtisgeneratedusingamodiﬁcationofthestan-dardLSTMequationstoincludebothhierarchicalandtimingstructuresforeachcomment.Speciﬁ-cally,weusetwoforgetgates-onefortheprevious(orsubsequent)hierarchicallayer,andonefortheprevious(orsubsequent)timinglayer.Inordertodescribetheupdateequations,weintroducenotationforthehierarchicalandtimingstructure.InFigure2,thenodesinthetreearenum-beredintheorderthatthecommentsarecontributedintime.Tocharacterizegraphstructure,letπ(t)de-notetheparentoftandκ(t)itsﬁrstchild.Timestructureisrepresentedonlyamongasetofsiblings:p(t)isthesiblingpredecessorintime,ands(t)isthesiblingsuccessor.Thepointersκ(t),p(t)ands(t)aresetto∅whenthasnochild,predecessor,orsuc-cessor,respectively.Forexample,inFigure2(un),thenodet2willhaveπ(t2)=t1,κ(t2)=t4,p(t2)=∅ands(t2)=t3,andthenodet3willhaveπ(t3)=t1,κ(t3)=∅,p(t3)=t2ands(t3)=t5.Belowweprovidetheupdateequationsfortheforwardprocess,usingthesubscriptsi,F,g,c,andofortheinputgate,temporalforgetgate,hier-archichalforgetgate,cell,andoutput,respectively.Thevectorsit,ft,andgtaretheweightsfornewin-formation,rememberingoldinformationfromsib-lings,andrememberingoldinformationfromtheparent,respectively.σisasigmoidfunction,and◦indicatestheHadamardproduct.Ifp(t)=∅,alors

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

124

hp(t)andcp(t)aresettotheinitialstatevalue.it=σ(Wixt+Uihp(t)+Vihπ(t)+bi)ft=σ(Wfxt+Ufhp(t)+Vfhπ(t)+bf)gt=σ(Wgxt+Ughp(t)+Vghπ(t)+bg)˜ct=Wcxt+Uchp(t)+Vchπ(t)+bcct=ft◦cp(t)+gt◦cπ(t)+it◦˜ctot=σ(Woxt+Uohp(t)+Vohπ(t)+bo)ht=ot◦tanh(ct)Whenthewholetreestructureisknown,wecantakeadvantageofthefullresponsesubtreetobet-terrepresentthenodestate.Tothatend,wedeﬁneabackwardLSTMthathasasimilarsetofupdateequationsexceptthatonlytheﬁrstchildwillpassthehiddenstatetoitsparent.Speciﬁcally,theupdateequationsarethesameexceptthatπ(t)isreplacedwithκ(t),p(t)isreplacedwiths(t),andadifferentsetofweightmatricesandbiasvectorsarelearned.Let+and−indicateforwardandbackwardem-beddingsrespectively.OntopoftheLSTMunit,theforwardandbackwardstatevectorsareconcatenatedandpassedtoasoftmaxlayertopredict8quantizedkarmalevels:P.(yt=j|X,h)=exp(Wjs[h+t;h−t])P8k=1exp(Wks[h+t;h−t])wherexandhcorrespondtothesetofinputfeaturesandstatevectors(respectivement)forallnodesinthediscussion.2.2InputFeaturesThefullmodelincludestwotypesoffeaturesintheinputvector,includingnon-textualfeaturesassoci-atedwiththesubmissioncontextandthetextualfea-turesofthecommentatthatnode.Thesubmissioncontextfeaturesareextractedfromthegraphandmetadataassociatedwiththecomment,motivatedbypriorworkshowingthatcontextfactorssuchastheforum,timingandau-thorofapostareveryusefulinpredictingpopular-ity.Thesubmissioncontextfeaturesinclude:•Timing:timesinceroot,timesinceparent(inhours),numberoflatercomments,andnumberofpreviouscomments•Author:abinaryindicatorastowhethertheau-thoristheoriginalposter,andnumberofcom-mentsmadebytheauthorintheconversation•Graph-location:depthofthecomment(dis-tancefromtheroot),andnumberofsiblings•Graph-response:numberofchildren(directrepliestothecomment),heightofthesub-treerootedfromthenode,sizeofthatsubtree,numberofchildrennormalizedforeachthread(2normalizationtechniques),subtreesizenor-malizedforeachthread(2normalizationtech-niques).Twomethodsareusedtonormalizethesubtreesizeandnumberofchildrentocompensateforvariationassociatedwiththesizeofthediscussion,specif-ically:je)subtractthemeanfeaturevalueinthethread,andii)dividebythesquarerootoftherankofthefeaturevalueinthethread.ThesefeaturesareasupersetofthoseusedinFangetal.(2016).Thesubvectorincludingallthesefea-turesisdenotedxst.Thecommenttextfeatures,denotedxct,aregen-eratedusingasimpleaveragebag-of-wordsrepre-sentationlearnedduringthetraining:xct=1NNXi=1WiewhereWieisanembeddingofthei-thwordinthecomment,andNisthenumberofwordsinthecomment.Commentslongerthan100wordsweretruncatedtoreducenoiseassociatedwithlongcom-ments,assumingthattheearlyportioncarriesthemostinformation.Thepercentageofthecommentsthatexceed100wordsisaround11%−14%forthesubredditsusedinthestudy.Inallexperiments,thewordembeddingdimensionisd=100,andthevo-cabularyincludesonlywordsthatoccurredatleast10timesinthedataset.Theinputvectorxtissettoeitherxstor[xst;xct],dependingonwhethertheexperimentusestext.2.3PruningOftenthenumberofcommentsinasinglesubtreecanbelarge,whichleadstohightrainingcosts.Alargepercentageofthecommentsarelowkarmaand

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

125

minimallyrelevantforpredictingkarmaofneigh-bors,andmanycanbeeasilyidentiﬁedwithsimplegraphandtimingfeatures(e.g.havingnorepliesorcontributedlateinthediscussion).Donc,weintroduceapreprocessingstepthatidentiﬁescom-mentsthatarehighlylikelytobelowkarmatode-creasethecomputationcost.Wethenassignthesenodestobelevel0andprunethemoutofthetree,butretainacountofnodesprunedforuseinacount-weightedbiastermintheupdatetocaptureinforma-tionaboutresponsevolume.Fordetectinglowkarmacomments,wetrainasimpleSVMclassiﬁertoidentifycommentsatthe0karmalevelbasedonthesubmissioncontextfea-tures.Ifaprunedcommentleadstoadisconnectedgraph(e.g.,aninternalnodeisprunedbutnotitschildren),thenthecommentisretainedinthetree.Intesting,allprunedcommentsaregivenapredictedlevelof0andaccountedforintheevaluation.Thestateupdateshaveanadditionalbiastermforanynodesthathavesubsequentsiblingorchildrencommentspruned.Forexample,considerFigure2,ifnodes{t5,t6,t7,t9}arepruned,thent8willhaveamodiﬁedforwardupdate,andt3,t4willhaveamodiﬁedbackwardsupdate.Atnodet,deﬁneMκttobethenumberoflevelsprunedbelowit,Mptasthenumberofimmediatelypreceedingcommentsprunedinitssubgroup(respondingtothesamepar-ent),andMstasthenumberofsubsequentcommentsprunedinitssubgroupplusthenon-initialcommentsintheassociatedsubtrees.Intheexampleabove,Mκ3=1,Ms3=2,Ms4=1,Mp8=1,andallotherM∗t=0.Thepointersareupdatedreﬂectthestructureoftheprunedtree,sop(8)=4,s(4)=8,s(3)=∅.Thebiasvectorsrκ,rpandrsareas-sociatedwiththedifferentsetsofnodespruned.Let+and−indicateforwardandbackwardem-beddings,respectively.Theforwardupdatehasanadjustedpredecessorcontribution(h+p(t)+Mptrp).ThebackwardupdateaddsMstrs+Mκtrκtoeitherh−s(t)orh−κ(t),dependingonwhetheritisatimeorhierarchicalupdate,respectively.2.4TrainingTheobjectivefunctionisminimumcross-entropyoverthequantizedlevels.Allmodelparame-tersarejointlytrainedusingtheadadeltaoptimiza-tionalgorithm(Zeiler,2012).Wordembeddingssubredditcommentsthreadsvocabsizeaskwomen0.8M3.5K32Kaskmen1.1M4.5K35Kpolitics2.2M4.9K55KTable1:Datastatistics.subredditPrecRec%prunedaskwomen67.972.436.9askmen60.175.336.1politics49.660.347.5Table2:Precisionandrecallofthepruningclassiﬁerandpercentageofcommentspruned.areinitializedusingword2vecskip-gramembed-dings(Mikolovetal.,2013)trainedonallcom-mentsfromthecorrespondingsubreddit.ThecodeisimplementedinTheano(Teametal.,2016)andisavailableathttps://github.com/vickyzayats/graph-LSTM.WetunethemodeloverdifferentdimensionsoftheLSTMunit,andusetheperformanceonthedevelopmentsetasastoppingcriteriaforthetraining.3Experiments3.1DataReddit2isapopulardiscussionforumplatformcon-sistingofalargenumberofsubredditsfocusingondifferenttopicsandinterests.Inourstudy,weexper-imentedwith3subreddits:askwomen,askmen,andpolitics.AllthedataconsistsofdiscussionsmadeintheperiodbetweenJanuary1,2014andJanuary31,2015.Table1showsthetotalamountofdatausedforeachofthesubreddits.Foreachsubreddit,thethreadswererandomlydistributedbetweentraining,development(dev)andtestsetswiththeproportionsof6:2:2.TheperformanceofthepruningclassiﬁeronthedevsetispresentedinTable2.3.2TaskandEvaluationMetricsRedditkarmahasaZipﬁandistribution,highlyskewedtowardthelow-karmacomments.Sincetherarehighkarmacommentsareofgreatestinterestinpopularityprediction,Fangetal.(2016)proposesa2https://reddit.com

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

126

taskofpredictingquantizedkarma(usinganonlin-earhead-tailbreakruleforbinning)withevaluationusingamacroaverageoftheF1scoresforpredict-ingwhetheracommentexceedseachdifferentlevel.Experimentsreportedhereusethisframework.Speciﬁcally,allthecommentswithkarmalowerthan1areassignedtolevel0,andeachsubsequentlevelcorrespondstokarmalessthanorequaltothemediankarmaintherestofthecommentsbasedonthetrainingdatastatistics.Eachsubreddithas8quantizedkarmalevelsbasedonitskarmadistribu-tion.Thereare7binarysubtasks(doesthecommenthavekarmaatleveljorhigherforj=1,…,7),andthescoringmetricisthemacroaverageofF1(j).Fortuninghyperparametersandasastoppingcri-terion,weusealinearlyweightedaverageofF1scorestoincreasetheweightonhighkarmacom-ments,whichgivesslightlybetterperformanceforthehighkarmacasesbuthasonlyasmalleffectonthemacroaverage.3.3BaselineandContrastSystemsWecomparethegraphLSTMtoanode-independentbaseline,whichisafeedforwardneuralnetworkmodelconsistingofinput,hiddenandsoftmaxlay-ers.Thismodelisasimpliﬁcationofthegraph-LSTMmodelwherethereisnoconnectionbetweennodes.Thenode-independentmodelcharacterizesacommentwithoutreferencetoeitherthetextofthecommentthatitisrespondingtoorthecommentsreactingtoit.However,themodeldoeshavein-formationonthesizeoftheresponsesubtreeviathesubmissioncontextinputfeatures.Bothnode-independentandgraph-structuredmodelsaretrainedwiththesamecostfunctionandtunedoverthesamesetofhiddenlayerdimensions.Wecontrastperformanceofbotharchitectureswithandwithoutusingthetextofthecommentit-self.AsshowninFangetal.(2016),simplyus-ingsubmissioncontextfeatures(graph,timing,au-thor)givesastrongbaseline.Inordertoevaluatetheroleofeachdirection(forwardorbackward)inthegraph-structuredmodel,wealsopresentre-sultsusingonlytheforwarddirectiongraph-LSTMforcomparisontothebidirectionalmodel.Inaddi-tion,inordertoevaluatetheimportanceofthelan-guageofthecommentitselfvs.thelanguageusedintherestofthetree,weperformaninterpolationModelTextaskwomenaskmenpoliticsindepno53.248.346.6graphno54.652.147.9indepyes52.850.747.4interpmix54.752.148.2graph(F)yes55.053.349.9graphyes56.454.850.4Table3:AverageF1scoreofkarmalevelpredictionfornode-independent(indep)vs.graph-structured(graph)modelswithandwithouttextfeatures;interpcorrespondstoaninterpolationofthegraph-structuredmodelwith-outtextandthenode-independentmodelwithtext;andgraph(F)correspondstoagraph-structuredmodelcon-tainsforwarddirectiononly.betweenthegraph-LSTMwithnolanguagefeaturesandthenode-independentmodelwithlanguagefea-tures.Therelativeweightforthetwomodelsistunedonthedevelopmentset.3.4KarmaLevelPredictionTheresultsfortheaverageF1scoresonthetestsetarepresentedinTable3.Inexperimentsforallthesubreddits,graph-structuredmodelsoutperformthecorrespondingnode-independentmodelsbothwithandwithoutlanguagefeatures.Languagefeaturesalsogiveagreaterperformancegainwhenusedinthegraph-LSTMmodels.Thefactthattheforwardgraphimprovesovertheinterpolatedmodelsshowsthatitisnotsimplytheinformationinthecurrentnodethatmattersforkarmaofthatnode.Finally,whilethefullmodeloutperformstheforward-onlyversionforallthesubreddits,thegainissmallerthanthatobtainedbytheforwarddirectionaloneoverthenode-independentmodel,sotheforwarddirectionseemstobemoreimportant.Thekarmapredictionresults(F1score)atthedif-ferentlevelsisshowninFigure3.Whileinaskmenandaskwomensubredditstheoverallperformancedecreasesforhigherlevels,thepoliticssubreddithasanoppositetrend.Thismaybedueinparttothelowerpruningrecallinthepoliticssubreddit,butFangetal.(2016)alsoobservehigherperformanceforhighkarmalevelsinthepoliticssubreddit.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

127

Figure3:F1scoresasafunctionofthequantizedlevelsfordifferentmodelconﬁguration.4AnalysisHere,wepresentanalysesaimedatbetterunder-standingthebehaviorofthegraph-structuredmodelandtheroleoflanguageinprediction.Allanalysesareperformedonthedevelopmentset.Theanal-ysesaremotivatedbyconsideringpossiblescenar-iosthatareexceptionstotheeasycases,whichare:je)commentsthatarecontributedearlyinthediscus-sionandspawnlargesubtrees,likelytohavehighkarma,andii)commentswithsmallsubtreesthattypicallyhavelowkarma.Wehypothesizedthreescenarioswherethebidirectionalgraph-LSTMwithtextmightbeuseful.Onecaseiscontroversialcom-ments,whichhavelargesubtreesbutdonothavehighkarmabecauseofdownvotes;thesetendtohaveoverpredictionofkarmawhenusingonlysubmis-sioncontext.Theothertwoscenariosinvolveun-derpredictionofkarmawhenusingonlysubmissioncontext.Earlycommentsassociatedwithfewchil-drenandamorenarrowsubtree(seethedownwardchaininFigure1)mayspawnpopularnewthreadsandbeneﬁtfromthepopularityofothercommentsinthethread(morereadersattracted),thushavinghigherpopularitythanthenumberofchildrensug-gests.Lastly,commentsthatarecleverorhumor-ousdiscussionendpointsmighthavehighpopularitybutsmallsubtrees.Thesetwocasestendtodifferintheirrelativetiminginthediscussion.4.1KarmaPredictionvs.TimeTheﬁrststudylookedatwherethegraph-LSTMprovidesbeneﬁtsintermsoftiming.WeplottheaverageF1scoreasafunctionofthecontributiontimeinFigure4.Asanapproximationfortime,weusethequantizednumberofcommentsmadepriortothecurrentcomment.Theplotsshowthatthegraph-structuredmodelimprovesoverthenode-independentmodelthroughoutthediscussion.Rel-ativegainsarelargertowardstheendofdiscussionswherethenode-independentperformanceislower.AsimilartrendisobservedwhenplottingaverageF1asafunctionofdepthinthediscussiontree.Whiletheuseoftextinthegraph-LSTMseemstohelpthroughoutthediscussion,wehypothesizedthattherewouldbedifferentcaseswhereitmighthelp,andthesewouldoccuratdifferenttimes.In-deed,93%ofthecommentsthatareoverpredicted

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

128

Figure4:AverageF1scoresasafunctionoftime,approximatedusingthenumberofpreviouscommentsquantizedinincrementsof20.bymorethan2levelsbythenode-independentmodelwithouttext(controversialcomments)occurintheﬁrst20%ofthediscussion.Commentsthatareunderpredictedbymorethan2occurthroughoutthediscussionandareroughlyuniform(13-19%)overtheﬁrsthalfofthediscussion,butthenquicklyrampdown.High-karmacommentsarerareattheendofthediscussion;lessthan5%oftheunderpredictedcommentsareinthelast30%.4.2ImportanceofResponsesInordertoseehowthemodelbeneﬁtsfromusingthelanguagecuesinunderpredictedandoverpredictedscenarios,welookatthesizeoferrorsmadebythegraph-LSTMmodelwithandwithouttextfeatures.InFigure5,thex-axisindicatestheerrorbetweentheactualkarmalevelandthekarmalevelpredictedbythegraph-LSTMusingsubmissioncontextfea-turesonly.Thenegativeerrorsrepresenttheover-predictedcomments,andthepositiveerrorsrepre-senttheunderpredictedcomments.They-axisrep-resentstheaverageerrorbetweentheactualkarmalevelandthekarmalevelpredictedbythemodelusingbothsubmissioncontextandlanguagefea-tures.Thex=yidentitylinecorrespondstonobeneﬁtfromlanguagefeatures.Resultsarepresentedforthepoliticssubreddit;othersubredditshavesimilartrendsbutsmallerdifferencesfortheunderpredictedcases.Wecomparetwomodels–bidirectionalandfor-warddirectiongraph-structuredLSTM–inordertounderstandtheroleofthelanguageoftherepliesvs.thecommentanditshistory.Weﬁndthat,forthebidirectionalgraph-LSTMmodel,languageishelp-ingidentifyoverpredictedcasesmorethanunderpre-dictedones.Theforwarddirectionmodelalsoout-performsthenode-independentmodel,buthaslessbeneﬁtinoverpredictedcases,consistentwithourintuitionthatcontroversyisidentiﬁablebasedontheresponses.Althoughthecommenttextinputissim-plyabagofwords,itcancapturethemixedsenti-mentoftheresponses.Whileitisnotrepresentedintheplot,largerer-rorsaremuchlessfrequent.LookingataverageF1asafunctionofthenumberofchildren(directresponses),wefoundthatthegraph-LSTMmainlybeneﬁtsnodesthathaveasmallnumberofchildren,consistentwiththetwounderpredictionscenarioshypothesized.However,manyunderpredictedcasesarenotimpacted,sinceerrorsduetopruningcon-tributeto15-40%oftheunderpredictedcases,de-pendingonthesubreddit(highestforpolitics).ThisexplainsthesmallergainsforthepositivesideofFigure5.4.3LanguageUseAnalysisToprovideinsightsintowhatthemodelislearningaboutlanguage,welookedatindividualwordsasso-ciatedwithdifferentcategoriesofcomments,aswellasexamplesofthedifferenterrorcases.Forthewordlevelanalysis,weclassiﬁedwordsintwodifferentways,againusingthepoliticssub-reddit.First,weassociatewordsincommentswithzeroorpositivekarma.Foreachwordinthevocab-ulary,wecalculatetheprobabilityofasingle-wordcommentbeinglevelzerousingthetrainedmodelwithasimpliﬁedgraphstructure(apostandacom-ment)wherealltheinputsweresettozeroexceptthecommenttext.Thelistsofpositive-karmaandzero-karmacorrespondtothe300wordsassociatedwiththelowestandhighestprobabilityofzero-karma,re-spectively.Weidentiﬁed300positive-karmaandzero-karmareplywordsinasimilarfashion,usingasimpliﬁedgraphwithindividualwordsusasinputs

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

129

Figure5:Theerrorbetweentheactualkarmalevelandthekarmalevelpredictedbythemodelusingbothsubmissioncontextandlanguagefeatures.Negativeerrorscorrespondtoover-prediction;positiveerrorscorrespondtounder-prediction.forthereplywhilepredictingthecommentkarma.Second,weidentiﬁedwordsthatmaybeindica-tiveofcommentsthatareover-andunderpredictedbythegraph-structuredmodelwithouttextandforwhichthegraph-LSTMmodelwithtextreducedtheerrorbymorethan2levels.Speciﬁcally,wechoosethosewordswincommentshavingthehighestratior=p(w|t)/p(w),wheretindicatesanover-orun-derpredictedcomment,subjecttominimumoccur-renceconstraints(5foroverpredictedcomments,15forunderpredictedcomments).The50wordswiththehighestratiowerechosenforeachcaseandanywordsinbothover-andunderpredictedsetswereeliminated,leaving47words.Again,thiswasre-peatedforwordsinrepliestoovervs.underpre-dictedcomments,butwithaminimumcountthresh-oldof20,resultingin45words.Thelistsarenoisy,similartowhatisoftenfoundwiththetopicmodel,andcoloredbythelanguageofthesubredditcommunity,butafewtrendscanbeobserved.Lookingatthelistofwordsassociatedwithrepliestopositive-karmacommentswenoticedwordsthatindicatehumor(“LOL”,“hilarious”),positivefeedback(“Like”,“Right”),andemotionin-dicators(“!!»,swearing).Wordsincommentsandrepliesassociatedwithoverpredicted(controversé)casesarerelatedtocontroversialtopics(sexual,reg-ulate,liberals),namedpoliticalparties,andmen-tionsofdownvotingorindicationthatthecommenthasbeeneditedwiththeword“Edit.”Sincethetwosetsoflistsweregeneratedsepa-rately,therearewordsintheover/under-predictedliststhatoverlapwiththezero/non-zerokarmalists(12inthereplylists,20inthecommentlists).Themajorityoftheoverlap(26/32words)isconsistentFigure6:Themappingofthewordsinthecommentstothesharedspaceusingt-SNEinpoliticssubred-dit.Shownarethewordsthatarehighlyassociatedwithpositive-karma,negative-karma,underpredictedandoverpredictedcomments.withtheintuitionthatwordsontheunderpredictedlistshouldbeassociatedwithpositive-karma,andwordsontheoverpredictedlistmightoverlapwiththezero-karmalist.Ratherthanprovidingwordlists,manyneuralnet-workstudiesillustratetrendsusingwordembed-dingvisualization.Theembeddingsofthewordsfromtheunionoflistsforpositive-karma,zero-karma,underpredictedandoverpredictedcommentsandrepliesweretogetherusedtolearnat-SNEmap-ping.TheresultsareplottedforcommentsinFig-ure6,whichshowsthatthewordsthatareas-sociatedwithunderpredictedcomments(red)arealignedwithpositive-karmawords(vert)forbothcommenttextandtextinreplies.Wordsassociated

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

130

withoverpredictedcomments(blue)aremorescat-tered,buttheyaresomewhatmorelikethezero-karmawords(yellow).Thetrendsforwordsinrepliesaresimilar.Table4listsexamplesofthedifferenterrorsce-narioswiththereferencekarmaandpredictionsofdifferentmodels(node-independentwithouttext,feedforwardgraph-LSTMwithtext,andthefullbiLSTM).Theﬁrsttwoexamplesareoverpredicted(controversé)cases,whereignoringtextleadstoahighkarmaprediction,butthereferenceiszero.Intheﬁrstcase,theforwardmodelincorrectlypredictshighkarmabecause“Republican”tendstobeasso-ciatedwithpositivekarma.Themodelleveragingreplytextcorrectlypredictsthelowkarma.Inthesecondcase,theforwardmodelcapturesreducestheprediction,butagainhavingtherepliesismorehelp-ful.Thenexttwocasesareexamplesofunderpredic-tionduetosmallsubtrees.Example3isincorrectlylabeledaslevel0bytheforwardandno-textmodels,butbecausetheresponsesmention“nicejoke”and“accurateanalogy,”thebidirectionalmodelisabletoidentifyitaslevel7.Example4hasonlyonechild,butbothmodelsusinglanguagecorrectlypre-dictlevel7,probablybecausethemodelhaslearnedthatreferencesto“Colbert”arepopular.Thenexttwoexamplesareunderpredictedcasesfromearlyinthediscussion,manyofwhichexpressedanopin-ionthatinsomewayprovidedmultipleperspectives.Finally,thelasttwoexamplesrepresentinstanceswhereneithermodelsuccessfullyidentiﬁesahighkarmacomment,whichofteninvolveanalogies.Un-likethe“titanic”analogy,thesedidnothavesufﬁ-cientcuesinthereplies.5RelatedWorkTheproblemofpredictingpopularityinsocialme-diaplatformshasbeenthesubjectofseveralstudies.PopularityasdeﬁnedintermsofvolumeofresponsehasbeenexploredforsharesonFacebook(Chengetal.,2014)andTwitter(Bandarietal.,2012)andTwitterretweets(Tanetal.,2014;Zhaoetal.,2015;BiandCho,2016).StudiesonRedditpredictkarmaaspopularity(Lakkarajuetal.,2013;Jaechetal.,2015;Heetal.,2016)orascommunityendorsement(Fangetal.,2016).Popularitypredictionisadifﬁ-culttaskwheremanyfactorscanplayarole,whichiswhymostpriorstudiescontrolforspeciﬁcfactors,includingtopic(Tanetal.,2014;Weningeretal.,2013),timing(Tanetal.,2014;Jaechetal.,2015),and/orcommentcontent(Lakkarajuetal.,2013).Controllingforspeciﬁcfactorsisusefulinunder-standingthecomponentsofasuccessfulpost,butitdoesnotreﬂectarealisticscenario.StudiesthatdonotincludesuchconstraintshavelookedatTwitterretweets(BiandCho,2016)andRedditkarma(Heetal.,2016;Fangetal.,2016).Theworkin(Heetal.,2016)usesreinforcementlearningtoidentifypopularthreadstotrackgiventhepastcommenthistory,soitislearninglanguagecuesrelevanttohighkarmabutitdoesnotexplicitlypredictkarma.Inaddition,itmodelsrelevanceviaaninner-productofpastandnewcommentembed-dings,andusesanLSTMtomodelinter-commentdependenciesamongacollectionofcommentsirre-spectiveoftheirsibling-parentrelationship,whereastheLSTMinourworkisoveragraphthataccountsforthisrelationship.TheworkmostcloselyrelatedtoourstudyisFangetal.(2016).Thenode-independentbaselineim-plementedinourstudyisequivalenttotheirfeed-forwardnetworkbaseline,buttheresultsarenotdi-rectlycomparablebecauseofdifferencesintraining(weusemoredata)andinputfeatures.Themostim-portantdifferenceinourapproachistherepresenta-tionoftextualcontextusingabidirectionalgraph-LSTM,includingthehistorybehindandresponsestoacomment.Otherdifferencesare:je)Fangetal.(2016)useanLSTMtocharacterizecomments,whileourmodelusesasimplebag-of-wordsap-proach,andii)theylearnlatentsubmissioncontextmodelstodeterminetherelativeimportanceoftex-tualcues,whileourapproachusesasubmissioncon-textSVMtoprunelowkarmacomments(ignoringtheirtext).Allowingfordifferencesinbaselines,wenotethattheabsolutegaininperformancefromus-ingtextfeaturesislargerforourmodel,whichrep-resentslanguagecontext.TreeLSTMsareamodiﬁcationofsequentialLSTMsthathavebeenproposedforavarietyofsentence-levelNLPtasks(Taietal.,2015;Zhuetal.,2015;Zhangetal.,2016;LeandZuidema,2015).ThearchitectureoftreeLSTMsvariesdependingonthetask.Someoptionsincludesummarizingoverthechildren,addingaseparateforgetgateforeach

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

131

ExkarmaComment10770Republicansarefundamentallydishonest.(politique,id:1x9pcx)20740Thatisrape.Shewasdrunkandcouldnotconsent.Period.Anyofthesupposedevidenceotherwiseisnothingbutvictimblaming.(askwomen,id:2h8pyh)37007Theliberalskeepsayingthetitanicissinkingbutmysideis500feetintheair.(politique,id:1upfgl)47377Imissyourshow,StephenColbert.(askmen,id:2qmpzm)57377thatisterrifying.theyweregiventheorderstobustdownthedoorwithoutnoticetotheresidents,therebyplacingthemselvesindanger.andultimately,placingthelivesoftheresidentsindanger(whowouldbeactingoutoffearandself-defense)(politique,id:1wzwg6)67056It’ssomething,andalsowouldchangethewaythatPoliceunionsandStateProsecutorswork.Idon’tfundamentallyagreewiththemove,sinceitstillnecessitatesabusebytheState,butit’ssomething.(politique,id:27chxr)76000Chickenhawksalwaystalkabiggameaslongassomeoneelseisdoingtheﬁghting.(poli-tics,id:1wbgpd)86000[Ils]usestatisticsinthesamewaythatadrunkuseslampposts:forsupport,ratherthanillumination.-AndrewLang.(politique,id:1yc2fj)Table4:Examplecommentsandkarmalevelpredictions:reference,notext,graph(F),graph.child(Taietal.,2015),recurrentpropagationamongsiblings(Zhangetal.,2016),oruseofstackLSTMs(Dyeretal.,2015).Ourworkdiffersfromthesestudiesintworespects:thetreestructureherechar-acterizesadiscussionratherthanasinglesentence;andourarchitectureincorporatesbothhierarchicalandtemporalrecursionsinoneLSTMunit.6ConclusionInsummary,thispaperpresentsanovelapproachformodelingthreadeddiscussionsonsocialmediausingagraph-structuredbidirectionalLSTMwhichrepresentsbothhierarchicalandtemporalconversa-tionstructure.Thepropagationofhiddenstatein-formationinthegraphprovidesamechanismforrepresentingcontextuallanguage,includingthehis-torythatacommentisrespondingtoaswellastheensuingdiscussionitspawns.ExperimentsonRedditdiscussionsshowthatthegraph-structuredLSTMleadstoimprovedresultsinpredictingcom-mentpopularitycomparedtoanode-independentmodel.Analysesshowthatthemodelbeneﬁtspre-dictionovertheextentofthediscussion,andthatlanguagecuesareparticularlyimportantfordistin-guishingcontroversialcommentsfromthosethatareverypositivelyreceived.Responsesfromevenasmallnumberofcommentsseemtobeuseful,soitislikelythatthebidirectionalmodelwouldstillbeusefulwithashort-timelookaheadforearlypredic-tionofpopularity.WhileweevaluatethemodelonpredictingthepopularityofcommentsinspeciﬁcforumsonRed-dit,itcanbeappliedtoothersocialmediaplatformsthatmaintainathreadedstructureorpossiblytoci-tationnetworks.Inadditiontopopularitypredic-tion,weexpectthemodelwouldbeusefulforothertasksforwhichtheresponsestocommentsarein-formative,suchasdetectingtopicoropinionshift,inﬂuenceortrolls.Withthemoreﬁne-grainedfeed-backincreasinglyavailableonsocialmediaplat-forms(e.g.laughter,amour,anger,tears),itmaybepossibletodistinguishdifferenttypesofpopularityaswellaslevels,e.g.sharedsentimentvs.humor.Inthisstudy,themodelusesasimplebag-of-wordsrepresentationofthetextinacomment;moresophisticatedattention-basedmodelsand/orfeatureengineeringmayimproveperformance.Inaddition,performanceofthemodelonunderpredictedcom-mentsappearstobelimitedbythepruningmecha-nismthatweintroduced.Itwouldbeusefultoex-plorethetradeoffsofreducingtheamountofprun-ingvs.usingamorecomplexclassiﬁerforprun-ing.Finally,itwouldbeusefultoevaluateper-formanceusingashortwindowlookaheadforre-sponses,ratherthanthefulldiscussiontree.

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi

/
t

un
c
je
/

un
r
t
je
c
e
–
p
d

F
/

d
o

je
/

1
0
1
1
6
2

/
t

un
c
_
un
_
0
0
0
0
9
1
5
6
7
5
9
4

/
t

un
c
_
un
_
0
0
0
0
9
p
d

b
oui
g
toi
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

132

AcknowledgmentsThispaperisbasedonworksupportedbytheDARPADEFTProgram.ViewsexpressedarethoseoftheauthorsanddonotreﬂecttheofﬁcialpolicyorpositionoftheDepartmentofDefenseortheU.S.Government.Wethankthereviewersfortheirhelp-fulfeedback.ReferencesRojaBandari,SitaramAsur,andBernardoHuberman.2012.Thepulseofnewsinsocialmedia:Forecast-ingpopularity.InProc.ICWSM.BinBiandJunghooCho.2016.Modelingaretweetnetworkviaanadaptivebayesianapproach.InProc.WWW.JustinCheng,LadaAdamic,P.AlexDow,JonMichaelKleinberg,andJureLeskovec.2014.Cancascadesbepredicted?InProc.WWW.ChrisDyer,AdhigunaKuncoro,MiguelBallesteros,andNoahA.Smith.2015.Recurrentneuralnetworkgrammars.InProc.NAACL.HaoFang,HaoCheng,andMariOstendorf.2016.Learninglatentlocalconversationmodesforpredict-ingcommunityendorsementinonlinediscussions.InProc.SocialNLP.JiHe,MariOstendorf,XiaodongHe,JianshuChen,Jian-fengGao,LihongLi,andLiDeng.2016.DeepreinforcementlearningwithacombinatorialactionspaceforpredictingpopularRedditthreads.InProc.EMNLP.AaronJaech,VictoriaZayats,HaoFang,MariOstendorf,andHannanehHajishirzi.2015.Talkingtothecrowd:Whatdopeoplereacttoinonlinediscussions?InProc.EMNLP.HimabinduLakkaraju,JulianJ.McAuley,andJureLeskovec.2013.What’sinaname?Understandingtheinterplaybetweentitles,content,andcommunitiesinsocialmedia.InProc.ICWSM.PhongLeandWillemZuidema.2015.Compositionaldistributionalsemanticswithlongshorttermmemory.InProc.*SEM.TomasMikolov,KaiChen,GregCorrado,andJeffreyDean.2013.Efﬁcientestimationofwordrepresen-tationsinvectorspace.InProc.ICLR.KaiShengTai,RichardSocher,andChristopherDMan-ning.2015.Improvedsemanticrepresentationsfromtree-structuredlongshort-termmemorynetworks.InProc.ACL.ChenhaoTan,LillianLee,andBoPang.2014.Theef-fectofwordingonmessagepropagation:Topic-andauthor-controllednaturalexperimentsontwitter.InProc.ACL.TheTheanoDevelopmentTeam,RamiAl-Rfou,Guil-laumeAlain,AmjadAlmahairi,ChristofAnger-mueller,DzmitryBahdanau,NicolasBallas,Fr´ed´ericBastien,JustinBayer,AnatolyBelikov,etal.2016.Theano:Apythonframeworkforfastcomputa-tionofmathematicalexpressions.arXivpreprintarXiv:1605.02688.TrangTranandMariOstendorf.2016.Characterizingthelanguageofonlinecommunitiesanditsrelationtocommunityrecognition.InProc.EMNLP.TimWeninger,XihaoAviZhu,andJiaweiHan.2013.Anexplorationofdiscussionthreadsinsocialnewssites:AcasestudyoftheRedditcommunity.InProc.ASONAM.MatthewDZeiler.2012.Adadelta:anadaptivelearningratemethod.arXivpreprintarXiv:1212.5701.XingxingZhang,LiangLu,andMirellaLapata.2016.Top-downtreelongshort-termmemorynetworks.InProc.NAACL.QingyuanZhao,MuratAErdogdu,HeraYHe,AnandRajaraman,andJureLeskovec.2015.Seismic:Aself-excitingpointprocessmodelforpredictingtweetpopularity.InProc.SIGKDD.XiaodanZhu,ParinazSobhani,andHongyuGuo.2015.Longshort-termmemoryoverrecursivestructures.InProc.ICML. Transactions of the Association for Computational Linguistics, vol. 6, pp. 121–132, 2018. Action Editor: Ani Nenkova. image

Télécharger le PDF