Transactions of the Association for Computational Linguistics, 2 (2014) 297–310. Action Editor: Hal Daume III.

Transactions of the Association for Computational Linguistics, 2 (2014) 297–310. Action Editor: Hal Daume III.
Submitted 5/2014; Revised 6/2014; Published 10/2014. c(cid:13)2014 Association for Computational Linguistics.

297

ExploitingSocialNetworkStructureforPerson-to-PersonSentimentAnalysisRobertWestStanfordUniversitywest@cs.stanford.eduHristoS.PaskovStanfordUniversityhpaskov@stanford.eduJureLeskovecStanfordUniversityjure@cs.stanford.eduChristopherPottsStanfordUniversitycgpotts@stanford.eduAbstractPerson-to-personevaluationsareprevalentinallkindsofdiscourseandimportantfores-tablishingreputations,buildingsocialbonds,andshapingpublicopinion.Suchevaluationscanbeanalyzedseparatelyusingsignedso-cialnetworksandtextualsentimentanalysis,butthismissestherichinteractionsbetweenlanguageandsocialcontext.Tocapturesuchinteractions,wedevelopamodelthatpre-dictsindividualA’sopinionofindividualBbysynthesizinginformationfromthesignedsocialnetworkinwhichAandBareembed-dedwithsentimentanalysisoftheevaluativetextsrelatingAtoB.Weprovethatthisprob-lemisNP-hardbutcanberelaxedtoanef-ficientlysolvablehinge-lossMarkovrandomfield,andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyver-sionsintwoverydifferentdatasetsinvolvingcommunity-leveldecision-making:theWiki-pediaRequestsforAdminshipcorpusandtheConvoteU.S.Congressionalspeechcorpus.1IntroductionPeople’sevaluationsofoneanotherareprevalentinallkindsofdiscourse,publicandprivate,acrossages,genders,cultures,andsocialclasses(Dunbar,2004).Suchopinionsmatterforestablishingrep-utationsandreinforcingsocialbonds,andtheyareespeciallyconsequentialinpoliticalcontexts,wheretheytaketheformofendorsements,accusations,andassessmentsintendedtoswaypublicopinion.Thesignificanceofsuchperson-to-personevalu-ationsmeansthatthereisapressingneedforcom-putationalmodelsandtechnologiesthatcananalyzethem.Researchonsignedsocialnetworkssuggestsonepathforward:howonepersonwillevaluatean-othercanoftenbepredictedfromthenetworktheyareembeddedin.Linguisticsentimentanalysissug-gestsanotherpathforward:onecouldleveragetex-tualfeaturestopredictthevalenceofevaluativetextsdescribingpeople.Suchindependenteffortshavebeensuccessful,buttheygenerallyneglectthewaysinwhichsocialandlinguisticfeaturescomplementeachother.Insomesettings,textualdataissparsebutthenetworkstructureislargelyobserved;inoth-ers,textisabundantbutthenetworkispartlyorun-reliablyrecorded.Inaddition,weoftenseerichin-teractionsbetweenthetwokindsofinformation—politicalalliesmightteaseeachotherwithnegativelanguagetoenhancesocialbonds,andopponentsof-tenusesarcasticallypositivelanguageintheircriti-cisms.Separatesentimentorsigned-networkmod-elswillmissormisreadthesesignals.Wedevelop(Sec.3)agraphicalmodelthatsyn-thesizesnetworkandlinguisticinformationtomakemoreandbetterpredictionsaboutboth.Theobjec-tiveofthemodelistopredictA’sopinionofBusingasynthesisofthestructuralcontextaroundAandBinsidethesocialnetworkandsentimentanalysisoftheevaluativetextsrelatingAtoB.WeprovethattheproblemisNP-hardbutthatitcanberelaxedtoanefficientlysolvablehinge-lossMarkovrandomfield(Broecheleretal.,2010),andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyversionsintwoverydifferentdatasetsinvolv-ingcommunity-leveldecision-making:theWikipe-diaRequestsforAdminshipcorpus,inwhichWi-kipediaeditorsdiscussandvoteonwhoshouldbe

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

298

promotedwithintheWikipediahierarchy(Sec.4),andtheConvoteU.S.Congressionalspeechcorpus(Thomasetal.,2006),inwhichelectedofficialsdis-cusspoliticaltopics(Sec.5).Thesecorporadifferdramaticallyinsize,inthestyleandqualityoftheirtextualdata,andinthestructureandobservabilityoftheirnetworks.Together,theyprovideaclearpic-tureofhowjointmodelsoftextandnetworkstruc-turecanexcelwheretheircomponentpartscannot.2Backgroundandrelatedwork2.1SentimentanalysisInNLP,thelabelsentimentanalysiscoversdiversephenomenaconcerninghowinformationaboutemo-tions,attitudes,perspectives,andsocialidentitiesisconveyedinlanguage(PangandLee,2008).Mostworkassumesadimensionalmodelinwhichemo-tionsaredefinedprimarilybyvalence/polarityandarousal/intensity(Russell,1980;FeldmanBarrettandRussell,1998;RubinandTalerico,2009),andthedominantapplicationispredictingthevalenceofproduct,company,andservicereviews.Weadopttheconceptualassumptionsofthisworkforourbasicsentimentmodel,butourfocusisonperson-to-personevaluationsandtheirsocialconse-quences.Thisinvolveselementsofworkonmod-elingpoliticalaffiliation(Agrawaletal.,2003;Mal-oufandMullen,2008;Yuetal.,2008),bias(Yanoetal.,2010;Recasensetal.,2013),andstanceonde-batetopics(Thomasetal.,2006;SomasundaranandWiebe,2010;Linetal.,2006;Anandetal.,2011),buttheseaspectsofbeliefandsocialidentityarenotourprimaryconcern.Rather,weexpectthemtobepredictiveofthesentimentclassificationsweaimtomake—e.g.,iftwopeoplesharepoliticalviews,theywilltendtoevaluateeachotherpositively.Recentworkinsentimentanalysishasbroughtintopical,contextual,andsocialinformationtomakemorenuancedpredictionsaboutlanguage(Jurafskyetal.,2014;Wilsonetal.,2005;Blitzeretal.,2007).Webuildontheseinsightswithourmodel,whichseekstomodulatesentimentpredictionsbasedonnetworkinformation(andviceversa).2.2Signed-networkanalysisManysocialnetworksencodeperson-to-personsen-timentinformationviasignededgesbetweenuserssummarizingtheiropinionsofeachother.Inthisset-ting,onecanleveragesociologicaltheoriesofpair-wiserelationshipsandgroup-levelorganizationtoidentifyandunderstandpatternsintheserelation-ships(Heider,1946;CartwrightandHarary,1956).Balancetheoryisbasedonsimpleintuitionslike‘afriendofmyfriendismyfriend’,‘anenemyofmyenemyismyfriend’,and‘anenemyofmyfriendismyenemy’.Ingraphtheory,thesearestatementsabouttheedgesignsoftrianglesofconnectednodes:giventhesignsoftwoedges,balancetheorypredictsthethird,assummarizedinFig.1(a),wherethetwogivenedges(gray)determinethethird(black).Fordirectedrelationships,Leskovecetal.(2010b)formulateanalternativecalledstatustheory,whichpositsthatnetworksorganizeaccordingtosocialsta-tus:anodehaspositiveedgestootherswithhigherstatusandnegativeedgestothosewithlowersta-tus.Fig.1(a)illustratesthestructureofvariousdi-rectedsignedtriangles,wherethesignofthethirdedge(black)canbeinferredbasedonthesignsanddirectionsoftheothertwo(gray).Leskovecetal.(2010b)showthatsignededgesinnetworksemergeinamannerthatisbroadlycon-sistentwithbothofthesetheoriesandthatsocial-networkstructurealonecansupportaccurateedge-signpredictions(Leskovecetal.,2010a).Kunegisetal.(2013)predicthiddenpositiveandnegativeedgesinascenariowhereallobservededgesarepositive.Bachetal.(2013)andHuangetal.(2013)framesignpredictionasahinge-lossMarkovrandomfield,atypeofprobabilisticgraphicalmodelintroducedbyBroecheleretal.(2010).Ourmodelcombinestheseideaswithasentimentmodeltoachieveevenmorerobustpredictions.2.3Synthesisofsentiment&networkanalysisModelsofsentimentandsignednetworkshavebeensuccessfulatavarietyoftasks.However,missingfromthecurrentscientificpictureisadeepunder-standingofthewaysinwhichsentimentexpres-sionandsocialnetworksinteract.Tosomeextent,theseinteractionsarecapturedbyaddingcontextualanddemographicfeaturestoatext-basedsentimentmodel,butthosefeaturesonlyapproximatetherichrelationalstructureencodedinasignednetwork.Thomasetal.(2006)andTanetal.(2011)cap-italizeonthisinsightusinganelaborationofthe

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

299

+++–+–––+–––+++SOCIAL BALANCE THEORYSOCIAL STATUS THEORY+–?(a)Theoriesofsocialbalanceandstatus++––??+?“You’re onecrazy mofo!”“Love u! :)”“To whom itmay concern”(b)DesiderataFigure1:(a)Predictionsofsocialbalanceandstatustheoriesfortheboldblackedge,giventhethingrayedges.Balancetheoryreasonsaboutundirected,statustheoryaboutdirected,triangles.Inthestatusdiagrams,nodesizesignifiessocialstatus.Apositiveedgemaybereplacedbyanegativeedgeintheoppositedirection,andviceversa,withoutchangingtheprediction.Statustheorymakesnopredictionintherightmostcase.(b)Situationsofthesortweaimtocapture.Atleft,thenetworkresolvestextualambiguity.Atright,thetextcompensatesforedge-labelsparsity.graph-cutsapproachofPangandLee(2004).Theyareguidedbyanassumptionofhomophily,i.e.,thatcertainsocialrelationshipscorrelatewithagreementoncertaintopics:Thomasetal.(2006)usepartyaf-filiationandmentionsinspeechestopredictvotingpatterns,andTanetal.(2011)useTwitterfollowsandmentionstopredictattitudesaboutpoliticalandsocialevents.RelatedideasarepursuedbyMaetal.(2011)andHuetal.(2013),whoaddtermstotheirmodelsenforcinghomophilybetweenfriendswithregardtotheirpreferences.Weadoptsomeoftheassumptionsoftheaboveauthors,butourtaskisfundamentallydifferentintworespects.First,whereastheymodelperson-to-itemevaluations,wemodelperson-to-personevalu-ations;thesearealsothefocusofTangetal.(2013),who,though,useanunsignednetwork,whereasourworkisgearedtowarddistinguishingpositiveandnegativeedgelabels.Second,theabovemodelsmakeoverarchinghomophilyassumptions,whereasweallowourmodeltoexplorethefullsetoftriangleconfigurationssuggestedbyFig.1(a).3ModelHere,wearguethatcombiningtextualandstructuralfeaturescanhelppredictedgesigns.Weformu-lateamodel,showthatitiscomputationallyhard,andprovidearelaxedversionthatiscomputationallytractable,buildingonworkbyBachetal.(2013).3.1DesiderataInmanyreal-worldscenarios,richfeaturesareasso-ciatedwithedgesbetweentwopeople,suchascom-mentstheymadeabouteachother,messagestheyexchanged,orotherbehavioralfeatures.Suchfea-turesmaycontainastrongsentimentsignalusefulforpredictingedgesignsandmaybeusedtofitaconventionalsentimentmodel(Sec.2.1).However,thesignofanedgealsodependsonthesignsofsurroundingedgesinthenetwork(Sec.2.2).Apurelyedge-feature–basedsentimentmodelcan-notaccountforthenetworkstructure,sinceitrea-sonsaboutedgesasindependentofeachother.Wearguethatconsideringsentimentandnetworkstructurejointlycanresultinbetterpredictionsthaneitheroneonitsown.Fig.1(b)providestwoillus-trativeexamples.Here,thegrayedgesignsareob-served,whilethepolaritiesoftheblackedgesaretobepredicted.Intheleftnetwork,thetextoftheblackedge(‘You’reonecrazymofo!’)mightsug-gestanegativepolarity.However,anegativeblackedgewouldmakebothtrianglesinconsistentwithbalancetheory(Fig.1(a)),whereasapositiveblackedgemakesthemconsistentwiththetheory.So,inthiscase,thenetworkcontexteffectivelyhelpsde-tecttheteasing,non-literaltoneofthestatement.IntherightnetworkofFig.1(b),onlyoneofthreeedgesignsisobserved.Predictingtwopos-itiveedgeswouldbeconsistentwithbalancethe-ory,butthesamewouldbetrueforpredictingtwonegativeedges.Thetextonthelowerblackedge(‘Towhomitmayconcern’)doesnotcarryanyclearsentimentsignal,butthe‘Loveu!:)’ontheotheredgestronglysuggestsapositivepolarity.Thisletsusconcludethatthebottomedgeshouldprobablybepositive,too,sinceotherwisethetrianglewouldcontradictbalancetheory.Thisshowsthatcombin-ingsentimentandnetworkfeaturescanhelpwhenjointlyreasoningaboutseveralunknownedgesigns.

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

300

3.2ProblemformulationWenowformulateamodelcapableofsynthesizingtextualandnetworkfeatures.Notation.WerepresentthegivensocialnetworkasasignedgraphG=(V,E,x),wheretheverticesVrepresentpeople;theedgesE,relationshipsbetweenpeopleinV;andthesignvectorx∈{0,1}|E|repre-sentsedgepolarities,i.e.,xe=1(xe=0)indicatesapositive(negative)polarityforedgee∈E.Sometypesofrelationshipsimplydirectededges(e.g.,followingauseronTwitter,orvotingonacan-didateinanelection),whereasothersimplyundi-rectededges(e.g.,friendshiponFacebook,oragree-mentinanetworkofpoliticians).Weformulateourproblemforundirectedgraphshere,buttheexten-siontodirectedgraphsisstraightforward.Wede-fineatrianglet={e1,e2,e3}⊆Etobeasetofthreeedgesthatformacycle,anduseTtoindi-catethesetofalltrianglesinG.Finally,weusext=(xe1,xe2,xe3)∈{0,1}3torefertot’sedgesigns.Optimizationproblem.Weassumethatthestruc-tureofthenetwork(i.e.,VandE)isfullyobserved,whereastheedgesignsxareonlypartiallyobserved.Further,weassumethatwehaveasentimentmodelthatoutputs,foreachedgeeindependently,anes-timatepeoftheprobabilitythateisofpositivepo-larity,basedontextualfeaturesassociatedwithe.Thetask,then,istoinfertheunobservededgesignsbasedontheobservedinformation.Thehigh-levelideaisthatwewanttoinferedgesignsthat(1)agreewiththepredictionsofthesen-timentmodel,and(2)formtrianglesthatagreewithsocialtheoriesofbalanceandstatus.Itisnotalwayspossibletomeetbothobjectivessimultaneouslyforalledgesandtriangles,soweneedtofindatrade-off.Thisgivesrisetoacombinatorialoptimizationproblem,whichwetermTRIANGLEBALANCE,thatseekstofindedgesignsx∗thatminimizeanobjec-tiveconsistingofbothedgeandtrianglecosts:1x∗=argminx∈{0,1}|E|Xe∈Ec(xe,pe)+Xt∈Td(xt).(1)Thefirsttermisthetotaledgecost,inwhicheachedgeecontributesacostcapturinghowmuchitsin-ferredsignxedeviatesfromthepredictionpeofthe1Ofcourse,theentriesofx∗correspondingtoobservededgesarenotvariablebutfixedtotheirobservedvaluesinEq.1.sentimentmodel.Thesecondterm,thetotaltrianglecost,penalizeseachtriangletaccordingtohowun-desirableitsconfigurationisunderitsinferredsignsxt(e.g.,ifitcontradictsstatusorbalancetheory).Weusethefollowingedgecostfunction:c(xe,pe)=λ1(1−pe)xe+λ0pe(1−xe).(2)Here,λ1,λ0∈R+aretunableparametersthatallowforasymmetriccostsforpositiveandnegativeedges,respectively,andpeistheprobabilityofedgeebe-ingpositiveaccordingtothesentimentmodelalone.Intuitively,themoretheinferrededgesignxedevi-atesfromthepredictionpeofthesentimentmodel,thehighertheedgecost.(NotethatatmostoneofthetwosumfactorsofEq.2isnon-zero.)Thetrianglecostfortriangletissignifiedbyd(xt),whichcanonlytakeon8distinctvaluesbe-causext∈{0,1}3(inpractice,therearesymmetriesthatdecreasethisnumberto4).Theparametersd(xt)maybetunedsothattriangleconfigurationsthatagreewithsocialtheoryhavelowcosts,whilethosethatdisagreewithit(e.g.,‘theenemyofmyfriendismyfriend’)havehighcosts.3.3ComputationalcomplexityTheproblemdefinedinEq.1isintuitive,but,aswithmanycombinatorialoptimizationproblems,itishardtofindagoodsolution.Inparticular,wesketchaproofofthistheoreminAppendixA:Theorem1.TRIANGLEBALANCEisNP-hard.3.4RelaxationasaMarkovrandomfieldTheobjectivefunctionofEq.1maybeseenasdefin-ingaMarkovrandomfield(MRF)overtheunderly-ingsocialnetworkG,withedgepotentials(definedbyc)andtrianglepotentials(definedbyd).Infer-enceinMRFs(i.e.,computingx∗)isawell-stud-iedtaskforwhichavarietyofmethodshavebeenproposed(KollerandFriedman,2009).However,sinceourproblemisNP-hard,nomethodcanbeex-pectedtofindx∗efficiently.Onewayofdealingwiththecomputationalhardnesswouldbetofindanap-proximatebinarysolution,usingtechniquessuchasGibbssamplingorbeliefpropagation.Anotherop-tionistoconsideracontinuousrelaxationofthebi-naryproblemandfindanexactnon-binarysolutionwhoseedgesignsarecontinuous,i.e.,xe∈[0,1].

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

301

Wetakethislatterapproachandcastourproblemasahinge-lossMarkovrandomfield(HL-MRF).ThisisinspiredbyBachetal.(2013),whoalsouseanHL-MRFtopredictedgesignsbasedontrianglestructure,butdonotuseanyedgefeatures.AnHL-MRFisanMRFwithcontinuousvariablesandwithpotentialsthatcanbeexpressedassumsofhinge-losstermsoflinearfunctionsofthevariables(cf.Broecheleretal.(2010)fordetails).HL-MRFshavetheadvantagethattheirobjectivefunctionisconvexsothat,unlikebinaryMRFs(asdefinedbyEq.1),exactinferenceisefficient(Bachetal.,2013).Weachievearelaxationbyusingsumsofhinge-losstermstointerpolatecoverthecontinuousdo-main[0,1]andd,over[0,1]3(eventhoughtheyaredefinedonlyforbinarydomains).Asaresult,theHL-MRFformulationisequivalenttoEq.1whenallxearebinary,butitalsohandlescontinuousval-uesgracefully.Wenowinterpretareal-valued‘sign’xe∈[0,1]asthedegreetowhicheispositive.Westartbyshowinghowtotransformc:eventhoughitcouldbeusedinitscurrentform(Eq.2),wecreateatighterrelaxationbyusing˜c(xe,pe)=λ1kxe−pek++λ0kpe−xek+,(3)wherekyk+=max{0,y}isthehingeloss.Atmostonetermcanbeactiveforanyxe∈[0,1]duetothehingeloss,andc(xe,pe)=˜c(xe,pe)forbinaryxe.Torewrited,noticethat,foranyxt∈{0,1}3,wecanwritedasd(xt)=Xz∈{0,1}3d(z)δ(xt,z),(4)whereδ(xt,z)=1ifxt=zand0otherwise.Whileδisnotconvex,wecanusef(xt,z)=k1−kxt−zk1k+(5)asaconvexsurrogate.Whenxtisbinary,eitherxt=zsokxt−zk1=0orxt6=zsokxt−zk1≥1,andhencef(xt,z)=δ(xt,z).Toproveconvexity,notethat,foranyfixedbinaryz∈{0,1}3,kx−zk1=P3i=1|xi−zi|islinearinx∈[0,1]3,since|xi−zi|equalseitherxi(ifzi=0)or1−xi(ifzi=1).Itfol-lowsthatfisahinge-lossfunctionofalineartrans-formationofxtandthereforeconvexinxt.TrainingTestingEvidence To inferRANDOM SAMPLINGBFS SAMPLINGuvFigure2:Optionsfortrainingandtestingourmodel.Requiringthetrianglecostd(z)tobenonnegativeforalltriangletypesz∈{0,1}3,wecanuse˜d(xt)=Xz∈{0,1}3d(z)f(xt,z)(6)asaconvexsurrogateford.OuroveralloptimizationproblemisthenthefollowingrelaxationofEq.1:x∗=argminx∈[0,1]|E|Xe∈E˜c(xe,pe)+Xt∈T˜d(xt).(7)ThisobjectivehastheexactformofanHL-MRF,sinceitisaweightedsumofhingelossesoflin-earfunctionsofx.WeusetheProbabilisticSoftLogicpackage2toperformtheoptimization,whichisinturnbasedonthealternating-directionmethodofmultipliers(ADMM)(Boydetal.,2011).Learning.Clearly,asolutionisonlyusefulifthecostparameters(λ1,λ0,andd(z)forallz∈{0,1}3)aresetappropriately.Oneoptionwouldbetosetthevaluesheuristically,basedonthepredictionsmadebythesocialbalanceandstatustheories(Sec.2.2).However,itismoreprincipledtolearntheseparam-etersfromdata.Forthispurpose,weleveragethelearningproceduresincludedintheHL-MRFimple-mentationweemploy,whichusesthevoted-percep-tronalgorithmtoperformmaximum-likelihoodesti-mation(Bachetal.,2013).Sinceourdatapoints(edges)interactwitheachotherviathenetwork,somewordsonhowweper-formtrainingandtestingareinorder.Fig.2showstwooptionsforobtainingtrainingandtestingsets(weusebothoptionsinourexperiments).Inthe‘randomsampling’paradigm,werandomlychooseasetofedgesfortraining(blue),andadisjointsetofedgesfortesting(yellow).In‘BFSsampling’,we2http://psl.umiacs.umd.edu

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

302

runabreadth-firstsearchfromseednodeutoobtainacoherenttrainingset(blue),andlikewisefromaseednodevtoobtainacoherenttestingset(yellow),takingcarethatnoedgesfromthetrainingsetarealsoincludedinthetestingset.Duringbothtrainingandtesting,anarbitrarypor-tionoftheedgesignsmaybefixedtoobservedval-uesandneednotbeinferred.ThesearethesolidedgesinFig.2;werefertothemasevidence.Fur-ther,wedefinetheevidenceratioasthenumberofevidenceedges,dividedbythenumberofalledgesconsidered(solidanddashed).Thelearningalgorithmmayusethestructure(VandE)ofthetraininggraphinducedbyallblueedges(solidanddashed),thepredictionspeofthesentimentmodelforallblueedges,andthesignsofthesolidblueedgestopredictthedashedblueedges.Duringtesting,thenetworkstructureofallyel-lowedges,thesentimentpredictionsforallyellowedges,andthesignsofthesolidyellowedgesmaybeusedtopredictthedashedyellowedgesigns.Inprinciple,alltrainingedgescouldbeusedasextraevidencefortesting(i.e.,allblueedgesmaybemadesolidyellow).However,inourexperiments,wekeepthetrainingandtestingsetsfullydisjoint.Technicaldetails.Forclarity,wegivefurtherde-tails.First,thedistributionofpositiveandnegativesignsmaybeskewed;e.g.,weobserveapriorprob-abilityof76%positivesignsinourWikipediacor-pus(Sec.4).Therefore,asalsodonebyBachetal.(2013),weaddacosttermtoourobjective(Eq.7)thatpenalizesdeviationsfromthispriorprobabil-ity(asestimatedonthetrainingset).Thisensuresthatthemodelcandefaulttoareasonablepredictionforedgesthatarenotembeddedinanytrianglesandaboutwhichthesentimentmodelisuncertain.Second,intuitively,weshouldnotpenalizedevi-atingfromthesentimentmodelwhenitisitselfun-certainaboutitsprediction(i.e.,whenpeisfarfromboth0and1).Rather,wewanttorelymoreheavilyonsignalsfromthenetworkstructureinsuchcases.Toachievethis,weintroduce10pairsofcostparam-eters(λ(1)1,λ(1)0),…,(λ(10)1,λ(10)0).Then,wedividetheinterval[0,1]into10bins,andwhenpefallsintothei-thbin,weuseλ(i)1andλ(i)0inEq.3.Thisway,largercostscanbelearnedfortheextremebinscloseto0and1thanfortheintermediatebinsaround0.5.Finally,hinge-losstermsmayoptionallybesquaredinHL-MRFs.WeusethesquaredhingelossinEq.5,sinceinitialexperimentationshowedthistoperformslightlybetterthanthelinearhingeloss.4WikipediaexperimentsOurfirstsetofexperimentsisconductedontheWi-kipediaRequestsforAdminshipcorpus,whichal-lowsustoevaluateourmodel’sabilitytopredictperson-to-personevaluationsinWebtextsthatareinformalbutpertaintoimportantsocialoutcomes.4.1DatasetdescriptionForaWikipediaeditortobecomeanadministrator,arequestforadminship(RfA)3mustbesubmitted,eitherbythecandidateorbyanothercommunitymember.Subsequently,anyWikipediamembermaycastasupporting,neutral,oropposingvote.Thisinducesadirected,signednetworkinwhichnodesrepresentWikipediamembersandedgesrepresentvotes(wediscardneutralvotes).Wecrawledandparsedallvotessincetheadop-tionoftheRfAprocessin2003throughMay2013.4ThissignednetworkwaspreviouslyanalyzedbyLeskovecetal.(2010b;2010a).However,thereisalsoarichtextualcomponentthathassofarre-maineduntappedforedge-signprediction:eachvoteistypicallyaccompaniedbyashortcomment(me-dian/mean:19/34tokens).Atypicalpositivecom-mentreads,‘I’venoconcerns,willmakeanexcel-lentadditiontotheadmincorps’,whileanexampleofanegativecommentis,‘Littleevidenceofcollab-orationwithothereditorsandlimitedcontentcre-ation.’ThepresenceofavotingnetworkalongsidetextualedgefeaturesmakesourmethodofSec.3well-suitedforthisdataset.TheRfAnetworkcontains11Knodes,160Kedges(76%positive),andcloseto1Mtriangles.4.2ExperimentalsetupTrain/testsets.Wefollowthetrain–testparadigmtermed‘BFSsampling’inSec.3.4andFig.2,choos-ing10randomseednodes,fromeachofwhichweperformabreadth-firstsearch(followingbothin-andout-links)untilwehavevisited350nodes.We3http://en.wikipedia.org/wiki/Wikipedia:RfA4Dataavailableonline(West,2014).

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

303

thusobtain10subgraphswith350nodeseach.Wetrainamodelforeachsubgraphiandtestitonsub-graphi+1(mod10),ensuringthatedgesfromthetraininggraphareremovedfromthetestinggraph.TheBFSsamplingparadigmwasusedbecausethealternative(‘randomsampling’inFig.2)pro-ducessubgraphswithmostlyisolatededgesandonlyafewtriangles—anunrealisticscenario.Evaluatedmodels.Weevaluatethreemodels:1.Astandard,text-basedsentimentmodelthattreatsedgesasindependentdatapoints;2.ourfullmodelasspecifiedinSec.3.4,whichcombinesedgecostsbasedonthepredictionsofthetext-basedsentimentmodelwithtrianglecostscapturingnetworkcontext;3.aversionofourmodelthatconsidersonlytrian-glecosts,whileignoringthepredictionsofthetext-basedsentimentmodel(akintothemodelproposedbyBachetal.(2013)).Werefertothesemodelsas‘sentiment’,‘combined’,and‘network’,respectively.Sentimentmodel.Ourtext-basedsentimentmodelisanL2-regularizedlogistic-regressionclassifierwhosefeaturesaretermfrequenciesofthe10,000overallmostfrequentwords.TheL2-penaltyischo-senviacross-validationonthetrainingset.Sincecommentsoftenexplicitlycontainthelabel(‘sup-port’or‘oppose’),weremoveallwordswithpre-fixes‘support’or‘oppos’.Wetrainthemodelonlyonce,onarandomsampleof1,000commentsdrawnfromthesetofall160Kcomments(thevastmajorityofwhichwillnotappearinour10subgraphs).Evidenceratio.Regardingtheothertwomodels,recallfromSec.3.4ourdefinitionoftheevidenceratio,thefractionofedgesignsthatarefixedasevi-denceandneednotbeinferred.Inourexperiments,weexploretheimpactoftheevidenceratioduringtrainingandtesting,sinceweexpectperformancetoincreaseasmoreevidenceisavailable.(Weusethesameevidenceratiofortrainingandtesting,butthisneednotnecessarilybeso.)Metrics.Asourprincipalevaluationmetrics,weusetheareasunderthecurve(AUC)ofthereceiveroperatingcharacteristic(ROC)curveaswellastheprecision–recall(PR)curves.TherearetwoPRllllEvidence ratioArea under the curve12.5%25%50%75%0.50.60.70.80.91.0llllllll(a)AUC/ROCllllEvidence ratio12.5%25%50%75%0.20.40.60.81.0llllllllCombinedSentimentNetworkRandom(b)AUC/negPRFigure3:AUCasfunctionofevidenceratio(Wikipedia),withstandarderrors.curves,oneforthepositiveclass,theotherforthenegativeone.Ofthesetwo,thepositiveclassislessinteresting:duetotheclassimbalanceof76%posi-tiveedges,evenarandomguesserwouldachieveanAUCof0.76.ThePRcurveofthenegativeclassismoreinformative:here,itismuchhardertoachievehighAUC,sincerandomguessingyieldsonly0.24.Moreover,thenegativeedgesarearguablymoreim-portant,notonlybecausetheyarerarer,butalsobe-causetheyindicatetensionsinthenetwork,whichwemightbeinterestedindetectingandresolving.Forthesereasons,wereportonlytheAUCunderthenegativePRcurve(AUC/negPR)here.Additionally,wereporttheareaundertheROCcurve(AUC/ROC),astandardmetricforquantify-ingclassificationperformanceonunbalanceddata.Itcapturestheprobabilityofarandompositivetestexamplereceivingahigherscorethanarandomneg-ativeone(soguessinggivesanAUC/ROCof0.5).4.3ResultsPerformanceasafunctionofevidenceratio.TheAUCsasfunctionsoftheevidenceratioareshowninFig.3(a).(WeemphasizethattheseplotsarenotthemselvesROCandPRcurves;rather,theyarede-rivedfromthosecurvesbymeasuringAUCforarangeofmodels,parametrizedbyevidenceratio.)Sinceweusethesamesentimentmodelinallcases(Sec.4.2),itsperformance(yellow)doesnotdependontheevidenceratio.Itisremarkablyhigh,atanAUC/ROCof0.88,asaconsequenceofthehighlyindicative,sometimesevenformulaic,lan-guageusedinthecomments(examplesinSec.4.2).Thenetwork-onlymodel(blue)workspoorlyonverylittleevidence(AUC/ROC0.56for12.5%ev-

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
8
4
1
5
6
6
8
9
9

/

/
t

l

a
c
_
a
_
0
0
1
8
4
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

304

lllllllNumber of dropped featuresArea under the curve01050100500100020000.50.60.70.80.91.0llllllllllllll(a)AUC/ROClllllllNumber of dropped features01050100500100020000.20.40.60.81.0llllllllllllllCombinedSentimentNetworkRandom(b)AUC/negPRFigure4:AUCasfunctionofnumberofdroppedfeatures(Wikipedia),withstandarderrors.Evidenceratio75%.idence)butimprovessteadilyasmoreevidenceisused(AUC/ROC0.82for75%evidence):thisisin-tuitive,sincemoreevidencemeansstrongercontextforeachedgesigntobepredicted.Althoughthenetwork-onlymodelworkspoorlyonlittleevidence,ourfullmodel(black),whichsyn-thesizesthesentimentandnetworkmodels,isnotaf-fectedbythisandeffectivelydefaultstothebehaviorofthesentiment-onlymodel.Furthermore,althoughthenetwork-onlymodelneverattainstheperfor-manceofthesentiment-onlymodel,combiningthetwoinourfullmodel(black)nonethelessyieldsasmallperformanceboostintermsofAUC/ROCto0.89for75%evidence.Thegainsaresignifi-cantlylargerwhenweconsiderAUC/negPRinsteadofAUC/ROC:whilethesentimentmodelachieves0.60,thecombinedmodelimprovesonthisby13%,to0.68,at75%evidenceratio.Performanceasafunctionofsentiment-modelquality.Itseemshardtoimprovebymuchonasen-timentmodelthatachievesanAUC/ROCof0.88onitsown;theWikipediacorpusoffersanexception-allyexplicitlinguisticsignal.Hence,inournextex-periment,weexploresystematicallyhowourmodelbehavesunderalesspowerfulsentimentmodel.First,wemeasure,foreachfeature(i.e.,word),howinformativeitisonitsownforpredictingthesignsofedges(quantifiedbyitsmutualinformationwiththeedgesign),whichinducesarankingoffea-turesintermsofinformativeness.Now,tomakethesentimentmodellesspowerfulinacontrolledway,wedropthetopmfeaturesandrepeattheexperimentdescribedaboveforarangeofmvalues(wherewekeeptheevidenceratiofixedat75%).llllllllll1e−041e−021e+00Sentiment−model prediction peNorm. cost for deviating from pe0.10.20.30.40.50.60.70.80.91.0Figure5:Normalizedcostλ(i)(definedinSec.4.3;log-arithmicscale)fordeviatingfromsentiment-modelpre-dictionspe,forbinsi=1,…,10(Wikipedia).Upperbinboundariesonthex-axis.Valuesshownareaveragesover10folds.Evidenceratio75%.Fig.4showsthattheperformanceofthesenti-mentmodel(yellow)declinesdrasticallyasmorefeaturesareremoved.Thecombinedmodel(black),onthecontrary,ismuchlessaffected:whentheper-formanceofthesentimentmodeldropstothatofthenetwork-only(blue;ROC/AUC0.81),thecombinedmodelisstillstrongerthanboth(0.86).Evenasthesentimentmodelapproachesrandomperformance,thecombinedmodelstillneverdropsbelowthenet-work-onlymodel—itsimplylearnstodisregardthepredictionsofthesentimentmodelaltogether.Learnededgecosts.RecallfromthefinalpartofSec.3.4thateachoutputpeofthesentimentmodelfallsintooneof10bins[0,0.1],…,[0.9,1],withseparateedge-costparametersλ(i)1andλ(i)0learnedforeachbucketi.Therationalewastogivethemodelthefreedomtotradeoffedgeandtrianglecostsdifferentlyforeachedgee,dependingonhowinformativethesentimentmodel’spredictionpeis.Thegoalofthissectionistounderstandwhetherourmodelindeedexposessuchbehavior.RecallfromEq.3thatλ1istheconstantwithwhichtheab-solutedifferencebetweenpeandtheinferrededgesignxeismultipliedwhenxe>pe,whileλ0istheconstantwhenxe
Transactions of the Association for Computational Linguistics, 2 (2014) 297–310. Action Editor: Hal Daume III. image

Download pdf