Operazioni dell'Associazione per la Linguistica Computazionale, 2 (2014) 297–310. Redattore di azioni: Hal Daume III.
Submitted 5/2014; Revised 6/2014; Pubblicato 10/2014. C(cid:13)2014 Associazione per la Linguistica Computazionale.
297
ExploitingSocialNetworkStructureforPerson-to-PersonSentimentAnalysisRobertWestStanfordUniversitywest@cs.stanford.eduHristoS.PaskovStanfordUniversityhpaskov@stanford.eduJureLeskovecStanfordUniversityjure@cs.stanford.eduChristopherPottsStanfordUniversitycgpotts@stanford.eduAbstractPerson-to-personevaluationsareprevalentinallkindsofdiscourseandimportantfores-tablishingreputations,buildingsocialbonds,andshapingpublicopinion.Suchevaluationscanbeanalyzedseparatelyusingsignedso-cialnetworksandtextualsentimentanalysis,butthismissestherichinteractionsbetweenlanguageandsocialcontext.Tocapturesuchinteractions,wedevelopamodelthatpre-dictsindividualA’sopinionofindividualBbysynthesizinginformationfromthesignedsocialnetworkinwhichAandBareembed-dedwithsentimentanalysisoftheevaluativetextsrelatingAtoB.Weprovethatthisprob-lemisNP-hardbutcanberelaxedtoanef-ficientlysolvablehinge-lossMarkovrandomfield,andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyver-sionsintwoverydifferentdatasetsinvolvingcommunity-leveldecision-making:theWiki-pediaRequestsforAdminshipcorpusandtheConvoteU.S.Congressionalspeechcorpus.1IntroductionPeople’sevaluationsofoneanotherareprevalentinallkindsofdiscourse,publicandprivate,acrossages,genders,cultures,andsocialclasses(Dunbar,2004).Suchopinionsmatterforestablishingrep-utationsandreinforcingsocialbonds,andtheyareespeciallyconsequentialinpoliticalcontexts,wheretheytaketheformofendorsements,accusations,andassessmentsintendedtoswaypublicopinion.Thesignificanceofsuchperson-to-personevalu-ationsmeansthatthereisapressingneedforcom-putationalmodelsandtechnologiesthatcananalyzethem.Researchonsignedsocialnetworkssuggestsonepathforward:howonepersonwillevaluatean-othercanoftenbepredictedfromthenetworktheyareembeddedin.Linguisticsentimentanalysissug-gestsanotherpathforward:onecouldleveragetex-tualfeaturestopredictthevalenceofevaluativetextsdescribingpeople.Suchindependenteffortshavebeensuccessful,buttheygenerallyneglectthewaysinwhichsocialandlinguisticfeaturescomplementeachother.Insomesettings,textualdataissparsebutthenetworkstructureislargelyobserved;inoth-ers,textisabundantbutthenetworkispartlyorun-reliablyrecorded.Inaddition,weoftenseerichin-teractionsbetweenthetwokindsofinformation—politicalalliesmightteaseeachotherwithnegativelanguagetoenhancesocialbonds,andopponentsof-tenusesarcasticallypositivelanguageintheircriti-cisms.Separatesentimentorsigned-networkmod-elswillmissormisreadthesesignals.Wedevelop(Sec.3)agraphicalmodelthatsyn-thesizesnetworkandlinguisticinformationtomakemoreandbetterpredictionsaboutboth.Theobjec-tiveofthemodelistopredictA’sopinionofBusingasynthesisofthestructuralcontextaroundAandBinsidethesocialnetworkandsentimentanalysisoftheevaluativetextsrelatingAtoB.WeprovethattheproblemisNP-hardbutthatitcanberelaxedtoanefficientlysolvablehinge-lossMarkovrandomfield(Broecheleretal.,2010),andweshowthatthisimplementationoutperformstext-onlyandnetwork-onlyversionsintwoverydifferentdatasetsinvolv-ingcommunity-leveldecision-making:theWikipe-diaRequestsforAdminshipcorpus,inwhichWi-kipediaeditorsdiscussandvoteonwhoshouldbe
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
298
promotedwithintheWikipediahierarchy(Sec.4),andtheConvoteU.S.Congressionalspeechcorpus(Thomasetal.,2006),inwhichelectedofficialsdis-cusspoliticaltopics(Sec.5).Thesecorporadifferdramaticallyinsize,inthestyleandqualityoftheirtextualdata,andinthestructureandobservabilityoftheirnetworks.Together,theyprovideaclearpic-tureofhowjointmodelsoftextandnetworkstruc-turecanexcelwheretheircomponentpartscannot.2Backgroundandrelatedwork2.1SentimentanalysisInNLP,thelabelsentimentanalysiscoversdiversephenomenaconcerninghowinformationaboutemo-tions,attitudes,perspectives,andsocialidentitiesisconveyedinlanguage(PangandLee,2008).Mostworkassumesadimensionalmodelinwhichemo-tionsaredefinedprimarilybyvalence/polarityandarousal/intensity(Russell,1980;FeldmanBarrettandRussell,1998;RubinandTalerico,2009),andthedominantapplicationispredictingthevalenceofproduct,company,andservicereviews.Weadopttheconceptualassumptionsofthisworkforourbasicsentimentmodel,butourfocusisonperson-to-personevaluationsandtheirsocialconse-quences.Thisinvolveselementsofworkonmod-elingpoliticalaffiliation(Agrawaletal.,2003;Mal-oufandMullen,2008;Yuetal.,2008),bias(Yanoetal.,2010;Recasensetal.,2013),andstanceonde-batetopics(Thomasetal.,2006;SomasundaranandWiebe,2010;Linetal.,2006;Anandetal.,2011),buttheseaspectsofbeliefandsocialidentityarenotourprimaryconcern.Rather,weexpectthemtobepredictiveofthesentimentclassificationsweaimtomake—e.g.,iftwopeoplesharepoliticalviews,theywilltendtoevaluateeachotherpositively.Recentworkinsentimentanalysishasbroughtintopical,contextual,andsocialinformationtomakemorenuancedpredictionsaboutlanguage(Jurafskyetal.,2014;Wilsonetal.,2005;Blitzeretal.,2007).Webuildontheseinsightswithourmodel,whichseekstomodulatesentimentpredictionsbasedonnetworkinformation(andviceversa).2.2Signed-networkanalysisManysocialnetworksencodeperson-to-personsen-timentinformationviasignededgesbetweenuserssummarizingtheiropinionsofeachother.Inthisset-ting,onecanleveragesociologicaltheoriesofpair-wiserelationshipsandgroup-levelorganizationtoidentifyandunderstandpatternsintheserelation-ships(Heider,1946;CartwrightandHarary,1956).Balancetheoryisbasedonsimpleintuitionslike‘afriendofmyfriendismyfriend’,‘anenemyofmyenemyismyfriend’,and‘anenemyofmyfriendismyenemy’.Ingraphtheory,thesearestatementsabouttheedgesignsoftrianglesofconnectednodes:giventhesignsoftwoedges,balancetheorypredictsthethird,assummarizedinFig.1(UN),wherethetwogivenedges(gray)determinethethird(black).Fordirectedrelationships,Leskovecetal.(2010B)formulateanalternativecalledstatustheory,whichpositsthatnetworksorganizeaccordingtosocialsta-tus:anodehaspositiveedgestootherswithhigherstatusandnegativeedgestothosewithlowersta-tus.Fig.1(UN)illustratesthestructureofvariousdi-rectedsignedtriangles,wherethesignofthethirdedge(black)canbeinferredbasedonthesignsanddirectionsoftheothertwo(gray).Leskovecetal.(2010B)showthatsignededgesinnetworksemergeinamannerthatisbroadlycon-sistentwithbothofthesetheoriesandthatsocial-networkstructurealonecansupportaccurateedge-signpredictions(Leskovecetal.,2010a).Kunegisetal.(2013)predicthiddenpositiveandnegativeedgesinascenariowhereallobservededgesarepositive.Bachetal.(2013)andHuangetal.(2013)framesignpredictionasahinge-lossMarkovrandomfield,atypeofprobabilisticgraphicalmodelintroducedbyBroecheleretal.(2010).Ourmodelcombinestheseideaswithasentimentmodeltoachieveevenmorerobustpredictions.2.3Synthesisofsentiment&networkanalysisModelsofsentimentandsignednetworkshavebeensuccessfulatavarietyoftasks.However,missingfromthecurrentscientificpictureisadeepunder-standingofthewaysinwhichsentimentexpres-sionandsocialnetworksinteract.Tosomeextent,theseinteractionsarecapturedbyaddingcontextualanddemographicfeaturestoatext-basedsentimentmodel,butthosefeaturesonlyapproximatetherichrelationalstructureencodedinasignednetwork.Thomasetal.(2006)andTanetal.(2011)cap-italizeonthisinsightusinganelaborationofthe
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
299
+++–+–––+–––+++SOCIAL BALANCE THEORYSOCIAL STATUS THEORY+–?(UN)Theoriesofsocialbalanceandstatus++––??+?“You’re onecrazy mofo!”“Love u! :)”“To whom itmay concern”(B)DesiderataFigure1:(UN)Predictionsofsocialbalanceandstatustheoriesfortheboldblackedge,giventhethingrayedges.Balancetheoryreasonsaboutundirected,statustheoryaboutdirected,triangles.Inthestatusdiagrams,nodesizesignifiessocialstatus.Apositiveedgemaybereplacedbyanegativeedgeintheoppositedirection,andviceversa,withoutchangingtheprediction.Statustheorymakesnopredictionintherightmostcase.(B)Situationsofthesortweaimtocapture.Atleft,thenetworkresolvestextualambiguity.Atright,thetextcompensatesforedge-labelsparsity.graph-cutsapproachofPangandLee(2004).Theyareguidedbyanassumptionofhomophily,i.e.,thatcertainsocialrelationshipscorrelatewithagreementoncertaintopics:Thomasetal.(2006)usepartyaf-filiationandmentionsinspeechestopredictvotingpatterns,andTanetal.(2011)useTwitterfollowsandmentionstopredictattitudesaboutpoliticalandsocialevents.RelatedideasarepursuedbyMaetal.(2011)andHuetal.(2013),whoaddtermstotheirmodelsenforcinghomophilybetweenfriendswithregardtotheirpreferences.Weadoptsomeoftheassumptionsoftheaboveauthors,butourtaskisfundamentallydifferentintworespects.First,whereastheymodelperson-to-itemevaluations,wemodelperson-to-personevalu-ations;thesearealsothefocusofTangetal.(2013),who,Anche se,useanunsignednetwork,whereasourworkisgearedtowarddistinguishingpositiveandnegativeedgelabels.Second,theabovemodelsmakeoverarchinghomophilyassumptions,whereasweallowourmodeltoexplorethefullsetoftriangleconfigurationssuggestedbyFig.1(UN).3ModelHere,wearguethatcombiningtextualandstructuralfeaturescanhelppredictedgesigns.Weformu-lateamodel,showthatitiscomputationallyhard,andprovidearelaxedversionthatiscomputationallytractable,buildingonworkbyBachetal.(2013).3.1DesiderataInmanyreal-worldscenarios,richfeaturesareasso-ciatedwithedgesbetweentwopeople,suchascom-mentstheymadeabouteachother,messagestheyexchanged,orotherbehavioralfeatures.Suchfea-turesmaycontainastrongsentimentsignalusefulforpredictingedgesignsandmaybeusedtofitaconventionalsentimentmodel(Sec.2.1).Tuttavia,thesignofanedgealsodependsonthesignsofsurroundingedgesinthenetwork(Sec.2.2).Apurelyedge-feature–basedsentimentmodelcan-notaccountforthenetworkstructure,sinceitrea-sonsaboutedgesasindependentofeachother.Wearguethatconsideringsentimentandnetworkstructurejointlycanresultinbetterpredictionsthaneitheroneonitsown.Fig.1(B)providestwoillus-trativeexamples.Here,thegrayedgesignsareob-served,whilethepolaritiesoftheblackedgesaretobepredicted.Intheleftnetwork,thetextoftheblackedge(‘You’reonecrazymofo!’)mightsug-gestanegativepolarity.However,anegativeblackedgewouldmakebothtrianglesinconsistentwithbalancetheory(Fig.1(UN)),whereasapositiveblackedgemakesthemconsistentwiththetheory.So,inthiscase,thenetworkcontexteffectivelyhelpsde-tecttheteasing,non-literaltoneofthestatement.IntherightnetworkofFig.1(B),onlyoneofthreeedgesignsisobserved.Predictingtwopos-itiveedgeswouldbeconsistentwithbalancethe-ory,butthesamewouldbetrueforpredictingtwonegativeedges.Thetextonthelowerblackedge(‘Towhomitmayconcern’)doesnotcarryanyclearsentimentsignal,butthe‘Loveu!:)’ontheotheredgestronglysuggestsapositivepolarity.Thisletsusconcludethatthebottomedgeshouldprobablybepositive,pure,sinceotherwisethetrianglewouldcontradictbalancetheory.Thisshowsthatcombin-ingsentimentandnetworkfeaturescanhelpwhenjointlyreasoningaboutseveralunknownedgesigns.
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
300
3.2ProblemformulationWenowformulateamodelcapableofsynthesizingtextualandnetworkfeatures.Notation.WerepresentthegivensocialnetworkasasignedgraphG=(V,E,X),wheretheverticesVrepresentpeople;theedgesE,relationshipsbetweenpeopleinV;andthesignvectorx∈{0,1}|E|repre-sentsedgepolarities,i.e.,xe=1(xe=0)indicatesapositive(negative)polarityforedgee∈E.Sometypesofrelationshipsimplydirectededges(e.g.,followingauseronTwitter,orvotingonacan-didateinanelection),whereasothersimplyundi-rectededges(e.g.,friendshiponFacebook,oragree-mentinanetworkofpoliticians).Weformulateourproblemforundirectedgraphshere,buttheexten-siontodirectedgraphsisstraightforward.Wede-fineatrianglet={e1,e2,e3}⊆Etobeasetofthreeedgesthatformacycle,anduseTtoindi-catethesetofalltrianglesinG.Finally,weusext=(xe1,xe2,xe3)∈{0,1}3torefertot’sedgesigns.Optimizationproblem.Weassumethatthestruc-tureofthenetwork(i.e.,VandE)isfullyobserved,whereastheedgesignsxareonlypartiallyobserved.Further,weassumethatwehaveasentimentmodelthatoutputs,foreachedgeeindependently,anes-timatepeoftheprobabilitythateisofpositivepo-larity,basedontextualfeaturesassociatedwithe.Thetask,Poi,istoinfertheunobservededgesignsbasedontheobservedinformation.Thehigh-levelideaisthatwewanttoinferedgesignsthat(1)agreewiththepredictionsofthesen-timentmodel,E(2)formtrianglesthatagreewithsocialtheoriesofbalanceandstatus.Itisnotalwayspossibletomeetbothobjectivessimultaneouslyforalledgesandtriangles,soweneedtofindatrade-off.Thisgivesrisetoacombinatorialoptimizationproblem,whichwetermTRIANGLEBALANCE,thatseekstofindedgesignsx∗thatminimizeanobjec-tiveconsistingofbothedgeandtrianglecosts:1x∗=argminx∈{0,1}|E|Xe∈Ec(xe,pe)+Xt∈Td(xt).(1)Thefirsttermisthetotaledgecost,inwhicheachedgeecontributesacostcapturinghowmuchitsin-ferredsignxedeviatesfromthepredictionpeofthe1Ofcourse,theentriesofx∗correspondingtoobservededgesarenotvariablebutfixedtotheirobservedvaluesinEq.1.sentimentmodel.Thesecondterm,thetotaltrianglecost,penalizeseachtriangletaccordingtohowun-desirableitsconfigurationisunderitsinferredsignsxt(e.g.,ifitcontradictsstatusorbalancetheory).Weusethefollowingedgecostfunction:C(xe,pe)=λ1(1−pe)xe+λ0pe(1−xe).(2)Here,λ1,λ0∈R+aretunableparametersthatallowforasymmetriccostsforpositiveandnegativeedges,rispettivamente,andpeistheprobabilityofedgeebe-ingpositiveaccordingtothesentimentmodelalone.Intuitively,themoretheinferrededgesignxedevi-atesfromthepredictionpeofthesentimentmodel,thehighertheedgecost.(NotethatatmostoneofthetwosumfactorsofEq.2isnon-zero.)Thetrianglecostfortriangletissignifiedbyd(xt),whichcanonlytakeon8distinctvaluesbe-causext∈{0,1}3(inpractice,therearesymmetriesthatdecreasethisnumberto4).Theparametersd(xt)maybetunedsothattriangleconfigurationsthatagreewithsocialtheoryhavelowcosts,whilethosethatdisagreewithit(e.g.,‘theenemyofmyfriendismyfriend’)havehighcosts.3.3ComputationalcomplexityTheproblemdefinedinEq.1isintuitive,Ma,aswithmanycombinatorialoptimizationproblems,itishardtofindagoodsolution.Inparticular,wesketchaproofofthistheoreminAppendixA:Theorem1.TRIANGLEBALANCEisNP-hard.3.4RelaxationasaMarkovrandomfieldTheobjectivefunctionofEq.1maybeseenasdefin-ingaMarkovrandomfield(MRF)overtheunderly-ingsocialnetworkG,withedgepotentials(definedbyc)andtrianglepotentials(definedbyd).Infer-enceinMRFs(i.e.,computingx∗)isawell-stud-iedtaskforwhichavarietyofmethodshavebeenproposed(KollerandFriedman,2009).Tuttavia,sinceourproblemisNP-hard,nomethodcanbeex-pectedtofindx∗efficiently.Onewayofdealingwiththecomputationalhardnesswouldbetofindanap-proximatebinarysolution,usingtechniquessuchasGibbssamplingorbeliefpropagation.Anotherop-tionistoconsideracontinuousrelaxationofthebi-naryproblemandfindanexactnon-binarysolutionwhoseedgesignsarecontinuous,i.e.,xe∈[0,1].
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
301
Wetakethislatterapproachandcastourproblemasahinge-lossMarkovrandomfield(HL-MRF).ThisisinspiredbyBachetal.(2013),whoalsouseanHL-MRFtopredictedgesignsbasedontrianglestructure,butdonotuseanyedgefeatures.AnHL-MRFisanMRFwithcontinuousvariablesandwithpotentialsthatcanbeexpressedassumsofhinge-losstermsoflinearfunctionsofthevariables(cf.Broecheleretal.(2010)fordetails).HL-MRFshavetheadvantagethattheirobjectivefunctionisconvexsothat,unlikebinaryMRFs(asdefinedbyEq.1),exactinferenceisefficient(Bachetal.,2013).Weachievearelaxationbyusingsumsofhinge-losstermstointerpolatecoverthecontinuousdo-main[0,1]andd,Sopra[0,1]3(eventhoughtheyaredefinedonlyforbinarydomains).Asaresult,theHL-MRFformulationisequivalenttoEq.1whenallxearebinary,butitalsohandlescontinuousval-uesgracefully.Wenowinterpretareal-valued‘sign’xe∈[0,1]asthedegreetowhicheispositive.Westartbyshowinghowtotransformc:eventhoughitcouldbeusedinitscurrentform(Eq.2),wecreateatighterrelaxationbyusing˜c(xe,pe)=λ1kxe−pek++λ0kpe−xek+,(3)wherekyk+=max{0,sì}isthehingeloss.Atmostonetermcanbeactiveforanyxe∈[0,1]duetothehingeloss,andc(xe,pe)=˜c(xe,pe)forbinaryxe.Torewrited,noticethat,foranyxt∈{0,1}3,wecanwritedasd(xt)=Xz∈{0,1}3D(z)δ(xt,z),(4)whereδ(xt,z)=1ifxt=zand0otherwise.Whileδisnotconvex,wecanusef(xt,z)=k1−kxt−zk1k+(5)asaconvexsurrogate.Whenxtisbinary,eitherxt=zsokxt−zk1=0orxt6=zsokxt−zk1≥1,andhencef(xt,z)=δ(xt,z).Toproveconvexity,notethat,foranyfixedbinaryz∈{0,1}3,kx−zk1=P3i=1|xi−zi|islinearinx∈[0,1]3,since|xi−zi|equalseitherxi(ifzi=0)or1−xi(ifzi=1).Itfol-lowsthatfisahinge-lossfunctionofalineartrans-formationofxtandthereforeconvexinxt.TrainingTestingEvidence To inferRANDOM SAMPLINGBFS SAMPLINGuvFigure2:Optionsfortrainingandtestingourmodel.Requiringthetrianglecostd(z)tobenonnegativeforalltriangletypesz∈{0,1}3,wecanuse˜d(xt)=Xz∈{0,1}3D(z)F(xt,z)(6)asaconvexsurrogateford.OuroveralloptimizationproblemisthenthefollowingrelaxationofEq.1:x∗=argminx∈[0,1]|E|Xe∈E˜c(xe,pe)+Xt∈T˜d(xt).(7)ThisobjectivehastheexactformofanHL-MRF,sinceitisaweightedsumofhingelossesoflin-earfunctionsofx.WeusetheProbabilisticSoftLogicpackage2toperformtheoptimization,whichisinturnbasedonthealternating-directionmethodofmultipliers(ADMM)(Boydetal.,2011).Learning.Clearly,asolutionisonlyusefulifthecostparameters(λ1,λ0,andd(z)forallz∈{0,1}3)aresetappropriately.Oneoptionwouldbetosetthevaluesheuristically,basedonthepredictionsmadebythesocialbalanceandstatustheories(Sec.2.2).Tuttavia,itismoreprincipledtolearntheseparam-etersfromdata.Forthispurpose,weleveragethelearningproceduresincludedintheHL-MRFimple-mentationweemploy,whichusesthevoted-percep-tronalgorithmtoperformmaximum-likelihoodesti-mation(Bachetal.,2013).Sinceourdatapoints(edges)interactwitheachotherviathenetwork,somewordsonhowweper-formtrainingandtestingareinorder.Fig.2showstwooptionsforobtainingtrainingandtestingsets(weusebothoptionsinourexperiments).Inthe‘randomsampling’paradigm,werandomlychooseasetofedgesfortraining(blue),andadisjointsetofedgesfortesting(yellow).In‘BFSsampling’,we2http://psl.umiacs.umd.edu
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
302
runabreadth-firstsearchfromseednodeutoobtainacoherenttrainingset(blue),andlikewisefromaseednodevtoobtainacoherenttestingset(yellow),takingcarethatnoedgesfromthetrainingsetarealsoincludedinthetestingset.Duringbothtrainingandtesting,anarbitrarypor-tionoftheedgesignsmaybefixedtoobservedval-uesandneednotbeinferred.ThesearethesolidedgesinFig.2;werefertothemasevidence.Fur-ther,wedefinetheevidenceratioasthenumberofevidenceedges,dividedbythenumberofalledgesconsidered(solidanddashed).Thelearningalgorithmmayusethestructure(VandE)ofthetraininggraphinducedbyallblueedges(solidanddashed),thepredictionspeofthesentimentmodelforallblueedges,andthesignsofthesolidblueedgestopredictthedashedblueedges.Duringtesting,thenetworkstructureofallyel-lowedges,thesentimentpredictionsforallyellowedges,andthesignsofthesolidyellowedgesmaybeusedtopredictthedashedyellowedgesigns.Inprinciple,alltrainingedgescouldbeusedasextraevidencefortesting(i.e.,allblueedgesmaybemadesolidyellow).Tuttavia,inourexperiments,wekeepthetrainingandtestingsetsfullydisjoint.Technicaldetails.Forclarity,wegivefurtherde-tails.First,thedistributionofpositiveandnegativesignsmaybeskewed;e.g.,weobserveapriorprob-abilityof76%positivesignsinourWikipediacor-pus(Sec.4).Therefore,asalsodonebyBachetal.(2013),weaddacosttermtoourobjective(Eq.7)thatpenalizesdeviationsfromthispriorprobabil-ity(asestimatedonthetrainingset).Thisensuresthatthemodelcandefaulttoareasonablepredictionforedgesthatarenotembeddedinanytrianglesandaboutwhichthesentimentmodelisuncertain.Second,intuitively,weshouldnotpenalizedevi-atingfromthesentimentmodelwhenitisitselfun-certainaboutitsprediction(i.e.,whenpeisfarfromboth0and1).Piuttosto,wewanttorelymoreheavilyonsignalsfromthenetworkstructureinsuchcases.Toachievethis,weintroduce10pairsofcostparam-eters(λ(1)1,λ(1)0),…,(λ(10)1,λ(10)0).Then,wedividetheinterval[0,1]into10bins,andwhenpefallsintothei-thbin,weuseλ(io)1andλ(io)0inEq.3.Thisway,largercostscanbelearnedfortheextremebinscloseto0and1thanfortheintermediatebinsaround0.5.Finally,hinge-losstermsmayoptionallybesquaredinHL-MRFs.WeusethesquaredhingelossinEq.5,sinceinitialexperimentationshowedthistoperformslightlybetterthanthelinearhingeloss.4WikipediaexperimentsOurfirstsetofexperimentsisconductedontheWi-kipediaRequestsforAdminshipcorpus,whichal-lowsustoevaluateourmodel’sabilitytopredictperson-to-personevaluationsinWebtextsthatareinformalbutpertaintoimportantsocialoutcomes.4.1DatasetdescriptionForaWikipediaeditortobecomeanadministrator,arequestforadminship(RfA)3mustbesubmitted,eitherbythecandidateorbyanothercommunitymember.Subsequently,anyWikipediamembermaycastasupporting,neutro,oropposingvote.Thisinducesadirected,signednetworkinwhichnodesrepresentWikipediamembersandedgesrepresentvotes(wediscardneutralvotes).Wecrawledandparsedallvotessincetheadop-tionoftheRfAprocessin2003throughMay2013.4ThissignednetworkwaspreviouslyanalyzedbyLeskovecetal.(2010B;2010UN).Tuttavia,thereisalsoarichtextualcomponentthathassofarre-maineduntappedforedge-signprediction:eachvoteistypicallyaccompaniedbyashortcomment(me-dian/mean:19/34gettoni).Atypicalpositivecom-mentreads,‘I’venoconcerns,willmakeanexcel-lentadditiontotheadmincorps’,whileanexampleofanegativecommentis,‘Littleevidenceofcollab-orationwithothereditorsandlimitedcontentcre-ation.’ThepresenceofavotingnetworkalongsidetextualedgefeaturesmakesourmethodofSec.3well-suitedforthisdataset.TheRfAnetworkcontains11Knodes,160Kedges(76%positive),andcloseto1Mtriangles.4.2ExperimentalsetupTrain/testsets.Wefollowthetrain–testparadigmtermed‘BFSsampling’inSec.3.4andFig.2,choos-ing10randomseednodes,fromeachofwhichweperformabreadth-firstsearch(followingbothin-andout-links)untilwehavevisited350nodes.We3http://en.wikipedia.org/wiki/Wikipedia:RfA4Dataavailableonline(West,2014).
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
303
thusobtain10subgraphswith350nodeseach.Wetrainamodelforeachsubgraphiandtestitonsub-graphi+1(mod10),ensuringthatedgesfromthetraininggraphareremovedfromthetestinggraph.TheBFSsamplingparadigmwasusedbecausethealternative(‘randomsampling’inFig.2)pro-ducessubgraphswithmostlyisolatededgesandonlyafewtriangles—anunrealisticscenario.Evaluatedmodels.Weevaluatethreemodels:1.Astandard,text-basedsentimentmodelthattreatsedgesasindependentdatapoints;2.ourfullmodelasspecifiedinSec.3.4,whichcombinesedgecostsbasedonthepredictionsofthetext-basedsentimentmodelwithtrianglecostscapturingnetworkcontext;3.aversionofourmodelthatconsidersonlytrian-glecosts,whileignoringthepredictionsofthetext-basedsentimentmodel(akintothemodelproposedbyBachetal.(2013)).Werefertothesemodelsas‘sentiment’,‘combined’,and‘network’,respectively.Sentimentmodel.Ourtext-basedsentimentmodelisanL2-regularizedlogistic-regressionclassifierwhosefeaturesaretermfrequenciesofthe10,000overallmostfrequentwords.TheL2-penaltyischo-senviacross-validationonthetrainingset.Sincecommentsoftenexplicitlycontainthelabel(‘sup-port’or‘oppose’),weremoveallwordswithpre-fixes‘support’or‘oppos’.Wetrainthemodelonlyonce,onarandomsampleof1,000commentsdrawnfromthesetofall160Kcomments(thevastmajorityofwhichwillnotappearinour10subgraphs).Evidenceratio.Regardingtheothertwomodels,recallfromSec.3.4ourdefinitionoftheevidenceratio,thefractionofedgesignsthatarefixedasevi-denceandneednotbeinferred.Inourexperiments,weexploretheimpactoftheevidenceratioduringtrainingandtesting,sinceweexpectperformancetoincreaseasmoreevidenceisavailable.(Weusethesameevidenceratiofortrainingandtesting,butthisneednotnecessarilybeso.)Metrics.Asourprincipalevaluationmetrics,weusetheareasunderthecurve(AUC)ofthereceiveroperatingcharacteristic(ROC)curveaswellastheprecision–recall(PR)curves.TherearetwoPRllllEvidence ratioArea under the curve12.5%25%50%75%0.50.60.70.80.91.0llllllll(UN)AUC/ROCllllEvidence ratio12.5%25%50%75%0.20.40.60.81.0llllllllCombinedSentimentNetworkRandom(B)AUC/negPRFigure3:AUCasfunctionofevidenceratio(Wikipedia),withstandarderrors.curves,oneforthepositiveclass,theotherforthenegativeone.Ofthesetwo,thepositiveclassislessinteresting:duetotheclassimbalanceof76%posi-tiveedges,evenarandomguesserwouldachieveanAUCof0.76.ThePRcurveofthenegativeclassismoreinformative:here,itismuchhardertoachievehighAUC,sincerandomguessingyieldsonly0.24.Moreover,thenegativeedgesarearguablymoreim-portant,notonlybecausetheyarerarer,butalsobe-causetheyindicatetensionsinthenetwork,whichwemightbeinterestedindetectingandresolving.Forthesereasons,wereportonlytheAUCunderthenegativePRcurve(AUC/negPR)here.Additionally,wereporttheareaundertheROCcurve(AUC/ROC),astandardmetricforquantify-ingclassificationperformanceonunbalanceddata.Itcapturestheprobabilityofarandompositivetestexamplereceivingahigherscorethanarandomneg-ativeone(soguessinggivesanAUC/ROCof0.5).4.3ResultsPerformanceasafunctionofevidenceratio.TheAUCsasfunctionsoftheevidenceratioareshowninFig.3(UN).(WeemphasizethattheseplotsarenotthemselvesROCandPRcurves;Piuttosto,theyarede-rivedfromthosecurvesbymeasuringAUCforarangeofmodels,parametrizedbyevidenceratio.)Sinceweusethesamesentimentmodelinallcases(Sec.4.2),itsperformance(yellow)doesnotdependontheevidenceratio.Itisremarkablyhigh,atanAUC/ROCof0.88,asaconsequenceofthehighlyindicative,sometimesevenformulaic,lan-guageusedinthecomments(examplesinSec.4.2).Thenetwork-onlymodel(blue)workspoorlyonverylittleevidence(AUC/ROC0.56for12.5%ev-
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
e
D
tu
/
T
UN
C
l
/
l
UN
R
T
io
C
e
–
P
D
F
/
D
o
io
/
.
1
0
1
1
6
2
/
T
l
UN
C
_
UN
_
0
0
1
8
4
1
5
6
6
8
9
9
/
/
T
l
UN
C
_
UN
_
0
0
1
8
4
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
304
lllllllNumber of dropped featuresArea under the curve01050100500100020000.50.60.70.80.91.0llllllllllllll(UN)AUC/ROClllllllNumber of dropped features01050100500100020000.20.40.60.81.0llllllllllllllCombinedSentimentNetworkRandom(B)AUC/negPRFigure4:AUCasfunctionofnumberofdroppedfeatures(Wikipedia),withstandarderrors.Evidenceratio75%.idence)butimprovessteadilyasmoreevidenceisused(AUC/ROC0.82for75%evidence):thisisin-tuitive,sincemoreevidencemeansstrongercontextforeachedgesigntobepredicted.Althoughthenetwork-onlymodelworkspoorlyonlittleevidence,ourfullmodel(black),whichsyn-thesizesthesentimentandnetworkmodels,isnotaf-fectedbythisandeffectivelydefaultstothebehaviorofthesentiment-onlymodel.Furthermore,althoughthenetwork-onlymodelneverattainstheperfor-manceofthesentiment-onlymodel,combiningthetwoinourfullmodel(black)nonethelessyieldsasmallperformanceboostintermsofAUC/ROCto0.89for75%evidence.Thegainsaresignifi-cantlylargerwhenweconsiderAUC/negPRinsteadofAUC/ROC:whilethesentimentmodelachieves0.60,thecombinedmodelimprovesonthisby13%,to0.68,at75%evidenceratio.Performanceasafunctionofsentiment-modelquality.Itseemshardtoimprovebymuchonasen-timentmodelthatachievesanAUC/ROCof0.88onitsown;theWikipediacorpusoffersanexception-allyexplicitlinguisticsignal.Hence,inournextex-periment,weexploresystematicallyhowourmodelbehavesunderalesspowerfulsentimentmodel.First,wemeasure,foreachfeature(i.e.,word),howinformativeitisonitsownforpredictingthesignsofedges(quantifiedbyitsmutualinformationwiththeedgesign),whichinducesarankingoffea-turesintermsofinformativeness.Now,tomakethesentimentmodellesspowerfulinacontrolledway,wedropthetopmfeaturesandrepeattheexperimentdescribedaboveforarangeofmvalues(wherewekeeptheevidenceratiofixedat75%).llllllllll1e−041e−021e+00Sentiment−model prediction peNorm. cost for deviating from pe0.10.20.30.40.50.60.70.80.91.0Figure5:Normalizedcostλ(io)(definedinSec.4.3;log-arithmicscale)fordeviatingfromsentiment-modelpre-dictionspe,forbinsi=1,…,10(Wikipedia).Upperbinboundariesonthex-axis.Valuesshownareaveragesover10folds.Evidenceratio75%.Fig.4showsthattheperformanceofthesenti-mentmodel(yellow)declinesdrasticallyasmorefeaturesareremoved.Thecombinedmodel(black),onthecontrary,ismuchlessaffected:whentheper-formanceofthesentimentmodeldropstothatofthenetwork-only(blue;ROC/AUC0.81),thecombinedmodelisstillstrongerthanboth(0.86).Evenasthesentimentmodelapproachesrandomperformance,thecombinedmodelstillneverdropsbelowthenet-work-onlymodel—itsimplylearnstodisregardthepredictionsofthesentimentmodelaltogether.Learnededgecosts.RecallfromthefinalpartofSec.3.4thateachoutputpeofthesentimentmodelfallsintooneof10bins[0,0.1],…,[0.9,1],withseparateedge-costparametersλ(io)1andλ(io)0learnedforeachbucketi.Therationalewastogivethemodelthefreedomtotradeoffedgeandtrianglecostsdifferentlyforeachedgee,dependingonhowinformativethesentimentmodel’spredictionpeis.Thegoalofthissectionistounderstandwhetherourmodelindeedexposessuchbehavior.RecallfromEq.3thatλ1istheconstantwithwhichtheab-solutedifferencebetweenpeandtheinferrededgesignxeismultipliedwhenxe>pe,whileλ0istheconstantwhenxe