Transactions of the Association for Computational Linguistics, vol. 2, pp. 505–516, 2014. Action Editor: Janyce Wiebe.

Transactions of the Association for Computational Linguistics, vol. 2, pp. 505–516, 2014. Action Editor: Janyce Wiebe.
Submitted 4/2014; Revised 8/2014; Published November 1, 2014. c
(cid:13)

2014 Association for Computational Linguistics.

JointModelingofOpinionExpressionExtractionandAttributeClassificationBishanYangDepartmentofComputerScienceCornellUniversitybishan@cs.cornell.eduClaireCardieDepartmentofComputerScienceCornellUniversitycardie@cs.cornell.eduAbstractInthispaper,westudytheproblemsofopin-ionexpressionextractionandexpression-levelpolarityandintensityclassification.Tradi-tionalfine-grainedopinionanalysissystemsaddresstheseproblemsinisolationandthuscannotcaptureinteractionsamongthetex-tualspansofopinionexpressionsandtheiropinion-relatedproperties.Wepresenttwotypesofjointapproachesthatcanaccountforsuchinteractionsduring1)bothlearningandinferenceor2)onlyduringinference.Exten-siveexperimentsonastandarddatasetdemon-stratethatourapproachesprovidesubstantialimprovementsoverpreviouslypublishedre-sults.Byanalyzingtheresults,wegainsomeinsightintotheadvantagesofdifferentjointmodels.1IntroductionAutomaticextractionofopinionsfromtexthasat-tractedconsiderableattentioninrecentyears.Inparticular,significantresearchhasfocusedonex-tractingdetailedinformationforopinionsatthefine-grainedlevel,e.g.identifyingopinionexpressionswithinasentenceandpredictingphrase-levelpo-larityandintensity.Theabilitytoextractfine-grainedopinioninformationiscrucialinsupportingmanyopinion-miningapplicationssuchasopinionsummarization,opinion-orientedquestionanswer-ingandopinionretrieval.Inthispaper,wefocusontheproblemofidenti-fyingopinionexpressionsandclassifyingtheirat-tributes.Weconsiderasanopinionexpressionanysubjectiveexpressionthatexplicitlyorimplic-itlyconveysemotions,sentiment,beliefs,opinions(i.e.privatestates)(Wiebeetal.,2005),andcon-sidertwokeyattributes—polarityandintensity—forcharacterizingtheopinions.Considerthesen-tenceinFigure1,forexample.Thephrases“abiasinfavorof”and“beingseverelycriticized”areopin-ionexpressionscontainingpositivesentimentwithmediumintensityandnegativesentimentwithhighintensity,respectively.Mostexistingapproachestacklethetasksofopin-ionexpressionextractionandattributeclassificationinisolation.Thefirsttaskistypicallyformulatedasasequencelabelingproblem,wherethegoalistola-beltheboundariesoftextspansthatcorrespondtoopinionexpressions(Brecketal.,2007;YangandCardie,2012).Thesecondtaskisusuallytreatedasabinaryormulti-classclassificationproblem(Wil-sonetal.,2005;ChoiandCardie,2008;YessenalinaandCardie,2011),wherethegoalistoassignaclasslabeltoatextfragment(e.g.aphraseorasen-tence).Solutionstothetwotaskscanbeappliedinapipelinearchitecturetoextractopinionexpressionsandtheirattributes.However,pipelinesystemssuf-ferfromerrorpropagation:opinionexpressioner-rorspropagateandleadtounrecoverableerrorsinattributeclassification.Limitedworkhasbeendoneonthejointmodelingofopinionexpressionextractionandattributeclas-sification.ChoiandCardie(2010)firstproposedajointsequencelabelingapproachtoextractopin-ionexpressionsandlabelthemwithpolarityandin-tensity.Theirapproachtreatsbothexpressionex-tractionandattributeclassificationastoken-levelse-

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
9
9
1
5
6
6
9
3
3

/

/
t

l

a
c
_
a
_
0
0
1
9
9
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
9
S
e
p
e
m
b
e
r
2
0
2
3

506

Hedemonstratedabiasinfavorofmediumtherebelsdespitebeingseverelycriticizedhigh.Figure1:Anexamplesentenceannotatedwithopinionexpressionsandtheirpolarityandintensity.Weusecoloredboxestomarkthetextualspansofopinionexpressionswheregreen(red)denotespositive(negative)polarity,andusesubscriptstodenoteintensity.quencelabelingtasks,andthuscannotmodelthelabeldistributionoverexpressionseventhoughtheannotationsaregivenattheexpressionlevel.Jo-hanssonandMoschitti(2011)consideredapipelineofopinionextractionfollowedbypolarityclassifica-tionandproposere-rankingitsk-bestoutputsusingglobalfeatures.Onekeyissue,however,isthattheapproachenumeratesthek-bestoutputinapipelinemannerandthustheydonotnecessarilycorrespondtothek-bestglobaldecisions.Moreover,asthenumberofopinionattributesgrows,itisnotclearhowtoidentifythebestkforeachattribute.Incontrasttoexistingapproaches,weformu-lateopinionexpressionextractionasasegmenta-tionproblemandattributeclassificationassegment-levelattributelabeling.Tocapturetheirinterac-tions,wepresenttwotypesofjointapproaches:(1)jointlearningapproaches,whichcombineopinionsegmentdetectionandattributelabelingintoasin-gleprobabilisticmodel,andestimateparametersforthisjointmodel;and(2)jointinferenceapproaches,whichbuildseparatemodelsforopinionsegmentdetectionandattributelabelingattrainingtime,andjointlyapplythese(viaasingleobjectivefunction)onlyattesttimetoidentifythebest“combined”de-cisionofthetwomodels.Toinvestigatetheeffectivenessofourapproaches,weconductedextensiveexperimentsonastandardcorpusforfine-grainedopinionanalysis(theMPQAcorpus(Wiebeetal.,2005)).Wefoundthatallofourproposedapproachesprovidesubstan-tialimprovementsoverthepreviouslypublishedre-sults.Wealsocomparedourapproachestoastrongpipelinebaselineandobservedthatjointlearningre-sultsinasignificantboostinprecisionwhilejointinference,withanappropriateobjective,cansignifi-cantlyboostbothprecisionandrecallandobtainthebestoverallperformance.Erroranalysisprovidesadditionalunderstandingofthedifferencesbetweenthejointlearningandjointinferenceapproaches,andsuggeststhatjointinferencecanbemoreeffec-tiveandmoreefficientforthetaskinpractice.2RelatedWorkSignificantresearchefforthasbeeninvestedinthetaskoffine-grainedopinionanalysisinrecentyears(Wiebeetal.,2005;Wilsonetal.,2009).Wil-sonetal.(2005)firstmotivatedandstudiedphrase-levelpolarityclassificationonanopen-domaincor-pus.ChoiandCardie(2008)developedinferencerulestocapturecompositionaleffectsatthelexicallevelonphrase-levelpolarityclassification.Yesse-nalinaandCardie(2011)andSocheretal.(2013)learncontinuous-valuedphraserepresentationsbycombiningtherepresentationsofwordswithinanopinionexpressionandusingthemasfeaturesforclassifyingpolarityandintensity.Alloftheseap-proachesassumetheopinionexpressionsareavail-ablebeforetrainingtheclassifiers.However,inreal-worldsettings,thespansofopinionexpres-sionswithinthesentencearenotavailable.Infact,ChoiandCardie(2008)demonstratedthattheper-formanceofexpression-levelpolarityclassificationdegradesasmoresurrounding(butirrelevant)con-textisconsidered.Thismotivatestheadditionaltaskofidentifyingthespansofopinionexpressions.Opinionexpressionextractionhasbeensuccess-fullytackledviasequencetaggingmethods.Brecketal.(2007)appliedconditionalrandomfieldstoas-signeachtokenalabelindicatingwhetheritbelongstoanopinionexpressionornot.YangandCardie(2012)employedasegment-levelsequencelabelerbasedonsemi-CRFswithrichphrase-levelsyntac-ticfeatures.Inthiswork,wealsoutilizesemi-CRFstomodelopinionexpressionextraction.Therehasbeenlimitedworkonthejointmodelingofopinionexpressionextractionandattributeclassi-fication.ChoiandCardie(2010)firstdevelopedajointsequencelabelerthatjointlytagsopinions,po-larityandintensitybytrainingCRFswithhierarchi-calfeatures(Zhaoetal.,2008).Onemajordrawbackoftheirapproachisthatitmodelsbothopinionex-tractionandattributelabelingastasksintoken-levelsequencelabeling,andthuscannotmodeltheirinter-

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
9
9
1
5
6
6
9
3
3

/

/
t

l

a
c
_
a
_
0
0
1
9
9
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
9
S
e
p
e
m
b
e
r
2
0
2
3

507

actionsattheexpression-level.JohanssonandMos-chitti(2011)andJohanssonandMoschitti(2013)proposeajointapproachtoopinionexpressionex-tractionandpolarityclassificationbyre-rankingitsk-bestoutputusingglobalfeatures.Onemajoris-suewiththeirapproachisthatthek-bestcandidateswereobtainedwithoutglobalreasoningabouttherelativeuncertaintyintheindividualstages.Asthenumberofconsideredattributesgrows,italsobe-comeshardertodecidehowmanypredictionstose-lectfromeachattributeclassifier.Comparedtotheexistingapproaches,ourjointmodelshavetheadvantageofmodelingopinionex-pressionextractionandattributeclassificationatthesegment-level,andmoreimportantly,theyprovideaprincipledwayofcombiningthesegmentationandclassificationcomponents.Ourworkfollowsalonglineofjointmodelingre-searchthathasdemonstratedgreatsuccessforvari-ousNLPtasks(RothandYih,2004;Punyakanoketal.,2004;FinkelandManning,2010;Rushetal.,2010;Choietal.,2006;YangandCardie,2013).Methodstendtofallintooneoftwojointmod-elingframeworks:thefirstlearnsajointmodelthatcapturesglobaldependencies;theotherusesindependently-learnedmodelsandconsidersglobaldependenciesonlyduringinference.Inthiswork,westudybothtypesofjointapproachesforopinionexpressionextractionandopinionattributeclassifi-cation.3ApproachInthissection,wepresentourapproachesforthejointmodelingofopinionexpressionextractionandattributeclassification.Specifically,givenasen-tence,ourgoalistoidentifythespansofopinionexpressions,andsimultaneouslyassigntheirpolar-ityandintensity.Trainingdataconsistsofacol-lectionofsentenceswithmanuallyannotatedopin-ionexpressionspans,eachassociatedwithapolar-itylabelthattakesvaluesfrom{positive,negative,neutral},andanintensitylabel,takingvaluesfrom{high,medium,low}.Inthefollowing,wefirstdescribehowwemodelopinionexpressionextractionasasegment-levelse-quencelabelingproblemandmodelattributepredic-tionasaclassificationproblem.Thenweproposeourjointmodelsforcombiningopinionsegmenta-tionandattributeclassification.3.1OpinionExpressionExtractionTheproblemofopinionexpressionextractionas-sumestokenizedsentencesasinputandoutputsthespansoftheopinionexpressionsineachsen-tence.Previousworkhastackledthisproblemus-ingtoken-basedsequencelabelingmethodssuchasCRFs(e.g.Brecketal.(2007),YangandCardie(2012)).However,semi-MarkovCRFs(SarawagiandCohen,2004)(henceforthsemi-CRF)havebeenshownmoreappropriateforthetaskthanCRFssincetheyallowcontiguousspansintheinputsequence(e.g.anounphrase)tobetreatedasagroupratherthanasdistincttokens.Thus,theycaneasilycapturesegment-levelinformationlikesyntacticconstituentstructure(YangandCardie,2012).Thereforeweadoptthesemi-CRFmodelforopinionexpressionextractionhere.Givenasentencex,denoteanopinionseg-mentationasys=h(s0,b0),…,(sk,bk)i,wherethes0:kareconsecutivesegmentsthatformasegmentationofx;eachsegmentsi=(ti,ui)consistsofthepositionsofthestarttokentiandanendtokenui;andeachsiisassociatedwithabinaryvariablebi∈{I,O},whichindicateswhetheritisanopinionexpression(I)ornot(O).TakethesentenceinFigure1,forexam-ple.Thecorrespondingopinionsegmentationisys=h((0,0),O),((1,1),O),((2,6),I),((7,8),O),((9,9),O),((10,12),I),((13,13),O)i,whereeachsegmentcorrespondstoanopinionexpressionortoaphraseunitthatdoesnotexpressanyopinion.Usingasemi-MarkovCRF,wemodelthecondi-tionaldistributionoverallpossibleopinionsegmen-tationsgiventheinputx:P(ys|x)=exp{P|ys|i=1θ·f(ysi,ysi−1,x)}Py0s∈Yexp{P|y0s|i=1θ·f(y0si,y0si−1,x)}(1)whereθdenotesthemodelparameters,ysi=(si,bi)andfdenotesafeaturefunctionthatencodesthepo-tentialsoftheboundariesforopinionsegmentsandthepotentialsoftransitionsbetweentwoconsecutivelabeledsegments.

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
9
9
1
5
6
6
9
3
3

/

/
t

l

a
c
_
a
_
0
0
1
9
9
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
9
S
e
p
e
m
b
e
r
2
0
2
3

508

Notethattheprobabilityisnormalizedoverallpossibleopinionsegmentations.Toreducethetrain-ingcomplexity,weadoptedthemethoddescribedinYangandCardie(2012),whichonlynormalizesoversegmentcandidatesthatareplausibleaccord-ingtotheparsingstructureofthesentence.Figure2showssomecandidatesegmentationsgeneratedforanexamplesentence.Suchatechniqueresultsinalargereductionintrainingtimeandwasshowntobeeffectiveforidentifyingopinionexpressions.Thestandardtrainingobjectiveofasemi-CRF,istominimizetheloglossL(θ)=argminθ−NXi=1logP(y(i)s|x(i))(2)Itpenalizesanypredictedopinionexpressionwhoseboundariesdonotexactlyalignwiththeboundariesofthecorrectopinionexpressionsusing0-1loss.Unfortunately,exactboundarymatchingisoftennotusedasanevaluationmetricforopinionexpres-sionextractionsinceitishardforhumanannota-torstoagreeontheexactboundariesofopinionex-pressions.1Mostpreviousworkusedproportionalmatching(JohanssonandMoschitti,2013)asittakesintoaccounttheoverlappingproportionofthepre-dictedandthecorrectopinionexpressionstocom-puteprecisionandrecall.Toincorporatethiseval-uationmetricintotraining,weusesoftmax-margin(GimpelandSmith,2010)thatreplaceP(y(i)s|x(i))in(2)withPcost(y(i)s|x(i)),whichequalsexp{P|ys|i=1θ·f(ysi,ysi−1,x)}Py0s∈Yexp{P|y0s|i=1θ·f(y0si,y0si−1,x)+l(y0s,ys)}andwedefinethelossfunctionl(y0s,ys)as|y0s|Xi=1|ys|Xj=1(1{b0i6=bj∧b0i6=O}|sj∩s0i||s0i|+1{b0i6=bj∧bj6=O}|sj∩s0i||sj|)whichisthesumoftheprecisionandrecallerrorsofsegmentlabelingusingproportionalmatching.Theloss-augmentedprobabilityisonlycomputedduring1Theinter-annotatoragreementonboundariesofopinionexpressionsisnotstressedinMPQA(Wiebeetal.,2005).We hope to eradicate the eternal scourge of corruption .[ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ][ ] [ ][ ]Figure2:ExamplesofSegmentationCandidatestraining.Themoretheproposedlabeledsegmenta-tionoverlapswiththetruelabeledsegmentationforx,thelessitwillbepenalized.Duringinference,wecanobtainthebestlabeledsegmentationbysolvingargmaxysP(ys|x)=argmaxys|ys|Xi=1θ·f(ysi,ysi−1,x)Thiscanbedoneefficientlyviadynamicprogram-ming:V(t)=argmaxs=(u,t)∈s:t,y=(s,b),y0G(y,y0)+V(u−1)(3)wheres:tdenotesallcandidatesegmentsendingatpositiontandG(y,y0)=θ·f(y,y0,x).Theoptimalys∗canbeobtainedbycomputingV(n),wherenisthelengthofthesentence.3.2OpinionAttributeClassificationWeconsidertwotypesofopinionattributes:polar-ityandintensity.Foreachattribute,wemodelthemultinomialdistributionofanattributeclassgivenatextsegmentDenotingtheclassvariableforeachattributeasaj,wehaveP(aj|xs)=exp{φj·gj(aj,xs)}Pa0∈Ajexp{φj·gj(a0,xs)}(4)wherexsdenotesatextsegment,φjisapa-rametervectorandgjdenotesfeaturefunctionsforattributeaj.Thelabelspaceforpolarityclassificationis{positive,negative,neutral,∅}andthelabelspaceforintensityclassificationis{high,medium,low,∅}.Weincludeanemptyvalue∅todenoteassigningnoattributevaluetothosetextsegmentsthatarenotopinionexpressions.Inthefollowingdescriptionofourjointmod-els,weomitthesuperscriptontheattributevariableandderiveourmodelswithonesingleopinionat-tributeforsimplicity.Thederivationscanbecarriedthroughwithmorethanoneopinionattributebyas-sumingtheindependenceofdifferentattributes.

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
9
9
1
5
6
6
9
3
3

/

/
t

l

a
c
_
a
_
0
0
1
9
9
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
9
S
e
p
e
m
b
e
r
2
0
2
3

509

3.3TheJointModelsWeproposetwotypesofjointmodelsforopinionsegmentationandattributeclassification:(1)jointlearningmodels,whichtrainasinglesequencela-belingmodelthatmaximizesajointprobabilitydis-tributionoversegmentationandattributelabeling,andinfersthemostprobablelabeledsegmentationsaccordingtothejointprobability;and(2)jointinfer-encemodels,whichtrainasequencelabelingmodelforopinionsegmentationandseparatelytrainclassi-ficationmodelsforattributelabeling,andcombinethesegmentationandclassificationmodelsduringinferencetomakeglobaldecisions.Inthefollow-ing,wefirstpresentthejointlearningmodelsandthenintroducethejointinferencemodels.3.3.1JointSequenceLabelingWecanformulatejointopinionsegmentationandclassificationasasequencelabelingproblemonthelabelspaceY={y|y=h(s0,˜b0),…,(sk,˜bk)i}where˜bi=(bi,ai)∈{I,O}×A,wherebiisabinaryvariableasdescribedbeforeandaiisanattributeclassvariableassociatedwithsegmentsi.Sinceonlyopinionexpressionsshouldbeassignedopinionattributes,weconsiderthefollowinglabel-ingconstraints:ai=∅ifandonlyifbi=O.Wecanapplythesametrainingandinferencepro-ceduredescribedinSection3.1byreplacingthela-belspaceyswiththejointlabelspacey.Notethatthefeaturefunctionsaresharedoverthejointlabelspace.Forthelossfunctionintheloss-augmentedobjective,theopinionsegmentlabelbisalsore-placedwiththeaugmentedlabel˜b.3.3.2HierarchicalJointSequenceLabelingTheabovejointsequencelabelingmodeldoesnotexplicitlymodelthedependenciesbetweenopinionsegmentationandattributelabeling.Thetwosub-taskssharethesamesetoffeaturesandparameters.Inthefollowing,weintroduceanalternativeap-proachthatexplicitlymodelstheconditionaldepen-dencybetweenopinionsegmentationandattributelabeling,andallowssegmentation-andattribute-specificparameterstobejointlylearnedinonesin-glemodel.Notethatthejointlabelspacenaturallyformsahierarchicalstructure:theprobabilityofchoos-ingasequencelabelycanbeinterpretedastheprobabilityoffirstchoosinganopinionsegmenta-tionys=h(s0,b0),…,(sk,bk)igiventheinputx,andthenchooseasequenceofattributelabelsya=ha0,…,akigiventhechosensegmentsequence.Fol-lowingthisintuition,thejointprobabilitycanbede-composedasP(y|x)=P(ys|x)P(ya|ys,x)whereP(ys|x)ismodeledasEquation(1)andP(ya|ys,x)=|ys|Yi=1P(ai|ysi,x)∝exp{|ys|Xi=1φ·g(ai,ysi,x)}wheregdenotesafeaturefunctionthatencodesattribute-specificinformationfordiscriminatingdif-ferentattributeclassesforeachsegment.Fortraining,wecanalsoapplyasoftmax-marginbyaddingalossfunctionl(y0,y)tothedenominatorofP(y|x)(asinthebasicjointsequencelabelingmodeldescribedinSection3.3.1).Withtheestimatedparameters,wecaninfertheoptimalopinionsegmentationandattributelabelingbysolvingargmaxys,yaP(ys|x)P(ya|ys,x)Wecanapplyasimilardynamicprogrammingpro-cedurebyreplaceingyinEquation(3)withy=(s,b,a)andG(y,y0)withθ·f(y,y0,x)+φ·g(y,x).Ourdecompositionoflabelsandfeaturesissim-ilartothehierarchicalconstructionofCRFfeaturesinChoiandCardie(2010).Thedifferenceisthatourmodelisbasedonsemi-CRFsandthedecompo-sitionisbasedonajointprobability.Wewillshowthatthisresultsinbetterperformancethanthemeth-odsinChoiandCardie(2010)inourexperiments.3.3.3JointInferenceModelingthejointprobabilityofopinionseg-mentationandattributelabelingisarguablyelegant.However,trainingcanbeexpensiveasthecompu-tationinvolvesnormalizingoverallpossibleseg-mentationsandallpossibleattributelabelingsfor

l

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

p

:
/
/

d
i
r
e
c
t
.

m

i
t
.

e
d
u

/
t

a
c
l
/

l

a
r
t
i
c
e

p
d

f
/

d
o

i
/

.

1
0
1
1
6
2

/
t

l

a
c
_
a
_
0
0
1
9
9
1
5
6
6
9
3
3

/

/
t

l

a
c
_
a
_
0
0
1
9
9
p
d

.

f

b
y
g
u
e
s
t

t

o
n
0
9
S
e
p
e
m
b
e
r
2
0
2
3

510

eachsegment.Thus,wealsoinvestigatejointin-ferenceapproacheswhichcombinetheseparately-trainedmodelsduringinferencewithoutcomputingthenormalizationterm.Foropinionsegmentation,wetrainasemi-CRF-basedmodelusingtheapproachdescribedinSec-tion1.Forattributeclassification,wetrainaMax-EntmodelbymaximizingP(aj|xs)inEquation(4).Asweonlyneedtoestimatetheprobabilityofanattributelabelgivenindividualtextsegments,thetrainingdatacanbeconstructedbycollectingalistoftextsegmentslabeledwithcorrectattributelabels.Thetextsegmentsdonotneedtoformallpossiblesentencesegmentations.Toconstructsuchtrainingexamples,wecollectedfromeachsentenceallopin-ionexpressionslabeledwiththeircorrespondingat-tributesandusetheremainingtextsegmentsasex-amplesfortheemptyattributevalue.ThetrainingoftheMaxEntmodelismuchmoreefficientthanthetrainingofthesegmentationmodel.JointInferencewithProbability-basedEsti-matesTocombinetheseparately-trainedmodelsatinferencetime,anaturalinferenceobjectiveistojointlymaximizetheprobabilityofopinionsegmen-tationandtheprobabilityofattributelabelinggiventhechosensegmentationargmaxys,yaP(ys|x)P0(ya|ys,x)(5)WeapproximatetheconditionalprobabilityasP0(ya|ys,x)=|ys|Yi=1P(ai|xsi)α(6)whereα∈(0,1].Wefoundthatα<1providesbetterperformancethanα=1empirically.Thisisanapproximationsincethedistributionofattributelabelingisestimatedindependentlyfromtheopinionsegmentationduringtraining.JointInferencewithLoss-basedEstimatesIn-steadofdirectlyusingtheoutputprobabilitiesoftheattributeclassifiers,weexploreanalternativethates-timatesP0(ya|ys,x)basedonthepredictionuncer-tainty:P0(ya|ys,x)∝exp(−α|ys|Xi=1U(ai|xsi))(7)whereU(ai|xsi)isauncertaintyfunctionthatmea-surestheclassificationmodel’suncertaintyinitsas-signmentofattributeclassaitosegmentxsi.In-tuitively,wewanttopenalizeattributeassignmentsthatareuncertainorfavorattributeassignmentswithlowuncertainty.Thepredictionuncertaintyismea-suredusingtheexpectedloss.Theexpectedlossforapredictedlabela0canbewrittenasEa|xsi[l(a,a0)]=XaP(a|xsi)l(a,a0)wherel(a,a0)isalossfunctionovera0andthetruelabela.Weusedthestandard0-1lossfunc-tioninourexperiments2andsetU(ai|xsi)=log(Ea|xsi[l(a,ai)]).Bothjointinferenceobjectivescanbesolvedeffi-cientlyviadynamicprogramming.4FeaturesWeconsiderasetofbasicfeaturesaswellastask-specificfeaturesforopinionsegmentationandat-tributelabeling,respectively.4.1BasicFeaturesUnigrams:wordunigramsandPOStagunigramsforalltokensinthesegmentcandidate.Bigrams:wordbigramsandPOSbigramswithinthesegmentcandidate.Phraseembeddings:foreachsegmentcandidate,weassociatewithita300-dimensionalphraseem-beddingasadensefeaturerepresentationfortheseg-ment.WemakeuseoftherecentlypublishedwordembeddingstrainedonGoogleNews(Mikolovetal.,2013).Foreachsegment,wecomputetheav-erageofthewordembeddingvectorsthatcomprisethephrase.Weomitwordsthatarenotfoundinthevocabulary.Ifnowordsarefoundinthetextseg-ment,weassignafeaturevectorofzeros.Opinionlexicon:Foreachwordinthesegmentcan-didate,weincludeitspolarityandintensityasindi-catedinanexistingSubjectivityLexicon(Wilsonetal.,2005).2Thelossfunctioncanbetunedtobettertradeoffprecisionandrecallaccordingtotheapplicationsathand.Wedidnotexplorethisoptioninthispaper. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 511 4.2Segmentation-specificFeaturesBoundarywordsandPOStags:word-levelfea-tures(words,POS,lexicon)beforeandaftertheseg-mentcandidate.Phrasestructure:thesyntacticcategoriesofthedeepestconstituentsthatcoverthesegmentintheparsetree,e.g.NP,VP,TOVB.VPpatterns:VP-relatedsyntacticpatternsde-scribedinYangandCardie(2012),e.g.VPsubj,VParg,whichhavebeenshownusefulforopinionexpressionextraction.4.3Polarity-specificFeaturesPolaritycount:countsofpositive,negativeandneutralwordswithinthesegmentcandidateaccord-ingtotheopinionlexicon.Negation:indicatorfornegatorswithinthesegmentcandidate.4.4Intensity-specificFeaturesIntensitycount:countsofwordswithstrongandweakintensitywithinthesegmentcandidateaccord-ingtotheopinionlexicon.Intensitydictionary:AssuggestedinChoiandCardie(2010),weincludefeaturesindicat-ingwhetherthesegmentcontainsanintensifier(e.g.highly,really),adiminisher(e.g.little,less),astrongmodalverb(e.g.must,will),andaweakmodalverb(e.g.may,could).5ExperimentsAllourexperimentswereconductedontheMPQAcorpus(Wiebeetal.,2005),awidelyusedcorpusforfine-grainedopinionanalysis.WeusedthesameevaluationsettingasinChoiandCardie(2010),where135documentswereusedfordevelopmentand10-foldcross-validationwasperformedonadif-ferentsetof400documents.Eachtrainingfoldcon-sistsofsentenceslabeledwithopinionexpressionboundariesandeachexpressionislabeledwithpo-larityandintensity.Table1showssomestatisticsoftheevaluationdata.Weusedprecision,recallandF1asevaluationmetricsforopinionextractionandcomputedthemusingbothproportionalmatchingandbinarymatch-ingcriteria.Proportionalmatchingconsiderstheoverlappingproportionofapredictedexpressionsandagoldstandardexpressions∗,andcomputesprecisionasPs∈SPs∗∈S∗|s∩s∗||s|/|S|andrecallasPs∈SPs∗∈S∗|s∩s∗||s∗|/|S∗|,whereSandS∗denotethesetofpredictedopinionexpressionsandthesetofcorrectopinionexpressions,respectively.Binarymatchingisamorerelaxedmetricthatconsidersapredictedopinionexpressiontobecorrectifitover-lapswithacorrectopinionexpression.Weexperimentedwiththefollowingmodels:(1)PIPELINE:firstextractsthespansofopinionexpressionsusingthesemi-CRFmodelinSection3.1,andthenassignspolarityandintensitytotheex-tractedopinionexpressionsusingMaxEntmodelsinSection3.2.NotethatthelabelspaceoftheMaxEntmodelsdoesnotinclude∅sincetheyassumethatalltheopinionexpressionsextractedbythepreviousstagearecorrect.(2)JSL:thejointsequencelabelingmethodde-scribedinSection3.3.1.(3)HJSL:thehierarchicaljointsequencelabelingmethoddescribedinSection3.3.2.(4)JI-PROB:thejointinferencemethodusingprobability-basedestimates(Equation6).(5)JI-LOSS:thejointinferencemethodusingloss-basedestimates(Equation7).Wealsocompareourresultswithpreviouslypub-lishedresultsfromChoiandCardie(2010)onthesametask.Allourmodelsareloglinearmodels.WeuseL-BFGSwithL2regularizationfortrainingandsettheregularizationparameterto1.0.WesetthescalingparameterαinJI-PROBandJI-LOSSviagridsearchovervaluesbetween0.1and1withincrementsof0.1usingthedevelopmentset.WeconsiderthesamesetoffeaturesdescribedinSection4inallthemodels.Forthepipelineandjointinferencemodelswheretheopinionsegmen-tatorandattributeclassifiersareseparatelytrained,weemploybasicfeaturesplussegmentation-specificfeaturesintheopinionsegmentator;andemployba-sicfeaturesplusattribute-specificfeaturesintheat-tributeclassifiers.5.1ResultsWewouldliketofirstinvestigatehowmuchwecangainfromusingtheloss-augmentedtrainingcom-paredtousingthestandardtrainingobjective.Loss- l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 512 NumberofOpinionExpressionsPositiveNegativeNeutral217048636368HighMediumLow280557214875NumberofDocuments400NumberofSentences8241AverageLengthofOpinionExpressions2.86wordsTable1:Statisticsoftheevaluationcorpusaugmentedtrainingcanbeappliedtothetrainingoftheopinionsegmentationmodelusedinthepipelinemethodandthejointinferencemethods,orbeap-pliedtothetrainingofthejointsequencelabelingapproaches,JSLandHJSL(thelossfunctiontakesintoaccountboththespanoverlapandthematch-ingofattributevalues).Weevaluatetwoversionsofeachmethod:oneusesloss-augmentedtrainingandoneusesstandardlog-losstraining.Table2showstheresultsofopinionexpressiondetectionwithoutevaluatingtheirattributes.Similartrendscanbeob-servedintheresultsofopinionexpressiondetectionwithrespecttoeachattribute.Wecanseethatin-corporatingtheevaluation-metric-basedlossfunc-tionduringtrainingconsistentlyimprovestheper-formanceforallmodelsintermsofF1measure.Thisconfirmstheeffectivenessofloss-augmentedtrainingofoursequencemodelsforopinionextrac-tion.Asaresult,allfollowingresultsarebasedontheloss-augmentedversionofourmodels.ComparingtheresultsofdifferentmodelsinTa-ble2,wecanseethatPIPELINEprovidesastrongbaseline.Incomparison,JSLandHJSLsignifi-cantlyimproveprecisionbutfailinrecall,whichindicatesthatjointsequencelabelingismorecon-servativeandprecision-biasedforextractingopinionexpressions.HJSLsignificantlyoutperformsJSL,andthisconfirmsthebenefitofmodelingthecon-ditionaldependencybetweenopinionsegmentationandattributeclassification.Inaddition,weseethatcombiningopinionsegmentationandattributeclas-sificationwithoutjointtraining(JI-PROBandJI-LOSS)hurtprecisionbutimprovesrecall(vs.JSLandHJSL).JI-LOSSpresentsthebestF1perfor-manceandsignificantlyoutperformsthePIPELINEbaselineinallevaluationmetrics.ThissuggeststhatJI-LOSSprovidesaneffectivejointinferenceobjec-tiveandisabletoprovidemorebalancedprecisionandrecallthanotherjointapproaches.Table3showstheperformanceonopinionextrac-tionwithrespecttopolarityandintensityattributes.Similarly,wecanseethatJI-LOSSoutperformsallotherbaselinesinF1;HJSLoutperformsJSLbutisslightlyworsethanPIPELINEinF1;JI-PROBisrecall-orientedandlesseffectivethanJI-LOSS.Wehypothesizethattheworseperformanceofjointsequencelabelingisduetoitsstrongassump-tiononthedependenciesbetweenopinionsegmen-tationandattributelabelinginthetrainingdata.Forexample,theexpression“fundamentallyunfairandunjust”asawholeislabeledasanopinionex-pressionwithnegativepolarity.However,thesub-expression“unjust”canbealsoviewedasanega-tiveexpressionbutitisnotannotatedasanopinionexpressioninthisexample(asMPQAdoesnotcon-sidernestedopinionexpressions).Asaresult,themodelwouldwronglypreferanemptyattributetotheexpression“unjust”.However,inourjointin-ferenceapproaches,theattributeclassificationmod-elsaretrainedindependentlyfromthesegmentationmodel,andthetrainingexamplesfortheclassifiersonlyconsistofcorrectlylabeledexpressions(“un-just”asanestedopinionexpressioninthisexamplewouldnotbeconsideredinthetrainingdatafortheattributeclassifier).Therefore,thejointinferenceapproachesdonotsufferfromthisissue.Althoughjointinferencedoesnotaccountfortaskdependen-ciesduringtraining,thepromisingperformanceofJI-LOSSdemonstratesthatmodelinglabeldepen-denciesduringinferencecanbemoreeffectivethanthePIPELINEbaseline.InTable3,wecanseethattheimprovementofJI-LOSSislesssignificantinthepositiveclassandthehighclass.Thisisduetothelackoftrainingdataintheseclasses.Theimprovementinthemediumclassisalsolesssignificant.Thismaybebecauseitisin-herentlyhardertodisambiguatemediumfromlow.Ingeneral,weobservethatextractingopinionex-pressionswithcorrectintensityisahardertaskthanextractingopinionexpressionswithcorrectpolarity.Table4presentstheF1scores(duetospacelimitonlyF1scoresarereported)forallsubtasksusingthebinarymatchingmetric.Weincludetheprevi-ouslypublishedresultsofChoiandCardie(2010)forthesametaskusingthesamefoldsplitandeval- l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 513 Loss-augmentedTrainingStandardTrainingPRF1PRF1PIPELINE60.9663.2962.1060.0560.5960.32JSL64.98†54.6059.2967.09†50.5657.62HJSL66.16∗56.7761.0567.98†50.8158.11JI-PROB50.9577.44∗61.3250.0676.98∗60.54JI-LOSS63.77†64.51†64.04∗64.97†61.55†63.12∗Table2:OpinionExpressionExtraction(ProportionalMatching).Inalltables,weuseboldtoindicatethehighestscoreamongallthemethods;use∗toindicatestatisticallysignificantimprovements(p<0.05)overalltheothermethodsunderthepaired-ttest;use†todenotestatisticallysignificance(p<0.05)overthepipelinebaseline.PositiveNegativeNeutralPRF1PRF1PRF1PIPELINE45.2643.0744.0450.5947.9149.1140.9849.3044.57JSL50.58†32.3439.3750.2244.0146.8146.83†39.8142.85HJSL50.34†37.0642.5953.29†43.9848.0747.29†43.2745.03JI-PROB36.4747.81∗41.2440.8354.40∗46.5133.5959.22∗42.66JI-LOSS46.44†44.58†45.40∗54.88∗48.5051.40∗43.42†52.02†47.09∗HighMediumLowPRF1PRF1PRF1PIPELINE40.9828.1033.2535.4444.7239.3631.1934.4632.63JSL37.9130.83†33.8839.07†37.3138.0540.95†26.7132.24HJSL41.0528.8033.6339.06†39.7139.1740.01†29.8834.12JI-PROB34.8230.94†32.5429.1650.89∗36.8925.0642.99∗31.53JI-LOSS46.11∗26.3633.3937.58†43.5840.15∗33.85†40.92†36.93∗Table3:OpinionExtractionwithCorrectAttributes(ProportionalMatching)uationmetric.CRF-JSLandCRF-HJSLarebothjointsequencelabelingmethodsbasedonCRFs.DifferentfromJSLandHJSL,theyperformse-quencelabelingatthetokenlevelinsteadoftheseg-mentlevel,andinHJSL,thedecompositionofla-belsarenotbasedonthedecompositionofthejointprobabilityofopinionsegmentationandattributela-beling.Wecanseethatboththepipelineandjointmethodsclearlyoutperformpreviousresultsinallevaluationcriteria.3WecanalsoseethatJI-LOSSprovidesthebestperformanceamongallbaselines.5.1.1ErrorAnalysisJointvs.PipelineWefoundthatmanyerrorsmadebythepipelinesystemareduetoerrorprop-agation.Table5liststhreeexamples,representingthreetypesofthepropagatederrors:(1)theattributeclassifiersmissthepredictionsincetheopinionex-3SignificancetestwasnotconductedovertheresultsinChoiandCardie(2010)aswedonothavetheir10foldresults.pressionextractorfailstoidentifytheopinionex-pression;(2)theattributeclassifiersassignattributestoanon-opinionatedexpressionsinceitwasmistak-enlyextracted;(3)theattributeclassifiersmisclas-sifytheattributessincetheboundariesofopinionex-pressionsarenotcorrectlydeterminedbytheopin-ionexpressionextractor.Ourjointmodelsareabletocorrectmanyoftheseerrors,suchastheexamplesinTable5,duetothemodelingofthedependencybetweenopinionexpressionextractionandattributeclassification.JointLearningvs.JointInferenceNotethatJSLandHJSLbothemployjointlearningwhileJI-PROBandJI-LOSSemployjointinference.Toin-vestigatethedifferencebetweenthesetwotypesofjointmodels,welookintotheerrorsmadebyHJSLandJI-LOSS.Ingeneral,weobservedthatHJSLex-tractsmanyfeweropinionexpressionscomparedtoJI-LOSS,andasaresult,itpresentshighprecisionbutlowrecall.ThefirsttwoexamplesinTable6 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 514 ExtractionPositiveNegativeNeutralHighMediumLowPIPELINE73.3051.5058.4552.4539.3447.0839.05JSL69.7645.2457.1150.2541.48†45.8836.49HJSL71.4349.0858.3852.2541.06†46.8238.45JI-PROB74.37†50.9358.2054.03†39.8046.6540.73†JI-LOSS75.11∗53.02∗62.01∗54.33†41.79†47.3842.53∗Previouswork(ChoiandCardie(2010))CRF-JSL60.541.950.341.238.437.628.0CRF-HJSL62.043.152.843.136.340.930.7Table4:OpinionExtractionResults(BinaryMatching)ExampleSentencesPipelineJointModelsItisthevictimofanexplosivesituationhighattheeco-nomic,...Noopinions×XAwhitefarmerwhowasshotdeadMondaywasthe10thtobekilled.the10thtobekilledmedium×XTheywould“fallbelowminimumstandardsmediumforhumanemediumtreatment”.minimumstandardsforhumanetreatmentmedium×XTable5:ExamplesofmistakesmadebythepipelinebaselinethatarecorrectedbythejointmodelsarecaseswhereHJSLgainsinprecisionandlosesinrecall,respectively.ThelastexampleinTable6showsanerrormadebyHJSLbutcorrectedbyJI-LOSS.Theoretically,jointlearningismorepowerfulthanjointinferenceasitmodelsthetaskdependen-ciesduringtraining.However,weonlyobserveim-provementsonprecisionandseedropsinrecall.Asdiscussedbefore,wehypothesizethatthisisduetothemismatchofdependencyassumptionsbetweenthemodelandthejointlyannotateddata.Wefoundthatjointinferencecanbesuperiortobothpipelineandjointlearning,anditisalsomuchmoreefficientintraining.InourexperimentsonanAmazonEC2instancewith64-bitprocessor,4CPUsand15GBmemory,trainingforthejointlearningapproachestookonehourforeachtrainingfold,butonly5min-utesforthejointinferenceapproaches.5.2AdditionalExperiments5.2.1EvaluationwithRerankingPreviouswork(JohanssonandMoschitti,2011)showedthatrerankingiseffectiveinimprovingthepipelineofopinionexpressionextractionandpolar-ityclassification.Weextendedtheirapproachtohandlebothpolarityandintensityandinvestigatedtheeffectofrerankingonboththepipelineandjointmodels.Forthepipelinemodel,wegenerated64-best(distinct)outputwith4-bestlabelingateachpipelinestage;forthejointmodels,wegenerated50-best(distinct)outputusingViterbi-likedynamicprogramming.Wetrainedthererankerusingtheon-linePassiveAggressivealgorithm(Crammeretal.,2006)asinJohanssonandMoschitti(2013)with100iterationsandaregularizationconstantC=0.01.Forfeatures,weincludedtheprobabilityout-putbythebasemodels,thepolarityandintensityofeachpairofextractedopinionexpressions,andthewordsequenceandthePOSsequencebetweentheadjacentpairsofextractedopinionexpressions.Table7showsthererankingperformance(F1)forallsubtasks.Wecanseethatafterreranking,JI-LOSSstillprovidesthebestperformanceandHJSLachievescomparableperformancetoPIPELINE.Wealsofoundthatrerankingleadstolessperformancegainforthejointinferenceapproachesthanforthejointlearningapproaches.Thisisbecausethek-bestoutputofJI-PROBandJI-LOSSpresentlessdiver-sitythanJSLandHJSL.Asimilarissueforrerank-inghasalsobeendiscussedinFinkeletal.(2006).5.2.2EvaluationonSentence-levelTasksAsanadditionalexperiment,weconsiderasu-pervisedsentence-levelsentimentclassificationtaskusingfeaturesderivedfromthepredictionoutputofdifferentopinionextractionmodels.Asastan- l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 515 ExampleSentencesJointLearnJointInferTheexpressionisundoubtedlystrongandwellthoughtouthigh.Xwellthoughtoutmedium×ButtheSadcMinisterialTaskForcesaidtheelectionwasfreeandfairmedium.Noopinions×XThepresidentbrandedhighasthe“axisofevil”highinhisstatement...ofevilhigh×XTable6:Examplesofmistakesthataremadebythejointlearningmodelbutarecorrectedbythejointinferencemodelandviceversa.Weusethesamecoloredboxnotationasbefore,anduseyellowcolortodenoteneutralsentiment.ExtractionPositiveNegativeNeutralHighMediumLowPIPELINE+reranking73.7251.4560.5153.2440.0747.6540.47JSL+reranking72.0247.5259.8152.8441.04†46.5839.40HJSL+reranking72.6050.7860.8553.4541.04†47.7540.08JI-PROB+reranking74.81†51.4559.5953.9840.6646.8740.80JI-LOSS+reranking75.59†53.29∗62.50∗54.94∗41.79∗47.6742.66∗Table7:OpinionExtractionwithReranking(BinaryMatching)FeaturesAccPositiveNegativeNeutralBOW65.2651.9077.4736.41PIPELINE-OP67.4155.4979.4239.48JSL-OP65.8655.9777.6836.46HJSL-OP66.7955.1279.2937.56JI-PROB-OP67.1356.4979.3038.49JI-LOSS-OP68.23∗57.32∗80.12∗40.45∗Table8:Sentence-levelSentimentClassificationdardbaseline,wetrainaMaxEntclassifierusingunigrams,bigramsandopinionlexiconfeaturesex-tractedfromthesentence.Usingthepredictionout-putofanopinionextractionmodel,weconstructfea-turesbyusingonlywordsfromtheextractedopinionexpressions,andincludethepredictedopinionat-tributesasadditionalfeatures.Wehypothesizethatthemoreinformativetheextractedopinionexpres-sionsare,themoretheycancontributetosentence-levelsentimentclassificationasfeatures.Table8showstheresultsintermsofclassificationaccuracyandF1scoreineachsentimentcategory.BOWisthestandardMaxEntbaseline.Wecanseethatus-ingfeaturesconstructedfromtheopinionexpres-sionsalwaysimprovedtheperformance.Thiscon-firmstheinformativenessoftheextractedopinionexpressions.Inparticular,usingtheopinionexpres-sionsextractedbyJI-LOSSgivesthebestperfor-manceamongallthebaselinesinallevaluationcrite-ria.Thisisconsistentwithitssuperiorperformanceinourpreviousexperiments.6ConclusionWeaddresstheproblemofopinionexpressionex-tractionandopinionattributeclassificationbypre-sentingtwotypesofjointmodels:jointlearning,whichoptimizestheparametersofdifferentsub-tasksinajointprobabilisticframework;jointinfer-ence,whichoptimizestheseparately-trainedmod-elsjointlyduringinferencetime.Weshowthatourmodelsachievesubstantiallybetterperformancethanthepreviouslypublishedresults,anddemon-stratethatjointinferencewithanappropriateobjec-tivecanbemoreeffectiveandefficientthanjointlearningforthetask.Wealsodemonstratetheuse-fulnessofoutputofoursystemsforsentence-levelsentimentanalysistasks.Forfuturework,weplantoimprovejointmodelingforthetaskbycapturingsemanticrelationsamongdifferentopinionexpres-sions.AcknowledgementThisworkwassupportedinpartbyDARPA-BAA-12-47DEFTgrant#12475008andNSFgrantBCS-0904822.Wethanktheanonymousreviewers,IgorLabutovandtheCornellNLPGroupforhelpfulsuggestions. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 9 9 1 5 6 6 9 3 3 / / t l a c _ a _ 0 0 1 9 9 p d . f b y g u e s t t o n 0 9 S e p e m b e r 2 0 2 3 516 ReferencesE.Breck,Y.Choi,andC.Cardie.2007.Identifyingex-pressionsofopinionincontext.InProceedingsoftheinternationaljointconferenceonArtificalintelligence.YejinChoiandClaireCardie.2008.Learningwithcom-positionalsemanticsasstructuralinferenceforsubsen-tentialsentimentanalysis.InProceedingsoftheCon-ferenceonEmpiricalMethodsinNaturalLanguageProcessing.YejinChoiandClaireCardie.2010.Hierarchicalse-quentiallearningforextractingopinionsandtheirat-tributes.InProceedingsoftheAnnualMeetingoftheAssociationforComputationalLinguistics-ShortPa-pers.YejinChoi,EricBreck,andClaireCardie.2006.Jointextractionofentitiesandrelationsforopinionrecog-nition.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing.KobyCrammer,OferDekel,JosephKeshet,ShaiShalev-Shwartz,andYoramSinger.2006.Onlinepassive-aggressivealgorithms.TheJournalofMachineLearn-ingResearch,7:551–585.JennyRoseFinkelandChristopherDManning.2010.Hierarchicaljointlearning:Improvingjointparsingandnamedentityrecognitionwithnon-jointlylabeleddata.InProceedingsoftheAnnualMeetingoftheAs-sociationforComputationalLinguistics.JennyRoseFinkel,ChristopherDManning,andAn-drewYNg.2006.Solvingtheproblemofcascadingerrors:Approximatebayesianinferenceforlinguisticannotationpipelines.InProceedingsoftheConfer-enceonEmpiricalMethodsinNaturalLanguagePro-cessing.KevinGimpelandNoahASmith.2010.Softmax-margincrfs:Traininglog-linearmodelswithcostfunctions.InHumanLanguageTechnologies:Confer-enceoftheNorthAmericanChapteroftheAssociationforComputationalLinguistics.RichardJohanssonandAlessandroMoschitti.2011.Ex-tractingopinionexpressionsandtheirpolarities:ex-plorationofpipelinesandjointmodels.InProceed-ingsoftheAssociationforComputationalLinguistics:HumanLanguageTechnologies:shortpapers.RichardJohanssonandAlessandroMoschitti.2013.Relationalfeaturesinfine-grainedopinionanalysis.ComputationalLinguistics,39(3):473–509.TomasMikolov,IlyaSutskever,KaiChen,GregSCor-rado,andJeffDean.2013.Distributedrepresentationsofwordsandphrasesandtheircompositionality.InAdvancesinNeuralInformationProcessingSystems.V.Punyakanok,D.Roth,W.Yih,andD.Zimak.2004.Semanticrolelabelingviaintegerlinearprogramminginference.InProceedingsoftheinternationalconfer-enceonComputationalLinguistics.D.RothandW.Yih.2004.Alinearprogrammingformu-lationforglobalinferenceinnaturallanguagetasks.AlexanderMRush,DavidSontag,MichaelCollins,andTommiJaakkola.2010.Ondualdecompositionandlinearprogrammingrelaxationsfornaturallanguageprocessing.InProceedingsoftheConferenceonEm-piricalMethodsinNaturalLanguageProcessing.SunitaSarawagiandWilliamWCohen.2004.Semi-markovconditionalrandomfieldsforinformationex-traction.InAdvancesinNeuralInformationProcess-ingSystems.RichardSocher,AlexPerelygin,JeanYWu,JasonChuang,ChristopherDManning,AndrewYNg,andChristopherPotts.2013.Recursivedeepmodelsforsemanticcompositionalityoverasentimenttreebank.InProceedingsoftheConferenceonEmpiricalMeth-odsinNaturalLanguageProcessing.J.Wiebe,T.Wilson,andC.Cardie.2005.Annotatingex-pressionsofopinionsandemotionsinlanguage.Lan-guageResourcesandEvaluation,39(2):165–210.TheresaWilson,JanyceWiebe,andPaulHoffmann.2005.Recognizingcontextualpolarityinphrase-levelsentimentanalysis.InProceedingsoftheconferenceonhumanlanguagetechnologyandempiricalmethodsinnaturallanguageprocessing.TheresaWilson,JanyceWiebe,andPaulHoffmann.2009.Recognizingcontextualpolarity:Anexplo-rationoffeaturesforphrase-levelsentimentanalysis.Computationallinguistics,35(3):399–433.BishanYangandClaireCardie.2012.Extractingopin-ionexpressionswithsemi-markovconditionalrandomfields.InProceedingsoftheJointConferenceonEm-piricalMethodsinNaturalLanguageProcessingandComputationalNaturalLanguageLearning.BishanYangandClaireCardie.2013.Jointinferenceforfine-grainedopinionextraction.InProceedingsoftheAnnualMeetingoftheAssociationforComputationalLinguistics.AinurYessenalinaandClaireCardie.2011.Composi-tionalmatrix-spacemodelsforsentimentanalysis.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing.JunZhao,KangLiu,andGenWang.2008.Addingredundantfeaturesforcrfs-basedsentencesentimentclassification.InProceedingsoftheconferenceonem-piricalmethodsinnaturallanguageprocessing.
Download pdf