Transactions of the Association for Computational Linguistics, 2 (2014) 219–230. Action Editor: Alexander Koller.
Submitted 11/2013; Revised 1/2014; Published 5/2014. c
(cid:13)
2014 Association for Computational Linguistics.
BacktoBasicsforMonolingualAlignment:ExploitingWordSimilarityandContextualEvidenceMdArafatSultan†,StevenBethard‡andTamaraSumner††InstituteofCognitiveScienceandDepartmentofComputerScienceUniversityofColoradoBoulder‡DepartmentofComputerandInformationSciencesUniversityofAlabamaatBirminghamarafat.sultan@colorado.edu,bethard@cis.uab.edu,sumner@colorado.eduAbstractWepresentasimple,easy-to-replicatemonolin-gualalignerthatdemonstratesstate-of-the-artperformancewhilerelyingonalmostnosu-pervisionandaverysmallnumberofexternalresources.Basedonthehypothesisthatwordswithsimilarmeaningsrepresentpotentialpairsforalignmentiflocatedinsimilarcontexts,weproposeasystemthatoperatesbyfindingsuchpairs.Intwointrinsicevaluationsonalignmenttestdata,oursystemachievesF1scoresof88–92%,demonstrating1–3%absoluteimprove-mentoverthepreviousbestsystem.Moreover,intwoextrinsicevaluationsouralignerout-performsexistingaligners,andevenanaiveapplicationofthealignerapproachesstate-of-the-artperformanceineachextrinsictask.1IntroductionMonolingualalignmentisthetaskofdiscoveringandaligningsimilarsemanticunitsinapairofsentencesexpressedinanaturallanguage.Suchalignmentspro-videvaluableinformationregardinghowandtowhatextentthetwosentencesarerelated.Consequently,alignmentisacentralcomponentofanumberofimportanttasksinvolvingtextcomparison:textualentailmentrecognition,textualsimilarityidentifica-tion,paraphrasedetection,questionansweringandtextsummarization,tonameafew.Thehighutilityofmonolingualalignmenthasspawnedsignificantresearchonthetopicinthere-centpast.Majoreffortsthathavetreatedalignmentasastandaloneproblem(MacCartneyetal.,2008;ThadaniandMcKeown,2011;Yaoetal.,2013a)areprimarilysupervised,thankstothemanuallyalignedcorpuswithtrainingandtestsetsfromMicrosoftRe-search(Brockett,2007).Primaryconcernsofsuchworkincludebothqualityandspeed,duetothefactthatalignmentisfrequentlyacomponentoflargerNLPtasks.Drivenbysimilarmotivations,weseektodevisealightweight,easy-to-constructalignerthatproduceshigh-qualityoutputandisapplicabletovariousendtasks.Amidavarietyofproblemformulationsandingeniousapproachestoalignment,wetakeastepbackandexaminecloselytheeffectivenessoftwofrequentlymadeassumptions:1)Relatedsemanticunitsintwosentencesmustbesimilarorrelatedintheirmeaning,and2)Commonalitiesintheirse-manticcontextsintherespectivesentencesprovideadditionalevidenceoftheirrelatedness(MacCartneyetal.,2008;ThadaniandMcKeown,2011;Yaoetal.,2013a;Yaoetal.,2013b).Alignment,basedsolelyonthesetwoassumptions,reducestofindingthebestcombinationofpairsofsimilarsemanticunitsinsim-ilarcontexts.Exploitingexistingresourcestoidentifysimilarityofsemanticunits,wesearchforrobusttechniquestoidentifycontextualcommonalities.Dependencytreesareacommonlyusedstructureforthispurpose.Whiletheyremainacentralpartofouraligner,weexpandthehorizonsofdependency-basedalignmentbeyondexactmatchingbysystematicallyexploitingthenotionof“typeequivalence”withasmallhand-craftedsetofequivalentdependencytypes.Inaddi-tion,weaugmentdependency-basedalignmentwithsurface-leveltextanalysis.Whilephrasalalignmentsareimportantandhave
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
220
beeninvestigatedinmultiplestudies,wefocuspri-marilyonwordalignments(whichhavebeenshowntoformthevastmajorityofalignments(≥95%)inmultiplehuman-annotatedcorpora(Yaoetal.,2013b)),keepingtheframeworkflexibleenoughtoallowincorporationofphrasalalignmentsinfuture.Evaluationofouraligneronthebenchmarkdatasetreportedin(Brockett,2007)showsanF1scoreof91.7%:a3.1%absoluteimprovementovertheprevi-ousbestsystem(Yaoetal.,2013a),correspondingtoa27.2%errorreduction.Itshowssuperiorperfor-mancealsoonthedatasetreportedin(Thadanietal.,2012).Additionally,wepresentresultsoftwoextrinsicevaluations,namelytextualsimilarityiden-tificationandparaphrasedetection.Ouralignernotonlyoutperformsexistingalignersineachtask,butalsoapproachestopsystemsfortheextrinsictasks.2BackgroundMonolingualalignmenthasbeenappliedtovariousNLPtasksincludingtextualentailmentrecognition(Hickletal.,2006;HicklandBensley,2007),para-phraseidentification(DasandSmith,2009;Madnanietal.,2012),andtextualsimilarityassessment(B¨aretal.,2012;Hanetal.,2013)–insomecasesex-plicitly,i.e.,asaseparatemodule.Butmanysuchsystemsresorttosimplisticand/orad-hocstrategiesforalignmentandinmostsuchwork,thealignmentmoduleswerenotseparatelyevaluatedonalignmentbenchmarks,makingtheirdirectassessmentdifficult.WiththeintroductionoftheMSRalignmentcor-pus(Brockett,2007)developedfromthesecondRecognizingTextualEntailmentchallengedata(Bar-Haimetal.,2006),directevaluationandcomparisonofalignersbecamepossible.ThefirstalignertrainedandevaluatedonthecorpuswasaphrasalalignercalledMANLI(MacCartneyetal.,2008).Itrepre-sentsalignmentsassetsofdifferenteditoperations(whereasequenceofeditsturnsoneinputsentenceintotheother)andfindsanoptimalsetofeditsviaasimulatedannealingsearch.WeightsofdifferenteditfeaturesarelearnedfromthetrainingsetoftheMSRalignmentcorpususingaperceptronlearningalgorithm.MANLIincorporatesonlyshallowfea-turescharacterizingcontextualsimilarity:relativepositionsofthetwophrasesbeingaligned(ornot)inthetwosentencesandbooleanfeaturesrepresentingwhetherornottheprecedingandfollowingtokensofthetwophrasesaresimilar.ThadaniandMcKeown(2011)substitutedMANLI’ssimulatedannealing-baseddecodingwithintegerlinearprogramming,andachievedaconsider-ablespeed-up.Moreimportantlyforourdiscussion,theyfoundcontextualevidenceintheformofsyn-tacticconstraintsusefulinbetteraligningstopwords.Thadanietal.(2012)furtherextendedthemodelbyaddingfeaturescharacterizingdependencyarcedits,effectivelybringingstrongerinfluenceofcontextualsimilarityintoalignmentdecisions.Againtheperfor-manceimprovedconsequently.Themostsuccessfulalignertodatebothintermsofaccuracyandspeed,calledJacanaAlign,wasde-velopedbyYaoetal.(2013a).Incontrasttotheearliersystems,JacanaAlignisawordalignerthatformulatesalignmentasasequencelabelingprob-lem.Eachwordinthesourcesentenceislabeledwiththecorrespondingtargetwordindexifanalign-mentisfound.ItemploysaconditionalrandomfieldtoassignthelabelsandusesafeaturesetsimilartoMANLI’sintermsoftheinformationtheyencode(withsomeextensions).Contextualfeaturesincludeonlysemanticmatchoftheleftandtherightneigh-borsofthetwowordsandtheirPOStags.EventhoughJacanaAlignoutperformedtheMANLIen-hancementsdespitehavinglesscontextualfeatures,itisdifficulttocomparetheroleofcontextinthetwomodelsbecauseofthelargeparadigmaticdispar-ity.AnextensionofJacanaAlignwasproposedforphrasalalignmentsin(Yaoetal.,2013b),butthecontextualfeaturesremainedlargelyunchanged.Noticeableinalltheabovesystemsistheuseofcontextualevidenceasafeatureforalignment,butinouropinion,nottoanextentsufficienttoharnessitsfullpotential.Eventhoughdeeperdependency-basedmodelingofcontextualcommonalitiescanbefoundinsomeotherstudies(KouylekovandMagnini,2005;Chambersetal.,2007;Changetal.,2010;Yaoetal.,2013c),webelievethereisfurtherscopeforsystematicexploitationofcontextualevidenceforalignment,whichweaimtodointhiswork.Onthecontrary,wordsemanticsimilarityhasbeenacentralcomponentofmostaligners;variousmea-suresofwordsimilarityhavebeenutilized,includingstringsimilarity,resource-basedsimilarity(derivedfromoneormorelexicalresourceslikeWordNet)
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
221
AlignidenticalwordsequencesAlignnamedentitiesAligncontentwordsAlignstopwordsFigure1:Systemoverviewanddistributionalsimilarity(computedfromwordco-occurrencestatisticsinlargecorpora).Animpor-tanttrade-offbetweenprecision,coverageandspeedexistshereandalignerscommonlyrelyononlyasubsetofthesemeasures(ThadaniandMcKeown,2011;Yaoetal.,2013a).WeusetheParaphraseDatabase(PPDB)(Ganitkevitchetal.,2013),whichisalargeresourceoflexicalandphrasalparaphrasesconstructedusingbilingualpivoting(BannardandCallison-Burch,2005)overlargeparallelcorpora.3SystemOursystemoperatesasapipelineofalignmentmod-ulesthatdifferinthetypesofwordpairstheyalign.Figure1isasimplisticrepresentationofthepipeline.Eachmodulemakesuseofcontextualevidencetomakealignmentdecisions.Inaddition,thelasttwomodulesareinformedbyawordsemanticsimilarityalgorithm.Becauseoftheirphrasalnature,wetreatnamedentitiesseparatelyfromothercontentwords.Therationalebehindtheorderinwhichthemodulesarearrangedisdiscussedlaterinthissection(3.3.5).Beforediscussingeachalignmentmoduleinde-tail,wedescribethesystemcomponentsthatidentifywordandcontextualsimilarity.3.1WordSimilarityTheabilitytocorrectlyidentifysemanticsimilaritybetweenwordsiscrucialtoouraligner,sincecon-textualevidenceisimportantonlyforsimilarwords.Insteadoftreatingwordsimilarityasacontinuousvariable,wedefinethreelevelsofsimilarity.Thefirstlevelisanexactwordorlemmamatchwhichisrepresentedbyasimilarityscoreof1.Thesecondlevelrepresentssemanticsimilaritybetweentwotermswhicharenotidentical.Toidentifysuchwordpairs,weemploytheParaphraseDatabase(PPDB)1.Weusethelargest(XXXL)ofthePPDB’slexicalparaphrasepackagesandtreatallpairsiden-ticallybyignoringtheaccompanyingstatistics.We1http://paraphrase.orgcustomizetheresourcebyremovingpairsofidenti-calwordsorlemmasandaddinglemmatizedformsoftheremainingpairs.Fornow,weusethetermppdbSimtorefertothesimilarityofeachwordpairinthismodifiedversionofPPDB(whichisavaluein(0,1))andlaterexplainhowwedetermineit(Section3.3.5).Finally,anypairofdifferentwordswhichisabsentinPPDBisassignedazerosimilarityscore.3.2ExtractingContextualEvidenceOuralignmentmodulescollectcontextualevidencefromtwocomplementarysources:syntacticdepen-denciesandwordsoccurringwithinasmalltextualvicinityofthetwowordstobealigned.Theapplica-tionofeachkindassumesacommonprincipleofmin-imalevidence.Formally,giventwoinputsentencesSandT,weconsidertwowordss∈Sandt∈Ttoformacandidatepairforalignmentif∃rs∈Sand∃rt∈Tsuchthat:1.(s,t)∈
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
224
textuallysimilarwordintheothersentenceforeachwordinthesequence.Weobservetheresultsofaligningidenticalwordsinsuchsequencesoflengthncontainingatleastonecontentword.Thissimpleheuristicdemonstratesahighprecision(≈97%)ontheMSRalignmentdevsetforn≥2,andthereforeweconsidermembershipinsuchsequencesasthesimplestformofcontextualevidenceinoursystemandalignallidenticalwordsequencepairsinSandTcontainingatleastonecontentword.Fromthispointon,wewillrefertothismoduleaswsAlign.3.3.2NamedEntitiesWealignnamedentitiesseparatelytoenablethealignmentoffullandpartialmentions(andacronyms)ofthesameentity.WeusetheStanfordNamedEntityRecognizer(Finkeletal.,2005)toidentifynamedentitiesinSandT.Afteraligningtheexacttermmatches,anyunmatchedtermofapartialmentionisalignedtoalltermsinthefullmention.Themod-ulerecognizesonlyfirst-letteracronymsandalignsanacronymtoalltermsinthefullmentionofthecorrespondingname.Sincenamedentitiesareinstancesofnouns,namedentityalignmentisalsoinformedbycontextualev-idence(whichwediscussinthenextsection),buthappensbeforealignmentofothergenericcontentwords.Parents(orchildren)ofanamedentityaresimplytheparents(orchildren)ofitsheadword.WewillrefertothismoduleasamethodnamedneAlignfromthispointon.3.3.3ContentWordsExtractionofcontextualevidenceforpromisingcontentwordpairshasalreadybeendiscussedinSection3.2,coveringbothdependency-basedcontextandtextualcontext.Algorithm3(cwDepAlign)describesthedependency-basedalignmentprocess.Foreachpotentiallyalignablepair(si,tj),thedependency-basedcontextisextractedasdescribedinAlgorithm1,andcontextsimilarityiscalculatedasthesumofthewordsimilaritiesofthe(sk,tl)contextwordpairs(lines2-7).(ThewordSimmethodreturnsasimilarityscorein{0,ppdbSim,1}.)Thealignmentscoreofthe(si,tj)pairisthenaweightedsumofwordandcontextualsimilarity(lines8-12).(WediscusshowtheweightsaresetinSectionAlgorithm3:cwDepAlign(S,T,EQ,AE,w,STOP)Input:1.S,T:Sentencestobealigned2.EQ:Dependencytypeequivalences(Table1)3.AE:Alreadyalignedwordpairindexes4.w:Weightofwordsimilarityrelativetocontex-tualsimilarity5.STOP:AsetofstopwordsOutput:A={(i,j)}:wordindexpairsofalignedwords{(si,tj)}wheresi∈Sandtj∈TΨ←∅;ΛΨ←∅;Φ←∅1forsi∈S,tj∈Tdo2ifsi6∈STOP∧¬∃tl:(i,l)∈AE3∧tj6∈STOP∧¬∃sk:(k,j)∈AE4∧wordSim(si,tj)>0then5context←depContext(S,T,i,j,EQ)6contextSim←X(k,l)∈contextwordSim(sk,tl)7ifcontextSim>0then8Ψ←Ψ∪{(i,j)}9ΛΨ(i,j)←context10Φ(i,j)←w∗wordSim(si,tj)11+(1−w)∗contextSim12LinearizeandsortΨindecreasingorderofΦ(i,j)13A←∅14for(i,j)∈Ψdo15if¬∃l:(i,l)∈A16∧¬∃k:(k,j)∈Athen17A←A∪{(i,j)}18for(k,l)∈ΛΨ(i,j)do19if¬∃q:(k,q)∈A∪AE20∧¬∃p:(p,l)∈A∪AEthen21A←A∪{(k,l)}223.3.5.)Themodulethenaligns(si,tj)pairswithnon-zeroevidenceindecreasingorderofthisscore(lines13-18).Inaddition,italignsallthepairsthatcontributedcontextualevidenceforthe(si,tj)alignment(lines19-22).Notethatweimplementaone-to-onealignmentwherebyawordgetsalignedatmostoncewithinthemodule.Algorithm4(cwTextAlign)presentsalignmentbasedonsimilaritiesinthetextualneighborhood.Foreachpotentiallyalignablepair(si,tj),Algorithm2isusedtoextractthecontext,whichisasetofneigh-boringcontentwordpairs(lines2-7).Thecontextualsimilarityisthesumofthesimilaritiesofthesepairs
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
225
Algorithm4:cwTextAlign(S,T,AE,w,STOP)Input:1.S,T:Sentencestobealigned2.AE:Existingalignmentsbywordindexes3.w:Weightofwordsimilarityrelativetocontex-tualsimilarity4.STOP:AsetofstopwordsOutput:A={(i,j)}:wordindexpairsofalignedwords{(si,tj)}wheresi∈Sandtj∈TΨ←∅;Φ←∅1forsi∈S,tj∈Tdo2ifsi6∈STOP∧¬∃tl:(i,l)∈AE3∧tj6∈STOP∧¬∃sk:(k,j)∈AE4∧wordSim(si,tj)>0then5Ψ←Ψ∪{(i,j)}6context←textContext(S,T,i,j,STOP)7contextSim←X(k,l)∈contextwordSim(sk,tl)8Φ(i,j)←w∗wordSim(si,tj)9+(1−w)∗contextSim10LinearizeandsortΨindecreasingorderofΦ(i,j)11A←∅12for(i,j)∈Ψdo13if¬∃l:(i,l)∈A14∧¬∃k:(k,j)∈Athen15A←A∪{(i,j)}16(line8),andthealignmentscoreisaweightedsumofwordsimilarityandcontextualsimilarity(lines9,10).Thealignmentscoreisthenusedtomakeone-to-onewordalignmentdecisions(lines11-16).Consideringtextualneighborsasweakersourcesofevidence,wedonotaligntheneighbors.NotethatincwTextAlignwealsoalignsemanti-callysimilarcontentwordpairs(si,tj)withnocon-textualsimilaritiesifnopairs(sk,tj)or(si,tl)existwithahigheralignmentscore.ThisisaconsequenceofourobservationoftheMSRalignmentdevdata,wherewefindthatmoreoftenthannotcontentwordsareinherentlysufficientlymeaningfultobealignedevenintheabsenceofcontextualevidencewhentherearenocompetingpairs.ThecontentwordalignmentmoduleisthusitselfapipelineofcwDepAlignandcwTextAlign.3.3.4StopWordsWefollowthecontextualevidence-basedapproachtoalignstopwordsaswell,someofwhichgetalignedAlgorithm5:align(S,T,EQ,w,STOP)Input:1.S,T:Sentencestobealigned2.EQ:Dependencytypeequivalences(Table1)3.w:Weightofwordsimilarityrelativetocontex-tualsimilarity4.STOP:AsetofstopwordsOutput:A={(i,j)}:wordindexpairsofalignedwords{(si,tj)}wheresi∈Sandtj∈TA←wsAlign(S,T)1A←A∪neAlign(S,T,EQ,A,w)2A←A∪cwDepAlign(S,T,EQ,A,w,STOP)3A←A∪cwTextAlign(S,T,A,w,STOP)4A←A∪swDepAlign(S,T,A,w,STOP)5A←A∪swTextAlign(S,T,A,w,STOP)6aspartofwordsequencealignmentasdiscussedinSection3.3.1,andneighboralignmentasdiscussedinSection3.3.3.Fortherestweutilizedependen-ciesandtextualneighborhoodsasbefore,withthreeadjustments.Firstly,sincestopwordalignmentisthelastmod-uleinourpipeline,ratherthanconsideringallse-manticallysimilarwordpairsforcontextualsimilar-ity,weconsideronlyalignedpairs.Secondly,sincemanystopwords(e.g.determiners,modals)typi-callydemonstratelittlevariationinthedependenciestheyengagein,weignoretypeequivalencesforstopwordsandimplementonlyexactmatchingofdepen-dencies.(Stopwordsingeneralcanparticipateinequivalentdependencies,butweleaveconstructingacorrespondingmappingforfuturework.)Finally,fortextualneighborhood,weseparatelycheckalign-mentsoftheleftandtherightneighborsandaggre-gatetheevidencestodeterminealignment–againduetotheprimarilysyntacticnatureofinteractionofstopwordswiththeirneighbors.Thusstopwordsarealsoalignedinasequenceofdependencyandtextualneighborhood-basedalign-ments.WeassumetwocorrespondingmodulesnamedswDepAlignandswTextAlign,respectively.3.3.5TheAlgorithmOurfullalignmentpipelineisshownasthemethodaligninAlgorithm5.Notethatthestrictorderofthealignmentmoduleslimitsthescopeofdownstreammodulessinceeachsuchmodulediscardsanywordthathasalreadybeenalignedbyanearliermodule
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
226
(thisisaccomplishedviathevariableA;thecorre-spondingparameterinAlgorithms3and4isAE).Therationalesbehindthespecificorderofthemod-ulescannowbeexplained:(1)wsAlignisamodulewithrelativelyhigherprecision,(2)itisconvenienttoalignnamedentitiesbeforeothercontentwordstoen-ablealignmentofentitymentionsofdifferentlengths,(3)dependency-basedevidencewasobservedtobemorereliable(i.e.ofhigherprecision)thantextualevidenceintheMSRalignmentdevset,and(4)stopwordalignmentsaredependentonexistingcontentwordalignments.Thealignerassumestwofreeparameters:ppdbSimandw(inAlgorithms3and4).Todeterminetheirvalues,weexhaustivelysearchthroughthetwo-dimensionalspace(ppdbSim,w)forppdbSim,w∈{0.1,…,0.9,1}andthecombina-tion(0.9,0.9)yieldsthebestF1scorefortheMSRalignmentdevset.Weusethesevaluesforthealignerinallofitssubsequentapplications.4EvaluationWeevaluatetheperformanceofouralignerbothin-trinsicallyandextrinsicallyonmultiplecorpora.4.1IntrinsicEvaluationTheMSRalignmentdataset2(Brockett,2007)wasdesignedtotrainanddirectlyevaluateautomatedaligners.Threeannotatorsindividuallyalignedwordsandphrasesin1600pairsofpremiseandhypothe-sissentencesfromtheRTE2challengedata(dividedintodevandtestsets,eachconsistingof800sen-tences).Thedatasethassubsequentlybeenusedtoassessseveraltopperformingaligners(MacCartneyetal.,2008;ThadaniandMcKeown,2011;Yaoetal.,2013a;Yaoetal.,2013b).Weusethetestsetforevaluationinthesamemannerasthesestudies:(a)weapplymajorityruletoselectfromthethreesetsofannotationsforeachsentenceanddiscardthree-waydisagreements,(b)weevaluateonlyonthesurelinks(wordpairsthatannotatorsmentionedshouldcertainlybealigned,asopposedtopossiblelinks).Wetestthegeneralizabilityofthealignerbyeval-uatingit,unchanged(i.e.withidenticalparametervalues),onasecondalignmentcorpus:theEdin-2http://www.cs.biu.ac.il/nlp/files/RTE2006Aligned.zipSystemP%R%F1%E%MSRMacCartneyetal.(2008)85.485.385.321.3Thadani&McKeown(2011)89.586.287.833.0Yaoetal.(2013a)93.784.088.635.3Yaoetal.(2013b)92.182.886.829.1ThisWork93.789.891.743.8EDB++Yaoetal.(2013a)91.382.086.415.0Yaoetal.(2013b)90.481.985.913.7ThisWork93.582.587.618.3Table2:Resultsofintrinsicevaluationontwodatasetsburgh++3(Thadanietal.,2012)corpus.Thetestsetconsistsof306pairs;eachpairisalignedbyatmosttwoannotatorsandweadopttherandomselectionpolicydescribedin(Thadanietal.,2012)toresolvedisagreements.Table2showstheresults.Foreachcorpus,itshowsprecision(%alignmentsthatmatchedwithgoldannotations),recall(%goldalignmentsdiscov-eredbythealigner),F1scoreandthepercentageofsentencesthatreceivedtheexactgoldalignments(denotedbyE)fromthealigner.OntheMSRtestset,ouralignershowsa3.1%improvementinF1scoreoverthepreviousbestsys-tem(Yaoetal.,2013a)witha27.2%errorreduction.Importantly,itdemonstratesaconsiderableincreaseinrecallwithoutalossofprecision.TheEscorealsoincreasesasaconsequence.OntheEdinburgh++testset,oursystemachievesa1.2%increaseinF1score(anerrorreductionof8.8%)overthepreviousbestsystem(Yaoetal.,2013a),withimprovementsinbothprecisionandrecall.Thisisaremarkableresultthatdemonstratesthegeneralapplicabilityofthealigner,asnoparametertuningtookplace.4.1.1AblationTestWeperformasetofablationteststoassesstheimportanceofthealigner’sindividualcomponents.EachrowofTable3beginningwith(-)showsafea-tureexcludedfromthealignerandtwoassociatedsetsofmetrics,showingtheperformanceofthere-sultingalgorithmonthetwoalignmentcorpora.Withoutawordsimilaritymodule,recalldropsaswouldbeexpected.Withoutcontextualevidence(wordsequences,dependenciesandtextualneigh-bors)precisiondropsconsiderablyandrecallalsofalls.Withoutdependencies,thealignerstillgives3http://www.ling.ohio-state.edu/∼scott/#edinburgh-plusplus
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
227
MSREDB++FeatureP%R%F1%P%R%F1%Original93.789.891.793.582.587.6(-)WordSimilarity95.286.390.595.177.385.3(-)ContextualEvidence81.386.083.686.480.683.4(-)Dependencies94.288.891.493.881.387.1(-)TextNeighborhood85.590.487.990.484.387.2(-)StopWords94.288.391.292.280.085.7Table3:Ablationtestresultsstate-of-the-artresults,whichpointstothepossibilityofaveryfastyethigh-performancealigner.Withoutevidencefromtextualneighbors,however,thepreci-sionofthealignersuffersbadly.Textualneighborsfindalignmentsacrossdifferentlexicalcategories,atypeofalignmentthatiscurrentlynotsupportedbyourdependencyequivalences.Extendingthesetofdependencyequivalencesmightalleviatethisis-sue.Finally,withoutstopwords(i.e.whilealigningcontentwordsonly)recalldropsjustalittleforeachcorpus,whichisexpectedascontentwordsformthevastmajorityofnon-identicalwordalignments.4.2ExtrinsicEvaluationWeextrinsicallyevaluateoursystemontextualsimi-larityidentificationandparaphrasedetection.Herewediscusseachtaskandtheresultsofevaluation.4.2.1SemanticTextualSimilarityGiventwoshortinputtextfragments(commonlysentences),thegoalofthistaskistoidentifythedegreetowhichthetwofragmentsaresemanticallysimilar.The*SEM2013STStask(Agirreetal.,2013)assessedanumberofSTSsystemsonfourtestdatasetsbycomparingtheiroutputscorestohumanannotations.Pearsoncorrelationcoefficientwithhu-manannotationswascomputedindividuallyforeachtestsetandaweightedsumofthecorrelationswasusedasthefinalevaluationmetric(theweightforeachdatasetwasproportionaltoitssize).Weapplyouralignertothetaskbyaligningeachsentencepairandtakingtheproportionofcontentwordsalignedinthetwosentences(bynormalizingwiththeharmonicmeanoftheirnumberofcontentwords)asaproxyoftheirsemanticsimilarity.OnlythreeofthefourSTS2013datasetsarefreelyavail-able4(headlines,OnWN,andFNWN),whichweuseforourexperiments(leavingouttheSMTdataset).4http://ixa2.si.ehu.es/sts/SystemCorrel.%RankHanetal.(2013)73.71(original)JacanaAlign46.266ThisWork67.27Table4:ExtrinsicevaluationonSTS2013dataThesethreesetscontain1500annotatedsentencepairsintotal.Table4showstheresults.Thefirstrowshowstheperformanceofthetopsysteminthetask.Withadirectapplicationofouraligner(noparametertun-ing),ourSTSalgorithmacheivesa67.15%weightedcorrelation,whichwouldearnitthe7thrankamong90participatingsystems.ConsideringthefactthatalignmentisoneofmanycomponentsofSTS,thisresultistrulypromising.Forcomparison,wealsoevaluatethepreviousbestalignernamedJacanaAlign(Yaoetal.,2013a)onSTS2013data(theJacanaAlignpublicrelease5isused,whichisaversionoftheoriginalalignerwithextralexicalresources).Weapplythreedifferentval-uesderivedfromitsoutputasproxiesofsemanticsimilarity:a)alignedcontentwordproportion,b)theViterbidecodingscore,andc)thenormalizeddecod-ingscore.Ofthethree,(b)givesthebestresults,whichweshowinrow2ofTable4.OuraligneroutperformsJacanaAlignbyalargemargin.4.2.2ParaphraseIdentificationThegoalofparaphraseidentificationistodecideiftwosentenceshavethesamemeaning.Theoutputisayes/nodecisioninsteadofareal-valuedsimilarityscoreasinSTS.WeusetheMSRparaphrasecor-pus6(4076devpairs,1725testpairs)(Dolanetal.,2004)toevaluateouralignerandcomparewithotheraligners.Followingearlierwork(MacCartneyetal.,2008;Yaoetal.,2013b),weuseanormalizedalign-mentscoreofthetwosentencestomakeadecisionbasedonathresholdwhichwesetusingthedevset.Alignmentswithahigher-than-thresholdscorearetakentobeparaphrasesandtherestnon-paraphrases.Again,thisisanoversimplifiedapplicationofthealigner,evenmoresothaninSTS,sinceasmallchangeinlinguisticpropertiesoftwosentences(e.g.polarityormodality)canturnthemintonon-5https://code.google.com/p/jacana/6http://research.microsoft.com/en-us/downloads/607d14d9-20cd-47e3-85bc-a2f65cd28042/
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
228
SystemAcc.%P%R%F1%Madnanietal.(2012)77.479.089.984.1Yaoetal.(2013a)70.072.688.179.6Yaoetal.(2013b)68.168.695.879.9ThisWork73.476.686.481.2Table5:ExtrinsicevaluationonMSRparaphrasedataparaphrasesdespitehavingahighdegreeofalign-ment.Sothealignerwasnotexpectedtodemonstratestate-of-the-artperformance,butstillitgetscloseasshowninTable5.Thefirstcolumnshowstheaccu-racyofeachsysteminclassifyingtheinputsentencesintooneoftwoclasses:true(paraphrases)andfalse(non-paraphrases).Therestofthecolumnsshowtheperformanceofthesystemforthetrueclassintermsofprecision,recall,andF1score.Italicizednumbersrepresentscoresthatwerenotreportedbytheauthorsofthecorrespondingpapers,buthavebeenrecon-structedfromthereporteddata(andhencearelikelytohavesmallprecisionerrors).Thefirstrowshowsthebestperformancebyanysystemonthetestsettothebestofourknowledge.Thenexttworowsshowtheperformancesoftwostate-of-the-artaligners(performancesofbothsys-temswerereportedin(Yaoetal.,2013b)).Thelastrowshowstheperformanceofouraligner.Al-thoughitdoesworsethanthebestparaphrasesystem,itoutperformstheotheraligners.5DiscussionOurexperimentsrevealthatawordalignerbasedonsimplemeasuresoflexicalandcontextualsimilar-itycandemonstratestate-of-the-artaccuracy.How-ever,asalignmentisfrequentlyacomponentoflargertasks,highaccuracyaloneisnotalwayssufficient.Otherdimensionsofanaligner’susabilityincludespeed,consumptionofcomputingresources,replica-bility,andgeneralizabilitytodifferentapplications.Ourdesigngoalsincludeachievingabalanceamongsuchmultifariousandconflictinggoals.Aspeedadvantageofouralignerstemsfromfor-mulatingtheproblemasone-to-onewordalignmentandthusavoidinganexpensivedecodingphase.Thepresenceofmultiplephasesisoffsetbydiscardingalreadyalignedwordsinsubsequentphases.TheuseofPPDBastheonly(hashable)wordsimilarityresourcehelpsinreducinglatencyaswellasspacerequirements.AsshowninSection4.1.1,furtherspeedupcouldbeachievedwithonlyasmallperfor-mancedegradationbyconsideringonlythetextualneighborhoodassourceofcontextualevidence.However,thetwomajorgoalsthatwebelievethealignerachievestothegreatestextentarereplicabil-ityandgeneralizability.Theeasyreplicabilityofthealignerstemsfromitsuseofonlybasicandfre-quentlyusedNLPmodules(alemmatizer,aPOStagger,anNERmodule,andadependencyparser:allavailableaspartoftheStanfordCoreNLPsuite7;weuseaPythonwrapper8)andasinglewordsimilarityresource(PPDB).Weexperimentallyshowthatthealignercanbesuccessfullyappliedtodifferentalignmentdatasetsaswellasmultipleendtasks.WebelieveadesigncharacteristicthatenhancesthegeneralizabilityofthealignerisitsminimaldependenceontheMSRalignmenttrainingdata,whichoriginatesfromatex-tualentailmentcorpushavinguniquepropertiessuchasdisparitiesinthelengthsoftheinputsentencesandadirectionalnatureoftheirrelationship(i.e.,thepremiseimplyingthehypothesis,butnotviceversa).Arelatedpotentialreasonisthesymmetryofthealigner’soutput(causedbyitsassumptionofnodirectionality)–thefactthatitoutputsthesamesetofalignmentsregardlessoftheorderoftheinputsentences,incontrasttomostexistingaligners.Majorlimitationsofthealignerincludetheinabil-itytoalignphrases,includingmultiwordexpressions.Itisincapableofcapturingandexploitinglongdis-tancedependenciesamongwords(e.g.coreferences).NowordsimilarityresourceisperfectandPPDBisnoexception,thereforecertainwordalignmentsalsoremainundetected.6ConclusionsWeshowhowcontextualevidencecanbeusedtoconstructamonolingualwordalignerwithcertainde-siredproperties,includingstate-of-the-artaccuracy,easyreplicability,andhighgeneralizability.Somepotentialavenuesforfutureworkinclude:allow-ingphrase-levelalignmentviaphrasalsimilarityre-sources(e.g.thephrasalparaphrasesofPPDB),in-cludingothersourcesofsimilaritysuchasvectorspacemodelsorWordNetrelations,expandingtheset7http://nlp.stanford.edu/downloads/corenlp.shtml8https://github.com/dasmith/stanford-corenlp-python
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
229
ofdependencyequivalencesand/orusingsemanticroleequivalences,andformulatingouralignmental-gorithmasobjectiveoptimizationratherthangreedysearch.Thealignerisavailablefordownloadathttps://github.com/ma-sultan/monolingual-word-aligner.AcknowledgmentsThismaterialisbasedinpartuponworksupportedbytheNationalScienceFoundationunderGrantNum-bersEHR/0835393andEHR/0835381.Anyopin-ions,findings,andconclusionsorrecommendationsexpressedinthismaterialarethoseoftheauthor(s)anddonotnecessarilyreflecttheviewsoftheNa-tionalScienceFoundation.ReferencesEnekoAgirre,DanielCer,MonaDiab,AitorGonzalez-Agirre,andWeiweiGuo.2013.*SEM2013sharedtask:SemanticTextualSimilarity.InProceedingsoftheSecondJointConferenceonLexicalandCompu-tationalSemantics.AssociationforComputationalLinguistics,32-43.ColinBannardandChrisCallison-Burch.2005.Para-phrasingwithBilingualParallelCorpora.InProceed-ingsofthe43rdAnnualMeetingonAssociationforComputationalLinguistics.AssociationforComputa-tionalLinguistics,597-604.DanielB¨ar,ChrisBiemann,IrynaGurevych,andTorstenZesch.2012.UKP:computingsemantictextualsim-ilaritybycombiningmultiplecontentsimilaritymea-sures.InProceedingsoftheFirstJointConferenceonLexicalandComputationalSemantics.AssociationforComputationalLinguistics,435-440.RoyBar-Haim,IdoDagan,BillDolan,LisaFerro,DaniloGiampiccolo,BernardoMagnini,andIdanSzpektor.2006.TheSecondPASCALRecognisingTextualEn-tailmentChallenge.InProceedingsofTheSecondPASCALRecognisingTextualEntailmentChallenge.ChrisBrockett.2007.AligningtheRTE2006Cor-pus.TechnicalReportMSR-TR-2007-77,MicrosoftResearch.NathanaelChambers,DanielCer,TrondGrenager,DavidHall,ChloeKiddon,BillMacCartney,Marie-CatherinedeMarneffe,DanielRamage,EricYeh,andChristopherDManning.2007.Learningalignmentsandleverag-ingnaturallogic.InProceedingsoftheACL-PASCALWorkshoponTextualEntailmentandParaphrasingAs-sociationforComputationalLinguistics,165-170.Ming-WeiChang,DanGoldwasser,DanRoth,andVivekSrikumar.2010.DiscriminativeLearningoverCon-strainedLatentRepresentations.InProceedingsofthe2010AnnualConferenceoftheNorthAmericanChap-teroftheAssociationforComputationalLinguisticsAssociationforComputationalLinguistics,429-437.DipanjanDasandNoahA.Smith.2009.ParaphraseIden-ticationasProbabilisticQuasi-SynchronousRecogni-tion.InProceedingsoftheJointConferenceofthe47thAnnualMeetingoftheACLandthe4thInternationalJointConferenceonNaturalLanguageProcessingoftheAFNLP.AssociationforComputationalLinguistics,468-476.Marie-CatherinedeMarneffe,BillMacCartney,andChristopherD.Manning.2006.GeneratingTypedDependencyParsesfromPhraseStructureParses.InProceedingsoftheInternationalConferenceonLan-guageResourcesandEvaluation.449-454.Marie-CatherinedeMarneffeandChristopherD.Man-ning.2008.Stanfordtypeddependenciesmanual.TechnicalReport,StanfordUniversity.BillDolan,ChrisQuirk,andChrisBrockett.2004.Un-supervisedConstructionofLargeParaphraseCorpora:ExploitingMassivelyParallelNewsSources.InPro-ceedingsoftheInternationalConferenceonCompu-tationalLinguistics.AssociationforComputationalLinguistics,350-356.JennyRoseFinkel,TrondGrenager,andChristopherMan-ning.2005.IncorporatingNon-localInformationintoInformationExtractionSystemsbyGibbsSampling.InProceedingsofthe43rdAnnualMeetingoftheAssoci-ationforComputationalLinguistics.AssociationforComputationalLinguistics,363-370.JuriGanitkevitch,BenjaminVanDurme,andChrisCallison-Burch.2013.PPDB:TheParaphraseDatabase.InProceedingsofthe2013ConferenceoftheNorthAmericanChapteroftheAssociationforCom-putationalLinguistics.AssociationforComputationalLinguistics,758-764.LushanHan,AbhayKashyap,TimFinin,JamesMayeld,andJonathanWeese.2013.UMBCEBIQUITY-CORE:SemanticTextualSimilaritySystems.InProceedingsoftheSecondJointConferenceonLexicalandCompu-tationalSemantics,Volume1.AssociationforCompu-tationalLinguistics,44-52.AndrewHicklandJeremyBensley.2007.ADiscourseCommitment-BasedFrameworkforRecognizingTex-tualEntailment.InProceedingsoftheACL-PASCALWorkshoponTextualEntailmentandParaphrasing.As-sociationforComputationalLinguistics,171-176.AndrewHickl,JeremyBensley,JohnWilliams,KirkRoberts,BryanRink,andYingShi.2006.Recog-nizingTextualEntailmentwithLCCsGROUNDHOG
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
–
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
1
7
8
1
5
6
6
8
3
6
/
/
t
l
a
c
_
a
_
0
0
1
7
8
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
230
System.InProceedingsoftheSecondPASCALChal-lengesWorkshoponRecognizingTextualEntailment.MilenKouylekovandBernardoMagnini.2005.Rec-ognizingtextualentailmentwithtreeeditdistanceal-gorithms.InProceedingsofthePASCALChallengesWorkshop:RecognisingTextualEntailmentChallenge17-20.BillMacCartney,MichelGalley,andChristopherD.Man-ning.2008.APhrase-BasedAlignmentModelforNat-uralLanguageInference.InProceedingsofthe2008ConferenceonEmpiricalMethodsinNaturalLanguageProcessing.AssociationforComputationalLinguistics,802-811.NitinMadnani,JoelTetreault,andMartinChodorow.2012.Re-examiningMachineTranslationMetricsforParaphraseIdentification.InProceedingsof2012Con-ferenceoftheNorthAmericanChapteroftheAssoci-ationforComputationalLinguistics.AssociationforComputationalLinguistics,182-190.KapilThadaniandKathleenMcKeown.2011.OptimalandSyntactically-InformedDecodingforMonolingualPhrase-BasedAlignment.InProceedingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics:HumanLanguageTechnologies.Associa-tionforComputationalLinguistics,254-259.KapilThadani,ScottMartin,andMichaelWhite.2012.AJointPhrasalandDependencyModelforParaphraseAlignment.InProceedingsofCOLING2012:Posters.1229-1238.KristinaToutanova,DanKlein,ChristopherD.Manning,andYoramSinger.2003Feature-richPart-of-speechTaggingwithaCyclicDependencyNetworkInPro-ceedingsofthe2003HumanLanguageTechnologyConferenceoftheNorthAmericanChapteroftheAsso-ciationforComputationalLinguistics.AssociationforComputationalLinguistics,173-180.XuchenYao,BenjaminVanDurme,ChrisCallison-Burch,andPeterClark.2013a.ALightweightandHighPer-formanceMonolingualWordAligner.InProceedingsofthe51stAnnualMeetingoftheAssociationforCom-putationalLinguistics.AssociationforComputationalLinguistics,702-707.XuchenYao,BenjaminVanDurme,ChrisCallison-Burch,andPeterClark.2013b.Semi-MarkovPhrase-basedMonolingualAlignment.InProceedingsofthe2013ConferenceonEmpiricalMethodsinNaturalLanguageProcessing.AssociationforComputationalLinguistics,590-600.XuchenYao,BenjaminVanDurme,ChrisCallison-Burch,andPeterClark.2013c.AnswerExtractionasSe-quenceTaggingwithTreeEditDistance.InProceed-ingsofthe2013ConferenceoftheNorthAmericanChapteroftheAssociationforComputationalLinguis-tics.AssociationforComputationalLinguistics,858-867.
Download pdf