Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 179–192. Editor de acciones: Johan Bos.

Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 179–192. Editor de acciones: Johan Bos.
Submitted 1/2013; Revised 3/2013; Publicado 5/2013. C(cid:13)2013 Asociación de Lingüística Computacional.

CombinedDistributionalandLogicalSemanticsMikeLewisSchoolofInformaticsUniversityofEdinburghEdinburgh,EH89AB,UKmike.lewis@ed.ac.ukMarkSteedmanSchoolofInformaticsUniversityofEdinburghEdinburgh,EH89AB,UKsteedman@inf.ed.ac.ukAbstractWeintroduceanewapproachtosemanticswhichcombinesthebenefitsofdistributionalandformallogicalsemantics.Distributionalmodelshavebeensuccessfulinmodellingthemeaningsofcontentwords,butlogicalse-manticsisnecessarytoadequatelyrepresentmanyfunctionwords.Wefollowformalse-manticsinmappinglanguagetologicalrep-resentations,butdifferinthattherelationalconstantsusedareinducedbyofflinedistri-butionalclusteringatthelevelofpredicate-argumentstructure.Ourclusteringalgorithmishighlyscalable,allowingustorunoncor-porathesizeofGigaword.Differentsensesofawordaredisambiguatedbasedontheirin-ducedtypes.Weoutperformavarietyofex-istingapproachesonawide-coveragequestionansweringtask,anddemonstratetheabilitytomakecomplexmulti-sentenceinferencesin-volvingquantifiersontheFraCaSsuite.1IntroductionMappingnaturallanguagetomeaningrepresenta-tionsisacentralchallengeofNLP.Therehasbeenmuchrecentprogressinunsuperviseddistributionalsemantics,inwhichthemeaningofawordisin-ducedbasedonitsusageinlargecorpora.Thisap-proachisusefulforarangeofkeyapplicationsin-cludingquestionansweringandrelationextraction(LinandPantel,2001;PoonandDomingos,2009;Yaoetal.,2011).Becausesuchasemanticscanbeautomicallyinduced,itescapesthelimitationofde-pendingonrelationsfromhand-builttrainingdata,knowledgebasesorontologies,whichhaveprovedoflimiteduseincapturingthehugevarietyofmean-ingsthatcanbeexpressedinlanguage.However,distributionalsemanticshaslargelyde-velopedinisolationfromtheformalsemanticsliter-ature.Whilstdistributionalsemanticshasbeenef-fectiveinmodellingthemeaningsofcontentwordssuchasnounsandverbs,itislessclearthatitcanbeappliedtothemeaningsoffunctionwords.Semanticoperators,suchasdeterminers,negation,conjunc-tions,modals,tense,mood,aspect,andpluralsareubiquitousinnaturallanguage,andarecrucialforhighperformanceonmanypracticalapplications—butcurrentdistributionalmodelsstruggletocaptureevensimpleexamples.Conversely,computationalmodelsofformalsemanticshaveshownlowrecallonpracticalapplications,stemmingfromtheirre-lianceonontologiessuchasWordNet(Molinero,1995)tomodelthemeaningsofcontentwords(Bobrowetal.,2007;BosandMarkert,2005).Forexample,considerwhatisneededtoansweraquestionlikeDidGooglebuyYouTube?fromthefollowingsentences:1.GooglepurchasedYouTube2.Google’sacquisitionofYouTube3.Googleacquiredeverycompany4.YouTubemaybesoldtoGoogle5.GooglewillbuyYouTubeorMicrosoft6.Googledidn’ttakeoverYouTubeAlloftheserequireknowledgeoflexicalseman-tics(e.g.thatbuyandpurchasearesynonyms),butsomealsoneedinterpretationofquantifiers,nega-tives,modalsanddisjunction.Itseemsunlikelythat

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
1
9
1
5
6
6
6
4
9

/

/
t

yo

a
C
_
a
_
0
0
2
1
9
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

180

distributionalorformalapproachescanaccomplishthetaskalone.Weproposeamethodformappingnaturallan-guagetofirst-orderlogicrepresentationscapableofcapturingthemeaningsoffunctionwordssuchasevery,notandor,butwhichalsousesdistributionalstatisticstomodelthemeaningofcontentwords.Ourapproachdiffersfromstandardformalseman-ticsinthatthenon-logicalsymbolsusedinthelog-icalformareclusteridentifiers.Wherestandardse-manticformalismswouldmaptheverbwritetoawrite’symbol,wemapittoaclusteridentifiersuchasrelation37,whichthenounauthormayalsomapto.Thismappingislearntbyofflineclustering.Unlikepreviousdistributionalapproaches,weperformclusteringatthelevelofpredicate-argumentstructure,ratherthansyntacticdependencystruc-ture.Thismeansthatweabstractawayfrommanysyntacticdifferencesthatarenotpresentinthese-mantics,suchasconjunctions,passives,relativeclauses,andlong-rangedependencies.Thissignifi-cantlyreducessparsity,sowehavefewerpredicatestoclusterandmoreobservationsforeach.Ofcourse,manypracticalinferencesrelyheavilyonbackgroundknowledgeabouttheworld—suchknowledgefallsoutsidethescopeofthiswork.2BackgroundOurapproachisbasedonCombinatoryCategorialGrammar(CCG;Steedman,2000),astronglylexi-calisedtheoryoflanguageinwhichlexicalentriesforwordscontainalllanguage-specificinformation.Thelexicalentryforeachwordcontainsasyntacticcategory,whichdetermineswhichothercategoriesthewordmaycombinewith,andasemanticinter-pretation,whichdefinesthecompositionalseman-tics.Forexample,thelexiconmaycontaintheentry:write‘(S\NP)/notario público:λyλx.write0(X,y)Fundamentalmente,thereisatransparentinterfacebetweenthesyntacticcategoryandthesemantics.Forex-amplethetransitiveverbentryabovedefinestheverbsyntacticallyasafunctionmappingtwonoun-phrasestoasentence,andsemanticallyasabi-naryrelationbetweenitstwoargumententities.Thismeansthatitisrelativelystraightforwardtodeterministicallymapparseroutputtoalogicalform,asintheBoxersystem(jefe,2008).ThisEverydogbarksNP↑/NNS\NPλpλq.∀x[pag(X)=⇒q(X)]λx.dog0(X)λx.bark0(X)>NP↑λq.∀x[dog0(X)=⇒q(X)]>S∀x[dog0(X)=⇒bark0(X)]Figure1:AstandardlogicalformderivationusingCCG.TheNP↑notationmeansthatthesubjectistype-raised,andtakingtheverb-phraseasanargument—soisanab-breviationofS/(S\NP).Thisisnecessaryinparttosup-portacorrectsemanticsforquantifiers.InputSentenceShakespearewroteMacbeth⇓Intialsemanticanalysiswritearg0,arg1(shakespeare,macbeth)⇓EntityTypingwritearg0:PER,arg1:BOOK(shakespeare:PER,macbeth:BOOK)⇓Distributionalsemanticanalysisrelation37(shakespeare:PER,macbeth:BOOK)Figure2:Layersusedinourmodel.formofsemanticscapturestheunderlyingpredicate-argumentstructure,butfailstolicensemanyimpor-tantinferences—as,forexample,writeandauthordonotmaptothesamepredicate.Inadditiontothelexicon,thereisasmallsetofbinarycombinatorsandunaryrules,whichhaveasyntacticandsemanticinterpretation.Figure1givesanexampleCCGderivation.3OverviewofApproachWeattempttolearnaCCGlexiconwhichmapsequivalentwordsontothesamelogicalform—forexamplelearningentriessuchas:author‘N/PP[de]:λxλy.relation37(X,y)write‘(S\NP)/notario público:λxλy.relation37(X,y)TheonlychangetothestandardCCGderivationisthatthesymbolsusedinthelogicalformarearbi-traryrelationidentifiers.Welearnthesebyfirstmap-pingtoadeterministiclogicalform(usingpredicates

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
1
9
1
5
6
6
6
4
9

/

/
t

yo

a
C
_
a
_
0
0
2
1
9
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

181

suchasauthor’andwrite’),usingaprocesssimi-lartoBoxer,andthenclusteringpredicatesbasedontheirarguments.Thislexiconcanthenbeusedtoparsenewsentences,andintegratesseamlesslywithCCGtheoriesofformalsemantics.Typingpredicates—forexample,determiningthatwritingisarelationbetweenpeopleandbooks—hasbecomestandardinrelationclustering(Schoen-mackersetal.,2010;Berantetal.,2011;Yaoetal.,2012).WedemonstatehowtobuildatypingmodelintotheCCGderivation,bysubcategorizingalltermsrepresentingentitiesinthelogicalformwithamoredetailedtype.Thesetypesarealsoin-ducedfromtext,asexplainedinSection5,butforconveniencewedescribethemwithhuman-readablelabels,suchasPER,LOCandBOOK.Akeyadvantageoftypingisthatitallowsustomodelambiguouspredicates.FollowingBerantetal.(2011),weassumethatdifferenttypesignaturesofthesamepredicatehavedifferentmeanings,butgivenatypesignatureapredicateisunambiguous.ForexampleadifferentlexicalentryfortheverbbornisusedinthecontextsObamawasborninHawaiiandObamawasbornin1961,reflectingadistinctioninthesemanticsthatisnotobviousinthesyntax1.Typingalsogreatlyimprovestheefficiencyofclustering,asweonlyneedtocomparepredicateswiththesametypeduringclustering(forexample,wedonothavetoconsiderclusteringapredicatebetweenpeopleandplaceswithpredicatesbetweenpeopleanddates).Inthiswork,wefocusoninducingbinaryrela-tions.Manyexistingapproacheshaveshownhowtoproducegoodclusteringsof(non-event)nouns(Brownetal.,1992),anyofwhichcouldbesim-plyintegratedintooursemantics—butrelationclus-teringremainsanopenproblem(seeSection9).N-aryrelationsarebinarized,bycreatingabi-naryrelationbetweeneachpairofarguments.Forexample,forthesentenceRussiasoldAlaskatotheUnitedStates,thesystemcreatesthreebinaryrelations—correspondingtosellToSomeone(Russia,Alaska),buyFromSomeone(US,Alaska),sellSome-thingTo(Russia,US).Thistransformationdoesnot1Whilstthisassumptionisveryuseful,itdoesnotalwayshold—forexample,thegenitiveinShakespeare’sbookisambigu-ousbetweenownershipandauthorshiprelationsevengiventhetypesofthearguments.exactlypreservemeaning,butstillcapturesthemostimportantrelations.Notethatthisallowsustocomparesemanticrelationsacrossdifferentsyntac-tictypes—forexample,bothtransitiveverbsandargument-takingnounscanbeseenasexpressingbi-narysemanticrelationsbetweenentities.Figure2showsthelayersusedinourmodel.4InitialSemanticAnalysisTheinitialsemanticanalysismapsparseroutputontoalogicalform,inasimilarwaytoBoxer.ThesemanticformalismisbasedonSteedman(2012).Thefirststepissyntacticparsing.WeusetheC&Cparser(ClarkandCurran,2004),trainedonCCGBank(HockenmaierandSteedman,2007),us-ingtherefinedversionofHonnibaletal.(2010)whichbringsthesyntaxclosertothepredicate-argumentstructure.Anautomaticpost-processingstepmakesanumberofminorchangestotheparseroutput,whichconvertsthegrammarintoonemoresuitableforoursemantics.PP(prepositionalphrase)andPR(phrasalverbcomplement)categoriesaresub-categorisedwiththerelevantpreposition.NouncompoundswiththesameMUCnamed-entitytype(ChinchorandRobinson,1997)aremergedintoasinglenon-compositionalnode2(weotherwiseig-norenamed-entitytypes).AllargumentNPsandPPsaretype-raised,allowingustorepresentquanti-fiers.Allprepositionalphrasesaretreatedascorear-guments(i.e.giventhecategoryPP,notadjunctcat-egorieslike(N\N)/NPor((S\NP)\(S\NP))/notario público),asitisdifficultfortheparsertodistinguishargu-mentsandadjuncts.InitialsemanticlexicalentriesforalmostallwordscanbegeneratedautomaticallyfromthesyntacticcategoryandPOStag(obtainedfromtheparser),asthesyntacticcategorycapturestheunderlyingpredicate-argumentstructure.WeuseaDavidsonian-stylerepresentationofarguments(Davidson,1967),whichwebinarizebycreatingaseparatepredicateforeachpairofargumentsofaword.ThesepredicatesarelabelledwiththelemmaoftheheadwordandaPropbank-styleargumentkey(KingsburyandPalmer,2002),e.g.arg0,argIn.WedistinguishnounandverbpredicatesbasedonPOS2Forexample,thisallowsustogiveBarackObamatheseman-ticsλx.barackobama(X)insteadofλx.barack(X)∧obama(X),whichismoreconvenientforcollectingdistributionalstatistics.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
1
9
1
5
6
6
6
4
9

/

/
t

yo

a
C
_
a
_
0
0
2
1
9
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

182

WordCategorySemanticsAutomaticauthorN/PP[de]λxλy.authorarg0,argOf(y,X)write(S\NP)/NPλxλy.writearg0,arg1(y,X)ManualeveryNP↑/Nλpλq.∀x[pag(X)→q(X)]no(S\NP)/(S\NP)λpλx.¬p(X)Figure3:Exampleinitiallexicalentriestag—so,forexample,wehavedifferentpredicatesforeffectasanounorverb.Thisalgorithmcanbeoverriddenwithman-uallexicalentriesforspecificclosed-classfunctionwords.Whilstitmaybepossibletolearnthesefromdata,ourapproachispragmaticastherearerelativelyfewsuchwords,andthecomplexlogicalformsrequiredwouldbedifficulttoinducefromdis-tributionalstatistics.Weaddasmallnumberoflexi-calentriesforwordssuchasnegatives(No,notetc.),andquantifiers(numbers,cada,cada,todo,etc.).SomeexampleinitiallexicalentriesareshowninFigure3.5EntityTypingModelOurentity-typingmodelassignstypestonouns,whichisusefulfordisambiguatingpolysemouspredicates.OurapproachissimilartoO’Seaghdha(2010)inthatweaimtoclusterentitiesbasedonthenounandunarypredicatesappliedtothem(itissimpletoconvertfromthebinarypredicatestounarypredicates).Forexample,wewantthepair(bornargIn,1961)tomaptoaDATtype,y(bornargIn,Hawaii)tomaptoaLOCtype.Thisisnon-trivial,asboththepredicatesandargumentscanbeambiguousbetweenmultipletypes—buttopicmodelsofferagoodsolution(describedbelow).5.1TopicModelWeassumethatthetypeofeachargumentofapred-icatedependsonlyonthepredicateandargument,althoughRitteretal.(2010)demonstrateanadvan-tageofmodellingthejointprobabilityofthetypesofmultipleargumentsofthesamepredicate.WeusethestandardLatentDirichletAllocationmodel(Bleietal.,2003),whichperformscomparablytomorecomplexmodelsproposedinO’Seaghdha(2010).Intopic-modellingterminology,weconstructadocumentforeachunarypredicate(e.g.bornargIn),basedonallofitsargumententities(palabras).Weas-sumethattheseargumentsaredrawnfromasmallnumberoftypes(temas),suchasPER,DATorLOC3.Eachtypejhasamultinomialdistributionφjoverarguments(forexample,aLOCtypeismorelikelytogenerateHawaiithan1961).Eachunarypredicateihasamultinomialdistributionθiovertopics,sothebornargInpredicatewillnormallygen-erateaDATorLOCtype.SparseDirichletpriorsαandβonthemultinomialsbiasthedistributionstobepeaky.TheparametersareestimatedbyGibbssampling,usingtheMalletimplementation(McCal-lum,2002).Thegenerativestorytocreatethedatais:Foreverytypek:Drawthep(arg|k)distributionφkfromDir(b)Foreveryunarypredicatei:Drawthep(tipo|i)distributionθifromDir(a)Foreveryargumentj:DrawatypezijfromMult(θi)DrawanargumentwijfromMult(φθi)5.2TypinginLogicalFormInthelogicalform,allconstantsandvariablesrepre-sentingentitiesxcanbeassignedadistributionovertypespx(t)usingthetypemodel.Aninitialtypedistributionisappliedinthelexicon,usingtheφdistributionsforthetypesofnouns,andtheθidis-tributionsforthetypeofargumentsofbinarypredi-cates(invertedusingBayes’rule).Thenateachβ-reductioninthederivation,weupdateprobabilitiesofthetypestobetheproductofthetypedistribu-tionsofthetermsbeingreduced.Iftwotermsxand3Typesareinducedfromthetext,butwegivehuman-readablelabelshereforconvenience.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu

/
t

a
C
yo
/

yo

a
r
t
i
C
mi

pag
d

F
/

d
oh

i
/

.

1
0
1
1
6
2

/
t

yo

a
C
_
a
_
0
0
2
1
9
1
5
6
6
6
4
9

/

/
t

yo

a
C
_
a
_
0
0
2
1
9
pag
d

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

183

fileasuit(S\NP)/NPNP↑λy:(DOC=0.5LEGAL=0.4CLOTHES=0.01…)λx:(cid:26)PER=0.7ORG=0.2…(cid:27).filearg0,arg1(X,y)λp.∃y:(CLOTHES=0.6LEGAL=0.3DOC=0.001…)[suit0(y)∧p(y)]Descargar PDF