Transactions of the Association for Computational Linguistics, vol. 3, pp. 257–270, 2015. Action Editor: Katrin Erk.
Submission batch: 12/2014; Revision batch 3/2015; Published 5/2015.
2015 Association for Computational Linguistics. Distributed under a CC-BY-NC-SA 4.0 Licence.
c
(cid:13)
LearningaCompositionalSemanticsforFreebasewithanOpenPredicateVocabularyJayantKrishnamurthyCarnegieMellonUniversity5000ForbesAvenuePittsburgh,PA15213jayantk@cs.cmu.eduTomM.MitchellCarnegieMellonUniversity5000ForbesAvenuePittsburgh,PA15213tom.mitchell@cmu.eduAbstractWepresentanapproachtolearningamodel-theoreticsemanticsfornaturallanguagetiedtoFreebase.Crucially,ourapproachusesanopenpredicatevocabulary,enablingittoproducedenotationsforphrasessuchas“Re-publicanfront-runnerfromTexas”whosese-manticscannotberepresentedusingtheFree-baseschema.Ourapproachdirectlyconvertsasentence’ssyntacticCCGparseintoalog-icalformcontainingpredicatesderivedfromthewordsinthesentence,assigningeachwordaconsistentsemanticsacrosssentences.Thislogicalformisevaluatedagainstalearnedprobabilisticdatabasethatdefinesadistribu-tionoverdenotationsforeachtextualpred-icate.Atrainingphaseproducesthisprob-abilisticdatabaseusingacorpusofentity-linkedtextandprobabilisticmatrixfactoriza-tionwithanovelrankingobjectivefunction.Weevaluateourapproachonacompositionalquestionansweringtaskwhereitoutperformsseveralcompetitivebaselines.Wealsocom-pareourapproachagainstmanuallyannotatedFreebasequeries,findingthatouropenpred-icatevocabularyenablesustoanswermanyquestionsthatFreebasecannot.1IntroductionTraditionalknowledgerepresentationassumesthatworldknowledgecanbeencodedusingaclosedvocabularyofformalpredicates.Inrecentyears,semanticparsinghasenabledustobuildcompo-sitionalmodelsofnaturallanguagesemanticsus-ingsuchaclosedpredicatevocabulary(ZelleandMooney,1996;ZettlemoyerandCollins,2005).Thesesemanticparsersmapnaturallanguagestate-mentstodatabasequeries,enablingapplicationssuchasansweringquestionsusingalargeknowl-edgebase(Yahyaetal.,2012;KrishnamurthyandMitchell,2012;CaiandYates,2013;Kwiatkowskietal.,2013;Berantetal.,2013;BerantandLiang,2014;Reddyetal.,2014).En outre,themodel-theoreticsemanticsprovidedbysuchparsershavethepotentialtoimproveperformanceonothertasks,suchasinformationextractionandcoreferenceres-olution.However,aclosedpredicatevocabularyhasinher-entlimitations.First,itscoveragewillbelimited,assuchvocabulariesaretypicallymanuallycon-structed.Second,itmayabstractawaypotentiallyrelevantsemanticdifferences.Forexample,these-manticsof“Republicanfront-runner”cannotbead-equatelyencodedintheFreebaseschemabecauseitlackstheconceptofa“front-runner.”Wecouldchoosetoencodethisconceptas“politician”atthecostofabstractingawaythedistinctionbetweenthetwo.Asthisexampleillustrates,thesetwoproblemsareprevalentineventhelargestknowledgebases.Analternativeparadigmisanopenpredicatevocabulary,whereeachnaturallanguagewordorphraseisgivenitsownformalpredicate.Thisparadigmisembodiedinbothopeninformationex-traction(Bankoetal.,2007)anduniversalschema(Riedeletal.,2013).Openpredicatevocabularieshavethepotentialtocapturesubtlesemanticdistinc-tionsandachievehighcoverage.However,wehaveyettodevelopcompellingapproachestocomposi-tionalsemanticswithinthisparadigm.Thispapertakesasteptowardcompositionalse-
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
258
InputTextRepublicanfront-runnerfromTexas→LogicalFormλx.∃y,z.FRONT-RUNNER(X)∧y=/EN/REPUBLICAN∧NN(oui,X)∧z=/EN/TEXAS∧FROM(X,z)→EntityProb./EN/GEORGEBUSH0.57/EN/RICKPERRY0.45…φθTEXASREPUB.G.BUSH…F.-RUN.POL.STATE…Entity/PredicateEmbeddingsφθ(G.BUSH,TEXAS)(G.BUSH,REPUB.)(REPUB.,G.BUSH)…FROMLIVESINNN…→TEXASREPUB.G.BUSH…F.-RUN.POL.STATE….1.1.9.1.1.9.8.1.1ProbabilisticDatabase(G.BUSH,TEXAS)(G.BUSH,REPUB.)(REPUB.,G.BUSH)…FROMLIVESINNN….9.1.1.9.1.1.1.1.7Figure1:Overviewofourapproach.Topleft:thetextisconvertedtologicalformbyCCGsyntacticparsingandacollectionofmanually-definedrules.Bottom:low-dimensionalembeddingsofeachentity(entitypair)andcategory(relation)arelearnedfromanentity-linkedwebcorpus.Theseembeddingsareusedtoconstructaprobabilisticdatabase.Thelabelsofthesematricesareshortenedforspacereasons.Topright:evaluatingthelogicalformontheprobabilisticdatabasecomputesthemarginalprobabilitythateachentityisanelementofthetext’sdenotation.manticswithanopenpredicatevocabulary.Ourap-proachdefinesadistributionoverdenotations(setsofFreebaseentities)givenaninputtext.Themodelhastwocomponents,showninFigure1.Thefirstcomponentisarule-basedsemanticparserthatusesasyntacticCCGparserandmanually-definedrulestomapentity-linkedtextstologicalformscontain-ingpredicatesderivedfromthewordsinthetext.Thesecondcomponentisaprobabilisticdatabasewithapossibleworldssemanticsthatdefinesadis-tributionoverdenotationsforeachtextually-derivedpredicate.Thisdatabaseassignsindependentprob-abilitiestoindividualpredicateinstances,suchasP(FRONT-RUNNER(/EN/GEORGEBUSH))=0.9.Together,thesecomponentsdefineanexponentially-largedistributionoverdenotationsforaninputtext;tosimplifythisoutput,wecomputethemarginalprobability,overallpossibleworlds,thateachentityisanelementofthetext’sdenotation.Thelearningprobleminourapproachistotraintheprobabilisticdatabasetopredictadenotationforeachpredicate.Weposethisproblemasprobabilis-ticmatrixfactorizationwithanovelquery/answerrankingobjective.Thisfactorizationlearnsalow-dimensionalembeddingofeachentity(entitypair)andcategory(relation)suchthatthedenotationofapredicateislikelytocontainentitiesorentitypairswithnearbyvectors.Totrainthedatabase,wefirstcollecttrainingdatabyanalyzingentity-linkedsen-tencesinalargewebcorpuswiththerule-basedsemanticparser.Thisprocessgeneratesacollec-tionoflogicalformquerieswithobservedentityan-swers.Thequery/answerrankingobjective,whenoptimized,trainsthedatabasetoranktheobservedanswersforeachqueryaboveunobservedanswers.Weevaluateourapproachonaquestionanswer-ingtask,findingthatourapproachoutperformssev-eralbaselinesandthatournewtrainingobjectiveimprovesperformanceoverapreviously-proposedobjective.Wealsoevaluatethetrade-offsbetweenopenandclosedpredicatevocabulariesbycompar-ingourapproachtoamanually-annotatedFreebasequeryforeachquestion.Thiscomparisonrevealsthat,whenFreebasecontainspredicatesthatcoverthequestion,itachieveshigherprecisionandrecallthanourapproach.However,ourapproachcancor-rectlyanswermanyquestionsnotcoveredbyFree-base.2SystemOverviewThepurposeofoursystemistopredictadenota-tionγforagivennaturallanguagetexts.Thede-notationγisthesetofFreebaseentitiesthatsrefersto;forexample,ifs=“presidentoftheUS,”thenγ={/EN/OBAMA,/EN/BUSH,…}.1Oursystem1Thispaperusesasimplemodel-theoreticsemanticswherethedenotationofanounphraseisasetofentitiesandthede-notationofasentenceiseithertrueorfalse.However,forno-tationalconvenience,denotationsγwillbetreatedassetsofentitiesthroughout.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
259
representsthispredictionproblemusingthefollow-ingprobabilisticmodel:P.(c|s)=XwX‘P(c|‘,w)P.(w)P.(‘|s)Thefirstterminthisfactorization,P.(‘|s),isadis-tributionoverlogicalforms‘giventhetexts.Thistermcorrespondstotherule-basedsemanticparser(Section3).Thissemanticparserisdeterministic,sothistermassignsprobability1toasinglelogicalformforeachtext.Thesecondterm,P.(w),repre-sentsadistributionoverpossibleworlds,whereeachworldisanassignmentoftruthvaluestoallpossiblepredicateinstances.Thedistributionoverworldsisrepresentedbyaprobabilisticdatabase(Section4).Thefinalterm,P.(c|‘,w),deterministicallyevalu-atesthelogicalform‘ontheworldwtoproduceadenotationγ.Thistermrepresentsqueryevaluationagainstafixeddatabase,asinotherworkonseman-ticparsing.Section5describesinferenceinourmodel.Toproducearankedlistofentities(Figure1,topright)fromP(c|s),oursystemcomputesthemarginalprobabilitythateachentityisanelementofthede-notationγ.Thisproblemcorrespondstoqueryeval-uationinaprobabilisticdatabase,whichisknowntobetractableinmanycases(Suciuetal.,2011).Section6describestraining,whichestimatespa-rametersfortheprobabilisticdatabaseP(w).Thisstepfirstautomaticallygeneratestrainingdatausingtherule-basedsemanticparser.Thisdataisusedtoformulateamatrixfactorizationproblemthatisop-timizedtoestimatethedatabaseparameters.3Rule-BasedSemanticParserThefirstpartofourcompositionalsemanticssystemisarule-basedsystemthatdeterministicallycom-putesalogicalform‘foratexts.Thiscomponentisusedduringinferencetoanalyzethelogicalstruc-tureoftext,andduringtrainingtogeneratetrainingdata(seeSection6.1).Severalinput/outputpairsforthissystemareshowninFigure2.Theconversiontologicalformhas3phases:1.CCGsyntacticparsingparsesthetextandap-pliesseveraldeterministicsyntactictransfor-mationstofacilitatesemanticanalysis.2.EntitylinkingmarksknownFreebaseentitiesinthetext.3.Semanticanalysisassignsalogicalformtoeachword,thencomposesthemtoproducealogicalformforthecompletetext.3.1SyntacticParsingThefirststepinouranalysisistosyntacticallyparsethetext.WeusetheASP-SYNparser(Kr-ishnamurthyandMitchell,2014)trainedonCCG-Bank(HockenmaierandSteedman,2002).Wethenautomaticallytransformtheresultingsyntacticparsetomakethesyntacticstructuremoreamenabletosemanticanalysis.ThisstepmarksNPsinconjunctionsbyreplacingtheirsyntacticcategorywithNP[conj].Thistransformationallowsseman-ticanalysistodistinguishbetweenappositivesandcomma-separatedlists.Italsotransformsallverbar-gumentstocorearguments,i.e.,usingthecategoryPP/NPasopposedto((S\NP)\(S\NP))/NP.Thisstepsimplifiesthesemanticanalysisofverbswithprepositionalphrasearguments.ThefinaltransformationaddsawordfeaturetoeachPPcat-egory,e.g.,mappingPPtoPP[par].Thesefeaturesareusedtogenerateverb-prepositionrelationpredi-cates,suchasDIRECTEDBY.3.2EntityLinkingThesecondstepistoidentifymentionsofFreebaseentitiesinthetext.Thisstepcouldbeperformedbyanoff-the-shelfentitylinkingsystem(Ratinovetal.,2011;MilneandWitten,2008)orstringmatching.However,ourtrainingandtestdataisderivedfromClueweb2009,sowerelyontheentitylinkingforthiscorpusprovidedbyGabrilovichet.al(2013).Oursystemincorporatestheprovidedentitylinksintothesyntacticparseprovidedthattheyarecon-sistentwiththeparsestructure.Specifically,were-quirethateachmentioniseither(1)aconstituentintheparsetreewithsyntacticcategoryNorNPor(2)acollectionofN/NorNP/NPmodifierswithasingleheadword.Thefirstcasecoversnounandnounphrasementions,whilethesecondcasecov-ersnouncompounds.Inbothcases,wesubstituteasinglemulti-wordterminalintotheparsetreespan-ningthementionandinvokespecialsemanticrulesformentionsdescribedinthenextsection.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
260
DanHesse,CEOofSprintλx.∃y.x=/EN/DANHESSE∧CEO(X)∧OF(X,oui)∧y=/EN/SPRINTYankeespitcherλx.∃y.PITCHER(X)∧NN(oui,X)∧y=/EN/YANKEESTomCruiseplaysMaverickinthemovieTopGun.∃x,oui,z.x=/EN/TOMCRUISE∧PLAYS(X,oui)∧y=/EN/MAVERICK(CHARACTER)∧PLAYSIN(X,z)∧z=/EN/TOPGUNFigure2:Exampleinput/outputpairsforourseman-ticanalysissystem.MentionsofFreebaseentitiesinthetextareindicatedbyunderlines.3.3SemanticanalysisThefinalstepusesthesyntacticparseandentitylinkstoproducealogicalformforthetext.Thesys-teminducesalogicalformforeverywordinthetextbasedonitssyntacticCCGcategory.Composingtheselogicalformsaccordingtothesyntacticparseproducesalogicalformfortheentiretext.Oursemanticanalysesarebasedonarelativelyna¨ıvemodel-theoreticsemantics.Wefocusonlan-guagewhosesemanticscanberepresentedwithexistentially-quantifiedconjunctionsofunaryandbinarypredicates,ignoring,forexample,tempo-ralscopeandsuperlatives.Generally,oursys-temmodelsnounsandadjectivesasunarypredi-cates,andverbsandprepositionsasbinarypredi-cates.Specialmulti-wordpredicatesaregeneratedforverb-prepositioncombinations.Entitymentionsaremappedtothementionedentityinthelogicalform.Wealsocreatedspecialrulesforanalyzingconjunctions,appositives,andrelativizingconjunc-tions.Thecompletelistofrulesusedtoproducetheselogicalformsisavailableonline.2Wemadeseveralnotablechoicesindesigningthiscomponent.First,multi-argumentverbsareana-lyzedusingpairwiserelations,asinthethirdexam-pleinFigure2.Thisanalysisallowsustoavoidrea-soningaboutentitytriples(quadruples,etc.),whicharechallengingforthematrixfactorizationduetosparsity.Second,noun-prepositioncombinationsareanalyzedasacategoryandrelation,asinthefirstexampleinFigure2.Weempiricallyfoundthatcombiningthenounandprepositioninsuch2http://rtw.ml.cmu.edu/tacl2015_csfinstancesresultedinworseperformance,asitdra-maticallyincreasedthesparsityoftraininginstancesforthecombinedrelations.Third,entitymentionswiththeN/Ncategoryareanalyzedusingaspe-cialnoun-nounrelation,asinthesecondexampleinFigure2.Ourintuitionisthatthisrelationsharesinstanceswithotherrelations(e.g.,“cityinTexas”implies“Texancity”).Enfin,welowercasedeachwordtocreateitspredicatename,butperformednolemmatizationorothernormalization.3.4DiscussionThescopeofoursemanticanalysissystemissome-whatlimitedrelativetoothersimilarsystems(Bos,2008;LewisandSteedman,2013)asitonlyout-putsexistentially-quantifiedconjunctionsofpredi-cates.Ourgoalinbuildingthissystemwastoan-alyzenounphrasesandsimplesentences,forwhichthisrepresentationgenerallysuffices.Thereasonforthisfocusistwofold.First,thissubsetoflanguageissufficienttocapturemuchofthelanguagesurround-ingFreebaseentities.Second,forvarioustechni-calreasons,thisrestrictedsemanticrepresentationiseasiertouse(andmoreinformative)fortrainingtheprobabilisticdatabase(seeSection6.3).Notethatthissystemcanbestraightforwardlyex-tendedtomodeladditionallinguisticphenomena,suchasadditionallogicaloperatorsandgeneralizedquantifiers,bywritingadditionalrules.Theseman-ticsoflogicalformsincludingtheseoperationsarewell-definedinourmodel,andthesystemdoesnotevenneedtobere-trainedtoincorporatetheseaddi-tions.4ProbabilisticDatabaseThesecondpartofourcompositionalsemanticssys-temisaprobabilisticdatabase.Thisdatabaserep-resentsadistributionoverpossibleworlds,whereeachworldisanassignmentoftruthvaluestoev-erypredicateinstance.Equivalently,theprobabilis-ticdatabasecanbeviewedasadistributionoverdatabasesorknowledgebases.Formally,aprobabilisticdatabaseisacollectionofrandomvariables,eachofwhichrepresentsthetruthvalueofasinglepredicateinstance.Givenen-titiese∈E,categoriesc∈C,andrelationsr∈Rtheprobabilisticdatabasecontainsbooleanrandom
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
261
variablesc(e)andr(e1,e2)foreachcategoryandrelationinstance,respectively.Alloftheseran-domvariablesareassumedtobeindependent.Letaworldwrepresentanassignmentoftruthvaluestoalloftheserandomvariables,wherec(e)=wc,eandr(e1,e2)=wr,e1,e2.Byindependence,theprobabil-ityofaworldcanbewrittenas:P.(w)=Ye∈EYc∈CP(c(e)=wc,e)×Ye1∈EYe2∈EYr∈RP(r(e1,e2)=wr,e1,e2)Thenextsectiondiscusseshowprobabilisticma-trixfactorizationisusedtomodeltheprobabilitiesofthesepredicateinstances.4.1MatrixFactorizationModelTheprobabilisticmatrixfactorizationmodeltreatsthetruthofeachpredicateinstanceasanindepen-dentbooleanrandomvariablethatistruewithprob-ability:P.(c(e)=TRUE)=σ(θTcφe)P.(r(e1,e2)=TRUE)=σ(θTrφ(e1,e2))Above,p(X)=ex1+existhelogisticfunction.Intheseequations,θcandθrrepresentk-dimensionalvectorsofper-predicateparameters,whileφeandφ(e1,e2)representk-dimensionalvectorembeddingsofeachentityandentitypair.Thismodelcontainsalow-dimensionalembeddingofeachpredicateandentitysuchthateachpredicate’sdenotationhasahighprobabilityofcontainingentitieswithnearbyvectors.Theprobabilitythateachvariableisfalseissimply1minustheprobabilitythatitistrue.Thismodelcanbeviewedasmatrixfactorization,asdepictedinFigure1.Thecategoryandrelationinstanceprobabilitiescanbearrangedinapairofmatricesofdimension|E|×|C|et|E|2×|R.|.Eachrowofthesematricesrepresentsanentityorentitypair,eachcolumnrepresentsacategoryorre-lation,andeachvalueisbetween0and1andrep-resentsatruthprobability(Figure1,bottomright).Thesetwomatricesarefactoredintomatricesofsize|E|×kandk×|C|,et|E|2×kandk×|R.|,re-spectively,containingk-dimensionalembeddingsofeachentity,catégorie,entitypairandrelation(Figure1,bottomleft).Theselow-dimensionalembeddingsarerepresentedbytheparametersφandθ.5Inference:ComputingMarginalProbabilitiesInferencecomputesthemarginalprobability,overallpossibleworlds,thateachentityisanelementofatext’sdenotation.Inmanycases–dependingonthetext–thesemarginalprobabilitiescanbecom-putedexactlyinpolynomialtime.TheinferenceproblemistocalculateP(e∈γ|s)foreachentitye.BecauseboththesemanticparserP(‘|s)andqueryevaluationP(c|‘,w)aredeter-ministic,thisproblemcanberewrittenas:P.(e∈γ|s)=Xγ1(e∈γ)P.(c|s)=Xw1(e∈J‘Kw)P.(w)Above,‘representsthelogicalformforthetextsproducedbytherule-basedsemanticparser,and1representstheindicatorfunction.ThenotationJ‘Kwrepresentsdenotationproducedby(determin-istically)evaluatingthelogicalform‘onworldw.Thisinferenceproblemcorrespondstoqueryeval-uationinaprobabilisticdatabase,whichis#P-hardingeneral.Intuitively,thisproblemcanbedifficultbecauseP(c|s)isajointdistributionoversetsofen-titiesthatcanbeexponentiallylargeinthenumberofentities.However,alargesubsetofprobabilisticdatabasequeries,knownassafequeries,permitpolynomialtimeevaluation(DalviandSuciu,2007).Safequeriescanbeevaluatedextensionallyusingaprob-abilisticnotionofadenotationthattreatseachentityasindependent.LetJ‘KPdenoteaprobabilisticde-notation,whichisafunctionfromentities(orentitypairs)toprobabilities,i.e.,J‘KP(e)∈[0,1].Thedenotationofalogicalformisthencomputedre-cursively,inthesamemannerasanon-probabilisticdenotation,usingprobabilisticextensionsofthetyp-icalrules,suchas:JcKP(e)=XwP(w)1(wc,e)JrKP(e1,e2)=XwP(w)1(wr,e1,e2)Jc1(X)∧c2(X)KP(e)=Jc1KP(e)×Jc2KP(e)J∃y.r(X,oui)KP(e)=1−Yy∈E(1−JrKP(e,oui))
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
262
Thefirsttworulesarebasecasesthatsimplyre-trievepredicateprobabilitiesfromtheprobabilisticdatabase.Theremainingrulescomputetheprob-abilisticdenotationofalogicalformfromthede-notationsofitsparts.3Theformulafortheprob-abilisticcomputationontherightofeachoftheserulesisastraightforwardconsequenceofthe(as-sumed)independenceofentities.Forexample,thelastrulecomputestheprobabilityofanORofasetofindependentrandomvariables(indexedbyy)us-ingtheidentityA∨B=¬(¬A∧¬B).Forsafequeries,J‘KP(e)=P(e∈γ|s),thatis,theproba-bilisticdenotationcomputedaccordingtotheaboverulesisequaltothemarginalprobabilitydistribu-tion.Inpractice,allofthequeriesintheexperi-mentsaresafe,becausetheycontainonlyonequeryvariableanddonotcontainrepeatedpredicates.Formoreinformationonqueryevaluationinprobabilis-ticdatabases,wereferthereadertoSuciuetal.(2011).Notethatinferencedoesnotcomputethemostprobabledenotation,maxγP(c|s).Insomesense,themostprobabledenotationisthecorrectoutputforamodel-theoreticsemantics.However,itishighlysensitivetotheprobabilitiesinthedatabase,andinmanycasesitisempty(becauseaconjunctionofindependentbooleanrandomvariablesisunlikelytobetrue).Producingarankedlistofentitiesisalsousefulforevaluationpurposes.6TrainingThetrainingprobleminourapproachistolearnpa-rametersθandφfortheprobabilisticdatabase.Weconsidertwodifferentobjectivefunctionsforlearn-ingtheseparametersthatuseslightlydifferentformsoftrainingdata.Inbothcases,traininghastwophases.First,wegeneratetrainingdata,intheformofobservedassertionsorquery-answerpairs,byap-plyingtherule-basedsemanticparsertoacorpusofentity-linkedwebtext.Second,weoptimizetheparametersoftheprobabilisticdatabasetorankob-servedassertionsoranswersaboveunobservedas-sertionsoranswers.3Thislistingofrulesispartialasitdoesnotinclude,e.g.,negationorjoinsbetweenone-argumentandtwo-argumentlog-icalforms.However,thecorrespondingrulesareeasytoderive.6.1TrainingDataTrainingdataisgeneratedbyapplyingtheprocessillustratedinFigure3toeachsentenceinanentity-linkedwebcorpus.First,weapplyourrule-basedsemanticparsertothesentencetoproducealogi-calform.Next,weextractportionsofthislogicalformwhereeveryvariableisboundtoaparticu-larFreebaseentity,resultinginasimplifiedlogicalform.Becausethelogicalformsareexistentially-quantifiedconjunctionsofpredicates,thisstepsim-plydiscardsanyconjunctsinthelogicalformcon-tainingavariablethatisnotboundtoaFreebaseentity.Fromthissimplifiedlogicalform,wegen-eratetwotypesoftrainingdata:(1)predicatein-stances,et(2)querieswithknownanswers(seeFigure3).Inbothcases,thecorpusconsistsentirelyofassumed-to-be-truestatements,makingobtainingnegativeexamplesamajorchallengefortraining.46.2PredicateRankingObjectiveRiedeletal.(2013)introducedarankingobjectivetoworkaroundthelackofnegativeexamplesinasim-ilarmatrixfactorizationproblem.TheirobjectiveisamodifiedversionofBayesianPersonalizedRank-ing(Rendleetal.,2009)thataimstorankobservedpredicateinstancesaboveunobservedinstances.Thisobjectivefunctionusesobservedpredicateinstances(Figure3,bottomleft)astrainingdata.Thisdataconsistsoftwocollections,{(ci,ei)}ni=1and{(rj,tj)}mj=1,ofobservedcategoryandrelationinstances.Weusetjtodenoteatupleofentities,tj=(ej,1,ej,2),tosimplifynotation.Thepredicaterankingobjectiveis:OP(je,φ)=nXi=1logσ(θTci(φei−φe0i))+mXj=1logσ(θTrj(φtj−φt0j))wheree0iisarandomlysampledentitysuchthat(ci,e0i)doesnotoccurinthetrainingdata.Simi-larly,t0jisarandomentitytuplesuchthat(rj,t0j)doesnotoccur.Maximizingthisfunctionattempts4Aseeminglysimplesolutiontothisproblemistorandomlygeneratenegativeexamples;cependant,weempiricallyfoundthatthisapproachperformsconsiderablyworsethanbothofthepro-posedrankingobjectives.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
263
OriginalsentenceandlogicalformGeneralPowell,appearingSundayonCNN’sLateEdition,said…∃w,X,oui,z.w=/EN/POWELL∧GENERAL(w)∧APPEARING(w,X)∧SUNDAY(X)∧APPEARINGON(w,oui)∧y=/EN/LATE∧’S(z,oui)∧z=/EN/CNN∧SAID(w,…)Simplifiedlogicalform∃w,oui,z.w=/EN/POWELL∧GENERAL(w)∧APPEARINGON(w,oui)∧y=/EN/LATE∧’S(z,oui)∧z=/EN/CNNInstancesQueriesAnswersGENERAL(/EN/POWELL)λw.GENERAL(w)∧APPEARINGON(w,/EN/LATE)/EN/POWELLAPPEARINGON(/EN/POWELL,/EN/LATE)λy.APPEARINGON(/EN/POWELL,oui)∧’S(/EN/CNN,oui)/EN/LATE’S(/EN/CNN,/EN/LATE)λz.’S(z,/EN/LATE)/EN/CNNFigure3:Illustrationoftrainingdatagenerationappliedtoasinglesentence.Wegeneratetwotypesoftrain-ingdata,predicateinstancesandquerieswithobservedanswers,bysemanticallyparsingthesentenceandextractingportionsofthegeneratedlogicalformwithobservedentityarguments.Thepredicateinstancesareextractedfromtheconjunctsinthesimplifiedlogicalform,andthequeriesarecreatedbyremovingasingleentityfromthesimplifiedlogicalform.tofindθci,φeiandφe0isuchthatP(ci(ei))islargerthanP(ci(e0i))(andsimilarlyforrelations).Duringtraining,e0iandt0jareresampledoneachpassoverthedatasetaccordingtoeachentityortuple’sem-piricalfrequency.6.3QueryRankingObjectiveThepreviousobjectiveaimstoranktheentitieswithineachpredicatewell.However,suchwithin-predicaterankingsareinsufficienttoproducecorrectanswersforqueriescontainingmultiplepredicates–thescoresforeachpredicatemustfurtherbecali-bratedtoworkwellwitheachothergiventheinde-pendenceassumptionsoftheprobabilisticdatabase.Weintroduceanewtrainingobjectivethatencour-agesgoodrankingsforentirequeriesinsteadofsin-glepredicates.Thedataforthisobjectiveconsistsoftuples,{(‘i,ei)}ni=1,ofaquery‘iwithanob-servedanswerei(Figure3,bottomright).Each‘iisafunctionwithexactlyoneentityargument,and‘i(e)isaconjunctionofpredicateinstances.Forex-ample,thelastqueryinFigure3isafunctionofoneargumentz,and‘(e)isasinglepredicateinstance,’S(e,/EN/LATE).Thenewobjectiveaimstoranktheobservedentityansweraboveunobservedenti-tiesforeachquery:OQ(je,φ)=nXi=1logPrank(‘i,ei,e0i)Prankgeneralizestheapproximaterankingprob-abilitydefinedbythepredicaterankingobjec-tivetomoregeneralqueries.Theexpressionσ(θTc(φe−φe0))inthepredicaterankingobjectivecanbeviewedasanapproximationoftheprob-abilitythateisrankedabovee0incategoryc.Prankusesthisapproximationforeachindividualpredicateinthequery.Forexample,giventhequery‘=λx.c(X)∧r(X,oui)andentities(e,e0),Prank(‘,e,e0)=σ(θc(φe−φe0))×σ(θr(φ(e,oui)−φ(e0,y))).Forthisobjective,wesamplee0isuchthat(‘i,e0i)doesnotoccurinthetrainingdata.When‘’sbodyconsistsofaconjunctionofpred-icates,thequeryrankingobjectivesimplifiescon-siderably.Inthiscase,‘canbedescribedasthreesetsofone-argumentfunctions:categoriesC(‘)={λx.c(X)},leftargumentsofrelationsRL(‘)={λx.r(X,oui)},andrightargumentsofre-lationsRR(‘)={λx.r(oui,X)}.En outre,Prankisaproductsowecandistributethelog:OQ(je,φ)=nXi=1Xλx.c(X)∈C(‘i)logσ(θc(φei−φe0i))+Xλx.r(X,oui)∈RL(‘i)logσ(θr(φ(ei,oui)−φ(e0i,oui)))+Xλx.r(oui,X)∈RR(‘i)logσ(θr(φ(oui,ei)−φ(oui,e0i)))Thissimplificationrevealsthatthemaindiffer-encebetweenOQandOPisthesamplingoftheunobservedentitiese0andtuplest0.OPsamplestheminanunconstrainedfashionfromtheirempir-icaldistributionsforeverypredicate.OQconsiders
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
264
thelargercontextinwhicheachpredicateoccurs,withtwomajoreffects.First,morenegativeexam-plesaregeneratedforcategoriesbecausethelogicalforms‘aremorespecific.Forexample,both“pres-identofSprint”and“presidentoftheUS”generateinstancesofthePRESIDENTpredicate;OQwilluseentitiesthatonlyoccurwithoneoftheseasnegativeexamplesfortheother.Second,therelationparam-etersaretrainedtoranktupleswithasharedargu-ment,asopposedtotuplesingeneral.Notethat,althoughPrankgeneralizestomorecomplexlogicalformsthanexistentially-quantifiedconjunctions,trainingwiththeselogicalformsismoredifficultbecausePrankisnolongeraproduct.Inthesecases,itbecomesnecessarytoperformin-ferencewithinthegradientcomputation,whichcanbeexpensive.Therestrictiontoconjunctionsmakesinferencetrivial,enablingthefactorizationabove.7EvaluationWeevaluateourapproachtocompositionalseman-ticsonaquestionansweringtask.Eachtestexam-pleisa(compositional)naturallanguagequestionwhoseanswerisasetofFreebaseentities.Wecom-pareouropendomainapproachtoseveralbaselinesbasedonpriorwork,aswellasahuman-annotatedFreebasequeryforeachexample.7.1DataWeusedClueweb09webcorpus5withthecorre-spondingGoogleFACCentitylinking(Gabrilovichetal.,2013)tocreatethetrainingandtestdataforourexperiments.Thetrainingdataisderivedfrom3millionwebpages,andcontains2.1mpredicatein-stances,1.1mqueries,172kentitiesand181kentitypairs.Predicatesthatappearedfewerthan6timesinthetrainingdatawerereplacedwiththepredicateUNK,resultingin25kcategoriesand2.2krelations.Ourtestdataconsistsoffill-in-the-blanknatu-rallanguagequestionssuchas“Incanemperor”or“CunninghamdirectedAuchtre’ssecondmusicvideo.”Thesequestionswerecreatedbyapply-ingthetrainingdatagenerationprocess(Section6.1)toacollectionofheld-outwebpages.Eachnaturallanguagequestionhasacorrespondinglogicalform5http://www.lemurproject.org/clueweb09.php#ofquestions220Avg.#ofpredicates/query2.77Avg.#ofcategories/query1.75Avg.#ofrelations/query1.02Avg.#ofanswers/query1.92#ofquestionswith≥1answer116(foundbyatleastonesystem)Table1:Statisticsofthetestdataset.querycontainingatleastonecategoryandrelation.Wechosenottouseexistingdatasetsforseman-ticparsingintoFreebaseasourgoalistomodelthesemanticsoflanguagethatcannotnecessarilybemodelledusingtheFreebaseschema.Existingdatasets,suchasFree917(CaiandYates,2013)andWe-bQuestions(Berantetal.,2013),wouldnotallowustoevaluateperformanceonthissubsetoflanguage.Consequently,weevaluateoursystemonanewdatasetwithunconstrainedlanguage.However,wedocompareourapproachagainstmanually-annotatedFreebasequeriesonournewdataset(Section7.5).Allofthedataforourexperimentsisavailableathttp://rtw.ml.cmu.edu/tacl2015_csf.7.2MethodologyOurevaluationmethodologyisinspiredbyinfor-mationretrievalevaluations(Manningetal.,2008).Eachsystempredictsarankedlistof100answersforeachtestquestion.Wethenpoolthetop30an-swersofeachsystemandmanuallyjudgetheircor-rectness.Thecorrectanswersfromthepoolarethenusedtoevaluatetheprecisionandrecallofeachsys-tem.Inparticular,wecomputeaverageprecision(AP)foreachquestionandreportthemeanaverageprecision(MAP)acrossallquestions.Wealsore-portaweightedversionofMAP,whereeachques-tion’sAPisweightedbyitsnumberofannotatedcorrectanswers.Averageprecisioniscomputedas1mPmk=1Prec(k)×Correct(k),wherePrec(k)istheprecisionatrankk,Correct(k)isanindicatorfunctionforwhetherthekthansweriscorrect,andmisthenumberofreturnedanswers(atmost100).StatisticsoftheannotatedtestsetareshowninTable1.Aconsequenceofourunconstraineddatagenerationapproachisthatsometestquestionsaredifficulttoanswer:ofthe220queries,atleastonesystemwasabletoproduceacorrectanswerfor116.Theremainingquestionsaremostlyunanswerable
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
265
MAPWeightedMAPCLUSTERING0.2240.266CORPUSLOOKUP0.2460.296FACTORIZATION(OP)0.2990.473FACTORIZATION(OQ)0.3090.492ENSEMBLE(OP)0.3910.614ENSEMBLE(OQ)0.4060.645Upperbound0.5271.0Table2:Meanaverageprecisionforourquestionansweringtask.ThedifferenceinMAPbetweeneachpairofadjacentmodelsisstatisticallysignifi-cant(p<.05)viathesigntest.becausetheyreferencerareentitiesunseeninthetrainingdata.7.3ModelsandBaselinesWeimplementedtwobaselinemodelsbasedonex-istingtechniques.TheCORPUSLOOKUPbaselineanswerstestquestionsbydirectlyusingthepredi-cateinstancesinthetrainingdataasitsknowledgebase.Forexample,giventhequeryλx.CEO(x)∧OF(x,/EN/SPRINT),thismodelwillreturnthesetofentitiesesuchthatCEO(e)andOF(e,/EN/SPRINT)bothappearinthetrainingdata.Allanswersfoundinthisfashionareassignedprobability1.TheCLUSTERINGbaselinefirstclustersthepred-icatesinthetrainingcorpus,thenanswersques-tionsusingtheclusteredpredicates.Theclusteringaggregatespredicateswithsimilardenotations,ide-allyidentifyingsynonymstosmoothoversparsityinthetrainingdata.OurapproachiscloselybasedonLewisandSteedman(2013),thoughisalsocon-ceptuallyrelatedtoapproachessuchasDIRT(LinandPantel,2001)andUSP(PoonandDomingos,2009).WeusetheChineseWhispersclusteringal-gorithm(Biemann,2006)andcalculatethesimilar-itybetweenpredicatesasthecosinesimilarityoftheirTF-IDFweightedentitycountvectors.Thede-notationofeachclusteristheunionofthedenota-tionsoftheclusteredpredicates,andeachentityinthedenotationisassignedprobability1.Wealsotrainedtwoprobabilisticdatabasemod-els,FACTORIZATION(OP)andFACTORIZATION(OQ),usingthetwoobjectivefunctionsdescribedinSections6.2and6.3,respectively.Weoptimizedbothobjectivesbyperforming100passesovertheRecallPrecisionENS.(OQ)ENS.(OP)FACT.(OQ)FACT.(OP)C.LOOKUPCLUSTERING00.20.40.60.81.00.30.40.50.60.70.80.91.0bbbbbbbbbbb+++++++++++rrrrrrrrrrruuuuuuuuuuubbbbbbbbbbb+++++++++++Figure4:Averaged11-pointprecision/recallcurvesforthe116answerabletestquestions.trainingdatawithAdaGrad(Duchietal.,2011)us-inganL2regularizationparameterofλ=10−4.Thepredicateandentityembeddingshave300di-mensions.Theseparameterswereselectedonthebasisofpreliminaryexperimentswithasmallvali-dationset.Finally,weobservedthatCORPUSLOOKUPhashighprecisionbutlowrecall,whilebothmatrixfac-torizationmodelshavehighrecallwithsomewhatlowerprecision.ThisobservationsuggestedthatanensembleofCORPUSLOOKUPandFACTORIZA-TIONcouldoutperformeithermodelindividually.Wecreatedtwoensembles,ENSEMBLE(OP)andENSEMBLE(OQ),bycalculatingtheprobabilityofeachpredicateasa50/50mixtureofeachmodel’spredictedprobability.7.4ResultsTable2showstheresultsofourMAPevaluation,andFigure4showsaprecision/recallcurveforeachmodel.TheMAPnumbersaresomewhatlowbe-causealmosthalfofthetestquestionshavenocor-rectanswersandallmodelsgetanaveragepreci-sionof0onthesequestions.TheupperboundonMAPisthefractionofquestionswithatleast1correctanswer.Notethatthemodelsperformwellontheanswerablequestions,asreflectedbythera-tiooftheachievedMAPtotheupperbound.TheweightedMAPmetricalsocorrectsfortheseunan-swerablequestions,astheyareassigned0weightintheweightedaverage.Theseresultsdemonstrateseveralfindings.First,wefindthatbothFACTORIZATIONmodelsoutper-formthebaselinesinbothMAPandweightedMAP. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 3 7 1 5 6 6 7 6 2 / / t l a c _ a _ 0 0 1 3 7 p d . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 266 #ofquestionsw/anannotatedMQLquery142queryreturns>1answer95queryreturnsnoanswers47#ofquestionsw/oanMQLquery78Table3:StatisticsoftheFreebaseMQLqueriesan-notatedforthetestdataset.Theperformanceimprovementseemstobemostsignificantinthehighrecallregime(rightsideofFigure4).Deuxième,wefindthatthequeryrankingobjectiveOQimprovesperformanceoverthepredi-caterankingobjectiveOPby2-4%ontheanswer-ablequeries.Theprecision/recallcurvesshowthatthisimprovementisconcentratedinthelowrecallregime.Finally,theensemblemodelsareconsider-ablybetterthantheircomponentmodels;cependant,evenintheensembledmodels,wefindthatOQout-performsOPbyafewpercent.7.5ComparisontoSemanticParsingtoFreebaseAnaturalquestioniswhetherouropenvocabu-laryapproachoutperformsaclosedapproachforthesameproblem,suchassemanticparsingtoFreebase(e.g.,Reddyetal.(2014)).Inordertoanswerthisquestion,wecomparedourbestperformingmodeltoamanually-annotatedFreebasequeryforeachtestquestion.Thiscomparisonallowsustounderstandtherelativeadvantagesofopenandclosedpredicatevocabularies.ThefirstauthormanuallyannotatedaFreebaseMQLqueryforeachnaturallanguagequestioninthetestdataset.Thisannotationissomewhatsub-jective,asmanyofthequestionscanonlybeinex-actlymappedontotheFreebaseschema.Weusedthefollowingguidelinesinperformingthemap-ping:(1)allrelationsinthetextmustbemappedtooneormoreFreebaserelations,(2)allenti-tiesmentionedinthetextmustbeincludedinthequery,(3)adjectivemodifierscanbeignoredand(4)entitiesnotmentionedinthetextmaybein-cludedinthequery.Thefourthconditionisnec-essarybecausemanyone-placepredicates,suchasMAYOR(X),arerepresentedinFreebaseusingabinaryrelationtoaparticularentity,suchasGOVERNMENTOFFICE/TITLE(X,/EN/MAYOR).StatisticsoftheannotatedqueriesareshowninMAPENSEMBLE(OQ)0.263Freebase0.385Table4:Meanaverageprecisionofourbestper-formingmodelcomparedtoamanuallyannotatedFreebasequeryforeachtestquestion.Table3.Coverageisreasonablyhigh:wewereabletoannotateaFreebasequeryfor142questions(65%ofthetestset).Theremainingunannotatableques-tionsareduetomissingpredicatesinFreebase,suchasarelationdefiningtheemperoroftheIncanem-pire.Ofthe142annotatedFreebasequeries,95ofthemreturnatleastoneentityanswer.Thequerieswithnoanswerstypicallyreferenceuncommonen-titieswhichhavefewornoknownrelationinstancesinFreebase.Theannotatedqueriescontainanaver-ageof2.62Freebasepredicates.Wecomparedourbestperformingmodel,EN-SEMBLE(OQ),tothemanuallyannotatedFreebasequeriesusingthesamepooledevaluationmethodol-ogy.ThesetofcorrectanswerscontainsthecorrectpredictionsofENSEMBLE(OQ)fromthepreviousevaluationalongwithallanswersfromFreebase.ResultsfromthisevaluationareshowninTable4.6IntermsofoverallMAP,Freebaseoutperformsourapproachbyafairmargin.However,thisini-tialimpressionbeliesamorecomplexreality,whichisshowninTable5.Thistablecomparesbothap-proachesbytheirrelativeperformanceoneachtestquestion.Onapproximatelyone-thirdoftheques-tions,FreebasehasahigherAPthanourapproach.Onanotherthird,ourapproachhasahigherAPthanFreebase.Onthefinalthird,bothapproachesper-formequallywell–thesearetypicallyquestionswhereneitherapproachreturnsanycorrectanswers(67ofthe75).Freebaseoutperformsintheover-allMAPevaluationbecauseittendstoreturnmorecorrectanswerstoeachquestion.NotethattheannotatedFreebasequerieshaveseveraladvantagesinthisevaluation.First,Freebasecontainssignificantlymorepredicateinstancesthanourtrainingdata,whichallowsittoproducemorecompleteanswers.Second,theFreebasequeries6Thenumbersinthistablearenotcomparabletothenum-bersinTable2asthecorrectanswersforeachquestionaredif-ferent.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
267
#ofqueriesFreebasehigherAP75(34%)equalAP75(34%)ENSEMBLE(OQ)higherAP70(31%)Table5:Question-by-questioncomparisonofmodelperformance.Eachtestquestionisplacedintooneofthethreebucketsabove,dependingonwhetherFreebaseorENSEMBLE(OQ)achievesabetterav-erageprecision(AP)forthequestion.correspondtotheperformanceofaperfectsemanticparser,whilecurrentsemanticparsersachieveaccu-raciesaround68%(BerantandLiang,2014).Theresultsfromthisexperimentsuggestthatclosedandopenpredicatevocabulariesarecomple-mentary.Freebaseproduceshighqualityanswerswhenitcoversaquestion.However,manyofthere-mainingquestionscanbeansweredcorrectlyusinganopenvocabularyapproachlikeours.Thisevalu-ationalsosuggeststhatrecallisalimitingfactorofourapproach;inthefuture,recallcanbeimprovedbyusingalargercorpusorincludingFreebasein-stancesduringtraining.8RelatedWorkOpenPredicateVocabulariesTherehasbeenconsiderableworkongeneratingsemanticrepresentationswithanopenpredicatevo-cabulary.Muchoftheworkisnon-compositional,focusingonidentifyingsimilarpredicatesandenti-ties.DIRT(LinandPantel,2001),Resolver(YatesandEtzioni,2007)andothersystems(Yaoetal.,2012)clustersynonymousexpressionsinacorpusofrelationtriples.Matrixfactorizationisanalter-nativeapproachtoclusteringthathasbeenusedforrelationextraction(Riedeletal.,2013;Yaoetal.,2013)andfindinganalogies(Turney,2008;Speeretal.,2008).Allofthisworkiscloselyrelatedtodis-tributionalsemantics,whichusesdistributionalin-formationtoidentifysemanticallysimilarwordsandphrases(TurneyandPantel,2010;Griffithsetal.,2007).Someworkhasconsideredtheproblemofcom-positionalsemanticswithanopenpredicatevocab-ulary.Unsupervisedsemanticparsing(PoonandDomingos,2009;TitovandKlementiev,2011)isaclustering-basedapproachthatincorporatescom-positionusingagenerativemodelforeachsentencethatfactorsaccordingtoitsparsetree.LewisandSteedman(2013)alsopresentaclustering-basedap-proachthatusesCCGtoperformsemanticcompo-sition.Thisapproachissimilartoours,exceptthatweusematrixfactorizationandFreebaseentities.Finally,someworkhasfocusedontheproblemoftextualinferencewithinthisparadigm.Faderetal.(2013)presentaquestionansweringsystemthatlearnstoparaphraseaquestionsothatitcanbean-sweredusingacorpusofOpenIEtriples(Faderetal.,2011).Distributionalsimilarityhasalsobeenusedtolearnweightedlogicalinferencerulesthatcanbeusedforrecognizingtextualentailmentoridentifyingsemanticallysimilartext(Garretteetal.,2011;Garretteetal.,2013;Beltagyetal.,2013).Thislineofworkfocusesonperforminginferencebetweentexts,whereasourworkcomputesatext’sdenotation.AsignificantdifferencebetweenourworkandmostoftherelatedworkaboveisthatourworkcomputesdenotationscontainingFreebaseentities.Usingtheseentitieshastwoadvantages:(1)iten-ablesustouseentitylinkingtodisambiguatetextualmentions,et(2)itfacilitatesacomparisonagainstalternativeapproachesthatrelyonaclosedpredi-catevocabulary.Disambiguatingtextualmentionsisamajorchallengeforpreviousapproaches,soanentity-linkedcorpusisamuchcleanersourceofdata.However,ourapproachcouldalsoworkwithautomaticallyconstructedentities,forexample,cre-atedbyclusteringmentionsinanunsupervisedfash-ion(Singhetal.,2011).SemanticParsingSeveralsemanticparsershavebeendevelopedforFreebase(CaiandYates,2013;Kwiatkowskietal.,2013;Berantetal.,2013;BerantandLiang,2014).OurapproachismostsimilartothatofReddyetal.(2014),whichusesfixedsyntacticparsesofunla-beledtexttotrainaFreebasesemanticparser.Likeourapproach,thissystemautomatically-generatesquery/answerpairsfortraining.However,thissys-tem,likeallFreebasesemanticparsers,usesaclosedpredicatevocabularyconsistingofonlyFreebasepredicates.Incontrast,ourapproachusesanopenpredicatevocabularyandcanlearndenotationsforwordswhosesemanticscannotberepresentedusing
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
268
Freebasepredicates.Consequently,ourapproachcananswermanyquestionsthattheseFreebasese-manticparserscannot(seeSection7.5).Therule-basedsemanticparserusedinthispa-perisverysimilartoseveralotherrule-basedsys-temsthatproducelogicalformsfromsyntacticCCGparses(Bos,2008;LewisandSteedman,2013).Wedevelopedourownsysteminordertohavecontrolovertheparticularsoftheanalysis;cependant,ourap-proachiscompatiblewiththesesystemsaswell.ProbabilisticDatabasesOursystemassignsamodel-theoreticsemanticstostatementsinnaturallanguage(Dowtyetal.,1981)usingalearneddistributionoverpossibleworlds.Thisdistributionisconciselyrepresentedinaprobabilisticdatabase,whichcanbeviewedasasimpleMarkovLogicNetwork(RichardsonandDomingos,2006)wherealloftherandomvari-ablesareindependent.Thisindependencesimplifiesqueryevaluation:probabilisticdatabasespermitef-ficientexactinferenceforsafequeries(Suciuetal.,2011),andapproximateinferencefortheremain-der(Gatterbaueretal.,2010;GatterbauerandSuciu,2015).9DiscussionThispaperpresentsanapproachforcompositionalsemanticswithanopenpredicatevocabulary.Ourapproachdefinesaprobabilisticmodeloverdeno-tations(setsofFreebaseentities)conditionedonaninputtext.Themodelhastwocomponents:arule-basedsemanticparserthatproducesalogicalformforthetext,andaprobabilisticdatabasethatdefinesadistributionoverdenotationsforeachpredicate.Atrainingphaselearnstheprobabilisticdatabasebyapplyingprobabilisticmatrixfactorizationwithaquery/answerrankingobjectivetologicalformsde-rivedfromalarge,entity-linkedwebcorpus.Anex-perimentalanalysisdemonstratesthatthisapproachoutperformsseveralbaselinesandcananswermanyquestionsthatcannotbeansweredbysemanticpars-ingintoFreebase.Ourapproachlearnsamodel-theoreticsemanticsfornaturallanguagetexttiedtoFreebase,asdosomesemanticparsers,exceptwithanopenpredi-catevocabulary.Thisdifferenceinfluencesseveralotheraspectsofthesystem’sdesign.First,becausenoknowledgebasewiththenecessaryknowledgeexists,thesystemisforcedtolearnitsknowledgebase(intheformofaprobabilisticdatabase).Sec-ond,thesystemcandirectlymapsyntacticCCGparsestologicalforms,asitisnolongerneces-sarytomapwordstoaclosedvocabularyofknowl-edgebasepredicates.Insomesense,ourapproachistheexactoppositeofthetypicalsemanticpars-ingapproach:usually,thesemanticparserislearnedandtheknowledgebaseisfixed;ici,theknowl-edgebaseislearnedandthesemanticparserisfixed.Fromamachinelearningperspective,train-ingaprobabilisticdatabaseviamatrixfactorizationiseasierthantrainingasemanticparser,astherearenodifficultinferenceproblems.However,itremainstobeseenwhetheralearnedknowledgebasecanachievesimilarrecallasafixedknowledgebaseonthesubsetoflanguageitcovers.Therearetwolimitationsofthiswork.Themostobviouslimitationistherestrictiontoexistentiallyquantifiedconjunctionsofpredicates.Thislimita-tionisnotinherenttotheapproach,cependant,andcanberemovedinfutureworkbyusingasystemlikeBoxer(Bos,2008)forsemanticparsing.Amoreseriouslimitationistherestrictiontoone-andtwo-argumentpredicates,whichpreventsoursystemfromrepresentingeventsandn-aryrelations.Con-ceptually,asimilarmatrixfactorizationapproachcouldbeusedtolearnembeddingsforn-aryentitytuples;cependant,inpractice,thesparsityofthesetu-plesmakeslearningchallenging.Developingmeth-odsforlearningn-aryrelationsisanimportantprob-lemforfuturework.Adirectionforfutureworkisscalingupthesizeofthetrainingcorpustoimproverecall.Lowre-callisthemainlimitationofourcurrentsystemasdemonstratedbytheexperimentalanalysis.Bothstagesoftraining,thedatagenerationandmatrixfactorization,canbeparallelizedusingacluster.AlloftherelationinstancesinFreebasecanalsobeaddedtothetrainingcorpus.Itshouldbefeasibletoincreasethequantityoftrainingdatabyafactorof10-100,forexample,totrainonallofClueWeb.Scalingupthetrainingdatamayallowasemanticparserwithanopenpredicatevocabularytooutper-formcomparableclosedvocabularysystems.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
269
AcknowledgmentsThisresearchwassupportedinpartbyDARPAun-dercontractnumberFA8750-13-2-0005,andbyagenerousgrantfromGoogle.WeadditionallythankMattGardner,NdapaNakashole,AmosAzariaandtheanonymousreviewersfortheirhelpfulcom-ments.ReferencesMicheleBanko,MichaelJ.Cafarella,StephenSoderland,MattBroadhead,andOrenEtzioni.2007.Openinfor-mationextractionfromtheweb.InProceedingsofthe20thInternationalJointConferenceonArtificalIntel-ligence.IslamBeltagy,CuongChau,GemmaBoleda,DanGar-rette,KatrinErk,andRaymondMooney.2013.Mon-taguemeetsmarkov:Deepsemanticswithprobabilis-ticlogicalform.InSecondJointConferenceonLexi-calandComputationalSemantics(*SEM),Volume1:ProceedingsoftheMainConferenceandtheSharedTask:SemanticTextualSimilarity.JonathanBerantandPercyLiang.2014.Semanticpars-ingviaparaphrasing.InProceedingsofthe52ndAn-nualMeetingoftheAssociationforComputationalLinguistics.JonathanBerant,AndrewChou,RoyFrostig,andPercyLiang.2013.SemanticparsingonFreebasefromquestion-answerpairs.InProceedingsofthe2013ConferenceonEmpiricalMethodsinNaturalLan-guageProcessing.ChrisBiemann.2006.Chinesewhispers:Anefficientgraphclusteringalgorithmanditsapplicationtonatu-rallanguageprocessingproblems.InProceedingsoftheFirstWorkshoponGraphBasedMethodsforNat-uralLanguageProcessing.JohanBos.2008.Wide-coveragesemanticanalysiswithboxer.InProceedingsofthe2008ConferenceonSe-manticsinTextProcessing.QingqingCaiandAlexanderYates.2013.Large-scaleSemanticParsingviaSchemaMatchingandLexiconExtension.InProceedingsoftheAnnualMeetingoftheAssociationforComputationalLinguistics(ACL).NileshDalviandDanSuciu.2007.Efficientqueryeval-uationonprobabilisticdatabases.TheVLDBJournal,16(4),October.DavidR.Dowty,RobertE.Wall,andStanleyPeters.1981.IntroductiontoMontagueSemantics.JohnDuchi,EladHazan,andYoramSinger.2011.Adaptivesubgradientmethodsforonlinelearningandstochasticoptimization.JournalofMachineLearningResearch,12:2121–2159,July.AnthonyFader,StephenSoderland,andOrenEtzioni.2011.Identifyingrelationsforopeninformationex-traction.InProceedingsoftheConferenceonEmpiri-calMethodsinNaturalLanguageProcessing.AnthonyFader,LukeZettlemoyer,andOrenEtzioni.2013.Paraphrase-drivenlearningforopenquestionanswering.InProceedingsofthe51stAnnualMeetingoftheAssociationforComputationalLinguistics.EvgeniyGabrilovich,MichaelRinggaard,andAmar-nagSubramanya.2013.FACC1:Freebaseanno-tationofClueWebcorpora,Version1(Releasedate2013-06-26,Formatversion1,Correctionlevel0).http://lemurproject.org/clueweb09/.DanGarrette,KatrinErk,andRaymondMooney.2011.Integratinglogicalrepresentationswithprobabilisticinformationusingmarkovlogic.InProceedingsoftheInternationalConferenceonComputationalSeman-tics.DanGarrette,KatrinErk,andRaymondJ.Mooney.2013.Aformalapproachtolinkinglogicalformandvector-spacelexicalsemantics.InHarryBunt,JohanBos,andStephenPulman,editors,ComputingMean-ing,volume4,pages27–48.WolfgangGatterbauerandDanSuciu.2015.Approx-imateliftedinferencewithprobabilisticdatabases.ProceedingsoftheVLDBEndowment,8(5),January.WolfgangGatterbauer,AbhayKumarJha,andDanSu-ciu.2010.Dissociationandpropagationforefficientqueryevaluationoverprobabilisticdatabases.InPro-ceedingsoftheFourthInternationalVLDBworkshoponManagementofUncertainData(MUD2010)inconjunctionwithVLDB2010,Singapore,September13,2010.ThomasL.Griffiths,JoshuaB.Tenenbaum,andMarkSteyvers.2007.Topicsinsemanticrepresentation.PsychologicalReview114.JuliaHockenmaierandMarkSteedman.2002.Acquir-ingcompactlexicalizedgrammarsfromacleanertree-bank.InProceedingsofThirdInternationalConfer-enceonLanguageResourcesandEvaluation.JayantKrishnamurthyandTomM.Mitchell.2012.Weaklysupervisedtrainingofsemanticparsers.InProceedingsofthe2012JointConferenceonEmpir-icalMethodsinNaturalLanguageProcessingandComputationalNaturalLanguageLearning.JayantKrishnamurthyandTomM.Mitchell.2014.Jointsyntacticandsemanticparsingwithcombinatorycat-egorialgrammar.InProceedingsofthe52ndAnnualMeetingoftheAssociationforComputationalLinguis-tics.TomKwiatkowski,EunsolChoi,YoavArtzi,andLukeZettlemoyer.2013.Scalingsemanticparserswithon-the-flyontologymatching.InProceedingsofthe
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
1
3
7
1
5
6
6
7
6
2
/
/
t
je
un
c
_
un
_
0
0
1
3
7
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
270
2013ConferenceonEmpiricalMethodsinNaturalLanguageProcessing.MikeLewisandMarkSteedman.2013.Combineddistributionalandlogicalsemantics.TransactionsoftheAssociationforComputationalLinguistics,1:179–192.DekangLinandPatrickPantel.2001.DIRT—discov-eryofinferencerulesfromtext.InProceedingsoftheSeventhACMSIGKDDInternationalConferenceonKnowledgeDiscoveryandDataMining.ChristopherD.Manning,PrabhakarRaghavan,andHin-richSch¨utze.2008.IntroductiontoInformationRe-trieval.CambridgeUniversityPress,NewYork,New York,USA.DavidMilneandIanH.Witten.2008.Learningtolinkwithwikipedia.InProceedingsofthe17thACMCon-ferenceonInformationandKnowledgeManagement.HoifungPoonandPedroDomingos.2009.Unsuper-visedsemanticparsing.InProceedingsofthe2009ConferenceonEmpiricalMethodsinNaturalLan-guageProcessing.LevRatinov,DanRoth,DougDowney,andMikeAn-derson.2011.Localandglobalalgorithmsfordis-ambiguationtowikipedia.InProceedingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics:HumanLanguageTechnologies.SivaReddy,MirellaLapata,andMarkSteedman.2014.Large-scalesemanticparsingwithoutquestion-answerpairs.TransactionsoftheAssociationofComputa-tionalLinguistics–Volume2,Issue1.SteffenRendle,ChristophFreudenthaler,ZenoGantner,andLarsSchmidt-Thieme.2009.BPR:Bayesianper-sonalizedrankingfromimplicitfeedback.InProceed-ingsoftheTwenty-FifthConferenceonUncertaintyinArtificialIntelligence.MatthewRichardsonandPedroDomingos.2006.Markovlogicnetworks.MachineLearning,62(1-2):107–136,February.SebastianRiedel,LiminYao,AndrewMcCallum,andBenjaminM.Marlin.2013.Relationextractionwithmatrixfactorizationanduniversalschemas.InJointHumanLanguageTechnologyConference/AnnualMeetingoftheNorthAmericanChapteroftheAsso-ciationforComputationalLinguistics.SameerSingh,AmarnagSubramanya,FernandoPereira,andAndrewMcCallum.2011.Large-scalecross-documentcoreferenceusingdistributedinferenceandhierarchicalmodels.InAssociationforComputa-tionalLinguistics:HumanLanguageTechnologies(ACLHLT).RobertSpeer,CatherineHavasi,andHenryLieberman.2008.AnalogySpace:Reducingthedimensionalityofcommonsenseknowledge.InAAAI.DanSuciu,DanOlteanu,ChristopherR´e,andChristophKoch.2011.Probabilisticdatabases.SynthesisLec-turesonDataManagement,3(2):1–180.IvanTitovandAlexandreKlementiev.2011.Abayesianmodelforunsupervisedsemanticparsing.InProceed-ingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics.PeterD.TurneyandPatrickPantel.2010.Fromfre-quencytomeaning:vectorspacemodelsofsemantics.JournalofArtificialIntelligenceResearch,37(1),Jan-uary.PeterD.Turney.2008.Thelatentrelationmappingen-gine:Algorithmandexperiments.JournalofArtificialIntelligenceResearch,33(1):615–655,December.MohamedYahya,KlausBerberich,ShadyElbassuoni,MayaRamanath,VolkerTresp,andGerhardWeikum.2012.Naturallanguagequestionsforthewebofdata.InProceedingsofthe2012JointConferenceonEm-piricalMethodsinNaturalLanguageProcessingandComputationalNaturalLanguageLearning.LiminYao,SebastianRiedel,andAndrewMcCallum.2012.Unsupervisedrelationdiscoverywithsensedis-ambiguation.InProceedingsofthe50thAnnualMeet-ingoftheAssociationforComputationalLinguistics:LongPapers-Volume1.LiminYao,SebastianRiedel,andAndrewMcCallum.2013.Universalschemaforentitytypeprediction.InProceedingsofthe2013WorkshoponAutomatedKnowledgeBaseConstruction.AlexanderYatesandOrenEtzioni.2007.Unsupervisedresolutionofobjectsandrelationsontheweb.InPro-ceedingsofthe2007AnnualConferenceoftheNorthAmericanChapteroftheAssociationforComputa-tionalLinguistics.JohnM.ZelleandRaymondJ.Mooney.1996.Learningtoparsedatabasequeriesusinginductivelogicpro-gramming.InProceedingsofthethirteenthnationalconferenceonArtificialIntelligence.LukeS.ZettlemoyerandMichaelCollins.2005.Learn-ingtomapsentencestologicalform:structuredclas-sificationwithprobabilisticcategorialgrammars.InUAI’05,Proceedingsofthe21stConferenceinUn-certaintyinArtificialIntelligence.
Télécharger le PDF