文档 - 麻省理工学院人工智能研究专业

您需要什么主题的文档?

计算语言学协会会刊, 卷. 6, PP. 197–210, 2018. 动作编辑器: Hinrich Sch¨utze.

计算语言学协会会刊, 卷. 6, PP. 197–210, 2018. 动作编辑器: Hinrich Sch¨utze. 提交批次: 6/2017; 修改批次: 9/2017; 已发表 4/2018. 2018 计算语言学协会. 根据 CC-BY 分发 4.0 执照. C (西德:13) KnowledgeCompletionforGenericsusingGuidedTensorFactorizationHanieSedghi∗GoogleBrainMountainView,CA,U.S.A.hsedghi@google.comAshishSabharwalAllenInstituteforArtiﬁcialIntelligence(AI2)西雅图,的,U.S.A.AshishS@allenai.orgAbstractGivenaknowledgebaseorKBcontaining(嘈杂)factsaboutcommonnounsorgener-ics,suchas“alltreesproduceoxygen”or“someanimalsliveinforests”,weconsidertheproblemofinferringadditionalsuchfactsataprecisionsimilartothatofthestartingKB.SuchKBscapturegeneralknowledgeabouttheworld,andarecrucialforvariousappli-cationssuchasquestionanswering.Differ-entfromcommonlystudiednamedentityKBssuchasFreebase,genericsKBsinvolvequan-tiﬁcation,havemorecomplexunderlyingreg-ularities,tendtobemoreincomplete,andvio-latethecommonlyusedlocallyclosedworldassumption(LCWA).WeshowthatexistingKBcompletionmethodsstrugglewiththisnewtask,andpresenttheﬁrstapproachthatissuccessful.Ourresultsdemonstratethatex-ternalinformation,suchasrelationschemasandentitytaxonomies,ifusedappropriately,canbeasurprisinglypowerfultoolinthisset-ting.First,oursimpleyeteffectiveknowledgeguidedtensorfactorizationapproachachievesstate-of-the-artresultsontwogenericsKBs(80%精确的)forscience,doublingtheirsizeat74%-86%precision.Second,ournoveltax-onomyguided,submodular,activelearningmethodforcollectingannotationsaboutrareentities(e.g.,oriole,abird)is6xmoreeffec-tiveatinferringfurthernewfactsaboutthemthanmultipleactivelearningbaselines.1IntroductionWeconsidertheproblemofcompletingapartialknowledgebase(KB)containingfactsaboutgener-∗ThisworkwasdonewhiletheauthorwasafﬁliatedwiththeAllenInstituteforArtiﬁcialIntelligence.icsorcommonnouns,representedasathird-ordertensorof(来源,关系,目标)三元组,suchas(butterﬂy,pollinate,ﬂower)和(thermometer,mea-sure,temperature).Suchfactscapturecommonknowledgethathumanshaveabouttheworld.Theyarearguablyessentialforintelligentagentswithhuman-likeconversationalabilitiesaswellasforspeciﬁcapplicationssuchasquestionanswering.Wedemonstratethatstate-of-the-artKBcompletionmethodsperformpoorlywhenfacedwithgener-ics,whileourstrategiesforincorporatingexternalknowledgeaswellasobtainingadditionalannota-tionsforrareentitiesprovidetheﬁrstsuccessfulso-lutiontothischallengingnewtask.Sincegenericsrepresentclassesofsimilarindi-viduals,thetruthvalueyiofagenericstriplexi=(s,r,t)dependsonthequantiﬁcationsemanticsoneassociateswithsandt.Indeed,thesemanticsofgenericsstatementscanbeambiguous,evenself-contradictory,duetoculturalnorms.AsLeslie(2008)pointsout,‘duckslayeggs’isgenerallycon-sideredtruewhile‘ducksarefemale’,whichistrueforabroadersetofducksthantheformerstatement,isgenerallyconsideredfalse.Toavoiddeepphilosophicalissues,weﬁxapar-ticularmathematicalsemanticsthatisespeciallyrel-evantfornoisyfactsderivedautomaticallyfromtext:associateswithacategoricalquantiﬁcationfrom{全部,一些,没有任何}andassociatet(隐式)withsome.Forinstance,“allbutterﬂiespollinate(一些)ﬂower”and“someanimalslivein(一些)forest”.Whenpresentingsuchtriplestohumans,theyarephrasedas:isittruethatallbutterﬂiespollinatesomeﬂower?Asanotationalshortcut,wetreatthequantiﬁcationofsasthecategoricallabelyiforthetriplexi.Forexample,(butterﬂy,pollinate,ﬂower) l D o w n o a d e d f r o m h t t p : / / 直接的