计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay.

计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay.
提交批次: 7/2015; 修改批次: 10/2015; 已发表 12/2015.

2015 计算语言学协会. 根据 CC-BY 分发 4.0 执照.

C
(西德:13)

ParsingAlgebraicWordProblemsintoEquationsRikKoncel-Kedziorski,HannanehHajishirzi,AshishSabharwal†,OrenEtzioni†,andSienaDumasAngUniversityofWashington,†AllenInstituteforAI{kedzior,hannaneh,sienaang}@uw.edu,{ashishs,orene}@allenai.orgAbstractThispaperformalizestheproblemofsolv-ingmulti-sentencealgebraicwordproblemsasthatofgeneratingandscoringequationtrees.Weuseintegerlinearprogrammingtogener-ateequationtreesandscoretheirlikelihoodbylearninglocalandglobaldiscriminativemod-els.Thesemodelsaretrainedonasmallsetofwordproblemsandtheiranswers,withoutanymanualannotation,inordertochoosetheequationthatbestmatchestheproblemtext.WerefertotheoverallsystemasALGES.WecompareALGESwithpreviousworkandshowthatitcoversthefullgamutofarithmeticoperationswhereasHosseinietal.(2014)onlyhandleadditionandsubtraction.Inaddition,ALGESovercomesthebrittlenessoftheKush-manetal.(2014)approachonsingle-equationproblems,yieldinga15%to50%reductioninerror.1IntroductionGrade-schoolalgebrawordproblemsarebriefnar-ratives(seeFigure1).Atypicalproblemfirstde-scribesapartialworldstateconsistingofcharacters,实体,andquantities.Nextitupdatestheconditionofanentityorexplicatestherelationshipbetweenentities.Finally,itposesaquestionaboutaquantityinthenarrative.Anordinarychildhastolearntherequiredalge-bra,butwilleasilygraspthenarrativeutilizingex-tensiveworldknowledge,largevocabulary,word-sensedisambiguation,coreferenceresolution,mas-teryofsyntax,andtheabilitytocombineindividualOceansideBikeRentalShopcharges17dollarsplus7dollarsanhourforrentingabike.Tompaid80dollarstorentabike.Howmanyhoursdidhepaytohavethebikecheckedout?=+$17$∗$7$xh80$solution:917+(7∗x)=80Figure1:Exampleproblemandsolutionsentencesintoacoherentmentalmodel.Incontrast,thechallengeforanNLPsystemisto“makesense”ofthenarrative,whichmayrefertoarbitraryactiv-itieslikerentingbikes,collectingcoins,oreatingcookies.Previousworkcopedwiththeopen-domainas-pectofalgebraicwordproblemsbyrelyingondeter-ministicstatetransitionsbasedonverbcategoriza-tion(Hosseinietal.,2014)orbylearningtemplatesthatcoverequationsofparticularforms(Kushmanetal.,2014).Wehavediscovered,然而,thatbothapproachesarebrittle,particularlyastrainingdataisscarceinthisdomain,andthespaceofequationsgrowsexponentiallywiththenumberofquantitiesmentionedinthemathproblem.WeintroduceALGES,1whichmapsanunseenmulti-sentencealgebraicwordproblemintoasetofpossibleequationtrees.Figure1showsanequationtreealongsidethewordproblemitrepresents.ALGESgeneratesthespaceoftreesviaIntegerLinearProgramming(ILP),whichallowsittocon-1Thecodeanddataispubliclyavailableathttps://gitlab.cs.washington.edu/ALGES/TACL2015. l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 586 strainthespaceoftreestorepresenttype-consistentalgebraicequationssatisfyingasmanydesirablepropertiesaspossible.ALGESlearnstomapspansoftexttoarithmeticoperators,tocombinethemgiventheglobalcontextoftheproblem,andtochoosethe“best”treecorrespondingtotheproblem.ThetrainingsetforALGESconsistsofunannotatedalgebraicwordproblemsandtheirsolution.Solv-ingtheequationrepresentedbysuchatreeistrivial.ALGESisdescribedindetailinSection4.ALGESisabletosolvewordproblemswithsingle-variableequationsliketheonesinFigure1.IncontrasttoHosseinietal.(2014),ALGEScovers+,−,,and/.TheworkofKushmanetal.(2014)hasbroaderscopebutweshowthatitreliesheav-ilyonoverlapbetweentrainingandtestdata.Whenthatoverlapisreduced,ALGESis15%to50%moreaccuratethanthissystem.Ourcontributionsareasfollows:(1)Weformal-izetheproblemofsolvingmulti-sentencealgebraicwordproblemsasthatofgeneratingandrankingequationtrees;(2)Weshowhowtoscorethelike-lihoodofequationtreesbylearningdiscriminativemodelstrainedfromasmallnumberofwordprob-lemsandtheirsolutions–withoutanymanualan-notation;和(3)WedemonstrateempiricallythatALGEShasbroaderscopethanthesystemofHos-seinietal.(2014),andovercomesthebrittlenessofthemethodofKushmanetal.(2014).2PreviousWorkOurworkisrelatedtosituatedsemanticinterpre-tation,whichaimstomapnaturallanguagesen-tencestoformalmeaningrepresentations(ZelleandMooney,1996;ZettlemoyerandCollins,2005;GeandMooney,2006;Kwiatkowskietal.,2010).Morecloselyrelatedisworkonlanguageground-ing,whosegoalistheinterpretationofasentenceinthecontextofaworldrepresentation(Branavanetal.,2009;Liangetal.,2009;Chenetal.,2010;Bordesetal.,2010;FengandLapata,2010;Ha-jishirzietal.,2011;Matuszeketal.,2012;Hajishirzietal.,2012;ArtziandZettlemoyer,2013;Koncel-Kedziorskietal.,2014;Yatskaretal.,2014;Seoetal.,2014;Hixonetal.,2015).然而,whilemostpreviousworkconsideredindividualsentencesinisolation,solvingwordproblemsoftenrequiresreasoningacrossthemulti-sentencediscourseoftheproblemtext.Recenteffortsinthemathdo-mainhavestudiednumberwordproblems(Shietal.,2015),logicpuzzleproblems(MitraandBaral,2015),arithmeticwordproblems(Hosseinietal.,2014;RoyandRoth,2015),algebrawordprob-lems(Kushmanetal.,2014;Zhouetal.,2015),andgeometrywordproblems(Seoetal.,2015).Wediscussinmoredetailbelowtwopioneeringworkscloselyrelatedtoourown.Hosseinietal.(2014)solveelementaryadditionandsubtractionproblemsbylearningverbcate-gories.Theygroundtheproblemtexttoaseman-ticsofentitiesandcontainers,anddecideifquanti-tiesareincreasingordecreasinginacontainerbaseduponthelearnedverbcategories.Whilerelyingonlyonverbcategoriesworkswellfor+and−,model-ing∗or/requiresgoingbeyondverbs.Forinstance,“Tinahas2cats.Johnhas3morecatsthanTina.Howmanycatsdotheyhavetogether?”and“Tinahas2cats.Johnhas3timesasmanycatsasTina.Howmanycatsdotheyhavetogether?”haveidenti-calverbs,buttheindicatedoperation(+and*resp.)isdifferent.ALGESmakesuseofaricherseman-ticrepresentationwhichfacilitatesdeeperlearningandawiderscopeofapplication,solvingproblemsinvolvingthe+,−,/,and∗operators(seeTable6).Kushmanetal.(2014)introduceageneralmethodforsolvingalgebraproblems.Thisworkcanalignawordproblemtoasystemofequationswithoneortwounknowns.Theylearnamappingfromwordproblemstoequationtemplatesusingglobalandlo-calfeaturesfromtheproblemtext.However,thelargespaceofequationtemplatesmakesitchalleng-ingforthismodeltolearntofindthebestequationdirectly,asasufficientlysimilartemplatemaynothavebeenobservedduringtraining.Instead,ourmethodmapswordproblemstoequationtrees,tak-ingadvantageofaricherrepresentationofquanti-fiednounsandtheirproperties,aswellastherecur-sivenatureofequationtrees.TheseallowALGEStouseabottom-upapproachtolearnthecorrespon-dencebetweenspansoftextsandarithmeticoper-ators(correspondingtointermediatenodesinthetree).ALGESthenscoresequationsusingglobalstructureoftheproblemtoproducethefinalresult.OurworkisalsorelatedtoresearchinusingILPtoenforceglobalconstraintsinNLPappli- l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 587 阳离子(RothandYih,2004).Mostpreviouswork(SrikumarandRoth,2011;GoldwasserandRoth,2011;Berantetal.,2014;Liuetal.,2015)uti-lizesILPasaninferenceproceduretofindthebestglobalpredictionoverinitiallytrainedlocalclassi-fiers.Similarly,weuseILPtoenforceglobalanddomainspecificconstraints.We,然而,useILPtoformcandidateequationswhicharethenusedtogeneratetrainingdataforourclassifiers.Ourworkisalsorelatedtoparserre-ranking(柯林斯,2005;GeandMooney,2005),whereare-rankermodelat-temptstoimprovetheoutputofanexistingproba-bilisticparser.Similarly,theglobalequationmodeldesignedinALGESattemptstore-rankequationsbasedonglobalproblemstructure.3SetupandProblemDefinitionGivennumericquantitiesVandanunknownxwhosevalueistheanswerweseek,anequationoverVandxisanyvalidmathematicalexpressionformedbycombiningelementsofV∪{X}usingbi-naryoperatorsfromO={+,-,,/,=}suchthatxappearsexactlyonce.WheneachelementofVappearsatmostonceintheequation,itmaynatu-rallyberepresentedasanequationtreewhereeachoperatorisanodewithedgestoitstwooperands.2TdenotesthesetofallequationtreesoverVandx.ProblemFormulation.Weaddresstheproblemofsolvinggrade-schoolalgebrawordproblemsthatmaptosingleequations.Solvingsuchawordprob-lemwamountstoselectinganequationtreetrepre-sentingthemathematicalcomputationimplicitinw.Figure1showsanexampleofwwithquantitiesun-derlined,andthecorrespondingtreet.Formally,weuseajointprobabilitydistributionp(t,w)thatde-fineshow“well”anequationtreet∈Tcapturesthemathematicalcomputationexpressedinw.Givenawordproblemwasinput,ourgoalistocompute˜t=argmaxt∈Tp(t|w).AnexhaustiveenumerationoverTquicklybe-comesimpracticalasproblemcomplexityincreasesandn=|V∪{X}|grows.Specifically,|时间|>h(n)=n!(n−1)!(n−1)2n−4,h(4)=432,h(6)>1.7M,H(8)>22B,etc.Thisvastsearchspacemakesitchallengingforadiscriminativemodelto2Problemsinvolvingsimultaneousequationsrequirecom-biningmultipleequationtrees,oneperequation.375−(7*X)=4375=(7*X)+4375=(x*7)+4
3.
Train
当地的
模型
(秒(在
7.1)

Monday,
375
学生


A
trip


zoo.
全部
7
buses

filled

4
学生


travel

cars.

如何
许多
学生


each
公共汽车
?
Qnt: 375
Ent: Student
Qnt: 7
Ent: Bus
Qnt: 4
Ent: Student
Qnt: X
Ent: Student
Ctr: Bus
1.
Ground
文本
w
进入
根据
Qsets
(秒(在
5)
:
subset

时间(w)
yielding
正确的
solu(在
375s
*s
-s
4s
7乙
xs
=
375s
=
+s
*s
4s
7乙
xs
375s
+s
-s
4s
=
7乙
xs
(7乙,xs)(375s,combine(7乙,xs))(7乙,xs)(combine(7乙,xs),4s)2.
Use
ILP

产生
中号
平等(在

时间(w)
(秒(在
6)



4.
Train
全球的
模型
(秒(在
7.2)



















:
problem-­‐tree

375+(7*X)=4375=(7/X)+4375-(x+7)=4TrlocalTrglobalTl(w):
操作员
节点

Tl(w)时间(w)\Tl(w)Training
例子
Label
*−*+Posi>ve
examples
(从













)
Nega>ve
examples
(从


























)
Tl(w)时间(w)\Tl(w)Figure2:AnoverviewoftheprocessoflearningforawordproblemanditsQsets.learntofind˜tdirectly,asasufficientlysimilartreemaynothavebeenobservedduringtraining.In-stead,ourmethodfirstgeneratessyntacticallyvalidequationtrees,andthenusesabottom-upapproachtoscoreequationswithalocalmodeltrainedtomapspansoftexttomathoperators,andaglobalmodeltrainedforcoherenceoftheentireequationw.r.t.globalproblemtextstructure.4OverviewoftheApproachFigure2givesanoverviewofourmethod,alsodetailedinFigure3.Inordertobuildequationtrees,weuseacompactrepresentationforeachnodecalledaQuantifiedSetorQsettomodelnaturallan-guagetextquantitiesandtheirproperties(e.g.,‘375students’in‘7buses’).Qsetsareusedfortrackingandcombiningquantitieswhenlearningthecorre-spondencebetweenequationtreesandtext.Definition1.Givenamathwordproblemw,letSbethesetofallpossiblespansoftextinw,φdenotetheemptyspan,andSφ=S∪{φ}.AQsetforwiseitherabaseQsetoracompoundQset.AbaseQsetisatuple(耳鼻喉科,qnt,adj,loc,vrb,syn,ctr)和:•ent∈S:entityorquantitynoun(e.g.,‘student’);•qnt∈R∪{X}:numberorquantity(e.g.,4orx); l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 588 学习(wordproblemsW,correspondingsolutionsL):1.Foreveryproblem-solutionpair(wi,‘i)withwi∈W,‘i∈L(A)S←BaseQsetsobtainedbyGroundingtextwiandReorderingtheresultingQsets(Section5)(乙)Ti←TopMtype-consistentequationtreecandidatesgeneratedbyILP(wi)(Section6)(C)T‘i←SubsetofTithatyieldsthecorrectnumericalsolution‘i(d)AddtoTrlocalfeatureshs1,s2iwithlabelopforeachoperatoropcombiningQsetss1,s2intreesinT‘i(e)AddtoTrglobalfeatureshw,tilabeledpositiveforeacht∈T‘iandlabelednegativeforeacht∈T\T‘i2.Llocal←TrainalocalQsetrelationshipmodelonTrlocal(Section7.1)3.Gglobal←TrainaglobalequationmodelonTrglobal(Section7.2)4.Outputlocalandglobalmodels(Llocal,Gglobal)推理(wordproblemw,localsetrelationmodelLlocal,globalequationmodelGglobal):1.S←BaseQsetsobtainedbyGroundingtextwiandReorderingtheresultingQsets(Section5)2.T←TopMtype-consistentequationtreecandidatesgeneratedbyILP(w)(Section6)3.t∗←argmaxti∈T(西德:16)Qtj∈tLlocal(tj|w)(西德:17)×Gglobal(t|w),scoringeachtreeti∈TbasedonEquation14.‘←Numericsolutiontowobtainedbysolvingequationtreet∗fortheunknown5.Output(t∗,')Figure3:Overviewofourmethodforsolvingalgebraicwordproblems.•adj⊆Sφ:adjectivesforentinw;•loc∈Sφ:locationofent(e.g.,‘inthedrawer’);•vrb∈Sφ:governingverbforent(e.g.,‘fill’);•syn:syntacticandpositionalinformationforent(e.g.,‘buses’isinsubjectposition);•ctr⊆Sφ:containersofent(e.g.,‘Bus’isacon-tainerforthe‘students’Qset).Propertiesbeingφindicatestheseoptionalproper-tiesareunspecified.AcompoundQsetisformedbycombiningtwoQsetswithanon-equalitybinaryop-eratorasdiscussedinsection5.Qsetscanbefurthercombinedwiththeequalityoperatortoyieldasemanticallyaugmentedequationtree.3TheexampleinFigure2hasfourbaseQsetsextractedfromproblemtext.EachpossibleequationtreecorrespondstoadifferentrecursivecombinationofthesefourQsets.Givenw,ALGESfirstextractsalistofnbaseQsetsS={s1,…,sn}(Section5).ItthenusesanILP-basedoptimizationmethodtocombineex-tractedQsetsintoalistoftype-consistentcandidateequationtrees(Section6).最后,ALGESusesdis-criminativemodelstoscoreeachcandidateequation,usingbothlocalandglobalfeatures(Section7).Specifically,therecursivenatureofourrepresen-tationallowsustodecomposethelikelihoodfunc-tionp(t,w)intolocalscoringfunctionsforeachin-3InspiredbySemanticallyAugmentedParseTrees(GeandMooney,2005)adaptedtoequationallogic.ternalnodeoftfollowedbyscoringtherootnode:p(t|w)∝Ytj∈tLlocal(tj|w)×Gglobal(t|w)(1)wherethelocalfunctionLlocal(tj|w)scoresthelike-lihoodofthesubtreetj,modelingpairwiseQsetre-lationships,whiletheglobalfunctionGglobal(t|w)scoresthelikelihoodoftherootoft,modelingtheequationinitsentirety.Learning.ALGESlearnsinaweaklysupervisedfashion,usingwordproblemswiandonlytheircor-rectanswer‘i(notthecorrespondingequationtree)astrainingdata{(wi,‘i)}i∈{1,…,氮}.WegroundeachwiintoorderedQsetsandgeneratealistoftype-consistentcandidatetrainingequationsT‘ithatyieldthecorrectanswer‘i.WebuildalocaldiscriminativemodelLlocaltoscorethelikelihoodthatamathoperatorop∈OcancorrectlycombinetwoQsetss1ands2basedontheirsemanticsandintertextualrelationships.Forexample,inFigure2thismodellearnsthat∗hasahighlikelihoodscorefor‘7buses’and‘xstudents’.Thetrainingdataconsistsoffeaturevectorshs1,s2ilabeledwithop,derivedfromtheequationtreesthatyieldthecorrectsolution.Wealsobuildaglobaldiscriminativemodelthatscoresequationtreesbasedontheglobalproblemstructure:Gglobal=ψ|fglobal(w,t)wherefglobal l D o w n o a d e d f r o m h t t p : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 589 representsglobalfeaturesofwandt,andφarepa-rameterstobelearned.Thetrainingdataconsistsoffeaturevectorshw,tiforequationtreesthatyieldthecorrectsolutionaspositiveexamples,andtherestasnegatives(Figure2).Thedetailsoflearningandin-ferencestepsaredescribedinSection7.5GroundingandCombiningQsetsWediscusshowwordproblemtextisgroundedintoanorderedlistofQsets.AQsetisacompactrep-resentationofthepropertiesofaquantityasde-scribedinasinglesentence.TheuseofQsetsfacil-itatesthebuildingofsemanticallyaugmentedequa-tiontrees.Additionally,bytrackingcertainproper-tiesoftextquantities,ALGEScanresolvepronomi-nalreferencesorelidednounstopropertiesofprevi-ousQsets.Itcanalsocombineinformationaboutquantitiesreferencedindifferentsentencesintoasinglesemanticstructureforfurtheruse.Grounding.ALGEStranslatesthetextoftheprob-lemwintointerrelatedbaseQsets{s1,…,sn},eachassociatedwithaquantityintheproblemtextw.ThepropertiesofeachQset(Definition1)areex-tractedfromthedependencyparserelationspresentinthesentencewherethequantityisreferredtoac-cordingtotherulesdescribedinTable1.Additionally,ALGESassignsasingletargetQsetsxcorrespondingtothequestionsentence.ThepropertiesofthetargetQsetarealsoextractedac-cordingtotherulesoftheTable1.Inparticular,theqntpropertyissettounknown,theentissettothenounappearingafterthewordswhat,manyormuchinthetargetsentence,andtheotherpropertiesareextractedaslistedinTable1.Reordering.Inordertoreducethespaceofpossibleequationtrees,ALGESreordersQsets{s1,…,sn}accordingtosemanticandtextualinformationandenforcesaconstraintthatQsetscanonlycombinewithadjacentQsetsintheequationtree.InFig-ure2,thetargetQsetcorrespondingtotheunknown(x‘students’)ismovedfromitstextuallocationattheendoftheproblemandplacedadjacenttotheQsetwithentity‘buses’.Thismoveistriggeredbytherelationshipbetweenthetargetentity‘student’anditscontainer‘bus’thatisquantifiedbyeachinthelastsentence.Inadditiontothecontainermatchrule,weemploythreeotherrulestomovethetargetForeachquantitymentionedinthetext,特性(qnt,耳鼻喉科,ctr,adj,vrb,loc)ofthecorrespondingQsetareextractedasfollows:1.qnt(quantity)isanumericalvalueordeterminerfoundintheproblemtext,oravariable.2.ent(实体)isanounrelatedtotheqntinthedepen-dencyparsetree.Ifqntisanumericalvalue,entisthenounrelatedbythenum,数字,orprepofrela-tions.Ifqntisadeterminer,entisthenounrelatedviathedetrelation.Whensuchanoundoesnotexistduetoparsefailureorpragmaticrecoverability,entisthenounthatisclosesttoqntinthesamesentenceortheentassociatedwiththemostrecentQset.3.ctr(container)isthesubjectoftheverbgovern-ingent,exceptintwocases:whenthissubjectisapronominalreference,thectrissettothectroftheclosestpreviousQset;ifentisrelatedtoanotherQsetwhoseqntisoneofeach,every,A,一个,每,orone,ctrissettotheentofthatQset.4.adj(形容词)isalistofadjectivesrelatedtoentbytheamodrelation.5.vrb(动词)isagoverningverb,eitherrelatedtoentbynsubjordobj6.loc(地点)isanounrelatedtoentbyprepon,prepin,orprepatrelations.Table1:TheprocessofformingasingleQset.QsetasdescribedinTable2.4Combining.TwoQsetsandanarithmeticoperatorcanbecombinedviathecombinefunctiontoformathirdQset,alternatelyreferredtoasacompound.Becauseofthis,wecanrepresentintermediatenodesintheequationtreeasQsetsthemselves.Therecur-sivecombinationofQsetsallowsustoeffectivelydecomposeequationtreesintoacollectionoflocaloperationsoveridenticalabstractions.ThisenableslearningfeaturesofQsetsandtextthatindicatepar-ticularoperationsfrombothleafandintermediatenodes.Themechanicsofc←combine(A,乙,op)aredetailedbelow.Forop=+,thepropertiesofeitherQsetaorbsufficetodefinec.ALGESalwaysformscusingthepropertiesofbinthesesituations.Forop=−,thepropertiesoftheleftoperandadefinetheresul-tantset,asevidencedbythesubtractionoperationspresentinthefirstprobleminTable9.Todetermine4Thesereorderingrulesareintentionallyminimal,butdoprovidesomegainoverbothpreservingthetextorderingofquantitiesorsettingorderingasasoftconstraint.SeeTable7. l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 590 1.MoveQsetsitoimmediatelyafterQsetsjifthecon-tainerofsiistheentityofsjandisquantifiedbyeach.2.MovetargetQsettothefrontofthelistiftheques-tionstatementincludeskeywordsstartorbegin.3.MovetargetQsettotheendofthelistifthequestionstatementincludeskeywordsleft,remain,andfinish.4.MovetargetQsettothetextuallocationofaninter-mediatereferencewiththesameentifitsnumprop-ertyisthedeterminersome.Table2:RulesforreorderingQsets.thestickersinLuke’spossession,weneedtotrackstickersrelatedtotheleftQsetwiththeverb‘got’.Forop=∗,theQsetrelationshipiscapturedbythecontainerandentityproperties:theonewhosepropertiespreserveaftermultiplicationhastheother’sentityasitscontainer.InFigure2,the‘bus’Qsetisthecontainerof‘students’.Whenthesearecombinedwiththe∗operator,theresultisofen-titytype‘student’.Forop=/,weusetheprop-ertiesoftheleftoperandtoencourageadistinctionbetweendivisionandmultiplication.6GeneratingEquationTreeswithILPWeuseanILPoptimizationmodeltogenerateequa-tiontreesinvolvingnbaseQsets.Theseequationtreesareusedforbothlearningandinferencesteps.ALGESgeneratesanorderedlistofMofthemostdesirablecandidateequationsforagivenwordprob-lemwusinganILP,whichmodelsglobalconsider-ationssuchastypeconsistencyandappropriatelowexpressioncomplexity.Tofacilitategenerationofequationtrees,werepresenttheminparenthesis-freepostfixorreversePolishnotation,whereabinaryop-eratorimmediatelyfollowsthetwooperandsitop-erateson(e.g.,abc+∗x=).GivenawordproblemwwithnbaseQsets(cf.Table3fornotation),webuildanoptimizationmodelILP(w)overthespaceofpostfixequationsE=e1e2eLoflengthLinvolvingknumericconstants,k0=n−kunknowns,rpossiblebinaryoperators,andq“types”ofQsets,wheretypecor-respondstotheentitypropertyofQsetsanddeter-mineswhichbinaryrelationshipsarepermittedbe-tweentwogivenQsets.ForsinglevariableequationsoverbinaryoperatorsO,k0=1,r=|氧|=5,andL=2n−1.Forbrevity,definem=n+randlet[j]denote{1,…,j}.ExpressionEcanbeevalu-atedbyconsideringe1,e2,…,eLinorder,pushingnon-operatorsymbolsontoastackσ,和,forop-eratorsymbols,poppingthetoptwoelementsofσ,applyingtheoperatortothem,andpushingtheresultbackontoσ.Thestackdepthoftheeiisthestacksizeaftereihasbeenprocessedthisway.INPUTwinputmathwordproblemnnumberofbaseQsetsknumberofnumericconstantsk0numberofunknowns(1forsingle-var.eqns.)rnumberofbinaryoperators(r=|氧|=5)mnumberofpossiblesymbols(n+r)typejtypeofj-thbaseQsetMdesirednumberofcandidateequationtreesLdesiredlengthofpostfixequations(2n−1)OUTPUTEpostfixequationtobegeneratedeii-thelementofE;i∈[L]VARIABLESfori∈[L]ximainILPvariablefori-thsymbolofEciindicatorvariable:eiisanumericconstantuiindicatorvariable:eiisanunknownoiindicatorvariable:eiisanoperatordipostfixstackdepthofei;di∈[L]titypeofei(correspondstoQsetentity);ti∈[q]Table3:ILPnotationforcandidateequationsmodelVariables.Integervariablesx1,x2,…,xLencodewhichsymboleacheirefersto.Theirdomain,[米],representstheknumericconstantsinthesameorderastheirrespectiveQsets,followedbythek0unknowns,andfinallyoperatorsintheorder+,−,,/,=.Binaryvariablesci,ui,andoiindicatewhethereiisanumericconstant,未知,oroper-ator,resp.Variablesdiwithdomain[L]equalthepostfixstackdepthofei.Finally,variablestiwithdomain[q]indicatethetypeofei.Forj∈[n],i.e.,forthekconstantsandk0unknowns,typej∈[q]denotestherespectiveQsetentity.Uncertaintyinobjecttypesmaybeincorporatedeasilybytreatingtypejasa(potentiallyweighted)subsetof[q].ConstraintsandObjectiveFunction.ConstraintsinILP(w)includesyntacticvalidity,typeconsis-tency,anddomainspecificsimplicityconsidera-tions.Wedescribethembrieflyhere,leavingdetailstotheAppendix.Theobjectivefunctionminimizesthesumoftheweightsofviolatedsoftconstraints. l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 – df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 591 以下,(H)denoteshardconstraints,(瓦)weightedsoftconstraints,和(磷)post-processingsteps.DefinitionalConstraints(H):Constraintsoverin-dicatorvariablesci,ui,andoiensuretheyrepre-senttheirintendedmeaning,includingtheinvariantci+ui+oi=1.Forstackdepthvariables,weaddd1=1anddi=di−1−2oi+1fori>1.SyntacticValidity(H):Validityofthepostfixex-pressionisenforcedeasilythroughconstraintso1=0anddL=1.Inaddition,weaddxL=mandxis/,thenTo=T16=T2.DomainConsiderations(H,瓦):Weaddafewdo-mainspecificconstraintsbasedonpatternsobservedinasmallsubsetofthequestions.Theseincludeanupperboundonthestackdepth,whichhelpsavoidoverlycomplexexpressionsunsuitableforgrade-schoolalgebra,andreducingredundancyby,e.g.,disallowingthenumericconstant0tobeanoperandof+or−orthesecondoperandof/.SymmetryBreaking(H,瓦):Ifacommutativeop-eratorisprecededbytwonumericconstants(e.g.,ab+),werequiretheconstantstorespecttheirQsetordering.Everyotherpairofconstantsthatdisre-spectsitsQsetorderingincursasmallpenalty.NegativeandFractionalAnswers(磷):Ratherthanimposingnon-negativityasacomplexconstraintinILP(w),wefilteroutcandidateexpressionsyieldinganegativeanswerasapost-processingstep.Sim-ilarly,whenallnumericconstantsareintegers,wefilteroutexpressionsyieldingafractionalanswer,againbasedontypicalquestionsinourdatasets.7LearningOurgoalistolearnascoringfunctionthatidentifiesthebestequationtreet∗correspondingtoanunseenwordproblemw.Sinceourdatasetconsistsonlyofproblem-solutionpairs{(wi,‘i)}i=1,...,N,train-ingourscoringmodelsrequiresproducingequa-tiontreesmatching‘i.Foreverytrainingin-stance(wi,‘i),weuseILP(wi)togenerateMtype-consistentequationtreecandidatesTi.Totrainourlocalmodel(section7.1),wefilterouttreesfromTithatdonotevaluateto‘i,extractall(s1,s2,op)triplesfromtheremainingtrees,andusefeaturevec-torscapturing(s1,s2)andlabeledwithopastrain-ingdata(seeFigure2).Fortheglobalmodel,weusefortrainingdataasubsetofTiwithanequalnumberofcorrectandincorrectequationtrees(section7.2).Oncetrained,weuseEquation1tocombinethesemodelstocomputeascoreforeachcandidateequa-tiontreegeneratedforanunseenwordproblematinferencetime(seeFigure3).7.1LocalQsetRelationshipModelWetrainalocalmodelofaprobabilitydistribu-tionoverthemathoperatorsthatmaybeusedtocombineapairofQsets.Theideaistolearnthecorrespondencebetweenspansoftextsandmathop-eratorsbyexaminingsuchtextsandtheQsetsoftheinvolvedoperands.GivenQsetss1ands2,thelo-calscoringfunctionscorestheprobabilityofeachop∈{+,-,,/},i.e.,Llocal=θ|flocal(s1,s2)whereflocalisafeaturevectorfors1ands2.NotethateitherQsetmaybeacompound(theresultofacombineprocedure).Thegoalistolearnparametersθbymaximizingthelikelihoodoftheoperatorsbe-tweeneverytwoQsetsthatweobserveinthetrain-ingdata.Wemodelthisasamulti-classSVMwithanRBFkernel.Features.Giventherichnessofthetextualpossi-bilitiesforindicatingamathoperation,thefeaturesaredesignedoversemanticandintertextualrelation-shipsbetweenQsets,aswellasdomain-specificlex-icalfeatures.Thefeaturevectorincludesthreemainfeaturecategories(Table4).第一的,singlesetfeaturesincludesyntacticandpo-sitionalfeaturesofindividualQsets.Forexample,theyincludeindicatorfeaturesforwhetherelementsofashortlexiconofmath-specifictermssuchas l D o w n o a d e d f r o m h t t p : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 - df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 592 1.SingleQsetFeatures(repeatedforB)•whatargumentofitsgoverningverbisA?•isAasubsetofanotherset?•isAacompound?•mathkeywordsfoundincontextofA•verbLindistancefromknownverbcategories(Bonly)2.RelationalfeaturesbetweenQsetsAandB•entitymatch•adjectiveoverlap•locationmatch•distanceintext•LinsimilaritybetweenverbsgoverningAandB•isoneasubtypeoftheother?•doesonecontaintheother?3.TargetQuantityfeatures•A/BistargetQset•A/Bentitymatchestargetentity•mathkeywordsintargetcontext4.Rootnodefeatures•#ofILPconstraintsviolatedbyequation•ScoresofleftandrightsubtreesofrootFigure4:Featuresusedforlocalandglobalmodels,forleftQsetAandrightQsetB‘add’and‘times’appearinthevicinityofthesetreferenceinthetext.Also,followingHosseinietal.(2014),weincludeavectorthatcapturesthedis-tancebetweentheverbsassociatedwitheachQsetandasmallcollectionofverbsfoundtobeusefulincategorizingarithmeticoperationsinthatwork,basedupontheirLinSimilarity(林,1998).第二,relationshipsbetweenQsetsarede-scribedw.r.t.variousQsetpropertiesdescribedinsection4.TheseincludebinaryfeatureslikewhetheroneQset’scontainerpropertymatchestheotherQset’sentity(astrongindicatorofmultiplication),orthedistancebetweentheverbsassociatedwitheachsetbasedupontheirLinSimilarity.Third,targetquantityfeaturescheckthematchingbetweenthetargetQsetandthecurrentQsetaswellasmathkeywordsinthetargetsentence.7.2GlobalEquationModelWealsotrainaglobalmodelthatscoresequationtreesbasedontheglobalstructureofthetreeandtheproblemtext.Theglobalmodelscoresthecom-patibilityofthetreewiththesoftconstraintsintro-ducedinSection6aswellasitscorrespondencewiththeproblemtext.Weuseadiscriminativemodel:Gglobal=ψ|fglobal(w,t)wherefglobalarethefea-turescapturingtreesandtheircorrespondenceswiththeproblemtext.Wetrainaglobalclassifiertorelatethesefeaturesthroughparametersψ.FeaturesfglobalareexplainedinTable4.TheyincludethenumberofviolatedsoftconstraintsintheILP,theprobabilitiesoftheleftandrightsubtreesoftherootasprovidedbythelocalmodel,andgloballexicalfeatures.Additionally,thethreelocalfeaturesetsareappliedtotheleftandrightQsets.7.3InferenceForanunseenproblemw,wefirstextractbaseQsetsfromw.Thegoalistofindthemostlikelyequationtreewithminimumviolationofhardandsoftcon-straints.UsingILP(w)overtheseQsets,wegener-ateMcandidateequationtreesorderedbythesumoftheweightsoftheconstraintstheyviolate.WecomputethelikelihoodscoregivenbyEqn.(1)foreachcandidateequationtreet,usethisasanesti-mateofthelikelihoodp(t|w),andreturnthecan-didatetreet∗withthehighestscore.InEqn.(1),thescoreoftistheproductofthelikelihoodscoresgivenbythelocalclassifierforeachoperandintandtheQsetsoverwhichitoperates,multipliedbythelikelihoodscoregivenbytheglobalclassifierforthecorrectnessoft.Iftheresultingequationprovidesthecorrectanswerforw,weconsiderinferencesuc-cessful.8ExperimentsThissectionreportsonthreeexperiments:acom-parisonofALGESwithKushmanetal.(2014)’stemplate-basedmethod,acomparisonofALGESwithHosseinietal.(2014)’sverb-categorizationmethods,andablationstudies.TheexperimentsarecomplicatedbythefactthatALGESislimitedtosin-gleequations,andtheverbcategorizationmethodcanonlyhandlesingle-equationswithoutmultipli-cationordivision.Ourmainexperimentalresultistoshowanimprovementoverthetemplate-basedmethodonsingle-equationalgebrawordproblems.Wefurthershowthatthetemplate-basedmethodde-pendsonlexicalandtemplateoverlapbetweenitstrainingandtestsets.Whentheseoverlapsarere-duced,themethod’saccuracydropssharply.Incon-trast,ALGESisquiterobusttochangesinlexicalandtemplateoverlap(seeTables4and5). l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 - df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 593 ExperimentalSetup.WeusetheStanfordDe-pendencyParserinCoreNLP3.4(DeMarneffeetal.,2006)toobtainsyntacticinformationusedforgroundingandfeaturecomputation.FortheILPmodel,weuseCPLEX12.6.1(IBMILOG,2014)togeneratethetopM=100equationtreeswithamaximumstackdepthof10,abortingexplorationuponhitting10Kfeasiblesolutionsor30seconds.5WeusePython’sSymPypackageforsolvingequa-tionsfortheunknown.Forthelocalandglobalmod-els,weusetheLIBSVMpackagetotrainSVMclas-sifiers(ChangandLin,2011)withRBFkernelsthatreturnlikelihoodestimatesasthescore.Dataset.Thisworkdealswithgrade-schoolalge-brawordproblemsthatmaptosingleequationswithvaryinglength.Everyequationmayinvolvemulti-plemathoperationsincludingmultiplication,divi-sion,subtraction,andadditionovernon-negativera-tionalnumbersandonevariable.Thedataisgath-eredfromhttp://math-aids.com,http://k5learning.com,andhttp://ixl.comwebsitesandasubsetofthedatafromKushmanetal.(2014)thatmapswordproblemstosingleequa-tions.WerefertothisdatasetasSINGLEEQ(seeTable9forexampleproblems).TheSINGLEEQdatasetconsistsof508problems,1,117句子,and15,292words.Baselines.Wecompareourmethodwiththetemplate-basedmethod(Kushmanetal.,2014)andtheverb-categorizationmethod(Hosseinietal.,2014).Forthetemplate-basedmethod,weusethefullysupervisedsetting,providingequationsforeachtrainingexample.8.1ComparisonwithTemplate-basedMethodWefirstcompareALGESwiththetemplate-basedmethodoverSINGLEEQ.Weevaluatebothsystemsonthenumberofcorrectanswersprovidedandre-porttheaverageofa5-foldcrossvalidation.ALGESachieves72%accuracywhereasthetemplate-basedmethodachieves67%accuracy,a15%relativere-ductioninerrors(firstcolumnsinTables4and5).Thisresultisstatisticallysignificantwithap-valueof0.018underapairedt-test.5Thesehyper-parameterswerechosenbasedonexperimen-tationwithasmallsubsetofthequestions.Amoresystematicchoicemayimproveoverallperformance.TemplateOverlap10.47.76.32.1ALGES0.720.660.660.63Template-based0.670.600.460.26Errorreduction15%15%33%50%Table4:Decreasingtemplateoverlap:AccuracyofALGESversusthetemplate-basedmethodonsingle-equationalgebrawordproblems.Thefirstcolumncorre-spondstotheSINGLEEQdataset,andtheothercolumnsareforsubsetswithdecreasingtemplateoverlap.LexicalOverlap4.33.32.62.5ALGES0.720.660.660.63Template-based0.670.600.460.26Errorreduction15%15%33%50%Table5:Decreasinglexicaloverlap:AccuracyofALGESversusthetemplate-basedmethodonsingle-equational-gebrawordproblems.ThefirstcolumncorrespondstotheSINGLEEQdataset,andtheothercolumnsareforsub-setswithdecreasinglexicaloverlap.LexicalOverlap.ByfurtheranalyzingSINGLEEQ,wenotedthatthereissubstantialoverlapbetweenthecontentwords(commonnoun,形容词,adverb,andverblemmas)indifferentproblems.Forex-ample,manyproblemsaskforthetotalnumberofseashellscollectedbytwopeopleonabeach,withonlythenamesofthepeopleandthenumberofseashellsthateachfoundchanged.Toanalyzetheeffectofthisrepetitiononthelearningmethodseval-uated,wedefinealexicaloverlapparameterasthetotalnumberofcontentwordsinadatasetdividedbythenumberofuniquecontentwords.Thetwo“seashellproblems”haveahighlexicaloverlap.TemplateOverlap.Wealsonotedthatmanyprob-lemsinSINGLEEQcanbesolvedusingthesametemplate,orequationtreestructureabovetheleafnodes.Forexample,aproblemwhichcorrespondstotheequation(9∗3)+7andadifferentproblemthatmapsto(4∗5)+2sharethesametemplate.Weintroduceatemplateoverlapparameterdefinedastheaveragenumberofproblemswiththesametemplateinadataset.Results.Inourdata,templateoverlapandlexi-caloverlapco-vary.Todemonstratethebrittlenessofthetemplate-basedmethodsimply,wepickedthreesubsetsofSINGLEEQwherebothparame- l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 - df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 594 tersweresubstantiallylowerthaninSINGLEEQandrecordedtherelativeperformanceofthetemplate-basedmethodandofALGESinTables4and5.Thedatausedinbothtablesisthesame,buttheta-blesareseparatedforreadability.ThefirstcolumnreportsresultsfortheSINGLEEQdataset,andtheothercolumnsreportresultsforthesubsetswithde-creasingtemplateandlexicaloverlaps.Thesubsetsconsistof254,127,and63questionsrespectively.Weseethatasthelexicaloverlapdropsfrom4.3to2.5andasthetemplateoverlapdropsfrom10.4to2.1,therelativeadvantageofALGESoverthetem-platemethodsgoesupfrom15%to50%.Whilethetemplate-basedmethodisabletosolveawiderrangeofproblemsthanALGES,itsaccu-racyfallsoffsignificantlywhenfacedwithfewerre-peatedtemplatesorlessspuriouslexicaloverlapbe-tweenproblems(from0.67to0.26).TheaccuracyofALGESalsodeclinesfrom0.72to0.63acrossthetable,whichneedstobeinvestigatedfurther.Infuturework,wealsoneedtoinvestigateadditionalsettingsforthetwoparametersandtoattemptto“break”theirco-variance.Nevertheless,wehaveuncoveredanimportantbrittlenessinthetemplate-basedmethodandhaveshownthatALGESissub-stantiallymorerobust.8.2ComparisonwithVerb-CategorizationTheverb-categorizationmethodlearnstosolvead-ditionandsubtractionproblems,whileALGESisca-pableofsolvingmultiplicationanddivisionprob-lemsaswell.Wecompareagainsttheirmethodoverourdatasetaswellasthedatasetprovidedbythatwork,herereferredtoasADDSUB.ADDSUBconsistsofadditionandsubtractionwordproblemswiththepossibilityofirrelevantdistractorquanti-tiesintheproblemtext.Theverbcategorizationmethodusesrulesforhandlingirrelevantinforma-tion.AnexampleruleistoremoveaQsetwhosead-jectiveisnotconsistentwiththeadjectiveofthetar-getQset.WeaugmentALGESwithrulesintroducedinthismethodforhandlingirrelevantinformationinADDSUB.Results,reportedinTable6,showcomparableaccuracybetweenbothmethodsonHosseinietal.(2014)data.Ourmethodshowsasignificantim-provementversustheirsontheSINGLEEQdatasetduetothepresenceofmultiplicationanddivisionoperators,as40%oftheproblemsinourdatasetin-cludetheseoperators.MethodADDSUBSINGLEEQALGES0.770.72Verb-categorization0.780.48Errorreduction-53%Table6:AccuracyofALGEScomparedtoverbcatego-rizationmethod.8.3AblationStudyInordertodeterminetheeffectofvariouscompo-nentsofoursystemonitsoverallperformance,weperformthefollowingablations:NoLocalModel:这里,wetestourmethodabsentthelocalinformation(Section7.1).Thatis,wegen-erateequationsusingallILPconstraints,andscoretreessolelyoninformationprovidedbytheglobalmodel:p(t|w)∝Gglobal(w,t).NoGlobalModel:这里,wetestourmethodwith-outtheglobalinformation(Section7.2).Thatis,wegenerateequationsusingonlythehardconstraintsofILPandscoretreessolelyoninformationprovidedbythelocalmodel:p(t|w)∝Qti∈tLlocal(w,的).NoQsetReordering:WetestourmethodwithoutthedeterministicQsetreorderingrulesoutlinedinSection5.Instead,weallowtheILPtochoosethetopMequationsregardlessoforder.ResultsinTable7showthateachcomponentofALGEScontributestoitsoverallperformanceontheSINGLEEQcorpus.WefindthatboththeGlobalandLocalmodelscontributesignificantlytotheoverallsystem,demonstratingthesignificanceofabottom-upapproachtobuildingequationtrees.ImportanceofFeatures.Wealsoevaluatetheac-curacyofthelocalQsetrelationshipmodel(Sec-tion7.1)onthetaskofpredictingthecorrectop-eratorforapairofQsetshs1,s2iovertheSIN-GLEEQdatasetusinga5-foldcrossvalidation.Ta-ble8showsthevalueofeachfeaturegroupusedinthelocalclassifier,andthustheimportanceofdetailsoftheQsetrepresentation. l 从http下载 : / / 直接的 . 米特 . 呃呃 / t a c l / 拉蒂斯 - df / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 压力 . 来宾来访 0 8 九月 2 0 2 3 595 MethodAccuracyALGES0.72NoLocalModel0.50NoGlobalModel0.49NoQsetReordering0.68Table7:AblationstudyofeachcomponentofALGES.MethodAccuracyLocalclassifier:FullFeatureset0.84NoSingleSetFeatures0.81NoSetRelationFeatures0.75NoTargetFeatures0.79Table8:Accuracyoflocalclassifierinpredictingthecor-rectoperatorbetweentwoQsetsandablatingfeaturesets.8.4QualitativeExamplesandErrorAnalysis.Table9showssomeexamplesofproblemssolvedbyourmethod.Weanalyzed72errorsmadebyALGESontheSINGLEEQdataset.Table10summarizesfivemajorcategoriesoferrors.ProblemsandequationsLukehad20stickers.Hebought12stickersfromastoreinthemallandgot20stickersforhisbirthday.ThenLukegave5ofthestickerstohissisterandused8todecorateagreetingcard.HowmanystickersdoesLukehaveleft?((20+((12+20)−8))−5)=xMaggiebought4packsofredbouncyballs,8packsofyellowbouncyballs,and4packsofgreenbouncyballs.Therewere10bouncyballsineachpackage.HowmanybouncyballsdidMaggiebuyinall?x=(((4+8)+4)∗10)Samhad79dollarstospendon9books.Afterbuyingthemhehad16dollars.Howmuchdideachbookcost?79=((9∗x)+16)Fredlovestradingcards.Hebought2packsoffootballcardsfor$2.73each,apackofPokemoncardsfor$4.01,andadeckofbaseballcardsfor$8.95.HowmuchdidFredspendoncards?((2∗2.73)+(4.01+8.95))=xTable9:ExamplesofproblemssolvedbyALGESto-getherwiththereturnedequation.Parsingerrorscauseawronggroundingintothedesignedrepresentation.Forexample,theparsertreats‘vanilla’asanounmodifiedbythenumber‘19’,leadingoursystemtotreat‘vanilla’astheen-tityofaQsetratherthan‘cupcake’.DespitetheimprovementsthatcomefromALGES,aportionoferrorsareattributedtogroundingandorderingis-sues.Forinstance,thesystemfailstocorrectlydis-ErrortypeExampleParsingIssues(12%)Randyneeds53cupcakesforabirthdayparty.Healreadyhas7chocolatecupcakesand19vanillacupcakes.Howmanymorecup-cakesshouldRandybuy?Grounding&Ordering(19%)Thereare24bicyclesand14tricyclesinthestorageatDanny’sapartmentbuilding.Eachbicyclehas2wheelsandeachtricyclehas3wheels.Howmanywheelsarethereinall?SemanticLimitation(19%)Thesumofthreeconsecutiveevennumbersis162.Whatisthesmallestofthesenumbers?LackofKnowledge(32%)Arestaurantsold63hamburgerslastweek.Howmanyhamburgersonaverageweresoldeachday?Inferringquantities(18%)Sara,基思,Benny,andAlyssaeachhave96baseballcards.Howmanydozenbaseballcardsdotheyhaveinall?Table10:Examplesofdifferenterrorcategoriesandrel-ativefrequencies.Sourcesoferrorsareunderlined.tinguishbetweenthesetsofwheels,andsodoesnotgetthemovement-triggeringcontainerrelationshipsright.Semanticlimitationsareanothersourceofer-rors.Forexample,ALGESdoesnotmodelthese-manticsof‘threeconsecutivenumbers’.Thefourthcategoryreferstoerrorscausedduetolackofworldknowledge(e.g.,‘week’correspondsto‘7days’).最后,ALGESisnotabletoinferquantitieswhentheyarenotexplicitlymentionedinthetext.Forex-ample,thenumberofpeopleshouldbeinferredbycountingthepropernamesintheproblem.9ConclusionInthisworkwehaveoutlinedamethodforsolv-inggradeschoolalgebrawordproblems.Wehaveempiricallydemonstratedthevalueofourapproachversusstate-of-the-artwordproblemsolvingtech-niques.Ourmethodgroundsquantityreferences,utilizestype-consistencyconstraintstoprunethesearchspace,learnswhichalgebraicoperatorsareindicatedbytext,andranksequationsaccordingtoaglobalobjectivefunction.ALGESisahybridofpre-vioustemplate-basedandverbcategorizationstate-basedmethodsforsolvingsuchproblems.Bylearn-ingcorrespondencesbetweentextandmathematicaloperators,weextendthemethodofstateupdatesbasedonverbcategories.Bylearningtore-rankequationtreesusingagloballikelihoodmodel,weextendthemethodofmappingwordproblemsto l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 596 equationtemplates.DifferentcomponentsofALGEScanbeadaptedtootherdomainsoflanguagegroundingthatre-quirecross-sentencereasoning.Futureworkin-volvesextendingALGEStosolvehighergrademathwordproblemsincludingsimultaneousequations.Thiscanbeaccomplishedbyextendingthevari-ablegroundingsteptoallowmultiplevariables,andtrainingtheglobalequationmodeltorecog-nizewhichquantitiesbelongtowhichequation.ThecodeanddataforALGESarepubliclyavailable.Acknowledgments:ThisresearchwassupportedbytheAllenInstituteforAI(66-9175),AllenDistinguishedInvestigatorAward,andNSF(IIS-1352249).WethankReginaBarzilay,LukeZettle-moyer,AriaHaghighi,MarkHopkins,AliFarhadi,andtheanonymousreviewersfortheirhelpfulcom-ments.ReferencesYoavArtziandLukeZettlemoyer.2013.Weaklysu-pervisedlearningofsemanticparsersformappingin-structionstoactions.TACL,1(1):49–62.JonathanBerant,VivekSrikumar,Pei-ChunChen,AbbyVanderLinden,BrittanyHarding,BradHuang,PeterClark,andChristopherD.Manning.2014.Modelingbiologicalprocessesforreadingcomprehen-sion.InEMNLP.AntoineBordes,NicolasUsunier,andJasonWeston.2010.Labelrankingunderambiguoussupervisionforlearningsemanticcorrespondences.InICML,pages103–110.S.R.K.Branavan,HarrChen,LukeS.Zettlemoyer,andReginaBarzilay.2009.Reinforcementlearningformappinginstructionstoactions.InACL/AFNLP,pages82–90.Chih-ChungChangandChih-JenLin.2011.LIBSVM:Alibraryforsupportvectormachines.ACMTransac-tionsonIntelligentSystemsandTechnology,2:27:1–27:27.DavidL.Chen,JoohyunKim,andRaymondJ.Mooney.2010.Trainingamultilingualsportscaster:Usingper-ceptualcontexttolearnlanguage.JAIR,37:397–435.MichaelCollins.2005.Discriminativererankingfornaturallanguageparsing.ComputationalLinguistics,31(1):25–70.Marie-CatherineDeMarneffe,BillMacCartney,Christo-pherD.Manning,etal.2006.Generatingtypedde-pendencyparsesfromphrasestructureparses.InPro-ceedingsofLREC,volume6,pages449–454.YansongFengandMirellaLapata.2010.Howmanywordsisapictureworth?automaticcaptiongenerationfornewsimages.InACL,pages1239–1249.RuifangGeandRaymondJ.Mooney.2005.Astatisti-calsemanticparserthatintegratessyntaxandseman-tics.InConferenceonComputationalNaturalLan-guageLearning,pages9–16.RuifangGeandRaymondJ.Mooney.2006.Discrimina-tivererankingforsemanticparsing.InACL.D.GoldwasserandD.Roth.2011.Learningfromnaturalinstructions.InIJCAI.HannanehHajishirzi,JuliaHockenmaier,ErikT.Mueller,andEyalAmir.2011.Reasoningaboutrobocupsoccernarratives.InUAI,pages291–300.HannanehHajishirzi,MohammadRastegari,AliFarhadi,andJessicaK.Hodgins.2012.Semanticunderstand-ingofprofessionalsoccercommentaries.InUAI.BenHixon,PeterClark,andHannanehHajishirzi.2015.Learningknowledgegraphsforquestionansweringthroughconversationaldialog.InNAACL.MohammadJavadHosseini,HannanehHajishirzi,OrenEtzioni,andNateKushman.2014.Learningtosolvearithmeticwordproblemswithverbcategorization.InEMNLP,pages523–533.IBMILOG.2014.IBMILOGCPLEXOptimizationStudio12.6.RikKoncel-Kedziorski,HannanehHajishirzi,andAliFarhadi.2014.Multi-resolutionlanguagegroundingwithweaksupervision.InEMNLP.NateKushman,YoavArtzi,LukeZettlemoyer,andReginaBarzilay.2014.Learningtoautomaticallysolvealgebrawordproblems.InACL,pages271–281.TomKwiatkowski,LukeZettlemoyer,SharonGoldwater,andMarkSteedman.2010.Inducingprobabilisticccggrammarsfromlogicalformwithhigher-orderunifica-tion.InEMNLP,pages1223–1233.PercyLiang,MichaelI.Jordan,andDanKlein.2009.Learningsemanticcorrespondenceswithlesssupervi-sion.InACL/AFNLP,pages91–99.DekangLin.1998.Aninformation-theoreticdefinitionofsimilarity.InICML,volume98,pages296–304.FeiLiu,JeffreyFlanigan,SamThomson,NormanSadeh,andNoahA.Smith.2015.Towardabstractivesum-marizationusingsemanticrepresentations.InNAACL.CynthiaMatuszek,EvanHerbst,LukeZettlemoyer,andDieterFox.2012.Learningtoparsenaturallan-guagecommandstoarobotcontrolsystem.InProc.ofthe13thInternationalSymposiumonExperimentalRobotics(ISER),June.ArindamMitraandChittaBaral.2015.Learningtoau-tomaticallysolvelogicgridpuzzles.InEMNLP. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 1 6 0 1 5 6 6 8 1 4 / / t l a c _ a _ 0 0 1 6 0 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 597 DanRothandWen-tauYih.2004.Alinearprogram-mingformulationforglobalinferenceinnaturallan-guagetasks.InHweeTouNgandEllenRiloff,ed-itors,CoNLL,pages1–8.AssociationforComputa-tionalLinguistics.SubhroRoyandDanRoth.2015.Solvinggeneralarith-meticwordproblems.InEMNLP.MinJoonSeo,HannanehHajishirzi,AliFarhadi,andOrenEtzioni.2014.Diagramunderstandinginge-ometryquestions.InAAAI.MinjoonSeo,HannanehHajishirzi,AliFarhadi,OrenEt-zioni,andClintMalcolm.2015.Solvinggeometryproblems:Combiningtextanddiagraminterpretation.InEMNLP.ShumingShi,YuehuiWang,Chin-YewLin,XiaojiangLiu,andYongRui.2015.Automaticallysolvingnum-berwordproblemsbysemanticparsingandreasoning.InEMNLP.VivekSrikumarandDanRoth.2011.Ajointmodelforextendedsemanticrolelabeling.InEMNLP,Edin-burgh,Scotland.MarkYatskar,LucyVanderwende,andLukeZettle-moyer.2014.Seenoevil,saynoevil:Descriptiongenerationfromdenselylabeledimages.LexicalandComputationalSemantics(*SEM2014),page110.JohnM.ZelleandRaymondJ.Mooney.1996.Learn-ingtoparsedatabasequeriesusinginductivelogicpro-gramming.InAAAI,pages1050–1055.LukeS.ZettlemoyerandMichaelCollins.2005.Learn-ingtomapsentencestologicalform:Structuredclas-sificationwithprobabilisticcategorialgrammars.InUAI,pages658–666.LipuZhou,ShuaixiangDai,andLiweiChen.2015.Learntosolvealgebrawordproblemsusingquadraticprogramming.InEMNLP.Appendix:ILPModelDetailsFigure5summarizesvariousconstraintsofourILPmodelforgeneratingcandidateequations.op1idxiisanauxilaryvariablewhosevalue,whenxiisanop-erator,istheindexinthepostfixexpressionofthefirstoperandofxi.Ifop1idxi=j,auxiliaryvari-ablesop1xi,op1ti,op1oi,andop1uimirrorxj,tj,oj,anduj,respectively.sedenotesthecorrespondingconstantoroperatorsymbole(e.g.,‘+’,‘=’,‘0’,ETC。)inthepostfixexpressionbeingconstructed.HandW,asbefore,representhardandweightedsoftconstraints.DefinitionalConstraints(H):ci=I(xi≤k),i∈[L]oi=I(xi>n),i∈[L]ci+ui+oi=1,i∈[L]d1=1;di=di−1−2oi+1,2≤i≤Lop1idxi=maxj≤i−2{j|dj=di},3≤i≤LI(op1idxi=j)⇒I(op1xi=xj),我(op1ti=tj),我(op1oi=oj),我(op1ui=uj),我,j∈[L]我(xi=j)⇒I(ti=typej),i∈[L],j∈[q]o1=0;dL=1,Postfixvalidity(H)xL=m;希s/)⇒I(ti=op1ti),i∈[L]我(xi∈{s∗,s/})⇒I(ti−16=op1ti),i∈[L]Non-redundancy(H),Symmetrybreaking(H):我(xi∈{s+,s−})⇒I(xi−16=s0,op1xi6=s0),i∈[L]我(xi=s/⇒I(xi−16∈{s0,s1}),i∈[L]我(xi∈{s+,s−},ci−1=ci−2=1)⇒I(xi−2
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像
计算语言学协会会刊, 卷. 3, PP. 585–597, 2015. 动作编辑器: Regina Barzilay. 图像

下载pdf