文档 - 麻省理工学院人工智能研究专业

您需要什么主题的文档?

计算语言学协会会刊, 卷. 4, PP. 47–60, 2016. 动作编辑器: David Chiang.

计算语言学协会会刊, 卷. 4, PP. 47–60, 2016. 动作编辑器: David Chiang. 提交批次: 11/2015; 已发表 2/2016. 2016 计算语言学协会. 根据 CC-BY 分发 4.0 执照. C (西德:13) DetectingCross-CulturalDifferencesUsingaMultilingualTopicModelE.D.Guti´errez1EkaterinaShutova2PatriciaLichtenstein3GerarddeMelo4LucaGilardi51UniversityofCalifornia,SanDiego2ComputerLaboratory,UniversityofCambridge3UniversityofCalifornia,Merced4IIIS,TsinghuaUniversity,5ICSI,Berkeleyedg@icsi.berkeley.edues407@cam.ac.uktricia1@uchicago.edugdm@demelo.orglucag@icsi.berkeley.eduAbstractUnderstandingcross-culturaldifferenceshasimportantimplicationsforworldaffairsandmanyaspectsofthelifeofsociety.Yet,themajorityoftext-miningmethodstodatefocusontheanalysisofmonolingualtexts.Incon-trast,wepresentastatisticalmodelthatsimul-taneouslylearnsasetofcommontopicsfrommultilingual,non-paralleldataandautomati-callydiscoversthedifferencesinperspectivesonthesetopicsacrosslinguisticcommunities.Weperformabehaviouralevaluationofasub-setofthedifferencesidentiﬁedbyourmodelinEnglishandSpanishtoinvestigatetheirpsy-chologicalvalidity.1IntroductionRecentyearshaveseenagrowinginterestintext-miningapplicationsaimedatuncoveringpublicopinionsandsocialtrends(Faderetal.,2007;Mon-roeetal.,2008;GerrishandBlei,2011;Pennac-chiottiandPopescu,2011).Theyrestontheas-sumptionthatthelanguageweuseisindicativeofourunderlyingworldviews.Researchincognitiveandsociolinguisticssuggeststhatlinguisticvaria-tionacrosscommunitiessystematicallyreﬂectsdif-ferencesintheirculturalandmoralmodelsandgoesbeyondlexiconandgrammar(K¨ovecses,2004;LakoffandWehling,2012).Cross-culturaldiffer-encesmanifestthemselvesintextinamultitudeofways,mostprominentlythroughtheuseofexplicitopinionvocabularywithrespecttoacertaintopic(e.g.“policiesthatbeneﬁtthepoor”),idiomaticandmetaphoricallanguage(e.g.“thecompanyisspin-ningitswheels”)andothertypesofﬁgurativelan-guage,suchasironyorsarcasm.Theconnectionbetweenlanguage,cultureandreasoningremainsoneofthecentralresearchques-tionsinpsychology.ThibodeauandBorodit-sky(2011)investigatedhowmetaphorsaffectourdecision-making.Theypresentedtwogroupsofhu-mansubjectswithtwodifferenttextsaboutcrime.Intheﬁrsttext,crimewasmetaphoricallyportrayedasavirusandinthesecondasabeast.Thetwogroupswerethenaskedasetofquestionsonhowtotacklecrimeinthecity.Asaresult,whiletheﬁrstgrouptendedtooptforpreventivemeasures(e.g.strongersocialpolicies),thesecondgroupconvergedonpunishment-orrestraint-orientedmea-sures.AccordingtoThibodeauandBoroditsky,theirresultsdemonstratethatmetaphorshaveprofoundinﬂuenceonhowweconceptualizeandactwithre-specttosocietalissues.Thissuggeststhatinordertogainafullunderstandingofsocialtrendsacrosspop-ulations,oneneedstoidentifysubtlebutsystematiclinguisticdifferencesthatstemfromthegroups’cul-turalbackgrounds,expressedbothliterallyandﬁg-uratively.Performingsuchananalysisbyhandislabor-intensiveandoftenimpractical,particularlyinamultilingualsettingwhereexpertiseinallofthelanguagesofinterestmayberare.Withtheriseofbloggingandsocialmedia,NLPtechniqueshavebeensuccessfullyusedforanumberoftasksinpoliticalscience,includingautomaticallyestimatingtheinﬂuenceofparticularpoliticiansintheUSsenate(Faderetal.,2007),identifyinglex-icalfeaturesthatdifferentiatepoliticalrhetoricofopposingparties(Monroeetal.,2008),predictingvotingpatternsofpoliticiansbasedontheiruseoflanguage(GerrishandBlei,2011),andpredictingpoliticalafﬁliationofTwitterusers(PennacchiottiandPopescu,2011).Fangetal.(2012)addressed l D o w n o a d e d f r o m h t t p : / / 直接的 . m i