Documentation - Recherche en IA spécialisée au MIT

Sur quel sujet avez-vous besoin de documentation?

Transactions of the Association for Computational Linguistics, vol. 4, pp. 47–60, 2016. Action Editor: David Chiang.

Transactions of the Association for Computational Linguistics, vol. 4, pp. 47–60, 2016. Action Editor: David Chiang. Submission batch: 11/2015; Published 2/2016. 2016 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence. c (cid:13) DetectingCross-CulturalDifferencesUsingaMultilingualTopicModelE.D.Guti´errez1EkaterinaShutova2PatriciaLichtenstein3GerarddeMelo4LucaGilardi51UniversityofCalifornia,SanDiego2ComputerLaboratory,UniversityofCambridge3UniversityofCalifornia,Merced4IIIS,TsinghuaUniversity,5ICSI,Berkeleyedg@icsi.berkeley.edues407@cam.ac.uktricia1@uchicago.edugdm@demelo.orglucag@icsi.berkeley.eduAbstractUnderstandingcross-culturaldifferenceshasimportantimplicationsforworldaffairsandmanyaspectsofthelifeofsociety.Yet,themajorityoftext-miningmethodstodatefocusontheanalysisofmonolingualtexts.Incon-trast,wepresentastatisticalmodelthatsimul-taneouslylearnsasetofcommontopicsfrommultilingual,non-paralleldataandautomati-callydiscoversthedifferencesinperspectivesonthesetopicsacrosslinguisticcommunities.Weperformabehaviouralevaluationofasub-setofthedifferencesidentiﬁedbyourmodelinEnglishandSpanishtoinvestigatetheirpsy-chologicalvalidity.1IntroductionRecentyearshaveseenagrowinginterestintext-miningapplicationsaimedatuncoveringpublicopinionsandsocialtrends(Faderetal.,2007;Mon-roeetal.,2008;GerrishandBlei,2011;Pennac-chiottiandPopescu,2011).Theyrestontheas-sumptionthatthelanguageweuseisindicativeofourunderlyingworldviews.Researchincognitiveandsociolinguisticssuggeststhatlinguisticvaria-tionacrosscommunitiessystematicallyreﬂectsdif-ferencesintheirculturalandmoralmodelsandgoesbeyondlexiconandgrammar(K¨ovecses,2004;LakoffandWehling,2012).Cross-culturaldiffer-encesmanifestthemselvesintextinamultitudeofways,mostprominentlythroughtheuseofexplicitopinionvocabularywithrespecttoacertaintopic(e.g.“policiesthatbeneﬁtthepoor”),idiomaticandmetaphoricallanguage(e.g.“thecompanyisspin-ningitswheels”)andothertypesofﬁgurativelan-guage,suchasironyorsarcasm.Theconnectionbetweenlanguage,cultureandreasoningremainsoneofthecentralresearchques-tionsinpsychology.ThibodeauandBorodit-sky(2011)investigatedhowmetaphorsaffectourdecision-making.Theypresentedtwogroupsofhu-mansubjectswithtwodifferenttextsaboutcrime.Intheﬁrsttext,crimewasmetaphoricallyportrayedasavirusandinthesecondasabeast.Thetwogroupswerethenaskedasetofquestionsonhowtotacklecrimeinthecity.Asaresult,whiletheﬁrstgrouptendedtooptforpreventivemeasures(e.g.strongersocialpolicies),thesecondgroupconvergedonpunishment-orrestraint-orientedmea-sures.AccordingtoThibodeauandBoroditsky,theirresultsdemonstratethatmetaphorshaveprofoundinﬂuenceonhowweconceptualizeandactwithre-specttosocietalissues.Thissuggeststhatinordertogainafullunderstandingofsocialtrendsacrosspop-ulations,oneneedstoidentifysubtlebutsystematiclinguisticdifferencesthatstemfromthegroups’cul-turalbackgrounds,expressedbothliterallyandﬁg-uratively.Performingsuchananalysisbyhandislabor-intensiveandoftenimpractical,particularlyinamultilingualsettingwhereexpertiseinallofthelanguagesofinterestmayberare.Withtheriseofbloggingandsocialmedia,NLPtechniqueshavebeensuccessfullyusedforanumberoftasksinpoliticalscience,includingautomaticallyestimatingtheinﬂuenceofparticularpoliticiansintheUSsenate(Faderetal.,2007),identifyinglex-icalfeaturesthatdifferentiatepoliticalrhetoricofopposingparties(Monroeetal.,2008),predictingvotingpatternsofpoliticiansbasedontheiruseoflanguage(GerrishandBlei,2011),andpredictingpoliticalafﬁliationofTwitterusers(PennacchiottiandPopescu,2011).Fangetal.(2012)addressed l D o w n o a d e d f r o m h t t p : / / direct . m i