Transactions of the Association for Computational Linguistics, 2 (2014) 193–206. Action Editor: Joakim Nivre.

Transactions of the Association for Computational Linguistics, 2 (2014) 193–206. Action Editor: Joakim Nivre.
Submitted 12/2013; Revised 1/2014; Published 4/2014. c(cid:13)2014 Association for Computational Linguistics.

DiscriminativeLexicalSemanticSegmentationwithGaps:RunningtheMWEGamutNathanSchneiderEmilyDanchikChrisDyerNoahA.SmithSchoolofComputerScienceCarnegieMellonUniversityPittsburgh,PA15213,USA{nschneid,emilydan,cdyer,nasmith}@cs.cmu.eduAbstractWepresentanovelrepresentation,evaluationmeasure,andsupervisedmodelsforthetaskofidentifyingthemultiwordexpressions(MWEs)inasentence,resultinginalexicalseman-ticsegmentation.OurapproachgeneralizesastandardchunkingrepresentationtoencodeMWEscontaininggaps,therebyenablingeffi-cientsequencetaggingalgorithmsforfeature-richdiscriminativemodels.ExperimentsonanewdatasetofEnglishwebtextofferthefirstlinguistically-drivenevaluationofMWEiden-tificationwithtrulyheterogeneousexpressiontypes.Ourstatisticalsequencemodelgreatlyoutperformsalookup-basedsegmentationpro-cedure,achievingnearly60%F1forMWEidentification.1IntroductionLanguagehasaknackfordefyingexpectationswhenputunderthemicroscope.Forexample,thereisthenotion—sometimesreferredtoascompositionality—thatwordswillbehaveinpredictableways,withindi-vidualmeaningsthatcombinetoformcomplexmean-ingsaccordingtogeneralgrammaticalprinciples.Yetlanguageisawashwithexamplestothecontrary:inparticular,idiomaticexpressionssuchasawashwithNP,haveaknackforVP-ing,tothecontrary,anddefyexpectations.Thankstoprocesseslikemetaphorandgrammaticalization,theseare(tovariousdegrees)semanticallyopaque,structurallyfossilized,and/orstatisticallyidiosyncratic.Inotherwords,idiomaticexpressionsmaybeexceptionalinform,fonction,ordistribution.Theyaresodiverse,sounruly,so1.MWnamedentities:PrimeMinisterTonyBlair2.MWcompounds:hotairballoon,skinnydip3.conventionallySWcompounds:somewhere4.verb-particle:pickup,dryout,takeover,cutshort5.verb-preposition:referto,dependon,lookfor6.verb-noun(-preposition):payattention(à)7.supportverb:makedecisions,takepictures8.otherphrasalverb:putupwith,getridof9.PPmodifier:aboveboard,atall,fromtimetotime10.coordinatedphrase:cutanddry,moreorless11.connective:aswellas,letalone,inspiteof12.semi-fixedVP:pickupwhereleftoff13.fixedphrase:scaredtodeath,leaveofabsence14.phatic:You’rewelcome.Meneither!15.proverb:Beggarscan’tbechoosers.Figure1:SomeoftheclassesofidiomsinEnglish.Theexamplesincludedherecontainmultiplelexicalizedwords—withtheexceptionofthosein(3),iftheconven-tionalsingle-word(SW)spellingisused.difficulttocircumscribe,thatentiretheoriesofsyn-taxarepredicatedonthenotionthatconstructionswithidiosyncraticform-meaningmappings(Fillmoreetal.,1988;Goldberg,1995)orstatisticalproperties(Goldberg,2006)offercrucialevidenceaboutthegrammaticalorganizationoflanguage.Herewefocusonmultiwordexpressions(MWEs):lexicalizedcombinationsoftwoormorewordsthatareexceptionalenoughtobeconsideredassingleunitsinthelexicon.Asfigure1illus-trates,MWEsoccupydiversesyntacticandsemanticfunctions.WithinMWEs,wedistinguish(un)propernamesand(b)lexicalidioms.Thelatterhaveprovedthemselvesa“painintheneckforNLP”(Sagetal.,2002).AutomaticandefficientdetectionofMWEs,thoughfarfromsolved,wouldhavediverseappli-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

194

cationsincludingmachinetranslation(CarpuatandDiab,2010),informationretrieval(Newmanetal.,2012),opinionmining(Berend,2011),andsecondlanguagelearning(Ellisetal.,2008).Itisdifficulttoestablishanycomprehensivetax-onomyofmultiwordidioms,letalonedeveloplin-guisticcriteriaandcorpusresourcesthatcutacrossthesetypes.Consequently,thevoluminouslitera-tureonMWEsincomputationallinguistics—see§7,BaldwinandKim(2010),andRamisch(2012)forsurveys—hasbeenfragmented,looking(forexam-ple)atsubclassesofphrasalverbsornominalcom-poundsinisolation.TotheextentthatMWEshavebeenannotatedinexistingcorpora,ithasusuallybeenasasecondaryaspectofsomeotherscheme.Traditionally,suchresourceshaveprioritizedcertainkindsofMWEstotheexclusionofothers,sotheyarenotappropriateforevaluatinggeneral-purposeidentificationsystems.Inthisarticle,webrieflyreviewashallowformofanalysisforMWEsthatisneutraltoexpressiontype,andthatfacilitatesfreetextannotationwith-outrequiringaprespecifiedMWElexicon(§2).Theschemeappliestogappy(discontinuous)aswellascontiguousexpressions,andallowsforaqualitativedistinctionofassociationstrengths.InSchneideretal.(2014)wehaveappliedthisschemetofullyan-notatea55,000-wordcorpusofEnglishwebreviews(Biesetal.,2012a),aconversationalgenreinwhichcolloquialidiomsarehighlysalient.Thisarticle’smaincontributionistoshowthattherepresentation—constrainedaccordingtolinguisticallymotivatedas-sumptions(§3)—canbetransformedintoasequencetaggingschemethatresemblesstandardapproachesinnamedentityrecognitionandothertextchunkingtasks(§4).Alongtheselines,wedevelopadiscrim-inative,structuredmodelofMWEsincontext(§5)andtrain,evaluate,andexamineitontheannotatedcorpus(§6).Enfin,in§7and§8wecommentonrelatedworkandfuturedirections.2AnnotatedCorpusTobuildandevaluateamultiwordexpressionana-lyzer,weusetheMWE-annotatedcorpusofSchnei-deretal.(2014).ItconsistsofinformalEnglishwebtextthathasbeenspecificallyandcompletelyanno-tatedforMWEs,withoutreferencetoanyparticularlexicon.Tothebestofourknowledge,thiscorpusisthefirsttobefreelyannotatedformanykindsofMWEs(withoutreferencetoalexicon),andisalsothefirstdatasetofsocialmediatextwithMWEan-notationsbeyondnamedentities.Thissectiongivesasynopsisoftheannotationconventionsusedtode-velopthatresource,astheyareimportanttounder-standingourmodelsandevaluation.Rationale.ThemultiwordexpressionscommunityhaslackedacanonicalcorpusresourcecomparabletobenchmarkdatasetsusedforproblemssuchasNERandparsing.Consequently,theMWElitera-turehasbeendrivenbylexicography:typically,thegoalistoacquireanMWElexiconwithlittleornosupervision,ortoapplysuchalexicontocorpusdata.StudiesofMWEsincontexthavefocusedonvarioussubclassesofconstructionsinisolation,ne-cessitatingspecial-purposedatasetsandevaluationschemes.Bycontrast,Schneideretal.’s(2014)cor-puscreatesanopportunitytotacklegeneral-purposeMWEidentification,suchaswouldbedesirableforusebyhigh-coveragedownstreamNLPsystems.Itisusedtotrainandevaluateourmodelsbelow.Thecor-pusispubliclyavailableasabenchmarkforfurtherresearch.1Data.Thedocumentsinthecorpusareonlineuserreviewsofrestaurants,medicalproviders,retailers,automotiveservices,petcareservices,etc.Markedbyconversationalandopinionatedlanguage,thisgenreisfertilegroundforcolloquialidioms(Nunbergetal.,1994;Moon,1998).The723reviews(55,000words,3,800phrases)intheEnglishWebTree-bank(WTB;Biesetal.,2012b)werecollectedbyGoogle,tokenized,andannotatedwithphrasestruc-turetreesinthestyleofthePennTreebank(Marcusetal.,1993).MWEannotatorsusedthesentenceandwordtokenizationssuppliedbythetreebank.2Annotationscheme.Theannotationschemeitselfwasdesignedtobeassimpleaspossible.ItconsistsofgroupingtogetherthetokensineachsentencethatbelongtothesameMWEinstance.WhileannotationguidelinesprovideexamplesofMWEgroupingsinawiderangeofconstructions,theannotatorisnot1http://www.ark.cs.cmu.edu/LexSem/2Becauseweusetreebankdata,syntacticparsesareavailabletoassistinposthocanalysis.Syntacticinformationwasnotshowntoannotators.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

195

#ofconstituenttokens23≥4totalstrong22575951723024weak2691216945925267162413483#ofgaps01226263944322135229485296Table1:CountsintheMWEcorpus.tiedtoanyparticulartaxonomyorsyntacticstructure.Thissimplifiesthenumberofdecisionsthathavetobemadeforeachsentence,evenifsomearedifficult.Furtherinstructionstoannotatorsincluded:•Groupsshouldincludeonlythelexicallyfixedpartsofanexpression(moduloinflectionalmorphology);thisgenerallyexcludesdeterminersandpronouns:madethemistake,pridethemselveson.•MultiwordpropernamescountasMWEs.•Misspelledorunconventionallyspelledtokensareinterpretedaccordingtotheintendedwordifclear.•Overtokenizedwords(spelledastwotokens,butconventionallyoneword)arejoinedasmultiwords.Cliticsseparatedbythetokenizationinthecorpus—negativen’t,possessive’s,etc.—arejoinediffunc-tioningasafixedpartofamultiword(e.g.,T’sCafe),butnotifusedproductively.Gaps.Thereare,broadlyspeaking,threereasonstogrouptogethertokensthatarenotfullycontigu-ous.Mostcommonly,gapscontaininternalmodifiers,suchasgoodinmakegooddecisions.Syntacticcon-structionssuchasthepassivecanresultingapsthatmightnototherwisebepresent:ingooddecisionsweremade,thereisinsteadagapfilledbythepas-siveauxiliary.Finally,someMWEsmaytakeinternalarguments:theygavemeabreak.Figure1hasaddi-tionalexamples.Multiplegapscanoccurevenwithinthesameexpression,thoughitisrare:theyagreedtogiveBobawell-deservedbreak.Strength.Theannotationschemehastwo“strength”levelsforMWEs.Clearlyidiomaticex-pressionsaremarkedasstrongMWEs,whilemostlycompositionalbutespeciallyfrequentcollocations/phrases(e.g.,abundantlyclearandpatentlyobvious)aremarkedasweakMWEs.WeakmultiwordgroupsareallowedtoincludestrongMWEsasconstituents(butnotviceversa).Stronggroupsarerequiredtocoherewhenusedinsideweakgroups:thatis,aweakgroupcannotincludeonlypartofastronggroup.Forpurposesofannotation,therewerenoconstraintshingingontheorderingoftokensinthesentence.Process.MWEannotationproceededonesentenceatatime.The6annotatorsreferredtoandimprovedtheguidelinesdocumentonanongoingbasis.Everysentencewasseenindependentlybyatleast2an-notators,anddifferencesofopinionwerediscussedandresolved(oftenbymarkingaweakMWEasacompromise).SeeSchneideretal.(2014)fordetails.Statistics.Theannotatedcorpusconsistsof723documents(3,812phrases).MWEsarefrequentinthisdomain:57%ofsentences(72%ofsentencesover10wordslong)and88%ofdocumentscontainatleastoneMWE.8,060/55,579=15%oftokensbelongtoanMWE;intotal,thereare3,483MWEinstances.544(16%)arestrongMWEscontainingagold-taggedpropernoun—mostarepropernames.Abreakdownappearsintable1.3RepresentationandTaskDefinitionWedefinealexicalsegmentationofasentenceasapartitioningofitstokensintosegmentssuchthateachsegmentrepresentsasingleunitoflexicalmeaning.Amultiwordlexicalexpressionmaycontaingaps,i.e.interruptionsbyothersegments.Weimposetworestrictionsongapsthatappeartobewell-motivatedlinguistically:•Projectivity:Everyexpressionfillingagapmustbecompletelycontainedwithinthatgap;gappyexpressionsmaynotinterleave.•Nonestedgaps:Agapinanexpressionmaybefilledbyothersingle-ormultiwordexpressions,solongasthosedonotthemselvescontaingaps.Formalgrammar.OurschemecorrespondstothefollowingextendedCFG(Thatcher,1967),whereSisthefullsentenceandterminalswarewordtokens:S→X+X→w+(Y+w+)∗Y→w+EachexpressionXorYislexicalizedbythewordsinoneormoreunderlinedvariablesontheright-handside.AnXconstituentmayoptionallycontainoneormoregapsfilledbyYconstituents,whichmustnotcontaingapsthemselves.33MWEswithmultiplegapsarerarebutattestedindata:e.g.,puttingmeatmyease.Weencounteredoneviolationofthegapnestingconstraintinthereviewsdata:Ihave21nothing21but21fantasticthings2to21say21.Additionally,theinterruptedphrase

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

196

Denotingmultiwordgroupingswithsubscripts,Mywifehadtaken1her’072Ford2Fusion2in1foraroutineoil3change3contains3multiwordgroups—{taken,dans},{’07,Ford,Fusion},{oil,changement}—and7single-wordgroups.ThefirstMWEisgappy(ac-centuatedbythebox);asinglewordandacontiguousmultiwordgroupfallwithinthegap.Theprojectivityconstraintforbidsananalysisliketaken1her’072Ford1Fusion2,whilethegapnestingconstraintfor-bidstaken1her2’07Ford2Fusion2in1.3.1Two-levelScheme:Strongvs.WeakMWEsOurannotateddatadistinguishtwostrengthsofMWEsasdiscussedin§2.Augmentingthegram-maroftheprevioussection,wethereforedesignatenonterminalsasstrong(X,Oui)orweak(˜X,˜Y):S→˜X+˜X→X+(˜Y+X+)∗X→w+(˜Y+w+)∗˜Y→Y+Y→w+AweakMWEmaybelexicalizedbysinglewordsand/orstrongmultiwords.Strongmultiwordscannotcontainweakmultiwordsexceptingaps.Further,thecontentsofagapcannotbepartofanymultiwordthatextendsoutsidethegap.4Forexample,considerthesegmentation:hewaswillingtobudge1a2little2on1thepricewhichmeans4a43lot43to4me4.SubscriptsdenotestrongMWgroupsandsuperscriptsweakMWgroups;un-markedtokensserveassingle-wordexpressions.TheMWgroupsarethus{budge,sur},{un,little},{un,lot},et{moyens,{un,lot},à,me}.Asshouldbeevidentfromthegrammar,theprojectivityandgap-nestingconstraintsapplyherejustasinthe1-levelscheme.3.2EvaluationMatchingcriteria.GiventhatmosttokensdonotbelongtoanMWE,toevaluateMWEidentificationweadoptaprecision/recall-basedmeasurefromthecoreferenceresolutionliterature.TheMUCcriterion(Vilainetal.,1995)measuresprecisionandrecallgreatgatewaysnever1before1,so23far23as23Hudsonknew2,seen1byEuropeanswasannotatedinanothercorpus.4Thiswasviolated6timesinourannotateddata:modifierswithingapsaresometimescollocatedwiththegappyexpression,asinon12a12tight1budget12andhave12little1doubt12.oflinksintermsofgroups(units)impliedbythetransitiveclosureoverthoselinks.5Itcanbedefinedasfollows:Leta—bdenotealinkbetweentwoelementsinthegoldstandard,andaˆ—bdenotealinkinthesystemprediction.Letthe∗operatordenotethetran-sitiveclosureoveralllinks,suchthat⟦a—∗b⟧is1ifaandbbelongtothesame(gold)ensemble,and0other-wise.Assumingtherearenoredundant6linkswithinanyannotation(whichinourcaseisguaranteedbylinkingconsecutivewordsineachMWE),wecanwritetheMUCprecisionandrecallmeasuresas:P=∑a,b∶aˆ—b⟦a—∗b⟧∑a,b∶aˆ—b1R=∑a,b∶a—b⟦aˆ—∗b⟧∑a,b∶a—b1Thisawardspartialcreditwhenpredictedandgoldexpressionsoverlapinpart.RequiringfullMWEstomatchexactlywouldarguablybetoostringent,over-penalizinglargerMWEsforminordisagreements.WecombineprecisionandrecallusingthestandardF1measureoftheirharmonicmean.Thisisthelink-basedevaluationusedformostofourexperiments.Forcomparison,wealsoreportsomeresultswithamorestringentexactmatchevaluationwherethespanofthepredictedMWEmustbeidenticaltothespanofthegoldMWEforittocountascorrect.Strengthaveraging.Recallthatthe2-levelscheme(§3.1)distinguishesstrongvs.weaklinks/groups,wherethelattercategoryappliestoreason-ablycompositionalcollocationsaswellasambigu-ousordifficultcases.Ifwhereoneannotationusesaweaklinktheotherhasastronglinkornolinkatall,wewanttopenalizethedisagreementlessthanifonehadastronglinkandtheotherhadnolink.Toaccommodatethe2-levelscheme,wethereforeaverageF↑1,inwhichallweaklinkshavebeencon-vertedtostronglinks,andF↓1,inwhichtheyhavebeenremoved:F1=12(F↑1+F↓1).7Ifneitherannota-tioncontainsanyweaklinks,thisequalstheMUC5Asacriterionforcoreferenceresolution,theMUCmeasurehasperceivedshortcomingswhichhavepromptedseveralothermeasures(seeRecasensandHovy,2011forareview).Itisnotclear,cependant,whetheranyofthesecriticismsarerelevanttoMWEidentification.6Alinkbetweenaandbisredundantiftheotherlinksalreadyimplythataandbbelongtothesameset.AsetofNelementsisexpressednon-redundantlywithexactlyN−1links.7Overallprecisionandrecallarelikewisecomputedbyaver-aging“strengthened”and“weakened”measurements.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

197

nogaps,1-levelheOwasOwillingOtoObudgeOaBlittleIonOtheOpriceOwhichOmeansBaIlotItoImeI.O(O∣BI+)+nogaps,2-levelheOwasOwillingOtoObudgeOaBlittle¯IonOtheOpriceOwhichOmeansBa˜Ilot¯Ito˜Ime˜I.O(O∣B[¯I˜I]+)+gappy,1-levelheOwasOwillingOtoObudgeBablittleionItheOpriceOwhichOmeansBaIlotItoImeI.O(O∣B(o∣bi+∣I)∗I+)+gappy,2-levelheOwasOwillingOtoObudgeBablittle¯ıon¯ItheOpriceOwhichOmeansBa˜Ilot¯Ito˜Ime˜I.O(O∣B(o∣b[¯ı˜ı]+[¯I˜I])[¯I˜I]+)+Figure2:Examplesandregularexpressionsforthe4taggingschemes.Stronglinksaredepictedwithsolidarcs,andweaklinkswithdottedarcs.Thebottomanalysiswasprovidedbyanannotator;theonesabovearesimplifications.scorebecauseF1=F↑1=F↓1.Thismethodappliestoboththelink-basedandexactmatchevaluationcriteria.4TaggingSchemesFollowing(RamshawandMarcus,1995),shallowan-alysisisoftenmodeledasasequence-chunkingtask,withtagscontainingchunk-positionalinformation.TheBIOschemeandvariants(e.g.,BILOU;RatinovandRoth,2009)arestandardfortaskslikenamedentityrecognition,supersensetagging,andshallowparsing.Thelanguageofderivationslicensedbythegram-marsin§3allowsforatag-basedencodingofMWEanalyseswithonlybigramconstraints.Wedescribe4taggingschemesforMWEidentification,startingwithBIOandworkinguptomoreexpressivevariants.Theyaredepictedinfigure2.Nogaps,1-level(3tags).Thisisthestandardcon-tiguouschunkingrepresentationfromRamshawandMarcus(1995)usingthetags{OBI}.Oisforto-kensoutsideanychunk;Bmarkstokensbeginningachunk;andImarksothertokensinsideachunk.MultiwordchunkswillthusstartwithBandthenI.BmustalwaysbefollowedbyI;IisnotallowedatthebeginningofthesentenceorfollowingO.Nogaps,2-level(4tags).WecandistinguishstrengthlevelsbysplittingIintotwotags:¯Iforstrongexpressionsand˜Iforweakexpressions.Toexpressstrongandweakcontiguouschunksrequires4tags:{OB¯I˜I}.(MarkingBwithastrengthaswellwouldberedundantbecauseMWEsareneverlength-onechunks.)Theconstraintson¯Iand˜IarethesameastheconstraintsonIinpreviousschemes.If¯Iand˜Ioccurnexttoeachother,thestrongattachmentwillreceivehigherprecedence,resultinginanalysisofstrongMWEsasnestedwithinweakMWEs.Gappy,1-level(6tags).Becausegapscannotthemselvescontaingappyexpressions(wedonotsupportfullrecursivity),afinitenumberofadditionaltagsaresufficienttoencodegappychunks.Wethere-foreaddlowercasetagvariantsrepresentingtokenswithinagap:{OoBbIi}.Inadditiontothecon-straintsstatedabove,nowithin-gaptagmayoccuratthebeginningorendofthesentenceorimmediatelyfollowingorprecedingO.Withinagap,b,je,andobehaveliketheirout-of-gapcounterparts.Gappy,2-level(8tags).8tagsarerequiredtoen-codethe2-levelschemewithgaps:{OoBb¯I¯ı˜I˜ı}.Variantsoftheinsidetagaremarkedforstrengthoftheincominglink—thisappliesgap-externally(capi-talizedtags)andgap-internally(lowercasetags).If¯Ior˜Iimmediatelyfollowsagap,itsdiacriticreflectsthestrengthofthegappyexpression,notthegap’scontents.5ModelWiththeaboverepresentationswemodelMWEiden-tificationassequencetagging,oneoftheparadigmsthathasbeenusedpreviouslyforidentifyingcon-tiguousMWEs(ConstantandSigogne,2011,see§7).8Constraintsonlegaltagbigramsaresufficienttoensurethefulltaggingiswell-formedsubjecttotheregularexpressionsinfigure2;weenforcethese8Hierarchicalmodelingbasedonourrepresentationsislefttofuturework.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

198

constraintsinourexperiments.9InNLP,conditionalrandomfields(Laffertyetal.,2001)andthestructuredperceptron(Collins,2002)arepopulartechniquesfordiscriminativesequencemodelingwithaconvexlossfunction.Wechoosethesecondapproachforitsspeed:learningandin-ferencedependmainlyontheruntimeoftheViterbialgorithm,whoseasymptoticcomplexityislinearinthelengthoftheinputand(withafirst-orderMarkovassumption)quadraticinthenumberoftags.Below,wereviewthestructuredperceptronanddiscussourcostfunction,features,andexperimentalsetup.5.1Cost-AugmentedStructuredPerceptronThestructuredperceptron’s(Collins,2002)learn-ingprocedure,algorithm1,generalizestheclassicperceptronalgorithm(FreundandSchapire,1999)toincorporateastructureddecodingstep(forsequences,theViterbialgorithm)intheinnerloop.Thus,train-ingrequiresonlymaxinference,whichisfastwithafirst-orderMarkovassumption.Intraining,featuresareadjustedwhereataggingerrorismade;thepro-cedurecanbeviewedasoptimizingthestructuredhingeloss.Theoutputoflearningisaweightvectorthatparametrizesafeature-richscoringfunctionovercandidatelabelingsofasequence.TobetteralignthelearningalgorithmwithourF-score–basedMWEevaluation(§3.2),weuseacost-augmentedversionofthestructuredperceptronthatissensitivetodifferentkindsoferrorsduringtraining.Whenrecallisthebiggerobstacle,wecanadoptthefollowingcostfunction:givenasentencex,itsgoldlabelingy∗,andacandidatelabelingy′,coût(y∗,y′,X)=∣y∗∣∑j=1c(y∗j,y′j)wherec(y∗,y′)=⟦y∗≠y′⟧+ρ⟦y∗∈{B,b}∧y′∈{Ô,o}⟧Asinglenonnegativehyperparameter,r,controlsthetradeoffbetweenrecallandaccuracy;higherρbiasesthemodelinfavorofrecall(possiblyhurt-ingaccuracyandprecision).Thisisaslightvariantoftherecall-orientedcostfunctionofMohitetal.(2012).Thedifferenceisthatweonlypenalizebeginning-of-expressionrecallerrors.Preliminary9The8-tagschemelicenses42tagbigrams:sequencessuchasBOando¯ıareprohibited.Therearealsoconstraintsontheallowedtagsatthebeginningandendofthesequence.Input:data⟨⟨x(n),oui(n)⟩⟩Nn=1;numberofiterationsMw←0w←0t←1form=1toMdoforn=1toNdo⟨x,y⟩←⟨x(n),oui(n)⟩ˆy←argmaxy′(w⊺g(X,y′)+coût(oui,y′,X))ifˆy≠ythenw←w+g(X,oui)−g(X,ˆy)w←w+tg(X,oui)−tg(X,ˆy)endt←t+1endendOutput:w−(w/t)Algorithm1:Trainingwiththeaveragedperceptron.(AdaptedfromDaumé,2006,p.19.)experimentsshowedthatacostfunctionpenalizingallrecallerrors—i.e.,withρ⟦y∗≠O∧y′=O⟧asthesecondterm,asinMohitetal.—tendedtoappendadditionaltokenstohigh-confidenceMWEs(suchaspropernames)ratherthanencouragenewMWEs,whichwouldrequirepositingatleasttwonewnon-outsidetags.5.2FeaturesBasicfeatures.ThesearelargelybasedonthoseofConstantetal.(2012):theylookatwordunigramsandbigrams,characterprefixesandsuffixes,andPOStags,aswellaslexiconentriesthatmatchlemmas10ofmultiplewordsinthesentence.AppendixAliststhebasicfeaturesindetail.Someofthebasicfeaturesmakeuseoflexicons.Weuseorconstruct10listsofEnglishMWEs:allmultiwordentriesinWordNet(Fellbaum,1998);allmultiwordchunksinSemCor(Milleretal.,1993);allmultiwordentriesinEnglishWiktionary;11theWikiMwedatasetminedfromEnglishWikipedia(Hartmannetal.,2012);theSAIDdatabaseofphrasallexicalidioms(Kuiperetal.,2003);thenamedentitiesandotherMWEsintheWSJcorpusontheEnglishsideoftheCEDT(Hajiˇcetal.,2012);10TheWordNetAPIinNLTK(Birdetal.,2009)wasusedforlemmatization.11http://en.wiktionary.org;dataobtainedfromhttps://toolserver.org/~enwikt/definitions/enwikt-defs-20130814-en.tsv.gz

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

199

LOOKUPSUPERVISEDMODELpreexisinglexiconsentriesmaxgaplengthPRF1σPRF1σnone074.3944.4355.572.19WordNet+SemCor71k046.1528.4135.102.4474.5145.7956.641.906lexicons420k035.0546.7640.002.8876.0852.3961.951.6710lexicons437k033.9847.2939.482.8875.9551.3961.172.30bestconfigurationwithin-domainlexicon146.6647.9047.182.3176.6451.9161.841.652lexicons+MWtypes(train)≥16lexicons+MWtypes(train)≥2Table2:Useoflexiconsforlookup-basedvs.statisticalsegmentation.Supervisedlearningusedonlybasicfeaturesandthestructuredperceptron,withthe8-tagscheme.Resultsarewiththelink-basedmatchingcriterionforevaluation.Top:Comparisonofpreexistinglexicons.“6lexicons”referstoWordNetandSemCorplusSAID,WikiMwe,Phrases.net,andEnglishWiktionary;“10lexicons”addsMWEsfromCEDT,VNC,LVC,andOyz.(Intheselookup-basedconfigurations,allowinggappyMWEsneverhelpsperformance.)Bottom:CombiningpreexistinglexiconswithalexiconderivedfromMWEsannotatedinthetrainingportionofeachcross-validationfoldatleastonce(lookup)ortwice(model).Allprecision,recall,andF1percentagesareaveragedacross8foldsofcross-validationontrain;standarddeviationsareshownfortheF1score.Ineachcolumn,thehighestvalueusingonlypreexistinglexiconsisunderlined,andthehighestoverallvalueisbolded.Theboxedrowindicatestheconfigurationusedasthebasisforsubsequentexperiments.theverb-particleconstructions(VPCs)datasetof(Baldwin,2008);alistoflightverbconstructions(LVCs)providedbyClaireBonial;andtwoidiomswebsites.12Afterpreprocessing,eachlexicalentryconsistsofanorderedsequenceofwordlemmas,someofwhichmaybevariableslike.Givenasentenceandoneormoreofthelexicons,lookupproceedsasfollows:weenumerateentrieswhoselemmasequencesmatchasequenceoflemma-tizedtokens,andbuildalatticeofpossibleanalysesoverthesentence.Wefindtheshortestpath(i.e.,usingasfewexpressionsaspossible)withdynamicprogramming,allowinggapsofuptolength2.13Unsupervisedwordclusters.Distributionalclus-teringonlarge(unlabeled)corporacanproducelexi-calgeneralizationsthatareusefulforsyntacticandsemanticanalysistasks(par exemple.:Milleretal.,2004;Kooetal.,2008;Turianetal.,2010;Owoputietal.,2013;Graveetal.,2013).WewereinterestedtoseewhetherasimilarpatternwouldholdforMWEidentification,giventhatMWEsareconcernedwithwhatislexi-callyidiosyncratic—i.e.,backingofffromspecificlexemestowordclassesmaylosetheMWE-relevantinformation.Brownclustering14(Brownetal.,1992)12http://www.phrases.net/andhttp://home.postech.ac.kr/~oyz/doc/idiom.html13Eachtop-levellexicalexpression(single-ormultiword)incursacostof1;eachexpressionwithinagaphascost1.25.14WithLiang’s(2005)implementation:https://github.com/percyliang/brown-cluster.Weobtain1,000clustersonthe21-million-wordYelpAcademicDataset15(whichissimilaringenretotheannotatedwebre-viewsdata)givesusahardclusteringofwordtypes.Toourtagger,weaddfeaturesmappingtheprevi-ous,current,andnexttokentoBrownclusterIDs.ThefeatureforthecurrenttokenconjoinsthewordlemmawiththeclusterID.Part-of-speechtags.WecomparedthreePTB-stylePOStaggersonthefullREVIEWSsubcor-pus(train+test).TheStanfordCoreNLPtagger16(Toutanovaetal.,2003)yieldsanaccuracyof90.4%.TheARKTweetNLPtaggerv.0.3.2(Owoputietal.,2013)achieves90.1%withthemodel17trainedontheTwittercorpusofRitteretal.(2011),and94.9%whentrainedontheANSWERS,EMAIL,NEWSGROUP,andWEBLOGsubcorporaofWTB.Weusethisthirdcon-figurationtoproduceautomaticPOStagsfortrainingandtestingourMWEtagger.(Acomparisoncondi-tionin§6.3usesoraclePOStags.)5.3ExperimentalSetupThecorpusofwebreviewsdescribedin§2isusedfortrainingandevaluation.101arbitrarilychosendocuments(500phrases,7,171words)wereheldfromwordsappearingatleast25times.15https://www.yelp.com/academic_dataset16v.3.2.0,withenglish-bidirectional-distsim17http://www.ark.cs.cmu.edu/TweetNLP/model.ritter_ptb_alldata_fixed.20130723

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

200

LINK-BASEDEXACTMATCHconfigurationMρ∣w∣PRF1PRF1basemodel5—1,765k69.2750.4958.3560.9948.2753.85+recallcost41501,765k61.0957.9459.4153.0955.3854.17+clusters31002,146k63.9855.5159.3956.3453.2454.70+oraclePOS41002,145k66.1959.3562.5358.5157.0057.71Table3:Comparisonofsupervisedmodelsontest(usingthe8-tagscheme).Thebasemodelcorrespondstotheboxedresultintabletable2,buthereevaluatedontest.Foreachconfiguration,thenumberoftrainingiterationsMand(exceptforthebasemodel)therecall-orientedhyperparameterρweretunedbycross-validationontrain.outasafinaltestset.Thisleft3,312sentences/48,408wordsfortraining/development(train).Fea-tureengineeringandhyperparametertuningwereconductedwith8-foldcross-validationontrain.The8-tagschemeisusedexceptwhereotherwisenoted.Inlearningwiththestructuredperceptron(algo-rithm1),weemploytwowell-knowntechniquesthatcanbothbeviewedasregularization.First,weusetheaverageofparametersoveralltimestepsoflearn-ing.Second,withineachcross-validationfold,wede-terminethenumberoftrainingiterations(epochs)Mbyearlystopping—thatis,aftereachiteration,weusethemodeltodecodetheheld-outdata,andwhenthataccuracyceasestoimprove,usethepreviousmodel.Thetwohyperparametersarethenumberofiterationsandthevalueoftherecallcosthyperparameter(r).Botharetunedviacross-validationontrain;weusethemultipleof50thatmaximizesaveragelink-basedF1.Thechosenvaluesareshownintable3.Experi-mentsweremanagedwiththeducttapetool.186ResultsWeexperimentallyaddressthefollowingquestionstoprobeandjustifyourmodelingapproach.6.1Issupervisedlearningnecessary?PreviousMWEidentificationstudieshavefoundbenefittostatisticallearningoverheuristiclexiconlookup(ConstantandSigogne,2011;Greenetal.,2012).OurfirstexperimenttestswhetherthisholdsforcomprehensiveMWEidentification:itcomparesoursupervisedtaggingapproachwithbaselinesofheuristiclookuponpreexistinglexicons.Thebase-linesconstructalatticeforeachsentenceusingthesamemethodaslexicon-basedmodelfeatures(§5.2).Ifmultiplelexiconsareused,theunionoftheiren-18https://github.com/jhclark/ducttape/triesisusedtoconstructthelattice.Theresultingsegmentation—whichdoesnotencodeastrengthdistinction—isevaluatedagainstthegoldstandard.Table2showstheresults.Evenwithjustthela-beledtrainingsetasinput,thesupervisedapproachbeatsthestrongestheuristicbaseline(thatincorpo-ratesin-domainlexiconentriesextractedfromthetrainingdata)by30precisionpoints,whileachievingcomparablerecall.Forexample,thebaseline(butnotthestatisticalmodel)incorrectlypredictsanMWEinplacestoeatinBaltimore(becauseeatin,meaning‘eatathome,’islistedinWordNet).ThesupervisedapproachhaslearnednottotrustWordNettoomuchduetothissortofambiguity.Downstreamapplica-tionsthatcurrentlyuselexiconmatchingforMWEidentification(e.g.,GhoneimandDiab,2013)likelystandtobenefitfromourstatisticalapproach.6.2HowbesttoexploitMWElexicons(type-levelinformation)?Forstatisticaltagging(rightportionoftable2),usingmorepreexisting(out-of-domain)lexiconsgenerallyimprovesrecall;precisionalsoimprovesabit.AlexiconofMWEsoccurringinthenon-held-outtrainingdataatleasttwice19(table2,bottomright)ismarginallyworse(betterprecision/worserecall)thanthebestresultusingonlypreexistinglexicons.6.3VariationsonthebasemodelWeexperimentwithsomeofthemodelingalterna-tivesdiscussedin§5.Resultsappearintable3underboththelink-basedandexactmatchevaluationcri-teria.Wenotethattheexactmatchscoresare(asexpected)severalpointslower.19IfwetrainwithaccesstothefulllexiconoftrainingsetMWEs,thelearnercredulouslyoverfitstorelyingonthatlexicon—afterall,ithasperfectcoverageofthetrainingdata!—whichprovesfatalforthemodelattesttime.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

201

Recall-orientedcost.Therecall-orientedcostaddsabout1link-basedF1point,sacrificingprecisioninfavorofrecall.Unsupervisedwordclusters.Whencombinedwiththerecall-orientedcost,theseproduceaslightimprovementtoprecision/degradationtorecall,im-provingexactmatchF1butnotaffectinglink-basedF1.Onlyafewclustersreceivehighpositiveweight;oneoftheseconsistsofmatter,joke,biggie,pun,avail,clue,corkage,frills,worries,etc.Thesewordsarediversesemantically,butalloccurincollocationswithno,whichiswhatmakestheclustercoherentandusefultotheMWEmodel.Oraclepart-of-speechtags.Usinghuman-annotatedratherthanautomaticPOStagsimprovesMWEidentificationbyabout3F1pointsontest(similardifferenceswereobservedindevelopment).6.4Whatarethehighest-weightedfeatures?Anadvantageofthelinearmodelingframeworkisthatwecanexaminelearnedfeatureweightstogainsomeinsightintothemodel’sbehavior.Ingeneral,thehighest-weightedfeaturesarethelexiconmatchingfeaturesandfeaturesindicativeofpropernames(POStagofpropernoun,capitalizedwordnotatthebeginningofthesentence,etc.).Despitetheoccasionalclustercapturingcolloca-tionaloridiomaticgroupings,asdescribedintheprevioussection,theclustersappeartobemostlyusefulforidentifyingwordsthattendtobelong(ornot)topropernames.Forexample,theclusterwithstreet,road,freeway,highway,airport,etc.,aswellaswordsoutsideoftheclustervocabulary,weighinfavorofanMWE.Aclusterwitheverydaydesti-nations(neighborhood,doctor,hotel,bank,dentist)prefersnon-MWEs,presumablybecausethesewordsarenottypicallypartofpropernamesinthiscorpus.Thiswasfromthebestmodelusingnon-oraclePOStags,sotheclustersareperhapsusefulincorrect-ingforpropernounsthatweremistakenlytaggedascommonnouns.Onecaveat,though,isthatitishardtodiscerntheimpactofthesespecificfeatureswhereothersmaybecapturingessentiallythesameinformation.6.5HowheterogeneousarelearnedMWEs?Ontest,thefinalmodel(withautomaticPOStags)predicts365MWEinstances(31aregappy;23arePOSpattern#examples(lowercasedlemmas)NOUNNOUN53customerservice,oilchangeVERBPREP36workwith,dealwith,yellatPROPNPROPN29eagletransmission,comfortzoneADJNOUN21majoraward,topnotch,mentalhealthVERBPART20moveout,endup,pickup,passupVERBADV17comeback,comein,comeby,stayawayPREPNOUN12ontime,infact,incash,forinstanceVERBNOUN10takecare,makemoney,givecrapVERBPRON10thankyou,getitPREPPREP8outof,dueto,outta,inbetweenADVADV6nomatter,upfront,atall,earlyonDETNOUN6alot,alittle,abit,adealVERBDETNOUN6answerthephone,takeachanceNOUNPREP5kindof,carefor,tipon,answertoTable4:ToppredictedPOSpatternsandfrequencies.weak).Thereare298uniqueMWEtypes.OrganizingthepredictedMWEsbytheircoarsePOSsequencerevealsthatthemodelisnottoopreju-dicedinthekindsofexpressionsitrecognizes:the298typesfallunder89uniquePOS+strengthpatterns.Table4showsthe14POSsequencespredicted5ormoretimesasstrongMWEs.Someoftheexamples(majoraward,adeal,tipon)arefalsepositives,butmostarecorrect.SingletonpatternsincludePROPNVERB(godforbid),PREPDET(atthat),ADJPRON(worthit),andPREPVERBPREP(todiefor).TruepositiveMWEsmostlyconsistof(un)namedentities,et(b)lexicalidiomsseenintrainingand/orlistedinoneofthelexicons.Occasionallythesys-temcorrectlyguessesanunseenandOOVidiombasedonfeaturessuchashyphenation(walk-in)andcapitalization/OOVwords(ChiliRelleno,BIGMIS-TAKE).Ontest,244goldMWEtypeswereunseenintraining;thesystemfound93truepositives(wherethetypewaspredictedatleastonce),109falseposi-tives,and151falsenegatives—anunseentyperecallrateof38%.Removingtypesthatoccurredinlexi-consleaves35truepositives,61falsepositives,and111falsenegatives—aunseenandOOVtyperecallrateof24%.6.6Whatkindsofmismatchesoccur?Inspectionoftheoutputturnsupfalsepositivesduetoambiguity(e.g.,Spongyandsweetbread);falsenegatives(toptobottom);andoverlap(gethighqual-ityservice,goldgethighqualityservice;liveupto,goldliveupto).Anumberofthemismatchesturn

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

202

scheme∣Y∣ρM∣w∣PRF1nogaps,1-level31002.1733k73.3355.7263.20nogaps,2-level41503.3977k72.6059.1165.09gappy,1-level62001.61,466k66.4861.2663.65gappy,2-level81003.51,954k73.2760.4466.15Table5:Trainingwithdifferenttaggingschemes.Resultsarecross-validationaveragesontrain.Allschemesareevaluatedagainstthefullgoldstandard(8tags).outtobeproblemswiththegoldstandard,likehav-ingourwatershutoff(goldhavingourwatershutoff).Thissuggeststhatevennoisyautomatictaggersmighthelpidentifyannotationinconsistenciesanderrorsformanualcorrection.6.7Aregappinessandthestrengthdistinctionlearnedinpractice?ThreequartersofMWEsarestrongandcontainnogaps.Toseewhetherourmodelisactuallysensi-tivetothephenomenaofgappinessandstrength,wetrainondatasimplifiedtoremoveoneorbothdistinctions—asinthefirst3labelingsinfigure2—andevaluateagainstthefull8-tagscheme.Forthemodelwiththerecallcost,clusters,andoraclePOStags,weevaluateeachofthesesimplificationsofthetrainingdataintable5.Thegoldstandardforevaluationremainsthesameacrossallconditions.Ifthemodelwasunabletorecovergappyexpres-sionsorthestrong/weakdistinction,wewouldexpectittodonobetterwhentrainedwiththefulltagsetthanwiththesimplifiedtagset.However,thereissomelossinperformanceasthetagsetforlearningissim-plified,whichsuggeststhatgappinessandstrengtharebeinglearnedtoanextent.7RelatedWorkOurannotatedcorpus(Schneideretal.,2014)joinsseveralresourcesthatindicatecertainvarietiesofMWEs:lexiconssuchasWordNet(Fellbaum,1998),SAID(Kuiperetal.,2003),andWikiMwe(Hartmannetal.,2012);targetedlists(Baldwin,2005,2008;Cooketal.,2008;TuandRoth,2011,2012);web-siteslikeWiktionaryandPhrases.net;andlarge-scalecorporasuchasSemCor(Milleretal.,1993),theFrenchTreebank(Abeilléetal.,2003),theSzeged-ParalellFXcorpus(Vincze,2012),andthePragueCzech-EnglishDependencyTreebank(ˇCmejreketal.,2005).ThedifferenceisthatSchneideretal.(2014)pursuedacomprehensiveannotationapproachratherthantargetingspecificvarietiesofMWEsorrelyingonapreexistinglexicalresource.Theannotationsareshallow,notrelyingexplicitlyonsyntax(thoughinprincipletheycouldbemappedontotheparsesintheWebTreebank).Intermsofmodeling,theuseofmachinelearn-ingclassification(HashimotoandKawahara,2008;Shigetoetal.,2013)andspecificallyBIOsequencetagging(DiabandBhutada,2009;ConstantandSi-gogne,2011;Constantetal.,2012;Vinczeetal.,2013)forcontextualrecognitionofMWEsisnotnew.Lexicalsemanticclassificationtaskslikenamedentityrecognition(e.g.,RatinovandRoth,2009),su-persensetagging(CiaramitaandAltun,2006;PaaßandReichartz,2009),andindextermidentification(Newmanetal.,2012)alsoinvolvechunkingofcer-tainMWEs.Butourdiscriminativemodels,facili-tatedbythenewcorpus,broadenthescopeoftheMWEidentificationtasktoincludemanyvarietiesofMWEsatonce,includingexplicitmarkingofgapsandastrengthdistinction.Bycontrast,theafore-mentionedidentificationsystems,aswellassomeMWE-enhancedsyntacticparsers(e.g.,Greenetal.,2012),havebeenrestrictedtocontiguousMWEs.However,Greenetal.(2011)allowgapstobede-scribedasconstituentsinasyntaxtree.GimpelandSmith’s(2011)shallow,gappylanguagemodelal-lowsarbitrarytokengroupingswithinasentence,whereasourmodelimposesprojectivityandnest-ingconstraints(§3).BlunsomandBaldwin(2006)presentasequencemodelforHPSGsupertagging,andevaluateperformanceondiscontinuousMWEs,thoughthesequencemodeltreatsthenon-adjacentcomponentsupertagslikeotherlabels—itcannoten-forcethattheymutuallyrequireoneanother,aswedoviathegappytaggingscheme(§3.1).ThelexiconlookupproceduresofBejˇceketal.(2013)canmatchgappyMWEs,butarenonstatisticalandextremelyerror-pronewhentunedforhighoraclerecall.Anothermajorthreadofresearchhaspursuedun-superviseddiscoveryofmultiwordtypesfromrawcorpora,suchaswithstatisticalassociationmeasures(Churchetal.,1991;Pecina,2010;Ramischetal.,2012,interalia),parallelcorpora(Melamed,1997;MoirónandTiedemann,2006;TsvetkovandWint-ner,2010),oracombinationthereof(Tsvetkovand

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

203

Wintner,2011);thismaybefollowedbyalookup-and-classifyapproachtocontextualidentification(Ramischetal.,2010).Thoughpreliminaryexperi-mentswithourmodelsdidnotshowbenefittoincor-poratingsuchautomaticallyconstructedlexicons,wehopethesetwoperspectivescanbebroughttogetherinfuturework.8ConclusionThisarticlehaspresentedthefirstsupervisedmodelforidentifyingheterogeneousmultiwordexpressionsinEnglishtext.Ourfeature-richdiscriminativese-quencetaggerperformsshallowchunkingwithanovelschemethatallowsforMWEscontaininggaps,andincludesastrengthdistinctiontoseparatehighlyidiomaticexpressionsfromcollocations.ItistrainedandevaluatedonacorpusofEnglishwebreviewsthatarecomprehensivelyannotatedformultiwordexpressions.Beyondthetrainingdata,itsfeaturesin-corporateevidencefromexternalresources—severallexiconsaswellasunsupervisedwordclusters;weshowexperimentallythatthisstatisticalapproachisfarsuperiortoidentifyingMWEsbyheuristiclexiconlookupalone.Futureextensionsmightintegrateaddi-tionalfeatures(e.g.,exploitingstatisticalassociationmeasurescomputedoverlargecorpora),enhancethelexicalrepresentation(e.g.,byaddingsemantictags),improvetheexpressivenessofthemodel(e.g.,withhigher-orderfeaturesandinference),orintegratethemodelwithothertasks(suchasparsingandtransla-tion).Ourdataandopensourcesoftwarearereleasedathttp://www.ark.cs.cmu.edu/LexSem/.AcknowledgmentsThisresearchwassupportedinpartbyNSFCA-REERgrantIIS-1054319,GooglethroughtheRead-ingisBelievingprojectatCMU,andDARPAgrantFA8750-12-2-0342fundedundertheDEFTprogram.WearegratefultoKevinKnight,MarthaPalmer,ClaireBonial,LoriLevin,EdHovy,TimBaldwin,OmriAbend,membersofJHUCLSP,theNLPgroupatBerkeley,andtheNoah’sARKgroupatCMU,andanonymousreviewersforvaluablefeedback.ABasicFeaturesAllareconjoinedwiththecurrentlabel,yi.LabelFeatures1.previouslabel(theonlyfirst-orderfeature)TokenFeaturesOriginaltoken2.i={1,2}3.i=∣w∣−{0,1}4.capitalized∧⟦i=0⟧5.wordshapeLowercasedtoken6.prefix:[wi]k1∣4k=17.suffix:[wi]∣w∣j∣∣w∣j=∣w∣−38.hasdigit9.hasnon-alphanumericc10.contextword:wj∣i+2j=i−211.contextwordbigram:wj+1j∣i+1j=i−2LemmaFeatures12.lemma+contextlemmaifoneofthemisaverbandtheotherisanoun,verb,adjective,adverb,preposition,orparticle:λi∧λj∣i+2j=i−2Part-of-speechFeatures13.contextPOS:posj∣i+2j=i−214.contextPOSbigram:posj+1j∣i+1j=i−215.word+contextPOS:wi∧posi±116.contextword+POS:wi±1∧posiLexiconFeatures(unlexicalized)WordNetonly17.OOV:λiisnotinWordNetasaunigramlemma∧posi18.compound:non-punctuationlemmaλiandthe{previous,next}lemmainthesentence(ifitisnon-punctuation;aninter-veninghyphenisallowed)formanentryinWordNet,possiblyseparatedbyahyphenorspace19.compound-hyphen:posi=HYPH∧previousandnexttokensformanentryinWordNet,possiblyseparatedbyahyphenorspace20.ambiguityclass:ifcontentwordunigramλiisinWordNet,thesetofPOScategoriesitcanbelongto;elseposiifnotacontentPOS∧thePOSofthelongestMWmatchtowhichλibelongs(ifany)∧thepositioninthatmatch(BorI)Foreachmultiwordlexicon21.lexiconname∧statusoftokeniintheshortestpathsegmen-tation(Ô,B,orI)∧subcategoryoflexicalentrywhosematchincludestokeni,ifmatched∧whetherthematchisgappy22.theabove∧POStagsofthefirstandlastmatchedtokensintheexpressionOverallmultiwordlexicons23.atleastklexiconscontainamatchthatincludesthistoken(ifn≥1matches,nactivefeatures)24.atleastklexiconscontainamatchthatincludesthistoken,startswithagivenPOS,andendswithagivenPOS

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

204

ReferencesAnneAbeillé,LionelClément,andFrançoisToussenel.2003.BuildingatreebankforFrench.InAnneAbeilléandNancyIde,editors,Treebanks,volume20ofText,SpeechandLanguageTechnology,pages165–187.KluwerAcademicPublishers,Dordrecht,TheNether-lands.TimothyBaldwin.2005.Lookingforprepositionalverbsincorpusdata.InProc.oftheSecondACL-SIGSEMWorkshopontheLinguisticDimensionsofPrepositionsandtheirUseinComputationalLinguisticsFormalismsandApplications,pages115–126.Colchester,UK.TimothyBaldwin.2008.AresourceforevaluatingthedeeplexicalacquisitionofEnglishverb-particlecon-structions.InProc.ofMWE,pages1–2.Marrakech,Morocco.TimothyBaldwinandSuNamKim.2010.Multiwordexpressions.InNitinIndurkhyaandFredJ.Damerau,editors,HandbookofNaturalLanguageProcessing,SecondEdition.CRCPress,TaylorandFrancisGroup,BocaRaton,Florida,USA.EduardBejˇcek,PavelStraˇnák,andPavelPecina.2013.Syntacticidentificationofoccurrencesofmultiwordexpressionsintextusingalexiconwithdependencystructures.InProc.ofthe9thWorkshoponMultiwordExpressions,pages106–115.Atlanta,Georgia,USA.GáborBerend.2011.Opinionexpressionminingbyex-ploitingkeyphraseextraction.InProc.of5thInterna-tionalJointConferenceonNaturalLanguageProcess-ing,pages1162–1170.ChiangMai,Thailand.AnnBies,JustinMott,ColinWarner,andSethKulick.2012a.EnglishWebTreebank.TechnicalReportLDC2012T13,LinguisticDataConsortium,Philadel-phia,Pennsylvania,USA.AnnBies,JustinMott,ColinWarner,andSethKulick.2012b.EnglishWebTreebank.TechnicalReportLDC2012T13,LinguisticDataConsortium,Philadel-phia,Pennsylvania,USA.StevenBird,EwanKlein,andEdwardLoper.2009.Natu-ralLanguageProcessingwithPython:AnalyzingTextwiththeNaturalLanguageToolkit.O’ReillyMedia,Inc.,Sebastopol,California,USA.PhilBlunsomandTimothyBaldwin.2006.MultilingualdeeplexicalacquisitionforHPSGsviasupertagging.InProc.ofEMNLP,pages164–171.Sydney,Australia.PeterF.Brown,PeterV.deSouza,RobertL.Mercer,Vin-centJ.DellaPietra,andJeniferC.Lai.1992.Class-basedn-grammodelsofnaturallanguage.Computa-tionalLinguistics,18(4):467–479.MarineCarpuatandMonaDiab.2010.Task-basedeval-uationofmultiwordexpressions:apilotstudyinsta-tisticalmachinetranslation.InProc.ofNAACL-HLT,pages242–245.LosAngeles,California,USA.KennethChurch,WilliamGale,PatrickHanks,andDon-aldHindle.1991.Usingstatisticsinlexicalanalysis.InUriZernik,editor,Lexicalacquisition:exploitingon-lineresourcestobuildalexicon,pages115–164.LawrenceErlbaumAssociates,Hillsdale,NewJersey,USA.MassimilianoCiaramitaandYaseminAltun.2006.Broad-coveragesensedisambiguationandinformationextrac-tionwithasupersensesequencetagger.InProc.ofEMNLP,pages594–602.Sydney,Australia.MichaelCollins.2002.DiscriminativetrainingmethodsforHiddenMarkovModels:theoryandexperimentswithperceptronalgorithms.InProc.ofEMNLP,pages1–8.Philadelphia,Pennsylvania,USA.MatthieuConstantandAnthonySigogne.2011.MWU-awarepart-of-speechtaggingwithaCRFmodelandlexicalresources.InProc.oftheWorkshoponMulti-wordExpressions:fromParsingandGenerationtotheRealWorld,pages49–56.Portland,Oregon,USA.MatthieuConstant,AnthonySigogne,andPatrickWatrin.2012.Discriminativestrategiestointegratemultiwordexpressionrecognitionandparsing.InProc.ofACL,pages204–212.JejuIsland,Korea.PaulCook,AfsanehFazly,andSuzanneStevenson.2008.TheVNC-Tokensdataset.InProc.ofMWE,pages19–22.Marrakech,Morocco.HalDaumé,III.2006.Practicalstructuredlearningtech-niquesfornaturallanguageprocessing.Ph.D.disserta-tion,UniversityofSouthernCalifornia,LosAngeles,California,USA.URLhttp://hal3.name/docs/daume06thesis.pdf.MonaDiabandPravinBhutada.2009.Verbnouncon-structionMWEtokenclassification.InProc.ofMWE,pages17–22.Suntec,Singapore.NickC.Ellis,RitaSimpson-Vlach,andCarsonMaynard.2008.Formulaiclanguageinnativeandsecondlan-guagespeakers:psycholinguistics,corpuslinguistics,andTESOL.TESOLQuarterly,42(3):375–396.ChristianeFellbaum,editor.1998.WordNet:anelec-troniclexicaldatabase.MITPress,Cambridge,Mas-sachusetts,USA.CharlesJ.Fillmore,PaulKay,andMaryCatherineO’Connor.1988.Regularityandidiomaticityingram-maticalconstructions:thecaseof‘letalone’.Language,64(3):501–538.YoavFreundandRobertE.Schapire.1999.Largemarginclassificationusingtheperceptronalgorithm.MachineLearning,37(3):277–296.MahmoudGhoneimandMonaDiab.2013.Multiwordexpressionsinthecontextofstatisticalmachinetrans-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

205

lation.InProc.ofIJCNLP,pages1181–1187.Nagoya,Japan.KevinGimpelandNoahA.Smith.2011.Generativemodelsofmonolingualandbilingualgappypatterns.InProc.ofWMT,pages512–522.Edinburgh,Écosse,UK.AdeleE.Goldberg.1995.Constructions:aconstructiongrammarapproachtoargumentstructure.UniversityofChicagoPress,Chicago,Illinois,USA.AdeleE.Goldberg.2006.Constructionsatwork:thenatureofgeneralizationinlanguage.OxfordUniversityPress,Oxford,UK.EdouardGrave,GuillaumeObozinski,andFrancisBach.2013.HiddenMarkovtreemodelsforsemanticclassinduction.InProc.ofCoNLL,pages94–103.Sofia,Bulgaria.SpenceGreen,Marie-CatherinedeMarneffe,JohnBauer,andChristopherD.Manning.2011.Multiwordexpres-sionidentificationwithtreesubstitutiongrammars:aparsingtourdeforcewithFrench.InProc.ofEMNLP,pages725–735.Edinburgh,Écosse,UK.SpenceGreen,Marie-CatherinedeMarneffe,andChristo-pherD.Manning.2012.Parsingmodelsforidentify-ingmultiwordexpressions.ComputationalLinguistics,39(1):195–227.JanHajiˇc,EvaHajiˇcová,JarmilaPanevová,PetrSgall,SilvieCinková,EvaFuˇcíková,MarieMikulová,PetrPajas,JanPopelka,JiˇríSemecký,JanaŠindlerová,JanŠtˇepánek,JosefToman,ZdeˇnkaUrešová,andZdenˇekŽabokrtský.2012.PragueCzech-EnglishDependencyTreebank2.0.TechnicalReportLDC2012T08,Linguis-ticDataConsortium,Philadelphia,Pennsylvania,USA.URLhttp://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2003T10.SilvanaHartmann,GyörgySzarvas,andIrynaGurevych.2012.MiningmultiwordtermsfromWikipedia.InMariaTeresaPazienzaandArmandoStellato,editors,Semi-AutomaticOntologyDevelopment.IGIGlobal,Hershey,Pennsylvania,USA.ChikaraHashimotoandDaisukeKawahara.2008.Con-structionofanidiomcorpusanditsapplicationtoid-iomidentificationbasedonWSDincorporatingidiom-specificfeatures.InProc.ofEMNLP,pages992–1001.Honolulu,Hawaii,USA.TerryKoo,XavierCarreras,andMichaelCollins.2008.Simplesemi-superviseddependencyparsing.InProc.ofACL-08:HLT,pages595–603.Columbus,Ohio.KoenraadKuiper,HeatherMcCann,HeidiQuinn,ThereseAitchison,andKeesvanderVeer.2003.SAID.TechnicalReportLDC2003T10,LinguisticDataConsortium,Philadelphia,Pennsylvania,USA.URLhttp://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2003T10.JohnD.Lafferty,AndrewMcCallum,andFernandoC.N.Pereira.2001.Conditionalrandomfields:probabilisticmodelsforsegmentingandlabelingsequencedata.InProc.ofICML,pages282–289.PercyLiang.2005.Semi-supervisedlearningfornat-urallanguage.Master’sthesis,MassachusettsIn-stituteofTechnology,Cambridge,Massachusetts,USA.URLhttp://people.csail.mit.edu/pliang/papers/meng-thesis.pdf.MitchellP.Marcus,BeatriceSantorini,andMaryAnnMarcinkiewicz.1993.Buildingalargeannotatedcor-pusofEnglish:thePennTreebank.ComputationalLinguistics,19(2):313–330.I.DanMelamed.1997.Automaticdiscoveryofnon-compositionalcompoundsinparalleldata.InProc.ofEMNLP,pages97–108.Providence,RhodeIsland,USA.GeorgeA.Miller,ClaudiaLeacock,RandeeTengi,andRossT.Bunker.1993.Asemanticconcordance.InProc.ofHLT,pages303–308.Plainsboro,NewJersey,USA.ScottMiller,JethranGuinness,andAlexZamanian.2004.Nametaggingwithwordclustersanddiscriminativetraining.InProc.ofHLT-NAACL,pages337–342.Boston,Massachusetts,USA.BehrangMohit,NathanSchneider,RishavBhowmick,Ke-malOflazer,andNoahA.Smith.2012.Recall-orientedlearningofnamedentitiesinArabicWikipedia.InProc.ofEACL,pages162–173.Avignon,France.BegonaVilladaMoirónandJörgTiedemann.2006.Iden-tifyingidiomaticexpressionsusingautomaticword-alignment.InProc.oftheEACL2006WorkshoponMulti-wordExpressionsinaMultilingualContext,pages33–40.Trento,Italy.RosamundMoon.1998.FixedexpressionsandidiomsinEnglish:acorpus-basedapproach.OxfordStud-iesinLexicographyandLexicology.ClarendonPress,Oxford,UK.DavidNewman,NagendraKoilada,JeyHanLau,andTimothyBaldwin.2012.Bayesiantextsegmentationforindextermidentificationandkeyphraseextraction.InProc.ofCOLING2012,pages2077–2092.Mumbai,India.GeoffreyNunberg,IvanA.Sag,andThomasWasow.1994.Idioms.Language,70(3):491–538.OlutobiOwoputi,BrendanO’Connor,ChrisDyer,KevinGimpel,NathanSchneider,andNoahA.Smith.2013.Improvedpart-of-speechtaggingforonlineconversa-tionaltextwithwordclusters.InProc.ofNAACL-HLT,pages380–390.Atlanta,Georgia,USA.GerhardPaaßandFrankReichartz.2009.Exploiting

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
1
7
6
1
5
6
6
8
1
6

/

/
t

je

un
c
_
un
_
0
0
1
7
6
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

206

semanticconstraintsforestimatingsupersenseswithCRFs.InProc.oftheNinthSIAMInternationalConfer-enceonDataMining,pages485–496.Sparks,Nevada,USA.PavelPecina.2010.Lexicalassociationmeasuresandcollocationextraction.LanguageResourcesandEvalu-ation,44(1):137–158.CarlosRamisch.2012.Agenericandopenframeworkformultiwordexpressionstreatment:fromacquisitiontoapplications.Ph.D.disser-tation,UniversityofGrenobleandFederalUni-versityofRioGrandedoSul,Grenoble,France.URLhttp://www.inf.ufrgs.br/~ceramisch/download_files/thesis-getalp.pdf.CarlosRamisch,VitorDeAraujo,andAlineVillavicencio.2012.Abroadevaluationoftechniquesforautomaticacquisitionofmultiwordexpressions.InProc.ofACL2012StudentResearchWorkshop,pages1–6.JejuIs-land,Korea.CarlosRamisch,AlineVillavicencio,andChristianBoitet.2010.mwetoolkit:aframeworkformultiwordexpres-sionidentification.InProc.ofLREC,pages662–669.Valletta,Malta.LanceA.RamshawandMitchellP.Marcus.1995.Textchunkingusingtransformation-basedlearning.InProc.oftheThirdACLWorkshoponVeryLargeCorpora,pages82–94.Cambridge,Massachusetts,USA.LevRatinovandDanRoth.2009.Designchallengesandmisconceptionsinnamedentityrecognition.InProc.ofCoNLL,pages147–155.Boulder,Colorado,USA.MartaRecasensandEduardHovy.2011.BLANC:Im-plementingtheRandindexforcoreferenceevaluation.NaturalLanguageEngineering,17(04):485–510.AlanRitter,SamClark,Mausam,andOrenEtzioni.2011.Namedentityrecognitionintweets:anexperimentalstudy.InProc.ofEMNLP,pages1524–1534.Edin-burgh,Écosse,UK.IvanSag,TimothyBaldwin,FrancisBond,AnnCopes-take,andDanFlickinger.2002.Multiwordexpressions:apainintheneckforNLP.InAlexanderGelbukh,editor,ComputationalLinguisticsandIntelligentTextProcessing,volume2276ofLectureNotesinComputerScience,pages189–206.Springer,Berlin,Germany.NathanSchneider,SpencerOnuffer,NoraKazour,EmilyDanchik,MichaelT.Mordowanec,HenriettaConrad,andNoahA.Smith.2014.Comprehensiveannotationofmultiwordexpressionsinasocialwebcorpus.InProc.ofLREC.Reykjavík,Iceland.YutaroShigeto,AiAzuma,SoramiHisamoto,ShuheiKondo,TomoyaKouse,KeisukeSakaguchi,AkifumiYoshimoto,FrancesYung,andYujiMatsumoto.2013.ConstructionofEnglishMWEdictionaryanditsappli-cationtoPOStagging.InProc.ofthe9thWorkshoponMultiwordExpressions,pages139–144.Atlanta,Georgia,USA.JamesW.Thatcher.1967.Characterizingderivationtreesofcontext-freegrammarsthroughageneralizationoffiniteautomatatheory.JournalofComputerandSystemSciences,1(4):317–322.KristinaToutanova,DanKlein,ChristopherD.Manning,andYoramSinger.2003.Feature-richpart-of-speechtaggingwithacyclicdependencynetwork.InProc.ofHLT-NAACL,pages173–180.Edmonton,Alberta,Canada.YuliaTsvetkovandShulyWintner.2010.Extractionofmulti-wordexpressionsfromsmallparallelcorpora.InColing2010:Posters,pages1256–1264.Beijing,China.YuliaTsvetkovandShulyWintner.2011.Identificationofmulti-wordexpressionsbycombiningmultiplelin-guisticinformationsources.InProc.ofEMNLP,pages836–845.Edinburgh,Écosse,UK.YuanchengTuandDanRoth.2011.LearningEnglishlightverbconstructions:contextualorstatistical.InProc.oftheWorkshoponMultiwordExpressions:fromParsingandGenerationtotheRealWorld,pages31–39.Portland,Oregon,USA.YuanchengTuandDanRoth.2012.SortingoutthemostconfusingEnglishphrasalverbs.InProc.of*SEM,pages65–69.Montréal,Québec,Canada.JosephTurian,Lev-ArieRatinov,andYoshuaBengio.2010.Wordrepresentations:asimpleandgeneralmethodforsemi-supervisedlearning.InProc.ofACL,pages384–394.Uppsala,Sweden.MartinˇCmejrek,JanCuˇrín,JanHajiˇc,andJiˇríHavelka.2005.PragueCzech-EnglishDependencyTreebank:resourceforstructure-basedMT.InProc.ofEAMT,pages73–78.Budapest,Hungary.MarcVilain,JohnBurger,JohnAberdeen,DennisCon-nolly,andLynetteHirschman.1995.Amodel-theoreticcoreferencescoringscheme.InProc.ofMUC-6,pages45–52.Columbia,Maryland,USA.VeronikaVincze.2012.LightverbconstructionsintheSzegedParalellFXEnglish-Hungarianparallelcorpus.InProc.ofLREC.Istanbul,Turkey.VeronikaVincze,IstvánNagyT.,andJánosZsibrita.2013.LearningtodetectEnglishandHungarianlightverbconstructions.ACMTransactionsonSpeechandLan-guageProcessing,10(2):6:1–6:25.
Télécharger le PDF