Transacciones de la Asociación de Lingüística Computacional, 1 (2013) 125–138. Editor de acciones: Sharon Goldwater.
Submitted 10/2012; Revised 3/2013; Publicado 5/2013. C
(cid:13)
2013 Asociación de Lingüística Computacional.
ModelingChildDivergencesfromAdultGrammarSamSahakianUniversityofWisconsin-Madisonsahakian@cs.wisc.eduBenjaminSnyderUniversityofWisconsin-Madisonbsnyder@cs.wisc.eduAbstractDuringthecourseoffirstlanguageacquisi-tion,childrenproducelinguisticformsthatdonotconformtoadultgrammar.Inthispaper,weintroduceadatasetandapproachforsys-tematicallymodelingthischild-adultgrammardivergence.Ourcorpusconsistsofchildsen-tenceswithcorrectedadultforms.Webridgethegapbetweentheseformswithadiscrim-inativelyrerankednoisychannelmodelthattranslateschildsentencesintoequivalentadultutterances.OurmethodoutperformsMTandESLbaselines,reducingchilderrorby20%.Ourmodelallowsustochartspecificaspectsofgrammardevelopmentinlongitudinalstud-iesofchildren,andinvestigatethehypothesisthatchildrenshareacommondevelopmentalpathinlanguageacquisition.1IntroductionSincethepublicationoftheBrownStudy(1973),theexistenceofstandardstagesofdevelopmenthasbeenanunderlyingassumptioninthestudyoffirstlanguagelearning.Asachildmovestowardslan-guagemastery,theirlanguageusegrowspredictablytoincludemorecomplexsyntacticstructures,even-tuallyconvergingtofulladultusage.Inthecourseofthisprocess,childrenmayproducelinguisticformsthatdonotconformtothegrammaticalstandard.Fromtheadultpointofviewthesearelanguageer-rors,alabelwhichimpliesafaultyproduction.Con-sideringthework-in-progressnatureofachildlan-guagelearner,thesedivergencescouldalsobede-scribedasexpressionsofthestructuraldifferencesbetweenchildandadultgrammar.Thepredictabilityofthesedivergenceshasbeenobservedbypsychol-ogists,linguistsandparents(Owens,2008).1Ourworkleveragesthedifferencesbetweenchildandadultlanguagetomaketwocontributionsto-wardsthestudyoflanguageacquisition.First,weprovideacorpusoferrorfulchildsentencesanno-tatedwithadult-likerephrasings.Thisdatawillal-lowresearcherstotesthypothesesandbuildmodelsrelatingthedevelopmentofchildlanguagetoadultforms.Oursecondcontributionisaprobabilisticmodeltrainedonourcorpusthatpredictsagram-maticalrephrasinggivenanerrorfulchildsentence.Thegenerativeassumptionofourmodelisthatsentencesbegininunderlyingadultforms,andarethenstochasticallytransformedintoobservedchildutterances.Givenanobservedchildutterances,wecalculatetheprobabilityofthecorrectedadulttrans-lationtasP(t|s)∝P(s|t)PAG(t),whereP(t)isanadultlanguagemodelandP(s|t)isanoisemodelcraftedtocapturechildgrammarerrorslikeomissionofcertainfunctionwordsandcorruptionsoftenseordeclension.Theparame-tersofthisnoisemodelareestimatedusingourcor-pusofchildandadult-formutterances,usingEMtohandleunobservedwordalignments.Weusethisgenerativemodeltoproducen-bestlistsofcandi-datecorrectionswhicharethenrerankedusinglongrangesentencefeaturesinadiscriminativeframe-work(CollinsandRoark,2004).1Fortheremainderofthispaperweuse“error”and“diver-gence”interchangeably.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
126
Onecouldarguethatournoisychannelmodelmirrorsthecognitiveprocessofchildlanguagepro-ductionbyappealingtothehypothesisthatchildrenrapidlylearnadult-likegrammarbutproduceerrorsduetoperformancefactors(Bloom,1990;Ham-burgerandCrain,1984).Thatbeingsaid,ourpri-marygoalinthispaperisnotcognitiveplausibility,butratherthecreationofapracticaltooltoaidintheempiricalstudyoflanguageacquisition.Byau-tomaticallyinferringadult-likeformsofchildsen-tences,ourmodelcanhighlightandcomparedevel-opmentaltrendsofchildrenovertimeusinglargequantitiesofdata,whileminimizingtheneedforhu-manannotation.Besidesthis,ourmodel’spredictivesuccessit-selfhastheoreticalimplications.Byaggregatingtrainingandtestingdataacrosschildren,ourmodelinstantiatestheBrownhypothesisofasharedde-velopmentalpath.Evenwhenadequateper-childtrainingdataexists,usingdataonlyfromotherchil-drenleadstonodegradationinperformance,sug-gestingthatthelearnedparameterscapturegeneralchildlanguagephenomenaandnotjustindividualhabits.Besidesaggregatingacrosschildren,ourmodelcoarselylumpstogetherallstagesofdevel-opment,providingafrozensnapshotofchildgram-mar.Thisestablishesabaselineformorecognitivelyplausibleandtemporallydynamicmodels.Wecompareourcorrectionsystemagainsttwobaselines,aphrase-basedMachineTranslation(MONTE)sistema,andamodeldesignedforEnglishSecondLanguage(ESL)errorcorrection.Relativetothebestperformingbaseline,ourapproachachievesa30%decreaseinworderror-rateandafourpointincreaseinBLEUscore.Weanalyzetheperfor-manceofoursystemonvariouschilderrorcate-gories,highlightingourmodel’sstrengths(correct-ingbedropsandmorphologicalovergeneralizations)aswellasitsweaknesses(correctingpronounandauxiliarydrops).Wealsoassessthelearningrateofourmodel,showingthatverylittleannotationisneededtoachievehighperformance.Finally,toshowcaseapotentialapplication,weuseourmodeltochartoneaspectoffourchildren’sgrammarac-quisitionovertime.WhilegenerallyvindicatingtheBrownthesisofacommondevelopmentalpath,theresultspointtosubtletiesinvariationacrossindivid-ualsthatmeritfurtherinvestigation.2BackgroundandRelatedWorkWhilechilderrorcorrectionisanoveltask,com-putationalmethodsarefrequentlyusedtostudyfirstlanguageacquisition.ThecomputationalstudyofspeechisfacilitatedbyTalkBank(MacWhinney,2007),alargedatabaseoftranscribeddialoguesin-cludingCHILDES(MacWhinney,2000),asubsec-tioncomposedentirelyofchildconversationdata.Computationaltoolshavebeendevelopedspecif-icallyforthelarge-scaleanalysisofCHILDES.Thesetoolsenablefurthercomputationalstudysuchastheautomaticcalculationofthelanguagedevel-opmentmetricsIPSYN(Sagaeetal.,2005)andD-Level(Lu,2009),ortheautomaticformula-tionofnovellanguagedevelopmentmetricsthem-selves(SahakianandSnyder,2012).Theavailabilityofchildlanguageisalsokeytothedesignofcomputationalmodelsoflanguagelearning(Alishahi,2010),whichcansupporttheplausibilityofproposedhumanstrategiesfortaskslikesemanticrolelabeling(Connoretal.,2008)orwordlearning(Regier,2005).Toourknowledgethispaperisthefirstworkonerrorcorrectioninthefirstlanguagelearningdomain.Previousworkhasemployedaclassifier-basedapproachtoiden-tifyspeecherrorsindicativeoflanguagedisordersinchildren(MorleyandPrud’hommeaux,2012).Automaticcorrectionofsecondlanguage(L2)writingisacommonobjectiveincomputerassistedlanguagelearning(CALL).Thesetasksgenerallytargethigh-frequencyerrorcategoriesincludingar-ticle,word-form,andprepositionchoice.PreviousworkinCALLerrorcorrectionincludesidentify-ingwordchoiceerrorsinTOEFLessaysbasedoncontext(ChodorowandLeacock,2000),correctingerrorswithagenerativelatticeandPCFGrerank-ing(LeeandSeneff,2006),andidentifyingabroadrangeoferrorsinESLessaysbyexamininglinguis-ticfeaturesofwordsinsequence(Gamon,2011).Ina2011sharedESLcorrectiontask(DaleandKilgar-riff,2011),thebestperformingsystem(Rozovskayaetal.,2011)correctedpreposition,artículo,punctu-ationandspellingerrorsbybuildingclassifiersforeachcategory.ThislineofworkisgroundedinthepracticalapplicationofautomaticerrorcorrectionasalearningtoolforESLstudents.StatisticalMachineTranslation(SMT)hasbeen
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
127
appliedindiversecontextsincludinggrammarcor-rectionaswellasparaphrasing(Quirketal.,2004),questionanswering(EchihabiandMarcu,2003)andpredictionoftwitterresponses(Ritteretal.,2011).Intherealmoferrorcorrection,SMThasbeenap-pliedtoidentifyandcorrectspellingerrorsininter-netsearchqueries(Sunetal.,2010).WithinCALL,ParkandLevy(2011)tookanunsupervisedSMTapproachtoESLerrorcorrectionusingWeightedFi-niteStateTransducers(FSTs).TheworkdescribedinthispaperisinspiredbythatofParkandLevy,andinSection6wedetaildifferencesbetweenourapproaches.Wealsoincludetheirmodelasabase-line.3DataTotrainandevaluateourtranslationsystem,wefirstcollectedacorpusof1,000errorfulchild-languageutterancesfromtheAmericanEnglishportionoftheCHILDESdatabase.Toencouragediversityinthegrammaticaldivergencescapturedbyourcorpus,ourdataisdrawnfromalargepoolofstudies(seebibliographyforthefulllistofcitations).Intheannotationprocess,candidatechildsen-tenceswererandomlyselectedfromthepoolandclassifiedbyhandaseithergrammaticallycorrect,divergentorunclassifiable(whenitwasnotpossi-bletotellwhatachildistryingtosay).Wecon-tinuedthisprocessuntil1,000divergentsentenceswerefound.Alongthewaywealsoencountered5,197grammaticallycorrectutterancesand909thatwereunclassifiable.2BecauseCHILDESincludesspeechsamplesfromchildrenofdiverseage,back-groundandlanguageability,ourcorpusdoesnotcaptureanyspecificstageoflanguagedevelopment.Instead,thecorpusrepresentsageneralsnapshotofalearnerwhohasnotyetmasteredEnglishastheirfirstlanguage.Toprovidethegrammaticallycorrectcounterparttochilddata,ourerrorfulsentenceswerecorrectedbyworkersonAmazon’sMechanicalTurkwebser-vice.Givenachildutteranceanditssurroundingconversationalcontext,annotatorswereinstructedtotranslatethechildutteranceintoadult-likeEn-glish.WelimitedeligibleworkerstonativeEnglish2Thesehand-classifiedsentencesareavailableonlinealongwithoursetoferrorfulsentences.ErrorTypeChildUtteranceInsertionIdidlockedit.InflectionMorecookie?DeletionThatnothow.LemmaChoiceIgotgrain.OvergeneralizationIdrawedit.Table1:Examplesoferrortypescapturedbyourmodel.speakersresidingintheUS.Wealsorequiredanno-tatorstofollowabrieftutorialinwhichtheyprac-ticecorrectingsampleutterancesaccordingtoourguidelines.Theseguidelinesinstructedworkerstominimallyaltersentencestobegrammaticallycon-sistentwithaconversationorwrittenletter,withoutalteringunderlyingmeaning.Annotatorswereeval-uatedonaworker-by-workerbasisandrejectedintherarecasethattheyignoredourguidelines.Ac-ceptedworkerswerepaid7centsforcorrectingeachsetof5sentences.Toachieveaconsistentjudgment,wepostedeachsetofsentencesforcorrectionby7differentannotators.Oncemultiplereferencetranslationswereob-tainedweselectedasinglebestcorrectionbyplu-rality,arbitratingtiesasnecessary.Thereweresev-eralcasesinwhichcorrectionsobtainedbypluralitydecisiondidnotperfectlyfollowinstructions.Theseweremanuallycorrected.Boththerawtranslationsprovidedbyindividualannotatorsaswellasthecu-ratedfinaladultformsareprovidedonlineaspartofourdataset.3Resultingpairsoferrorfulchildsen-tencesandtheiradult-likecorrectionsweresplitinto73%training,7%developmentand20%testdata,whichweusetobuild,tuneandevaluateourgram-marcorrectionsystem.Inthefinaltestphase,devel-opmentdataisincludedinthetrainingset.4ModelAccordingtoourgenerativemodel,adult-likeutter-ancesareformedandthentransformedbyanoisychanneltobecomechildsentences.Thestructureofournoisemodelistailoredtomatchourobserva-tionsofcommonchilderrors.Theseinclude:func-tionwordinsertions,functionworddeletions,swapsoffunctionwordsand,inflectionalchangestocon-tentwords.Examplesofeacherrortypearegiven3Dataisavailableathttp://pages.cs.wisc.edu/~bsnyder
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
128
inTable1.Ourmodeldoesnotallowreorderings,andcanthusbedescribedintermsofword-by-wordstochastictransformationstotheadultsentence.Weuse10wordclassestoparameterizeourmodel:pronouns,negators,wh-words,conjunc-tions,prepositions,determiners,modalverbs,“be”verbs,otherauxiliaryverbs,andlexicalcontentwords.Thelistofwordsineachclassisprovidedaspartofourdataset.Foreachinputadultwordw,themodelgeneratesoutputwordw0asahierarchi-calseriesofdrawsfrommultinomialdistributions,conditionedontheoriginalwordwanditsclassc.AlldistributionsreceiveanasymmetricDirichletpriorwhichfavorsretentionoftheadultword.Withthesoleexceptionofwordinsertions,thedistribu-tionsareparameterizedandlearnedduringtrain-ing.Ourmodelconsistsof217multinomialdistri-butions,with6,718freeparameters.Thepreciseformandparameterizationofourmodelwerehandcraftedforperformanceonthede-velopmentdata,usingtrialanderror.Wealsocon-sideredmorefine-grainedmodelforms(i.e.oneparameterforeachnon-lexicalinput-outputwordpair),aswellascoarserparameterizations(i.e.asinglesharedparameterdenotinganyinflectionchange).Themodelwedescribehereseemedtoachievethebestbalanceofspecificityandgeneral-ization.Wenowpresentpseudocodedescribingthenoisemodel’soperationuponprocessingeachword,alongwithabriefdescriptionofeachstep.Actionselection(lines3-7):Onreadinganinputword,anactioncategoryaisselectedfromaprob-abilitydistributionconditionedontheinputword’sclass.Ourmodelallowsuptotwofunctionwordinsertionsordeletionsinarowbeforeaswapisre-quired.Lexicalcontentwordsmaynotbedeletedorinserted,onlyswapped.InsertandDelete(lines8-15):Thedeletioncaserequiresnodecisionafteractionselection.Intheinsertioncase,theclassoftheinsertedword,c0,isselectedconditionedoncPREV,theclassofthepreviousadultword.Thepreciseidentityoftheinsertedwordisthendrawnfromauniformdistributionoverwordsinclassc0.Itisimportanttonotethatintheinsertioncase,theinputwordatagiveniterationwillbere-processedatthenextiteration(lines33-35).insdel←0forwordwwithclassc,inflectionf,lemma‘do3:ifinsdel=2thena←swapelse6:a∼{insert,delete,swap}|cendififa=deletethen9:insdel++c0←(cid:15)w0←(cid:15)12:elseifa=inserttheninsdel++c0∼classes|cPREV,insert15:w0∼wordsinc0|insertelseinsdel←018:c0←cifc∈uninflected-classesthenw0∼wordsinc|w,swap21:elseifc=auxthen‘0∼aux-lemmas|',swapf0∼inflections|F,swap24:w0←COMBINE(‘0,f0)elsef0∼inflections|F,swap27:w0←COMBINE(',f0)endifendif30:ifw0∈irregularthenw0∼OVERGEN(w0)∪{w0}endif33:ifa=insertthengotoline3endif36:endforSwap(lines16-29):Intheswapcase,awordofgivenclassissubstitutedforanotherwordinthesameclass.Dependingonthesourceword’sclass,swapsarehandledinslightlydifferentways.Ifthewordisamodal,conjunction,determiner,preposi-tion,“wh-”wordornegative,itisconsidered“unin-
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
129
flected.”Inthesecases,anewwordw0isselectedfromallwordsinclassc,conditionedonthesourcewordw.Ifwisanauxiliaryverb,theswapprocedurecon-sistsoftwoparallelsteps.Alemmaisselectedfrompossibleauxiliarylemmas,conditionedonthelemmaofthesourceword.4Inthesecondstep,anoutputinflectiontypeisselectedfromadistributionconditionedonthesourceword’sinflection.Thepreciseoutputwordisfullyspecifiedbythechoiceoflemmaandconjugation.Ifwisnotineitheroftheabovetwocategories,itisalexicalword,andourmodelonlyallowschangesinconjugationordeclension.Ifthesourcewordisanounitmayswaptosingularorpluralformcon-ditionedonthesourceform.Ifthewordisaverb,itmayswaptoanyconjugatedornon-finiteform,againconditionedonthesourceform.LexicalwordsthatarenotmarkedbyCELEX(Baayenetal.,1996)asnounsorverbsmayonlyswaptotheexactsameword.Overgeneralization(lines30-32):Finalmente,thenoisychannelconsidersthepossibilityofproduc-ingovergeneralizedwordforms(like“maked”and“childs”)inplaceoftheircorrectirregularforms.TheOVERGENfunctionproducestheincorrectover-generalizedform.Wedrawfromadistributionwhichchoosesbetweenthisformandthecorrectoriginalword.Ourmodelmaintainsseparatedis-tributionsfornouns(overgeneralizedplurals)andverbs(overgeneralizedpasttense).5ImplementationInthissection,wedescribestepsnecessarytobuild,trainandtestourerrorcorrectionmodel.WeightedFiniteStateTransducers(FSTs)usedinourmodelareconstructedwithOpenFst(Allauzenetal.,2007).5.1SentenceFSTsTheseFSTsprovidethebasisforourtranslationpro-cess.WerepresentsentencesbybuildingasimplelinearchainFST,progressingfromnodetonodewitheacharcacceptingandyieldingonewordinthesentence.Allarcsareweightedwithprobabilityone.4Auxiliarylemmasincludehave,hacer,go,will,andget.5.2NoiseFSTThenoisemodelprovidesaconditionalprobabilityoverchildsentencesgivenanadultsentence.Ween-codethismodelasaFSTwithseveralstates,allow-ingustotrackthenumberofconsecutiveinsertionsordeletions.Weallowonlytwooftheseoperationsinarow,therebyconstrainingthelengthoftheout-putsentence.Thisconstraintresultsinthreestates(insdel=0,insdel=1,insdel=2),alongwithanendstate.Inourtrainingdata,only2sentencepairscannotbedescribedbythenoisemodelduetothisconstraint.EacharcintheFSThasan(cid:15)oradult-languagewordasinputsymbol,andapossiblyerrorfulchild-languagewordor(cid:15)asoutputsymbol.Eacharcweightistheprobabilityoftransducingtheinputwordtotheoutputword,determinedaccordingtotheparameterizeddistributionsdescribedinSec-tion4.Arcscorrespondingtoinsertionsordele-tionsleadtoanewstate(insdel++)andarenotal-lowedfromstateinsdel=2.Substitutionarcsallleadbacktostateinsdel=0.Wordclassinfor-mationisgivenbyasetofwordlistsforeachnon-lexicalclass.5InflectionalinformationisderivedfromCELEX.5.3LanguageModelFSTThelanguagemodelprovidesapriordistributionoveradultformsentences.WebuildaatrigramlanguagemodelFSTwithKneser-NeysmoothingusingOpenGRM(Roarketal.,2012).Thelan-guagemodelistrainedonallparentspeechintheCHILDESstudiesfromwhichourerrorfulsentencesaredrawn.InthelanguagemodelFST,theinputandoutputwordsofeacharcareidentical.Arcsareweightedwiththeprobabilityofthen-grambeginningwithsomeprefixassociatedwiththesourcenode,andendingwiththearc’sinput/outputword.Inthissetup,theprobabilityofastringisthetotalweightofthepathacceptingandemittingthatstring.5.4TrainingAsdetailedinSection4,ournoisemodelconsistsofaseriesofmultinomialdistributionswhichgovern5Wordlistsareincludedforreferencewithourdataset.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
130
01eso:that2is:
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
131
placeoftheadultsentence,thelanguagemodelFSTisused,yielding:FSTdecode=FSTlm◦FSTnoise◦FSTsEachpaththroughFSTdecodecorrespondstoanadulttranslationandderivation(t,d),withpathweightP(s,d|t,i)PAG(t).De este modo,thehighest-weightpathcorrespondstothemostlikelytranslationandderivationpair:argmaxt,dP(t,d|s,i)Weuseadynamicprogramtofindthenhighest-weightpathswithdistinctadultsentencest.Thiscanbeviewedasfindingthenmostlikelyadulttrans-lations,usingaViterbiapproximationP(t|s,i)=argmaxdP(t,d|s,i).Inourexperimentswesetn=50.AsimplifiedFSTdecodeexampleisshowninFigure1.5.6DiscriminativeRerankingTomoreflexiblycapturelongrangesyntacticfea-tures,weembedournoisychannelmodelinadis-criminativererankingprocedure.Foreachchildsen-tences,wetakethen-bestcandidatetranslationst1,…,tnfromtheunderlyinggenerativemodel,asdescribedintheprevioussection.Wethenmapeachcandidatetranslationtitoad-dimensionalfeaturevectorf(s,de).Thererankingmodelthenusesad-dimensionalweightvectorλtopredictthecandidatetranslationwithhighestlinearscore:t∗=argmaxtiλ·f(s,de)Tosimulatetestconditions,wetraintheweightvec-toronn-bestlistsfrom8-foldcross-validationovertrainingdata,usingtheaveragedperceptronrerank-ingalgorithm(CollinsandRoark,2004).Sincethen-bestlistmightnotincludetheexactgold-standardcorrection,atargetcorrectionwhichmaximizesourevaluationmetricischosenfromthelist.Then-bestlistisnon-linearlyseparable,soperceptrontrainingiteratesfor1000rounds,whenitisterminatedwith-outconverging.Ourfeaturefunctionf(s,de)yieldsninebooleanandreal-valuedfeaturesderivedfrom(i)theFSTthatgenerateschildsentencesfromcandidateadult-formti,y(ii)thePOSsequenceanddependencyparseofcandidatetiobtainedwiththeStanfordParser(deMarneffeetal.,2006).Featureswerese-lectedbasedontheirperformanceinrerankingheld-outdevelopmentdatafromthetrainingset.Rerank-ingfeaturesaregivenbelow:GenerativeModelProbabilities:Wefirstincludethejointprobabilityofthechildsentencesandcan-didatetranslationti,givenbythegenerativemodel:Plm(de)Pnoise(s|de).Wealsoisolatethecandidatetranslation’slanguagemodelandnoisemodelprob-abilitiesasfeatures.Sincebothoftheseproba-bilitiesnaturallyfavorshortersentences,wescalethemtosentencelength,yieldingnpPlm(de)andnpPnoise(s|de)respectively.Bynotscalingthejointprobability,weallowthererankertolearnitsownbiastowardslongerorshortercorrectedsentences.ContainsNounSubject,AccusativeNounSub-ject:Thefirstbooleanfeatureindicateswhetherthedependencyparseofcandidatetranslationticon-tainsa“nsubj”relation.Thesecondindicatesifa“nsubj”relationexistswherethedependentisanac-cusativepronoun(e.g.“Himatethecookie”).Thesefeaturesandtheonefollowinghavepreviouslybeenusedinclassifierbasederrordetection(MorleyandPrud’hommeaux,2012).ContainsFiniteVerb:ThisbooleanfeatureistrueifthePOStagsoftiincludeafiniteverb.Thisfeaturedifferentiatesstructureslike“Iamgoing”from“Igoing.”QuestionTemplateFeatures:Wedefinetem-platesforwh-andyes-noquestions.Asentencefitsthewh-questiontemplateifitbeginswithawh-word,followedbyanauxiliaryorcopulaverb(e.g.“Whodid…").Asentencefitstheyes-notemplatewhenitbeginswithanauxiliaryorcopulaverb,thenanounsubjectfollowedbyaverboradjective(e.g.“Areyougoing…").Weincludeonebooleanfeatureforeachofthesetemplatesindicatingwhenatem-platematchisinappropriate,whentheoriginalchildutteranceterminatesinaperiodinsteadofaquestionmark.Inadditiontothetwofeaturesforinappropri-atetemplatematches,wehaveasinglefeaturethatsignalsappropriatematchesofeitherquestiontem-plate—whentheoriginalchildutteranceterminatesinaquestionmark.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
132
ChildUtteranceHumanCorrectionMachineCorrectionIamnotputinmymouth.Iamnotputtingitinmymouth.Iamnotgoingtoputitinmymouth.Thisonehavewater?Doesthisonehavewater?Thisonehaswater?Wanttoreadthebook.Iwanttoreadthebook.Youwanttoreadthebook.Whyyougoingtogettwo?Whyareyougoingtogettwo?Whyareyougoingtohavetwo?Youverysticky.Youareverysticky.Youareverysticky.Henolike.Hedoesnotlikeit.Hedoesnotlikethat.Yeahitlooksalady.YeahitlookslikealadyYeahitlookslikealady.Eleanorcometoo.Eleanorcametoo.Eleanorcometoo.Deskinhere.ThedeskisinhereDeskisinhere.Whyhe’sdoc?Whyishecalleddoc?He’supdoc?Table2:Randomlyselectedtestoutputgeneratedbyourcompleteerrorcorrectionmodel,alongwithcorrespondingchildutterancesandhumancorrections.6ExperimentsandAnalysisBaselinesWecompareoursystem’sperformancewithtwopre-existingbaselines.Thefirstisastan-dardphrase-basedmachinetranslationsystemusingMOSES(Koehnetal.,2007)withGIZA++(OchandNey,2003)wordalignments.Weholdout9%ofthetrainingdatafortuningusingtheMERTalgo-rithmwithBLEUobjective(Och,2003).ThesecondbaselineisourimplementationoftheESLerrorcorrectionsystemdescribedbyParkandLevy(2011).Likeoursystem,thisbaselinetrainsFSTnoisemodelsusingEMintheV-expectationsemiring.Ournoisemodeliscraftedspecificallyforthechildlanguagedomain,andsodiffersfromParkandLevy’sinseveralways:Primero,wecaptureawiderrangeofword-swaps,withricherparameteri-zationallowingmanymoretranslationoptions.Asaresult,ourmodelhas6,718parameters,manymorethantheESLmodel’s187.Theseparameterscorre-spondtolearnedprobabilitydistributions,whereasintheESLmodelmanyofthedistributionsarefixedasuniform.Wealsocapturealargerclassoferrors,includingdeletions,changeofauxiliarylemma,andinflectionalovergeneralizations.Finally,weuseadiscriminativererankingsteptomodellong-rangesyntacticdependencies.AlthoughtheESLmodelisoriginallygearedtowardsfullyunsupervisedtrain-ing,wetrainthisbaselineinthesamesupervisedframeworkasourmodel.EvaluationandPerformanceWetrainallmodelson80%ofourchild-adultsentencepairsandtestontheremaining20%.Forillustration,selectedoutputfromourmodelisshowninTable2.PredictionsareevaluatedwithBLEUscore(Pap-inenietal.,2002)andWordErrorRate(WER),de-finedastheminimumstringeditdistance(inwords)betweenreferenceandpredictedtranslations,di-videdbylengthofthereference.Asacontrol,wecompareallresultsagainstscoresfortheuncor-rectedchildsentencesthemselves.AsreportedinTable3,ourmodelachievesthebestscoresforbothmetrics.BLEUscoreincreasesfrom50forchildsentencesto62,whileWERisreducedfrom.271to.224.Interestingly,MOSESachievesaBLEUscoreof58—stillfourpointsbelowourmodel—butac-tuallyincreasesWERto.449.Forbothmetrics,theESLsystemincreaseserror.Thisisnotsurprisinggiventhatitsintendedapplicationisinanentirelydifferentdomain.ErrorAnalysisWemeasuredtheperformanceofourmodeloverthesixmostcommoncategoriesofchilddivergence,includingdeletionsofvariousfunctionwordsandovergeneralizationsofpasttenseforms(e.g.“maked”for“made”).Wefirstiden-tifiedmodelparametersassociatedwitheachcate-gory,andthencountedthenumberofcorrectandin-correctparameterfiringsonthetestsentences.AsTable4indicates,ourmodelperformsreasonablywellon“be”verbdeletions,prepositiondeletions,andovergeneralizations,buthasdifficultycorrectingpronounandauxiliarydeletions.Ingeneral,hypothesizingdroppedwordsburdensthenoisemodelbyaddingadditionaldrawsfrommultinomialdistributionstothederivation.Topre-
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
133
BLEUWERWERreranking62.12.224BLEUreranking60.86.231Noreranking60.37.233Moses58.29.449ESL40.76.318ChildSentences49.55.271Table3:WERandBLEUscores.Oursystem’sperfor-manceusingvariousrerankingschemes(BLEUobjec-tive,WERobjectiveandnone)iscontrastedwithMosesMTandESLerrorcorrectionbaselines,aswellasun-correctedtestsentences.Bestperformanceundereachmetricisshowninbold.dictadeletion,eitherthelanguagemodelorthererankermuststronglypreferincludingtheomit-tedword.Asyntax-basednoisemodelmayachievebetterperformanceindetectingandcorrectingchildworddrops.Whileourmodelparameterizationandperfor-mancerelyonthelargelyconstrainednatureofchildlanguageerrors,weobservesomeinstancesinwhichitisoverlyrestrictive.For10%ofutterancesinourcorpus,itisimpossibletorecovertheexactgold-standardadultsentence.Thesesentencesfea-tureerrorslikereorderingorlexicallemmaswaps—forexample“ItalkMexican”for“IspeakSpanish.”Whileourmodelmaycorrectothererrorsinthesesentences,aperfectcorrectionisunattainable.Sometimes,ourmodelproducesappropriateformswhichbyhappenstancedonotconformtotheannotators’decision.Forexample,inthesecondrowofTable2,themodelcorrects“Thisonehavewater?”to“Thisonehaswater?",insteadofthemoreverbosecorrectionchosenbytheannotators(“Doesthisonehavewater?").Similarmente,ourmodelsometimesproducescorrectionswhichseemappro-priateinisolation,butdonotpreservethemeaningimpliedbythelargerconversationalcontext.Forex-ample,inrowthreeofTable2,thesentence“Wanttoreadthebook.”isrecognizedbothbyourhu-manannotatorsandthesystemasrequiringapro-nounsubject.Unliketheannotators,sin embargo,themodelhasnoknowledgeofconversationalcontext,soitchoosesthehighestprobabilitypronoun—inthiscase“you”—insteadofthecontextuallycorrect“I.”ErrorTypeCountF1PRBeDeletions63.84.84.84PronounDeletions30.15.38.1Aux.Deletions30.21.44.13Prep.Deletions26.65.82.54Det.Deletions22.48.73.36Overgen.Past7.921.0.86Table4:Frequencyofthesixmostcommonerrortypesintestdata,alongwithourmodel’scorrespondingF-measure,precisionandrecall.Allcountsare±.12atp=.05underabinomialnormalapproximationinter-val.0%10%20%30%40%50%60%70%80%90%100%404550556065.20.22.24.26.28.30.32% Train DataBLEUWERFigure2:Performancewithlimitedtrainingdata.WERisdrawnasthedashedline,andBLEUasthesolidline.LearningCurvesInFigure2,weseethatthelearningcurvesforourmodelinitiallyrisesharply,thenremainrelativelyflat.Usingonly10%ofourtrainingdata(80oraciones),weincreaseBLEUfrom44(usingjustthelanguagemodel)toalmost61.WeonlyreachourreportedBLEUscoreof62whenaddingthefinal20%oftrainingdata.Thisresultemphasizesthespecificityofourparameteri-zation.Becauseourmodelissotailoredtothechild-languagescenario,onlyafewexamplesofeacher-rortypeareneededtofindgoodparametervalues.Wesuspectthatmoreannotateddatawouldleadtoacontinuedbutslowincreaseinperformance.TrainingandTestingacrossChildrenWeuseoursystemtoinvestigatethehypothesisthatlan-guageacquisitionfollowsasimilarpathacrosschil-dren(Marrón,1973).Totestthishypothesis,wetrainourmodelonallchildrenexcludingAdam,whoaloneisresponsiblefor21%ofoursentences.WethentestthelearnedmodelontheseparatedAdam
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
134
Trainedon:BLEUWERAdam72.58.226AllOthers69.83.186Uncorrected45.54.278Table5:PerformanceonAdam’ssentencestrainingonotherchildren,versustrainingonhimself.Bestperfor-manceundereachmetricisshowninbold.data.Theseresultsarecontrastedwithperformanceof8-foldcrossvalidationtrainingandtestingsolelyonAdam’sutterances.PerformancestatisticsaregiveninTable5.Wefirstnotethatmodelstrainedinbothscenar-iosleadtolargeerrorreductionsoverthechildsen-tences.Thisprovidesevidencethatourmodelcap-turesgeneral,andnotchild-specific,errorpatterns.AlthoughtrainingexclusivelyonAdamdoesleadtoincreasedBLEUscore(72.58vs69.83),WERisminimizedwhenusingthelargervolumeoftrain-ingdatafromotherchildren(.186vs.226).Takenasawhole,theseresultssuggestthattrainingandtestingonseparatechildrendoesnotdegradeperfor-mance.Thisfindingsupportsthegeneralhypothesisofshareddevelopmentalpaths.PlottingChildLanguageErrorsoverTimeAf-tertrainingonannotateddata,wepredictdiver-gencesinallavailabledatafromthechildreninRogerBrown’s1973study—Adam,EveandSarah—aswellasAbe(Kuczaj,1977),achildfromasep-aratestudyoverasimilarage-range.Weploteachchild’sper-utterancefrequencyofprepositionomis-sionsinFigure3.Sinceweevaluateover65,000utterancesandrerankinghasnoimpactonpreposi-tiondropprediction,weskipthererankingsteptosavecomputation.InFigure3,weseethatAdamandSarah’sprepo-sitiondropsspikeearly,andthengraduallydecreaseinfrequencyastheirprepositionusemovestowardsthatofanadult.AlthoughEve’sdatacoversanear-liertimeperiod,weseethatherpatternofprepo-sitiondropsshowsasimilarspikeandgradualde-crease.ThisisconsistentwithEve’sgenerallan-guageprecocity.Brown’sconclusion—thatthelan-guagedevelopmentofthesethreechildrenadvancedinsimilarstagesatdifferenttimes—isconsistentwithourpredictions.However,whenweexamine18232833384348535800.020.040.060.080.10.120.14AdamEveSarahAbeAge (Months)Per-Utterance FrequencyFigure3:Automaticallydetectedprepositionomissionsinun-annotatedutterancesfromfourchildrenovertime.Assumingperfectmodelpredictions,frequenciesare±.002atp=.05underabinomialnormalapproxima-tioninterval.PredictionerrorisgiveninTable4.Abewedonotobservethesamepattern.7Thispointstoadegreeofvarianceacrosschildren,andsuggeststheuseofourmodelasatoolforfurtherempiricalrefinementoflanguagedevelopmenthy-potheses.DiscussionOurerrorcorrectionsystemisde-signedtobemoreconstrainedthanafull-scaleMTsystem,focusingparameterlearningonerrorsthatareknowntobecommontochildlanguagelearn-ers.Reorderingsareprohibited,lexicalwordswapsarelimitedtoinflectionalchanges,anddeletionsarerestrictedtofunctionwordcategories.Byhighlyre-strictingourhypothesisspace,weprovideaninduc-tivebiasforourmodelthatmatchesthechildlan-guagedomain.ThisisparticularlyimportantsincethesizeofourtrainingsetismuchsmallerthanthatusuallyusedinMT.Indeed,asFigure2shows,verylittledataisneededtoachievegoodperformance.Incontrast,theESLbaselinesuffersbecauseitsgenerativemodelistoorestrictedforthedomainoftranscribedchildlanguage.AsshownaboveinTable4,childdeletionsoffunctionwordsarethemostfrequenterrortypesinourdata.SincetheESLmodeldoesnotcaptureworddeletions,andhasamorerestrictednotionofwordswaps,88%ofchildsentencesinourtrainingcorpuscannotbetranslatedtotheirreferenceadultversions.TheresultisthattheESLmodeltendstorelytooheavilyonthelan-guagemodel.Forexample,onthesentence“Icom-7Thoughitisofcoursepossiblethatasimilarspikeanddrop-offoccurredearlierinAbe’sdevelopment.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
135
ingtoyou,”theESLmodelimprovesn-gramprob-abilitybyproducing“Icametoyou”insteadofthecorrect“Iamcomingtoyou”.Thisincreaseserroroverthechildsentenceitself.Inadditiontothedomain-specificgenerativemodel,ourapproachhastheadvantageoflong-rangesyntacticinformationencodedbyrerankingfeatures.Althoughtheperceptronalgorithmplaceshighweightonthegenerativemodelprobability,italtersthepredictionsin17outof201testsentences,inallcasesanimprovement.Threeofthesererank-ingchangesaddanounsubject,fiveenforceques-tionstructure,andnineaddamainverb.7ConclusionandFutureWorkInthispaperweintroduceacorpusofdivergentchildsentenceswithcorrespondingadultforms,en-ablingthesystematiccomputationalmodelingofchildlanguagebyrelatingittoadultgrammar.Weproposeachild-to-adulttranslationtaskasameanstoinvestigatechildlanguagedevelopment,andpro-videaninitialmodelforthistask.Ourmodelisbasedonanoisy-channelassump-tion,allowingforthedeletionandcorruptionofin-dividualwords,andistrainedusingFSTtechniques.Despitethedebatablecognitiveplausibilityofoursetup,ourresultsdemonstratethatourmodelcap-turesmanystandarddivergencesandreducestheaverageerrorofchildsentencesbyapproximately20%,withhighperformanceonspecificfrequentlyoccurringerrortypes.Themodelallowsustochartaspectsoflanguagedevelopmentovertime,withouttheneedforaddi-tionalhumanannotation.Ourexperimentsshowthatchildrensharecommondevelopmentalstagesinlan-guagelearning,whilepointingtochild-specificsub-tletiesinprepositionuse.Infuturework,weintendtodynamicallymodelchildlanguageabilityasitgrowsandshiftsinre-sponsetointernalprocessesandexternalstimuli.Wealsoplantodevelopandtrainmodelsspecializ-inginthedetectionofspecificerrorcategories.Byexplicitlyshiftingourmodel’sobjectivefromchild-adulttranslationtothedetectionofsomeparticularerror,wehopetoimproveouranalysisofchilddi-vergencesovertime.AcknowledgmentsTheauthorsthankthereviewersandacknowledgesupportbytheNSF(grantIIS-1116676)andare-searchgiftfromGoogle.Anyopinions,findings,orconclusionsarethoseoftheauthors,anddonotnec-essarilyreflecttheviewsoftheNSF.ReferencesA.Alishahi.2010.Computationalmodelingofhumanlanguageacquisition.SynthesisLecturesonHumanLanguageTechnologies,3(1):1–107.C.Allauzen,M.Riley,J.Schalkwyk,W.Skut,andM.Mohri.2007.OpenFst:Ageneralandefficientweightedfinite-statetransducerlibrary.Implementa-tionandApplicationofAutomata,pages11–23.R.H.Baayen,R.Piepenbrock,andL.Gulikers.1996.CELEX2(CD-ROM).LinguisticDataConsortium.E.Bates,I.Bretherton,andL.Snyder.1988.Fromfirstwordstogrammar:Individualdifferencesanddisso-ciablemechanisms.CambridgeUniversityPress.D.C.BellingerandJ.B.Gleason.1982.Sexdifferencesinparentaldirectivestoyoungchildren.SexRoles,8(11):1123–1139.L.Bliss.1988.Thedevelopmentofmodals.JournalofAppliedDevelopmentalPsychology,9:253–261.L.Bloom,L.Hood,andP.Lightbown.1974.Imitationinlanguagedevelopment:Si,cuando,andwhy.CognitivePsychology,6(3):380–420.L.Bloom,P.Lightbown,L.Hood,M.Bowerman,M.Maratsos,andM.P.Maratsos.1975.Structureandvariationinchildlanguage.MonographsoftheSoci-etyforResearchinChildDevelopment,pages1–97.L.Bloom.1973.Onewordatatime:Theuseofsinglewordutterancesbeforesyntax.Mouton.P.Bloom.1990.Subjectlesssentencesinchildlanguage.LinguisticInquiry,21(4):491–504.J.N.BohannonIIIandA.L.Marquis.1977.Chil-dren’scontrolofadultspeech.ChildDevelopment,48(3):1002–1008.R.Brown.1973.Afirstlanguage:Theearlystages.HarvardUniversityPress.V.Carlson-Luden.1979.Causalunderstandinginthe10-month-old.Ph.D.thesis,UniversityofColoradoatBoulder.E.C.CarteretteandM.H.Jones.1974.Informalspeech:Alphabetic&phonemictextswithstatisticalanalysesandtables.UniversityofCaliforniaPress.M.ChodorowandC.Leacock.2000.Anunsupervisedmethodfordetectinggrammaticalerrors.InProceed-ingsoftheNorthAmericanChapteroftheAssociationforComputationalLinguistics,pages140–147.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
136
M.CollinsandB.Roark.2004.Incrementalparsingwiththeperceptronalgorithm.InProceedingsoftheAsso-ciationforComputationalLinguistics,pages111–118,Barcelona,España,July.M.Connor,Y.Gertner,C.Fisher,andD.Roth.2008.BabySRL:Modelingearlylanguageacquisition.InProceedingsoftheConferenceonComputationalNat-uralLanguageLearning,pages81–88.R.DaleandA.Kilgarriff.2011.Helpingourown:TheHOO2011pilotsharedtask.InProceedingsoftheEu-ropeanWorkshoponNaturalLanguageGeneration,pages242–249.M.C.deMarneffe,B.MacCartney,andC.D.Manning.2006.Generatingtypeddependencyparsesfromphrasestructureparses.InProceedingsofTheIn-ternationalConferenceonLanguageResourcesandEvaluation,volume6,pages449–454.M.J.Demetras,K.N.Post,andC.E.Snow.1986.Feed-backtofirstlanguagelearners:Theroleofrepetitionsandclarificationquestions.JournalofChildLan-guage,13(2):275–292.M.J.Demetras.1989.Workingparents’conversationalresponsestotheirtwo-year-oldsons.M.Dreyer,J.R.Smith,andJ.Eisner.2008.Latent-variablemodelingofstringtransductionswithfinite-statemethods.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages1080–1089.A.EchihabiandD.Marcu.2003.Anoisy-channelap-proachtoquestionanswering.InProceedingsoftheAssociationforComputationalLinguistics,pages16–23.J.Eisner.2001.Expectationsemirings:FlexibleEMforlearningfinite-statetransducers.InProceedingsoftheESSLLIworkshoponfinite-statemethodsinNLP.M.Gamon.2011.High-ordersequencemodelingforlanguagelearnererrordetection.InProceedingsoftheWorkshoponInnovativeUseofNLPforBuildingEd-ucationalApplications,pages180–189.L.C.G.Haggerty.1930.Whatatwo-and-one-half-year-oldchildsaidinoneday.ThePedagogicalSeminaryandJournalofGeneticPsychology,37(1):75–101.W.S.Hall,W.C.Tirre,A.L.Brown,J.C.Campoine,P.F.Nardulli,HOAbdulrahman,MASozen,W.C.Schno-brich,H.Cecen,J.G.Barnitz,etal.1979.Thecommunicativeenvironmentofyoungchildren:Socialclass,étnico,andsituationaldifferences.BulletinoftheCenterforChildren’sBooks,32:08.W.S.Hall,W.E.Nagy,andR.L.Linn.1980.Spokenwords:Effectsofsituationandsocialgrouponoralwordusageandfrequency.UniversityofIllinoisatUrbana-Champaign,CenterfortheStudyofReading.W.S.Hall,W.E.Nagy,andG.Nottenburg.1981.Sit-uationalvariationintheuseofinternalstatewords.Technicalreport,UniversityofIllinoisatUrbana-Champaign,CenterfortheStudyofReading.H.HamburgerandS.Crain.1984.Acquisitionofcogni-tivecompiling.Cognition,17(2):85–136.R.P.Higginson.1987.Fixing:Assimilationinlanguageacquisition.UniversityMicrofilmsInternational.M.H.JonesandE.C.Carterette.1963.Redundancyinchildren’sfree-readingchoices.JournalofVerbalLearningandVerbalBehavior,2(5-6):489–493.P.Koehn,H.Hoang,A.Birch,C.Callison-Burch,M.Federico,N.Bertoldi,B.Cowan,W.Shen,C.Moran,R.Zens,etal.2007.Moses:Opensourcetoolkitforstatisticalmachinetranslation.InProceed-ingsoftheAssociationforComputationalLinguis-tics(InteractivePosterandDemonstrationSessions),pages177–180.S.A.Kuczaj.1977.Theacquisitionofregularandirreg-ularpasttenseforms.JournalofVerbalLearningandVerbalBehavior,16(5):589–600.J.LeeandS.Seneff.2006.Automaticgrammarcor-rectionforsecond-languagelearners.InProceedingsoftheInternationalConferenceonSpokenLanguageProcessing.X.Lu.2009.Automaticmeasurementofsyntacticcom-plexityinchildlanguageacquisition.InternationalJournalofCorpusLinguistics,14(1):3–28.B.MacWhinney.2000.TheCHILDESproject:Toolsforanalyzingtalk,volume2.PsychologyPress.B.MacWhinney.2007.TheTalkBankproject.Cre-atinganddigitizinglanguagecorpora:SynchronicDatabases,1:163–180.M.Mohri.2008.Systemandmethodofepsilonremovalofweightedautomataandtransducers,June3.USPatent7,383,185.E.MorleyandE.Prud’hommeaux.2012.Usingcon-stituencyanddependencyparsefeaturestoidentifyer-rorfulwordsindisorderedlanguage.InProceedingsoftheWorkshoponChild,ComputerandInteraction.A.Ninio,C.E.Snow,B.A.Pan,andP.R.Rollins.1994.Classifyingcommunicativeactsinchildren’sinteractions.JournalofCommunicationDisorders,27(2):157–187.F.J.OchandH.Ney.2003.Asystematiccomparisonofvariousstatisticalalignmentmodels.ComputationalLinguistics,29(1):19–51.F.J.Och.2003.Minimumerrorratetraininginstatisticalmachinetranslation.InProceedingsoftheAssociationforComputationalLinguistics,pages160–167.R.E.Owens.2008.Languagedevelopment:Anintro-duction.PearsonEducation,Cª.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
137
K.Papineni,S.Roukos,T.Ward,andW.J.Zhu.2002.BLEU:amethodforautomaticevaluationofmachinetranslation.InProceedingsoftheAssociationforComputationalLinguistics,pages311–318.Y.A.ParkandR.Levy.2011.Automatedwholesentencegrammarcorrectionusinganoisychannelmodel.Pro-ceedingsoftheAssociationforComputationalLin-guistics,pages934–944.A.M.Peters.1987.Theroleofimitationinthedevel-opingsyntaxofablindchildinperspectivesonrepeti-tion.Text,7(3):289–311.K.Post.1992.ThelanguagelearningenvironmentoflaterbornsinaruralFloridacommunity.Ph.D.thesis,HarvardUniversity.C.Quirk,C.Brockett,andW.Dolan.2004.Monolin-gualmachinetranslationforparaphrasegeneration.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages142–149.T.Regier.2005.Theemergenceofwords:Attentionallearninginformandmeaning.CognitiveScience,29(6):819–865.A.Ritter,C.Cherry,andW.B.Dolan.2011.Data-drivenresponsegenerationinsocialmedia.InProceedingsoftheConferenceonEmpiricalMethodsinNaturalLanguageProcessing,pages583–593.B.Roark,R.Sproat,C.Allauzen,M.Riley,J.Sorensen,andT.Tai.2012.TheOpenGrmopen-sourcefinite-stategrammarsoftwarelibraries.InProceedingsoftheAssociationforComputationalLinguistics(SystemDemonstrations),pages61–66.A.Rozovskaya,M.Sammons,J.Gioja,andD.Roth.2011.UniversityofIllinoissysteminHOOtextcor-rectionsharedtask.InProceedingsoftheEuropeanWorkshoponNaturalLanguageGeneration,pages263–266.J.Sachs.1983.Talkingaboutthethereandthen:Theemergenceofdisplacedreferenceinparent-childdis-course.Children’sLanguage,4.K.Sagae,A.Lavie,andB.MacWhinney.2005.Auto-maticmeasurementofsyntacticdevelopmentinchildlanguage.InProceedingsoftheAssociationforCom-putationalLinguistics,pages197–204.S.SahakianandB.Snyder.2012.Automaticallylearn-ingmeasuresofchildlanguagedevelopment.Pro-ceedingsoftheAssociationforComputationalLin-guistics(Volume2:ShortPapers),pages95–99.C.E.Snow,F.Shonkoff,K.Lee,andH.Levin.1986.Learningtoplaydoctor:Effectsofsex,edad,andex-perienceinhospital.DiscourseProcesses,9(4):461–473.E.L.StineandJ.N.Bohannon.1983.Imitations,inter-actions,andlanguageacquisition.JournalofChildLanguage,10(03):589–603.X.Sun,J.Gao,D.Micol,andC.Quirk.2010.Learningphrase-basedspellingerrormodelsfromclickthroughdata.InProceedingsoftheAssociationforComputa-tionalLinguistics,pages266–274.P.Suppes.1974.Thesemanticsofchildren’slanguage.AmericanPsychologist,29(2):103.T.Z.Tardif.1994.Adult-to-childspeechandlanguageacquisitioninMandarinChinese.Ph.D.thesis,YaleUniversity.V.Valian.1991.SyntacticsubjectsintheearlyspeechofAmericanandItalianchildren.Cognition,40(1-2):21–81.L.VanHouten.1986.Roleofmaternalinputintheacquisitionprocess:Thecommunicativestrategiesofadolescentandoldermotherswiththeirlanguagelearningchildren.InBostonUniversityConferenceonLanguageDevelopment.A.Warren-LeubeckerandJ.N.BohannonIII.1984.Into-nationpatternsinchild-directedspeech:Mother-fatherdifferences.ChildDevelopment,55(4):1379–1385.A.Warren.1982.Sexdifferencesinspeechtochildren.Ph.D.thesis,GeorgiaInstituteofTechnology.B.WilsonandA.M.Peters.1988.Whatareyoucookin’onahot?:Athree-year-oldblindchild’s‘violation’ofuniversalconstraintsonconstituentmovement.Lan-guage,64:249–273.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
t
a
C
yo
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
d
oh
i
/
.
1
0
1
1
6
2
/
t
yo
a
C
_
a
_
0
0
2
1
5
1
5
6
6
6
4
7
/
/
t
yo
a
C
_
a
_
0
0
2
1
5
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
9
S
mi
pag
mi
metro
b
mi
r
2
0
2
3