Transactions of the Association for Computational Linguistics, vol. 4, pp. 87–98, 2016. Action Editor: Alexander Clark.

Transactions of the Association for Computational Linguistics, vol. 4, pp. 87–98, 2016. Action Editor: Alexander Clark.
Submission batch: 7/2015; Revision batch: 11/2015; Published 4/2016.

2016 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence.

c
(cid:13)

LearningTier-basedStrictly2-LocalLanguagesAdamJardineandJeffreyHeinzUniversityofDelaware{ajardine,heinz}@udel.eduAbstractTheTier-basedStrictly2-Local(TSL2)lan-guagesareaclassofformallanguageswhichhavebeenshowntomodellong-distancephonotacticgeneralizationsinnaturallan-guage(Heinzetal.,2011).Thispaperin-troducestheTier-basedStrictly2-LocalIn-ferenceAlgorithm(2TSLIA),thefirstnon-enumerativelearnerfortheTSL2languages.Weprovethe2TSLIAisguaranteedtocon-vergeinpolynomialtimeonadatasamplewhosesizeisboundedbyaconstant.1IntroductionThisworkpresentstheTier-basedStrictly2-LocalInferenceAlgorithm(2TSLIA),anefficientlearn-ingalgorithmforaclassofTier-basedStrictlyLo-cal(TSL)formallanguages(Heinzetal.,2011).ATSLclassisdeterminedbytwoparameters:thetier,orsubsetofthealphabet,andthepermissibletierk-factors,whicharethelegalsequencesoflengthkallowedinthestring,onceallnon-tiersymbolshavebeenremoved.TheTier-basedStrictly2-Local(TSL2)languagesarethoseinwhichk=2.Aswillbediscussedbelow,theTSLlanguagesareofinteresttophonologybecausetheycanmodelawidevarietyoflong-distancephonotacticpatternsfoundinnaturallanguage(Heinzetal.,2011;Mc-MullinandHansson,forthcoming).OneexampleisderivedfromLatinliquiddissimilation,inwhichtwolscannotappearinawordunlessthereisanrintervening,regardlessofdistance.Forexam-ple,floralis‘floral’iswell-formedbutnot*mili-talis(cf.militaris‘military’).Asexplainedinsec-tions2and4,thiscanbemodeledwithpermissible2-factorsoveratierconsistingoftheliquids{je,r}.Forlong-distancephonotactics,kcanbefixedto2,butitisdoesnotappearthatthetiercanbefixedsincelanguagesemployavarietyofdifferenttiers.Thispresentsaninterestinglearningproblem:Givenafixedk,howcananalgorithminducebothatierandasetofpermissibletierk-factorsfrompositivedata?Thereissomerelatedworkwhichaddressesthisquestion.GoldsmithandRiggle(2012),buildingonworkbyGoldsmithandXanthos(2009),presentamethodbasedonmutualinformationforlearn-ingtiersandsubsequentlylearningharmonypat-terns.Thispaperdiffersinthatitsmethodsarerootedfirmlyingrammaticalinferenceandformallanguagetheory(delaHiguera,2010).Forinstance,incontrasttotheresultspresentedthere,weprovethekindsofpatterns2TSLIAsucceedsonandthekindofdatasufficientforittodoso.Nonetheless,thereisrelevantworkincomputa-tionallearningtheory:Gold(1967)provedthatanyfiniteclassoflanguagesisidentifiableinthelimitviaanenumerationmethod.Givenafixedalphabetandafixedk,thenumberofpossibletiersandper-missibletierk-factorsisfinite,andthuslearnableinthisway.However,suchlearnersaregrosslyinef-ficient.Noprovably-correct,non-enumerative,ef-ficientlearnerforboththetierandpermissibletierk-factorparametershaspreviouslybeenproposed.Thisworkfillsthisgapwithanalgorithmwhichlearnstheseparameterswhenk=2frompositivedataintimepolynomialinthesizeofthedata.Finally,Jardine(2016)presentsasimplifiedver-

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
0
8
5
1
5
6
7
3
7
6

/

/
t

je

un
c
_
un
_
0
0
0
8
5
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

88

sionof2TSLIAandreportstheresultsofsomesim-ulations.Unlikethatpaper,thepresentworkpro-videsthefullmathematicaldetailsandproofs.Thesimulationsarediscussedinthediscussionsection.Thispaperisstructuredasfollows.§2moti-vatestheworkwithexamplesfromnaturallanguagephonology.§3outlinesthebasicconceptsanddef-initionstobeusedthroughoutthepaper.§4definestheTSLlanguagesanddiscussestheirproperties.§5detailsthe2TSLIAandprovesitlearnstheTSL2classinpolynomialtimeanddata.§6discussesfu-turework,and§7concludes.2LinguisticmotivationTheprimarymotivationforstudyingtheTSLlan-guagesandtheirlearnabilitycomesfromtheirrel-evancetophonotactic(wordwell-formedness)pat-ternsinnaturallanguagephonology.Manyphono-tacticpatternsbelongtoeithertheStrictlyLocal(SL)languages(McNaughtonandPapert,1971)ortheStrictlyPiecewise(SP)languages(Rogersetal.,2010).1AnexampleofphonotacticknowledgewhichisSLisChomskyandHalle’s(ChomskyandHalle,1965)observationthatEnglishspeakerswillclassifyblickasapossiblewordofEnglishwhilerejectingbnickasimpossible.ThisisSLbecauseitcanbedescribedas*bnbeingunacceptableasaword-initialsequence.SPlanguagescandescribelong-distancedependenciesbasedonprecedencere-lationshipsbetweensounds(suchasconsonanthar-monyinSarcee,inwhichsmaynotfollowaSany-whereinaword,butmayprecedeone(Cook,1984))(Heinz,2010un).Aussi,theSLandSPlanguagesareefficientlylearnable(Garc´ıaetal.,1990;Heinz,2010un;Heinz,2010b;HeinzandRogers,2013).Cependant,therearesomelong-distancepatternswhichcannotbedescribedpurelywithprecedencerelationships.OneexampleisapatternfromLatin,inwhichincertaincasesanlcannotfollowanotherlunlessanrintervenes,nomatterthedistancebe-tweenthem(Jensen,1974;Odden,1994;Heinz,2010un).Thiscanbeseeninthe-alisadjectivalsuf-fix(Example1),whichappearsas-arisiftheworditattachestoalreadycontainsanl((d)through(F)1Therelationshipbetweentheseformallanguageclassesandhumancognition(linguisticandotherwise)isdiscussedinmoredetailinRogersandPullum(2011),RogersandHauser(2010),Rogersetal.(2013),andLai(2013;2015).inExample1),exceptincaseswherethereisanin-terveningr,inwhichitappearsagainas-alis((g)through(je)inExample1).Intheexamples,thesoundsinquestionareunderlinedforemphasis(datafromHeinz(2010un)),andfor(d)through(F),illicitformsaregiven,markedwithastar(*).Example1.a.navalis‘naval’b.episcopalis‘episcopal’c.infinitalis‘negative’d.solaris‘solar’(*solalis)e.lunaris‘lunar’(*lunalis)f.militaris‘military’(*militalis)g.floralis‘floral’h.sepulkralis‘funereal’i.litoralis‘oftheshore’Thisnon-localalternatingpatternisnotSPbe-causeSPlanguagescannotexpressblockingeffects(Heinz,2010un).Cependant,itcanbedescribedwithaTSLgrammarinwhichthetieris{r,je}andthepermissibletier2-factorsdonotinclude*lland*rr.Thisyieldsexactlythesetofstringsinwhichanlisalwaysimmediately(disregardingallsoundsbesides{r,je})followedbyanr,andviceversa.FormallearningalgorithmsforSLandSPlan-guagescanprovideamodelforhumanlearningofSLandSPsoundpatterns(Heinz,2010un).TSLlan-guagesarealsosimilarlylearnable,giventhestip-ulationthatboththetierandkarefixed.Fornat-urallanguage,thevalueforkneverseemstogoabove2(Heinzetal.,2011).Cependant,tiersvaryinhumanlanguage—TSLpatternsoccurbothwithdifferentkindsofvowels(Nevins,2010)andcon-sonants(Suzuki,1998;Bennett,2013).Forexam-ple,inTurkishthetieristheentirevowelinven-tory(ClementsandSezer,1982),whileinFinnishitisvowelsexcept/i,e/(Ringen,1975).InSamalaconsonantharmony,thetierissibilants(RoseandWalker,2004),whereasinKoorete,thetierissibi-lantsand{b,r,g,d}(McMullinandHansson,forth-coming).Ainsi,itisofinteresttounderstandhowboththetierandpermissibletier2-factorsforTSL2grammarsmightbelearnedefficiently.3PreliminariesBasicknowledgeofsettheoryisassumed.LetS1−S2denotethesettheoreticdifferenceofsetsS1and

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
0
8
5
1
5
6
7
3
7
6

/

/
t

je

un
c
_
un
_
0
0
0
8
5
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

89

S2.LetP(S)denotethepowersetofSandPfin(S)bethesetofallfinitesubsetsofS.LetΣdenoteasetofsymbols,referredtoasthealphabet,andletastringoverΣbeafinitese-quenceofsymbolsfromthatalphabet.Thelengthofastringwwillbedenoted|w|.Letλdenotetheemptystring;|λ|=0.LetΣ∗(Kleenestar)repre-sentallstringsoverthisalphabet,andΣkrepresentallstringsoflengthk.Concatenationoftwostringswandv(orsymbolsσ1andσ2)willbewrittenwv(σ1σ2).Specialbeginning(o)andend(n)symbols(o,n6∈Σ)willoftenbeusedtomarkthebegin-ningandendofwords;thealphabetΣaugmentedwiththese,Σ∪{o,n},willbedenotedΣon.Astringu∈Σ∗issaidtobeafactororsub-stringofanotherstringwiftherearetwootherstringsv,v0∈Σ∗suchthatw=vuv0.Wecalluak-factorofwifitisafactorofwand|toi|=k.Letfack:Σ∗→P(Σ≤k)beafunctionmappingstringstotheirk-factors,wherefack(w)equals{toi|uisak-factorofw}si|w|>kandequals{w}otherwise.Forexample,fac2(aab)={aa,ab}andfac8(aab)={aab}.Weextendthek-factorfunc-tiontolanguages;fack(L)=Sw∈Lfack(w).3.1Grammars,languages,andlearningAlanguage(orstringset)LisasubsetofΣ∗.IfLisfinite,let|L|denotethecardinalityofL,andlet||L||denotethesizeofL,whichisdefinedtobePw∈L|w|.LetL1·L2denotetheconcatenationofthelanguagesL1andL2,i.e.,thepairwiseconcate-nationofeachwordinL1toeachwordinL2.Fornotationalsimplicity,theconcatenationofasingle-tonlanguage{w}toanotherlanguageL2(orL1to{w})willbewrittenwL2(L1w).Agrammarisafiniterepresentationofapos-siblyinfinitelanguage.AclassLoflanguagesisrepresentedbyaclassRofrepresentationsifeveryr∈RisoffinitesizeandthereisanamingfunctionL:R→Lwhichisbothtotalandsurjective.Thelearningparadigmusedinthispaperisidenti-ficationinthelimitlearningparadigm(Gold,1967),withpolynomialboundsontimeanddata(delaHiguera,1997).Thisparadigmhastwocomplemen-taryaspects.First,itrequiresthattheinformationwhichdistinguishesalearningtargetfromotherpo-tentialtargetsbepresentintheinputforalgorithmstosuccessfullylearn.Second,itrequiressuccessfulalgorithmstoreturnahypothesisintimepolynomialofthesizeofthesample,andthatthesizeofthesam-pleitselfmustbepolynomialinthesizeofgrammar.Thedefinitionofthelearningparadigm(Defini-tion3)dependsonsomepreliminarynotions.Definition1.LetLbeaclassoflanguagesrepre-sentedbysomeclassRofrepresentations.1.AninputsampleIforalanguageL∈LisafinitesetofdataconsistentwithL,thatistosayI⊆L.2.A(L,R.)-learningalgorithmAisaprogramthattakesasinputasampleforalanguageL∈LandoutputsarepresentationfromR.Thenotionofcharacteristicsampleisintegral.Definition2(Characteristicsample).Fora(L,R.)-learningalgorithmA,asampleCSisacharacteris-ticsampleofalanguageL∈LifforallsamplesIforLsuchthatCS⊆I,AreturnsarepresentationrsuchthatL(r)=L.Nowthelearningparadigmcanbedefined.Definition3(Identificationinpolynomialtimeanddata).AclassLoflanguagesisidentifiableinpolynomialtimeanddataifthereexistsa(L,R.)-learningalgorithmAandtwopolynomialsp()andq()suchthat:1.ForanysampleIofsizemforL∈L,Are-turnsahypothesisr∈RinO(p(m))time.2.Foreachrepresentationr∈Rofsizen,thereexistsacharacteristicsampleofrforAofsizeatmostO(q(n)).4Tier-basedStrictlyLocalLanguagesThissectionintroducestheTier-basedStrictlyLocal(TSL)languages(Heinzetal.,2011).TheTSLlan-guagesgeneralizetheSLlanguages(McNaughtonandPapert,1971;Garc´ıaetal.,1990;Caron,2000),andassuchthesewillbrieflybediscussedfirst.4.1TheStrictlyLocalLanguagesTheSLlanguagescanbedefinedasfollows(Heinzetal.,2011):Definition4(SLlanguages).AlanguageLisStrictlyk-Local(SLk)iffthereexistsafi-nitesetS⊆fack(oΣ∗n)suchthatL={w∈Σ∗|fack(propre)⊆S}

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
0
8
5
1
5
6
7
3
7
6

/

/
t

je

un
c
_
un
_
0
0
0
8
5
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

90

SuchasetSissometimesreferredtoastheper-missiblek-factorsorpermissiblesubstringsofL.Forexample,letL={λ,ab,abab,ababab,…}.ThisLcanbedescribedwithasetofpermissible2-factorsS={sur,oa,ab,ba,bn}becauseevery2-factorofeverywordinLisinS;thus,LisStrictly2-Local(abbreviatedSL2).AsasetSofpermissiblek-factorsisfiniteitcanalsobeviewedasaSLgrammarwhereL(S)={w∈Σ∗|fack(propre)⊆S}.AnelementsofanSLgrammarSforalanguageLisuseful(resp.use-less)iffs∈fack(L)(s6∈fack(L)).AcanonicalSLgrammarcontainsnouselesselements.Intheexampleabove,aa6∈Sandaa6∈fac2(w)foranyw∈L.Suchastringisreferredtoasaforbiddenk-factororarestrictiononL.Thesetofforbiddenk-factorsRisfack(oΣ∗n)−S.Think-ingaboutthegrammarintermsofSorintermsofRisequivalent,butinsomecasesitissimplertorefertooneratherthantheother,soweshalluseboth.AnySLkclassoflanguagesislearnablewithpolynomialboundsontimeanddataifkisknowninadvance(Garc´ıaetal.,1990;Heinz,2010b).TheclassofSLklanguages(foreachk)belongstoacollectionoflanguageclassescalledstringex-tensionlanguageclasses(Heinz,2010b).Thedis-cussionabovepresentsSLklanguagesfromthisper-spective.Theselanguageclasseshavemanydesir-ablelearningproperties,duetotheirunderlyinglat-ticestructure(Heinzetal.,2012).4.2TheTSLlanguagesTheTSLlanguagescanbethoughtofasafurtherpa-rameterizationonthek-factorfunctionwhereacer-tainsubsetofthealphabettakespartinthegrammarandallothersymbolsinthealphabetareignored.ThisspecialsubsetisreferredtoasatierT⊆Σ.Symbolsnotonthetierareremovedfromconsider-ationofthegrammarbyanerasingfunctionET:Σ∗→T∗definedasET(σ0σ1σn)=u0u1unwhereui=σiifσi∈T;elseui=λ.Forexample,ifΣ={un,b,c}andT={un,c}thenET(bbabbcbba)=aca.WecanthendefineatierversionfacT-koffackasfacT-k(w)=fack(oET(w)n).Ici,oandnarebuiltintothefunctionastheyarealwaystreatedaspartofthetier.Continu-ingtheexamplefromabove,facT-2(bbabbcbba)={oa,ac,ca,un}.facT-kcanbeextendedtolan-guagesaswithfackabove.TheTSLlanguagescannowbedefinedparalleltotheSLlanguages(thefollowingdefinitionisequiv-alenttotheoneinHeinzetal.(2011)):Definition5(TSLlanguages).AlanguageLisTier-basedStrictlyk-LocaliffthereexistsasubsetT⊆ΣofthealphabetandafinitesetS⊆fack(oT∗n)suchthatL={w∈Σ∗|facT-k(w)⊆S}ParalleltoSLgrammarsabove,hT,SicanbethoughtofasaTSLgrammarofL.Likewise,theforbiddentiersubstrings(ortierrestrictions)RissimplythesetfacT-k(Σ∗)−S.Finally,hT,SiiscanonicalifScontainsnouselesselements(i.e.,s∈S⇔s∈facT-k(L(hT,Si)))andRisnonempty(thissecondrestrictionisexplainedbelow).Forexample,letΣ={un,b,c}andT={un,c}asaboveandletS={sur,oa,oc,ac,un,ca,cc,cn}.PluggingtheseintoDefinition5,weobtainalanguageLwhichonlycontainsthosestringswithouttheforbidden2-factoraaontierT.Thesearewordswhichmaycontainbsin-terspersedwithasandcsprovidedthatnoapre-cedesanotherwithoutaninterveningc.Forexam-ple,bbabbcbba∈Lbutbbabbbbabb6∈L,becauseET(bbabbbbabb)=aaandaa∈fac2(oaan)butaa6∈S.LiketheclassofSLklanguages,theclassofTSLlanguages(forfixedTandk)isastringextensionlanguageclass.TherelationshipoftheTSLlan-guagestoothersub-Regularlanguageclasses(Mc-NaughtonandPapert,1971;Rogersetal.,2013)isstudiedinHeinzetal.(2011).GivenafixedkandT,SiseasilylearnableinthesamewayastheSLlanguages(Heinzetal.,2011).Cependant,asdiscussedabove,inthecaseofnaturallanguagephonology,itisnotclearthatinformationaboutTisavailableapriori.LearningbothTandSsimultaneouslyisthusaninterestingproblem.Thisproblemadmitsatechnical,butunsatisfying,solution.ThenumberofsubsetsTsuchthatT⊆Σisfinite,sothenumberofTSLklanguagesgivenafixedkisfinite.Itisalreadyknownthatanyfiniteclassoflanguageswhichcanbeenumeratedcanbeidentifiedinthelimitbyanalgorithmwhichchecksthroughtheenumeration(Gold,1967).Cependant,giventhecognitiverelevanceofTSLlanguages,it

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi

/
t

un
c
je
/

je

un
r
t
je
c
e

p
d

F
/

d
o

je
/

.

1
0
1
1
6
2

/
t

je

un
c
_
un
_
0
0
0
8
5
1
5
6
7
3
7
6

/

/
t

je

un
c
_
un
_
0
0
0
8
5
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

91

isofinteresttopursueasmarter,computationallyefficientmethodforlearningthem.WhataretheconsequencesofvaryingT?WhenT=Σ,theresultisanSLklanguage,becausetheerasingfunctionoperatesvacuously(Heinzetal.,2011).Inversement,whenT=∅,byDefinition5,Siseither∅or{sur}.TheformerobtainstheemptylanguagewhilethelatterobtainsΣ∗.ByDefinition5,Σ∗canbedescribedwithanyT⊆ΣaslongasS=fac2(oT∗n).Insuchagrammar,noneofthemembersofTserveanypurpose;thuswestipulatethatforacanonicalgrammarRisnonempty.Importantly,amemberofTmayfailtobelongtoastringinRandstillserveapurpose.Forex-ample,letΣ={un,b,c},T={un,c},andS=fac2(oT∗n)−{aa}.Becausecappearsinnofor-biddentiersubstringsinR,itisfreelydistributedinL=L(hT,Si).Cependant,itmakesadifferenceinthelanguage,becauseaca∈Lbutaba6∈L.Ifc(andtherelevanttiersubstringsinS)weremiss-ingfromthetier,neitherabanoracawouldbeinL.Thiscanbethoughtofa‘blocking’functionofc,be-causeitallowssequencesaaeventhoughaa∈R.WemaynowreturntotheLatinexamplefrom§2inalittlemoredetail.TheLatinpattern,inwhichrsandlsmustalternate,regardlessofotherinterveningsounds.ThiscanbecapturedbyaTSLgrammarinwhichT={je,r}andS={ol,ou,lr,rl,rr,rn,ln}.Thiscapturesthegen-eralizationthat,ignoringallsoundsbesideslandr,landlareneverallowedtobeadjacent.There-mainderofthepaperdiscusseshowsuchagrammar,includingT,maybeinducedfrompositivedata.5AlgorithmThissectionintroducestheTier-basedStrictly2-LocalInferenceAlgorithm(2TSLIA),whichin-ducesbothatierandasetofpermissibletier2-factorsfrompositivedata.First,in§5.1,thecon-ceptofapath,whichiscrucialtothelearner,isdefined.§5.3introducesanddescribesthealgo-rithm,and§5.4definesadistinguishingexampleandprovesthatitisacharacteristicsampleforwhichthealgorithmisguaranteedtoconverge.Timeanddatacomplexityforthealgorithmarediscussedthere.5.1PathsFirst,wedefinetheconceptof2-paths.Howthisconceptmightbegeneralizedtokisdiscussedin§6.Pathsdenoteprecedencerelationsbetweensymbolsinastring,buttheyarefurtheraugmentedwithsetsofinterveningsymbols.Formally,a2-pathisa3-tuplehx,Z,yiwherexandyareasymbolinΣonandZisasubsetofΣ.The2-pathsofastringw=σ0σ1σn,denotedpaths2(w),arepaths2(w)=(cid:8)hσi,Z,σji(cid:12)(cid:12)jel D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 92 tierelementsofG.Foranyelementsσi,σj∈T,theirtieradjacencywithrespecttoasetLofstringsreferstowhetherornottheymayappearadjacentonthetier;formally,σi,σjaretieradjacentinLiffσiσj∈facT-2(w)forsomew∈L.Anexclusiveblockerisaσ∈TwhichdoesnotappearinR.Thatis,∀w∈R,σ6∈fac1(w).Anexclusiveblockeristhusnotrestrictedinitsdistri-butionbutmayintervenebetweenotherelementsonthetier.ThefreeelementsofagrammarGwillrefertotheunionofthesetofthenontierelementsofGandexclusiveblockersgivenG.Givenanorderσ0,σ1,...,σnonΣ,letΣi={σh|hl D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 93 Données:AnalphabetΣ={σ0,σ1,...,σn};finitesetIofinputstringsinΣ∗Result:ATSL2grammarhT,Sifunctiongettier(Ti,P.,je):InitializePTi={hτ1,X,τ2i∈P|τ1,τ2∈Ti∪{o,n}};InitializeHi=Σ−Ti;fori≤ndoLetσ=σi;ifa.)∃X,X0⊆Hisuchthatho,X,σi,,X0,ni∈PTiand∀σ0∈Ti,∃Y,Y0⊆Hisuchthathσ,Oui,σ0i,hσ0,Y0,σi∈PTiandb.)foreachhτ1,Z,τ2i∈PTiwithτ1,τ2∈((Ti∪{o,n})−{p}),σ∈Z,Z−{p}⊆Hi,∃Z0⊆Hisuchthathτ1,Z0,τ2i∈PTithenReturngettier(Ti−{p},PTi,i+1);elsei=i+1endendReturnhTi,PTii;functionmain(je,Σ):InitializeP=paths2(je);InitializeS=∅;SetT,PT=gettier(Σ,P.,0);forp=hτ1,X,τ2i∈PTdoifX⊆Σ−TthenAddτ1τ2toS;endendReturnhT,Si;Algorithm1:TheTSL2LearningAlgorithm(2TSLIA)Example3.LetΣ={un,b,c,d}fromExample2,withthatorder,andletI={aaa,bab,cac,dad}.Intheinitialconditionofthealgorithm,i=0,soσi=a,Ti=Σ,Hi=∅,andPTi=paths2(je).BecauseHi=∅,asatisifescondition(un)ifho,,ai∈PTi,ha,,ni∈PTi,andforeachσ∈Ti={un,b,c,d},,,ai∈PTiandha,,σi∈PTi.Thisistruegiventhispartic-ularI,becauseho,,ai,ha,,ni,ha,,ai,∈paths2(aaa),hb,,ai,ha,,bi∈paths2(bab),hc,,ai,ha,,ci∈paths2(cac),andhd,,ai,ha,,di∈paths2(dad).Condition(b).Thisconditioncheckswhetherσiisanexclusiveblocker,andthusshouldnotbere-movedfromTi.Itdoesthisbylookingforapairτ1,τ2ofmembersofTidistinctfromσiwhosetier-adjacencyisdependentonσi.Itsearchesthroughpathsoftheformhτ1,Z,τ2i,whereZincludesσiandsomesubsetofHi.Foreachsuchhτ1,Z,τ2i,itsearchesforahτ1,Z0,τ2iwhereZ0isonlyasubsetofHi.Ifsuchahτ1,Z0,τ2iexists,thenτ1andτ2aretier-adjacentinthedataregardlessofσi.However,ifnosuchpathexists,thenitmaybethatτ1andτ2canonlyappearwithσiinbetweenthem,andthusgettierinfersthatσiisanexclusiveblocker.Ifanysuchpairisfound,thencondition(b)fails,andσimustremaininTi.Example4.ContinuingwithExample3,awouldnotsatisfycondition(b)givenI.Giventhati=0,inorderforatosatisfycondition(b),forallτ1,τ2∈{o,b,c,d,n},ifhτ1,{un},τ2iisinPTi,thenhτ1,∅,τ2imustalsobeinPTi.ForthisI,thisisfalseforτ1=τ2=b,becausehb,{un},bi∈paths2(bab),buthb,,bi6∈paths2(je).Thusgettierwouldinferthataisanexclusiveblocker.However,thereadercanconfirmthatawouldsatisfycondition(b)foraninputI0=I∪{λ,bb,cc,dd}.Ifbothconditions(un)et(b)aremet,thenthealgorithmdeterminesthatσishouldnotappearonthefinalhypothesisTforthetier,andsogettierisrecursivelycalledwithTi−{σi}andi+1asarguments.Becausethehypothesisforthetierhaschanged,PTiisalsopassed,soonthenextcallwhengettiercalculatesPTi+1,itonlyhastosearchthroughPTiinsteadofthefullsetofpathsP.Ifneitherconditionismet,nochangeismadetoTi,iisincreased,andthenextσiischeckedbycontinuingthroughtheforloop.Thisprocessisre-peatedforeachmemberofthealphabet,afterwhichTiisreturnedasthefinalanswerforthetier.ItstierpathsPTiarealsoreturned.ThefunctionmaincallsgettierandtakesitsresultasTandPT.Usingthis,itfindseachτ1τ2∈S.ItdoesthisbysearchingthroughPTandfindingpathshτ1,X,τ2iwhoseinterveningsetXisasub- l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 94 setofthenon-tierelementsΣ−T.Suchτ1τ2pairsarethustier-adjacentinI.TheresultinggrammarhT,Siisthenreturned.5.4IdentificationinpolynomialtimeanddataHereweestablishthemainresultofthepaperthat2TSLIAidentifiestheTSL2classinpolynomialtimeanddata.Asistypical,theproofreliesonaestablishingacharacteristicsampleforthe2TSLIA.Lemma1establishes2TSLIArunsefficiently.Lemma1.GivenaninputsampleIofsizen,2TSLIAoutputsagrammarintimepolynomialinthesizeofn.Proof.Thepathsarecalculatedforeachwordonceatthebeginningofmain.Thistakestimequadraticinthesizeofthesample(Remark1),soO(n2).CallthissetofpathsP(notePisalsoboundedinsizebyn2).En plus,theloopingettieriscalledexactly|Σ|times.Checkingcondition(un)ingettierrequiresasinglepassthroughthepaths.Onotherhand,condition(b)requiresonesearchthroughPforeverypathelementp∈Pintheworstcase,whichisO(n4).ThusthetimecomplexityforgettierisO(|Σ|(n2+n4))=O(n4)depuis|Σ|isaconstant.Lastly,findingthepermissiblesubstringsonthetieralsorequiresasinglepassthroughP,whichalsotakesO(n2).Altogetherthenanupperboundonthetimecomplexityof2TSLIAisgivenbyO(n2+n4+n2)=O(n4),whichispolynomial.Nextwedefineadistinguishingsampleforatar-getlanguageforthe2TSLIA.Wefirstshowitispolynomialinthesizeofthetargetgrammar.Wethenshowthatitisacharacteristicsample.Definition6(Distinguishingsample).Givenanal-phabetΣ,withsomeorderσ1,σ2,...,σnonΣ,thetargetlanguageL∈TSL2,andthecanonicalgram-marG=hT,SiforL,adistinguishingsampleDforGisthesetmeetingthefollowingconditions.Re-callthatH=Σ−Tarethenon-tierelementsandHireferstothenon-tierelementslessthanσi.1.Thenon-tierelementcondition.Forallnon-tierelementsσi∈H,i.∃w1,w2∈Ds.t.ho,X,σi∈paths2(w1)andhσ,X0,ni∈paths2(w2),X,X0⊆Hi∀σ0∈Ti,∃v1,v2∈Ds.t.hσ,Oui,σ0i∈paths2(v1)andhσ0,Y0,σi∈paths2(v2),Oui,Y0⊆Hi.ii.∀τ1τ2∈(facTi-2((Σ−{σi}))−R),w∈Ds.t.hτ1,X,τ2i∈paths2(w),X⊆Hi.2.Theexclusiveblockercondition.Foreachσi∈Twhichisanexclusiveblocker,w∈Ds.t.hτ1,X,τ2i∈paths2(w),whereτ1τ2∈R,τ16=σi,τ26=σi,σi∈X,andX−σi⊆Hi3.Theallowedtiersubstringcondition.∀τ1τ2∈S,somews.t.hτ1,X,τ2i∈paths2(w)whereX⊆HEssentially,item(1)ensuresthatallsymbolsσinotonthetargetTwillmeetconditions(un)et(b)intheforloopandberemovedfromthealgorithm’shypothesisforT.Item(2)ensuresthat,inthesitua-tionanexclusiveblockerσi∈Tmeetscondition(un)forremovalfromthetierhypothesis,itwillnotmeetcondition(b).Item(3)ensuresthatthesamplecon-tainseveryτ1τ2inS.Thesepointswillbediscussedmoredetailintheproofthatthedistinguishingex-ampleisacharacteristicsample.Lemma2.GivenagrammarG=hT,Siwhosesizeis|T|+||S||,thesizeofadistinguishingsampleDforGispolynomialinthesizeofG.Proof.RecallthatT⊆ΣandS⊆Σ2on,thatH=Σ−TandR=Σ2−S.Thenon-tierel-ementconditionrequiresthatforeachσ∈Handσ0∈T,thesampleminimallycontainsthewordsσ,σ0σ,andσ0σ,whosetotallengthis|H|+2|H||T|.Theexclusiveblockerconditionrequiresforeachexclusiveblockerσ∈Tandeachτ1τ2∈Rthatminimallyτ1στ2iscontainedinthesample.LettingBdenotethesetofexclusiveblockers,wehavethetotallengthofwordsinthecharacteristicsampleis||B||×||R.||.Enfin,theallowedtiersubstringcon-ditionisrequiresforeachτ1τ2∈Sthatminimallyτ1τ2iscontainedinthesample.Hence,thelengthofthesewordsequals|S|.AltogetherthismeansthereisacharacteristicsampleDsuchthat||D||=|H|+2|H||T|+||B||×||R.||+|S|.SinceH,T,B⊆ΣandR,S⊆ l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 95 Σ2,||D|||Σ|+2|Σ||Σ|+|Σ||Σ|2+|Σ|2=|Σ|+3|Σ|2+|Σ|3.Ainsi,||D||isboundedbyO(|Σ|3)lequel,depuis|Σ|isaconstant,iseffectivelyO(1).NextweproveLemma3,whichshowsthatthetierconjecturedbythe2TSLIAatstepiiscorrectforallsymbolsuptoσi.Thus,onceallsymbolsaretreatedbythe2TSLIA,itsconjectureforthetieriscorrect.LetT0icorrespondtothealgorithm’scurrenttierhypothesiswhenσiisbeingcheckedintheforloopofthegettierfunction,letH0i=Σ−T0ibethealgorithm’shypothesisforthesetofnon-tierelementslessthanσi,andletPT0ibethesetofpathsunderconsideration(i.e.,thesetofpathsfromtheinitializationofPTibeforetheforloop).Asabove,τ0τ1...τmindexpositionsinastringorpath.Lemma3.LetΣ={σ0,...,σn}andconsideranyG=hT,Si.GivenanyfiniteinputsampleIwhichcontainsadistinguishingsampleDforG,itisthecasethatforalli(0≤i≤n),T0i=Ti.Proof.Theproofisbyrecursiononi.Thebasecaseiswheni=0.Bydefinition,T0=Σ.Thealgo-rithmstartswithT00=Σ,soT00=T0.Nextweassumetherecursivehypothesis(RH):forsomei∈NthatT0i=Ti.WeprovethatT0i+1=Ti+1.Specifically,weshowthatifRHistruefori,alors(Case1)σi∈Hi+1impliesσi∈H0i+1and(Case2)σi6∈Hi+1impliesσi6∈H0i+1.Case1.Thisisthecaseinwhichσiisanon-tierelement.Thenon-tierelementconditioninDefini-tion6forDensuresthatthedatainIwillmeetbothconditions(un)et(b)intheforloopingettier()forremovingσifromthetierhypothesis.Condition(un)requiresthatho,X,σii∈PT0iandhσi,X0,ni∈PT0iforsomeX,X0⊆H0i,and∀σ0∈Ti,hσi,Oui,σ0i∈PT0iandhσ0,Y0,σii∈PT0iforsomeY,Y0⊆H0i.Part(je)ofthenon-tierele-mentconditioninDefintion(6)ensuresthatforσi,therearewordsw1,w2∈Isuchthatho,X,σii∈paths2(w1),hσi,X0,ni∈paths2(w2),et,forallσ0∈Ti,v1,v2∈Is.t.hσi,Oui,σ0i∈paths2(v1)andhσ0,Y0,σii∈paths2(v2),wheretheinterven-ingsetsX,X0,Y,Y0areallsubsetsofHi.BecausebyRHH0i=Hi,condition(un)forremovingσifromthetierhypothesisissatisfied.Forσitosatisfycondition(b),foranypathhτ1,X,τ2i∈PT0isuchthatσi∈XandX−{σi}⊆H0i,theremustbeanotherpathhτ1,X0,τ2i∈PT0iwhereX0⊆H0i.Ifτ1τ2∈R,thensuchahτ1,X,τ2iisguaranteednottoexistinPT0i,becauseτ1andτ2will,bythedefinitionofR,notappearinthedatawithonlynon-tierelementsbetweenthem.Forτ1τ26∈R,part(ii)ofthenontierelementcondi-tioninDefinition6ensuresthatsomehτ1,Y,τ2iwhereY⊆HiexistsinPT0i,asitrequiresthatforeachsuchτ1,τ2thereissomew∈Isuchthathτ1,X,τ2i∈paths2(w)whereX⊆Hi.Byhy-pothesisH0i=Hi,andsothereisalwaysguaran-teedtobesomehτ1,X0,τ2i∈PT0iwhereX0⊆H0i.Thus,condition(b)willalwaysbesatisfiedforσi.Thus,assumingtheRH,σi∈Hi+1isguaranteedtosatisfyconditions(un)et(b)andberemovedfromthealgorithm’shypothesisforthetier,soσi∈H0i+1.Case2.Thisisthecaseinwhichσi∈T.Therearetwomutuallyexclusivepossibilities.Thefirstpossibilityisthatσiisnotafreeelement.Here,σiisguaranteedtonotbetakenoffofthetierhy-pothesis,becausecondition(un)forremovingasym-bolfromthetierrequiresthatthereexistssomepathhσj,X,σiiandhσi,X0,σjiwhereX,X0⊆H0i.FromthedefinitionofaTSLgrammar,therewillex-istnohσj,Oui,σiiandhσi,Y0,σji,whereY,Y0⊆H,inthepathsofI.BecauseHi⊆Handbyhypoth-esisH0i=Hi,H0i⊆H,sothealgorithmwillcor-rectlyregisterthatσjσi∈Rorσiσj∈Randσiwillremainonthetierhypothesis.Thusσi6∈H0i+1.Theotherpossibilityisthatσiisanexclusiveblocker.Ifso,σimaysatisfycondition(un)fortierremoval(asdiscussedaboveforthecaseinwhichσi∈H).Cependant,theexclusiveblockercondi-tioninDefinition6forDguaranteesthatσiwillnotmeetcondition(b).Asdiscussedabove,con-dition(b)willfailifthereisahτ1,X,τ2i∈PT0isuchthatXincludesσiandzeroormoreelementsofH0iandnootherpathhτ1,X0,τ2i∈PT0iwhereX0onlyincludeselementsofH0i.Theexclusiveblockerconditionrequiresthattherewillbesomeτ1τ2∈Rsuchthatthereisawordw∈Ds.t.hτ1,Y,τ2i∈paths2(w)whereσi∈YandY−{σi}=Hi.FromthedefinitionofaTSLgrammar,nowsuchthathτ1,Y0,τ2i∈paths2(w)whereYi⊆HiwillappearinI,becauseHi⊆H.Becausebyhy-pothesisH0i=Hi,thealgorithmwillcorrectlyfind l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 96 hτ1,Y,τ2iandalsofindthatthereisnohτ1,Y0,τ2i.Thuswhenσiisanexclusiveblocker,itwillnotberemovedfromthetierhypothesis,andσi6∈H0i.WenowknowthatbothCases(1)et(2)aretrue;thus,assumingtheRH,Hi+1=H0i+1.Thus,byinduction,(∀i)[Hi=H0i].BecauseHiandH0iarethecomplementsofTiandT0i,respectivement,(∀i)[Ti=T0i].Lemma4(Distinguishingsample=characteristicsample).AdistinguishingsampleforaTSL2lan-guageasdefinedinDefinition6isacharacteristicsampleforthatlanguageforthe2TSLIA.Proof.Asabove,letG0=(T0,S0)betheoutputofthealgorithm.FromLemma3,weknowthatforanylanguageL(G),G=hT,Si,andacharacteristicsampleDforG,givenanysampleIofLsuchthatD⊆I,(∀i)[Ti=T0i].ThatT0=Timmediatelyfollowsfromthisfact.ThatS0=SfollowsdirectlyfromT0=TandtheallowedtiersubstringconditionofDefinition6forD.Theallowedtiersubstringconditionstatesthatforallτ1τ2∈S,thedistinguishingsamplewillcontainsomews.t.hτ1,X,τ2i∈paths2(w)whereX⊆H.BecauseT=T0,theforloopofmainwillcorrectlyfindallsuchτ1τ2.Thus,S=S0,andG=G0.Theorem1(Identificationinthelimitinpolynomialtimeanddata).The2TSLIAidentifiestheTSLklan-guagesinpolynomialtimeanddata.Proof.ImmediatefromLemmas1,2,and4.Wenotefurtherthatintheworstcasethetimecomplexityispolynomial(degreefourLemma1)andthedatacomplexityisconstant(Lemma2).6DiscussionThisalgorithmopensupmultiplepathsforfuturework.Themostimmediatetheoreticalquestioniswhetherthealgorithmherecanbe(efficiently)gen-eralizedtoanyTSLclassaslongasthelearnerknowskapriori.Webelievethatitcan.Theno-tionof2-pathscanbestraightforwardlyextendedtoksuchthatak-pathisa2k−1tupleoftheformhσ0,X0,σ1,X1,...,Xk−1,σki,whereeachsetXirepresentsthesymbolsbetweenσiandσi+1.Thealgorithmpresentedherecanthenbemodifiedtocheckasetofsuchk-paths.Webelievesuchanal-gorithmcouldbeshowntobeprovablycorrectusingaproofofsimilarstructuretotheonehere,althoughtimeanddatacomplexitywilllikelyincrease.However,intermsofapplyingthesealgorithmstonaturallanguagephonology,itislikelythatakvalueof2issufficient.Heinzetal.(2011)arguethatTSL2candescribebothlong-distancedissimilationandassimilationpatterns.OnepotentialexceptiontothisclaimcomesfromSundanese,wherewhetherliquidsmustagreeordisagreepartlydependsonthesyllablestructure(Bennett,2015).Additionalissuesarisewithnaturallanguagedata.Oneisthatnaturallanguagecorporaoftenincludesomeexceptionstophonotacticgeneralizations.Al-gorithmswhichtakeasinputsuchnoisydataandoutputgrammarsthatareguaranteedtoberepresen-tationsoflanguages‘close’tothetargetlanguagehavebeenobtainedandstudiedinthePAClearningparadigm(AngluinandLaird,1988).Itwouldbein-terestingtoapplysuchtechniquesandothersimilaronestoadapt2TSLIAintoanalgorithmthatremainseffectivedespitenoisyinputdata.Anotherareaoffutureresearchishowtogen-eralizeovermultipletiers.Jardine(2016),inrun-ningversionsofthe2TSLIAonnaturallanguagecorpora,showsthatitfailsbecauselocaldependen-cies(whichagaincanbemodeledwhereT=Σ)preventcrucialinformationintheCSfromappear-inginthedata.Furthermore,naturallanguagescanhaveseparatelong-distancephonotacticswhichholdoverdistincttiers.Forexample,KiYakahasbothavowelharmonypattern(Hyman,1998)andaliquid-nasalharmonypatternoverthetier{je,m,n,N}(Hy-man,1995).Ainsi,wordsinKiYakaexhibitapatterncorrespondingtotheintersectionoftwoTSL2gram-mars,onewithavoweltierandonewithanasal-liquidtier.Theproblemoflearningafiniteinter-sectionofTSL2languagesisthusanotherrelevantlearningproblem.Onefinalwaythisresultcanbeextendedistostudythenatureoflong-distanceprocessesinphonology.Chandlee(2014)extendsthenotionofSLlanguagestotheInput-andOutput-StrictlyLocalstringfunctions,whicharesufficienttomodellocalphonologicalprocesses.Subsequentwork(Chan-dleeetal.,2014;Chandleeetal.,2015;Jardineetal.,2014)hasshownhowtheseclassesoffunctions l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 97 canbeefficientlylearned,buildingonideasonthelearningofSLfunctions.Anopenquestion,alors,ishowtheseideascanbeusedtodevelopafunctionalversionoftheTSLlanguagestomodellong-distanceprocesses.Thecentralresultinthispapermaythenhelptounderstandhowthetiersoverwhichtheseprocessesapplycanbelearned.7ConclusionThispaperhaspresentedanalgorithmwhichcanlearnagrammarforanyTSL2languageintimepolynomialinthesizeoftheinputsample,whosesizeisboundedbyaconstant.AstheTSL2lan-guagescanmodellong-distancephonotacticsinnat-urallanguage,thisrepresentsasteptowardsunder-standinghowhumansinternalizesuchpatterns.AcknowledgmentsWethankR´emiEyraud,KevinMcMullin,andtwoanonymousreviewersforusefulcomments.Thisre-searchwassupportedbyNSF#1035577.ReferencesDanaAngluinandPhilipLaird.1988.Learningfromnoisyexamples.MachineLearning,2:343–370.WilliamG.Bennett.2013.Dissimilation,ConsonantHarmony,andSurfaceCorrespondence.Ph.D.thesis,Rutgers,theStateUniversityofNewJersey.WilliamG.Bennett.2015.Assimilation,dissimilation,andsurfacecorrespondenceinsundanese.NaturalLanguageandLinguisticTheory,33(2):371–415.PascalCaron.2000.Familiesoflocallytestablelan-guages.TheoreticalComputerScience,242:361–376.JaneChandlee,R´emiEyraud,andJeffreyHeinz.2014.LearningStrictlyLocalSubsequentialFunctions.TransactionsoftheAssociationforComputationalLinguistics,2:491–503.JaneChandlee,R´emiEyraud,andJeffreyHeinz.2015.OutputStrictlyLocalfunctions.InProceedingsofthe14thMeetingontheMathematicsofLanguage(MoL14),Chicago,IL,July.JaneChandlee.2014.StrictlyLocalPhonologicalPro-cesses.Ph.D.thesis,UniversityofDelaware.NoamChomskyandMorrisHalle.1965.Somecontro-versialquestionsinphonologicaltheory.JournalofLinguistics,1(2):pp.97–138.GeorgeN.ClementsandEnginSezer.1982.VowelandconsonantdisharmonyinTurkish.InHarryvanderHulstandNorvalSmith,editors,TheStructureofPhonologicalRepresentations(PartII).Foris,Dor-drecht.Eung-DoCook.1984.ASarceeGrammar.Vancouver:UniversityofBritishColumbiaPress.ColindelaHiguera.1997.Characteristicsetsforpoly-nomialgrammaticalinference.MachineLearning,27(2):125–138.ColindelaHiguera.2010.GrammaticalInference:LearningAutomataGrammars.CambridgeUniversityPress.PedroGarc´ıa,EnriqueVidal,andJos´eOncina.1990.Learninglocallytestablelanguagesinthestrictsense.InProceedingsoftheWorkshoponAlgorithmicLearn-ingTheory,pages325–338.MarkE.Gold.1967.Languageidentificationinthelimit.InformationandControl,10:447–474.JohnGoldsmithandJasonRiggle.2012.Informationtheoreticapproachestophonologicalstructure:ThecaseofFinnishvowelharmony.NaturalLanguage&LinguisticTheory,30(3):859–896.JohnGoldsmithandArisXanthos.2009.Learningphonologicalcategories.Language,85(1):4–38.JeffreyHeinzandJamesRogers.2013.Learningsub-regularclassesoflanguageswithfactoreddeterminis-ticautomata.InAndrasKornaiandMarcoKuhlmann,editors,Proceedingsofthe13thMeetingontheMath-ematicsofLanguage(MoL13),pages64–71,Sofia,Bulgaria,August.AssociationforComputationalLin-guistics.JeffreyHeinz,ChetanRawal,andHerbertG.Tanner.2011.Tier-basedstrictlylocalconstraintsforphonol-ogy.InProceedingsofthe49thAnnualMeetingoftheAssociationforComputationalLinguistics,pages58–64,Portland,Oregon,Etats-Unis,June.AssociationforComputationalLinguistics.JeffreyHeinz,AnnaKasprzik,andTimoK¨otzing.2012.Learningwithlattice-structuredhypothesisspaces.TheoreticalComputerScience,457:111–127,October.JeffreyHeinz.2010a.Learninglong-distancephonotac-tics.LinguisticInquiry,41:623–661.JeffreyHeinz.2010b.Stringextensionlearning.InPro-ceedingsofthe48thAnnualMeetingoftheAssociationforComputationalLinguistics,pages897–906.Asso-ciationforComputationalLinguistics,July.LarryHyman.1995.Nasalconsonantharmonyatadis-tance:ThecaseofYaka.StudiesinAfricanLinguis-tics,24:5–30.LarryHyman.1998.Positionalprominenceandthe‘prosodictrough’inYaka.Phonology,15:14–75.AdamJardine,JaneChandlee,R´emiEyraud,andJeffreyHeinz.2014.Veryefficientlearningofstructuredclassesofsubsequentialfunctionsfrompositivedata.InProceedingsofthe12thInternationalConference l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / t a c l / l a r t i c e - p d f / d o i / . 1 0 1 1 6 2 / t l a c _ a _ 0 0 0 8 5 1 5 6 7 3 7 6 / / t l a c _ a _ 0 0 0 8 5 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 98 onGrammaticalInference(ICGI2014),JMLRWork-shopProceedings,pages94–108.AdamJardine.2016.Learningtiersforlong-distancephonotactics.InLaurelPerkins,RachelDudley,Ju-lianaGerard,andKasiaHitczenko,editors,Proceed-ingsofthe6thConferenceonGenerativeApproachestoLanguageAcquisitionNorthAmerica(GALANA2015).JohnJensen.1974.Variablesinphonology.Language,50:675–686.RegineLai.2013.DomainSpecificityinLearningPhonology.Ph.D.thesis,UniversityofDelaware.RegineLai.2015.Learnableversusunlearnablehar-monypatterns.LinguisticInquiry,46(3):425–451.KevinMcMullinandGunnar´OlafurHansson.forth-coming.Long-distancephonotacticsasTier-BasedStrictly2-Locallanguages.InProceedingsoftheAn-nualMeetingonPhonology2015.RobertMcNaughtonandSeymourPapert.1971.Counter-FreeAutomata.MITPress.AndrewNevins.2010.LocalityinVowelHarmony.Number55inLinguisticInquiryMonographs.MITPress.DavidOdden.1994.Adjacencyparametersinphonol-ogy.Language,70(2):289–330.CatherineRingen.1975.VowelHarmony:TheoreticalImplications.Ph.D.thesis,IndianaUniversity.JamesRogersandMarcD.Hauser.2010.Theuseofformallanguagetheoryinstudiesofartificiallanguagelearning:Aproposalfordistinguishingthedifferencesbetweenhumanandnonhumananimallearners.InHarryvanderHulst,editor,RecursionandHumanLanguage(StudiesinGenerativeGrammar[SGG])104.deGruyter.JamesRogersandGeoffreyPullum.2011.Auralpatternrecognitionexperimentsandthesubregularhierarchy.JournalofLogic,LanguageandInformation,20:329–342.JamesRogers,JeffreyHeinz,GilBailey,MattEdlefsen,MollyVisscher,DavidWellcome,andSeanWibel.2010.Onlanguagespiecewisetestableinthestrictsense.InChristianEbert,GerhardJ¨ager,andJensMichaelis,editors,TheMathematicsofLanguage,vol-ume6149ofLectureNotesinArtificalIntelligence,pages255–265.Springer.JamesRogers,JeffreyHeinz,MargaretFero,JeremyHurst,DakotahLambert,andSeanWibel.2013.Cog-nitiveandsub-regularcomplexity.InFormalGram-mar,volume8036ofLectureNotesinComputerSci-ence,pages90–108.Springer.SharonRoseandRachelWalker.2004.Atypologyofconsonantagreementascorrespondence.Language,80:475–531.KeiichiroSuzuki.1998.Atypologicalinvestigationofdissimilation.Ph.D.thesis,UniversityofArizona,Tucson.
Télécharger le PDF