Automatic Inference of Sound
Correspondence Patterns across
Multiple Languages
Johann-Mattis List
Department of Linguistic and Cultural
Evolución, Max Planck Institute for the
Science of Human History, Jena
Sound correspondence patterns play a crucial role for linguistic reconstruction. Linguists use
them to prove language relationship, to reconstruct proto-forms, and for classical phylogenetic
reconstruction based on shared innovations. Cognate words that fail to conform with expected
patterns can further point to various kinds of exceptions in sound change, such as analogy or
assimilation of frequent words. Here I present an automatic method for the inference of sound
correspondence patterns across multiple languages based on a network approach. The core idea
is to represent all columns in aligned cognate sets as nodes in a network with edges representing
the degree of compatibility between the nodes. The task of inferring all compatible correspondence
sets can then be handled as the well-known minimum clique cover problem in graph theory,
which essentially seeks to split the graph into the smallest number of cliques in which each
node is represented by exactly one clique. The resulting partitions represent all correspondence
patterns that can be inferred for a given data set. By excluding those patterns that occur in only
a few cognate sets, the core of regularly recurring sound correspondences can be inferred. Basado
on this idea, the article presents a method for automatic correspondence pattern recognition,
which is implemented as part of a Python library which supplements the article. To illustrate the
usefulness of the method, I present how the inferred patterns can be used to predict words that
have not been observed before.
1. Introducción
By comparing the languages of the world, we gain invaluable insights into human
prehistory, predating the appearance of written records by thousands of years. The clas-
sical methods for historical language comparison, a collection of different techniques
summarized under the term comparative method (Meillet 1954; Weiss 2015), date back
to the early 19th century and have since then been constantly refined and improved (ver
Ross and Durie 1996 for details on the practical workflow). Thanks to the comparative
método, linguists have made groundbreaking insights into language change in general
and into the history of many specific language families (Campbell and Poser 2008)
and external evidence has often confirmed the validity of the findings (McMahon and
Envío recibido: 4 Abril 2018; versión revisada recibida: 3 Octubre 2018; accepted for publication:
21 Noviembre 2018.
doi:10.1162/COLI_a_00344
© 2019 Asociación de Lingüística Computacional
Publicado bajo una Atribución Creative Commons-NoComercial-SinDerivadas 4.0 Internacional
(CC BY-NC-ND 4.0) licencia
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 45, Número 1
McMahon 2005, pages 10–14). With increasing amounts of data, sin embargo, the methods,
which are largely manually applied, reach their practical limits. Como resultado, eruditos
are now increasingly trying to automate different aspects of the classical comparative
methods (Kondrak 2000; Proki´c, Wieling, and Nerbonne 2009; List 2014).
One of the fundamental insights of early historical linguistic research was that—
as a result of systemic changes in the sound system of languages—genetically related
languages exhibit structural similarities in those parts of their lexicon that were com-
monly inherited from their ancestral languages. These similarities surface in the form of
correspondence relations between sounds from different languages in cognate words.
English th [i], Por ejemplo, is usually reflected as d in German, as we can see from
cognate pairs like English think versus German denken, or English thorn and German
Dorn. English t, por otro lado, is usually reflected as z [ts] in German, as we can see
from pairs like English toe versus German Zeh, or English ten versus German zehn. El
identification of these regular sound correspondences plays a crucial role in historical
language comparison, serving not only as the basis for the proof of genetic relationship
(Dybo and Starostin 2008; Campbell and Poser 2008) or the reconstruction of proto-
formas (Hoenigswald 1960, pages 72–85; Anttila 1972, pages 229–263), pero (indirectamente)
also for classical subgrouping based on shared innovations (which would not be possi-
ble without identified correspondence patterns).
With the beginning of this millennium, historical linguistics has witnessed an
increased number of attempts to quantify specific tasks of the traditional compar-
ative method. Since then, scholars have repeatedly attempted to either directly in-
fer regular sound correspondences across genetically related languages (kay 1964;
Marrón, Holman, and Wichmann 2013; Kondrak 2003, 2009) or integrated the infer-
ence into workflows for automatic cognate detection (Guy 1994; List 2012, 2014; List,
Greenhill, and Gray 2017). What is interesting in this context, sin embargo, is that almost all
approaches dealing with regular sound correspondences, be it early formal—but clas-
sically grounded—accounts (Grimes and Agard 1959; Hoenigswald 1960) or computer-
based methods (Kondrak 2002, 2003; List 2014) only consider sound correspondences
between pairs of languages.
A rare exception can be found in the work of Antilla (1972, pages 229–263) OMS
presents the search for regular sound correspondences across multiple languages as
the basic technique underlying the comparative method for historical language com-
parison. Anttila’s description starts from a set of cognate word forms (or morphemes)
across the languages under investigation. These words are then arranged in such a way
that corresponding sounds in all words are placed into the same column of a matrix.
The extraction of regularly recurring sound correspondences in the languages under
investigation is then based on the identification of similar patterns recurring across
different columns within the cognate sets. The procedure is illustrated in Figure 1, dónde
four cognate sets in Sanskrit, Ancient Greek, latín, and Gothic are shown, two taken
from Anttila (1972, página 246) and two added by me.
Two points are remarkable about Anttila’s approach. Primero, it builds heavily on the
phonetic alignment of sound sequences,1 by which the sound sequences of words are
arranged in a matrix in such a way that all corresponding sounds are placed in the
same cell (List 2014). Segundo, it reflects a concrete technique by which regular sound
1 This concept was only recently adapted in linguistics (Covington 1996; Kondrak 2000; List 2014), building
heavily on approaches in bioinformatics and computer science (Needleman and Wunsch 1970; Wagner
and Fischer 1974), although it was implicitly always an integral part of the methodology of historical
language comparison (compare Dixon and Kroeber 1919, Fox 1995, 67F).
138
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
List
Correspondence Pattern Inference
Cifra 1
Regular sound correspondences across four Indo-European languages, illustrated with help of
alignments along the lines of Anttila (1972, página 246). In contrast to the original illustration, lost
sounds are displayed with help of the dash “-” as a gap symbol, while missing words (where no
reflex in Gothic or Latin could be found) are represented by the “∅” symbol.
correspondences for multiple languages can be detected and employed as a starting
point for linguistic reconstruction. If we look at the framed columns in the four exam-
ples in Figure 1, which are further labeled alphabetically, we can easily see that the
patterns A, mi, and F are remarkably similar. The only difference is that we miss data for
Gothic in the patterns E and F, y, como resultado, we don’t have reflex sounds (sounds in
a given alignment column as reflected in a cognate word) for the full sound correspon-
dence patterns in the respective columns. The same holds, sin embargo, for columns C, mi,
and F. Since A and C differ regarding the reflex sound of Gothic (u vs. au), they cannot
be assigned to the same correspondence set at this stage, and if we want to solve the
problem of finding the regular sound correspondences for the words in the figure, nosotros
need to decide which columns in the alignments we assign to the same correspondence
colocar, thereby “imputing” missing sounds where we miss a reflex. Assuming that the
“regular” pattern in our case is reflected by the group of C, mi, and F, we can make
predictions about the sounds missing in Gothic in E and F, concluding that, if ever we
find the missing reflex in so far unrecognized sources of Gothic in the future, we would
expect a -au- in the words for “daughter-in-law” and “red”.2
We can easily see how patterns of sound correspondences across multiple languages
can serve as the basis for multiple tasks in historical linguistics. Primero, we could use
them to guess how a word that is missing in a given alignment would sound in that
idioma, if it could be found. Since the task of identifying cognate words across multi-
ple languages is very complex, and words may have drastically shifted their meanings,
we could use the predictions to search for missing cognate forms in those areas of the
lexicon that we have not considered before.3 Second, if two alignment columns are
identical, they must reflect the same proto-sound, if alternative processes like borrowing
can be excluded. De este modo, similarly to the prediction of missing words in our cognate
conjuntos, we could use correspondence patterns to infer proto-forms, provided that parts
of the data are already annotated.4 Third, we could use them to check linguistic claims
2 As pointed out by the anonymous reviewer, Gothic ráups is a reflex of ‘red’ (Wright 1910, página 340), pero
as mentioned by Eugen Hill (personal communication), the Gothic form reflects a derivationally different
formation and was therefore correctly not listed in Anttila’s examples.
3 Consider cases of shifted meanings like English hound vs. German Hund ‘dog,’ or English -thorp as a
prefix in place names compared to German Dorf ‘village.’
4 But even if correspondence patterns are not identical, they could be assigned to the same proto-sound,
provided that one can show that the differences are conditioned by phonetic context. This is the case for
Gothic au [oh] in pattern C, which has been shown to go back to u when preceding h (Meier-Brügger 2002,
page 210f). Como resultado, scholars usually reconstruct Proto-Indo-European *u for A, C, mi, and F.
139
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
ABCDEFSanskrityugamdhuhi(tar)snuṣ(ā)-rudh(iras)Greekzugonthuga(ter-)-nu-(os)eruth(rós)LatiniugumØØØØ(Ø)-nur(a nosotros)-rub(es)Gothicjuk–dauh-(tar)ØØØØ(Ø)ØØØØ(Ø)Gloss’yoke»daughter»nuera»rojo’
Ligüística computacional
Volumen 45, Número 1
about cognate words themselves: If it turns out that the aligned cognate sets proposed
by linguists do not pattern into recurring correspondences across the languages under
consideration, we can directly criticize both individual claims regarding word relations
and general claims about the genetic relation of languages.
While it seems trivial to identify sound correspondences across multiple languages
from the few examples provided in Figure 1, the problem can become quite complicated
if we add more cognate sets and languages to the comparative sample. Especially the
handling of missing reflexes for a given cognate set becomes a problem here, as missing
data makes it difficult for linguists to decide which alignment columns to group with
entre sí. This can already be seen from the examples given in Figure 1, where we have
two possibilities to group the patterns A, C, mi, and F, if we base our judgments only on
these four patterns: E and F could be grouped with either A or C, and it may even
be possible that one should be grouped with A and one with C. The “true” solution
here depends on the history of the languages, but if the data that would allow us to
reconstruct this history is lost, we can never infer the historically correct grouping with
full confidence.
The goal of this article is to illustrate how a manual analysis in the spirit of Anttila
can be automated and fruitfully applied—not only in purely computational approaches
to historical linguistics, but also in computer-assisted frameworks that help linguists to
explore their data before they start carrying out painstaking qualitative comparisons
(List 2016). In order to illustrate how this problem can be solved computationally, el
article will first discuss some important general aspects of sound correspondences and
sound correspondence patterns in Section 2, introducing specific terminology that will
be needed in the remainder. En la sección 3, we will see that the problem of finding sound
correspondences across multiple languages can be modeled as the well-known clique-
cover problem in an undirected network (Bhasker and Samad 1991). While this problem
is hard to solve in an exact way computationally,5 fast approximate solutions exist
(Welsh and Powell 1967) and can be easily applied. Basado en estos hallazgos, the article
will introduce a fully automated method for the recognition of sound correspondence
patterns across multiple languages (Sección 4). This method is implemented in the form
of a Python library and can be readily applied to multilingual wordlist data as it is
also required by software packages such as LingPy (List, Greenhill, and Forkel 2017)
or software tools such as EDICTOR (List 2017). Sección 5 will then illustrate how the
method can be applied by testing how it performs in the task of predicting missing
cognate words and missing proto-forms.
2. Preliminaries on Sound Correspondence Patterns
In the introduction, it was emphasized that the traditional comparative method is
itself less concerned with regular sound correspondences attested for language pairs,
but for all languages under consideration. En el siguiente, this claim will be further
substantiated, while at the same time introducing some major methodological consid-
erations and ideas that are important for the development of the new method for sound
correspondence pattern recognition.
5 Both the clique-cover problem and its inverse problem, the graph coloring problem, have been shown to
be np-complete (Bhasker and Samad 1991).
140
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
List
Correspondence Pattern Inference
Mesa 1
Comparing correspondence patterns for Proto-Germanic reflexes of *d-, *þ-, and *t- in German,
Inglés, and Dutch (Germanic proto-forms follow Kroonen [2013]).
2.1 From Sound Correspondences to Sound Correspondence Patterns
Sound correspondences are most easily defined for pairs of languages. De este modo, es
straightforward to state that German [d] regularly corresponds to English [i] (o [d]),
that German [ts] regularly corresponds to English [t], and that German [t] corresponds
to English [d]. We can likewise expand this view to multiple languages by adding
another Germanic language, como, Por ejemplo, Dutch to our comparison, which has
[d] in the case of German [d] e inglés [i], [t] in the case of German [ts] e inglés
[t], y [d] in the case of German [t] e inglés [d].
The more languages and examples we add to the sample, sin embargo, the more com-
plex the picture becomes, and while we can state three (basic) patterns for the case of
Inglés, Alemán, and Dutch, given in our example, we may get easily more patterns,
due to secondary sound changes in the different languages, although we would still
reconstruct only three sounds in the proto-language ([i, t, d]). This is illustrated in
Mesa 1, where Proto-Germanic forms containing *p[þθ], *t, and *d in different pho-
netic environments are contrasted with their descendant forms in German, Inglés,
and Dutch. The example shows that there is a one-to-n relationship between what
we interpret as a proto-sound of the proto-language, and the regular correspondence
patterns that we may find in our data. While we will reserve the term sound correspon-
dence for pairwise language comparison, we will use the term sound correspondence
patrón (or simply correspondence pattern) for the abstract notion of regular sound
correspondences across a set of languages that we can find in the data.
2.2 Correspondence Patterns in the Classical Literature
Scholars like Meillet (1908, página 23) have stated that the core of historical linguistics
is not linguistic reconstruction, but the inference of correspondence patterns, empha-
sizing that “reconstructions are nothing else but the signs by which one points to the
141
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 45, Número 1
Mesa 2
Sound correspondence patterns for Indo-European stops, following Clackson (2007, página 37) .
PIE
*pag
*b
*bh
*t
*d
*DH
…
*kw
*gw
*gwh
Hittite
pag
b p
b p
t
d t
d t
…
kw/ku
kw/u
kw/ku gw/gu
Sanskrit
pag
b
t
d
dh/dh h
…
k c
g j
Griego
pag
b
ph/ph
t
d
th/th
…
k p t
g b d
ph/ph th/th kh/kh
latín
pag
b
f b
t
d
f d b
…
kw/qu
gw/gu u
f gw/gu u
Gothic
f b
pag
b
θ/þ d
t
d
…
hw/hw g
q
g b
…
…
…
…
…
…
…
…
…
…
…
correspondences in short form”.6 However, given the one-to-n relation between proto-
sounds and correspondence patterns, it is clear that this is not quite correct. Teniendo
inferred regular correspondence patterns in our data, our reconstructions will add a
different level of analysis by further clustering these patterns into groups that we believe
to reflect one single sound in the ancestral language.
That there are usually more than just one correspondence pattern for a recon-
structed proto-sound is nothing new to most practitioners of linguistic reconstruction.
Desafortunadamente, sin embargo, linguists rarely list all possible correspondence patterns ex-
haustively when presenting their reconstructions, but instead select the most frequent
unos, leaving the explanation of weird or unexpected patterns to comments written in
prosa. A first and important step of making a linguistic reconstruction system trans-
parent, sin embargo, should start from an exhaustive listing of all correspondence patterns,
including irregular patterns that occur very infrequently but would still be accepted by
the scholars as reflecting true cognate words.
What scholars do instead is provide tables that summarize the correspondence
patterns in a rough form, Por ejemplo, by showing the reflexes of a given proto-
sound in the descendant languages in a table, where multiple reflexes for one and the
same language are put in the same cell. An example, taken with modifications7 from
Clackson (2007, página 37), is given in Table 2. In this table, the major reflexes of Proto-
Indo-European stops in 11 languages representing the oldest attestations and major
branches of Indo-European are listed. This table is a very typical example for the way
in which scholars discuss, propose, and present correspondence patterns in linguistic
reconstruction (Beekes 1995; Brown et al. 2011; Holton et al. 2012; jacques 2017). El
shortcomings of this representation become immediately transparent. Neither are we
told about the frequency by which a given reflex is attested to occur in the descendant
idiomas, nor are we told about the specific phonetic conditions that have been
proposed to trigger the change where we have two reflexes for the same proto-sound.
6 My translation, original text: ‘Les «restitutions» ne sont rien autre chose que les signes par lesquels on
exprime en abrégé les correspondances.’
7 We added phonetic transcriptions, preceding the original sound given by the author, separated by a slash.
142
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
List
Correspondence Pattern Inference
Cifra 2
Alignment analyses of the six cognate sets from Table 1. Brackets around subsequences indicate
that the alignments cannot be fully resolved due to secondary morphological changes.
While scholars of Indo-European tend to know these conditions by heart, it is perfectly
understandable why they would not list them. Sin embargo, when presenting the results
to outsiders to their field in this form, it makes it quite difficult for them to correctly
evaluate the findings. A sound correspondence table may look impressive, but it is of
no use to people who have not studied the data themselves.
A further problem in the field of linguistic reconstruction is that scholars barely
discuss workflows or procedures by which sound correspondence patterns can be
inferred. For well-investigated language families like Indo-European or Austronesian,
which have been thoroughly studied for more than one hundred years (Blust 1990),
it is clear that there is no direct need to propose a heuristic procedure, given that the
major patterns have been identified long ago and the research has reached a stage where
scholarly discussions circle around individual etymologies or higher levels of linguistic
reconstruction, como la semántica, morfología, and syntax.8 For languages whose history
is less well known and where historical language reconstruction has not even reached
a stage of reconstruction where a majority of scholars agree, sin embargo, a procedure that
helps to identify the major correspondence patterns underlying a given data set would
surely be incredibly valuable.
2.3 Correspondence Patterns and Alignments
In order to infer correspondence patterns, the data must be available in aligned form
(mira la sección 1), eso es, we must know which of the sound segments that we compare
across cognate sets are assumed to go back to the same ancestral segment. Esto es
illustrated in Figure 2 where the cognate sets from Table 1 are presented in aligned
forma, with zero-matches (gaps) being represented as a dash («-«), and with brackets
indicating unalignable parts in the sequences, eso es, parts that cannot be aligned,
since the differences are not due to regular sound change.9 Although alignments are
never explicitly mentioned in Clackson (2007), they are implied by the provided
8 For examples, compare the very detailed etymological discussions by Meier-Brügger (2002,
pages 173–187).
9 Scholars at times object to this claim, but it should be evident, also from reading the account by Anttila
(1972) mentioned above, that without alignment analyses, albeit implicit ones that are never provided in
concreto, no correspondence patterns could be proposed.
143
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Proto-GermanicGermanEnglishDutchProto-GermanicGermanEnglishDutch’dead»thick»tongue»deed»thorn»tooth’daud(az)toːt(–)dɛd(–)doːt(–)θek(uz)dɪk(–)θɪk(–)dɪk(–)tuŋ(goː)tsʊŋ(-ə)tʌŋ(–)tɔŋ(–)deːd(iz)taːt(–)diːd(–)da:t(–)θurn(uz)dɔrn(–)θɔː-n(–)doːrn(–)tanθ(s)tsaːn-(-)tuː-θ(-)tɑnt(-)
Ligüística computacional
Volumen 45, Número 1
Cifra 3
Alignment sites and correspondence patterns: While alignment sites are concrete representations
of the presumed relations among cognate words, correspondence patterns are a further stage of
abstracción.
correspondence patterns, which are presumably derived from the alignment of reflexes
in each of the daughter languages. These assumed alignments are given in Table 2.
“village” with Dutch dorp
Following evolutionary biology, a given column of an alignment is called an align-
ment site (or simply a site). An alignment site may reflect the same values as we
find in a correspondence pattern, and correspondence patterns are usually derived
from alignment sites, but in contrast to a correspondence pattern, an alignment site
may reflect a correspondence pattern only incompletely, due to missing data in one
or more of the languages under investigation. Por ejemplo, when comparing German
Dorf
, it is immediately clear that the initial
sounds of both words represent the same correspondence pattern as we find for the
cognate sets for “thick” and “thorn” given in Figure 2, although no reflex of their Proto-
Germanic ancestor form þurpa- (originally meaning “crowd,” see Kroonen [2013, 553])
has survived in Modern English.10 Thanks to the correspondence patterns in Table 1,
sin embargo, we know that—if we project the word back to Proto-Germanic—we must
reconstruct the initial with *þ- '[i], since the match of German d- and Dutch d- occurs—if
we ignore recent borrowings—only in correspondence patterns in which English
has th-.
These “gaps” due to missing reflexes of a given cognate set are not the same as the
gaps inside an alignment, since the latter are due to the (regular) loss or gain of a sound
segment in a given alignment site, while gaps due to missing reflexes may either reflect
processes of lexical replacement (List 2014, page 37f), or a preliminary stage of research
resulting from insufficient data collections or insufficient search for potential reflexes.
While we use the dash as a symbol for gaps in alignment sites, we will use the character
Ø (denoting the empty set) to represent missing data in correspondence patterns and
alignment sites. The relation between correspondence patterns in the sense developed
here and alignment sites is illustrated in Figure 3, where the initial alignment sites of
three alignments corresponding to Proto-Germanic *þ [i] are assembled to form one
correspondence pattern.
10 Old English still has the word þorp, but in Modern English, we only find thorp in names.
144
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
θurn(uz)dɔrn(–)θɔː-n(–)doːrn(–)’thorn’alignment sitesoundcorrespondencepatternθek(uz)dɪk(–)θɪk(–)dɪk(–)’thick’Proto-GermanicGermanEnglishDutchθdθdθurp(a)ØØØØØØØdɔrf(-)dɔrp(-)’thorp’
List
Correspondence Pattern Inference
Cifra 4
Assessing the compatibility of the four alignment sites from Figure 1.
3. Preliminary Thoughts on Correspondence Pattern Recognition
If we recall the problem we had in grouping the alignment sites E and F from Figure 1
with either A or C, we can see that the general problem of grouping alignment sites
to correspondence patterns is their compatibility. If we had reflexes for all languages
under investigation in all cognate sets, the compatibility would not be a problem,
since we could simply group all identical sites with each other, and the task could be
considered as solved. Sin embargo, since it is rather the exception than the norm to have
all reflexes for all cognate sets in all languages, we will always find possible alternative
groupings for the alignment sites.
En el siguiente, we will assume that two alignment sites are compatible, si ellos
(a) share at least one sound that is not a gap symbol, y (b) do not have any
conflicting sounds. This is illustrated in Figure 4 for our four alignment sites A, C, mi,
and F from Figure 1. As we can see from the figure, only two sites are incompatible,
namely A and C, as they show different sounds for the reflexes in Gothic. Given that the
reflex for Latin is missing in site C, we can further see that C shares only two sounds
with E and F.
Having established the notion of alignment site compatibility, it is straightforward
to go a step further and model alignment sites in the form of a network. Aquí, all sites in
the data represent nodes (or vertices), and edges are only drawn between those nodes
that are compatible, following the criterion of compatibility outlined in the previous
section.11
Having shown how the data can be modeled in the form of a network, podemos
rephrase the task of identifying correspondence patterns as a network partitioning task
with the goal of splitting the network into non-overlapping sets of nodes. Given that our
main criterion for a valid correspondence pattern is full compatibility among all align-
ment sites of a given partition, we can further specify the task as a clique partitioning
tarea. A clique in a network is “a maximal subset of the vertices [nodos] in an undirected
network such that every member of the set is connected by an edge to every other”
(Hombre nuevo 2010, página 193). Demanding that sound correspondence patterns should
form a clique of compatible nodes in the network of alignment sites directly reflects
the basic practice of historical language comparison as outlined in Anttila (1972). Cualquier
further grouping would require us to identify complementary phonetic environments
for the incompatible alignment sites.
11 We can further weight the edges in the alignment site network, Por ejemplo, by using the number of
matching sounds (where no missing data is encountered) to represent the strength of the connection (pero
we will disregard weighting in the approach presented here).
145
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
AEAFEFACCECFSanskritu<=>tu——tu<=>tu——tu<=>tu——tu<=>tu——tu<=>tu——tu<=>uGreeku<=>uu<=>uu<=>uu<=>uu<=>uu<=>uLatinu<=>uu<=>uu<=>uu?ØØ?uØ?uGothicu?Øu?ØØ?Øu>=
155
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 45, Número 1
Mesa 8
Examples for the word prediction experiment for the Chinese data. The column Frequency lists
the size of the inferred patterns for each position of the predicted word form. The score is
calculated by dividing the number of correctly predicted sounds by the total number of sounds.
(# 719, # 654), Táibˇei (# 1096), and Nánch¯ang (# 484), respectivamente, we can assign # 719
y # 1096 a # 718 y # 484 y # 654 a # 73. In patterns # 718 y # 747, only Fúzh ¯ou
shows a different reflex. Since we have forms that are homophones in Middle Chinese
in both correspondence patterns (
en # 718 were both pronounced as
en # 747 y
*dam in Middle Chinese), we cannot find a conditioning context that would explain this
difference from the perspective of Middle Chinese alone. We know, sin embargo, that the
Mˇin dialects (to which Fúzh ¯ou belongs) reflect features that are more archaic than Mid-
dle Chinese. En este caso, the difference between the patterns is regularly reflecting the
difference between plain voiced and breathy voiced initials in the ancestor of the Mˇın
dialects, with the latter going back to complex onsets in Old Chinese, the predecessor of
all Chinese dialects (Baxter and Sagart 2014, page 171f). Además, if we compare the
patrones # 747 y # 73 directly, we can see that, although only Fúzh ¯ou has a direct reflex
of the original voiced sound in Middle Chinese, we can still find its traces in the different
correspondence patterns, since Bˇeij¯ıng and Guˇangzh ¯ou have contrastive outcomes in
both patterns ([th] versus [t]). When inspecting the tones that are reconstructed for the
different words in Middle Chinese, we can easily find a conditioning context where the
reflexes differ. The píng (flat) tone category in Middle Chinese correlates with aspiration,
while the other tone categories correlate with devoicing in the three dialects.25 If we had
no knowledge of Middle Chinese, it would be harder to understand that both patterns
correspond to the same proto-sound, but once assembled in such a way, it would still be
much easier for scholars to search for a conditioning context that allows them to assign
the same proto-sound to the two patterns in questions.
The example shows that, as far as the Middle Chinese dental stops are concerned,
we do not find explicit exceptions in our data, but can rather see that multiple corre-
spondence patterns for the same proto-sound may easily evolve. We can also see that a
careful alignment and cognate annotation is crucial for the success of the method, pero
25 This phenomenon most likely goes back to an earlier phonation contrast between the first (píng) tone in
Middle Chinese and the other tones.
156
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
List
Correspondence Pattern Inference
Mesa 9
Contrasting inferred correspondence patterns with Middle Chinese reconstructions (MC) y
tone patterns (MC Tones: PAG: píng (flat), S: shˇang (rising), q: qù (falling), R: rù (stop coda)) para
representative dialects of the major groups (Bˇeij¯ıng, S ¯uzh ¯ou, Chángsh¯a, Nánch¯ang, Méixiàn,
Táoyuán, Guˇangzh ¯ou, Fúzh ¯ou, Táibˇei).
even if the cognate judgments are fine, but the data are sparse, the method may propose
erroneous groupings.
In contrast to manual work on linguistic reconstruction, where correspondence pat-
terns are never regarded in the detail in which they are presented here, the method has
the potential to drastically increase both the transparency and the quality of linguistic
data sets, especially in combination with tools for cognate annotation, like EDICTOR, a
which we added a convenient way to inspect inferred correspondence patterns interac-
activamente (see the example in Appendix A). Because linguists can run the new method on
their data and then directly inspect the consequences by browsing all correspondence
patterns conveniently in the EDICTOR, the method makes it a lot easier for linguists to
come up with first reconstructions or to identify problems in the data.
6. Conclusion and Outlook
This study has presented a new method for the inference of sound correspondence
patterns in multilingual wordlists. Thanks to its integration with popular software
packages, the method can be easily applied, both within automated, or computer-
assisted workflows. The usefulness of the method was illustrated by showing how it
can be used to predict missing words in linguistic data sets. The method, sin embargo, tiene
much additional potential. Since the method can impute words not attested in existing
idiomas, it could likewise be used for the automatic reconstruction of proto-forms, el
identification of cognates, or the assessment of the general regularity of a given data set.
In addition to revealing potential correspondence structures underlying a given data
colocar, the method can additionally help to assess how well a given data set has been
analyzed before. By helping to improve the quality and transparency of existing and
future data sets in historical linguistics in this way, we hope that the method will in the
long run also contribute to new and important findings about the past of our world’s
idiomas.
157
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 45, Número 1
Supplementary Material
The supplementary material accompanying this article contains the code and all in-
structions needed to repeat the experiments described in this article. The original
package for correspondence pattern detection is publicly available from GitHub under
https://github.com/lingpy/lingrex (Versión 0.1.0). The package providing the sup-
plementary material with results and instructions for running the code is also available
via GitHub under https://github.com/lingpy/correspondence-pattern-paper
(Versión 1.1.1) and has been archived with Zenodo at https://doi.org/10.5281/
zenodo.1544949.
Apéndice A: Inspecting Correspondence Patterns in EDICTOR
The following screenshots show how the modified version of the EDICTOR allows for
an enhanced inspection of sound correspondence patterns inferred by the method.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
158
List
Correspondence Pattern Inference
Expresiones de gratitud
This research was funded by the DFG
research fellowship grant 261553824
“Vertical and lateral aspects of Chinese
dialect history” (2015–2016), and by
the ERC Starting Grant 715618 “Computer-
Assisted Language Comparison”
(http://calc.digling.org, 2017-2018).
Originally, the approach presented here was
inspired by a novel (so far still unpublished)
biological technique presented to me by Eric
Bapteste and Philippe Lopez, which later
turned out to be completely different from
the one presented here, as I misunderstood
the original intention of the draft. Este
misunderstanding, which helped me to
address a problem that had been following
me for a long time, reflects how inspiring my
collaboration with Eric and Philippe was.
I am particularly indebted to Nathan W. Colina
for supporting this project from the
beginning, by discussing the findings, el
methods, and their potential improvement.
I am also extremely thankful to Taraka Rama
for commenting on many previous versions
of this draft and the code, discussing details
and recommending enhancements, así como
to Simon J. Greenhill for his support after I
received the first reviews, and Mary
Walworth for helping with data. Timotheus
Bodt also deserves special thanks for being
an early tester of the methods. Además,
many people provided helpful comments on
an earlier version(s) of this article, incluido
Adam Powell, David A. S. Moslehi, Eugen
Colina, Juho Pystynen, Martin Kümmel, Rémy
Viredaz, Tiago Tresoldi, and Yoram Meroz, a
whom I would also like to express my gratitude.
Referencias
Anttila, Raimo. 1972. Una introducción a
Historical and Comparative Linguistics,
Macmillan, Nueva York.
Arnaud, Adam S., David Beck, and Grzegorz
Kondrak. 2017. Identifying cognate sets
across dictionaries of related languages. En
Actas de la 2017 Conferencia sobre
Empirical Methods in Natural Language
Procesando, pages 2509–2518, Asociación
para Lingüística Computacional.
Baxter, William H. 1992. A Handbook of Old
Chinese Phonology. de Gruyter, Berlina.
Baxter, William H., and Laurent Sagart. 2014.
Old Chinese. A New Reconstruction. Oxford
Prensa universitaria, Oxford.
Beekes, Robert S. PAG. 1995. Comparative
Indo-European Linguistics. An Introduction.
Juan Benjamín, Amsterdam and
Filadelfia.
Bhasker, J., and Tariq Samad. 1991. El
clique-partitioning problem. Computadoras &
Mathematics with Applications, 22(6):1–11.
Blust, Roberto. 1990. Patterns of sound
change in the Austronesian languages.
In Philip Baldi, editor, Lingüístico
Change and Reconstruction Methodology.
Mouton de Gruyter, Berlina; Nueva York,
pages 231–270.
Marrón, Cecil H., David Beck, Grzegorz
Kondrak, James K. Watters, and Søren
Wichmann. 2011. Totozoquean.
International Journal of American Linguistics,
77(3):323–372.
Marrón, Cecil H., Eric W. Holman, and Søren
Wichmann. 2013. Sound correspondences
in the world’s languages. Idioma,
89(1):4–29.
Campbell, Lyle, and William John Poser.
2008. Language Classification: History and
Método. Prensa de la Universidad de Cambridge,
Cambridge.
Clackson, James. 2007. Indo-European
Lingüística. Prensa de la Universidad de Cambridge,
Cambridge.
Covington, Michael A. 1996. An algorithm to
align words for historical comparison.
Ligüística computacional, 22(4):481–496.
dixon, R. B., y un. l. Kroeber. 1919.
Linguistic Families of California. Universidad
of California Press, berkeley.
Dybo, Anna, and George S. Starostin. 2008.
In defense of the comparative method, o
the end of the Vovin controversy. In I. S.
Smirnov, editor, Aspekty komparativistiki,
Volumen 3. RGGU, Moscow, pages 119–258.
Fox, Antonio. 1995. Linguistic Reconstruction.
prensa de la Universidad de Oxford, Oxford.
Greenhill, Simon J., Robert Blust, y
Russell D. Gray. 2008. The Austronesian
Basic Vocabulary Database: De
bioinformatics to lexomics. Evolutionary
Bioinformatics, 4271–283.
Grimes, Joseph E., and Frederick B. Agard.
1959. Linguistic divergence in romance.
Idioma, 35(4):598–604.
Guy, Jacques B. METRO. 1994. An algorithm for
identifying cognates in bilingual wordlists
and its applicability to machine
traducción. Journal of Quantitative
Lingüística, 1(1):35–42.
Hattori, Shir ¯o. 1973. Japanese dialects. En
Henry M. Hoenigswald and Robert H.
Langacre, editores. Diachronic, Areal and
Typological Linguistics, Número 11 en
Current Trends in Linguistics. Moutón, El
Hague and Paris, pages 368–400.
Hetland, Magnus Lie. 2010. Python
Algorithms. Mastering Basic Algorithms in
the Python Language. Apress, Nueva York.
159
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Ligüística computacional
Volumen 45, Número 1
Colina, Nathan W., and Johann-Mattis List.
2017. Challenges of annotation and
analysis in computer-assisted language
comparación: A case study on Burmish
idiomas. Yearbook of the Pozna ´n Linguistic
Meeting, 3(1):47–76.
Hoenigswald, Henry Max. 1960. Idioma
Change and Linguistic Reconstruction, 4. aufl.
1966 edition. The University of Chicago
Prensa, chicago.
Holton, Gary, Marian Klamer, František
Kratochvíl, Laura C. robinson, y
Antoinette Schapper. 2012. The historical
relations of the Papuan languages of Alor
and Pantar. Oceanic Linguistics,
51(1):86–122.
Hóu, J¯ıng¯ı. 2004. Xiàndài Hàny ˇu f¯angyán y¯ınkù
[Phonological Database of Chinese Dialects],
Shànghˇai Jiàoyù, Shànghˇai.
Huáng, Bùfán. 1992. Zàngmiˇan y ˇuzú y ˇuyán
cíhuì, Zh ¯ongy¯ang Mínzú Dàxué [Central
Institute of Minorities], Beij¯ıng.
jacques, Guillaume. 2017. A reconstruction
of Proto-Kiranti verb roots. Folia Linguistica
Historica, 38(1):177–215.
Jäger, Gerhard, Johann-Mattis List, and Pavel
Sofroniev. 2017. Using support vector
machines and state-of-the-art algorithms
for phonetic alignment to identify cognates
in multi-lingual wordlists. En procedimientos de
the 15th Conference of the European Chapter of
la Asociación de Lingüística Computacional.
Artículos largos. pages 1204–1215.
kay, Martín. 1964. The Logic of Cognate
Recognition in Historical Linguistics. El
Corporación RAND, Santa Mónica.
Kondrak, Grzegorz. 2000. A new algorithm
for the alignment of phonetic sequences.
In Proceedings of the 1st North American
Chapter of the Association for Computational
Linguistics Conference, pages 288–295.
Kondrak, Grzegorz. 2002. Determining
recurrent sound correspondences by
inducing translation models. In Nineteenth
International Conference on Computational
Lingüística, pages 488–494, Taipéi.
Kondrak, Grzegorz. 2003. Identifying
complex sound correspondences in
bilingual wordlists. Alexander Gelbukh,
editor. Computational Linguistics and
Intelligent Text Processing. Saltador, Berlina,
pages 432–443.
Kondrak, Grzegorz. 2009. Identification
of cognates and recurrent sound
correspondences in word lists. Traitement
Automatique des Langues, 50(2):201–235.
Kroonen, Guus. 2013. Etymological Dictionary
of Proto-Germanic. Número 11 in Leiden
Indo-European Etymological Dictionary
Serie. Rodaballo, Leiden and Boston.
160
List, Johann-Mattis. 2012. LexStat. Automatic
detection of cognates in multilingual
wordlists. In Proceedings of the EACL 2012
Joint Workshop of Visualization of Linguistic
Patterns and Uncovering Language History
from Multilingual Resources, pages 117–125,
Stroudsburg.
List, Johann-Mattis. 2014. Sequence
Comparison in Historical Linguistics.
Düsseldorf University Press,
Düsseldorf.
List, Johann-Mattis. 2016. Computer-assisted
language comparison: Reconciling
computational and classical approaches in
historical linguistics. Reporte técnico,
Max Planck Institute for the Science of
Human History, Jena.
List, Johann-Mattis. 2017. A web-based
interactive tool for creating, inspecting,
edición, and publishing etymological
conjuntos de datos. In Proceedings of the 15th
Conference of the European Chapter of the
Asociación de Lingüística Computacional.
Demostraciones del sistema. pages 9–12,
Asociación de Lingüística Computacional,
Valencia.
List, Johann-Mattis, Simon Greenhill, y
Robert Forkel. 2017. LingPy. A Python
Library for Quantitative Tasks in Historical
Lingüística. Max Planck Institute for the
Science of Human History, Jena.
List, Johann-Mattis, Simon J. Greenhill, y
Russell D. Gray. 2017. The potential of
automatic word comparison for historical
linguistics. MÁS UNO, 12(1):1–18.
McMahon, Abril, and Robert McMahon.
2005. Language Classification by Numbers.
prensa de la Universidad de Oxford, Oxford.
Meier-Brügger, Miguel. 2002.
Indogermanische Sprachwissenschaft, 8th
edition. de Gruyter, Berlin and New York.
Meillet, Antoine. 1908. Les dialectes
Indo-Européens, Librairie Ancienne Honoré
Champion, París.
Meillet, Antoine. 1954. La méthode comparative
en linguistique historique, reprint edition.
Honoré Champion, París.
Needleman, Saul B., and Christan D.
Wunsch. 1970. A gene method applicable
to the search for similarities in the amino
acid sequence of two proteins. Diario de
Molecular Biology, 48:443–453.
Hombre nuevo, METRO. mi. j. 2010. Networks. Un
Introducción. prensa de la Universidad de Oxford,
Oxford.
Proki´c, Jelena, Martijn Wieling, y juan
Nerbonne. 2009. Multiple sequence
alignments in linguistics. En procedimientos de
the EACL 2009 Workshop on Language
Technology and Resources for Cultural
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
List
Correspondence Pattern Inference
Heritage, Social Sciences, Humanities, y
Educación, pages 18–25.
ross, Malcolm, and Mark Durie. 1996.
Introducción. In Durie, Marca, editor. El
Comparative Method Reviewed. Regularity
and Irregularity in Language Change.
prensa de la Universidad de Oxford, Nueva York,
pages 3–38.
Wagner, Robert A., and Michael J. pescador.
1974. The string-to-string correction
problema. Journal of the Association for
Computing Machinery, 21(1):168–173.
Walworth, Mary. 2018. Polynesian
segmented data. Versión 1. Zenodo.
http://doi.org/10.5281/zenodo.1689909.
Weiss, Miguel. 2015. The comparative
método. In Claire Bowern and
Nicholas Evans, editores. The Routledge
Handbook of Historical Linguistics,
1st edition, Routledge Handbooks in
Lingüística. Routledge, Nueva York,
pages 127–145.
galés, D. j. A., y M. B. Powell. 1967.
An upper bound for the chromatic number
of a graph and its application to
timetabling problems. The Computer
Diario, 10(1):85–86.
Wright, Joseph. 1910. Grammar of the Gothic
Idioma, 2 edition. Clarendon Press,
Oxford.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
/
C
oh
yo
i
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
4
5
1
1
3
7
1
8
0
9
7
0
1
/
C
oh
yo
i
_
a
_
0
0
3
4
4
pag
d
.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
8
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
161