On the Relationships Between the Grammatical Genders of Inanimate
Nouns and Their Co-Occurring Adjectives and Verbs
Adina Williams∗1 Ryan Cotterell∗,2,3 Lawrence Wolf-Sonkin4
Dami´an Blasi5 Hanna Wallach6
2ETH Z¨urich
5Universit¨at Z¨urich
1Facebook AI Research
4Université Johns Hopkins
3University of Cambridge
6Microsoft Research
adinawilliams@fb.com
ryan.cotterell@inf.ethz.ch
lawrencews@jhu.edu
damian.blasi@uzh.ch
wallach@microsoft.com
Abstrait
We use large-scale corpora in six different
gendered languages, along with tools from
NLP and information theory, to test whether
there is a relationship between the grammatical
genders of inanimate nouns and the adjectives
used to describe those nouns. For all six
languages, we find that there is a statistically
significant relationship. We also find that
there are statistically significant relationships
between the grammatical genders of inanimate
nouns and the verbs that take those nouns
as direct objects, as indirect objects, and as
sujets. We defer deeper investigation of
these relationships for future work.
1
Introduction
In many languages, nouns possess grammati-
cal genders. When a noun refers to an animate
objet, its grammatical gender typically reflects
the biological sex or gender identity of that
objet (Zubin and K¨opcke, 1986; Corbett, 1991;
Kramer, 2014). Par exemple, in German, the word
for a boss is grammatically feminine when it
refers to a woman, but grammatically masculine
when it refers to a man—Chefin and Chef, res-
pectively. But inanimate nouns (c'est à dire., nouns that
refer to inanimate objects) also possess grammat-
ical genders. Any German speaker will tell you
that the word for a bridge, Br¨ucke, is grammati-
cally feminine, even though bridges have neither
biological sexes nor gender identities. Histori-
cally, the grammatical genders of inanimate nouns
have been considered more idiosyncratic and
∗Equal contribution in this scientific whirlwind.
139
less meaningful than the grammatical genders
of animate nouns (Brugmann, 1889; Bloomfield,
1933; Fox, 1990; Aikhenvald, 2000). Cependant,
some cognitive scientists have reopened this dis-
cussion by using laboratory experiments to test
whether speakers of gendered languages reveal
gender stereotypes (Sera et al., 1994)—for exam-
ple, and most famously, when choosing adjectives
to describe inanimate nouns (Boroditsky et al.,
2003).
Although laboratory experiments are highly
informative, they typically involve small sample
sizes. In this paper, we therefore use large-scale
corpora and tools from NLP and information
theory to test whether there is a relationship
between the grammatical genders of inanimate
nouns and the adjectives used to describe those
nouns. Spécifiquement, we calculate the mutual infor-
mation (MI)—a measure of the mutual statisti-
cal dependence between two random variables—
between the grammatical genders of inanimate
nouns and the adjectives that describe them (c'est à dire.,
share a dependency arc labeled AMOD) using large-
scale corpora in six different gendered languages
(specifically, German, Italian, Polish, Portuguese,
Russian, and Spanish). For all six languages, nous
find that the MI is statistically significant, meaning
that there is a relationship.
We also test whether there are relationships
between the grammatical genders of inanimate
nouns and the verbs that take those nouns as direct
objets, as indirect objects, and as subjects. For all
six languages, we find that there are statistically
significant relationships for the verbs that take
those nouns as direct objects and as subjects. Pour
five of the six languages, we also find that there
is a statistically significant relationship for the
verbs that take those nouns as indirect objects, mais
Transactions of the Association for Computational Linguistics, vol. 9, pp. 139–159, 2021. https://doi.org/10.1162/tacl a 00355
Action Editor: Sebastian Pad´o. Submission batch: 3/2020; Revision batch: 7/2020; Published 3/2021.
c(cid:3) 2021 Association for Computational Linguistics. Distributed under a CC-BY 4.0 Licence.
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
je
un
c
_
un
_
0
0
3
5
5
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
because of the small number of noun–verb pairs
involved, we caution against reading too much
into this finding.
To contextualize our findings, we test whether
there are statistically significant
relationships
between the grammatical genders of inanimate
nouns and the cases and numbers of these nouns.
A priori, we do not expect to find statistically
significant relationships, so these tests can be
viewed as a baseline of sorts. As expected, pour
each of the six languages, there are no statistically
significant relationships.
To provide further context, we also repeat
tests for animate nouns—a ‘‘skyline’’ of
tous
sorts—finding that for all six languages there
is a statistically significant relationship between
the grammatical genders of animate nouns and
the adjectives used to describe those nouns. Nous
also find that there are statistically significant
relationships between the grammatical genders
of animate nouns and the verbs that take those
nouns as direct objects, as indirect objects, et
as subjects. All of these relationships have effect
sizes (operationalized as normalized MI values)
that are larger than the effect sizes for inanimate
nouns.
We emphasize that the practical significance
and implications of our findings require deeper
enquête. La plupart
importantly, we do not
investigate the characteristics of the relationships
that we find. This means that we do not know
whether these relationships are characterized by
gender stereotypes, as argued by some cognitive
scientists. We also do not engage with the ways
that historical and sociopolitical factors affect the
grammatical genders possessed by either animate
or inanimate nouns (Fodor, 1959; Ibrahim, 2014).
2 Background
2.1 Grammatical Gender
Languages lie along a continuum with respect
to whether nouns possess grammatical genders.
like
Languages with no grammatical genders,
Turkish, lie on one end of this continuum, alors que
languages with tens of gender-like classes, like
Swahili (Corbett, 1991), lie on the other. Dans ce
papier, we focus on six different gendered lan-
guages for which large-scale corpora are readily
available: German, Italian, Polish, Portuguese,
languages of Indo-
Russian, and Spanish—all
these languages
European descent. Three of
(Italian, Portuguese, and Spanish) have two
grammatical genders (masculine and feminine),
while the other two have three grammatical
genders (masculine, feminine, and neuter).
All six languages exhibit gender agreement,
meaning that words are marked with morpholog-
ical suffixes that reflect the grammatical genders
of their surrounding nouns (Corbett, 2006). Pour
example, consider the following translations of
the sentence, ‘‘The delicate fork is on the cold
ground.’’
(1) Die zierliche Gabel steht auf dem kalten
Boden.
the.F.SG.NOM delicate.F.SG.NOM fork.F.SG.NOM
stands
cold.M.SG.DAT
the.M.SG.DAT
ground.M.SG.DAT
The delicate fork is on the cold ground.
sur
(2) El tenedor delicado est´a en el suelo fr´ıo.
the.M.SG fork.M.SG delicate.M.SG is on the.M.SG
ground.M.SG cold.M.SG
The delicate fork is on the cold ground.
Because the German word for a fork, Gabel, est
grammatically feminine, the German translation
uses the feminine determiner, die. Had Gabel been
masculine, the German translation would have
used the masculine determiner, der. De la même manière,
because the Spanish word for a fork, tenedor, est
grammatically masculine, the Spanish translation
uses the masculine determiner, el,
instead of
the feminine determiner, la. As we explain in
Section 3, we lemmatize each corpus to ensure
that our tests do not simply reflect the presence of
gender agreement.
2.2 Grammatical Gender & Meaning
Although some scholars have described the
grammatical genders possessed by inanimate
nouns as ‘‘creative’’ and meaningful (Grimm,
1890; Wheeler, 1899), many scholars have
considered them to be idiosyncratic (Brugmann,
1889; Bloomfield, 1933) or arbitrary (Maratsos,
1979, p. 317). In an overview of this work,
Dye et al. (2017) wrote, ‘‘As often as not, le
languages of the world assign [inanimate] objets
into seemingly arbitrary [classes]… William of
140
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
je
un
c
_
un
_
0
0
3
5
5
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Ockham considered gender to be a meaningless,
unnecessary aspect of language.’’ Bloomfield
(1933) shared this viewpoint, stating that ‘‘[t]ici
seems to be no practical criterion by which the
gender of a noun in German, French, or Latin
langue
[peut] be determined.’’ Indeed, adult
learners often have particular difficulty mastering
the grammatical genders of
inanimate nouns
(Franceschina 2005, Ch. 4, DeKeyser 2005;
Montrul et al. 2008), which suggests that their
meanings are not straightforward.
Even if the grammatical genders possessed by
inanimate nouns are meaningless, ample evidence
suggests that gender-related information may af-
fect cognitive processes (Sera et al., 1994; Cubelli
et coll., 2005, 2011; Kurinski and Sera, 2011;
Boutonnet et al., 2012; Saalbach et al., 2012).
Typologists and formal linguists have argued that
grammatical genders are an important feature for
morphosyntactic processes (Corbett, 1991, 2006;
Harbour et al., 2008; Harbour, 2011; Kramer,
2014, 2015), while some cognitive scientists
have shown that grammatical genders can be a
perceptual cue—for example, human brain res-
ponses exhibit sensitivity to gender mismatches
in several different
languages (Osterhout and
Mobley, 1995; Hagoort and Brown, 1999;
Vigliocco et al., 2002; Wicha et al., 2003, 2004;
Barber et al., 2004; Barber and Carreiras, 2005;
Ba˜n´on et al., 2012; Caffarra et al., 2015), et
the grammatical genders of determiners and
adjectives can prime nouns (Bates et al., 1996;
Akhutina et al., 1999; Friederici and Jacobsen,
1999). Cependant,
le
relationship between grammatical gender and
meaning remains an open research question.
the precise nature of
En particulier,
this viewpoint
the grammatical genders pos-
sessed by inanimate nouns might affect the ways
that speakers of gendered languages conceptualize
the objects referred to by those nouns (Jakobson,
1959; Clarke et al., 1981; Ervin-Tripp, 1962;
Konishi, 1993; Sera et al., 1994, 2002; Vigliocco
et coll., 2005; Bassetti, 2007)—although we note
que
is somewhat contentious
(Hofst¨atter, 1963; Bender et al., 2011; McWhorter,
2014). Neo-Whorfian cognitive scientists hold
a particularly strong variant of this viewpoint,
arguing that the grammatical genders possessed
by inanimate nouns prompt speakers of gendered
languages to rely on gender stereotypes when
choosing adjectives to describe those nouns
(Boroditsky and Schmidt, 2000; Boroditsky et al.,
in German,
stereotypically masculine
que
2002; Phillips and Boroditsky, 2003; Boroditsky,
2003; Boroditsky et al., 2003; Semenuks et al.,
2017). Most famously, Boroditsky et al. (2003)
claim to have conducted a laboratory experi-
ment showing that speakers of German choose
stereotypically feminine adjectives to describe,
Par exemple, bridges, while speakers of Spanish
adjectives,
choose
the word
reflecting the fact
for a bridge, Br¨ucke, is grammatically feminine,
while in Spanish, the word for a bridge, puente,
is grammatically masculine. Boroditsky et al.
(2003) took these findings to be a relatively strong
confirmation of the existence of a stereotype
effect—that is, that speakers of gendered lan-
guages reveal gender stereotypes when choosing
adjectives to describe inanimate nouns. That said,
the experiment has not gone unchallenged. En effet,
Mickan et al. (2014) reported two unsuccessful
replication attempts.
2.3 Laboratory Experiments vs. Corpora
Traditionnellement, studies of grammatical gender and
meaning have relied on laboratory experiments.
This is for two reasons: 1) laboratory experiments
can be tightly controlled, et 2) they enable
scholars to measure speakers’ immediate, réel-
time speech production. Cependant,
they also
typically involve small sample sizes and, in many
cases, somewhat artificial settings. In contrast,
large-scale corpora of written text enable scholars
to measure even relatively weak correlations via
writers’ text production in natural, albeit
less
tightly controlled, settings. They also facilitate
the discovery of correlations that hold across
languages with disparate histories, cultural con-
texts, and even gender systems. Par conséquent, grand-
scale corpora have proven useful for studying a
wide variety of language-related phenomena (par exemple.,
Featherston and Sternefeld, 2007; Kennedy, 2014;
Blasi et al., 2019).
In this paper, we assume that a writer’s choice
of words in written text is as informative as a
speaker’s choice of words in a laboratory expe-
riment, despite the obvious differences between
these settings. Par conséquent, we use large-scale
corpora and tools from NLP and information
théorie, enabling us to test for the presence
of even relatively weak relationships involving
the grammatical genders of
inanimate nouns
across multiple different gendered languages. Nous
141
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
je
un
c
_
un
_
0
0
3
5
5
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Chiffre 1: Dependency tree for the sentence, ‘‘Yo quiero cruzar un puente robusto.’’
therefore argue that our findings complement,
rather than supersede, laboratory experiments.
3 Data Preparation
2.4 Related Work
Our paper is not the first to use large-scale corpora
and tools from NLP to investigate gender and
langue. Many scholars have studied the ways
that societal norms and stereotypes,
y compris
gender norms and stereotypes, can be reflected in
representations of distributional semantics derived
from large-scale corpora, such as word embed-
dings (Bolukbasi et al., 2016; Caliskan et al.,
2017; Garg et al., 2018; Zhao et al., 2018).
Plus récemment, Williams et al. (2019) trouvé
that the grammatical genders of inanimate nouns
in eighteen different languages were correlated
with their lexical semantics. Dye et al. (2017)
used tools from information theory to reject
the idea that the grammatical genders of nouns
separate those nouns into coherent categories,
arguing instead that grammatical genders are only
meaningful in that they systematically facilitate
communication efficiency by reducing nominal
entropy. Also relevant to our paper is the work
of Kann (2019), who proposed a computational
approach to testing whether there is a relationship
between the grammatical genders of inanimate
nouns and the words that co-occur with those
nouns, operationalized via word embeddings.
Cependant, in contrast to our findings, they found no
evidence for the presence of such a relationship.
Enfin, many scholars have proposed a variety
of computational techniques for mitigating gender
norms and stereotypes in a wide range of language-
based applications (Dev and Phillips, 2019; Dinan
et coll., 2019; Ethayarajh et al., 2019; Hall Maudslay
et coll., 2019; Stanovsky et al., 2019; Tan and Celis,
2019; Zhou et al., 2019; Zmigrod et al., 2019).
142
We use the May 2018 dump of Wikipedia to
create a corpus for each of the six different
gendered languages (c'est à dire., German, Italian, Polish,
Portuguese, Russian, and Spanish). Although
Wikipedia is not the most representative data
source,
language-specific
corpora that are roughly parallel—that is, ils
refer to the same objects, but are not direct
translations of each other (which could lead to
artificial word choices). We use UDPipe to to-
kenize each corpus (Straka et al., 2016).
this choice yields
We dependency parse the corpus for each
language using a language-specific dependency
parser (Andor et al., 2016; Alberti et al., 2017),
trained using Universal Dependencies treebanks
(Nivre et al., 2017). An example dependency
tree is shown in Figure 1. We then extract all
noun–adjective pairs (dependency arcs labeled
AMOD) and noun–verb pairs from each of the
six corpora; for verbs, we extract three types of
pairs, reflecting the fact that nouns can be direct
objets (dependency arcs labeled DOBJ), indirect
objets (dependency arcs labeled IOBJ), or subjects
(dependency arcs labeled NSUBJ) of verbs. Nous
discard all pairs that contain a noun that is not
present in WordNet (Princeton University, 2010).
We label the remaining nouns as ‘‘animate’’ or
‘‘inanimate’’ according to WordNet.
Suivant, we lemmatize all words (c'est à dire., nouns,
adjectives, and verbs). Each word is factored into
a set of lexical features consisting of a lemma,
or canonical morphological form, and a bundle
of three morphological features corresponding
to the grammatical gender, number, and case of
that word. Par exemple, the German word for a
fork, Gabel, is grammatically feminine, singular,
and genitive. For nouns, we discard the lemmas
themselves and retain only the morphological
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
je
un
c
_
un
_
0
0
3
5
5
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
features; for adjectives and verbs, we retain the
lemmas and discard the morphological features.
For adjectives and verbs,
lemmatizing is
especially important because it ensures that our
tests do not simply reflect
the presence of
gender agreement, as we describe in Section 2.1.
Cependant, this means that if the lemmatizer fails,
then our tests may simply reflect gender agreement
despite our best efforts. To guard against this, nous
use a state-of-the-art lemmatizer (M¨uller et al.,
2015), trained for each language using Universal
Dependencies treebanks (Nivre et al., 2017). Nous
expect that when the lemmatizer fails, the resulting
lemmata will be low frequency. We try to exclude
lemmatization failures from our calculations by
discarding low-frequency lemmata. For each
langue, we rank the adjective lemmata by
their token counts and retain only the highest-
ranked lemmata (in rank order) that account for
90% of the adjective tokens; we then discard
all noun–adjective pairs that do not contain one
of these lemmata. We repeat the same process
for verbs.
Enfin,
le
most
relationships, we also discard
low-frequency inanimate nouns and, separately,
low-frequency animate nouns using the same
processus. We provide counts of the remaining
noun–adjective and noun–verb pairs in Table 3
(for inanimate nouns) and Table 4 (for animate
nouns).
to ensure that our tests reflect
salient
4 Méthodologie
VERB
⊂ V (cid:2)
For each language (cid:2) ∈ {de, it, pl, pt, ru, es}, nous
define V (cid:2)
ADJ to be the set of adjective lemmata
represented in the noun–adjective pairs retained
for that language as defined above. We similarly
define V (cid:2)
to be the set of verb lemmata
represented in the noun–verb pairs retained for
langue, as described above. We then
que
define V (cid:2)
VERB, et
VERB-IOBJ
V (cid:2)
VERB to be the sets of verbs that take the
nouns as direct objects, as indirect objects, and as
sujets, respectivement. We also define G(cid:2) to be the
set of grammatical genders for that language (par exemple.,
Ges = {MSC, FEM}), C(cid:2) to be the set of cases (par exemple.,
Cde = {NOM, ACC, GEN, DAT}), and N (cid:2) to be the set
of numbers (par exemple., N pt = {PL, SG}). Enfin, nous
define fourteen random variables: UN(cid:2)
a are
V (cid:2)
ADJ-valued random variables, D(cid:2)
a are
i and A(cid:2)
i and D(cid:2)
VERB-DOBJ
⊂ V (cid:2)
VERB, V (cid:2)
⊂ V (cid:2)
VERB-SUBJ
VERB-IOBJ-valued random variables, S(cid:2)
VERB-SUBJ-valued random variables, G(cid:2)
i and C(cid:2)
i and N (cid:2)
V (cid:2)
i and I (cid:2)
VERB-DOBJ-valued random variables, je (cid:2)
un
are V (cid:2)
i and
a are V (cid:2)
S(cid:2)
i and
a are G(cid:2)-valued random variables, C(cid:2)
G(cid:2)
a are
C(cid:2)-valued random variables, and N (cid:2)
a are
N (cid:2)-valued random variables. The subscripts ‘‘i’’
and ‘‘a’’ denote inanimate and animate nouns,
respectivement
To test whether there is a relationship between
the grammatical genders of inanimate nouns and
the adjectives used to describe those nouns for
langue (cid:2), we calculate the MI (mutual
dans-
formation)—a measure of the mutual statistical
dependence between two random variables—
between G(cid:2)
i and A(cid:2)
je:
MI(G(cid:2)
je; UN(cid:2)
je)
(cid:2)
(cid:2)
=
P. (g, un) log2
g ∈ G(cid:2)
a∈V (cid:2)
ADJ
Pi(g, un)
Pi(g) Pi(un)
,
(1)
je; UN(cid:2)
je) = 0;
i and A(cid:2)
if G(cid:2)
then MI(G(cid:2)
je)}, where H(G(cid:2)
je) is the entropy of A(cid:2)
where all probabilities are calculated with respect
to inanimate nouns only. If G(cid:2)
i are
independent (c'est à dire., there is no relationship between
i and A(cid:2)
eux) then MI(G(cid:2)
je
je; UN(cid:2)
je) =
are maximally dependent
min{H(G(cid:2)
je), H(UN(cid:2)
je) is the entropy
of G(cid:2)
i and H(UN(cid:2)
je. Pour
simplicity, we use plug-in estimates for all
probabilities (c'est à dire., empirical probabilities), defer-
ring the use of more sophisticated estimators for
future work. We note that MI(G(cid:2)
je) can be
temps; cependant, | G(cid:2)|
calculated in O
is negligible (c'est à dire, two or three) so the main cost
est |V (cid:2)
| G(cid:2)| · |V (cid:2)
je, UN(cid:2)
|.
(cid:3)
(cid:4)
ADJ
|
ADJ
To test for statistical significance, we perform a
permutation test. Spécifiquement, we permute the
the inanimate nouns
grammatical genders of
10,000 times and, for each permutation, recal-
culate the MI between G(cid:2)
i using the
permuted genders. We obtain a p-value by
calculating the percentage of permutations that
have a higher MI than the MI obtained using the
non-permuted genders; if the p-value is less than
0.05, then we treat the relationship between G(cid:2)
je
et un(cid:2)
i as statistically significant.
i and A(cid:2)
Because the maximum possible MI between
any pair of random variables depends on the
entropies of those variables, MI values are not
comparable across pairs of random variables. Nous
therefore also calculate the normalized MI (NMI)
143
je
D
o
w
n
o
un
d
e
d
F
r
o
m
h
t
t
p
:
/
/
d
je
r
e
c
t
.
m
je
t
.
e
d
toi
/
t
un
c
je
/
je
un
r
t
je
c
e
–
p
d
F
/
d
o
je
/
.
1
0
1
1
6
2
/
t
je
un
c
_
un
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
je
un
c
_
un
_
0
0
3
5
5
p
d
.
F
b
oui
g
toi
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
de
it
pl
pt
ru
es
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
je, UN(cid:2)
0.0310
je)
je, D(cid:2)
0.0290
je )
je, je (cid:2)
0.0743
je )
je, S(cid:2)
0.0276
je )
je, C(cid:2)
je ) < 0.001
i, N (cid:2)
i ) < 0.001 < 0.001 <0.001 <0.001 < 0.001 < 0.001
0.0520
0.0440
0.0640
0.0270
< 0.001
0.0225
0.0109
0.0514
0.0226
< 0.001
0.0400
0.0129
0.0230
0.0090
N/A
0.0664
0.0090
0.0184
0.0090
N/A
0.0500
0.0232
0.6973
0.0274
N/A
Table 1: The mutual information (MI) between the grammatical genders of
inanimate nouns and a) the adjectives used to describe those nouns (top row),
b) the verbs that take those nouns as direct objects, as indirect objects, and as
subjects (rows 2–4, respectively), and c) the cases and numbers of those nouns
(rows 5 and 6, respectively) for six different gendered languages. Statistical
significance (i.e., a p-value less than 0.05) is indicated using bold. MI values
are not comparable across pairs of random variables.
i and A(cid:2)
i by normalizing MI(G(cid:2)
between G(cid:2)
i, A(cid:2)
i)
to lie between zero and one. The most obvious
choice of normalizer is the maximum possible
i)}—however, var-
MI—that is, min{H(G(cid:2)
ious other normalizers have been proposed, each
of which has different advantages and disadvan-
tages (Gates et al., 2019). We therefore calculate
six different variants of NMI(G(cid:2)
i) using the
following normalizers:
i), H(A(cid:2)
i, A(cid:2)
min{H(G(cid:2)
i), H(A(cid:2)
i)}
(cid:5)
H(G(cid:2)
i)H(A(cid:2)
i)
i) + H(A(cid:2)
i)
H(G(cid:2)
2
i)}
max{H(G(cid:2)
i), H(A(cid:2)
max {log | G(cid:2)|, log | V (cid:2)
ADJ
| }
log M (cid:2)
i ,
(2)
(3)
(4)
(5)
(6)
(7)
where M (cid:2)
i
imate) noun–adjective pairs retained for
language.
is the number of non-unique (inan-
that
i, D(cid:2)
To test whether there are relationships between
the grammatical genders of inanimate nouns and
the verbs that take those nouns as direct objects,
as indirect objects, and as subjects, we calculate
MI(G(cid:2)
i ). Again,
all probabilities are calculated with respect to
inanimate
perform
only,
permutation tests to test for statistical significance.
We also calculate six NMI variants for each of the
three pairs of random variables, using normalizers
i ), and MI(G(cid:2), S(cid:2)
i ), MI(G(cid:2)
and we
nouns
i, I (cid:2)
that are analogous to those in Eq. (2) through
Eq. (7).
i, N (cid:2)
As a baseline, we test whether there are rela-
tionships between the grammatical genders of
inanimate nouns and the cases and numbers of
those nouns—that is, we calculate MI(G(cid:2)
i, C(cid:2)
i )
and MI(G(cid:2)
i ) using probabilities that are
calculated with respect to inanimate nouns only.
Again, we perform permutation tests (but we
there will be statistically
do not expect
significant relationships), and we calculate six
NMI variants for each pair of random variables
using normalizers that are analogous to those in
Eq. (2) through Eq. (7).
that
a, S(cid:2)
a, D(cid:2)
a, I (cid:2)
a, N (cid:2)
a), MI(G(cid:2)
a, C(cid:2)
a),
a, A(cid:2)
Finally, we calculate MI(G(cid:2)
a),
a), MI(G(cid:2)
a), MI(G(cid:2)
MI(G(cid:2)
and
MI(G(cid:2)
a)) using probabilities calculated with
respect to animate nouns only. The first five of
these are intended to serve as a ‘‘skyline,’’ while
the last two are intended to serve as a sanity check
(i.e., we expect them to be close to zero, as with
inanimate nouns). Again, we perform permutation
tests to test for statistical significance, and we
calculate six NMI variants for each pair of random
variables.
5 Results
i and A(cid:2)
In the first row of Table 1, we provide the
i for each language (cid:2) ∈
MI between G(cid:2)
{de, it, pl, pt, ru, es}. For all
six languages,
MI(G(cid:2)
i) is statistically significant (i.e., p <
0.05), meaning that there is a relationship between
the grammatical genders of inanimate nouns and
i, A(cid:2)
144
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Figure 2: The normalized mutual information (NMI) between the grammatical genders of inanimate nouns and
a) the adjectives used to describe those nouns and b) the verbs that take those nouns as direct objects and as
subjects for six different gendered languages. Each subplot contains six variants of NMI(G(cid:2)
i ),
and NMI(G(cid:2)
i )—one per normalizer—for a single language (cid:2) ∈ {de, it, pl, pt, ru, es}.
i ), NMI(G(cid:2)
i , D(cid:2)
i , A(cid:2)
i , S(cid:2)
i, I (cid:2)
i, I (cid:2)
i, S(cid:2)
i ), MI(G(cid:2)
the adjectives used to describe those nouns. Rows
2–4 of Table 1 contain MI(G(cid:2)
i, D(cid:2)
i ),
and MI(G(cid:2), S(cid:2)
i ) for each language. For all
six languages, MI(G(cid:2)
i ) and MI(G(cid:2)
i, D(cid:2)
i ) are
statistically significant (i.e., p < 0.05). For five
of the six languages, MI(G(cid:2)
i ) is statistically
significant, but because of the small number of
noun–verb pairs involved, we caution against
reading too much into this finding. We note that
direct objects are closest to verbs in analyses
of constituent structures, followed by subjects
and then indirect objects (Chomsky, 1957; Adger,
2003). Finally, the last two rows of Table 1 contain
MI(G(cid:2)
i ), respectively, for
each language. We do not find any statistically sig-
nificant relationships for either case or number.
i ) and MI(G(cid:2)
i , N (cid:2)
i, C(cid:2)
145
(2)
i ),
i ),
(7),
i, I (cid:2)
i, S(cid:2)
i, D(cid:2)
through Eq.
i ) from each plot because of
To facilitate comparisons, each subplot
in
i, A(cid:2)
Figure 2 contains six variants of NMI(G(cid:2)
i),
and NMI(G(cid:2)
NMI(G(cid:2)
calculated
using normalizers that are analogous to those
for a single
in Eq.
language (cid:2) ∈ {de, it, pl, pt, ru, es}.
(We omit
NMI(G(cid:2)
the
small number of noun–verb pairs involved.) For
(cid:2) ∈ {it, pl, pt, es}, NMI(G(cid:2)
i) is larger than
NMI(G(cid:2)
i ) and NMI(G(cid:2)
i ), regardless of
the normalizer. For (cid:2) ∈ {it, pl}, NMI(G(cid:2)
i, S(cid:2)
i )
i , Dpt
i ); NMI(Gpt
is larger than NMI(G(cid:2)
i, D(cid:2)
i ) is
larger than NMI(Gpt
i , Spt
i , Des
i ); and NMI(Ges
i ) and
i , Ses
NMI(Ges
i ) are roughly comparable—again,
the normalizer. Meanwhile,
regardless of
all
i , Dde
than NMI(Gde
i , Ade
NMI(Gde
i )
i ) is larger
i, A(cid:2)
i, S(cid:2)
i, D(cid:2)
de
it
pl
pt
ru
es
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
MI(G(cid:2)
a, A(cid:2)
0.0928
a)
a, D(cid:2)
0.0410
a)
a, I (cid:2)
0.0737
a)
a, S(cid:2)
0.0343
a)
a, C(cid:2)
a) < 0.001
a, N (cid:2)
a) < 0.001 < 0.001 < 0.001 < 0.001 < 0.001 < 0.001
0.0845
0.0664
0.0600
0.0303
< 0.001
0.0621
0.0273
0.0439
0.0258
< 0.001
0.1111
0.0091
0.0358
0.0192
N/A
0.0933
0.0320
0.0687
0.0252
N/A
0.1316
0.0543
0.0543
0.0543
N/A
Table 2: The mutual information (MI) between the grammatical genders of
animate nouns and a) the adjectives used to describe those nouns (top row),
b) the verbs that take those nouns as direct objects, as indirect objects, and as
subjects (rows 2–4, respectively), and c) the cases and numbers of those nouns
(rows 5 and 6, respectively) for six different gendered languages. Statistical
significance (i.e., a p-value less than 0.05) is indicated using bold. MI values are
not comparable across pairs of random variables.
i , Sde
i , Sde
i )
the normalizer
i ), NMI(Gde
and NMI(Gde
for
i , Ade
Eq. (2), while NMI(Gde
and NMI(Gde
the other five normalizers. Finally, NMI(Gru
and NMI(Gru
larger
normalizer.
in
i ),
i ) are all roughly comparable for
i , Aru
i )
i ) are roughly comparable and
the
i , Dru
than NMI(Gru
regardless of
i , Dde
i , Sru
i ),
In other words, the relationship between the
grammatical genders of inanimate nouns and
the adjectives used to describe those nouns is
generally stronger than, but sometimes roughly
comparable to,
the relationships between the
grammatical genders of inanimate nouns and the
verbs that take those nouns as direct objects and
as subjects. However, the relative strengths of the
relationships between the grammatical genders of
inanimate nouns and the verbs that take those
nouns as direct objects and as subjects vary
depending on the language.
a, S(cid:2)
a), MI(G(cid:2)
a, A(cid:2)
a), MI(G(cid:2)
In Table 2, we provide MI(G(cid:2)
a), MI(G(cid:2)
a, N (cid:2)
a), MI(G(cid:2)
a,
a, I (cid:2)
a, C(cid:2)
D(cid:2)
a), and
a) for each language (cid:2) ∈ {de, it, pl,
MI(G(cid:2)
pt, ru, es}. As with inanimate nouns, we find
that there is a statistically significant relationship
between the grammatical genders of animate
nouns and the adjectives used to describe those
nouns. We also find that there are statistically
significant relationships between the grammatical
genders of animate nouns and the verbs that
take those nouns as direct objects, as indirect
objects, and as subjects. Again, the relationship
for the verbs that take those nouns as indirect
objects involves a small number of noun–verb
146
pairs. As expected, we do not find any statisti-
cally significant relationships for either case or
number.
a),
a),
a, I (cid:2)
i, S(cid:2)
a, S(cid:2)
a, D(cid:2)
regardless of
i, S(cid:2)
i ); for (cid:2) ∈ {de, pt}, NMI(G(cid:2)
Figure 3 is analogous to Figure 2, in that each
a, A(cid:2)
subplot contains six variants of NMI(G(cid:2)
a),
NMI(G(cid:2)
and NMI(G(cid:2)
calculated
using normalizers that are analogous to those in
Eq. (2) through Eq. (7), for a single language
(cid:2) ∈ {de, it, pl, pt, ru, es}. (As with inanimate
nouns, we omit NMI(G(cid:2)
a) from each plot
because of
the small number of noun–verb
involved.) For (cid:2) ∈ {de, it, pl, pt, es},
pairs
i, D(cid:2)
i) is larger than NMI(G(cid:2)
i, A(cid:2)
NMI(G(cid:2)
i ) and
NMI(G(cid:2)
i, S(cid:2)
i ),
the normalizer.
For (cid:2) ∈ {it, pl}, NMI(G(cid:2)
i ) is larger than
i, D(cid:2)
NMI(G(cid:2)
i, D(cid:2)
i ) is
larger than NMI(G(cid:2)
i , Des
i ); and NMI(Ges
i ) and
i , Ses
NMI(Ges
i ) are roughly comparable—again,
the normalizer. Meanwhile,
regardless of
all
, Dru
, Aru
NMI(Gru
i )
i
which is larger than NMI(Gru
i ) for the
i
(3), while
normalizers in Eq.
NMI(Gru
i ) are roughly
i
, Sru
comparable and larger than NMI(Gru
i ) for
i
the other five normalizers.
Finally, each subplot
i) and NMI(G(cid:2)
in Figure 4 contains
a, A(cid:2)
NMI(G(cid:2)
a), calculated using
a single normalizer, for each for each language
(cid:2) ∈ {de, it, pl, pt, ru, es}. Each subplot
in
Figure 5 analogously contains NMI(G(cid:2)
i, D(cid:2)
i ) and
a, D(cid:2)
NMI(G(cid:2)
a), while each subplot in Figure 6
contains NMI(G(cid:2)
i ) and NMI(G(cid:2)
a). The
NMI values for animate nouns are generally larger
i ) is larger than NMI(Gru
i
(2) and Eq.
i , Dru
i ) and NMI(Gru
i, A(cid:2)
a, S(cid:2)
i, S(cid:2)
, Aru
, Sru
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
Figure 3: The normalized mutual information (NMI) between the grammatical genders of animate nouns and a)
the adjectives used to describe those nouns and b) the verbs that take those nouns as direct objects and as subjects
for six different gendered languages. Each subplot contains six variants of NMI(G(cid:2)
a), and
a)—one per normalizer—for a single language (cid:2) ∈ {de, it, pl, pt, ru, es}.
NMI(G(cid:2)
a), NMI(G(cid:2)
a, D(cid:2)
a, A(cid:2)
a, S(cid:2)
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
147
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
Figure 4: The normalized mutual information (NMI) between the grammatical genders of a) inanimate and
b) animate nouns and the adjectives used to describe those nouns. Each subplot contains NMI(G(cid:2)
i ) and
NMI(G(cid:2)
a), calculated using a single normalizer, for each language (cid:2) ∈ {de, it, pl, pt, ru, es}.
a, A(cid:2)
i , A(cid:2)
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
148
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
Figure 5: The normalized mutual information (NMI) between the grammatical genders of a) inanimate and b)
animate nouns and the verbs that take those nouns as direct objects. Each subplot contains NMI(G(cid:2)
i ) and
NMI(G(cid:2)
a), calculated using a single normalizer, for each language (cid:2) ∈ {de, it, pl, pt, ru, es}.
a, D(cid:2)
i , D(cid:2)
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
149
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
Figure 6: The normalized mutual information (NMI) between the grammatical genders of a) inanimate and
b) animate nouns and the verbs that take those nouns as subjects. Each subplot contains NMI(G(cid:2)
i ) and
a), calculated using a single normalizer, for each language (cid:2) ∈ {de, it, pl, pt, ru, es}.
NMI(G(cid:2)
a, S(cid:2)
i , S(cid:2)
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
150
than the NMI values for inanimate nouns. The
i , Apl
only exception is Polish, where NMI(Gpl
i )
is larger than NMI(Gpl
a ), regardless of the
normalizer.
a , Apl
6 Discussion
We find evidence for the presence of a statistically
significant relationship between the grammatical
genders of inanimate nouns and the adjectives
used to describe those nouns for six different
gendered languages (specifically, German, Italian,
Polish, Portuguese, Russian, and Spanish). We
also find evidence for the presence of statistically
significant relationships between the grammatical
genders of inanimate nouns and the verbs that take
those nouns as direct objects, as indirect objects,
and as subjects. However, we caution against
reading too much into the relationship for the verbs
that take those nouns as indirect objects because
of the small number of noun–verb pairs involved.
The effect sizes (operationalized as NMI values)
for all of these relationships are smaller than the
effect sizes for animate nouns. As expected, we do
not find any statistically significant relationships
for either case or number.
We emphasize that our findings complement,
rather than supersede,
laboratory experiments,
such as that of Boroditsky et al. (2003). We
use large-scale corpora and tools from NLP and
information theory to test for the presence of
even relatively weak relationships across multi-
ple different gendered languages—and, indeed,
the relationships that we find have effect sizes
(operationalized as NMI values) that are small. In
contrast, laboratory experiments typically focus
on much stronger relationships by tightly con-
trolling experimental conditions and measuring
speakers’ immediate, real-time speech produc-
tion. Moreover, although we find statistically
significant relationships, we do not investigate the
characteristics of these relationships. This means
that we do not know whether they are character-
ized by gender stereotypes, as argued by some
cognitive scientists, including Boroditsky et al.
(2003). We also do not know whether the rela-
tionships that we find are causal in nature. Because
MI is symmetric, our findings say nothing about
whether the grammatical genders of inanimate
nouns cause writers to choose particular adjec-
tives or verbs. We defer deeper investigation of
this for future work.
We note that each of our tests can be viewed as a
comparison of the similarity of two clusterings of a
set of items—specifically, a ‘‘clustering’’ of nouns
into grammatical genders and a ‘‘clustering’’
of the same nouns into, for example, adjective
lemmata. Although (normalized) MI is a standard
measure for comparing clusterings,
is not
limitations (see, e.g., Newman et al.
without
[2020]
future work,
for an overview). For
we therefore recommend replicating our tests
using other information-theoretic measures for
comparing clusterings.
it
Acknowledgments
We
thank Lera Boroditsky, Hagen Blix,
Eleanor Chodroff, Andrei Cimpian, Zach Davis,
Jason Eisner, Richard Futrell, Todd Gureckis,
Katharina Kann, Peter Klecha, Zhiwei Li, Ethan
Ludwin-Peery, Alec Marantz, Arya McCarthy,
John McWhorter, Sabrina J. Mielke, Elizabeth
Salesky, Arturs Semenuks, and Colin Wilson
for discussions at various points related to the
ideas in this paper. Katharina Kann approves this
acknowledgment.
A Appendix A: Counts
Counts of the noun–adjective and noun–verb pairs
for all six gendered languages are in Table 3
(for inanimate nouns) and Table 4 (for animate
nouns).
151
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
# noun–adj. tokens
# noun–adj. types
# noun types
# adj. types
# noun–verb (subj.) tokens
# noun–verb (subj.) types
# noun (subj.) types
# verb types
# noun–verb (dobj.) tokens
# noun–verb (dobj.) types
# noun (dobj.) types
# verb types
# noun–verb (iobj.) tokens
# noun–verb (iobj.) types
# noun (iobj.) types
# verb types
# noun–case tokens
# noun–case types
# noun types
# case types
# noun–number tokens
# noun–number types
# noun types
# number types
de
it
pl
pt
ru
es
6443907
770952
10712
4129
3191030
445536
10741
707
3440922
427441
10504
805
163935
50133
5520
386
14681293
2252632
11989
4
14681293
2252632
11989
2
6246856
666656
6410
3607
1432354
292949
6318
702
2855037
393246
6407
806
71
53
59
68
N/A
N/A
N/A
0
11588448
1748927
7014
2
11631913
640107
5533
4080
2179396
297996
5522
874
3964828
236849
4359
708
54138
18214
2258
417
15300621
1465314
5839
7
15300621
1465314
5839
2
640558
638774
5672
3431
1871941
337262
5780
758
4850012
541347
5896
738
95009
39738
3757
357
N/A
N/A
N/A
0
14631732
2042626
6256
2
32900200
1633963
9327
11028
6007063
864480
9129
1803
6738606
713703
8998
1539
1570273
300703
8150
1816
51641929
5028075
9692
6
51641929
5028075
9692
2
3605439
368795
6157
1907
1534211
376888
7470
875
2859135
576835
11567
9746
56038
24830
3574
464
N/A
N/A
N/A
0
5672790
1034307
1593
2
Table 3: Counts of the inanimate noun–adjective and noun–verb pairs for all six gendered languages.
152
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
de
it
pl
pt
ru
es
# noun–adj. tokens
# noun–adj. types
# noun types
# adj. types
# noun–verb (subj.) tokens
# noun–verb (subj.) types
# noun (subj.) types
# verb types
# noun–verb (dobj.) tokens
# noun–verb (dobj.) types
# noun (dobj.) types
# verb types
# noun–verb (iobj.) tokens
# noun–verb (iobj.) types
# noun (iobj.) types
# verb types
# noun–case tokens
# noun–case types
# noun types
# case types
# noun–number tokens
# noun–number types
# noun types
# number types
662760
99332
1998
3587
637801
113308
2056
707
321400
60760
1901
804
51359
17804
1149
378
1926614
390672
2292
4
1926614
390672
2292
2
818300
92424
1078
3507
399747
77551
1066
702
388187
55574
1025
805
7
6
6
6
N/A
N/A
N/A
0
1801285
306968
1135
2
1137209
97847
954
3836
526894
89819
969
874
456824
76348
867
724
43187
8440
628
411
1907688
299511
1024
7
1907688
299511
1024
2
712101
90865
1006
3176
456349
89959
1013
758
527259
92220
1028
737
23139
110185
773
340
N/A
N/A
N/A
0
1931315
356352
1072
2
3225932
264117
2098
9833
1516740
253150
2020
1799
494534
118818
1912
1535
518540
11353
1858
1769
6357089
987420
2194
6
6357089
987420
2194
2
387025
50173
1320
1828
310569
93586
1477
874
850234
85235
1023
745
23955
9586
947
456
N/A
N/A
N/A
0
786177
200785
1593
2
Table 4: Counts of the animate noun–adjective and noun–verb pairs for all six gendered languages.
153
References
David Adger. 2003. Core Syntax: A Minimalist
Approach, 33. Oxford University Press Oxford.
Alexandra Y. Aikhenvald. 2000. Classifiers: A
Typology of Noun Categorization Devices:
A Typology of Noun Categorization Devices.
Oxford University Press.
Tatiana Akhutina, Andrei Kurgansky, Maria
Polinsky, and Elizabeth Bates. 1999. Process-
ing of grammatical gender in a three-gender
system: Experimental evidence from Russian.
Journal of Psycholinguistic Research, 28(6):
695–713. DOI: https://doi.org/10
.1023/A:1023225129058, PMID:
10510865
Chris Alberti, Daniel Andor,
Ivan Bogatyy,
Michael Collins, Dan Gillick, Lingpeng Kong,
Terry Koo, Ji Ma, Mark Omernick, Slav Petrov,
Chayut Thanapirom, Zora Tung, and David
Weiss. 2017. SyntaxNet models for the CoNLL
2017 shared task. CoRR abs/1703.04929 arXiv
preprint arXiv:1703.04929.
Daniel Andor, Chris Alberti, David Weiss,
Aliaksei Severyn, Alessandro Presta, Kuzman
Ganchev, Slav Petrov, and Michael Collins.
2016. Globally normalized transition-based
the
neural networks.
the Association
54th Annual Meeting of
for Computational Linguistics (Volume 1:
Long Papers), pages 2442–2452. Associ-
ation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/P16
-1231
In Proceedings of
Jos´e Alem´an Ba˜n´on, Robert Fiorentino, and
Alison Gabriele. 2012. The processing of
number and gender agreement in Spanish: An
event-related potential
the
effects of structural distance. Brain Research,
https://doi.org
1456:49–63. DOI:
/10.1016/j.brainres.2012.03.057,
PMID: 22520436
investigation of
Horacio Barber and Manuel Carreiras. 2005.
Grammatical gender and number agreement
in Spanish: An ERP comparison. Journal
of Cognitive Neuroscience, 17(1):137–153.
DOI: https://doi.org/10.1162
/0898929052880101, PMID: 15701245
Horacio Barber, Elena Salillas, and Manuel
Carreiras. 2004. Gender or genders agreement.
On-line Study of Sentence Comprehension,
pages 309–328.
Benedetta Bassetti. 2007. Bilingualism and
thought: Grammatical gender and concepts
of objects in Italian–German bilingual children.
International Journal of Bilingualism, 11(3):
251–273. DOI: https://doi.org/10
.1177/13670069070110030101
Elizabeth Bates, Antonella Devescovi, Arturo
Hernandez, and Luigi Pizzamiglio. 1996.
Gender priming in Italian. Perception & Psy-
chophysics, 58(7):992–1004. DOI: https://
doi.org/10.3758/BF03206827,
PMID: 8920836
Andrea Bender, Sieghard Beller, and Karl
Christoph Klauer. 2011. Grammatical gender
in german: A case for linguistic relativity?
The Quarterly Journal of Experimental Psy-
chology, 64(9):1821–1835. DOI: https://
doi.org/10.1080/17470218.2011
.582128, PMID: 21740112
Damian Blasi, Ryan Cotterell, Lawrence Wolf-
Sonkin, Sabine Stoll, Balthasar Bickel, and
Marco Baroni. 2019. On the distribution of deep
clausal embeddings: A large cross-linguistic
study. In Proceedings of
the 57th Annual
the Association for Computa-
Meeting of
tional Linguistics, pages 3938–3943. DOI:
https://doi.org/10.18653/v1/P19
-1384
Leonard Bloomfield. 1933. Language, London:
Allen & Unwin.
Tolga Bolukbasi, Kai-Wei Chang, James Y. Zou,
Venkatesh Saligrama, and Adam T. Kalai. 2016.
Man is to computer programmer as woman is
to homemaker? Debiasing word embeddings.
In Advances in Neural Information Processing
Systems, pages 4349–4357.
Lera Boroditsky. 2003. Linguistic Relativity.
Encyclopedia of Cognitive Science.
Lera Boroditsky and Lauren A. Schmidt. 2000.
Sex, Syntax, and Semantics. In Proceedings of
the Annual Meeting of the Cognitive Science
Society.
154
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Lera Boroditsky, Lauren A. Schmidt, and Webb
Phillips. 2002. Can quirks of grammar affect
the way you think? Spanish and German
speakers’ ideas about the genders of objects.
https://escholarship.org/uc/item
/31t455gf.
Lera Boroditsky, Lauren A. Schmidt, and Webb
Phillips. 2003. Sex, Syntax, and Semantics.
Language in Mind: Advances in the Study of
Language and Thought, pages 61–79.
Greville G. Corbett. 2006. Agreement, Cambridge
University Press.
is
Roberto Cubelli, Lorella Lotto, Daniela Paolieri,
and Remo Job. 2005.
Massimo Girelli,
selected in bare
Grammatical gender
noun production: Evidence from the pic-
ture–word interference paradigm. Journal of
Memory and Language, 53(1):42–59. DOI:
https://doi.org/10.1016/j.jml
.2005.02.007
Bastien Boutonnet,
Panos Athanasopoulos,
and Guillaume Thierry. 2012. Unconscious
effects of grammatical gender during object
categorisation. Brain Research, 1479:72–79.
DOI: https://doi.org/10.1016/j
.brainres.2012.08.044, PMID:
22960201
Roberto Cubelli, Daniela Paolieri, Lorella
Lotto, and Remo Job. 2011. The effect of
grammatical gender on object categorization.
Journal of Experimental Psychology: Learning,
Memory, and Cognition, 37(2):449. DOI:
https://doi.org/10.1037/a0021965,
PMID: 21261427
Karl Brugmann. 1889. Das nominalgeschlecht
Inter-
in den indogermanischen sprachen.
f¨ur allgemeine Sprach-
nationale Zeitschrift
wissenschaft, 4. DOI: https://doi.org
/10.1111/psyp.12429, PMID: 25817315
Robert M. DeKeyser. 2005. What makes learning
second-language grammar difficult? A review
of issues. Language Learning, 55(S1):1–25.
DOI: https://doi.org/10.1111/j
.0023-8333.2005.00294.x
Sendy Caffarra, Anna
Siyanova-Chanturia,
Francesca Pesciarelli, Francesco Vespignani,
Is the noun
and Cristina Cacciari. 2015.
ending a cue to grammatical gender process-
ing? An ERP study on sentences in Italian.
Psychophysiology, 52(8):1019–1030. DOI:
https://doi.org/10.1126/science
.aal4230, PMID: 28408601
Aylin Caliskan, Joanna J. Bryson, and Arvind
Narayanan. 2017. Semantics derived automati-
cally from language corpora contain human-like
biases. Science, 356(6334):183–186.
Noam Chomsky. 1957. Syntactic Structures
(The Hague/Paris, Mouton), The Hague/Paris:
Mouton.
Mark A. Clarke, Ann Losoff, Margaret
Dickenson McCracken, and JoAnn Still. 1981.
Gender perception in Arabic and English.
Language Learning, 31(1):159–169. DOI:
https://doi.org/10.1111/j.1467
-1770.1981.tb01377.x
Greville G. Corbett. 1991. Gender, Cambridge
University Press. DOI: https://doi.org
/10.1017/CBO9781139166119
Sunipa Dev and Jeff Phillips. 2019. Attenuating
bias in word vectors. In The 22nd International
Intelligence and
Conference on Artificial
Statistics, pages 879–887.
Emily Dinan, Angela Fan, Adina Williams,
Jack Urbanek, Douwe Kiela, and Jason
Weston. 2019. Queens are powerful
too:
Mitigating gender bias in dialogue genera-
tion. arXiv preprint arXiv:1911.03842. DOI:
https://doi.org/10.18653/v1/2020
.emnlp-main.656
Melody Dye, Petar Milin, Richard Futrell, and
Michael Ramscar. 2017. A functional theory of
gender paradigms, In Perspectives on Morpho-
logical Organization, pages 212–239. Brill.
DOI: https://doi.org/10.1163
/9789004342934 011
Susan Ervin-Tripp. 1962. The connotations of
gender. Word, 18249–261.
Kawin Ethayarajh, David Duvenaud,
and
Graeme Hirst. 2019. Understanding unde-
In
sirable word embedding associations.
Proceedings of the 57th Annual Meeting of
the Association for Computational Linguistics,
155
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
pages 1696–1705, Florence,
Italy. Associ-
ation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/P19
-1166
Sam Featherston and Wolfgang Sternefeld.
2007. Roots: Linguistics in Search of
its
Evidential Base, volume 96. Walter de Gruyter.
DOI: https://doi.org/10.1515
/9783110198621
Istvan Fodor. 1959. The origin of grammatical
gender. Lingua, 8:186–214. DOI: https://
doi.org/10.1016/0024-3841(59)90020-8
Anthony Fox. 1990. The structure of German,
Oxford University Press.
Florencia Franceschina. 2005. Fossilized Second
Language Grammars: The Acquisition of
Grammatical Gender,
John
Benjamins Publishing. DOI: https://
doi.org/10.1075/lald.38
volume
38.
Angela D. Friederici and Thomas Jacobsen.
1999. Processing grammatical gender dur-
Journal of
ing language
comprehension.
Psycholinguistic Research,
28(5):467–484.
https://doi.org/10.1023/A
DOI:
https://doi.org
:1023243708702,
/10.1023/A:1023264209610
Nikhil Garg, Londa Schiebinger, Dan Jurafsky,
and James Zou. 2018. Word embeddings
quantify 100 years of gender and ethnic stereo-
types. Proceedings of the National Academy
of Sciences, 115(16):E3635–E3644. DOI:
https://doi.org/10.1073/pnas
.1720347115, PMID: 29615513, PMCID:
PMC5910851
Alexander J. Gates, Ian B. Wood, William
P. Hetrick,
and Yong-Yeol Ahn. 2019.
Element-centric clustering comparison unifies
overlaps and hierarchy. Scientific Reports,
https://doi.org/10
9(8574). DOI:
.1038/s41598-019-44892-y, PMID:
31189888, PMCID: PMC6561975
Jacob Grimm. 1890. Deutsche Grammatik, C.
Bertelsmann.
Peter Hagoort and Colin M. Brown. 1999.
Gender electrified: ERP evidence on the
syntactic nature of gender processing. Journal
of Psycholinguistic Research, 28(6):715–728.
DOI: https://doi.org/10.1023/A
:1023277213129, PMID: 10510866
Rowan Hall Maudslay, Hila Gonen, Ryan
Cotterell, and Simone Teufel. 2019. It’s all in
the name: Mitigating gender bias with name-
based counterfactual data substitution. In Pro-
ceedings of the 2019 Conference on Empirical
Methods in Natural Language Processing and
the 9th International Joint Conference on Nat-
ural Language Processing (EMNLP-IJCNLP),
pages 5267–5275, Hong Kong, China. Asso-
ciation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/D19
-1530
Daniel Harbour. 2011. Valence and atomic num-
ber. Linguistic Inquiry, 42(4):561–594. DOI:
https://doi.org/10.1162/LING a
00061
Daniel Harbour, David Adger, and Susana B´ejar.
2008. Phi theory: Phi-features Across Modules
and Interfaces, 16, Oxford University Press.
Peter R. Hofst¨atter. 1963.
¨Uber sprachliche
bestimmungsleistungen: Das problem des
grammatikalischen geschlechts von sonne und
f¨ur
experimentelle und
mond. Zeitschrift
angewandte Psychologie.
Muhammad Hasan Ibrahim. 2014. Grammatical
Gender: Its Origin and Development, 166,
Walter de Gruyter.
Roman Jakobson. 1959. On linguistic aspects of
translation. On Translation, 3:30–39. DOI:
https://doi.org/10.4159/harvard
.9780674731615.c18
Katharina Kann. 2019. Grammatical gender,
neo-Whorfianism, and word embeddings: A
Data-Driven Approach to Linguistic Relativity.
arXiv preprint arXiv:1910.09729.
Graeme
2014.
Kennedy.
Intro-
duction to Corpus Linguistics, Routledge.
DOI: https://doi.org/10.4324
/9781315843674
An
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Toshi Konishi. 1993. The semantics of gram-
matical
study.
gender: A cross-cultural
Journal of Psycholinguistic Research, 22(5):
519–534. DOI: https://doi.org/10
.1007/BF01068252, PMID: 8246207
156
Ruth Kramer. 2014. Gender
in Amharic: a
morphosyntactic approach to natural and
grammatical gender. Language Sciences, 43:
102–115. DOI: https://doi.org/10
.1016/j.langsci.2013.10.004
Ruth Kramer. 2015. The Morphosyntax of
Gender, 58, Oxford University Press. DOI:
https://doi.org/10.1093/acprof
:oso/9780199679935.001.0001
Elena Kurinski and Maria D. Sera. 2011. Does
learning Spanish grammatical gender change
English-speaking adults’
categorization of
inanimate objects? Bilingualism: Language and
Cognition, 14(2):203–220. DOI: https://
doi.org/10.1017/S1366728910000179
Michael Maratsos. 1979. How to get from words
to sentences, Doris Aaronson and Rober W.
Reiber, editors, Psycholinguistic Research:
Implications and Applications, Psychology
Press, Taylor & Francis Group, London and
New York.
John H. McWhorter. 2014. The Language Hoax:
Why the world looks the same in any language,
Oxford University Press.
Anne Mickan, Maren Schiefke, and Anatol
Stefanowitsch. 2014. Key is a llave is a
Schl¨ussel: A failure to replicate an experi-
ment from Boroditsky et al. 2003. Yearbook
of
the German Cognitive Linguistics Asso-
ciation, 2(1):39. DOI: https://doi.org
/10.1515/gcla-2014-0004
Silvina Montrul, Rebecca Foote, and Silvia
Perpi˜n´an. 2008. Gender agreement
in adult
second language learners and spanish heritage
speakers: The effects of age and context of
acquisition.
58(3):
503–553. DOI: https://doi.org/10
.1111/j.1467-9922.2008.00449.x
Language
Learning,
Thomas M¨uller, Ryan Cotterell, Alexander
Fraser, and Hinrich Sch¨utze. 2015.
Joint
tagging
lemmatization and morphological
with lemming. In Proceedings of
the 2015
Conference on Empirical Methods in Natural
Language Processing, pages 2268–2274. DOI:
https://doi.org/10.18653/v1/D15
-1272, PMID: 25768671
157
Mark E. J. Newman, George T. Cantwell,
Improved
and Jean-Gabriel Young. 2020.
mutual information measure for classification
and community detection. Physical Review
https://doi.org/10.1103
E. DOI:
/PhysRevE.101.042304,
PMID:
32422767
Joakim Nivre,
ˇZeljko Agi´c, Lars Ahrenberg,
Maria Jesus Aranzabe, Masayuki Asahara,
Aitziber Atutxa, Miguel Ballesteros,
John
Bauer, Kepa Bengoetxea, Riyaz Ahmad
Bhat, Eckhard Bick, Cristina Bosco, Gosse
Bouma, Sam Bowman, Marie Candito, G¨uls¸en
Cebiro˘glu Eryi˘git, Giuseppe G. A. Celano,
Fabricio Chalub, Jinho Choi, C¸ a˘grı C¸ ¨oltekin,
Miriam Connor, Elizabeth Davidson, Marie-
Catherine de Marneffe, Valeria de Paiva,
Arantza Diaz de Ilarraza, and Kaja Dobrovoljc.
2017. Universal dependencies 2.0.
Lee Osterhout and Linda A. Mobley. 1995.
Event-related brain potentials elicited by failure
to agree. Journal of Memory and Language,
34(6):739–773. DOI: https://doi.org
/10.1006/jmla.1995.1033
Webb Phillips and Lera Boroditsky. 2003. Can
quirks of grammar affect the way you think?
Grammatical gender and object concepts. In
Proceedings of
the
Cognitive Science Society, volume 25.
the Annual Meeting of
Princeton University. 2010. About WordNet.
https://wordnet.princeton.edu/
HenrikSaalbach,MutsumiImai,andLennart
Schalk. 2012. Grammatical gender and infer-
ences about biological properties in German-
speaking children. Cognitive Science, 36(7):
1251–1267. DOI: https://doi.org/10
.1111/j.1551-6709.2012.01251.x,
PMID: 22578067
Arturs Semenuks, Webb Phillips, Ioana Dalca,
Cora Kim, and Lera Boroditsky. 2017. Effects
of grammatical gender on object description. In
Proceedings of the 39th Annual Meeting of the
Cognitive Science Society (CogSci 2017).
Maria D. Sera, Christian A. H. Berge, and
Javier del Castillo Pintado. 1994. Grammatical
and conceptual forces in the attribution of
gender by English and Spanish speakers.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Cognitive Development, 9(3):261–292. DOI:
https://doi.org/10.1016/0885
-2014(94)90007-8
logy: General, 134(4):501. DOI: https://
doi.org/10.1037/0096-3445.134.4
.501, PMID: 16316288
Maria D. Sera, Chryle Elieff, James Forbes,
Melissa Clark Burch, Wanda Rodr´ıguez,
and Diane Poulin Dubois. 2002. When
language affects cognition and when it does
not: An analysis of grammatical gender
and classification. Journal of Experimental
Psychology: General,
131(3):377. DOI:
https://doi.org/10.1037/0096
-3445.131.3.377
Gabriel Stanovsky, Noah A. Smith,
in machine translation.
the 57th Annual Meeting of
and
Luke Zettlemoyer. 2019. Evaluating gender
In Proceed-
bias
the
ings of
Association for Computational Linguistics,
pages 1679–1684, Florence,
Italy. Associ-
ation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/P19
-1164
Milan Straka, Jan Hajiˇc, and Jana Strakov´a. 2016.
UDPipe: Trainable pipeline for processing
CoNLL-u files performing tokenization, mor-
phological analysis, POS tagging and parsing.
In Proceedings of the Tenth International Con-
ference on Language Resources and Evalua-
tion (LREC’16), pages 4290–4297, Portoroˇz,
Slovenia. European Language Resources Asso-
ciation (ELRA). http://ufal.mff.cuni
.cz/udpipe
Yi Chern Tan and L. Elisa Celis. 2019.
Assessing social and intersectional biases
in contextualized word representations.
In
Advances in Neural Information Processing
Systems, pages 13209–13220.
and syntactic
Gabriella Vigliocco, Marcus Lauer, Markus F.
Damian, and Willem J. M. Levelt. 2002.
in noun
Semantic
phrase production. Journal of Experimental
Psychology: Learning, Memory, and Cognition,
28(1):46. DOI: https://doi.org/10
.1037/0278-7393.28.1.46
forces
Benj
Ide Wheeler.
of
grammatical gender. The Journal of Germanic
Philology, 2(4):528–545.
1899. The
origin
Nicole Y. Y. Wicha, Elizabeth A. Bates, Eva M.
Moreno, and Marta Kutas. 2003. Potato not
Pope: Human brain potentials
to gender
expectation and agreement in Spanish spoken
sentences. Neuroscience
346(3):
165–168. DOI: https://doi.org/10
.1016/S0304-3940(03)00599-8
Letters,
Nicole Y. Y. Wicha, Eva M. Moreno, and
Marta Kutas. 2004. Anticipating words and
their gender: An event-related brain potential
study of semantic integration, gender expec-
tancy, and gender agreement
in Spanish
sentence reading. Journal of Cognitive Neuro-
science, 16(7):1272–1288. DOI: https://
doi.org/10.1162/0898929041920487,
PMID: 15453979, PMCID: PMC3380438
Adina Williams, Damian Blasi, Lawrence
Wolf-Sonkin, Hanna Wallach,
and Ryan
Cotterell. 2019. Quantifying the semantic
core of gender systems. In Proceedings of
the 2019 Conference on Empirical Methods
in Natural Language Processing and the 9th
International Joint Conference on Natural
Language
(EMNLP-IJCNLP),
pages 5734–5739, Hong Kong, China. Asso-
ciation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/D19
-1577
Processing
Jieyu Zhao, Yichao Zhou, Zeyu Li, Wei
Wang, and Kai-Wei Chang. 2018. Learning
gender-neutral word embeddings. In Proceed-
ings of
the 2018 Conference on Empirical
Methods in Natural Language Processing,
pages 4847–4853, Brussels, Belgium, Asso-
ciation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/D18
-1521
Gabriella Vigliocco, David P. Vinson, Federica
Paganelli, and Katharina Dworzynski. 2005.
Grammatical gender effects on cognition:
Implications for language learning and lan-
guage use. Journal of Experimental Psycho-
Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao
Huang, Muhao Chen, Ryan Cotterell, and
Kai-Wei Chang. 2019. Examining gender bias
in languages with grammatical gender. In Pro-
ceedings of the 2019 Conference on Empirical
158
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Methods in Natural Language Processing and
the 9th International Joint Conference on Nat-
ural Language Processing (EMNLP-IJCNLP),
pages 5276–5284, Hong Kong, China. Asso-
ciation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/D19
-1531,
PMCID:
PMC6540912
31191883,
PMID:
Ran Zmigrod, Sebastian J. Mielke, Hanna
Wallach, and Ryan Cotterell. 2019. Counter-
factual data augmentation for mitigating gender
stereotypes in languages with rich morphology.
In Proceedings of the 57th Annual Meeting of
the Association for Computational Linguistics,
pages 1651–1661. Florence,
Italy. Associ-
ation for Computational Linguistics. DOI:
https://doi.org/10.18653/v1/P19
-1161
David Zubin
relation
and Klaus-Michael K¨opcke.
and folk taxonomy: The
1986. Gender
grammatical
indexical
categorization. Noun Classes
and lexical
and Categorization, pages 139–180. DOI:
https://doi.org/10.1075/tsl.7
.12zub
between
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
t
a
c
l
/
l
a
r
t
i
c
e
-
p
d
f
/
d
o
i
/
.
1
0
1
1
6
2
/
t
l
a
c
_
a
_
0
0
3
5
5
1
9
2
4
1
4
2
/
/
t
l
a
c
_
a
_
0
0
3
5
5
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
159
Télécharger le PDF