TWIN BIRTH AND MATERNAL CONDITION

TWIN BIRTH AND MATERNAL CONDITION

Sonia Bhalotra and Damian Clarke*

Abstract—Twin births are often construed as a natural experiment in the so-
cial and natural sciences on the premise that the occurrence of twins is quasi-
random. We present population-level evidence that challenges this premise.
Using individual data for 17 million births in 72 des pays, we demonstrate
that indicators of mother’s health, health-related behaviors, and the prenatal
environment are systematically positively associated with twin birth. Le
associations are sizable, evident in richer and poorer countries—evident
even among women who do not use in vitro fertilization—and hold for
numerous different measures of health. We discuss potential mechanisms,
showing evidence that favors selective miscarriage.

je.

Introduction

TWINS have intrigued humankind for more than a century

(Thorndike, 1905). In behavioral genetics, demography,
and psychology, monozygotic twins are studied to assess the
importance of nurture relative to nature (Polderman et al.,
2015). In the social sciences, twin births are also used to
denote an unexpected increase in family size, which assists
causal identification of the impact of fertility on investments
in children and on women’s labor supply (Rosenzweig &
Wolpin, 2000, 1980un; Bronars & Grogger, 1994; Noir, De-
vereux, & Salvanes, 2005). A premise of studies that use twin
differences or the twin instrument is that twin births are quasi-
random and have no direct impact (except through fertility)
on the outcome under study. We present new population-
level evidence that challenges this premise. Using 16,962,165
births in 72 des pays, of which 462,246 (2.73%) are twins,
we show that the likelihood of a twin birth varies system-
atically with maternal condition. En particulier, our estimates
establish that mothers of twins are selectively healthy.1

We document that the association of twin births and mater-
nal condition is meaningfully large and widespread. We show
that is evident in richer and poorer countries and that it holds
for sixteen different markers of maternal condition, inclure-
ing health stocks and health conditions prior to pregnancy
(height, obesity, diabetes, hypertension, asthma, kidney dis-
ease, smoking), exposure to unexpected stress in pregnancy,

Received for publication June 1, 2017. Revision accepted for publication

Juillet 30, 2018. Editor: Rohini Pande.

∗Bhalotra: University of Essex; Clarke: Universidad de Santiago de Chile.
We are grateful to Paul Devereux, James Fenske, Judith Hall, Christian
Hansen, Martin Karlsson, Toru Kitagawa, Magne Mogstad, Cheti Nico-
letti, Carol Propper, Adam Rosen, Paul Schulz, Margaret Stevens, Atheen
Venkataramani, Marcos Vera-Hernandez, Frank Windmeijer, Emilia Del
Bono, Climent Quintana-Domeque, Pedro Ródenas, Libertad González,
Hanna Mühlrad, Anna Aevarsdottir, Martin Foureaux Koppensteiner, Ryan
Brun, Pietro Biroli, Rohini Pande, and three anonymous referees, along
with various seminar audiences and discussants for helpful comments
and/or sharing data. Any remaining errors are our own. An earlier ver-
sion of this paper was circulated as Part 1 of “The Twin Instrument,” IZA
discussion paper 10405.

A supplemental appendix is available online at http://www.mitpress

journals.org/doi/suppl/10.1162/rest_a_00789.

1Twins are not as rare as we may think: one in eighty live births and hence
one in forty newborns is a twin. In general and, par exemple, aux États-Unis
États, there is a positive trend in twin births.

and measures of the availability of medical professionals and
prenatal care.2 The effects are sizable, with a 1 standard de-
viation improvement in the indicator tending to increase the
likelihood of twinning by 6% à 12%.

Previous research has documented that twins have dif-
ferent endowments from singletons; Par exemple, twins are
more likely to have low birthweight and congenital anoma-
lies (Hall, 2003; Rosenzweig & Zhang, 2009). We focus not
on differences between twins and singletons but rather on
differences between mothers of twins and singletons, lequel
indicate whether occurrence of twin births is quasi-random.
It is known that twin births are not strictly random, occurring
more frequently among older mothers, at higher parity, et
in certain races and ethnicities (Hall, 2003; Bulmer, 1970),
but as these variables are typically observable, they can be
adjusted for (as in Rosenzweig & Wolpin, 1980un).3 De la même manière,
it is well documented that women using artificial reproduc-
tive technologies (ART) are more likely to give birth to twins
(Vitthala et al., 2009), but ART use is recorded in many birth
registries, so it can be controlled for and a conditional ran-
domness assumption upheld (Cáceres-Delpiano, 2006; Un-
grist, Lavy, & Schlosser, 2010). The reason that our finding
is potentially a major challenge is that maternal condition
is multidimensional and almost impossible to fully measure
and adjust for. To take a few examples, fetal health is poten-
tially a function of whether pregnant women skip breakfast
(Mazumder & Seeskin, 2015), they suffer bereavement in
pregnancy (Black et al., 2016), or they are exposed to air
pollution (Chay & Greenstone, 2003).

Our underlying hypothesis is that twins are more demand-
ing of maternal resources than singletons, and as a result,
conditions that challenge maternal health are more likely to
result in miscarriage of twins than of singletons. We dis-
cuss the role of alternative mechanisms including nonran-
dom conception and maternal survival selection. We provide
evidence in favor of the selective miscarriage mechanism us-
ing U.S. Vital Statistics data for 14 à 16 million births. Se-
lective miscarriage is similarly the mechanism behind the
stylized fact that weaker maternal condition is associated
with a lower probability of male birth (Trivers & Willard,
1973; Almond & Edlund, 2007). We confirm this in our data,

2We also show that a positive association of the chances of having twins
with health-related behaviors in pregnancy (healthy diet, smoking, alcohol,
drug consumption), although we do not rely on this because behaviors in
pregnancy may reflect a response to the mother’s knowledge that she is
carrying twins.

3Other correlates identified in the medical literature but not reflected in
social science research include high concentrations of follicle-stimulating
hormone in women, season and seasonal light, height, urbanization, et
starvation (Hall, 2003), with mixed results (based on small samples) quand
considering social class (Campbell, Campbell, & MacGillivary, 1974;
Campbell, 1998). These results have not been documented in the economics
or social science literature. In our discussion of mechanisms, we discuss
the difference between monozygotic and dyzygotic twins.

The Review of Economics and Statistics, Décembre 2019, 101(5): 853–864
© 2018 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
Published under a Creative Commons Attribution 4.0 International (CC PAR 4.0) Licence.
https://doi.org/10.1162/rest_a_00789

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi
/
r
e
s
t
/

je

un
r
t
je
c
e

p
d

F
/

/

/

/

1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
un
_
0
0
7
8
9
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

854

THE REVIEW OF ECONOMICS AND STATISTICS

TABLE 1.—THE QUANTITY–QUALITY AND FERTILITY–FLFP TRADE-OFFS: ESTIMATES USING THE TWIN INSTRUMENT

Estimates

Authors

Data/Context

Twin Use

OLS

IV

Maternal Controls

UN. Quantity–Quality

Rosenzweig and Wolpin

(1980un)

Black et al. (2005)

India, rural survey; outcome is

standardized schooling
Norway, administrative data

Twin ratio
in OLSa

IV

Cáceres-Delpiano (2006)

États-Unis, Census 5% file;

IV

Li et al. (2008)

outcome is behind
educational cohort

Chine, census 1% file; outcome
is educational enrollment

Rosenzweig and Zhang (2009)

Chine, twin survey

Angrist et al. (2010)

Israel, Census 20% file

Black et al. (2010)

Norway, administrative data;

outcome is IQ

IV

RFb,F

IV

IV

˚Aslund and Grönqvist (2010)

Sweden, administrative data

IV

Ponczek and Souza (2012)

Brazil, Census 10% file

IV (girl)

Marteleto and de Souza (2012)

Brazil, household survey

Mogstad and Wiswall (2016)

Norway, administrative data

B. Fertility and female labor force participation

Rosenzweig and Wolpin

(1980b)

États-Unis, pooled

demographic surveys

Bronars and Grogger (1994)

États-Unis 1970 et 1980

5% Census

Angrist and Evans (1998)
Jacobsen et al. (1999)

États-Unis 1980 5% Census
États-Unis 1970 et 1980

5% Census

IV (boy)

IV

IVc

RFd

RFd

IV
IVe

Cáceres-Delpiano (2012)

Pooled demographic surveys,

IV

developing countries

−2.483 (0.740)

−0.060 (0.003)
−0.076 (0.004)
−0.059 (0.006)
0.011 (0.000)
0.017 (0.001)

−0.031 (0.001)
−0.038 (0.002)
−0.307 (0.160)
−0.225 (0.172)
−0.145 (0.005)
−0.143 (0.005)

−0.113 (0.004)
−0.132 (0.006)
−0.100 (0.009)
−0.277 (0.015)
−0.283 (0.015)
−0.233 (0.010)
−0.230 (0.010)
−0.248 (0.003)
−0.240 (0.003)

None

Age and education

Age, éducation, and race

Age and education

Age

Age, place of birth, course

Age and education

Age and education

Age and education

Age, éducation, et
family income
Age and education

−0.038 (0.047)
−0.016 (0.044)
−0.024 (0.059)
0.002 (0.003)
0.010 (0.006)

0.002 (0.009)
−0.024 (0.014)
No birthweight control
Birthweight control
0.174 (0.166)
0.167 (0.117)
−0.149 (0.052)
−0.170 (0.052)
−0.115 (0.080)
0.022 (0.048)
−0.043 (0.048)
−0.042 (0.083)
−0.372 (0.198)
−0.634 (0.194)
−0.137 (0.146)
−0.060 (0.164)
0.064 (0.076)
0.131 (0.055)
0.053 (0.050)
−0.051 (0.053)
−0.107 (0.059)

−0.371 (0.212)
0.142 (0.102)
−0.036 (0.036)
−0.035 (0.017)
−0.176 (0.002)

−0.014 (0.001)
−0.010 (0.001)
−0.009 (0.001)

Short-term estimate
Long-term estimate
1970 Census
1980 Census
−0.057 (0.011)
−0.021 (0.014)
−0.025 (0.008)
−0.029 (0.012)
−0.016 (0.012)
−0.022 (0.012)

None

Age at first birth

Age, age at first birth
Age at first birth cubic

Age, éducation, literacy

status, country
dummies

Estimates and standard errors reported in columns OLS and IV refer to main estimates from each paper. Estimates are included from published articles using large samples of microdata. A comprehensive review is
provided in Clarke (2018). Where multiple estimates are reported, unless otherwise indicated, the first line refers to the impact of twins at birth 2, the second line the impact of twins at birth 3, and the third line the
impact of twins at birth 4 (if available). In panel A, the estimates refer to the outcome variable “years of education” unless specified in column 2. In panel B, all outcomes are the mother’s labor market participation.

a Twin ratio is the number of twin births divided by the number of pregnancies.
bCoefficients reported are impact of second birth twins on nontwin first births.
cNonlinear estimates are reported in paper. Here linear estimates are presented for comparison with other results.
dReduced form (RF) uses twins at first birth as independent variable.
eFirst line reports estimates from 1970 census, second line reports 1980 census.
f Standard errors are calculated from reported t-statistics.

showing that twin births are more likely to be girls. Our find-
ings add a novel twist to recent literature documenting that a
mother’s health and her environmental exposure to nutritional
or other stresses during pregnancy influence birth outcomes,
with many studies documenting lower birthweight (Currie
& Moretti, 2007; Bernstein et al., 2005; Quintana-Domeque
& Ródenas-Serrano, 2017). If birthweight is the intensive
margin, we may think of miscarriage as an extensive margin
response or the limiting case of low birthweight.

Our findings have implications for research that has ex-
ploited the assumed randomness of twin births. Studies us-

ing twins to isolate exogenous variation in fertility will tend
to underestimate the impact of fertility on parental invest-
ments in children and on women’s labor supply if selectively
healthy mothers invest more in children after-birth, and are
more likely to participate in the labor market (as discussed
in Bloom, Kuhn, & Prettner, 2015). In table 1 we summarize
studies using twin births to instrument fertility, document-
ing the mother-level controls in each study. In some cases,
the validity of the conditional randomness assumption is di-
rectly probed—for instance, with respect to mother’s educa-
tion (Black et al., 2005; Li, Zhang, & Zhu, 2008; Rosenzweig

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi
/
r
e
s
t
/

je

un
r
t
je
c
e

p
d

F
/

/

/

/

1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
un
_
0
0
7
8
9
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

TWIN BIRTH AND MATERNAL CONDITION

855

& Zhang, 2009). Cependant, as is acknowledged in each case,
any such tests are at best partial evidence in support of instru-
mental validity. Surtout, no previous study has attempted
to control for maternal health conditions or behaviors. Ce
is pertinent, as it could resolve the ambiguity of the avail-
able evidence on the impacts of fertility. En particulier, récent
studies using the twin instrument challenge a long-standing
theoretical prior of Becker and Lewis (1973) in rejecting the
presence of a quantity–quality (QQ) fertility trade-off in de-
veloped countries (Black et al., 2005; Angrist et al., 2010),
but our estimates suggest that this rejection could in principle
arise from ignoring the positive selection of women into twin
birth. De la même manière, research using the twin instrument tends to
find that additional children have relatively little influence
on female labor force participation (FLFP; see Lundborg,
Plug, & Rasmussen, 2017). Mais, again, these estimates are
likely to be downward biased. The results of studies in eco-
nomics, psychologie, éducation, and biology that instead ex-
ploit the genetic similarity of twins will not be biased but will
tend to have more restricted external validity than previously
assumed.4

II. Méthodologie

Dans cette section, we discuss two distinct approaches to test-
ing our hypothesis that twins are selectively born to healthier
mothers. We identify variation in the mother’s health before
she gives birth to twins and before she knows she will give
birth to twins. In the first approach, we use information on her
health condition (morbidities, height, weight), health-related
behaviors, access to health care, and environmental health
stressors. In our second approach, we use as a marker of
maternal health the fetal or infant survival rate of her births
prior to the birth at which she has twins (with parity-matched
counterfactuals). We discuss below the methods used to in-
vestigate potential mechanisms.

We conduct three robustness checks. D'abord, we restrict the
sample to non-ART births. It is important to demonstrate that
our hypothesis holds independent of ART use because there
is a positive association of ART with the likelihood of twin
births (Vitthala et al., 2009), and ART users are typically more
educated and wealthy (Lundborg et al., 2017). Another po-
tential concern is that we are capturing genetic traits that, pour
instance, are associated with the woman’s height or weight

4The twin instrument has been criticized for other reasons. A recent
critique of the use of twins to identify the QQ trade-off has argued that
parental behaviors may respond to the endowment of twins and not only to
the fact that twin births represent a fertility shock. Rosenzweig & Zhang
(2009) highlight that twins have lower birth endowments. They argue that
if parents reinforce endowments, then they may reallocate resources to-
ward the better-endowed children born before the twins, obscuring any
underlying QQ trade-off; this is examined in Angrist et al. (2010) et
Fitzsimons & Malde (2014). We remain agnostic on this. Our critique
is in principle orthogonal to this critique, providing a different reason
that an underlying QQ trade-off may be obscured, relating to endow-
ments and behaviors of mothers. This critique has not been previously
considered.

and also correlated with her predisposition toward twin birth.
This would appear to be a second-order concern since we do
not only rely on woman-specific measures of health but also
show a positive association of twinning with environmental
stressors, health facilities, and health-related behaviors. Nous
nevertheless investigate this concern in two different ways.
D'abord, we test whether we can identify a positive association
of the probability that a birth is a twin with woman-specific
time-varying health indicators conditional on woman fixed
effects that sweep out genetic influences. Deuxième, we leverage
biomedical research showing that monozygotic (MZ) twins
are randomly allocated across mothers, although genetic pre-
dispositions may influence the chances of having dizygotic
(DZ) twins (Meulemans et al., 1996). Ideally, we would re-
strict the sample to MZ twins, but MZ versus DZ are not
identified in the data. Plutôt, on the premise that MZ twins
are necessarily same sex and about half of all DZ twins are
same sex, we investigate our hypothesis restricting the sam-
ple to include only same-sex twins. If our results were driven
by genetic predispositions, then we should find weaker asso-
ciations in the same-sex sample. The methods and data used
to conduct the robustness checks are discussed with the re-
sults. The rest of this section elaborates the specification used
in the two main approaches to testing for twin randomness.

UN. Across Mothers

To test the null that twin births are “as good as random

we estimate conditional regressions of the form

twinb jy = γ0 + γ1Healthb jy + μb + λy + εb jy.

(1)

Ici, twin is an indicator of whether a birth of order b born
to woman j at age y is a twin. We control for fixed effects
for mother’s age and parity, as these are known to influence
the probability of twin birth. Where births are observed over
multiple years, races, or geographic areas, we include the
relevant fixed effects. Under the null, the coefficients on ma-
ternal health variables Healthb jy should not be statistically
distinguishable from 0. This is equivalent to a test of (condi-
tional) balance of characteristics of treated (with twins) et
control (without twins) mothers. Standard errors are clustered
at the level of the mother.

For ease of exposition, we maintain the subscript y for the
woman’s age at birth, but most of the health indicators are
measured before pregnancy to avoid the potential concern
of reverse causality—that twin births cause greater deple-
tion of the mother’s health than singleton births or encourage
women to adopt different behaviors. These include prepreg-
nancy measures of smoking, diabetes, hypertension, obesity,
height, kidney disease, and asthma. Measures of prenatal
or medical care are constructed as community-level mea-
sures of availability. In a specific case we discuss below, nous
use an exogenous measure of environmental stress in preg-
nancy. We also show results for some variables measured in

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

e
d
toi
/
r
e
s
t
/

je

un
r
t
je
c
e

p
d

F
/

/

/

/

1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
un
_
0
0
7
8
9
p
d

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

856

THE REVIEW OF ECONOMICS AND STATISTICS

pregnancy—smoking, alcohol, drugs, diet—and for one mea-
sure (BMI in developing country data) measured after birth.
We flag these variables so that their coefficients can be inter-
preted with this caveat in mind.5 Importantly, if we dropped
all of the flagged variables, we would still have a fairly com-
pelling breadth of evidence. We add controls for education
et, where available, wealth to allow for the fact that educa-
tion may motivate and wealth may facilitate health-seeking
behaviors (Kenkel, 1991; Lleras-Muney & Cutler, 2010).
This will confirm that the indicators in Health are not simply
proxying for socioeconomic status. As discussed above, nous
will present additional specifications including woman fixed
effects in the model and restricting to same-sex twins.

B. Pre-Twin Balance

We perform an alternative test that exploits predetermined
birth outcomes within mothers. This essentially involves test-
ing whether women who produce twins had, on average,
healthier births before the twin birth, as this would be a mea-
sure of predetermined maternal health. For each n = {2, 3, 4},
we estimate

PriorDeathb