TWIN BIRTH AND MATERNAL CONDITION
Sonia Bhalotra and Damian Clarke*
Abstract—Twin births are often construed as a natural experiment in the so-
cial and natural sciences on the premise that the occurrence of twins is quasi-
random. We present population-level evidence that challenges this premise.
Using individual data for 17 million births in 72 countries, we demonstrate
that indicators of mother’s health, health-related behaviors, and the prenatal
environment are systematically positively associated with twin birth. The
associations are sizable, evident in richer and poorer countries—evident
even among women who do not use in vitro fertilization—and hold for
numerous different measures of health. We discuss potential mechanisms,
showing evidence that favors selective miscarriage.
I.
Introduction
TWINS have intrigued humankind for more than a century
(Thorndike, 1905). In behavioral genetics, demography,
and psychology, monozygotic twins are studied to assess the
importance of nurture relative to nature (Polderman et al.,
2015). In the social sciences, twin births are also used to
denote an unexpected increase in family size, which assists
causal identification of the impact of fertility on investments
in children and on women’s labor supply (Rosenzweig &
Wolpin, 2000, 1980a; Bronars & Grogger, 1994; Black, De-
vereux, & Salvanes, 2005). A premise of studies that use twin
differences or the twin instrument is that twin births are quasi-
random and have no direct impact (except through fertility)
on the outcome under study. We present new population-
level evidence that challenges this premise. Using 16,962,165
births in 72 countries, of which 462,246 (2.73%) are twins,
we show that the likelihood of a twin birth varies system-
atically with maternal condition. In particular, our estimates
establish that mothers of twins are selectively healthy.1
We document that the association of twin births and mater-
nal condition is meaningfully large and widespread. We show
that is evident in richer and poorer countries and that it holds
for sixteen different markers of maternal condition, includ-
ing health stocks and health conditions prior to pregnancy
(height, obesity, diabetes, hypertension, asthma, kidney dis-
ease, smoking), exposure to unexpected stress in pregnancy,
Received for publication June 1, 2017. Revision accepted for publication
July 30, 2018. Editor: Rohini Pande.
∗Bhalotra: University of Essex; Clarke: Universidad de Santiago de Chile.
We are grateful to Paul Devereux, James Fenske, Judith Hall, Christian
Hansen, Martin Karlsson, Toru Kitagawa, Magne Mogstad, Cheti Nico-
letti, Carol Propper, Adam Rosen, Paul Schulz, Margaret Stevens, Atheen
Venkataramani, Marcos Vera-Hernandez, Frank Windmeijer, Emilia Del
Bono, Climent Quintana-Domeque, Pedro Ródenas, Libertad González,
Hanna Mühlrad, Anna Aevarsdottir, Martin Foureaux Koppensteiner, Ryan
Brown, Pietro Biroli, Rohini Pande, and three anonymous referees, along
with various seminar audiences and discussants for helpful comments
and/or sharing data. Any remaining errors are our own. An earlier ver-
sion of this paper was circulated as Part 1 of “The Twin Instrument,” IZA
discussion paper 10405.
A supplemental appendix is available online at http://www.mitpress
journals.org/doi/suppl/10.1162/rest_a_00789.
1Twins are not as rare as we may think: one in eighty live births and hence
one in forty newborns is a twin. In general and, for instance, in the United
States, there is a positive trend in twin births.
and measures of the availability of medical professionals and
prenatal care.2 The effects are sizable, with a 1 standard de-
viation improvement in the indicator tending to increase the
likelihood of twinning by 6% to 12%.
Previous research has documented that twins have dif-
ferent endowments from singletons; for example, twins are
more likely to have low birthweight and congenital anoma-
lies (Hall, 2003; Rosenzweig & Zhang, 2009). We focus not
on differences between twins and singletons but rather on
differences between mothers of twins and singletons, which
indicate whether occurrence of twin births is quasi-random.
It is known that twin births are not strictly random, occurring
more frequently among older mothers, at higher parity, and
in certain races and ethnicities (Hall, 2003; Bulmer, 1970),
but as these variables are typically observable, they can be
adjusted for (as in Rosenzweig & Wolpin, 1980a).3 Similarly,
it is well documented that women using artificial reproduc-
tive technologies (ART) are more likely to give birth to twins
(Vitthala et al., 2009), but ART use is recorded in many birth
registries, so it can be controlled for and a conditional ran-
domness assumption upheld (Cáceres-Delpiano, 2006; An-
grist, Lavy, & Schlosser, 2010). The reason that our finding
is potentially a major challenge is that maternal condition
is multidimensional and almost impossible to fully measure
and adjust for. To take a few examples, fetal health is poten-
tially a function of whether pregnant women skip breakfast
(Mazumder & Seeskin, 2015), they suffer bereavement in
pregnancy (Black et al., 2016), or they are exposed to air
pollution (Chay & Greenstone, 2003).
Our underlying hypothesis is that twins are more demand-
ing of maternal resources than singletons, and as a result,
conditions that challenge maternal health are more likely to
result in miscarriage of twins than of singletons. We dis-
cuss the role of alternative mechanisms including nonran-
dom conception and maternal survival selection. We provide
evidence in favor of the selective miscarriage mechanism us-
ing U.S. Vital Statistics data for 14 to 16 million births. Se-
lective miscarriage is similarly the mechanism behind the
stylized fact that weaker maternal condition is associated
with a lower probability of male birth (Trivers & Willard,
1973; Almond & Edlund, 2007). We confirm this in our data,
2We also show that a positive association of the chances of having twins
with health-related behaviors in pregnancy (healthy diet, smoking, alcohol,
drug consumption), although we do not rely on this because behaviors in
pregnancy may reflect a response to the mother’s knowledge that she is
carrying twins.
3Other correlates identified in the medical literature but not reflected in
social science research include high concentrations of follicle-stimulating
hormone in women, season and seasonal light, height, urbanization, and
starvation (Hall, 2003), with mixed results (based on small samples) when
considering social class (Campbell, Campbell, & MacGillivary, 1974;
Campbell, 1998). These results have not been documented in the economics
or social science literature. In our discussion of mechanisms, we discuss
the difference between monozygotic and dyzygotic twins.
The Review of Economics and Statistics, December 2019, 101(5): 853–864
© 2018 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology.
Published under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.
https://doi.org/10.1162/rest_a_00789
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
a
_
0
0
7
8
9
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
854
THE REVIEW OF ECONOMICS AND STATISTICS
TABLE 1.—THE QUANTITY–QUALITY AND FERTILITY–FLFP TRADE-OFFS: ESTIMATES USING THE TWIN INSTRUMENT
Estimates
Authors
Data/Context
Twin Use
OLS
IV
Maternal Controls
A. Quantity–Quality
Rosenzweig and Wolpin
(1980a)
Black et al. (2005)
India, rural survey; outcome is
standardized schooling
Norway, administrative data
Twin ratio
in OLSa
IV
Cáceres-Delpiano (2006)
United States, Census 5% file;
IV
Li et al. (2008)
outcome is behind
educational cohort
China, census 1% file; outcome
is educational enrollment
Rosenzweig and Zhang (2009)
China, twin survey
Angrist et al. (2010)
Israel, Census 20% file
Black et al. (2010)
Norway, administrative data;
outcome is IQ
IV
RFb,f
IV
IV
˚Aslund and Grönqvist (2010)
Sweden, administrative data
IV
Ponczek and Souza (2012)
Brazil, Census 10% file
IV (girl)
Marteleto and de Souza (2012)
Brazil, household survey
Mogstad and Wiswall (2016)
Norway, administrative data
B. Fertility and female labor force participation
Rosenzweig and Wolpin
(1980b)
United States, pooled
demographic surveys
Bronars and Grogger (1994)
United States 1970 and 1980
5% Census
Angrist and Evans (1998)
Jacobsen et al. (1999)
United States 1980 5% Census
United States 1970 and 1980
5% Census
IV (boy)
IV
IVc
RFd
RFd
IV
IVe
Cáceres-Delpiano (2012)
Pooled demographic surveys,
IV
developing countries
−2.483 (0.740)
−0.060 (0.003)
−0.076 (0.004)
−0.059 (0.006)
0.011 (0.000)
0.017 (0.001)
−0.031 (0.001)
−0.038 (0.002)
−0.307 (0.160)
−0.225 (0.172)
−0.145 (0.005)
−0.143 (0.005)
−0.113 (0.004)
−0.132 (0.006)
−0.100 (0.009)
−0.277 (0.015)
−0.283 (0.015)
−0.233 (0.010)
−0.230 (0.010)
−0.248 (0.003)
−0.240 (0.003)
None
Age and education
Age, education, and race
Age and education
Age
Age, place of birth, race
Age and education
Age and education
Age and education
Age, education, and
family income
Age and education
−0.038 (0.047)
−0.016 (0.044)
−0.024 (0.059)
0.002 (0.003)
0.010 (0.006)
0.002 (0.009)
−0.024 (0.014)
No birthweight control
Birthweight control
0.174 (0.166)
0.167 (0.117)
−0.149 (0.052)
−0.170 (0.052)
−0.115 (0.080)
0.022 (0.048)
−0.043 (0.048)
−0.042 (0.083)
−0.372 (0.198)
−0.634 (0.194)
−0.137 (0.146)
−0.060 (0.164)
0.064 (0.076)
0.131 (0.055)
0.053 (0.050)
−0.051 (0.053)
−0.107 (0.059)
−0.371 (0.212)
0.142 (0.102)
−0.036 (0.036)
−0.035 (0.017)
−0.176 (0.002)
−0.014 (0.001)
−0.010 (0.001)
−0.009 (0.001)
Short-term estimate
Long-term estimate
1970 Census
1980 Census
−0.057 (0.011)
−0.021 (0.014)
−0.025 (0.008)
−0.029 (0.012)
−0.016 (0.012)
−0.022 (0.012)
None
Age at first birth
Age, age at first birth
Age at first birth cubic
Age, education, literacy
status, country
dummies
Estimates and standard errors reported in columns OLS and IV refer to main estimates from each paper. Estimates are included from published articles using large samples of microdata. A comprehensive review is
provided in Clarke (2018). Where multiple estimates are reported, unless otherwise indicated, the first line refers to the impact of twins at birth 2, the second line the impact of twins at birth 3, and the third line the
impact of twins at birth 4 (if available). In panel A, the estimates refer to the outcome variable “years of education” unless specified in column 2. In panel B, all outcomes are the mother’s labor market participation.
a Twin ratio is the number of twin births divided by the number of pregnancies.
bCoefficients reported are impact of second birth twins on nontwin first births.
cNonlinear estimates are reported in paper. Here linear estimates are presented for comparison with other results.
dReduced form (RF) uses twins at first birth as independent variable.
eFirst line reports estimates from 1970 census, second line reports 1980 census.
f Standard errors are calculated from reported t-statistics.
showing that twin births are more likely to be girls. Our find-
ings add a novel twist to recent literature documenting that a
mother’s health and her environmental exposure to nutritional
or other stresses during pregnancy influence birth outcomes,
with many studies documenting lower birthweight (Currie
& Moretti, 2007; Bernstein et al., 2005; Quintana-Domeque
& Ródenas-Serrano, 2017). If birthweight is the intensive
margin, we may think of miscarriage as an extensive margin
response or the limiting case of low birthweight.
Our findings have implications for research that has ex-
ploited the assumed randomness of twin births. Studies us-
ing twins to isolate exogenous variation in fertility will tend
to underestimate the impact of fertility on parental invest-
ments in children and on women’s labor supply if selectively
healthy mothers invest more in children after-birth, and are
more likely to participate in the labor market (as discussed
in Bloom, Kuhn, & Prettner, 2015). In table 1 we summarize
studies using twin births to instrument fertility, document-
ing the mother-level controls in each study. In some cases,
the validity of the conditional randomness assumption is di-
rectly probed—for instance, with respect to mother’s educa-
tion (Black et al., 2005; Li, Zhang, & Zhu, 2008; Rosenzweig
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
a
_
0
0
7
8
9
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
TWIN BIRTH AND MATERNAL CONDITION
855
& Zhang, 2009). However, as is acknowledged in each case,
any such tests are at best partial evidence in support of instru-
mental validity. Importantly, no previous study has attempted
to control for maternal health conditions or behaviors. This
is pertinent, as it could resolve the ambiguity of the avail-
able evidence on the impacts of fertility. In particular, recent
studies using the twin instrument challenge a long-standing
theoretical prior of Becker and Lewis (1973) in rejecting the
presence of a quantity–quality (QQ) fertility trade-off in de-
veloped countries (Black et al., 2005; Angrist et al., 2010),
but our estimates suggest that this rejection could in principle
arise from ignoring the positive selection of women into twin
birth. Similarly, research using the twin instrument tends to
find that additional children have relatively little influence
on female labor force participation (FLFP; see Lundborg,
Plug, & Rasmussen, 2017). But, again, these estimates are
likely to be downward biased. The results of studies in eco-
nomics, psychology, education, and biology that instead ex-
ploit the genetic similarity of twins will not be biased but will
tend to have more restricted external validity than previously
assumed.4
II. Methodology
In this section, we discuss two distinct approaches to test-
ing our hypothesis that twins are selectively born to healthier
mothers. We identify variation in the mother’s health before
she gives birth to twins and before she knows she will give
birth to twins. In the first approach, we use information on her
health condition (morbidities, height, weight), health-related
behaviors, access to health care, and environmental health
stressors. In our second approach, we use as a marker of
maternal health the fetal or infant survival rate of her births
prior to the birth at which she has twins (with parity-matched
counterfactuals). We discuss below the methods used to in-
vestigate potential mechanisms.
We conduct three robustness checks. First, we restrict the
sample to non-ART births. It is important to demonstrate that
our hypothesis holds independent of ART use because there
is a positive association of ART with the likelihood of twin
births (Vitthala et al., 2009), and ART users are typically more
educated and wealthy (Lundborg et al., 2017). Another po-
tential concern is that we are capturing genetic traits that, for
instance, are associated with the woman’s height or weight
4The twin instrument has been criticized for other reasons. A recent
critique of the use of twins to identify the QQ trade-off has argued that
parental behaviors may respond to the endowment of twins and not only to
the fact that twin births represent a fertility shock. Rosenzweig & Zhang
(2009) highlight that twins have lower birth endowments. They argue that
if parents reinforce endowments, then they may reallocate resources to-
ward the better-endowed children born before the twins, obscuring any
underlying QQ trade-off; this is examined in Angrist et al. (2010) and
Fitzsimons & Malde (2014). We remain agnostic on this. Our critique
is in principle orthogonal to this critique, providing a different reason
that an underlying QQ trade-off may be obscured, relating to endow-
ments and behaviors of mothers. This critique has not been previously
considered.
and also correlated with her predisposition toward twin birth.
This would appear to be a second-order concern since we do
not only rely on woman-specific measures of health but also
show a positive association of twinning with environmental
stressors, health facilities, and health-related behaviors. We
nevertheless investigate this concern in two different ways.
First, we test whether we can identify a positive association
of the probability that a birth is a twin with woman-specific
time-varying health indicators conditional on woman fixed
effects that sweep out genetic influences. Second, we leverage
biomedical research showing that monozygotic (MZ) twins
are randomly allocated across mothers, although genetic pre-
dispositions may influence the chances of having dizygotic
(DZ) twins (Meulemans et al., 1996). Ideally, we would re-
strict the sample to MZ twins, but MZ versus DZ are not
identified in the data. Instead, on the premise that MZ twins
are necessarily same sex and about half of all DZ twins are
same sex, we investigate our hypothesis restricting the sam-
ple to include only same-sex twins. If our results were driven
by genetic predispositions, then we should find weaker asso-
ciations in the same-sex sample. The methods and data used
to conduct the robustness checks are discussed with the re-
sults. The rest of this section elaborates the specification used
in the two main approaches to testing for twin randomness.
A. Across Mothers
To test the null that twin births are “as good as random,”
we estimate conditional regressions of the form
twinb jy = γ0 + γ1Healthb jy + μb + λy + εb jy.
(1)
Here, twin is an indicator of whether a birth of order b born
to woman j at age y is a twin. We control for fixed effects
for mother’s age and parity, as these are known to influence
the probability of twin birth. Where births are observed over
multiple years, races, or geographic areas, we include the
relevant fixed effects. Under the null, the coefficients on ma-
ternal health variables Healthb jy should not be statistically
distinguishable from 0. This is equivalent to a test of (condi-
tional) balance of characteristics of treated (with twins) and
control (without twins) mothers. Standard errors are clustered
at the level of the mother.
For ease of exposition, we maintain the subscript y for the
woman’s age at birth, but most of the health indicators are
measured before pregnancy to avoid the potential concern
of reverse causality—that twin births cause greater deple-
tion of the mother’s health than singleton births or encourage
women to adopt different behaviors. These include prepreg-
nancy measures of smoking, diabetes, hypertension, obesity,
height, kidney disease, and asthma. Measures of prenatal
or medical care are constructed as community-level mea-
sures of availability. In a specific case we discuss below, we
use an exogenous measure of environmental stress in preg-
nancy. We also show results for some variables measured in
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
1
5
8
5
3
1
9
8
2
5
5
8
/
r
e
s
t
_
a
_
0
0
7
8
9
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
856
THE REVIEW OF ECONOMICS AND STATISTICS
pregnancy—smoking, alcohol, drugs, diet—and for one mea-
sure (BMI in developing country data) measured after birth.
We flag these variables so that their coefficients can be inter-
preted with this caveat in mind.5 Importantly, if we dropped
all of the flagged variables, we would still have a fairly com-
pelling breadth of evidence. We add controls for education
and, where available, wealth to allow for the fact that educa-
tion may motivate and wealth may facilitate health-seeking
behaviors (Kenkel, 1991; Lleras-Muney & Cutler, 2010).
This will confirm that the indicators in Health are not simply
proxying for socioeconomic status. As discussed above, we
will present additional specifications including woman fixed
effects in the model and restricting to same-sex twins.
B. Pre-Twin Balance
We perform an alternative test that exploits predetermined
birth outcomes within mothers. This essentially involves test-
ing whether women who produce twins had, on average,
healthier births before the twin birth, as this would be a mea-
sure of predetermined maternal health. For each n = {2, 3, 4},
we estimate
PriorDeathb