THE LOG OF GRAVITY
J. M. C. Santos Silva and Silvana Tenreyro*
it: under heteroskedasticity,
Abstract—Although economists have long been aware of Jensen’s in-
equality, many econometric applications have neglected an important
implication of
log-
linearized models estimated by OLS lead to biased estimates of the true
elasticities. We explain why this problem arises and propose an appropri-
ate estimator. Our criticism of conventional practices and the proposed
solution extend to a broad range of applications where log-linearized
equations are estimated. We develop the argument using one particular
illustration, the gravity equation for trade. We find significant differences
between estimates obtained with the proposed estimator and those ob-
tained with the traditional method.
the parameters of
I.
Introduction
ECONOMISTS have long been aware that Jensen’s in-
equality implies that E(ln y) (cid:1) ln E(y), that is, the
expected value of the logarithm of a random variable is
different from the logarithm of its expected value. This
basic fact, however, has been neglected in many economet-
ric applications. Indeed, one important implication of Jen-
sen’s inequality is that the standard practice of interpreting
the parameters of log-linearized models estimated by ordi-
nary least squares (OLS) as elasticities can be highly mis-
leading in the presence of heteroskedasticity.
Although many authors have addressed the problem of
obtaining consistent estimates of the conditional mean of
the dependent variable when the model is estimated in the
log linear form (see, for example, Goldberger, 1968; Man-
ning & Mullahy, 2001), we were unable to find any refer-
ence in the literature to the potential bias of the elasticities
estimated using the log linear model.
In this paper we use the gravity equation for trade as a
particular illustration of how the bias arises and propose an
appropriate estimator. We argue that the gravity equation,
and, more generally, constant-elasticity models, should be
estimated in their multiplicative form and propose a simple
pseudo-maximum-likelihood (PML) estimation technique.
Besides being consistent in the presence of heteroskedas-
ticity, this method also provides a natural way to deal with
zero values of the dependent variable.
Using Monte Carlo simulations, we compare the perfor-
mance of our estimator with that of OLS (in the log linear
specification). The results are striking. In the presence of
heteroskedasticity, estimates obtained using log-linearized
Received for publication March 29, 2004. Revision accepted for publi-
cation September 13, 2005.
* ISEG/Universidade Te´cnica de Lisboa and CEMAPRE; and London
School of Economics, CEP, and CEPR, respectively.
We are grateful
to two anonymous referees for their constructive
comments and suggestions. We also thank Francesco Caselli, Kevin
Denny, Juan Carlos Hallak, Daniel Mota, John Mullahy, Paulo Parente,
Manuela Simarro, and Kim Underhill for helpful advice on previous
versions of this paper. The usual disclaimer applies. Jiaying Huang
provided excellent research assistance. Santos Silva gratefully acknowl-
edges the partial financial support from Fundac¸a˜o para a Cieˆncia e
Tecnologia, program POCTI, partially funded by FEDER. A previous
version of this paper circulated as “Gravity-Defying Trade.”
models are severely biased, distorting the interpretation of
the model. These biases might be critical for the compara-
tive assessment of competing economic theories, as well as
for the evaluation of the effects of different policies. In
contrast, our method is robust to the different patterns of
heteroskedasticity considered in the simulations.
We next use the proposed method to provide new esti-
mates of the gravity equation in cross-sectional data. Using
standard tests, we show that heteroskedasticity is indeed a
severe problem, both in the traditional gravity equation
introduced by Tinbergen (1962), and in a gravity equation
that takes into account multilateral resistance terms or fixed
effects, as suggested by Anderson and van Wincoop (2003).
We then compare the estimates obtained with the proposed
PML estimator with those generated by OLS in the log
linear specification, using both the traditional and the fixed-
effects gravity equations.
Our estimation method paints a very different picture of
the determinants of international trade. In the traditional
gravity equation, the coefficients on GDP are not, as gen-
erally estimated, close to 1. Instead, they are significantly
smaller, which might help reconcile the gravity equation
with the observation that the trade-to-GDP ratio decreases
with increasing total GDP (or, in other words, that smaller
countries tend to be more open to international trade). In
addition, OLS greatly exaggerates the roles of colonial ties
and geographical proximity.
Using the Anderson–van Wincoop (2003) gravity equa-
tion, we find that OLS yields significantly larger effects for
geographical distance. The estimated elasticity obtained
from the log-linearized equation is almost twice as large as
that predicted by PML. OLS also predicts a large role for
common colonial ties, implying that sharing a common
colonial history practically doubles bilateral trade. In con-
trast, the proposed PML estimator leads to a statistically and
economically insignificant effect.
The general message is that, even controlling for fixed
effects,
the presence of heteroskedasticity can generate
strikingly different estimates when the gravity equation is
log-linearized, rather than estimated in levels. In other
words, Jensen’s inequality is quantitatively and qualitatively
important in the estimation of gravity equations. This sug-
gests that inferences drawn on log-linearized regressions
can produce misleading conclusions.
Despite the focus on the gravity equation, our criticism of
the conventional practice and the solution we propose ex-
tend to a broad range of economic applications where the
equations under study are log-linearized, or, more generally,
transformed by a nonlinear function. A short list of exam-
ples includes the estimation of Mincerian equations for
wages, production functions, and Euler equations, which are
typically estimated in logarithms.
The Review of Economics and Statistics, November 2006, 88(4): 641–658
© 2006 by the President and Fellows of Harvard College and the Massachusetts Institute of Technology
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
642
THE REVIEW OF ECONOMICS AND STATISTICS
The remainder of the paper is organized as follows.
Section II studies the econometric problems raised by the
estimation of gravity equations. Section III considers
constant-elasticity models in general; it introduces the PML
estimator and specification tests to check the adequacy of
the proposed estimator. Section IV presents the Monte Carlo
simulations. Section V provides new estimates of both the
traditional and the Anderson–van Wincoop gravity equa-
tion. The results are compared with those generated by
OLS, nonlinear least squares, and tobit estimations. Section
VI contains concluding remarks.
II. The Econometrics of the Gravity Equation
A. The Traditional Gravity Equation
The pioneering work of Jan Tinbergen (1962) initiated a
vast theoretical and empirical literature on the gravity equa-
tion for trade. Theories based on different foundations for
trade, including endowment and technological differences,
increasing returns to scale, and Armington demands, all
predict a gravity relationship for trade flows analogous to
Newton’s law of universal gravitation.1 In its simplest form,
the gravity equation for trade states that the trade flow from
country i to country j, denoted by Tij, is proportional to the
product of the two countries’ GDPs, denoted by Yi and Yj,
and inversely proportional to their distance, Dij, broadly
construed to include all factors that might create trade
resistance. More generally,
Tij (cid:1) (cid:2)0Y i
(cid:2)1Y j
(cid:2)3,
(cid:2)2Dij
(1)
where (cid:2)0, (cid:2)1, (cid:2)2, and (cid:2)3 are unknown parameters.
The analogy between trade and the physical force of
gravity, however, clashes with the observation that there is
no set of parameters for which equation (1) will hold exactly
for an arbitrary set of observations. To account for devia-
tions from the theory, stochastic versions of the equation are
used in empirical studies. Typically, the stochastic version
of the gravity equation has the form
Tij (cid:1) (cid:2)0Y i
(cid:2)1Y j
(cid:2)2Dij
(cid:2)3(cid:3)ij,
(2)
where (cid:3)ij is an error factor with E((cid:3)ij(cid:1)Yi, Yj, Dij) (cid:4) 1,
assumed to be statistically independent of the regressors,
leading to
E(cid:5)Tij(cid:1)Yi,Yj,Dij(cid:6) (cid:1) (cid:2)0Y i
(cid:2)1Y j
(cid:2)3.
(cid:2)2Dij
There is a long tradition in the trade literature of log-
linearizing equation (2) and estimating the parameters of
interest by least squares, using the equation
ln Tij (cid:1) ln (cid:2)0 (cid:2) (cid:2)1 ln Yi (cid:2) (cid:2)2 ln Yj
(cid:7) (cid:2)3 ln Dij (cid:2) ln (cid:3)ij.
(3)
The validity of this procedure depends critically on the
assumption that (cid:3)ij, and therefore ln (cid:3)ij, are statistically
independent of the regressors. To see why this is so, notice
that
the expected value of the logarithm of a random
variable depends both on its mean and on the higher-order
moments of the distribution. Hence, for example, if the
variance of the error factor (cid:3)ij in equation (2) depends on Yi,
Yj, or Dij, the expected value of ln (cid:3)ij will also depend on the
regressors, violating the condition for consistency of OLS.2
In the cases studied in section V we find overwhelming
evidence that the error terms in the usual log linear speci-
fication of the gravity equation are heteroskedastic, which
violates the assumption that ln (cid:3)ij is statistically indepen-
dent of the regressors and suggests that this estimation
method leads to inconsistent estimates of the elasticities of
interest.
A related problem with the analogy between Newtonian
gravity and trade is that gravitational force can be very
small, but never zero, whereas trade between several pairs
of countries is literally zero. In many cases, these zeros
occur simply because some pairs of countries did not trade
in a given period. For example, it would not be surprising to
find that Tajikistan and Togo did not trade in a certain year.3
These zero observations pose no problem at all for the
estimation of gravity equations in their multiplicative form.
In contrast, the existence of observations for which the
dependent variable is zero creates an additional problem for
2 As an illustration, consider the case in which (cid:3)ij follows a log normal
distribution, with E((cid:3)ij(cid:1)Yi, Yj, Dij) (cid:4) 1 and variance (cid:8)
2 (cid:4) f(Yi, Yj, Dij). The
error term in the log-linearized representation will then follow a normal
distribution, with E [ln (cid:3)ij(cid:1)Yi, Yj, Dij] (cid:4) (cid:9) 1
2), which is also a
ij
2
function of the covariates.
ln(1 (cid:7) (cid:8)
ij
1 See, for example, Anderson (1979), Helpman and Krugman (1985),
Bergstrand (1985), Davis (1995), Deardoff (1998), and Anderson and van
Wincoop (2003). A feature common to these models is that they all assume
complete specialization: each good is produced in only one country.
However, Haveman and Hummels (2001), Feenstra, Markusen, and Rose
(2000), and Eaton and Kortum (2001) derive the gravity equation without
relying on complete specialization. Examples of empirical studies framed
on the gravity equation include the evaluation of trade protection (for
example, Harrigan, 1993), regional
trade agreements (for example,
Frankel, Stein, & Wei, 1998; Frankel, 1997), exchange rate variability (for
example, Frankel & Wei, 1993; Eichengreen & Irwin, 1995), and currency
unions (for example, Rose, 2000; Frankel & Rose, 2002; and Tenreyro &
Barro, 2002). See also the various studies on border effects influencing the
patterns of intranational and international trade, including McCallum
(1995), and Anderson and van Wincoop (2003), among others.
3 The absence of trade between small and distant countries might be
explained, among other factors, by large variable costs (for example,
bricks are too costly to transport) or large fixed costs (for example,
information on foreign markets). At the aggregate level, these costs can be
best proxied by the various measures of distance and size entering the
gravity equation. The existence of zero trade between many pairs of
countries is directly addressed by Hallak (2006) and Helpman, Melitz, and
Rubinstein (2004). These authors propose a promising avenue of research
using a two-part estimation procedure, with a fixed-cost equation deter-
mining the cutoff point above which a country exports, and a standard
gravity equation. Their results, however, rely heavily on both normality
and homoskedasticity assumptions, the latter being the particular concern
of this paper. A natural topic for further research is to develop and
implement an estimator of the two-part model that, like the PML estimator
proposed here, is robust to distributional assumptions.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
643
the use of the log linear form of the gravity equation.
Several methods have been developed to deal with this
problem [see Frankel (1997) for a description of the various
procedures]. The approach followed by the large majority of
empirical studies is simply to drop the pairs with zero trade
from the data set and estimate the log linear form by OLS.
Rather than throwing away the observations with Tij (cid:4) 0,
some authors estimate the model using Tij (cid:7) 1 as the
dependent variable or use a tobit estimator. However, these
procedures will generally lead to inconsistent estimators of
the parameters of interest. The severity of these inconsis-
tencies will depend on the particular characteristics of the
sample and model used, but there is no reason to believe that
they will be negligible.
Zeros may also be the result of rounding errors.4 If trade
is measured in thousands of dollars, it is possible that for
pairs of countries for which bilateral trade did not reach a
minimum value, say $500, the value of trade is registered as
0. If these rounded-down observations were partially com-
pensated by rounded-up ones, the overall effect of these
errors would be relatively minor. However, the rounding
down is more likely to occur for small or distant countries,
and therefore the probability of rounding down will depend
on the value of the covariates, leading to the inconsistency
of the estimators. Finally, the zeros can just be missing
observations that are wrongly recorded as 0. This problem is
more likely to occur when small countries are considered,
and again the measurement error will depend on the covari-
ates, leading to inconsistency.
B. The Anderson–van Wincoop Gravity Equation
Anderson and van Wincoop (2003) argue that the tradi-
tional gravity equation is not correctly specified, as it does
not take into account multilateral resistance terms. One of
the solutions for this problem that is suggested by those
authors is to augment the traditional gravity equation with
exporter and importer fixed effects, leading to
Tij (cid:1) (cid:2)0Y i
(cid:2)1Y j
(cid:2)2Dij
(cid:2)3e(cid:10)idi(cid:7)(cid:10)jdj,
(4)
where (cid:2)0, (cid:2)1, (cid:2)2, (cid:2)3, (cid:10)i, and (cid:10)j are the parameters to be
estimated and di and dj are dummies identifying the exporter
and importer.5
Their model also yields the prediction that (cid:2)1 (cid:4) (cid:2)2 (cid:4) 1,
which leads to the unit-income-elasticity model
Tij (cid:1) (cid:2)0YiYjDij
(cid:2)3e(cid:10)idi(cid:7)(cid:10)jdj,
whose stochastic version has the form
E(cid:5)Tij(cid:1)Yi,Yj,Dij,di,dj(cid:6) (cid:1) (cid:2)0YiYjDij
(cid:2)3e(cid:10)idi(cid:7)(cid:10)jdj.
(5)
4 Trade data can suffer from many other forms of errors, as described in
Feenstra, Lipsey, and Bowen (1997).
As before, log-linearization of equation (5) raises the prob-
lem of how to treat zero-value observations. Moreover,
given that equation (5) is a multiplicative model, it is also
subject to the biases caused by log-linearization in the
presence of heteroskedasticity. Naturally, the presence of
the individual effects may reduce the severity of this prob-
lem, but whether or not that happens is an empirical issue.
In our empirical analysis we provide estimates for both
the traditional and the Anderson–van Wincoop gravity equa-
tions, using alternative estimation methods. We show that,
in practice, heteroskedasticity is quantitatively and qualita-
tively important in the gravity equation, even when control-
ling for fixed effects. Hence, we recommend estimating the
augmented gravity equation in levels, using the proposed
PML estimator, which also adequately deals with the zero-
value observations.
III. Constant-Elasticity Models
Despite their immense popularity, empirical studies in-
volving gravity equations still have important econometric
flaws. These flaws are not exclusive to this literature, but
extend to many areas where constant-elasticity models are
used. This section examines how the deterministic multipli-
cative models suggested by economic theory can be used in
empirical studies.
In their nonstochastic form, the relationship between the
multiplicative constant-elasticity model and its log linear
additive formulation is trivial. The problem, of course, is
that economic relations do not hold with the accuracy of
physical laws. All that can be expected is that they hold on
average. Indeed, here we interpret economic models like the
gravity equation as yielding the expected value of the
variable of interest, y (cid:3) 0, for a given value of the explan-
atory variables, x (see Goldberger, 1991, p. 5). That is, if
economic theory suggests that y and x are linked by a
constant-elasticity model of the form yi (cid:4) exp (xi(cid:11)), the
function exp(xi(cid:11)) is interpreted as the conditional expecta-
tion of yi given x, denoted E[yi(cid:1)x].6 For example, using the
notation in the previous section, the multiplicative gravity
relationship can be written as the exponential function exp
[ln (cid:2)0 (cid:7) (cid:2)1 ln Yi (cid:7) (cid:2)2 ln Yj (cid:7) (cid:2)3 ln Dij], which is interpreted
as the conditional expectation E(Tij(cid:1)Yi, Yj, Dij).
Because the relation yi (cid:4) exp(xi(cid:11)) holds on average but
not for each i, an error term is associated with each obser-
vation, which is defined as εi (cid:4) yi (cid:9) E[yi(cid:1)x].7 Therefore, the
stochastic model can be formulated as
6 Notice that if exp(xi(cid:11)) is interpreted as describing the conditional
median of yi (or some other conditional quantile) rather than the condi-
tional expectation, estimates of the elasticities of interest can be obtained
estimating the log linear model using the appropriate quantile regression
estimator (Koenker & Bassett, 1978). However, interpreting exp(xi(cid:11)) as a
conditional median is problematic when yi has a large mass of zero
observations, as in trade data. Indeed, in this case the conditional median
of yi will be a discontinuous function of the regressors, which is generally
not compatible with standard economic theory.
7 Whether the error enters additively or multiplicatively is irrelevant for
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
5 Note that, throughout the paper, Tij denotes exports from i to j.
our purposes, as explained below.
644
THE REVIEW OF ECONOMICS AND STATISTICS
yi (cid:1) exp(cid:5)xi(cid:11)(cid:6) (cid:2) εi,
with yi (cid:3) 0 and E[εi(cid:1)x] (cid:4) 0.
(6)
As we mentioned before, the standard practice of log-
linearizing equation (6) and estimating (cid:11) by OLS is inap-
propriate for a number of reasons. First of all, yi can be 0, in
which case log-linearization is infeasible. Second, even if
all observations of yi are strictly positive, the expected value
of the log-linearized error will in general depend on the
covariates, and hence OLS will be inconsistent. To see the
point more clearly, notice that equation (6) can be expressed
as
yi (cid:1) exp(cid:5)xi(cid:11)(cid:6) (cid:3)i,
with (cid:3)i (cid:4) 1 (cid:7) εi/exp(xi(cid:11)) and E[(cid:3)i(cid:1)x] (cid:4) 1. Assuming for the
moment that yi is positive, the model can be made linear in
the parameters by taking logarithms of both sides of the
equation, leading to
ln yi (cid:1) xi(cid:11) (cid:2) ln (cid:3)i.
(7)
To obtain a consistent estimator of the slope parameters
in equation (6) estimating equation (7) by OLS, it is neces-
sary that E[ln (cid:3)i(cid:1)x] does not depend on xi.8 Because (cid:3)i (cid:4)
1 (cid:7) εi/exp(xi(cid:11)), this condition is met only if εi can be
written as εi (cid:4) exp(xi(cid:11)) vi, where vi is a random variable
statistically independent of xi. In this case, (cid:3)i (cid:4) 1 (cid:7) vi and
therefore is statistically independent of xi, implying that
E[ln (cid:3)i(cid:1)x] is constant. Thus, only under very specific con-
ditions on the error term is the log linear representation of
the constant-elasticity model useful as a device to estimate
the parameters of interest.
When (cid:3)i is statistically independent of xi, the conditional
variance of yi (and εi) is proportional to exp(2xi(cid:11)). Although
economic theory generally does not provide any informa-
tion on the variance of εi, we can infer some of its properties
from the characteristics of the data. Because yi is nonnega-
tive, when E[yi(cid:1)x] approaches 0, the probability of yi being
positive must also approach 0. This implies that V[yi(cid:1)x], the
conditional variance of yi, tends to vanish as E[yi(cid:1)x] passes
to 0.9 On the other hand, when the expected value of y is far
away from its lower bound, it is possible to observe large
deviations from the conditional mean in either direction,
in practice, εi will
leading to greater dispersion. Thus,
generally be heteroskedastic and its variance will depend on
exp(xi(cid:11)), but there is no reason to assume that V[yi(cid:1)x] is
proportional to exp(2xi(cid:11)). Therefore, in general, regressing
ln yi on xi by OLS will lead to inconsistent estimates of (cid:11).
8 Consistent estimation of the intercept would also require E[ln (cid:3)i (cid:1)x] (cid:4) 0.
9 In the case of trade data, when E [ yi(cid:1)x] is close to its lower bound (that
is, for pairs of small and distant countries), it is unlikely that large values
of trade are observed, for they cannot be offset by equally large deviations
in the opposite direction, simply because trade cannot be negative.
Therefore, for these observations, dispersion around the mean tends to be
small.
It may be surprising that the pattern of heteroskedasticity
and, indeed, the form of all higher-order moments of the
conditional distribution of the error term can affect the
consistency of an estimator, rather than just its efficiency.
The reason is that
the nonlinear transformation of the
dependent variable in equation (7) changes the properties of
the error term in a nontrivial way because the conditional
expectation of ln (cid:3)i depends on the shape of the conditional
distribution of (cid:3)i. Hence, unless very strong restrictions are
imposed on the form of this distribution, it is not possible to
recover information about the conditional expectation of yi
from the conditional mean of ln yi, simply because ln (cid:3)i is
correlated with the regressors. Nevertheless, estimating
equation (7) by OLS will produce consistent estimates of
the parameters of E[ln yi(cid:1)x] as long as E[ln (yi)(cid:1)x] is a linear
function of the regressors.10 The problem is that
these
parameters may not permit identification of the parameters
of E[yi(cid:1)x].
In short, even assuming that all observations on yi are
positive, it is not advisable to estimate (cid:11) from the log linear
model. Instead, the nonlinear model has to be estimated.
A. Estimation
Although most empirical studies use the log linear form
of the constant-elasticity model, some authors [see Frankel
and Wei (1993) for an example in the international trade
literature] have estimated multiplicative models using non-
linear least squares (NLS), which is an asymptotically valid
estimator for equation (6). However, the NLS estimator can
be very inefficient in this context, as it ignores the het-
eroskedasticity that, as discussed before, is characteristic of
this type of data.
The NLS estimator of (cid:11) is defined by
ˆ(cid:11) (cid:1) arg min
(cid:12) yi (cid:4) exp(cid:5)xib(cid:6)(cid:13)2,
n
(cid:2)
i(cid:4)1
b
which implies the following set of first-order conditions:
n
(cid:2)
i(cid:4)1
(cid:12) yi (cid:4) exp(cid:5)xi
ˆ(cid:11)(cid:6)(cid:13) exp(cid:5)xi
ˆ(cid:11)(cid:6) xi (cid:1) 0.
(8)
These equations give more weight to observations where
exp(xi(cid:11)ˆ ) is large, because that is where the curvature of the
conditional expectation is more pronounced. However,
these are generally also the observations with larger vari-
ance, which implies that NLS gives more weight to noisier
observations. Thus, this estimator may be very inefficient,
depending heavily on a small number of observations.
If the form of V[yi(cid:1)x] were known, this problem could be
eliminated using a weighted NLS estimator. However, in
10 When E [ln yi (cid:1) x] is not a linear function of the regressors, estimating
equation (7) by OLS will produce consistent estimates of the parameters
of the best linear approximation to E [ln yi (cid:1) x] (see Goldberger, 1991, p.
53).
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
645
practice, all we know about V[yi(cid:1)x] is that, in general, it goes
to 0 as E[yi(cid:1)x] passes to 0. Therefore, an optimal weighted
NLS estimator cannot be used without further information
on the distribution of the errors. In principle, this problem
can be tackled by estimating the multiplicative model using
a consistent estimator, and then obtaining the appropriate
weights estimating the skedastic function nonparametri-
cally, as suggested by Delgado (1992) and Delgado and
Kniesner (1997). However, this nonparametric generalized
least squares estimator is rather cumbersome to implement,
especially if the model has a large number of regressors.
Moreover, the choice of the first-round estimator is an open
question, as the NLS estimator may be a poor starting point
due to its considerable inefficiency. Therefore, the nonpara-
metric generalized least squares estimator is not appropriate
to use as a workhorse for routine estimation of multiplica-
tive models.11 Indeed, what is needed is an estimator that is
consistent and reasonably efficient under a wide range of
heteroskedasticity patterns and is also simple to implement.
A possible way of obtaining an estimator that is more
efficient than the standard NLS without the need to use
nonparametric regression is to follow McCullagh and
Nelder (1989) and estimate the parameters of interest using
a PML estimator based on some assumption on the func-
tional form of V[yi(cid:1)x].12 Among the many possible specifi-
cations,
the conditional variance is
proportional to the conditional mean is particularly appeal-
ing. Indeed, under this assumption E[yi(cid:1)x] (cid:4) exp(xi(cid:11)) (cid:14)
V[yi(cid:1)x], and (cid:11) can be estimated by solving the following set
of first-order conditions:
the hypothesis that
n
(cid:2)
i(cid:4)1
(cid:12) yi (cid:4) exp(cid:5)xi(cid:11)˜ (cid:6)(cid:13)xi (cid:1) 0.
(9)
Comparing equations (8) and (9), it is clear that, unlike
the NLS estimator, which is a PML estimator obtained
assuming that V[yi(cid:1)x] is constant, the PML estimator based
on equation (9) gives the same weight to all observations,
rather than emphasizing those for which exp(xi(cid:11)) is large.
This is because, under the assumption that E[yi(cid:1)x] (cid:14) V[yi(cid:1)x],
all observations have the same information on the parame-
ters of interest as the additional information on the curvature
of the conditional mean coming from observations with
large exp(xi(cid:11)) is offset by their larger variance. Of course,
this estimator may not be optimal, but without further
information on the pattern of heteroskedasticity, it seems
11 A nonparametric generalized least squares estimator can also be used
to estimate linear models in the presence of heteroskedasticity of unknown
form (Robinson, 1987). However, despite having been proposed more
than 15 years ago, this estimator has never been adopted as a standard tool
by researchers doing empirical work, who generally prefer the simplicity
of the inefficient OLS, with an appropriate covariance matrix.
12 See also Manning and Mullahy (2001). A related estimator is proposed
by Papke and Wooldridge (1996) for the estimation of models for
fractional data.
natural to give the same weight to all observations.13 Even
if E[yi(cid:1)x] is not proportional to V[yi(cid:1)x], the PML estimator
based on equation (9) is likely to be more efficient than the
NLS estimator when the heteroskedasticity increases with
the conditional mean.
The estimator defined by equation (9) is numerically
equal to the Poisson pseudo-maximum-likelihood (PPML)
estimator, which is often used for count data.14 The form of
equation (9) makes clear that all that is needed for this
estimator to be consistent is the correct specification of the
conditional mean, that is, E[yi(cid:1)x] (cid:4) exp(xi(cid:11)). Therefore, the
data do not have to be Poisson at all—and, what is more
important, yi does not even have to be an integer—for the
estimator based on the Poisson likelihood function to be
consistent. This is the well-known PML result first noted by
Gourieroux, Monfort, and Trognon (1984).
The implementation of the PPML estimator is straight-
forward:
there are standard econometric programs with
commands that permit the estimation of Poisson regression,
even when the dependent variables are not integers. Because
the assumption V[yi(cid:1)x] (cid:14) E[yi(cid:1)x] is unlikely to hold, this
estimator does not take full account of the heteroskedastic-
ity in the model, and all inference has to be based on an
Eicker-White (Eicker, 1963; White, 1980) robust covariance
matrix estimator. In particular, within Stata (StataCorp.,
2003), the PPML estimation can be executed using the
following command:
poisson exporti, j ln(cid:5)distij(cid:6)
ln Yi ln Yj (cid:5)other variables(cid:6)ij, robust
where export (or import) is measured in levels.
Of course, if it were known that V[yi(cid:1)x] is a function of
higher powers of E[yi(cid:1)x], a more efficient estimator could be
obtained by downweighting even more the observations
with large conditional mean. An example of such an esti-
mator is the gamma PML estimator studied by Manning and
Mullahy (2001), which, like the log-linearized model, as-
sumes that V[yi(cid:1)x] is proportional to E[yi(cid:1)x]2. The first-order
conditions for the gamma PML estimator are given by
n
(cid:2)
i(cid:4)1
(cid:12) yi (cid:4) exp(cid:5)xi
ˇ(cid:11)(cid:6)(cid:13) exp(cid:5) (cid:4) xi
ˇ(cid:11)(cid:6) xi (cid:1) 0.
In the case of trade data, however, this estimator may
have an important drawback. Trade data for larger countries
(as gauged by GDP per capita) tend to be of higher quality
(see Frankel & Wei, 1993; Frankel, 1997); hence, models
assuming that V[yi(cid:1)x] is a function of higher powers of
E[yi(cid:1)x] might give excessive weight to the observations that
13 The same strategy is implicitly used by Papke and Wooldridge (1996)
in their pseudo-maximum-likelihood estimator for fractional data models.
14 See Cameron and Trivedi (1998) and Winkelmann (2003) for more
details on the Poisson regression and on more general models for count
data.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
646
THE REVIEW OF ECONOMICS AND STATISTICS
are more prone to measurement errors.15 Therefore, the
Poisson regression emerges as a reasonable compromise,
giving less weight to the observations with larger variance
than the standard NLS estimator, without giving too much
weight to observations more prone to contamination by
measurement error and less informative about the curvature
of E[yi(cid:1)x].16
B. Testing
In this subsection we consider tests for the particular
pattern of heteroskedasticity assumed by PML estimators,
focusing on the PPML estimator. Although PML estimators
are consistent even when the variance function is misspeci-
fied, the researcher can use these tests to check if a different
PML estimator would be more appropriate and to decide
whether or not the use of a nonparametric estimator of the
variance is warranted.
Manning and Mullahy (2001) suggested that if
V(cid:12) yi(cid:1)x(cid:13) (cid:1) (cid:15)0E(cid:12) yi(cid:1)x(cid:13)(cid:15)1,
(10)
the choice of the appropriate PML estimator can be based on
a Park-type regression (Park, 1966). Their approach is based
on the idea that if equation (10) holds and an initial consis-
tent estimate of E[yi(cid:1)x] is available, then (cid:15)1 can be consis-
tently estimated using an appropriate auxiliary regression.
Specifically, following Park (1966), Manning and Mullahy
(2001) suggest that (cid:15)1 can be estimated using the auxiliary
model
ln (cid:5) yi (cid:4) y˘i(cid:6)2 (cid:1) ln (cid:15)0 (cid:2) (cid:15)1 ln y˘i (cid:2) vi,
(11)
where y˘i denotes the estimated value of E[yi(cid:1)x]. Unfortu-
nately, as the discussion in the previous sections should
this approach based on the log-
have made clear,
linearization of equation (10) is valid only under very
restrictive conditions on the conditional distribution of yi.
However, it is easy to see that this procedure is valid when
the constant-elasticity model can be consistently estimated
in the log linear form. Therefore, using equation (11) a test
for H0 : (cid:15)1 (cid:4) 2 based on a nonrobust covariance estimator
provides a check on the adequacy of the estimator based on
the log linear model.
A more robust alternative, which is mentioned by Man-
ning and Mullahy (2001) in a footnote, is to estimate (cid:15)1
from
15 Frankel and Wei (1993) and Frankel (1997) suggest
larger
countries should be given more weight in the estimation of gravity
equations. This would be appropriate if the errors in the model were just
the result of measurement errors in the dependent variable. However, if it
is accepted that the gravity equation does not hold exactly, measurement
errors account for only part of the dispersion of trade data around the
gravity equation.
that
16 It is worth noting that the PPML estimator can be easily adapted to
deal with endogenous regressors (Windmeijer & Santos Silva, 1997) and
panel data (Wooldridge, 1999). These extensions, however, are not pur-
sued here.
(cid:5) yi (cid:4) y˘ i(cid:6)2 (cid:1) (cid:15)0(cid:5) y˘ i(cid:6)(cid:15)1 (cid:2) (cid:16)i,
(12)
using an appropriate PML estimator. The approach based on
equation (12) is asymptotically valid, and inference about (cid:15)1
can be based on the usual Eicker-White robust covariance
matrix estimator. For example, the hypothesis that V[yi(cid:1)x] is
proportional to E[yi(cid:1)x] is accepted if the appropriate confi-
dence interval for (cid:15)1 contains 1. However, if the purpose is
to test the adequacy of a particular value of (cid:15)1, a slightly
simpler method based on the Gauss-Newton regression (see
Davidson & MacKinnon, 1993) is available.
Specifically, to check the adequacy of the PPML for
which (cid:15)1 (cid:4) 1 and y˘i (cid:4) exp(xi(cid:11)˜ ), equation (12) can be
expanded in a Taylor series around (cid:15)1 (cid:4) 1, leading to
(cid:5) yi (cid:4) y˘ i(cid:6)2 (cid:1) (cid:15)0y˘ i (cid:2) (cid:15)0(cid:5)(cid:15)1 (cid:4) 1(cid:6)(cid:5)ln yi(cid:6)y˘i (cid:2) (cid:16)i.
Now, the hypothesis that V[yi(cid:1)x] (cid:14) E[yi(cid:1)x] can be tested
against equation (10) simply by checking the significance of
the parameter (cid:15)0((cid:15)1 (cid:9) 1). Because the error term (cid:16)i is
unlikely to be homoskedastic, the estimation of the Gauss-
Newton regression should be performed using weighted
least squares. Assuming that in equation (12) the variance is
also proportional to the mean, the appropriate weights are
given by exp((cid:9)xi(cid:11)˜ ), and therefore the test can be performed
by estimating
(cid:5) yi (cid:4) y˘ i(cid:6)2/(cid:3)y˘ i (cid:1) (cid:15)0
(cid:3)y˘ i
(cid:7) (cid:15)0(cid:5)(cid:15)1 (cid:4) 1(cid:6)(cid:5)ln y˘i(cid:6)(cid:3)y˘i (cid:2) (cid:16)*i
(13)
by OLS and testing the statistical significance of (cid:15)0((cid:15)1 (cid:9) 1)
using a Eicker-White robust covariance matrix estimator.17
In the next section, a small simulation is used to study the
Gauss-Newton regression test for the hypothesis that V[yi(cid:1)x]
(cid:14) E[yi(cid:1)x], as well as the Park-type test for the hypothesis
that the constant-elasticity model can be consistently esti-
mated in the log linear form.
IV. A Simulation Study
This section reports the results of a small simulation
study designed to assess the performance of different meth-
ods to estimate constant-elasticity models in the presence of
heteroskedasticity and rounding errors. As a by-product, we
also obtain some evidence on the finite-sample performance
of the specification tests presented above. These experi-
ments are centered around the following multiplicative
model:
17 Notice that to test V [ yi(cid:1)x] (cid:14) E [ yi(cid:1)x] against alternatives of the form
V [ yi(cid:1)x] (cid:4) (cid:15)0 exp [xi ((cid:11) (cid:7) (cid:15))], the appropriate auxiliary regression would
be
(cid:5) yi (cid:4) y˘ i(cid:6)2/ (cid:3)y˘ i (cid:1) (cid:15)
0
(cid:3)y˘ i (cid:2) (cid:15)
(cid:15)xi
0
(cid:3)y˘ i (cid:2) (cid:16)
*,
i
and the test could be performed by checking the joint significance of the
elements of (cid:15)0(cid:15). If the model includes a constant, one of the regressors in
the auxiliary regression is redundant and should be dropped.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
647
E(cid:12) yi(cid:1)x(cid:13) (cid:1) (cid:17)(cid:5) xi(cid:11)(cid:6) (cid:1) exp(cid:5)(cid:11)0 (cid:2) (cid:11)1x1i (cid:2) (cid:11)2x2i(cid:6),
i (cid:1) 1, . . . , 1000.
(14)
Because, in practice, regression models often include a
mixture of continuous and dummy variables, we replicate
this feature in our experiments: x1i is drawn from a standard
normal, and x2 is a binary dummy variable that equals 1 with
a probability of 0.4.18 The two covariates are independent,
and a new set of observations of all variables is generated in
each replication using (cid:11)0 (cid:4) 0, (cid:11)1 (cid:4) (cid:11)2 (cid:4) 1. Data on y are
generated as
yi (cid:1) (cid:17)(cid:5) xi(cid:11)(cid:6)(cid:3)i,
(15)
where (cid:3)i is a log normal random variable with mean 1 and
variance (cid:8)
2. As noted before, the slope parameters in equa-
i
tion (14) can be estimated using the log linear form of the
model only when (cid:8)
2 is constant, that is, when V[yi(cid:1)x] is
proportional to (cid:17)(xi(cid:11))2.
i
In these experiments we analyzed PML estimators of the
multiplicative model and different estimators of the log-
linearized model. The consistent PML estimators studied
were: NLS, gamma pseudo-maximum-likelihood (GPML),
and PPML. Besides these estimators, we also considered the
standard OLS estimator of the log linear model (here called
simply OLS); the OLS estimator for the model where the
dependent variable is yi (cid:7) 1 [OLS(y (cid:7) 1)]; a truncated OLS
estimator to be discussed below; and the threshold tobit of
Eaton and Tamura (1994) (ET-tobit).19
To assess the performance of the estimators under differ-
ent patterns of heteroskedasticity, we considered the four
following specifications of (cid:8)
2:
i
i
Case 1: (cid:8)
Case 2: (cid:8)
Case 3: (cid:8)
i
Case 4: (cid:8)
i
2 (cid:4) (cid:17)(xi(cid:11))(cid:9)2; V[yi(cid:1)x] (cid:4) 1.
2 (cid:4) (cid:17)(xi(cid:11))(cid:9)1; V[yi(cid:1)x] (cid:4) (cid:17) (xi(cid:11)).
2 (cid:4) 1; V[yi(cid:1)x] (cid:4) (cid:17)(xi(cid:11))2.
2 (cid:4) (cid:17)(xi(cid:11))(cid:9)1 (cid:7) exp(x2i); V[yi(cid:1)x] (cid:4) (cid:17)(xi(cid:11)) (cid:7)
i
exp(x2i) (cid:17)(xi(cid:11))2.
In case 1 the variance of εi is constant, implying that the
NLS estimator is optimal. Although, as argued before, this
case is unrealistic for models of bilateral trade, it is included
in the simulations for completeness. In case 2, the condi-
tional variance of yi equals its conditional mean, as in the
Poisson distribution. The pseudo-maximum-likelihood esti-
mator based on the Poisson distribution is optimal in this
situation. Case 3 is the special case in which OLS estimation
of the log linear model is consistent for the slope parameters
of equation (14). Moreover, in this case the log linear model
18 For example, in gravity equations, continuous variables (which are all
strictly positive) include income and geographical distance. In equation
(14), x1 can be interpreted as (the logarithm of) one of these variables.
Examples of binary variables include dummies for free-trade agreements,
common language, colonial ties, contiguity, and access to water.
19 We also studied the performance of other variants of the tobit model,
finding very poor results.
not only corrects the heteroskedasticity in the data, but,
because (cid:3)i is log normal, it is also the maximum likelihood
estimator. The GPML is the optimal PML estimator in this
case, but it should be outperformed by the true maximum
likelihood estimator. Finally, case 4 is the only one in which
the conditional variance does not depend exclusively on the
mean. The variance is a quadratic function of the mean, as
in case 3, but it is not proportional to the square of the mean.
We carried out two sets of experiments. The first set was
aimed at studying the performance of the estimators of the
multiplicative and the log linear models under different
patterns of heteroskedasticity. In order to study the effect of
the truncation on the performance of the OLS, and given
that this data-generating mechanism does not produce ob-
servations with yi (cid:4) 0, the log linear model was also
estimated using only the observations for which yi (cid:18) 0.5
[OLS(y (cid:18) 0.5)]. This reduces the sample size by approxi-
mately 25% to 35%, depending on the pattern of heteroske-
dasticity. The estimation of the threshold tobit was also
performed using this dependent variable. Notice that, al-
though the dependent variable has to cross a threshold to be
observable, the truncation mechanism used here is not equal
to the one assumed by Eaton and Tamura (1994). Therefore,
in all these experiments the ET-tobit will be slightly mis-
specified and the results presented here should be viewed as
a check of its robustness to this problem.
The second set of experiments studied the estimators’
performance in the presence of rounding errors in the
dependent variable. For that purpose, a new random vari-
able was generated by rounding to the nearest integer the
values of yi obtained in the first set of simulations. This
procedure mimics the rounding errors in official statistics
and generates a large number of zeros, a typical feature of
trade data. Because the model considered here generates a
large proportion of observations close to zero, rounding
down is much more frequent than rounding up. As the
probability of rounding up or down depends on the covari-
ates, this procedure will necessarily bias the estimates, as
discussed before. The purpose of the study is to gauge the
magnitude of these biases. Naturally, the log linear model
cannot be estimated in these conditions, because the depen-
dent variable equals 0 for some observations. Following
what is the usual practice in these circumstances, the trun-
cated OLS estimation of the log-linear model was per-
formed dropping the observations for which the dependent
variable equals 0. Notice that the observations discarded
with this procedure are exactly the same that are discarded
by OLS(y (cid:18) 0.5) in the first set of experiments. Therefore,
this estimator is also denoted OLS(y (cid:18) 0.5).
The results of the two sets of experiments are summa-
rized in table 1, which displays the biases and standard
errors of the different estimators of (cid:11) obtained with 10,000
replicas of the simulation procedure described above. Only
results for (cid:11)1 and (cid:11)2 are presented, as these are generally the
parameters of interest.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
648
THE REVIEW OF ECONOMICS AND STATISTICS
TABLE 1.—SIMULATION RESULTS UNDER DIFFERENT FORMS OF HETEROSKEDASTICITY
Results Without Rounding Error
Results with Rounding Error
(cid:11)1
(cid:11)2
(cid:11)1
(cid:11)2
Estimator:
Bias
S.E.
Bias
S.E.
Bias
S.E.
Bias
S.E.
PPML
NLS
GPML
OLS
ET-tobit
OLS( y (cid:18) 0.5)
OLS( y (cid:7) 1)
PPML
NLS
GPML
OLS
ET-tobit
OLS( y (cid:18) 0.5)
OLS( y (cid:7) 1)
PPML
NLS
GPML
OLS
ET-tobit
OLS( y (cid:18) 0.5)
OLS( y (cid:7) 1)
PPML
NLS
GPML
OLS
ET-tobit
OLS( y (cid:18) 0.5)
OLS( y (cid:7) 1)
(cid:9)0.00004
(cid:9)0.00006
0.01276
0.39008
(cid:9)0.47855
(cid:9)0.16402
(cid:9)0.40237
(cid:9)0.00011
0.00046
0.00376
0.21076
(cid:9)0.42394
(cid:9)0.17868
(cid:9)0.42371
(cid:9)0.00526
0.23539
(cid:9)0.00047
0.00015
(cid:9)0.31908
(cid:9)0.34480
(cid:9)0.51804
(cid:9)0.00696
0.35139
0.00322
0.13270
(cid:9)0.29908
(cid:9)0.39217
(cid:9)0.51440
Case 1: V [ yi(cid:1)x] (cid:4) 1
0.00009
(cid:9)0.00003
0.00754
0.35568
(cid:9)0.47786
(cid:9)0.15487
(cid:9)0.37683
0.027
0.017
0.082
0.054
0.032
0.038
0.022
Case 2: V [ yi(cid:1)x] (cid:4) (cid:17) (xi(cid:11))
0.00009
0.00066
0.00211
0.19960
(cid:9)0.42316
(cid:9)0.17220
(cid:9)0.39931
0.039
0.057
0.062
0.049
0.033
0.043
0.025
Case 3: V [ yi(cid:1)x] (cid:4) (cid:17) (xi(cid:11))2
(cid:9)0.00228
0.07323
(cid:9)0.00029
(cid:9)0.00003
(cid:9)0.32161
(cid:9)0.34614
(cid:9)0.50000
0.130
1.521
0.083
0.064
0.058
0.064
0.038
0.01886
0.00195
0.10946
—
(cid:9)0.49981
(cid:9)0.22121
(cid:9)0.37752
0.02190
0.00262
0.13243
—
(cid:9)0.45518
(cid:9)0.24405
(cid:9)0.39401
0.02332
0.23959
0.17134
—
(cid:9)0.36480
(cid:9)0.41006
(cid:9)0.48564
Case 4: V [ yi(cid:1)x] (cid:4) (cid:17) (xi(cid:11)) (cid:7) exp(x2i) (cid:17) (xi(cid:11))2
(cid:9)0.00647
0.08801
(cid:9)0.00137
(cid:9)0.12542
(cid:9)0.42731
(cid:9)0.41391
(cid:9)0.58087
0.144
1.827
0.083
0.075
0.063
0.070
0.041
0.02027
0.35672
0.12831
—
(cid:9)0.34351
(cid:9)0.45188
(cid:9)0.48627
0.016
0.008
0.068
0.039
0.030
0.027
0.014
0.019
0.033
0.043
0.030
0.028
0.026
0.015
0.091
3.066
0.041
0.032
0.044
0.039
0.021
0.103
7.516
0.057
0.039
0.049
0.042
0.021
0.017
0.008
0.096
—
0.030
0.026
0.015
0.020
0.033
0.073
—
0.028
0.026
0.016
0.091
3.082
0.068
—
0.043
0.037
0.022
0.104
7.521
0.085
—
0.047
0.040
0.022
0.02032
0.00274
0.09338
—
(cid:9)0.49968
(cid:9)0.21339
(cid:9)0.34997
0.02334
0.00360
0.11331
—
(cid:9)0.45513
(cid:9)0.23889
(cid:9)0.36806
0.02812
0.07852
0.14442
—
(cid:9)0.36789
(cid:9)0.41200
(cid:9)0.46597
0.01856
0.09239
0.10245
—
(cid:9)0.46225
(cid:9)0.46173
(cid:9)0.56039
0.029
0.018
0.108
—
0.032
0.036
0.024
0.041
0.057
0.087
—
0.033
0.040
0.028
0.133
1.521
0.104
—
0.056
0.060
0.040
0.146
1.829
0.129
—
0.060
0.066
0.044
As expected, OLS only performs well in case 3. In all
other cases this estimator is clearly inadequate because,
despite its low dispersion, it is often badly biased. More-
over, the sign and magnitude of the bias vary considerably.
Therefore, even when the dependent variable is strictly
positive, estimation of constant-elasticity models using the
log-linearized model cannot generally be recommended. As
for the modifications of the log-linearized model designed
to deal with the zeros of the dependent variable—ET-tobit,
OLS(y (cid:7) 1), and OLS(y (cid:18) 0.5)—their performance is also
very disappointing. These results clearly emphasize the
need to use adequate methods to deal with the zeros in the
data and raise serious doubts about the validity of the results
obtained using the traditional estimators based on the log
linear model. Overall, except under very special circum-
stances, estimation based on the log-linear model cannot be
recommended.
One remarkable result of this set of experiments is the
extremely poor performance of the NLS estimator. Indeed,
when the heteroskedasticity is more severe (cases 3 and 4),
this estimator, despite being consistent, leads to very poor
results because of its erratic behavior.20 Therefore, it is clear
that the loss of efficiency caused by some of the forms of
heteroskedasticity considered in these experiments is strong
enough to render this estimator useless in practice.
In the first set of experiments, the results of the gamma
PML estimator are very good. Indeed, when no measure-
ment error is present, the biases and standard errors of the
GPML estimator are always among the lowest. However,
this estimator is very sensitive to the form of measurement
error considered in the second set of experiments, consis-
tently leading to sizable biases. These results, like those of
the NLS, clearly illustrate the danger of using a PML
estimator that gives extra weight to the noisier observations.
As for the performance of the Poisson PML estimator, the
results are very encouraging. In fact, when no rounding
error is present, its performance is reasonably good in all
20 Manning and Mullahy (2001) report similar results.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
649
cases. Moreover, although some loss of efficiency is notice-
able as one moves away from case 2, in which it is an
optimal estimator,
the biases of the PPML are always
small.21 Moreover, the results obtained with rounded data
suggest that the Poisson-based PML estimator is relatively
robust to this form of measurement error of the dependent
variable. Indeed, the bias introduced by the rounding-off
errors in the dependent variable is relatively small, and in
some cases it even compensates the bias found in the first
set of experiments. Therefore, because it is simple to im-
plement and reliable in a wide variety of situations, the
Poisson PML estimator has the essential characteristics
needed to make it the new workhorse for the estimation of
constant-elasticity models.
Obviously, the sign and magnitude of the bias of the
estimators studied here depend on the particular specifica-
tion considered. Therefore, the results of these experiments
cannot serve as an indicator of what can be expected in
other situations. However, it is clear that, apart from the
Poisson PML method, all estimators will often be very
misleading.
These experiments were also used to study the finite-
the Gauss-Newton regression
sample performance of
(GNR) test for the adequacy of the Poisson PML based on
equation (13) and of the Park test advocated by Manning
and Mullahy (2001), which, as explained above, is valid
only to check for the adequacy of the estimator based on the
log linear model.22 Given that the Poisson PML estimator is
the only estimator with a reasonable behavior under all the
cases considered, these tests were performed using residuals
and estimates of (cid:17) (xi(cid:11)) from the Poisson regression. Table
2 contains the rejection frequencies of the null hypothesis at
the 5% nominal
level for both tests in the four cases
considered in the two sets of experiments. In this table the
rejection frequencies under the null hypothesis are given in
bold.
In as much as both tests have adequate behavior under the
null and reveal reasonable power against a wide range of
alternatives, the results suggest that these tests are important
tools to assess the adequacy of the standard OLS estimator
of the log linear model and of the proposed Poisson PML
estimator.
V. The Gravity Equation
In this section, we use the PPML estimator to quantita-
tively assess the determinants of bilateral trade flows, un-
covering significant differences in the roles of various
measures of size and distance from those predicted by the
21 These results are in line with those reported by Manning and Mullahy
(2001).
22 To illustrate the pitfalls of the procedure suggested by Manning and
Mullahy (2001), we note that the means of the estimates of (cid:15)1 obtained
using equation (11) in cases 1, 2, and 3 (without measurement error) were
0.58955, 1.29821, and 1.98705, whereas the true values of (cid:15)1 in these
cases are, respectively, 0, 1, and 2.
TABLE 2.—REJECTION FREQUENCIES AT THE 5% LEVEL
FOR THE TWO SPECIFICATION TESTS
Frequency
Case
GNR Test
Park Test
Without Measurement Error
1
2
3
4
1
2
3
4
0.91980
0.05430
0.58110
0.49100
With Measurement Error
0.91740
0.14980
0.57170
0.47580
1.00000
1.00000
0.06680
0.40810
1.00000
1.00000
1.00000
1.00000
logarithmic tradition. We perform the comparison of the two
techniques using both the traditional and the Anderson–van
Wincoop (2003) specifications of the gravity equation.
For the sake of completeness, we also compare the PPML
estimates with those obtained from alternative ways re-
searchers have used to deal with zero values for trade. In
particular, we present the results obtained with the tobit
estimator used in Eaton and Tamura (1994), OLS estimator
applied to ln(1 (cid:7) Tij), and a standard nonlinear least squares
estimator. The results obtained with these estimators are
presented for both the traditional and the Anderson–van
Wincoop specifications.
A. The Data
The analysis covers a cross section of 136 countries in
1990. Hence, our data set consists of 18,360 observations of
bilateral export flows (136 (cid:19) 135 country pairs). The list of
countries is reported in table A1 in the appendix. Informa-
tion on bilateral exports comes from Feenstra et al. (1997).
Data on real GDP per capita and population come from the
World Bank’s (2002) World Development Indicators. Data
on location and dummies indicating contiguity, common
language (official and second languages), colonial ties (di-
rect and indirect links), and access to water are constructed
from the CIA’s (2002) World Factbook. The data on lan-
guage and colonial links are presented on tables A2 and A3
in the appendix.23 Bilateral distance is computed using the
great circle distance algorithm provided by Andrew Gray
(2001). Remoteness—or relative distance—is calculated as
the (log of) GDP-weighted average distance to all other
countries (see Wei, 1996). Finally, information on preferen-
tial trade agreements comes from Frankel (1997), comple-
mented with data from the World Trade Organization. The
list of preferential trade agreements (and stronger forms of
trade agreements) considered in the analysis is displayed in
table A4 in the appendix. Table A5 in the appendix provides
23 Alternative estimates based on Boisso and Ferrantino’s (1997) index
of language similarity are available, at request, from the authors.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
650
THE REVIEW OF ECONOMICS AND STATISTICS
TABLE 3.—THE TRADITIONAL GRAVITY EQUATION
Estimator:
Dependent Variable:
Log exporter’s GDP
Log importer’s GDP
Log exporter’s GDP per capita
Log importer’s GDP per capita
Log distance
Contiguity dummy
Common-language dummy
Colonial-tie dummy
Landlocked-exporter dummy
Landlocked-importer dummy
Exporter’s remoteness
Importer’s remoteness
Free-trade agreement dummy
Openness
Observations
RESET test p-values
OLS
ln(Tij)
0.938**
(0.012)
0.798**
(0.012)
0.207**
(0.017)
0.106**
(0.018)
(cid:9)1.166**
(0.034)
0.314*
(0.127)
0.678**
(0.067)
0.397**
(0.070)
(cid:9)0.062
(0.062)
(cid:9)0.665**
(0.060)
0.467**
(0.079)
(cid:9)0.205*
(0.085)
0.491**
(0.097)
(cid:9)0.170**
(0.053)
9613
0.000
OLS
ln(1 (cid:7) Tij)
Tobit
ln(a (cid:7) Tij)
1.128**
(0.011)
0.866**
(0.012)
0.277**
(0.018)
0.217**
(0.018)
(cid:9)1.151**
(0.040)
(cid:9)0.241
(0.201)
0.742**
(0.067)
0.392**
(0.070)
0.106*
(0.054)
(cid:9)0.278**
(0.055)
0.526**
(0.087)
(cid:9)0.109
(0.091)
1.289**
(0.124)
0.739**
(0.050)
18360
0.000
1.058**
(0.012)
0.847**
(0.011)
0.227**
(0.015)
0.178**
(0.015)
(cid:9)1.160**
(0.034)
(cid:9)0.225
(0.152)
0.759**
(0.060)
0.416**
(0.063)
(cid:9)0.038
(0.052)
(cid:9)0.479**
(0.051)
0.563**
(0.068)
(cid:9)0.032
(0.073)
0.729**
(0.103)
0.310**
(0.045)
18360
0.204
NLS
Tij
0.738**
(0.038)
0.862**
(0.041)
0.396**
(0.116)
(cid:9)0.033
(0.062)
(cid:9)0.924**
(0.072)
(cid:9)0.081
(0.100)
0.689**
(0.085)
0.036
(0.125)
(cid:9)1.367**
(0.202)
(cid:9)0.471**
(0.184)
1.188**
(0.182)
1.010**
(0.154)
0.443**
(0.109)
0.928**
(0.191)
18360
0.000
PPML
Tij (cid:18) 0
0.721**
(0.027)
0.732**
(0.028)
0.154**
(0.053)
0.133**
(0.044)
(cid:9)0.776**
(0.055)
0.202
(0.105)
0.752**
(0.134)
0.019
(0.150)
(cid:9)0.873**
(0.157)
(cid:9)0.704**
(0.141)
0.647**
(0.135)
0.549**
(0.120)
0.179*
(0.090)
(cid:9)0.139
(0.133)
9613
0.941
PPML
Tij
0.733**
(0.027)
0.741**
(0.027)
0.157**
(0.053)
0.135**
(0.045)
(cid:9)0.784**
(0.055)
0.193
(0.104)
0.746**
(0.135)
0.024
(0.150)
(cid:9)0.864**
(0.157)
(cid:9)0.697**
(0.141)
0.660**
(0.134)
0.561**
(0.118)
0.181*
(0.088)
(cid:9)0.107
(0.131)
18360
0.331
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
a description of the variables and displays the summary
statistics.
B. Results
The Traditional Gravity Equation: Table 3 presents the
estimation outcomes resulting from the various techniques
for the traditional gravity equation. The first column reports
OLS estimates using the logarithm of exports as the depen-
dent variable; as noted before, this regression leaves out
pairs of countries with zero bilateral trade (only 9,613
country pairs, or 52% of the sample, exhibit positive export
flows).
The second column reports the OLS estimates using
ln(1 (cid:7) Tij) as dependent variable, as a way of dealing with
zeros. The third column presents tobit estimates based on
Eaton and Tamura (1994). The fourth column shows the
results of standard NLS. The fifth column reports Poisson
estimates using only the subsample of positive-trade pairs.
Finally, the sixth column shows the Poisson results for the
whole sample (including zero-trade pairs).
The first point to notice is that PPML-estimated coeffi-
cients are remarkably similar using the whole sample and
using the positive-trade subsample.24 However, most coef-
ficients differ—oftentimes significantly—from those ob-
tained using OLS. This suggests that in this case, heteroske-
dasticity (rather than truncation) is responsible for the
differences between PPML results and those of OLS using
only the observations with positive exports. Further evi-
dence on the importance of the heteroskedasticity is pro-
vided by the two-degrees-of-freedom special case of
White’s test for heteroskedasticity (see Wooldridge, 2002, p.
127), which leads to a test statistic of 476.6 and to a p-value
of 0. That is, the null hypothesis of homoskedastic errors is
unequivocally rejected.
Poisson estimates reveal that the coefficients on import-
er’s and exporter’s GDPs in the traditional equation are not,
as generally believed, close to 1. The estimated GDP elas-
ticities are just above 0.7 (s.e. (cid:4) 0.03). OLS generates
significantly larger estimates, especially on exporter’s GDP
(0.94, s.e. (cid:4) 0.01). Although all these results are conditional
on the particular specification used,25 it is worth pointing out
that unit income elasticities in the simple gravity framework
are at odds with the observation that the trade-to-GDP ratio
decreases with increasing total GDP, or, in other words, that
smaller countries tend to be more open to international
trade.26
24 The reason why truncation has little effect
in this case is that
observations with zero trade correspond to pairs for which the estimated
value of trade is close to zero. Therefore, the corresponding residuals are
also close to zero, and their elimination from the sample has little effect.
25 This result holds when one looks at the subsample of OECD countries.
It is also robust to the exclusion of GDP per capita from the regressions.
26 Note also that PPML predicts almost equal coefficients for the GDPs
of exporters and importers.
THE LOG OF GRAVITY
651
TABLE 4.—RESULTS OF THE TESTS FOR TYPE OF HETEROSKEDASTICITY
(p-VALUES)
Test (Null Hypothesis)
Exports (cid:18) 0
Full Sample
GNR (V [ yi(cid:1)x] (cid:14) (cid:17) (xi(cid:11)))
Park (OLS is valid)
0.144
0.000
0.115
0.000
The role of geographical distance as trade deterrent is
significantly larger under OLS; the estimated elasticity is
(cid:9)1.17 (s.e. (cid:4) 0.03), whereas the Poisson estimate is (cid:9)0.78
(s.e. (cid:4) 0.06). This lower estimate suggests a smaller role for
transport costs in the determination of trade patterns. Fur-
thermore, Poisson estimates indicate that, after controlling
for bilateral distance, sharing a border does not influence
trade flows, whereas OLS, instead, generates a substantial
effect: It predicts that trade between two contiguous coun-
tries is 37% larger than trade between countries that do not
share a border.27
We control for remoteness to allow for the hypothesis that
larger distances to all other countries might increase bilat-
trade between two countries.28 Poisson regressions
eral
support this hypothesis, whereas OLS estimates suggest that
only exporter’s remoteness increases bilateral flows be-
tween two given countries. Access to water appears to be
important for trade flows, according to Poisson regressions;
the negative coefficients on the land-locked dummies can be
interpreted as an indication that ocean transportation is
significantly cheaper. In contrast, OLS results suggest that
whether or not the exporter is landlocked does not influence
trade flows, whereas a landlocked importer experiences
lower trade. (These asymmetries in the effects of remote-
ness and access to water for importers and exporters are
hard to interpret.) We also explore the role of colonial
heritage, obtaining, as before, significant discrepancies:
Poisson regressions indicate that colonial ties play no role in
determining trade flows, once a dummy variable for com-
mon language is introduced. OLS regressions, instead, gen-
erate a sizeable effect (countries with a common colonial
past trade almost 45% more than other pairs). Language is
statistically and economically significant under both estima-
tion procedures.
Strikingly, in the traditional gravity equation, preferential-
trade agreements play a much smaller—although still sub-
stantial—role according to Poisson regressions. OLS esti-
mates suggest
trade agreements raise
expected bilateral trade by 63%, whereas Poisson estimates
indicate an average enhancement below 20%.
that preferential
Preferential trade agreements might also cause trade di-
version; if this is the case, the coefficient on the trade-
27 The formula to compute this effect is (ebi (cid:9) 1) (cid:19) 100%, where bi is
the estimated coefficient.
28 To illustrate the role of remoteness, consider two pairs of countries, (i,
j) and (k, l), and assume that the distance between the countries in each
pair is the same (Dij (cid:4) Dkl), but i and j are closer to other countries. In this
case, the most remote countries, k and l, will tend to trade more between
each other because they do not have alternative trading partners. See
Deardoff (1998).
agreement dummy will not reflect the net effect of trade
agreements. To account for the possibility of diversion, we
include an additional dummy, openness, similar to that used
by Frankel (1997). This dummy takes the value 1 whenever
one (or both) of the countries in the pair is part of a
preferential trade agreement, and thus it captures the extent
of trade between members and nonmembers of a preferen-
tial trade agreement. The sum of the coefficients on the trade
agreement and the openness dummies gives the net creation
effect of trade agreements. OLS suggests that trade destruc-
tion comes from trade agreements. Still, the net creation
effect is around 40%. In contrast, Poisson regressions pro-
vide no significant evidence of trade diversion, although the
point estimates are of the same order of magnitude under
both methods.
Hence, even when allowing for trade diversion effects, on
average, the Poisson method estimates a smaller effect of
preferential trade agreements on trade, approximately half
of that indicated by OLS. The contrast in estimates suggests
that the biases generated by standard regressions can be
substantial, leading to misleading inferences and, perhaps,
erroneous policy decisions.
We now turn briefly to the results of the other estimation
methods. OLS on ln(1 (cid:7) Tij) and tobit give very close
estimates for most coefficients. Like OLS, they yield large
estimates for the elasticity of bilateral trade with respect to
distance. Unlike OLS, however, they produce insignificant
coefficients for the contiguity dummy. They both generate
extremely large and statistically significant coefficients for
the trade-agreement dummy. The first method predicts that
trade between two countries that have signed a trade agree-
ment is on average 266% larger than that between countries
without an agreement. The second predicts that trade be-
tween countries in such agreements is on average 100%
larger. NLS tends to generate somewhat different estimates.
The elasticity of trade with respect to the exporter’s GDP is
significantly smaller than with OLS, but the corresponding
elasticity with respect to importer’s GDP is significantly
larger than with OLS. The estimated distance elasticity is
smaller than with OLS and bigger than with Poisson. Like
the other methods, NLS predicts a significant and large
effect for free-trade agreements.
It is noteworthy that all methods, except the PPML, lead
to puzzling asymmetries in the elasticities with respect to
importer and exporter characteristics (especially remoteness
and access to water).
To check the adequacy of the estimated models, we
performed a heteroskedasticity-robust RESET test (Ramsey,
1969). This is essentially a test for the correct specification
of the conditional expectation, which is performed by
checking the significance of an additional regressor con-
structed as (x(cid:20)b)2, where b denotes the vector of estimated
parameters. The corresponding p-values are reported at the
bottom of table 3. In the OLS regression, the test rejects the
hypothesis that the coefficient on the test variable is 0. This
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
652
THE REVIEW OF ECONOMICS AND STATISTICS
TABLE 5.—THE ANDERSON–VAN WINCOOP GRAVITY EQUATION
Estimator:
Dependent variable:
Log distance
Contiguity dummy
Common-language dummy
Colonial-tie dummy
Free-trade agreement dummy
Fixed effects
Observations
RESET test p-values
OLS
ln (Tij)
(cid:9)1.347**
(0.031)
0.174
(0.130)
0.406**
(0.068)
0.666**
(0.070)
0.310**
(0.098)
Yes
9613
0.000
OLS
ln (1 (cid:7) Tij)
(cid:9)1.332**
(0.036)
(cid:9)0.399*
(0.189)
0.550**
(0.066)
0.693**
(0.067)
0.174
(0.138)
Yes
18360
0.000
Tobit
ln (a (cid:7) Tij)
(cid:9)1.272**
(0.029)
(cid:9)0.253
(0.135)
0.485**
(0.057)
0.650**
(0.059)
0.137**
(0.098)
Yes
18360
0.000
NLS
Tij
(cid:9)0.582**
(0.088)
0.458**
(0.121)
0.926**
(0.116)
(cid:9)0.736**
(0.178)
1.017**
(0.170)
Yes
18360
0.000
PPML
Tij (cid:18) 0
(cid:9)0.770**
(0.042)
0.352**
(0.090)
0.418**
(0.094)
0.038
(0.134)
0.374**
(0.076)
Yes
9613
0.564
PPML
Tij
(cid:9)0.750**
(0.041)
0.370**
(0.091)
0.383**
(0.093)
0.079
(0.134)
0.376**
(0.077)
Yes
18360
0.112
means that the model estimated using the logarithmic spec-
ification is inappropriate. A similar result is found for the
OLS estimated using ln(1 (cid:7) Tij) as the dependent variable
and for the NLS. In contrast, the models estimated using the
Poisson regressions pass the RESET test, that is, the RESET
test provides no evidence of misspecification of the gravity
equations estimated using the PPML. With this particular
specification, the model estimated using tobit also passes
the test for the traditional gravity equation.
Finally, we also check whether the particular pattern of
heteroskedasticity assumed by the models is appropriate. As
explained in section III B, the adequacy of the log linear
model was checked using the Park-type test, whereas the
hypothesis V[yi(cid:1)x] (cid:14) (cid:17)(xi(cid:11)) was tested by evaluating the
significance of the coefficient of (ln y˘i)(cid:21)y˘i in the Gauss-
Newton regression indicated in equation (13). The p-values
of the tests are reported in table 4. Again, the log linear
specification is unequivocally rejected. On the other hand,
these results indicate that the estimated coefficient on (ln y˘i)
(cid:21)y˘ i is insignificantly different from 0 at the usual 5% level.
This implies that the Poisson PML assumption V[yi(cid:1)x] (cid:4)
(cid:15)0E[yi(cid:1)x] cannot be rejected at this significance level.
The Anderson–van Wincoop Gravity Equation:
Table 5
presents the estimated coefficients for the Anderson–van
Wincoop (2003) gravity equation, which controls more
properly for multilateral resistance terms by introducing
exporter- and importer-specific effects. As before, the col-
umns show, respectively, the estimated coefficients obtained
using OLS on the log of exports, OLS on ln(1 (cid:7) Tij), tobit,
NLS, PPML on the positive-trade sample, and PPML. Note
that, because this exercise uses cross-sectional data, we can
only identify bilateral variables.29
29 Anderson and van Wincoop (2003) impose unit income elasticities by
using as the dependent variable the log of exports divided by the product
of the countries’ GDPs. Because we are working with cross-sectional data
and the model specification includes importer and exporter fixed effects,
income elasticities cannot be identified, and there is no need to impose
restrictions on them. Still, the estimation of the PML models could be
performed using as the dependent variable the ratio of exports to the
product of the GDPs. This would downweight the observations with large
As with the standard specification of the gravity equation,
we find that, using the Anderson–van Wincoop (2003)
specification, estimates obtained with the Poisson method
vary little when only the positive-trade subsample is used.
Moreover, we find again strong evidence that the errors of
the log linear model estimated using the sample with posi-
tive trade are heteroskedastic. With this specification, the
two-degree-of-freedom special case of White’s test for het-
eroskedasticity leads to a test statistic of 469.2 and a p-value
of 0.
Because we are now conditioning on a much larger set of
controls, it is not surprising to find that most coefficients are
sensitive to the introduction of fixed effects. For example, in
the Poisson method, although the distance elasticity remains
about the same and the coefficient on common colonial ties
is still insignificant, the effect on common language is now
smaller and the coefficient on free-trade agreements is
larger. The results of the other estimation methods are
generally much more sensitive to the inclusion of the fixed
effects.
Comparing the results of PPML and OLS for the positive-
trade subsample, the following observations are in order.
The distance elasticity is substantially larger under OLS
((cid:9)1.35 versus (cid:9)0.75). Sharing a border has a positive effect
on trade under Poisson, but no significant effect under OLS.
Sharing a common language has similar effects under the
two techniques. Common colonial ties have strong effects
under OLS (with an average enhancement effect of 100%),
whereas Poisson predicts no significant effect. Finally, the
values of the product of the GDPs, implicitly assuming that the variance
of the error term is proportional to the square of this product. This is
contrary to what is advocated by Frankel and Wei (1993) and Frankel
(1997), who suggest that larger countries should be given more weight in
the estimation of gravity equations because they generally have better
data. In any case, whether this should be done or not is an empirical
question, and the right course of action depends on the pattern of
heteroskedasticity. With our data, using the ratio of exports to the product
of GDPs as the dependent variable leads to models that are rejected by the
specification tests. Therefore, the implied assumptions about the pattern of
heteroskedasticity are not supported by our data. Hence, we use exports as
the dependent variable of the gravity equation, and not the ratio of exports
to the product of GDPs.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
653
TABLE 6.—RESULTS OF THE TESTS FOR TYPE OF HETEROSKEDASTICITY
(p-VALUES)
Test (Null Hypothesis)
Exports (cid:18) 0
Full Sample
GNR (V [ yi(cid:1)x] (cid:14) (cid:17) (xi(cid:11)))
Park (OLS is valid)
0.100
0.000
0.070
0.000
two techniques produce reasonably similar estimates for the
coefficient on the trade-agreement dummy, implying a trade-
enhancement effect of the order of 40%.
As before, the other estimation methods lead to some
puzzling results. For example, OLS on ln(1 (cid:7) Tij) now
yields a significantly negative effect of contiguity, and under
NLS, the coefficient on common colonial ties becomes
significantly negative.
To complete the study, we performed the same set of
specification tests used before. The p-values of the het-
eroskedasticity-robust RESET test at the bottom of table 5
suggest that with the Anderson–van Wincoop (2003) spec-
ification of the gravity equation, only the models estimated
by the PPML method are adequate. The p-values of the tests
to check whether the particular pattern of heteroskedasticity
assumed by the models is appropriate are reported in table
6. As in the traditional gravity equation, the log linear
specification is unequivocally rejected. On the other hand,
these results indicate that the estimated coefficient on (ln y˘i)
(cid:21)y˘ i is insignificantly different from 0 at the usual 5% level.
This implies that the Poisson PML assumption V[yi(cid:1)x] (cid:4)
(cid:15)0E[yi(cid:1)x] cannot be rejected at this significance level.
To sum up, whether or not fixed effects are used in the
specification of the model, we find strong evidence that
estimation methods based on the log-linearization of the
gravity equation suffer from severe misspecification, which
hinders the interpretation of the results. NLS is also gener-
ally unreliable. In contrast, the models estimated by PPML
show no signs of misspecification and, in general, do not
produce the puzzling results generated by the other meth-
ods.30
VI. Conclusions
In this paper, we argue that the standard empirical meth-
ods used to estimate gravity equations are inappropriate.
30 It is worth noting that the large differences in estimates among the
various methods persist when we look at a smaller subsample of countries
that account for most of world trade and, quite likely, have better data.
More specifically, we run similar regressions for the subsample of 63
countries included in Frankel’s (1997) study. These countries accounted
for almost 90% of the world trade reported to the United Nations in 1992.
One advantage of this subsample is that the number of zeros is signifi-
cantly reduced. Heteroskedasticity, however, is still a problem: The null
hypothesis of homoskedasticity is rejected in both the traditional and the
fixed-effects gravity equations. As with the full sample, PPML generates
a smaller role for distance and common language than OLS, and, unlike
OLS, PPML predicts no role for colonial ties. In line with the findings
documented in Frankel (1997), the OLS estimated coefficient on the
free-trade-agreement dummy is negative in both specifications of the
gravity equation, whereas PPML predicts a positive and significant effect
(slightly bigger than that found for the whole sample). These results are
available—on request—from the authors.
The basic problem is that log-linearization (or, indeed, any
nonlinear transformation) of the empirical model in the
presence of heteroskedasticity leads to inconsistent esti-
mates. This is because the expected value of the logarithm
of a random variable depends on higher-order moments of
its distribution. Therefore, if the errors are heteroskedastic,
the transformed errors will be generally correlated with the
covariates. An additional problem of log-linearization is that
it is incompatible with the existence of zeros in trade data,
including
which led to several unsatisfactory solutions,
truncation of the sample (that is, elimination of zero-trade
pairs) and further nonlinear transformations of the depen-
dent variable.
We argue that the biases are present both in the traditional
specification of the gravity equation and in the Anderson–
van Wincoop (2003) specification, which includes country-
specific fixed effects.
To address the various estimation problems, we propose
a simple Poisson pseudo-maximum-likelihood method and
assess its performance using Monte Carlo simulations. We
find that in the presence of heteroskedasticity the standard
methods can severely bias the estimated coefficients, cast-
ing doubt on previous empirical findings. Our method,
instead, is robust to different patterns of heteroskedasticity
and, in addition, provides a natural way to deal with zeros in
trade data.
We use our method to reestimate the gravity equation and
document significant differences from the results obtained
using the log linear method. For example, income elastici-
ties in the traditional gravity equation are systematically
smaller than those obtained with log-linearized OLS regres-
sions. In addition, in both the traditional and Anderson–van
Wincoop specifications of the gravity equation, OLS esti-
mation exaggerates the role of geographical proximity and
colonial ties. RESET tests systematically favor the Poisson
PML technique. The results suggest that heteroskedasticity
(rather than truncation of the data) is responsible for the
main differences.
Log-linearized equations estimated by OLS are of course
used in many other areas of empirical economics and econo-
metrics. Our Monte Carlo simulations and the regression out-
comes indicate that in the presence of heteroskedasticity this
practice can lead to significant biases. These results suggest
that, at least when there is evidence of heteroskedasticity, the
Poisson pseudo-maximum-likelihood estimator should be used
as a substitute for the standard log linear model.
REFERENCES
Anderson, J., “A Theoretical Foundation for the Gravity Equation,”
American Economic Review 69 (1979), 106–116.
Anderson, J., and E. van Wincoop, “Gravity with Gravitas: A Solution to
the Border Puzzle,” American Economic Review 93 (2003), 170–
192.
Bergstrand, J., “The Gravity Equation in International Trade: Some
Microeconomic Foundations and Empirical Evidence,” this RE-
VIEW, 69 (1985), 474–481.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
654
THE REVIEW OF ECONOMICS AND STATISTICS
Boisso, D., and M. Ferrantino, “Economic and Cultural Distance in
International Trade: Empirical Puzzles,” Journal of Economic
Integration 12 (1997), 456–484.
Cameron, A. C., and P. K. Trivedi, Regression Analysis of Count Data
(Cambridge: Cambridge University Press, 1998).
Central Intelligence Agency, World Factbook, http://www.cia.gov/cia/
publications/factbook/ (2002).
Davidson, R., and J. G. MacKinnon, Estimation and Inference in Econo-
metrics (Oxford: Oxford University Press, 1993).
Davis, D., “Intra-industry Trade: A Hecksher-Ohlin-Ricardo Approach,”
Journal of International Economics 39 (1995), 201–226.
Deardoff, A., “Determinants of Bilateral Trade: Does Gravity Work in a
Neoclassical World?” in Jeffrey Frankel (Ed.), The Regionalization
of the World Economy (Chicago: University of Chicago Press,
1998).
Delgado, M., “Semiparametric Generalized Least Squares Estimation in
the Multivariate Nonlinear Regression Model,” Econometric The-
ory 8 (1992), 203–222.
Delgado, M., and T. J. Kniesner, “Count Data Models with Variance of
Unknown Form: An Application to a Hedonic Model of Worker
Absenteeism,” this REVIEW, 79 (1997), 41–49.
Eaton, J., and S. Kortum, “Technology, Geography and Trade,” NBER
working paper no. 6253 (2001).
Eaton, J., and A. Tamura, “Bilateralism and Regionalism in Japanese and
US Trade and Direct Foreign Investment Patterns,” Journal of the
Japanese and International Economics 8 (1994), 478–510.
Eichengreen, B., and D. Irwin, “Trade Blocs, Currency Blocs, and the
Reorientation of World Trade in the 1930’s,” Journal of Interna-
tional Economics, 38 (1995), 1–24.
Eicker, F., “Asymptotic Normality and Consistency of the Least Squares
Estimators for Families of Linear Regressions,” The Annals of
Mathematical Statistics 34 (1963), 447–456.
Feenstra, R. C., R. E. Lipsey, and H. P. Bowen, “World Trade Flows,
1970–1992, with Production and Tariff Data,” NBER working
paper no. 5910 (1997).
Feenstra, R., J. Markusen, and A. Rose, “Using the Gravity Equation to
Differentiate among Alternative Theories of Trade,” Canadian
Journal of Economics 34 (2001), 430–447.
Frankel, J., Regional Trading Blocs in the World Economic System
(Washington, DC: Institute for International Economics, 1997).
Frankel, J., and A. Rose, “An Estimate of the Effect of Common Curren-
cies on Trade and Income,” Quarterly Journal of Economics 117
(2002), 409–466.
Frankel, J., E. Stein, and S. Wei, “Continental Trading Blocs: Are They
Natural, or Super-Natural?” in J. Frankel (Ed.), The Regionaliza-
tion of the World Economy (Chicago: University of Chicago Press,
1998).
Frankel, J., and S. Wei, “Trade Blocs and Currency Blocs,” NBER
working paper no. 4335 (1993).
Goldberger, A. “The Interpretation and Estimation of Cobb-Douglas
Functions,” Econometrica 36 (1968), 464–472.
Goldberger, A., A Course in Econometrics (Cambridge, MA: Harvard
University Press, 1991).
Gourieroux, C., A. Monfort, and A. Trognon, “Pseudo Maximum Likeli-
hood Methods: Applications to Poisson Models,” Econometrica 52
(1984), 701–720.
Gray, A., http://argray.fateback.com/dist/formula.html (2001).
Hallak, J. C., “Product Quality and the Direction of Trade,” Journal of
International Economics 68 (2006), 238–265.
Harrigan, J., “OECD Imports and Trade Barriers in 1983,” Journal of
International Economics 35 (1993), 95–111.
Haveman, J., and D. Hummels, “Alternative Hypotheses and the Volume
of Trade: The Gravity Equation and the Extent of Specialization,”
Purdue University mimeograph, (2001).
Helpman, E., and P. Krugman, Market Structure and Foreign Trade
(Cambridge, MA: MIT Press, 1985).
Helpman, E., M. Melitz, and Y. Rubinstein, “Trading Patterns and Trading
Volumes,” Harvard University mimeograph (2004).
Koenker, R., and G. S. Bassett, Jr., “Regression Quantiles,” Econometrica
46 (1978), 33–50.
Manning, W. G., and J. Mullahy, “Estimating Log Models: To Transform
or Not to Transform?” Journal of Health Economics 20 (2001),
461–494.
McCallum, J., “National Borders Matter: Canada-US Regional Trade
Patterns,” American Economic Review 85 (1995), 615–623.
McCullagh, P., and J. A. Nelder, Generalized Linear Models, 2nd ed.
(London: Chapman and Hall, 1989).
Papke, L. E., and J. M. Wooldridge, “Econometric Methods for Fractional
Response Variables with an Application to 401(k) Plan Participa-
tion Rates,” Journal of Applied Econometrics 11 (1996), 619–632.
Park, R., “Estimation with Heteroskedastic Error Terms,” Econometrica
34 (1966), 888.
Ramsey, J. B., “Tests for Specification Errors in Classical Linear Least
Squares Regression Analysis,” Journal of the Royal Statistical
Society B 31 (1969), 350–371.
Robinson, P. M., “Asymptotically Efficient Estimation in the Presence of
Heteroskedasticity of Unknown Form,” Econometrica 55 (1987),
875–891.
Rose, A., “One Money One Market: Estimating the Effect of Common
Currencies on Trade,” Economic Policy 15 (2000), 7–46.
StataCorp., Stata Statistical Software: Release 8 (College Station, TX:
StataCorp LP, 2003).
Tenreyro, S., and R. Barro, “Economic Effects of Currency Unions,” FRB
Boston series working paper no. 02–4 (2002).
Tinbergen, J., The World Economy. Suggestions for an International
Economic Policy (New York: Twentieth Century Fund, 1962).
Wei, S., “Intra-national versus International Trade: How Stubborn Are
Nation States in Globalization?” NBER working paper no. 5331
(1996).
White, H., “A Heteroskedasticity-Consistent Covariance Matrix Estimator
and a Direct Test for Heteroskedasticity,” Econometrica 48 (1980),
817–838.
Winkelmann, R., Econometric Analysis of Count Data, 4th ed. (Berlin:
Springer-Verlag, 2003).
Windmeijer, F., and J. M. C. Santos Silva, “Endogeneity in Count Data
Models: An Application to Demand for Health Care,” Journal of
Applied Econometrics 12 (1997), 281–294.
Wooldridge, J. M., “Distribution-Free Estimation of Some Nonlinear
Panel Data Models,” Journal of Econometrics 90 (1999), 77–97.
Econometric Analysis of Cross Section and Panel Data (Cam-
bridge, MA: MIT Press, 2002).
World Bank, World Development Indicators CD-ROM (The World Bank,
2002).
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
THE LOG OF GRAVITY
655
APPENDIX
TABLE A1.—LIST OF COUNTRIES
Albania
Algeria
Angola
Argentina
Australia
Austria
Bahamas
Bahrain
Bangladesh
Barbados
Belgium-Lux.
Belize
Benin
Bhutan
Bolivia
Brazil
Brunei
Bulgaria
Burkina Faso
Burundi
Cambodia
Cameroon
Canada
Central African Rep.
Chad
Chile
China
Colombia
Comoros
Congo Dem. Rep.
Congo Rep.
Costa Rica
Coˆte D’Ivoire
Cyprus
Denmark
Djibouti
Dominican Rep.
Ecuador
Egypt
El Salvador
Eq. Guinea
Ethiopia
Fiji
Finland
France
Gabon
Gambia
Germany
Ghana
Greece
Guatemala
Guinea
Guinea-Bissau
Guyana
Haiti
Honduras
Hong Kong
Hungary
Iceland
India
Indonesia
Iran
Ireland
Israel
Italy
Jamaica
Japan
Jordan
Kenya
Kiribati
Korea, Rep.
Laos P. Dem. Rep.
Lebanon
Madagascar
Malawi
Malaysia
Maldives
Mali
Malta
Mauritania
Mauritius
Mexico
Mongolia
Morocco
Mozambique
Nepal
Netherlands
New Caledonia
New Zealand
Nicaragua
Niger
Nigeria
Norway
Oman
Pakistan
Panama
Papua New Guinea
Paraguay
Peru
Philippines
Poland
Portugal
Romania
Russian Federation
Rwanda
Saudi Arabia
Senegal
Seychelles
Sierra Leone
Singapore
Solomon Islands
South Africa
Spain
Sri Lanka
St. Kitts and Nevis
Sudan
Suriname
Sweden
Switzerland
Syrian Arab Rep.
Tanzania
Thailand
Togo
Trinidad and Tobago
Tunisia
Turkey
Uganda
United Arab Em.
United Kingdom
United States
Uruguay
Venezuela
Vietnam
Yemen
Zambia
Zimbabwe
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
656
THE REVIEW OF ECONOMICS AND STATISTICS
TABLE A2.—COMMON OFFICIAL AND SECOND LANGUAGES
English
French
Australia
Bahamas
Barbados
Belize
Brunei
Cameroon
Canada
Fiji
Gambia
Ghana
Guyana
Hong Kong
India
Indonesia
Ireland
Israel
Jamaica
Jordan
Kenya
Kiribati
Malawi
Malaysia
Maldives
Malta
Mauritius
New Zealand
Nigeria
Oman
Pakistan
Panama
Papua New Guinea
Philippines
Rwanda
Seychelles
Sierra Leone
Singapore
South Africa
Sri Lanka
St. Helena
St. Kitts and Nevis
Suriname
Tanzania
Trinidad and Tobago
Uganda
United Kingdom
United States
Zambia
Zimbabwe
Belgium-Lux.
Benin
Burkina Faso
Burundi
Cameroon
Canada
Central African Rep.
Chad
Comoros
Congo Dem. Rep.
Congo Rep.
Coˆte D’Ivoire
Djibouti
Eq. Guinea
France
Gabon
Guinea
Haiti
Lebanon
Madagascar
Mali
Mauritania
Mauritius
Morocco
New Caledonia
Niger
Rwanda
Senegal
Seychelles
Switzerland
Togo
Tunisia
Malay
Brunei
Indonesia
Malaysia
Singapore
Portuguese
Angola
Brazil
Guinea-Bissau
Mozambique
Portugal
Spanish
Argentina
Belize
Bolivia
Chile
Colombia
Costa Rica
Dominican Rep.
Ecuador
El Salvador
Eq. Guinea
Guatemala
Honduras
Mexico
Nicaragua
Panama
Paraguay
Peru
Spain
Uruguay
Venezuela
Arabic
Algeria
Bahrain
Chad
Comoros
Djibouti
Egypt
Israel
Jordan
Lebanon
Mauritania
Morocco
Oman
Saudi Arabia
Sudan
Syria
Tanzania
Tunisia
United Arab Em.
Yemen
Turkish
Cyprus
Turkey
Dutch
Belgium-Lux.
Netherlands
Suriname
German
Austria
Germany
Switzerland
Greek
Cyprus
Greece
Hungarian
Hungary
Romania
Italian
Italy
Switzerland
Lingala
Congo Dem. Rep.
Congo Rep.
Russian
Mongolia
Russian Federation
Swahili
Kenya
Tanzania
Chinese
China
Hong Kong
Malaysia
Singapore
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
United Kingdom
Australia
Bahamas
Bahrain
Barbados
Belize
Cameroon
Canada
Cyprus
Egypt
Fiji
Gambia
Ghana
Guyana
India
Ireland
Israel
Jamaica
Jordan
Kenya
Kuwait
Malawi
Malaysia
Maldives
Malta
EEC/EC
Belgium
Denmark
France
Germany
Greece
Ireland
Italy
Luxembourg
Netherlands
Portugal
Spain
United Kingdom
EFTA
Iceland
Norway
Switzerland
Liechtenstein
CER
Australia
New Zealand
THE LOG OF GRAVITY
TABLE A3.—COLONIAL TIES
657
Mauritius
New Zealand
Nigeria
Pakistan
Saint Kitts and Nevis
Seychelles
Sierra Leone
South Africa
Sri Lanka
Sudan
Tanzania
Trinidad and Tobago
Uganda
United States
Zambia
Zimbabwe
France
Algeria
Benin
Burkina Faso
Cambodia
Cameroon
Central African Rep.
Chad
Comoros
Congo
Djibouti
Gabon
Guinea
Haiti
Laos
Lebanon
Madagascar
Mali
Mauritania
Morocco
Niger
Senegal
Syria
Togo
Tunisia
Vietnam
Spain
Argentina
Bolivia
Chile
Colombia
Costa Rica
Cuba
Ecuador
El Salvador
Eq. Guinea
Guatemala
Honduras
Mexico
Netherlands
Nicaragua
Panama
Paraguay
Peru
Venezuela
Portugal
Angola
Brazil
Guinea-Bissau
Mozambique
Oman
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
TABLE A4.—PREFERENTIAL TRADE AGREEMENTS IN 1990
CARICOM
Bahamas
Barbados
Belize
Dominican Rep.
Guyana
Haiti
Jamaica
Trinidad and Tobago
St. Kitts and Nevis
Suriname
SPARTECA
Australia
New Zealand
Fiji
Kiribati
Papua New Guinea
Solomon Islands
PATCRA
Australia
Papua New Guinea
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
CACM
Costa Rica
El Salvador
Guatemala
Honduras
Nicaragua
Bilateral Agreements
EC-Cyprus
EC-Malta
EC-Egypt
EC-Syria
EC-Algeria
EC-Norway
EC-Iceland
EC-Switzerland
Canada–United States
Israel–United States
658
THE REVIEW OF ECONOMICS AND STATISTICS
Variable
Trade
Log of trade
Log of exporter’s GDP
Log of importer’s GDP
Log of exporter’s per capita GDP
Log of importer’s per capita GDP
Log of distance
Contiguity dummy
Common-language dummy
Colonial-tie dummy
Landlocked exporter dummy
Landlocked importer dummy
Exporter’s remoteness
Importer’s remoteness
Preferential–trade-agreement dummy
Openness dummy
TABLE A5.—SUMMARY STATISTICS
Full Sample
Exports (cid:18) 0
Mean
172132.2
—
23.24975
23.24975
7.50538
7.50538
8.78551
0.01961
0.20970
0.17048
0.15441
0.15441
8.94654
8.94654
0.02505
0.56373
Std. Dev.
1828720
—
2.39727
2.39727
1.63986
1.63986
0.74168
0.13865
0.40710
0.37606
0.36135
0.36135
0.26389
0.26389
0.15629
0.49594
Mean
328757.7
8.43383
24.42503
24.13243
8.09600
7.98602
8.69497
0.02361
0.21284
0.16894
0.10767
0.11401
8.90383
8.90787
0.04452
0.65796
Std. Dev.
2517139
3.26819
2.29748
2.43148
1.65986
1.68649
0.77283
0.15185
0.40933
0.37472
0.30998
0.31784
0.29313
0.28412
0.20626
0.47442
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
8
8
4
6
4
1
1
9
8
2
8
0
5
/
r
e
s
t
.
.
.
8
8
4
6
4
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3
Download pdf