STRATEGIC CITATION: A REASSESSMENT
Jeffrey Kuhn, Kenneth Younge, and Alan Marco*
Abstract—The United States patent system is unique in that it requires appli-
cants to cite documents they know to be relevant to the examination of their
patent applications. Lampe (2012) presents evidence that applicants strate-
gically withhold 21%–33% of relevant citations from patent examiners,
suggesting that many patents are fraudulently obtained. We challenge this
view. We first show that Lampe’s empirical design is inconsistent with both
legal standards and standard operating procedures, including how courts
identify strategic withholding. We then compile comprehensive data to re-
assess the empirical basis for Lampe’s main claim. We find no evidence
that applicants withhold citations.
I.
Introduction
THE United States is unique in requiring a patent appli-
cant to disclose to the U.S. Patent and Trademark Office
(USPTO) any document the applicant believes to be rele-
vant to the examination of a patent application. In theory,
this “duty to disclose” improves patent examination quality
by identifying relevant prior art. However, Lampe (2012) ar-
gues that firms strategically withhold 21%–33% of relevant
prior art citations. If accurate, Lampe’s findings would be
alarming, for they suggest that patent applicants (in effect)
commit widespread fraud, and presumably thereby obtain
more patent protection than is due.
We present evidence that challenges the view that appli-
cants systematically underdisclose; instead, we find no evi-
dence of strategic withholding. We first discuss institutional
reasons for believing that Lampe’s (2012) methodology bi-
ases the results in favor of finding strategic withholding. We
then replicate Lampe’s (2012) analysis using a larger sam-
ple and more accurate and comprehensive data. Finally, we
provide empirical evidence suggesting that the methodology
is subject to bias based on selection effects, time trends, and
firm size.
This article makes several contributions. First, it presents
new evidence of interest to policymakers (Mammen, 2009;
Kuhn, 2010; Johnson, 2017), for it directly conflicts with
earlier results (Lampe, 2012) and challenges the view that
the duty of disclosure is ineffective (Taylor, 2012). Second,
it contributes to the literature that investigates patent ex-
amination as an important aspect of innovation economics
(Cockburn, Kortum, & Stern, 2003; Lemley & Sampat, 2012;
Frakes & Wasserman, 2015). Third, it contributes to a liter-
ature showing how refinements in patent data can support
a more nuanced and accurate empirical assessment of inno-
vation (Hall, Jaffe, & Trajtenberg, 2005; Alcacer & Gittel-
Received for publication November 26, 2019. Revision accepted for pub-
lication March 10, 2021. Editor: Shachar Kariv.
∗Kuhn: Kenan-Flagler Business School, University of North Carolina at
Chapel Hill; Younge: College of Management of Technology, École Poly-
technique Fédérale de Lausanne; Marco: School of Public Policy, Georgia
Institute of Technology.
A supplemental appendix is available online at https://doi.org/10.1162/
rest_a_01051.
man, 2006; Jaffe & De Rassenfosse, 2017; Kuhn, Younge, &
Marco, 2020).
II.
Institutional Background
A. The Patent Examination Process
A patent examiner determines whether application’s
claims constitute a new and nonobvious advance over the
prior art. To identify prior art, the examiner is required to
perform a search and to review applicant submitted refer-
ences. An applicant is not obligated to search for prior art
but may nevertheless know of relevant documents. Since dis-
closure may not be in applicants’ interests, U.S. law imposes
a duty to disclose all information known to be material to
patentability (37 C.F.R. §1.56). The duty extends to inven-
tors, attorneys, and any other agents of the firm involved in
the patent application. Violation can lead to severe penalties,
such as unenforceability for the patent and disbarment for
complicit attorneys.
When an examiner identifies prior art that justifies rejecting
application’s claims, she issues an “Office Action” identify-
ing particular locations in specific prior art references where
the features of the claim are described. Rejections cabin the
scope of patent claims, since most patents are initially re-
jected, but applicants typically overcome rejections by nar-
rowing the claims (Marco, Sarnoff, & deGrazia, 2019; Kuhn
& Thompson, 2019). Examiners rely upon both examiner and
applicant references to justify rejections, but the majority of
citations do not form the basis of rejections. Such citations
nevertheless circumscribe the technology, clarify the inven-
tive step, and document both the applicant’s disclosure ef-
forts and the examiner’s search and examination. We empha-
size that the examiner must confirm that he actively reviewed
these nonrejection citations and nevertheless considered the
claimed invention to represent a novel and nonobvious im-
provement over them.
The duty of disclosure is intended to prohibit strategic
withholding, which may lead to an applicant receiving patent
rights that the examiner would not have granted had the exam-
iner known of the withheld information. Lampe (2012) does
not conceptually define “strategic withholding,” but at a min-
imum the term implies an active and knowledgeable choice.
Thus the term implies that (1) the information in question was
relevant to the claims of the focal patent, (2) the applicant
knew that the information was relevant to the examination of
the focal patent, (3) the applicant did not submit the infor-
mation for consideration, and (4) the withholding was inten-
tional rather than the result of oversight. These criteria not
only are consistent with the term’s ordinary meaning but also
accurately reflect the extensive body of law through which
courts evaluate claims of strategic withholding (Cotropia,
2009).
The Review of Economics and Statistics, March 2023, 105(2): 458–466
© 2021 The President and Fellows of Harvard College and the Massachusetts Institute of Technology. Published under a Creative Commons Attribution 4.0
International (CC BY 4.0) license.
https://doi.org/10.1162/rest_a_01051
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
STRATEGIC CITATION: A REASSESSMENT
459
B. Empirically Identifying Strategic Withholding
Large sample analysis typically precludes the type of judi-
cious evaluation performed by courts, so an empiricist must
identify strategic withholding from observational data. Such
an approach will only produce reliable results if the empiri-
cal definition of strategic withholding (1) is consistent with
the theoretical construct and (2) yields a relatively unbiased
measure. We argue that nearly all of the citations identified
by Lampe’s methodology as strategically withheld probably
do not meet one or more of the conceptual criteria for strate-
gic withholding, and that both the sample selected (i.e., “rel-
evant” citations) and the dependent variable (i.e., strategic
withholding) are likely to be overinclusive and unrepresen-
tative. We therefore contend that Lampe’s methodology is
flawed in both overall approach and specific selection crite-
ria, and thus likely to lead to biased and unreliable estimates.
Lampe (2012) identifies strategic withholding based on
patterns of citations. Lampe defines a citation by patent A
of patent B as “relevant” if (1) patent C also cites patent B,
(2) patents A and C were assigned to the same firm, and
(3) patent C was granted in a calendar year before patent A
was filed. Thus a relevant citation (patent B) is one which
the firm was aware of (as evidenced by patent C) when it
filed the new application (patent A). Lampe defines a relevant
citation as “strategically withheld” if it was submitted by the
examiner rather than the applicant. That is, Lampe assumes
that any examiner citation is strategically withheld if it was
previously cited anywhere in the applicant’s patent portfolio.
We now examine Lampe’s empirical definition of strategic
withholding in light of the construct defined in section II.A.
The first element of the construct is that the information in
question was relevant to the examination of the focal patent.
We contend that Lampe’s assumption that all citations are
relevant is invalid, for both applicant and examiner citations.
Kuhn et al. (2020) show that the technological proximity
between citing and cited patents has declined substantially
over time for applicant citations. The decline is likely an
unintended consequence of the duty of disclosure itself—
applicants reduce both the compliance cost and the risk of in-
advertent noncompliance by simply citing everything, copy-
ing hundreds or thousands of citations from patent to related
patent without manual review. The vast majority of these ci-
tations are ignored by examiners and are not, in fact, relevant
to the claims of the citing patent—only about 5% of all ci-
tations form the basis of a claim rejection. Lampe’s defining
a reference as “relevant” merely by virtue of the applicant
having cited it is therefore inconsistent with the practicalities
of patent examination.
Examiners often cite references not as evidence that the
claimed invention is unpatentable, but rather as evidence that
the examiner performed an adequate search, or as background
technical material. For precisely this reason, courts deter-
mine strategic withholding on the basis of whether the with-
held information would have been used to support a rejec-
tion, and not merely whether the examiner did cite or would
have cited the withheld information. This rule is not legal-
istic, but instead reflects the practicality that applicants can-
not be expected to accurately anticipate which of potentially
thousands of related background references an examiner may
subjectively deem informative. For the same reason, Lampe’s
defining a citation as “relevant” on the basis of examiner ci-
tation alone is also inconsistent with the realities of patent
examination.
The second element of the construct—knowledge of
relevance—is equally troubling. Companies and inventors
with large patent portfolios will have cited many references
in the past, and may simply not make the logical connec-
tion from a previously cited reference to a newly filed patent
application. As noted above, the majority of citations submit-
ted in recent years are likely generated by attorneys copying
them across related applications, and nearly half of citations
are submitted long after the citing patent is filed (Kuhn et al.,
2020). Firms typically employ attorneys to handle patent ex-
amination and rarely involve the inventors in any significant
way. Indeed, we know of no reason that an inventor would
be aware of citations made by attorneys or examiners in pre-
vious patents by the same firm or even the same inventor.
We therefore question Lampe’s assumption that an examiner
citation to a reference previously cited in a different patent
by the same firm or inventor typically indicates an intentional
decision by the applicant to withhold information.
The third element of the construct seems trivial: the iden-
tification of a citation as examiner-submitted would seem to
constitute evidence that the applicant did not, in fact, submit
the information for consideration. However, the USPTO’s
attribution of the citation’s source can be misleading—the
USPTO designation (MPEP 1302.12) only indicates whether
the reference is ever submitted by the examiner, so an
applicant-added citation that is re-added by the examiner is
nevertheless designated as “cited by examiner.” In practice,
examiners often ignore applicants’ citations in favor of their
own search results (Cotropia, Lemley, & Sampat, 2013) and
add citations that are highly similar or even identical to those
already submitted by the applicant, a particularly common
occurrence for rejection citations (Kuhn et al., 2020). The
applicant cannot be said to have withheld information in
such situations, despite the presence of an examiner citation.
Accordingly, even Lampe’s assumption that an examiner-
submitted citation identifies information not already submit-
ted by the applicant is demonstrably false for some citations.
All of these concerns have the same practical effect.
Namely, the empirical definition of “strategically withheld”
citations employed by Lampe (2012) is overly broad, and
likely encompasses many citations that were, in fact, not
actually withheld. At the same time, most of the citations
identified as “relevant” are likely irrelevant citations that
are mechanically copied from patent-to-patent without man-
ual review. Accordingly, variation in both sample selec-
tion (i.e., “relevant”) and dependent variable (i.e., “strategi-
cally withheld”) may largely reflect differences in automated
compliance strategies rather than intentional document-level
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
460
THE REVIEW OF ECONOMICS AND STATISTICS
decisions. These automated compliance strategies are likely
to vary with firm size, location (i.e., United States ver-
sus foreign), and technology, among other characteristics—
precisely the factors identified by Lampe as dimensions along
which the rate of strategic withholding varies.
III. Data and Replication
A.
Sources
We obtain bibliographic information on patents, including
patent citations, from the PatentsView dataset, which was un-
available at the time of Lampe’s analysis. PatentsView pro-
vides several advantages over the NBER Patent Data files
employed by Lampe (2012), such as inventor disambigua-
tion and improved firm disambiguation. However, the indica-
tor for whether a citation was examiner-added is unavailable
prior to 2002, so our sample includes patents granted 2002–
2014, whereas Lampe’s sample includes patents granted
2001–2002.
The Google Patents Public Datasets provide patent prior-
ity data and a correspondence between granted patent number
and the patent’s pre-grant publication (PGPub) number. Fol-
lowing Kuhn et al. (2020), citations in the sample include
those made to PGPubs that were later granted as patents.
We employ patent-to-patent textual similarity data devel-
oped by Younge and Kuhn (2016), who compute the patent-
to-patent cosine similarity of the full text of every pair of
patents under a vector space, term-frequency inverse docu-
ment frequency model. Kuhn et al. (2020) use these data to
evaluate the technological relatedness of different groups of
citations.
The PatentsView data set identifies whether an examiner
submitted a citation reference, but as discussed in section II.B,
this designation should not be interpreted to indicate that the
reference was first added by the examiner. Kuhn et al. (2020)
provide a correction to citation source attribution based on
new data from internal USPTO citation submissions forms,
which allows for more accurate identification of a citation’s
original submitter for the period from 2005 through 2014.
We identify patent citations used to support claim rejec-
tions from the Office Action Research Dataset for Patents
described by Lu, Myers, and Beliveau (2017) for the period
2008–2014 and from the bulk data files published by the
USPTO for the period 2005–2008.
All datasets are publicly available or available upon
request.
B.
Sample
To reassess Lampe’s conclusions, we construct a larger
sample to replicate the analysis over a longer time period.
We select all utility patents issued from 2002 to 2014, inclu-
sive. Following Lampe (2012), we exclude continuing patents
(i.e., continuation, continuation-in-part, divisional, and reis-
sue patents), patents not assigned to firms, and patents as-
signed to more than one firm. Our final sample of patents
includes data for 1,746,730 patents.
Next, we select all citations made by these patents that
meet Lampe’s criteria for relevant citations. The cited patent
must have been cited by a different patent that is: (1) issued
to the same firm as the citing patent, and (2) issued in a year
prior to the year in which the focal citing patent was filed. We
exclude citations made to the applicant’s own patents (i.e., a
self-citation). The final sample includes data for 2,480,248
patent citations.
Because larger firms file many patents and cite many ref-
erences, some of these previously- cited references may be
unfamiliar to inventors and attorneys involved with a later
patent application. Accordingly, we follow Lampe (2012) by
constructing a subsample of citations that were previously
cited in a patent having at least one inventor in common with
the focal patent. The common inventor subsample includes
data for 784,355 patent citations.
C. Variable Definitions and Summary Statistics
Table 1 provides summary statistics for the variables used
in this study for the full sample and the common inventor
subsample. Columns 1, 2, 7, and 8 copy the corresponding
values from table 1 of Lampe (2012). Columns 3–6 and 9–12
include statistics for our replication. We would not expect our
summary statistics to be identical to Lampe, because we em-
ploy different data sources and because our sample includes
an overlapping but not identical time span. Nonetheless, the
values in the 2002 Replication columns are broadly consistent
with the values reported by Lampe (2012).
Following Lampe (2012), Common inventors counts the
number of times that any inventor of the focal patent was also
an inventor on a prior patent that cited the same reference. We
calculate this variable using equation (2) in Lampe (2012).
Inventors are identified in raw patent data by name and not by
a unique identifier. We therefore employ the disambiguated
inventor identifier provided by PatentsView to identify pre-
vious patents by the same identifier. The common inventor
subsample restricts the citations to those for which Common
inventors ≥ 1.
The variable Applicant-added identifies whether the cita-
tion was added by the applicant. We find that 71% (81%) of
citations in the 2002 replication of the full sample (common
inventor subsample) are applicant-added, an increase of 4 (2)
percentage points over the value reported by Lampe (2012).
This modest difference is likely due to differences between
the data sources (e.g., firm disambiguation) leading to differ-
ences in sample selection.
To better understand the number of patents “at hazard”
of being strategically withheld, we construct counts of the
number of unique patents previously cited by the firm and in-
ventor. We construct previously cited patents (by firm) by first
identifying all patents granted to the same firm as the citing
patent in the years before the citing patent was filed (i.e. the
firm’s prior patents). We then count all unique patents cited
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
STRATEGIC CITATION: A REASSESSMENT
461
TABLE 1.—DESCRIPTIVE STATISTICS
Lampe
2001–2002
Full sample
Replication
2002
Replication
2002–2014
Lampe
2001–2002
Replication
2002
Replication
2002–2014
Common inventor subsample
Variable
Mean
S.D.
Mean
S.D.
Mean
S.D.
Mean
S.D.
Mean
S.D.
Mean
S.D.
Citing application year
Cited application year
Citing grant year
Cited grant year
Replication variables
Applicant-added
Common inventors
Control variables
Attorney or agent
Non-U.S. firm
Examination time
Citing claims
Cited claims
Previously cited patents
By firm
By common inventor
Additional variables
Rejection (102 or 103)
Applicant-added (cor.)
Observations
1,999.18
1,987.40
2,001.52
1,989.29
0.67
1.12
0.94
0.28
2.34
21.84
15.73
1.16
5.85
0.50
5.75
0.47
3.89
0.24
0.45
1.06
18.70
14.09
1,999.64
1,987.91
2,002.00
1,989.81
0.71
0.97
0.94
0.24
2.42
23.32
15.76
1.08
7.57
0.00
5.61
0.46
2.98
0.24
0.43
1.01
23.25
14.18
2,006.44
1,993.29
2,009.98
1,995.58
0.81
1.78
0.89
0.19
3.57
22.10
19.58
3.61
7.26
3.54
7.42
0.39
9.73
0.32
0.39
1.60
15.13
17.27
1,999.33
1,987.03
2,001.52
1,989.89
0.79
3.52
0.95
0.23
2.19
24.47
15.89
1.13
6.03
0.50
5.92
0.41
6.26
0.22
0.42
1.01
24.97
13.86
1,999.83
1,987.68
2,002.00
1,989.54
0.81
3.12
0.95
0.19
2.27
26.19
15.81
0.98
5.83
0.00
5.70
0.39
4.68
0.21
0.39
0.90
33.10
13.18
2,006.77
1,992.95
2,010.12
1,995.16
0.90
5.62
0.88
0.15
3.39
23.52
19.34
3.61
7.40
3.52
7.61
0.30
16.66
0.33
0.35
1.54
17.08
17.03
10,359
99
16,283
154
15,866
265
33,707
726
6,273
158
12,023
193
8,353
423
21,933
907
126,340
75,371
0.04
0.83
2,480,248
0.20
0.37
40,085
23,382
0.02
0.91
784,355
0.15
0.28
in any of the firm’s prior patents. Previously cited patents (by
common inventor) repeats this analysis for each of the inven-
tors of the citing patent, and represents a count of the union
of all citations previously made by those inventors.
For the period from 2008 to 2014, we identify citations
used to support claim rejections directly from the Office Ac-
tion Dataset. For the period from 2005 to 2008, we follow
Cotropia et al. (2013) and analyze the raw text of communi-
cations (known as “office actions”) sent from USPTO patent
examiners to applicants to identify rejection citations. We use
optical character recognition to convert more than 50 million
pages of documents from images to text, and then used nat-
ural language processing techniques and regular expressions
to identify patent numbers used to support rejections. We find
that about 4.2% of citations were used to support a rejection.
Finally, we identify a citation as being applicant-added
(corrected) when it meets any of three criteria: (1) the cita-
tion was not identified in the USPTO data as examiner-added,
(2) the citation was first submitted by the applicant accord-
ing to internal USPTO records, or (3) the applicant submitted
a different reference that is more than 80% textually simi-
lar to the focal reference. Although the difference between
applicant-added and applicant-added (corrected) in table 1
may seem small (0.81 versus 0.83), it represents a decrease in
examiner citations by about 10% across the sample, and ex-
aminer rejection citations are even more likely to be corrected
(Kuhn et al., 2020).
D. Replication
Lampe’s (2012) main result is that applicants withhold be-
tween 33% (full) and 21% (common inventor) of relevant
citations for a sample of patents granted in 2001 and 2002.
For clarity, we note that this result follows directly from
table 1 of Lampe (2012). As discussed in section II.B. Lampe
(2012) assumes that any citation to a reference that was pre-
viously cited by the same firm is strategically withheld if it is
submitted by the examiner. Because both the full sample and
the common inventor subsample described in Lampe (2012)
include only relevant citations, the percentage reported as
withheld is simply the percentage submitted by the applicant,
subtracted from one hundred. The remainder of the results in
Lampe (2012), such as tables explaining variation in which
citations are withheld, rely on the validity of this main re-
sult. Applying the same criteria and assumptions as Lampe
(2012), our 2002 replication sample shows that between 29%
(full) and 19% (common inventor) of relevant citations meet
Lampe’s criteria for strategic withholding.
IV.
Sample Evaluation
In section IV.A, we present evidence that the full sample as
defined by Lampe (2012) does not lead to reliable estimates
of strategic withholding because the methodology results in
estimates with an upward bias that increases with firm size.
Section IV.B shows that moving to the common inventor sub-
sample does not entirely correct the problem, and does noth-
ing to address several other problems that we identified in
section II.B. We investigate and reject various possible cor-
rections in section IV.C, and conclude that Lampe’s general
methodology is unlikely to lead to reliable estimates of strate-
gic withholding regardless of the sample selection criteria and
variable definitions.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
–
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
462
THE REVIEW OF ECONOMICS AND STATISTICS
FIGURE 1.—DENSITY OF PREVIOUS CITATIONS
A. Full Sample
In this section, we argue that the full sample as defined
by Lampe does not provide a credible basis for investigating
when and under what conditions the applicant has strategi-
cally withheld citations from the patent office. We begin by
noting that strategic withholding implies that a reference cited
by a patent examiner was not only known to the patent appli-
cant, but also that the applicant knew that the reference was
relevant.
A firm with a large patent portfolio will have previously
cited many references. When an examiner cites one of those
references in a later-filed application, it is possible that no one
at the firm drew a logical connection between the previous
citation and the subsequently filed patent. Assuming that any
examiner citation to a reference previously cited in a patent
by the same firm is evidence of strategic withholding biases
the results in favor of finding strategic withholding.
Figure 1 illustrates this problem graphically by plotting
the number of relevant references for patents in both the full
sample and the common inventors subsample. On average, a
patent in the full sample is assigned to a firm that previously
cited more than 10,000 references. Indeed, some patents
are assigned to firms that previously cited over 300,000
references.
If the mere presence of so many previous citations leads
to citations being erroneously identified as strategically with-
held, then we should expect that for purely mechanical rea-
sons the incidence of strategic withholding increases with
the number of previous citations. To test this argument,
table 2 includes results from linear probability models es-
timating the probability that a citation in the sample is added
by the examiner. In column 1, a 100% increase in the num-
ber of previously-cited references corresponds to a 0.039 in-
crease (p < 0.001) in the probability that a focal citation is
examiner-added and hence appears as strategically withheld.
This result is robust to the inclusion of a variety of control
variables in column 2, and is economically significant given
a mean probability of withholding of 0.17.
One interpretation of these results is that larger firms, such
as General Electric, IBM, and Microsoft, are more likely to
commit fraud before the patent office than smaller firms,
which we do not believe to be a credible conclusion. For
instance, we find that under Lampe’s definition IBM strate-
gically withholds up to 40% of relevant citations from the
patent office. A more reasonable interpretation is that when
examiners identify references for patents filed by such firms,
those examiner-added references are simply more likely to
have been previously cited by the same firm, as a matter of
chance.
Measurement error in the dependent variable that is uncor-
related with predictors does not bias estimates. In this context,
however, the number of previous citations made by a firm is
of course highly correlated with firm size, age, the presence
of an attorney, and a variety of other predictors. Moreover,
the measurement error is not only located in the dependent
variable (i.e., which citations are “withheld”), but also in the
sample selection itself (i.e., which citations are “relevant”).
For these reasons, we conclude that any estimates based on
the full sample approach described in Lampe are likely to be
biased, and in particular are likely to substantially overstate
the incidence of strategic withholding.
B. Common Inventor Subsample
In this section, we argue that the common inventor subsam-
ple is also unlikely to produce unbiased estimates of strategic
withholding. Lampe’s common inventor subsample restricts
the full sample to those citations previously made in a patent
by a common inventor within the firm. In theory, this new re-
striction ensures that at least one of the inventors of the subse-
quent patent knew of the citation. In practice, as we discussed
in section II.B, the inventor in the subsequent patent likely
did not know of the earlier citation, because it was likely sub-
mitted by an attorney or the examiner, and was unlikely to
logically connect the cited reference to the newly filed patent
application.
We note that a patent in the common inventors subsample is
filed by inventors whose previous patents jointly cite over 420
references, on average. Indeed, as shown in figure 1, some
patents are filed by inventors who jointly cite over 10,000
previous references. As the number of previously cited ref-
erences increases, the likelihood of unintentionally over-
looking a technological relationship between a newly filed
patent application and a reference previously cited by the firm
increases.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
STRATEGIC CITATION: A REASSESSMENT
463
TABLE 2.—LINEAR PROBABILITY MODEL REGRESSIONS OF EXAMINER CITATION
Full Sample
Common Inventor Subsample
Restricted Subsample
(1)
0.039***
(0.0001)
0.186***
(0.0002)
2,480,246
0.049
(2)
0.035***
(0.0001)
0.062***
(0.001)
−0.004***
(0.0001)
−0.002***
(0.00002)
−0.0003***
(0.00001)
0.270***
(0.001)
0.149***
(0.001)
2,480,246
0.142
(3)
0.007***
(0.0002)
0.108***
(0.0004)
784,355
0.003
(4)
0.007***
(0.0002)
0.041***
(0.001)
−0.007***
(0.0002)
−0.001***
(0.00002)
−0.0003***
(0.00002)
0.206***
(0.001)
0.103***
(0.001)
784,355
0.077
(5)
0.032***
(0.001)
0.603***
(0.004)
18,812
0.025
(6)
0.031***
(0.001)
0.026*
(0.013)
−0.004
(0.002)
−0.002***
(0.0003)
−0.001***
(0.0002)
0.175***
(0.007)
0.585***
(0.016)
18,812
0.062
Previously-cited patents
Attorney or agent
Examination time
Citing claims
Cited claims
Non-U.S. firm
Constant
Observations
R2
Previously-cited patents is a logged count of the number of patents cited by the firm in previous calendar years. Examination time is measured in years. Standard errors in parentheses. Two-tailed tests: ∗ p < 0.05,
∗∗ p < 0.01, and ∗∗∗ p < 0.001.
FIGURE 2.—VENN DIAGRAMS FOR ALTERNATIVE SUBSAMPLES SHOWING
CITATIONS SELECTED AS RELEVANT, AND PERCENTAGE SUBMITTED BY EXAMINER
Columns 3 and 4 of table 2 repeat the analyses in columns 1
and 2, but for the common inventors subsample. In column 3,
a 100% increase in the number of previously-cited references
corresponds to a 0.007 increase (p < 0.001) in the probabil-
ity that a focal citation is examiner-added, relative to a mean
probability of 0.10. This result is also robust to the inclusion
of a variety of control variables in column 4. While the coef-
ficients in columns 3 and 4 are lower than in columns 1 and
2, they remain positive and statistically significant. This sug-
gests either that firms with larger patent portfolios are more
likely to commit fraud, or that Lampe’s methodology over-
estimates the incidence of strategic withholding for larger
firms, even for the common inventor subsample. The com-
mon inventor subsample therefore suffers from precisely the
same problem as the full sample. Both the sample selection
criteria and the dependent variable therefore seem likely to be
biased in a way that is correlated with the predictors, leading
to biased estimates.
Finally, as further evidence of unreliability, we observe
that the rate of strategic withholding in both the full sample
and the common inventor subsample declined by more than
50% from 2002 to 2014 (see figure 3a). Because we can
identify no reason to believe that the rate of fraud has declined
so substantially and smoothly over that 13-year period, we
conclude that neither sample provides a reliable estimate of
the rate of strategic withholding.
C. Possible Corrections
One problem with even the common inventor subsample
is that previous patents by prolific inventors may have cited
thousands of references. We could therefore impose an ad-
ditional selection criteria restricting both relevant and with-
held citations to those situations in which the inventors had
previously cited fewer than some threshold number of ref-
erences (e.g., 100 references). Such a restriction, however,
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
would still fail to address the fact that many citations describe
background information that is not particularly relevant to the
examination of the citing patent. Accordingly, we could alter-
natively restrict both relevant and withheld citations to those
used to support a rejection of the claims, which are certainly
relevant. In this section, we discuss why imposing additional
selection criteria such as these is unlikely to lead to remedy
the problems with Lampe’s methodology and yield credible
estimates of strategic withholding.
First, different combinations of selection criteria lead to
very different samples and results. Figure 2 shows a Venn
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
464
THE REVIEW OF ECONOMICS AND STATISTICS
FIGURE 3.—INCIDENCE OF CITATIONS AND PATENTS THAT MEET DEFINITION OF WITHHOLDING
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
diagram of the number of citations included in the full sam-
ple, the common inventor subsample, and two other subsam-
ples. The rejection subsample restricts the analysis to cita-
tions used to support rejections. The inventor citation pool
<100 subsample excludes citations, whether or not in the
common inventor subsample, when the inventors have pre-
viously cited 100 or more references. For all samples, we
employ the applicant-added (corrected) variable to identify
strategic withholding. Estimates of strategic withholding vary
from 5.2% to 79.5%, depending on the combination of selec-
tion criteria employed. Indeed, these are not the only selection
criteria one might use; alternatively one might restrict to the
set of citations that share a common attorney, or restrict to
citations made by firms below a certain size, or restrict to ci-
tations that are textually similar to the citing patent. Because
one could reasonably argue for or against each of these se-
lection criteria, the reported outcome is essentially a matter
of choice.
Second, even employing all four selection criteria (the re-
stricted sample), we still observe a firm-size effect. As shown
in models 5 and 6 of table 2, a 100% increase in the number
of previously-cited patents corresponds to a 0.031 increase
(p < 0.001) in the probability of strategic withholding, rel-
ative to a mean probability of 0.061. Accordingly, even the
most restrictive selection criteria employed in figure 2 fail to
address the problems evident in the full sample and common
inventor subsample.
Third, the tradeoff between accuracy and external validity
in this context is likely severe. Figure 3b plots the percentage
of all patents that have at least one citation that meets Lampe’s
definition of strategic withholding, under different sampling
approaches. In the most Restricted Subsample, only 7,260
citations over 13 years meet the definition of “relevant,” and
only 362 patents per year (0.18% of patents in the sample)
have even a single withheld citation. At the extreme, we could
select a very small sample of litigated patents and determine
with some accuracy the rate of strategic withholding for those
patents. However, the citations we identified as strategically
withheld in that highly selected subsample will not be repre-
sentative of withheld citations more generally.
Fourth, all subsamples are both overinclusive and under-
inclusive. They are overinclusive for the reasons discussed
in section II.B. However, they are also underinclusive in the
sense that many instances of strategic withholding involve the
withholding of information such as patents that a firm has not
previously cited, nonpatent literature, or foreign patents, none
of which are included in either sample. Further, both errors
are correlated with predictor variables such as firm size, and
additional sample restrictions do not resolve these problems.
Fifth, if interpreted as evidence of strategic withholding,
the results presented in table 2 are inconsistent with the insti-
tutions related to patent examination. For example, a citation
made in a patent by a non-U.S. firm is between 0.175 and
0.270 more probable to be examiner-added. The location in
which a firm is incorporated seems unlikely to have such
a substantial effect on whether the firm strategically with-
holds information from the USPTO. A more reasonable in-
terpretation is that non-U.S. firms file more of their patents in
STRATEGIC CITATION: A REASSESSMENT
465
non-U.S. jurisdictions, which would mean that the count vari-
able previously-cited patents is a less reliable control for the
size of such firms. Further, a citation made in a patent in which
the firm is represented by an attorney or agent is between
0.026 (p < 0.05) and 0.062 (p < 0.001) more probable to be
examiner-added. Attorneys and agents owe an independent
duty of disclosure and candor to the patent office, and would
be risking their disbarment by intentionally withholding in-
formation. We therefore expect that the positive coefficient
indicates that the presence of an attorney or agent is another
indication of firm size, rather than evidence that attorneys
and agents are more likely to withhold information from the
patent office.
In sum, the methodology employed in Lampe (2012) forces
a severe trade-off. With few selection criteria, the samples
are overinclusive in ways that may yield severely and unpre-
dictably biased estimates. With more selection criteria, the
sample size diminishes substantially without convincingly
addressing several of the problems underlying the more gen-
eral approach. We are skeptical that any selection criteria
based on publicly available data is likely to lead to a sam-
ple suitable for generating relatively unbiased estimates of
strategic withholding.
V. Conclusion
An accurate assessment of applicant citation behavior is
necessary for evaluating the costs and benefits of the duty
of disclosure, particularly since the U.S. is the only ma-
jor jurisdiction that imposes this obligation. Lampe’s claim
that applicants withhold between 21% and 33% of rele-
vant citations provides a powerful argument against the effi-
cacy of disclosure. However, that research relies on two key
assumptions.
First, Lampe (2012) assumes that all cited references were
indeed relevant to the examination of a patent simply by virtue
of having been cited. However, the large majority of cited
references do not affect the patent examination process, and
indeed most citations are copied from patent to patent with
little-to-no manual review. For good reason, courts do not
expect applicants to anticipate which of the many different
background references an examiner may choose to cite. This
first assumption is therefore contrary to the practical realities
of patent examination, and suggests that fewer than 5% of the
citations identified by Lampe’s methodology as “relevant” are
entitled to that description.
Second, Lampe (2012) assumes that any examiner citation
to a reference previously cited in a different patent granted
to the same firm or inventor is evidence of strategic with-
holding. We show that the average firm has cited many ref-
erences in the past and that the rate of examiner citation in-
creases with the number of previously-cited references, an
effect which persists through various sample selection crite-
ria. While a small firm may easily review the citations made
in its previously-granted patents, IBM (or even a prolific in-
ventor) cannot be expected to accurately anticipate which
of its more than 300,000 previous citations an examiner may
choose to cite in a subsequent patent application. Merely con-
trolling for the number of previous citations is insufficient to
address the problem because the bias is embedded in both the
definition of the dependent variable (i.e., strategic withhold-
ing) and the sample selection criteria (i.e., large firms cite
more references) in ways that are correlated with predictors
such as the presence of an attorney, a firm’s status as U.S. or
foreign, and a patent’s examination time.
Based on this evidence, we conclude that the large majority
of citations identified by Lampe’s methodology as “relevant”
were in fact not relevant, and that the large majority of ci-
tations identified by Lampe’s methodology as “strategically
withheld” were in fact not strategically withheld, as those
terms are typically construed. Moreover, various alternative
but reasonable selection criteria lead to very different results,
suggesting that under Lampe’s methodology the main results
are largely driven by the researcher’s choices and assump-
tions rather than the phenomenon of interest. Given that our
analysis calls into question assumptions integral to Lampe’s
results, we are forced to conclude that Lampe’s claim that
that applicants withhold between 21% and 33% of relevant
citations is simply not supported by the evidence. The re-
mainder of Lampe’s results rely on the same samples and
dependent variable to investigate the determinants of strate-
gic withholding and therefore lack reliability and validity for
the same reasons.
REFERENCES
Alcacer, Juan, and Michelle Gittelman, “Patent Citations as a Measure of
Knowledge Flows: The Influence of Examiner Citations,” this RE-
VIEW 88:4 (2006), 774–779.
Cockburn, Iain M., Samuel Kortum, and Scott Stern, “Are All Patent
Examiners Equal? The Impact of Examiner Characteristics,” in
Patents in the Knowledge-Based Economy (Washington, DC: Na-
tional Academies Press, 2003).
Cotropia, Christopher A., “Modernizing Patent Law’s Inequitable Conduct
Doctrine,” Berkeley Tech. LJ 24 (2009), 723.
Cotropia, Christopher A., Mark A. Lemley, and Bhaven Sampat, “Do Ap-
plicant Patent Citations Matter?” Research Policy 42:4 (2013), 844–
854. 10.1016/j.respol.2013.01.003
Frakes, Michael D., and Melissa F. Wasserman, “Does the U.S. Patent and
Trademark Office Grant too Many Bad Patents: Evidence from a
Quasi-Experiment,” Stan. L. Rev. 67 (2015), 613.
Hall, Bronwyn H., Adam Jaffe, and Manuel Trajtenberg, “Market Value and
Patent Citations,” The RAND Journal of Economics 36:1 (2005), 16–
38.
Jaffe, Adam B., and Gaétan De Rassenfosse, “Patent Citation Data in So-
cial Science Research: Overview and Best Practices,” Journal of the
Association for Information Science and Technology 68:6 (2017),
1360–1374. 10.1002/asi.23731
Johnson, Eric E., “The Case for Eliminating Patent Law’s Inequitable Con-
duct Defense,” Colum. L. Rev. Online 117 (2017), 1.
Kuhn, Jeffrey M., “Information Overload at the U.S. Patent and Trademark
Office: Reframing the Duty of Disclosure in Patent Law as a Search
and Filter Problem,” Yale JL & Tech. 13 (2010), 89.
Kuhn, Jeffrey M., and Neil C. Thompson, “How to Measure and Draw
Causal Inferences with Patent Scope,” International Journal of
the Economics of Business 26:1 (2019), 5–38. 10.1080/13571516
.2018.1553284
Kuhn, Jeffrey M., Kenneth A. Younge, and Alan C. Marco, “Patent Citations
Reexamined,” RAND Journal of Economics 51 (2020), 109–132.
10.1111/1756-2171.12307
Lampe, Ryan, “Strategic Citation,” this REVIEW 94:1 (2012), 320–333.
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
466
THE REVIEW OF ECONOMICS AND STATISTICS
Lemley, Mark A., and Bhaven Sampat, “Examiner Characteristics and
Patent Office Outcomes,” this REVIEW 94:3 (2012), 817–827.
Lu, Qiang, Amanda Myers, and Scott Beliveau, “USPTO Patent Prosecution
Research Data: Unlocking Office Action Traits,” USPTO Economic
working paper (2017).
Mammen, Christian E., “Controlling the Plague: Reforming the Doctrine
of Inequitable Conduct,” Berkeley Tech. LJ 24 (2009), 1329.
Marco, Alan C., Joshua D. Sarnoff, and Charles A. W. deGrazia, “Patent
Claims and Patent Scope,” Research Policy 48 (2019), 103790.
10.1016/j.respol.2019.04.014
Taylor, Priscilla G., “Bringing Equity Back to the Inequitable Conduct Doc-
trine?” Berkeley Technology Law Journal 27 (2012), 349–379.
Younge, Kenneth A., and Jeffrey M. Kuhn, “Patent-to-Patent Similarity: A
Vector Space Model,” Available at SSRN 2709238 (2016).
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
/
r
e
s
t
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
1
0
5
2
4
5
8
2
0
7
3
2
2
5
/
r
e
s
t
_
a
_
0
1
0
5
1
p
d
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3