RESEARCH ARTICLE
A typology of scientific breakthroughs
Mignon Wuestman
, Jarno Hoekman
, and Koen Frenken
Innovation Studies Group, Copernicus Institute of Sustainable Development,
Princetonlaan 8a, 3584 CB, Utrecht University, 荷兰人
开放访问
杂志
关键词: breakthroughs, citation impact, scientific discovery, science of science, scientometrics,
societal impact
引文: Wuestman, M。, Hoekman, J。,
& Frenken, K. (2020). A typology of
scientific breakthroughs. Quantitative
Science Studies, 1(3), 1203–1222.
https://doi.org/10.1162/qss_a_00079
DOI:
https://doi.org/10.1162/qss_a_00079
已收到: 8 四月 2020
公认: 2 六月 2020
通讯作者:
Jarno Hoekman
j.hoekman@uu.nl
处理编辑器:
Ludo Waltman
抽象的
Scientific breakthroughs are commonly understood as discoveries that transform the knowledge
frontier and have a major impact on science, 技术, and society. Prior literature studying
breakthroughs generally treats them as a homogeneous group in attempts to identify supportive
conditions for their occurrence. 在本文中, we argue that there are different types of scientific
breakthroughs, which differ in their disciplinary occurrence and are associated with different
considerations of use and citation impact patterns. We develop a typology of scientific
breakthroughs based on three binary dimensions of scientific discoveries and use this typology
to analyze qualitatively the content of 335 scientific articles that report on breakthroughs. 为了
each dimension, we test associations with scientific disciplines, reported use considerations,
and scientific impact. We find that most scientific breakthroughs are driven by a question and in
line with literature, and that paradigm shifting discoveries are rare. Regarding the scientific
impact of breakthrough as measured by citations, we find that an article that answers an
unanswered question receives more citations compared to articles that were not motivated by
an unanswered question. We conclude that earlier research in which breakthroughs were
operationalized as highly cited scientific articles may thus be biased against the latter.
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
1.
介绍
The evolution of scientific knowledge is commonly understood as an alternating process between
short periods of breakthrough discoveries and long periods in which these breakthroughs are
further refined and elaborated (Kuhn, 1962; Toulmin, 1967). Periods of breakthrough discoveries
transform the knowledge frontier, while periods of refinement and elaboration allow for the major
scientific, 技术性的, and societal contributions of those breakthroughs to materialize (埃文斯,
2016; Hilgard & 贾米森, 2017; Winnink & Tijssen, 2014). 然而, while the periods of refine-
ment and elaborations are well understood (例如, Boyd & Richerson, 1985; Cavalli-Sforza &
费尔德曼, 1981; 纳尔逊 & 冬天, 1982), the characteristics of discoveries that transform the
knowledge frontier remain unclear (马克思 & Bornmann, 2013).
最近几年, several authors have suggested supportive conditions for scientific break-
throughs, such as cognitively diverse teams (Hage & Mote, 2010; Hinrichs, Seager, 等人。, 2017;
吴, 王, & 埃文斯, 2019), combinations of highly conventional and highly novel knowledge
(Mukherjee, Romero, 等人。, 2017; 席林 & 绿色的, 2011) and psychological characteristics of
科学家, such as stubbornness and tenacity (Grumet, 2008). 一般来说, these studies understand
scientific breakthroughs as being codified in highly cited publications, typically operationalized as
the top 5%, 1%, 或者 0.1% highly cited articles in a field (Ponomarev, 威廉姆斯, 等人。, 2014; Uzzi,
Mukherjee, 等人。, 2013; Zeng et al., 2017). These studies thus treat scientific breakthroughs as a
版权: © 2020 Mignon Wuestman,
Jarno Hoekman, and Koen Frenken.
在知识共享下发布
归因 4.0 国际的 (抄送 4.0)
执照.
麻省理工学院出版社
A typology of scientific breakthroughs
group characterized by high impact and implicitly assume that all scientific breakthroughs are
highly cited articles, and conversely that all highly cited articles are scientific breakthroughs. A
systematic effort to identify different types of breakthroughs has so far not been made. 这是,
然而, 有用, as it might be the case that breakthrough types occur under different circum-
stances and differ in their citation impact.
以下, we develop a typology of scientific breakthroughs, and examine differences between
kinds of breakthroughs in terms of their disciplinary occurrence, considerations of use and citation
impact. We make use of the Charge-Chance-Challenge (Cha-Cha-Cha) theory of scientific discov-
ery as described by Koshland (2007) to develop a typology of scientific breakthroughs. Rather than
understanding scientific breakthroughs as either Charge, Chance or Challenge type, as Koshland
做, we propose three discovery dimensions that underlie those three types to provide a better
understanding of the varieties of scientific breakthroughs. Using the three dimensions, we test to
what extent configurations of these three dimensions are observable in the scientific literature by
qualitatively coding the full text of 335 articles that, according to experts, report on scientific
breakthroughs. We then use these coded articles to explore how the different characteristics
are distributed over scientific disciplines and vary in their considerations of use and citation
impact. We are particularly interested in those aspects, because Koshland, in his paper, 提供
examples of the different types of discoveries that come primarily from physical sciences and life
科学. 此外, as it is known that scientific breakthroughs have transformational potential
both within and beyond science (Winnink & Tijssen, 2014), we explore how the different typo-
logical dimensions relate to furthering fundamental understanding and considerations of use
(Stokes, 1997). 最后, we model the effect of breakthrough characteristics on cumulative citation
impact over 10 years by means of a set of regression models. We find that breakthroughs vary
widely in their citation impact, and that there are telling differences in impact between different
types of breakthroughs.
2. THE CHA-CHA-CHA THEORY OF SCIENTIFIC DISCOVERIES
Koshland’s Cha-Cha-Cha theory of scientific discoveries was developed to aid in understanding
the heterogeneous nature of major scientific advances, and to improve our understanding of the
conditions under which scientific breakthroughs occur (Koshland, 2007). The theory is developed
from the perspective that different field conditions lead up to different types of scientific discov-
eries. These field conditions relate to the perceived state of knowledge in a scientific field, 哪个
offers opportunities for scientists to make relevant scientific contributions (discoveries). 考试用-
普莱, a scientific discovery may provide an answer to a long-standing question in a field.
或者, a breakthrough may be a serendipitous encounter with an important new piece
of evidence, which may fit or question the existing theory or observations in a field. Scientists
may also recognize a set of inconsistencies in the state-of-the-art literature in a field, which they
aim to resolve. Koshland’s Cha’s summarize these different kinds of scientific discoveries in three
类型: Charge, 机会, and Challenge.
2.1. Charge, Chance and Challenge Type Discoveries
Koshland defines Charge type discoveries as discoveries that “solve problems that are quite obvi-
ous … but in which the way to solve the problem is not so clear” (p. 761). 换句话说, Charge
type discoveries resolve “known unknowns” (Logan, 2009). Koshland uses Isaac Newton’s discov-
ery of gravity as an example of a Charge type discovery, because “the movement of stars in the sky
and the fall of an apple from a tree were apparent to everyone, but Isaac Newton came up with the
Quantitative Science Studies
1204
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
/
.
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
concept of gravity to explain it all” (p. 761). A recent example of a Charge type scientific break-
through is “cloaking technology” (Leonhardt, 2006), an invisibility device that has been a long-
standing dream of many scientists. While it had been proven that perfect invisibility is impossible
due to the wave nature of light, there was reason to believe that “perfect invisibility within the
accuracy of geometrical optics” was achievable (p. 1777). Leonhardt (2006) reports the formula-
tion of a “general recipe” for the design of media that can achieve such invisibility with possibilities
for practical demonstrations. This breakthrough has thus, 至少在理论上, solved a well-known
problem in a way that had not been thought of before.
Koshland defines Chance type discoveries as “instances of a chance event that the ready
mind recognizes as important and then explains to other scientists” (p. 761). For a Chance type
发现, the original contribution lies in recognizing the importance of an unexpected
encounter or explaining the importance to other members of the scientific community.
These encounters typically involve some kind of serendipity (Copeland, 2019; Koshland,
2007; Yaqub, 2018). Encounters may take the shape of accidental discoveries of fossils, 一个-
cient remains, and other natural phenomena, but also of unexpected outcomes of planned
实验, such as Alexander Fleming’s discovery of penicillin (Koshland, 2007). 一个更多
recent example of a Chance type discovery is reported in an article by Palmer, Barthelmy,
等人. (2005), who report on neutron star SGR 1806-20 emitting a giant gamma-ray flare on
27 十二月 2004. Recognizing the importance of this event was crucial: The authors note
that this flare was about a hundred times higher than the two giant flares observed from this
neutron star earlier, whereas the energy of giant flares is usually a thousand times higher than
that of a typical burst. Because of that difference, the authors further note that under different
情况, this burst could have been interpreted as another type of burst. 反而, 他们
suggest that the observed flare is of a newly discovered subclass.
As a third type of scientific breakthrough, Koshland defines Challenge type discoveries as “a
response to an accumulation of facts or concepts that are unexplained by or incongruous with
scientific theories of the time” (p. 761). Koshland provides Einstein’s theory of special relativity
as an example of a Challenge type discovery, as it provided a theory that explained anomalies
with contemporary theories. Another example presented itself with the report on a draft se-
quence of the Neandertal genome (绿色的, 克劳斯, 等人。, 2010). The authors emphasize in
the introduction of the article that “substantial controversy surrounds the question of whether
Neandertals interbred with anatomically modern humans” (p. 710). The challenged model
was based on the idea that modern humans, after leaving Africa, completely replaced
Neandertals without interbreeding. This theory was supported by evidence on morphological
features and DNA of modern humans, although the evidence was considered to be inconclu-
西韦. The draft sequence of the Neandertal genome suggest that Europeans and Asians, 但不是
Africans, have inherited genes from Neandertals—a finding that does not fit with the model.
反而, the authors put forward an alternative theory: Neandertals interbred with modern
humans after they left Africa, but before they spread into Europe and Asia.
While Koshland’s categorization is intuitive, it has thus far not been used to systematically
map scientific discoveries. One obstacle towards using Koshland’s theory to classify scientific
breakthroughs holds that Koshland does not specify whether we should understand scientific
discoveries as mutually exclusive (IE。, 机会, Charge, or Challenge alone), or as combina-
tions of types. 例如, a discovery can fit the description of both Chance and Challenge.
考虑, as an example, an article published in 2000, reporting on the discovery of two early
hominid skulls and tools at a site in the Republic of Georgia, which the authors interpret as
evidence that “the initial hominid dispersal from Africa was driven not by technological inno-
vation but more likely by biological and ecological parameters” (Gabunia, Vekua, 等人。, 2000,
Quantitative Science Studies
1205
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
/
.
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
p. 1025). This discovery fits the definition of a Chance type discovery because it involves a
chance encounter that scientists recognized as important, but it also fits the definition of a
Challenge type discovery because the authors interpret evidence that is incongruent with sci-
entific theories of the time, and propose an alternative theory. 而且, Koshland also does
not specify whether the three types are meant to be exhaustive. It might be that some discov-
eries do not fit with the definition of any of Chance, Charge, or Challenge.
2.2. From Discovery Types to Discovery Dimensions
As we aim to characterize and compare scientific breakthroughs, we allow for the possibility
that Koshland’s discovery types are neither exhaustive nor mutually exclusive. 相当, we as-
sume that breakthroughs can be characterized on three binary discovery dimensions. For each
of Koshland’s discovery types, the state of two of the three binary dimensions is fixed, 尽管
the state of the third dimension may vary (see also Table 1). For both states of each dimension,
we provide examples of relevant scientific articles in Table A1. We summarize the dimensions
as follows:
1. The discovery is driven by a question, or by a research object
第一的, we distinguish between discoveries that are question driven and discoveries that
are research object driven. Whereas in the case of question-driven discoveries the area
of ignorance and the line of enquiry in the field is well established and widely shared
(“we know what we do not know”), discoveries driven by a research object are inverse
question driven: The discovery precedes the formulation of the question (“we do not
know what we do not know”) (Meyers, 2011). 例如, archaeologists might
discover ancient hominid remains in an unexpected location, which then raises ques-
tions about the distribution and social relation of hominids (Brunet, Guy, 等人。, 2002).
The discovery of ancient remains thus drives the formulation of a question that was not
asked before. Note that it may be the case that discoveries driven by a research object
do actually provide answers to questions, but these questions were not the driver of the
发现.
Koshland’s Charge type can be characterized as question driven, referring to discoveries
that solve long-standing problems. In our earlier example, the discovery team was able
to design a theoretical cloaking device in response to the ambition of engineering invis-
能力. Chance and Challenge type discoveries are both research object driven: 机会
type discoveries start from an encounter with a research object that awaits interpretation,
and Challenge type discoveries start from the recognition of a research object that does
not fit with existing theories. In our example of a Chance type discovery, this was the
observation of a giant gamma-ray burst, and in our Challenge example, this was genetic
桌子 1.
Koshland’s types as configurations of discovery dimensions
Koshland’s types
Charge
机会
Challenge
A: question-driven or
research-object-driven
问题
research object
research object
Discovery dimensions
乙: new or known
question/research object
任何一个
新的
任何一个
C: question/research object is
against or in line with literature
in line with literature
任何一个
against literature
Quantitative Science Studies
1206
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
evidence against the model that claimed that modern humans replaced Neandertals
without interbreeding.
2. The discovery introduces a new question/research object, or contributes to a known
question/research object
As a second dimension, we distinguish between new and known questions and research
物体. We understand questions and research objects that are known as questions and
research objects that are documented in the scientific literature, and of which the scientists
who made the discovery were aware. 反过来, new questions and research objects are
those that are introduced by the scientists who made the discovery and are, 所以,
themselves part of the discovery. With regard to the scientific impact of a discovery, 这
is a relevant distinction, as it indicates whether the discovery team can be credited for
introducing the new question or research object, or only for resolving or contextualizing
它. Koshland described this in terms of “uncoverers,” or scientific teams for whom unco-
vering the question or research object is (part of ) their original contribution, and “discov-
erers,” or scientific teams that contribute to a question or research object uncovered by
其他的 (p. 761).
Koshland’s Chance type can clearly be characterized along this dimension. For Chance
type discoveries, it is the “uncovering” that is critical, along with recognizing and inter-
preting the relevance of the uncovered research object. Without observing the giant
gamma-ray burst, its discovery team would not have been able to report any discovery.
In the case of Charge or Challenge discoveries, the question or research object can be
either new or known. For Charge type discoveries, which answer “obvious” questions,
the question may be a long-standing one that many others have tried and failed to solve,
such as the ambition of invisibility or the puzzle of gravity, but it may also be a question
that they raised themselves as an extension of existing literature. 例如, 作者
of an article that reports on the derivation of germ cells from stem cells argued that
“because embryoid bodies sustain blood development, we reasoned that they might also
support primordial germ cell formation,” thus raising the question of whether germ cells
can indeed be made from such embryoid bodies (Geijsen, Horoschak, 等人。, 2004,
p. 148). 和, Challenge type discoveries can be a response to an accumulation of facts
that the discovery team uncovers themselves, or that were already known in the literature
前. Our example of a Challenge type discovery based on the draft sequence of the
Neandertal genome includes both: It reports original evidence that counters the existing
模型, and describes pieces of evidence that were uncovered by others. 这里, we are in
agreement with Koshland (2007), who also argued that Challenge type discoveries can
be accompanied with uncovery or not. Following his wording, it is the discovery of a new
explanation of facts that is critical for Challenge type discoveries, not the uncovering of
the facts as such.
3. The question or research object is against or in line with state-of-the-art literature
第三, we distinguish between discoveries that go against state-of-the-art literature and discov-
eries that fit with or follow logically from existing literature. 换句话说, the discovery
may have the potential to cause a paradigm shift, or it may fit within the current paradigm
(Koshland, 2007; Kuhn, 1962).
Koshland’s Challenge type discoveries are driven by research objects that are incongruent
with the current paradigm, and their interpretation thus calls for a paradigm shift. Challenge
type discoveries can thus be characterized as “against state-of-the-art literature.” The
article on the Neandertal genome, 例如, reported existing evidence incongruent
with the current paradigm, uncovered additional evidence, and offered an alternative
模型. Charge type discoveries, 另一方面, answer questions that have been part
Quantitative Science Studies
1207
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
/
.
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
of the existing literature or follow logically from it and, logically, cannot go against state-of-
the-art literature. The discovery of a theoretical cloaking device was, 的确, in line with
earlier ideas on the feasibility of such a device. Chance type discoveries may or may not be
in line with state-of-the-art literature, depending on the interpretation of their discoverers.
The article by Palmer et al. (2005) on an observed gamma-ray flare is an example of the
以前的: the flare was interpreted as an additional category of flares. This interpretation
offered an extension of the current model and did not require a paradigm shift. The dis-
covery of hominid skulls and tools in the Republic of Georgia (Gabunia et al., 2000), 经过
对比, is an example of both a Chance type discovery and a Challenge type discovery,
as the evidence is seen as incongruent with scientific theories of the time.
总之, we can define Koshland’s types as configurations of three binary dimensions, 作为
summarized in Table 1. Following the table, Charge type discoveries are driven by a question,
be it a new or known question, and are in line with existing literature. Chance type discoveries
are driven by a new research object and may be in line with or against existing literature.
Challenge type discoveries are driven by a new or existing research object, and go against existing
文学.
It follows from Table 1 that Koshland’s three discovery types are not exhaustive of the pos-
sible types that are analytically conceivable. 的确, there is no reason to assume that discov-
eries will only meet the particular configurations of the dimensions that are consistent with
Koshland’s three types. Using the framework, we are able to characterize scientific break-
throughs in three binary dimensions, so that each scientific breakthrough is classified as one
out of eight (23) possible types, rather than Koshland’s three. The question, 然后, of which of
the eight possible discovery types is most prevalent, is an empirical one.
3. RESEARCH DESIGN AND METHODS
3.1. Data Collection
To characterize different types of scientific breakthroughs, we make use of Science’s annual an-
nouncement of the Breakthrough of the Year (BotY) (AAAS, 2018) 之间 1999 和 2012. 每个
year the magazine’s scientific editors select “the most significant scientific discovery of the year”
(AAAS, 2018) and its nine runners-up1. The selected breakthroughs are described by the journal’s
reporters in the final issue of the year. These descriptions may refer to a single scientific break-
through or to a multitude of breakthroughs that center on a common theme2, and include a list
of references to the original research described and other supportive material.
For this paper, we use the reference list of each BotY description to select research articles that
report on the scientific breakthrough. We will refer to these articles as breakthrough articles. 我们
use the following requirements in our selection of breakthrough articles from the BotY (和
runners-up) reference lists: (A) Articles should be written in English3; (乙) articles should be
published in the same year as the year in which they were announced BotY or runner-up, 和
the exception of articles published in December the year before (as these were published after the
1 在 2015, a People’s Choice was introduced along with the editors’ selection. 然而, as data from 2013
onwards are not collected, these People’s Choice breakthroughs are not in the data. 前 2015, 仅有的
Science’s editors contributed to the selection process.
2 例如, in the runner-up “Water, water everywhere” from 2000, one breakthrough is the discovery of
an ocean on one of Jupiter’s satellites, and another is proof that there has been water on Mars at some point
in time.
3 No research articles written in another language were found.
Quantitative Science Studies
1208
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
/
.
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
BotY announcement of the previous year); (C) articles should be published in peer-reviewed
academic journals4; (d) articles should report original results described in the BotY description:
Review articles or articles that were included as further reading are omitted; (e) articles should
have a DOI and be available on Web of Science (WoS); 和 (F ) articles should not have been
retracted afterwards5.
Although the announcement of BotYs began in 1996 (replacing the annual announcement
of Molecule of the Year), data are collected from 1999 onwards, as no runners-up were
announced in 1998. BotYs are collected until 2012, to allow the articles at least 6 years to
receive citations after publication. This resulted in 335 scientific breakthrough articles, derived
从 140 BotYs (14 年: one breakthrough and nine runners-up per year). 桌子 2 显示了一个
summary of this selection process.
We used the DOI of each article to collect data from WoS: (A) publication date; (乙) publi-
cation source; (C) citation report consisting of the number of citations per year for 10 年, 或者
as many years as possible for articles published after 2008; (d) number of authors; 和 (e) A
PDF of the article’s full text (including abstract). These data were extracted from WoS in April
2018. The articles’ full texts were used to code the breakthrough article in terms of the three
discovery dimensions and its reported considerations of use.
We also collect data on the scientific discipline of each of our breakthrough articles based
on the indexation of Nature6 (Springer Nature, 日期不详。). We distinguish between disciplines as
listed by Nature: biological sciences7, business and commerce, environmental sciences,
health sciences8, 人文学科, 物理科学, scientific community and society, 和社会的
sciences.9 Because many of the examples of Chance type discoveries supplied by Koshland
are specifically from paleontological sciences and astronomical sciences, whereas examples
from Charge and Challenge type discoveries are not (Koshland, 2007), we will further distin-
guish between paleontological sciences and other biological sciences and between astronom-
ical sciences and other physical sciences.
3.2. Coding Discovery Dimensions and Reported Considerations of Use
We use directed content analysis (Hsieh & Shannon, 2005; Saldaña, 2015) to code each break-
through article on each of the three binary discovery dimensions and on the considerations of use
of the scientific breakthrough article. The result of this process will be used to assess descriptively,
statistically, and visually differences in discipline, citation impact, and reported considerations of
use between breakthrough articles by dimensions (see also Section 2.2).
To code articles on the three discovery dimensions, we use the text of the articles. For each
article we search for key phrases indicative of each dimension. Examples of key phrases used can
be found in Table A1, and were developed in three steps. 第一的, two coders, K. S. 和M. L. W.,
4 注意, although BotYs are announced by Science, reference lists include articles published in other peer-
reviewed journals.
5 As retraction of articles is essentially right-censored, because any article may be retracted in the future, 那里
may be a bias against older articles. 然而, there are few retractions in the data.
6 For breakthrough articles that were not published by Nature, we identify the most relevant scientific
discipline by determining the discipline of referenced articles published in Science.
7 Including anatomy, physiology, cell biology, biochemistry, biophysics, and paleontology (施普林格
自然, 日期不详。).
8 Including aspects of health, disease and healthcare aiming to develop knowledge, interventions, 和科技-
nology for use in healthcare (Springer Nature, 日期不详。).
9 然而, in our data set, we only find articles from “biological sciences,” “physical sciences,” “environ-
mental sciences,” and “health sciences.”
Quantitative Science Studies
1209
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
桌子 2.
Article selection process
Requirement met
Total listed references
Written in English
Published in announcement year
Published in peer-reviewed journal
Reports original results
Have DOI and in Web of Science
Have not been retracted
# 文章
895
895
746
691
367
340
335
coded 14 breakthrough articles from 2014, which were not part of the data set of this paper, 经过
highlighting phrases that signal the state of the three dimensions as defined in Section 2.2. 期间
this stage it was found that while relevant phrases could be found throughout the whole text of the
文章, the articles’ abstracts, first sentence, introductions and conclusions are the most informa-
tive with regard to the state of the discovery dimensions. Coding thus focused on these sections, 或者
on the whole text if abstract, introduction, and conclusion were inconclusive. 第二, 编码
differences between K. S. 和M. L. 瓦. were discussed until a consensus was reached on common
coding practices. 第三, the highlighted phrases of these 14 breakthrough articles were summa-
rized into stylized phrases. Note that some key phrases serve as signals for more than one dimen-
锡安. 例如, the phrase “On [日期] we have observed [……]” signals that the reported
scientific breakthrough is research object driven rather than question driven, but also that this re-
search object is new rather than known, because uncovering this research object is part of the
breakthrough. 第四, 这 335 breakthrough articles in our data set were then independently coded
by both coders. For our analyses, the dimension states question-driven, 新的, and against literature
are coded as 1, and research object-driven, 已知的, and in line with literature are coded as 0.
For the identification of reported considerations of use we follow the four quadrants pro-
posed by Stokes (1997) when cross-tabulating two questions: (A) Does the article report ap-
plied considerations of use of the scientific breakthrough, 或不?; 和 (乙) Does the article
report that the scientific breakthrough is part of a quest for fundamental understanding, 或者
不是? Articles that do not report applied considerations of use but do report contributions to
fundamental understanding are considered basic research. Articles that only report applied
considerations of use without contributing to fundamental understanding are applied research.
Articles that report both applied and fundamental considerations of use are considered as use-
inspired basic research, also known as “Pasteur’s quadrant” (Stokes, 1997). 最后, 文章
may report neither applied nor fundamental considerations of use. For the development of
key phrases that signal considerations of use, we followed the same procedure as for the de-
velopment of key phrases for discovery dimensions, 如上所述. Such phrases were
found to be typically reported in the final paragraph(s) of the breakthrough articles. Key
短语, as well as examples of reported considerations of use, can be found in Table A2.
We present intercoder reliability for each of our coded dimensions in Table 3, where we
report Cohen’s kappa, which takes intercoder agreement by chance into account. We find that
kappa values are sufficiently high (科恩, 1960). In cases of disagreement, final codes are
based on consensus between the two coders. Consensus was found for all articles in the data
放, which implies that all 335 publications originally selected serve as empirical observations.
Quantitative Science Studies
1210
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
桌子 3.
Intercoder reliability
Dimension
A: Question or research object
乙: New or known
C: Against or in line with literature
Applied
Basic
笔记: *p < .05, **p < .005, ***p < .001. 3.3. Analysis Cohen’s kappa 0.756*** 0.768*** 0.700*** 0.760*** 0.508*** We run chi2 tests and Tukey’s HSD post hoc tests (Tukey, 1949) to test whether discovery dimension states are associated with scientific disciplines and reported considerations of use. We also present bar charts to assess differences visually. To test whether discovery dimensions affect cumulative citation patterns, we run a set of regressions with the number of cumulative citations as the dependent variable, and three dummies that represent the three binary discovery dimensions as the main independent var- iables. Ten regression models estimate cumulative citations from 1 to 10 years after publica- tion. For this, negative binomial regression is appropriate, as our dependent variables reflect overdispersed count data (Cameron & Trivedi, 1998): If we were to use a Poisson regression rather than a negative binomial regression to model cumulative citations 10 years after pub- lication using all our independent variables, the residual variance of 184.073 would exceed the 220 degrees of freedom. As each BotY description can contain references to multiple breakthrough articles, we cluster standard errors at the level of the BotY description. As using cumulative citations per year makes it difficult to identify differences in the number of citations per year, we rerun our models using number of citations per year for one to 10 years after publication rather than cumulative citations per year for 1 to 10 years after publication as dependent variables, presented in Figure A1 and Table A3. Individual authors may each boost the cumulative citations to their own articles by bringing their work under the attention of others (Aksnes, 2003). Therefore, we include number of au- thors as a control variable. As this variable is heavily skewed, we use a log transformation for number of authors. In a second set of models, we further include dummy variables for disci- pline, as it is known that citations rates vary between disciplines, and we find that configura- tions of discovery dimensions are not randomly distributed over disciplines. 4. CHARACTERIZING BREAKTHROUGHS 4.1. Configurations of Discovery Dimensions and Associated Disciplines and Considerations of Use In Table 4, we present the distribution of articles in our data set over the eight different con- figurations of discovery dimensions. We also compare our typology to Koshland’s. We see that some combinations of characteristics are more common than others. Notably, the majority of our articles (77%) are what Koshland would describe as Charge type discoveries: driven by a known question that is in line with theory, irrespective of whether the research object is new or known. We also find that most articles can indeed be classified according to Koshland’s typology: Only 43 articles (13%) do not fit with that typology, either because they have Quantitative Science Studies 1211 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 3 1 2 0 3 1 8 6 9 9 7 1 q s s _ a _ 0 0 0 7 9 p d . / f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 A typology of scientific breakthroughs Table 4. Configuration of discovery dimensions Discovery dimensions A research object research object research object research object question question question question B known known new new known known new new C in line with literature against literature in line with literature against literature N 8 2 31 5 in line with literature 166 against literature in line with literature against literature 16 93 14 % 2.4 0.6 9.3 1.5 49.6 4.8 27.8 4.2 A0B0C0 A0B0C1 A0B1C0 A0B1C1 A1B0C0 A1B0C1 A1B1C0 A1B1C1 Koshland’s type - Challenge Chance Challenge + Chance Charge - Charge - properties of more than one type (11%), or because they have properties of none (2%). In this sense, the original typology of Koshland can be regarded as useful. Figure 1 presents bar charts of the discipline and reported considerations of use for each con- figuration of discovery dimensions, discussed in detail below. Most articles in our data set report considerations of use for fundamental understanding only (74%). Just a few are only applied (7%), while another 17% are classified as both fundamental and applied; 2% report neither. A large share (47%) of the articles report on research on biological sciences (excluding pa- leontology), while 26% report on research on physical sciences (excluding astronomy). Research in the field of health sciences and environmental sciences is less common. 4.1.1. Dimension A The majority of articles in this study (86%) are question driven. We find that question driven discoveries are not randomly distributed across disciplines (chi2 = 100, df = 5, p < .001). Based Figure 1. Bar chart of disciplines (left) and considerations of use (right) per configuration of discovery dimensions. Quantitative Science Studies 1212 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 1 3 1 2 0 3 1 8 6 9 9 7 1 q s s _ a _ 0 0 0 7 9 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 A typology of scientific breakthroughs on our post hoc test, we find that being question driven is associated with health sciences, more than with other disciplines ( p < .05). Indeed, almost all (96%) health sciences articles in our data set are question-driven. Conversely, being research object driven is associated with astronomy and paleontology more than with other disciplines ( p < .05). This may be because disciplines such as paleontology and astronomy more often encounter unexpected physical research objects, for example from fossil records and satellite observations, respectively. Question-driven articles are also not randomly distributed across the four Stokes quadrants (chi2 = 20, df = 3, p < .01). Our analysis suggests that being question driven is associated with reporting only applied considerations of use and with reporting both applied and fundamental considerations of use, while research object driven breakthroughs are associated with reporting on neither ( p < .05). 4.1.2. Dimension B Articles with a new question or research object are slightly less common than articles with a known question or research object (43% versus 57% of articles). Most of these are articles with a new question rather than a new research object (74%). Of our total set of articles, only 10% report on a new research object that is in line with theory. Our results indicate that this dimension and discipline are not independent (chi2 = 20, df = 5, p < .001). Specifically, we find that having a new question or research object is associated more with biological sciences (but not paleontology) and astronomy, while having a known question or research object is associated with physical sciences (but not astronomy) ( p < .05). It is further worth noting that breakthroughs that are specifically driven by a new research object are primarily found in astronomy (see also Figure 1). We do not find a strong association between this discovery dimension and Stokes’ four quadrants regarding considerations of use (chi2 = 7, df = 3, p > .05).
4.1.3. Dimension C
Breakthroughs that go against the state-of-the-art literature are uncommon (11%) and among
them the large majority are question driven. We do not find a strong association between this
discovery dimension and discipline (chi2 = 9, df = 5, p > .05). Our post hoc tests suggest that
breakthroughs going against the literature are somewhat common in paleontological articles,
while being in line with the literature is associated more with health sciences ( p < .1) and
physical sciences except astronomy ( p < .05). In terms of reported considerations of use,
we do not find significant evidence that being against state-of-the-art literature is associated
with reported considerations of use (chi2 = 5, df = 3, p > .1).
4.2. Citation Impact and Discovery Dimensions
桌子 5 shows descriptives of the cumulative number of citations within 10 years per discovery
dimension state. 平均而言, the articles in our data set collect 799 citations within the first 10
years after publication. 然而, with a median of 489 and an interquartile range of 625, 这
varies broadly: While the lowest decile of the articles in our data set have fewer than 108
citations, the highest decile has more than 1,571. 有趣的是, one article did not receive
any citation within 10 years10.
10 This article, which reports the first results from the Sudbury Neutrino Observatory (Helmer & SNO
Collaboration, 2002). It may be that this article did not receive any citations because there were two other
articles that report results from the same observatory (also included here). While all three articles were
originally submitted in April 2002, the Helmer et al. paper was published in November, while the other
two were published in June.
Quantitative Science Studies
1213
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
桌子 5.
Summary statistics of cumulative citations within 10 years per discovery dimension state
A1
A0
B1
B0
C1
C0
Dimension
question-driven
research-object-driven
新的
已知的
against literature
in line with literature
全部的
氮
288
47
144
191
37
298
335
意思是
867.1
412.3
英石. dev.
1,203.9
433.2
740.8
1,094.8
846.6
1,166.6
769.8
962.0
803.2
1,157.3
799.2
1,134
最小
27
1st quartile
268
Median
526
3rd quartile
917
0
0
27
27
0
0
150
240
247
219
246
245
302
444
505
519
484
489
418
738
917
898
868
870
Max
9,356
2,011
9,356
8,780
4,117
9,356
9,356
To test if the discovery dimension can explain some of the variation in citation counts, 我们
present the incidence rates of negative binomial regression models including control variables
图中 2, with one regression for each of the 10 年. Incidence rates for the effect of being
question-driven, driven by a new question or research object, or being against literature are
presented relative to being research object-driven, driven by a known question or research object
and being in line with literature, 分别.
桌子 6 presents the regression coefficients of our models, where cumulative citations 1, 2, 5,
和 10 years after publication are used as dependent variables. Models based on cumulative
citations after 3, 4, 6, 7, 8, 和 9 years were omitted from this table for readability reasons.
Dummies for the three binary dimensions (with question-driven, 新的, and against literature coded
作为 1, and research object-driven, known and in line with literature coded as 0) and control variables
为了 # authors (日志) and discipline, with biological sciences as reference category, are included.
4.2.1. Dimension A
We find that being question driven has a positive effect on cumulative citations of scientific break-
through articles. 后 10 年, articles that are question driven are estimated to receive twice as
数字 2.
N years, after controlling for the log of the number of authors and discipline.
Incidence rate ratios based on negative binomial regression on cumulative citations after
Quantitative Science Studies
1214
我
D
哦
w
n
哦
A
d
e
d
F
r
哦
米
H
t
t
p
:
/
/
d
我
r
e
C
t
.
米
我
t
.
/
e
d
你
q
s
s
/
A
r
t
我
C
e
–
p
d
我
F
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
A
_
0
0
0
7
9
p
d
.
/
F
乙
y
G
你
e
s
t
t
哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3
A typology of scientific breakthroughs
桌子 6.
Coefficients for negative binomial regression models
cumu_1y
(1)
0.360***
cumu_2y
(2)
0.337***
cumu_5y
(3)
0.300***
cumu_10y
(4)
0.195
cumu_1y
(5)
0.344***
cumu_2y
(6)
0.319***
cumu_5y
(7)
0.263***
cumu_10y
(8)
0.176
# authors (日志)
(0.120)
(0.096)
(0.086)
(0.143)
(0.119)
(0.100)
(0.091)
(0.144)
Dimension A
0.570***
0.655***
0.827***
0.724*
0.382*
0.528**
0.749***
0.699*
(0.303)
(0.310)
(0.422)
(0.586)
(0.254)
(0.278)
(0.409)
(0.602)
Dimension B
0.174
0.135
0.110
0.009
0.110
0.092
0.073
(0.113)
(0.099)
(0.108)
(0.138)
(0.107)
(0.096)
(0.107)
−0.030
(0.148)
Dimension C
0.090
0.109
0.086
0.148
0.038
0.082
0.069
0.160
(0.171)
(0.177)
(0.203)
(0.309)
(0.165)
(0.196)
(0.232)
(0.361)
持续的
3.036***
3.749***
4.677***
5.576***
3.424***
4.036***
4.875***
5.752***
(0.202)
(0.195)
(0.225)
(0.304)
(0.230)
(0.224)
(0.264)
(0.355)
Discipline
excluded
excluded
excluded
excluded
包括
包括
包括
包括
观察结果
LL
AIC
335
−1,840
3,691
335
−2,084
4,177
335
−2,393
4,796
244
−1,866
3,743
335
−1,758
3,533
335
−1,993
4,001
335
−2,290
4,596
244
−1,755
3,527
笔记: *** p < .001, ** p < .01, * p < .05.
many citations as articles that are research object driven (Model (8): IR = e0.699 = 2.011). There is
a smaller positive effect immediately after publication (Model (5): IR = e0.382 = 1.465), which
increases over time and seems to stabilize after 4–5 years and decrease slightly after 8 years.
Controlling for discipline in Models 5–8 slightly reduces the effect of being question driven, sug-
gesting that part of the effect seen in Models 1–4 is, in fact, due to high citation rates of disciplines
that are associated with being question driven. However, this does not alter our conclusion that
being question driven has a positive effect on cumulative citations of scientific breakthrough
articles.
4.2.2. Dimension B
We do not observe a significant association between this dimension and cumulative citations.
Our coefficients suggest that there may be a small positive effect of being driven by a new
question or research object on cumulative citations shortly after publication, which decreases
in later years. This may be caused by the unexpectedness and novelty of the new question or
new physical evidence introduced in the breakthrough article, and the sudden interest that this
may spark. However, this is not a significant finding.
Quantitative Science Studies
1215
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
A typology of scientific breakthroughs
4.2.3. Dimension C
We do not find a significant association between articles going against the literature and cu-
mulative citations. Upon visual inspection, there is some indication that breakthrough articles
driven by a question or research object that is against state-of-the-art literature receive more
citations in later years (year 9 and 10). The trend observed supports the idea that paradigm-
shifting discoveries require more time to have an impact before they can be integrated in future
knowledge development. However, the results are statistically insignificant.
The results of Models 9–16, where we use citations per year rather than cumulative cita-
tions per year as dependent variable, are presented in Figure A1 and Table A3. These results
are in line with our earlier results. Again, we find that only the question-driven dimension
significantly affects the number of citations received. We find that the difference in the number
of citations per year between question-driven and research object-driven articles is biggest
after 4–5 years.
5. DISCUSSION
In this paper, we have developed a typology of scientific breakthroughs and applied this ty-
pology to characterize a set of articles reporting on scientific breakthroughs. Using Koshland’s
Charge-Chance-Challenge theory of scientific discovery as a starting point, we propose that
scientific breakthroughs can be characterized along three dimensions: (a) whether the discov-
ery is question driven or research object driven; (b) whether the discovery contributes to a
known question or research object or introduces a new one; and (c) whether the discovery
is in line with, or against, state-of-the-art literature. We subsequently use the typology to char-
acterize 335 breakthrough articles along the three dimensions and analyzed how break-
through characteristics relate to scientific disciplines, citation impact, and considerations of
use for fundamental understanding and application.
One of our main findings holds that the large majority of breakthrough discoveries can be
classified as one of Koshland’s discovery types within his Cha-Cha-Cha framework. However,
we also observed that a small proportion of breakthroughs could not be characterized as any
of Koshland’s types, and some other articles fell into multiple Koshland types. Based on this
finding we conclude that, rather than distinguishing between Charge, Chance, and Challenge
types, breakthroughs can better be understood as being question driven or research object
driven, introducing a new question/research object or a known question/research object, and
having a contribution that is against or in line with state-of-the-art literature. We believe that
our framework marks an improvement over the original Cha-Cha-Cha theory, as we have made
the underlying dimensions explicit and orthogonal to one another, expanding the typology from
3 to 23 = 8 types. Our framework, then, can be used in future research to further probe the
antecedents and effects of scientific breakthroughs. It can equally be used to analyze differences
between characteristics of breakthrough and nonbreakthrough discoveries. A logical extension
of this paper is also to study whether the configurations of discovery dimensions discussed here
are distributed differently over breakthroughs than over nonbreakthroughs, and to test whether
the citation patterns we found are also observed for nonbreakthroughs.
Our main empirical finding holds that most scientific breakthroughs are driven by an already
existing question and in line with the state-of-the-art literature. This finding broadens our view of
science in that it questions the popular view of scientific breakthroughs as radical, paradigm-
shifting discoveries (e.g., Evans, 2016; Ventegodt & Merrick, 2004). Rather, it suggests that the
majority of scientific discoveries that are recognized as breakthroughs are better described as
“normal science” (Kuhn, 1962).
Quantitative Science Studies
1216
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
A typology of scientific breakthroughs
Our analysis also shows that articles reporting on scientific breakthroughs vary considerably in
their citation impact. In particular, breakthrough articles that were driven by a research object
rather than a question receive far fewer citations. This finding has implications for the interpreta-
tion of earlier research on scientific breakthroughs. Previous research has mainly analyzed scien-
tific breakthroughs based on citation impact, thereby considering breakthroughs as a
homogeneous group of discoveries (Ponomarev et al., 2014; Uzzi et al., 2013; Zeng et al.,
2017). In contrast, our findings suggest that earlier research aimed at identifying supportive con-
ditions for scientific breakthroughs did not recognize the variety of breakthroughs and may have
been biased against a minority of breakthroughs driven by research objects. Therefore, their find-
ings may not be generalizable to all scientific breakthroughs. For literature on scientific break-
throughs, a next step is to identify how conditions such as team composition and sponsoring
affect the occurrence of a variety of scientific breakthroughs, and in particular those breakthroughs
that are research object-driven, as we have shown that these have been underrepresented in the
literature thus far.
In this research, discoveries that were marked as scientific breakthroughs by Science have
been leading. This operationalization of scientific breakthroughs has several implications for
the generalizability of our findings. In the first place, our research only includes discoveries
that are recognized as scientific breakthroughs within a year after publication. Discoveries that
are recognized as such in a later phase may not have the same characteristics. For example,
their citation impact may differ over time. An interesting avenue for future research would be
to distinguish between discoveries that are received as breakthroughs shortly after publication
and those that are recognized as breakthroughs later on. An interesting question then holds
whether the relative prominence of the three dimensions introduced here differs between early
and delayed recognition. In the second place, it is not unlikely that the nomination for BotY in
itself affects the way a discovery is received. The increased visibility of the discovery may
inspire others to refine the discovery in other research projects, and can lead to an increase
in citations or even an increase in the likelihood of receiving a significant prize, such as a
Nobel Prize or a Fields Medal. For future research, we encourage alternative approaches to
identifying scientific breakthroughs that are more sensitive to delayed recognition and are not
based on external assessments. One such approach has been developed by Small, Tseng, and
Patek (2017), who identify and characterize biomedical discoveries based on automated text
analysis of citing sentences and cocitation analysis.
Our analysis of breakthrough discoveries is further limited by what has been reported in the
scientific articles. As such, we must limit ourselves to an analysis of the reported drive of the scien-
tific discoveries observed, which may not be the same as the actual drive of the discovery. Indeed,
authors may present the process of discovery as more linear and rational than it actually has been
(Myers, 1985). Similarly, the authors’ motivation to write and publish the article may be different
from their motivation to start the reported research project. For example, their original line of enquiry
may have resulted in a serendipitous finding that solves an unexpected problem in another line of
enquiry (Yaqub, 2018), which might lead the authors to change their narrative as well.
Furthermore, our analysis is limited by the limited number of articles considered. With more
observations, we could test differences in citation patterns of combinations of dimensions, rather
than for single dimensional states. This may help us understand whether breakthrough articles
that go against existing theory are accepted by the scientific community faster if they provide an
answer to a long-standing question, for example. This is likely, as the question-driven approach
of such articles may provide more legitimacy to the anomalous finding than if it were driven by
new evidence. We therefore encourage others to extend our analysis to a larger set of break-
through articles, potentially also including a broader range of scientific disciplines.
Quantitative Science Studies
1217
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
A typology of scientific breakthroughs
ACKNOWLEDGMENTS
The authors thank Kyle Siler for help in coding articles and Iris Wanzenböck for useful comments
on drafts.
AUTHOR CONTRIBUTIONS
M. Wuestman: Conceptualization, Formal Analysis, Investigation, Methodology, Visualization,
Writing—original draft. J. Hoekman: Conceptualization, Methodology, Supervision, Validation,
Writing—review & editing. K. Frenken: Funding acquisition, Conceptualization, Supervision,
Methodology, Validation, Writing—review & editing.
COMPETING INTERESTS
The authors have no competing interests.
FUNDING INFORMATION
M. Wuestman and K. Frenken are financed by the Vici grant (453-14-014), awarded by the
Netherlands Organisation for Scientific Research (NWO). J. Hoekman is financed by a Veni grant
(451-15-037), also awarded by NWO.
DATA AVAILABILITY
The main data set for the article is available at https://doi.org/10.6084/m9.figshare.12530042.
The citation data cannot be made publicly available due to the licensing contract terms of
WoS.
REFERENCES
AAAS. (2018). 2018 Breakthrough of the Year.
Aksnes, D. W. (2003). Characteristics of highly cited papers. Research
Evaluation, 12(3), 159–170.
Boyd, R., & Richerson, P. J. (1985). Culture and the evolutionary
process. Chicago: University of Chicago Press.
Brunet, M., Guy, F., Pilbeam, D., Mackaye, H. T., Likius, A.,
Ahounta, D., … Zollikofer, C. (2002). A new hominid from the
Upper Miocene of Chad, Central Africa. Nature, 418(6899), 801.
https://doi.org/10.1038/nature01005
Cameron, C., & Trivedi, P. K. (1998). Regression analysis of count
data. New York: Cambridge University Press.
Cavalli-Sforza, L., & Feldman, M. (1981). Cultural transmission
and evolution: A quantitative approach. Princeton: Princeton
University Press.
Cohen, J. (1960). A coefficient for agreement for nominal scales.
Educational and Psychological Measurement, 20(1). https://doi.
org/10.1177/001316446002000104
Copeland, S. (2019). On serendipity in science: Discovery at the
intersection of chance and wisdom. Synthese, 196(6), 2385–2406.
https://doi.org/10.1007/s11229-017-1544-3
Evans, J. P. (2016). (Mis)understanding science: The problem with
scientific breakthroughs. Hastings Center Report, 46(5), 11–13.
https://doi.org/10.1002/hast.611
Gabunia, L., Vekua, A., Lordkipanidze, D., Swisher, C. C., Ferring, R.,
Justus, A., … Mouskhelishvili, A. (2000). Earliest Pleistocene
hominid cranial remains from Dmanisi, Republic of Georgia:
Taxonomy, geological setting, and age. Science, 288(5468),
1019–1025. https://doi.org/10.1126/science.288.5468.1019
Geijsen, N., Horoschak, M., Kim, K., Gribnau, J., Eggan, K., &
Daley, G. Q. (2004). Derivation of embryonic germ cells and
male gametes from embryonic stem cells. Nature, 427(6970),
148–154. https://doi.org/10.1038/nature02247
Green, R. E., Krause, J., Briggs, A. W., Maricic, T., Stenzel, U.,
Kircher, M., … Pääbo, S. (2010). A draft sequence of the
Neandertal genome. Science, 328(5979), 710–722. https://doi.
org/10.1126/science.1188021
Grumet, G. W. (2008). Insubordination and genius: Galileo,
Darwin, Pasteur, Einstein, and Pauling. Psychological Reports,
102(3), 819–847. https://doi.org/10.2466/PR0.102.3.819-847
Hage, J., & Mote, J. (2010). Transformational organizations and
a burst of scientific breakthroughs: The Institut Pasteur and bio-
medicine, 1889–1919. Social Science History, 34(1), 13–46.
https://doi.org/10.1017/S0145553200014061
Helmer, R. L., & SNO Collaboration. (2002). First results from the
Sudbury Neutrino Observatory. Nuclear Physics B – Proceedings
Supplements, 111(1–3), 122–127.
Hilgard, J., & Jamieson, K. H. (2017). Does a scientific break-
through increase confidence in science? News of a Zika vaccine
and trust in science. Science Communication, 39(4), 548–560.
https://doi.org/10.1177/1075547017719075
Hinrichs, M. M., Seager, T. P., Tracy, S. J., & Hannah, M. A. (2017).
Innovation in the Knowledge Age: Implications for collaborative
Quantitative Science Studies
1218
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
A typology of scientific breakthroughs
science. Environment Systems and Decisions, 37(2), 144–155.
https://doi.org/10.1007/s10669-016-9610-9
Hsieh, H. F., & Shannon, S. E. (2005). Three approaches to qualitative
content analysis. Qualitative Health Research, 15(9), 1277–1288.
https://doi.org/10.1177/1049732305276687
Koshland, D. E. (2007). The Cha-Cha-Cha theory of scientific dis-
covery. Science, 317(5839), 761–762. https://doi.org/10.1126/
science.1147166
Kuhn, T. S. (1962). The structure of scientific revolutions. Structure
(Vol. 2). https://doi.org/10.1046/j.1440-1614.2002.t01-5-01102a.x
Leonhardt, U. (2006). Optical conformal mapping. Science, 312(June),
1777–1781. https://doi.org/10.1126/science.1126493
Logan, D. C. (2009). Known knowns, known unknowns, unknown
unknowns and the propagation of scientific enquiry. Journal of
Experimental Botany, 60(3), 712–714.
Marx, W., & Bornmann, L. (2013). The emergence of plate tectonics
and the Kuhnian model of paradigm shift: A bibliometric case study
based on the Anna Karenina principle. Scientometrics, 94(2),
595–614. https://doi.org/10.1007/s11192-012-0741-6
Meyers, M. A. (2011). Happy accidents: Serendipity in major
medical breakthroughs in the twentieth century. New York:
Arcade Publishing.
Mukherjee, S., Romero, D. M., Jones, B., & Uzzi, B. (2017). The
nearly universal link between the age of past knowledge and
tomorrow’s breakthroughs in science and technology: The hot-
spot. Science Advances, 3(4), e1601315. https://doi.org/10.1126/
sciadv.1601315
Myers, G. (1985). Texts as knowledge claims: the social construction
of two biology articles. Social Studies of Science, 15(4), 593–630.
Nelson, R. R., & Winter, S. G. (1982). An evolutionary theory of
economic change (Vol. 93). Cambridge, MA: Belknap. https://
doi.org/10.2307/2232409
Palmer, D. M., Barthelmy, S., Gehrels, N., Kippen, R. M., & Cayton, T.
(2005). A giant gamma-ray flare from the magnetar SGR 1806-20.
Nature, 434, 1107–1109. https://doi.org/10.1038/nature03525
Ponomarev, I. V., Williams, D. E., Hackett, C. J., Schnell, J. D., &
Haak, L. L. (2014). Predicting highly cited papers: A method for
early detection of candidate breakthroughs. Technological
Forecasting and Social Change, 81(1), 49–55. https://doi.org/
10.1016/j.techfore.2012.09.017
Saldaña, J. (2015). The coding manual for qualitative researchers.
Thousand Oaks, CA: SAGE Publications Ltd.
Schilling, M. A., & Green, E. (2011). Recombinant search and
breakthrough idea generation: An analysis of high impact papers
in the social sciences. Research Policy, 40(10), 1321–1331.
https://doi.org/10.1016/j.respol.2011.06.009
Small, H., Tseng, H., & Patek, M. (2017). Discovering discoveries:
Identifying biomedical discoveries using citation contexts.
Journal of Informetrics, 11(1), 46–62. https://doi.org/10.1016/j.
joi.2016.11.001
Springer Nature. (n.d.). Latest research and news by subject. Retrieved
30 January 2020, from https://www.nature.com/subjects
Stokes, D. E. (1997). Pasteur’s quadrant: Basic science and techno-
logical innovation. Washington, DC: Brookings Institution Press.
Toulmin, S. E. (1967). The evolutionary development of natural sci-
ence. American Scientist, 55, 456–471.
Tukey, J. W. (1949). Comparing individual means in the analysis of
variance. Biometrics, 5(2), 99–114.
Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013). Atypical
combinations and scientific impact. Science, 342(6157), 468–472.
https://doi.org/10.1126/science.1240474
Ventegodt, S., & Merrick, J. (2004). Philosophy of science: How to
identify the potential research for the day after tomorrow? The
Scientific World Journal, 4, 483–489. https://doi.org/10.1100/
tsw.2004.103
Winnink, J. J., & Tijssen, R. J. W. (2014). R&D dynamics and scien-
tific breakthroughs in HIV/AIDS drugs development: The case of
integrase inhibitors. Scientometrics, 101(1), 1–16. https://doi.org/
10.1007/s11192-014-1330-7
Wu, L., Wang, D., & Evans, J. A. (2019). Large teams have devel-
oped science and technology; small teams have disrupted it.
Nature, 566, 378–382. https://doi.org/10.2139/ssrn.3034125
Yaqub, O. (2018). Serendipity: Towards a taxonomy and a theory.
Research Policy, 47(1), 169–179. https://doi.org/10.1016/j.
respol.2017.10.007
Zeng, C. J., Qi, E. P., Li, S. L., Stanley, H. E., & Ye, F. Y. (2017).
Statistical characteristics of breakthrough discoveries in science
using the metaphor of black and white swans. Physica A:
Statistical Mechanics and Its Applications, 487, 40–46. https://
doi.org/10.1016/j.physa.2017.05.041
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Quantitative Science Studies
1219
A typology of scientific breakthroughs
APPENDIX
Table A1.
Examples and key phrases per dimension state
Dimension
A: The discovery is
driven by a question,
or by a research
object
State
A1: Driven by
question
2006 Cloaking
Technology
10.1126/science.
1126493
A0: Driven by
research
object
2008 Seeing Exoplanets
10.1126/science.
1166585
B: The discovery
B1: New
introduces a new
question/research
object, or contributes
to a known question/
research object
New question 2009
Live Long and
Prosper 10.1038/
nature08221
B0: Known
New research object
2002 The Tournai
Fossil 10.1038/
nature.00879
Known question 2001
Carbon Consensus
10.1126/science.
1057320
Known research object
2006 Tiktaalik Fossil
Fish 10.1038/
nature04637
Example
“This study develops a
general recipe for the
design of media that
create perfect invisibility
within the accuracy of
geometrical optics”
“High-contrast observations
with the Keck and Gemini
telescopes have revealed
three planets orbiting the
star HR 8799 […].”
“Inhibition of the TOR
signalling pathway by
genetic or pharmacological
intervention extends lifespan
in invertebrates, […].
However, whether inhibition
of mTOR signalling can
extend life in a mammalian
species was unknown.”
Key phrases
“It has been a long-standing
question whether […]”;
“[…] remains largely
unknown”; “We
hypothesize that […]”;
“We do not understand
the working of […]”
“We report the discovery
of […]”; “The discovery
of […] sheds new light
upon […]”
“We raise the question
whether […]”; “Since
we know […] and […],
it follows logically to ask
[…]”, “we hypothesize
that […]”
“Here we report the discovery
of six hominid specimens
from Chad, central Africa.”
“We report the discovery
of […]”; “On [date] we
have observed…”
“Despite widespread
consensus about the
existence of a terrestrial
carbon sink […], the
size, spatial distribution,
and cause of the sink
remain uncertain (refs).”
“Here we describe the
pectoral appendage of
a member of the sister group
of tetrapods, Tiktaalik roseae
(reported elsewhere, red.),
which is morphologically
and functionally transitional
between a fin and a limb.”
“Earlier research has raised
the question whether
[…]”; “It has been a
long-standing question
whether […]”
“The discovery of […],
by [ref], provides us
with new insights
into […]”
Quantitative Science Studies
1220
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
A typology of scientific breakthroughs
Table A1.
(continued )
Dimension
C: The question/
research object is
against, in line
with state-of-the-art
literature
State
C1: Against
literature
2000 New Cells for
Old 10.1126/
science.288.
5471.1660
Key phrases
“[…] which is against the
theory of […]”, “There
are two competing
theories, and we present
evidence against one in
support of the other”
Example
“The differentiation
potential of stem cells
in tissues of the adult
has been thought to be
limited to cell lineages
present in the organ
from which they were
derived. […] We show
here that neural stem
cells from the adult
mouse brain can […]
give rise to cells of all
germ layers.”
C0: In line with
literature
2006 Shrinking
Ice 10.1029/
2006GL026369
“We estimate mass trends
over Antarctica using
gravity variations […],
similar to a recent estimate
of ice mass loss from
satellite altimetry and
remote sensing data.”
“[…] which confirms the
theory of […]”; “[…] so
that we can extend the
model introduced by
[…]”
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
.
/
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Table A2.
Examples and key phrases for reported considerations of use
Considerations of use
Applied
2009 Live Long and
Prosper 10.1038/
nature08221
Examples
“These findings have implications
for further development of
interventions targeting mTOR
for the treatment and prevention
of age-related diseases.”
Fundamental
2006 Biodiversity and
Speciation 10.1126/
science.1126121
“This work has specific implications
for understanding the evolutionary
mechanisms responsible for
adaptive phenotypic change.”
Key phrases
“These findings will help us develop
[…]”; “Our findings can be used
to design […]”, “Our findings can
provide a platform to cure […]”,
“Potential uses of our findings
are […]”
“These findings will help us
understand […]”; “These findings
provide insight into […]”
Quantitative Science Studies
1221
A typology of scientific breakthroughs
Table A3.
Coefficients for negative binomial regression models, with noncumulative citation counts
# authors (log)
cit_1y
(9)
0.343***
cit_2y
(10)
0.314***
cit_5y
(11)
0.202***
cit_10y
(12)
0.082
cit_1y
(13)
0.324***
cit_2y
(14)
0.293***
cit_5y
(15)
0.182***
cit_10y
(16)
0.065
(0.043)
(0.045)
(0.061)
(0.100)
(0.044)
(0.047)
(0.064)
(0.105)
Dimension A
0.570***
0.746***
1.031***
0.715**
0.399**
0.662***
0.978***
0.692**
(0.157)
(0.162)
(0.222)
(0.299)
(0.162)
(0.172)
(0.239)
(0.316)
Dimension B
0.135
0.099
0.090
(0.109)
(0.112)
(0.154)
−0.027
(0.212)
0.077
0.069
0.053
(0.110)
(0.117)
(0.162)
−0.106
(0.225)
Dimension C
0.0004
(0.166)
0.126
0.053
0.241
(0.172)
(0.234)
(0.314)
−0.046
(0.164)
0.114
0.041
0.259
(0.173)
(0.240)
(0.319)
Constant
2.915***
3.062***
3.076***
3.404***
3.276***
3.270***
3.233***
3.602***
(0.197)
(0.204)
(0.279)
(0.388)
(0.223)
(0.237)
(0.329)
(0.457)
Discipline
excluded
excluded
excluded
excluded
included
included
included
included
Observations
Log Likelihood
335
−1,760
Akaike Inf. Crit.
3,531
335
−1,837
3,685
335
−1,829
3,668
244
−1,265
2,539
335
−1,681
3,377
335
−1,757
3,531
335
−1,752
3,519
244
−1,192
2,400
Note: *** p < .001, ** p < .01, * p < .05.
Figure A1.
Incidence rate for noncumulative citations after N years, after controlling for the log of the number of authors and discipline.
Quantitative Science Studies
1222
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
e
d
u
q
s
s
/
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
1
3
1
2
0
3
1
8
6
9
9
7
1
q
s
s
_
a
_
0
0
0
7
9
p
d
/
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3