OPINION

OPINION

Which aspects of the Open Science agenda are
most relevant to scientometric research and
édition? An opinion paper

Lutz Bornmann1

, Raf Guns2

, Michael Thelwall3

, and Dietmar Wolfram4

1Science Policy and Strategy Department, Administrative Headquarters of the Max Planck Society,
Hofgartenstr. 8, 80539 Munich, Allemagne
2Centre for R&D Monitoring, Faculty of Social Sciences, University of Antwerp,
Middelheimlaan 1, 2020 Antwerpen, Belgium
3Statistical Cybermetrics Research Group, School of Mathematics and Computer Science,
University of Wolverhampton Wulfruna Street, Wolverhampton WV1 1LY, ROYAUME-UNI
4School of Information Studies, University of Wisconsin—Milwaukee, P.O. Box 413, Milwaukee, WI, Etats-Unis, 53201

Mots clés: bibliométrie, open access, open data, open peer review, Open Science, scientometrics

ABSTRAIT

Open Science is an umbrella term that encompasses many recommendations for possible changes
in research practices, management, and publishing with the objective to increase transparency
and accessibility. This has become an important science policy issue that all disciplines should
consider. Many Open Science recommendations may be valuable for the further development of
research and publishing, but not all are relevant to all fields. This opinion paper considers the
aspects of Open Science that are most relevant for scientometricians, discussing how they can be
usefully applied.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

1.

INTRODUCTION

Although modern science has elements of openness at its core, such as the expectation that
results are shared in some form rather than kept secret, there is currently a call for increased
openness. Spellman, Gilbert, and Corker (2018) characterize Open Science as a collection of
actions that contribute to enhancing the replicability, robustness, and accessibility of research.
Increased accessibility may also support diversity in research. Sociologist Robert K. Merton
formulated the ethos of science as follows: communalism, universalism, disinterestedness,
and organized skepticism (CUDOS: Merton, 1942). Nevertheless, the PLACE counter-norms—
proprietary, locale, authority, commissioned, expert—prevail in the laboratory context of dis-
covery (Latour, 1987; Latour & Woolgar, 1979; Ziman, 2000). Intellectual property rights
have increasingly penetrated the core process of knowledge production and control in the
emerging knowledge-based economies (Whitley, 1984). Ainsi, openness is often not achieved
in practice.

Open Science is an important topic in science policy debates, with many recommendations
to improve transparency in research and publishing (Fecher & Friesike, 2014). As the Open
Research Glossary (https://figshare.com/articles/Open_Research_Glossary/1482094) suggests,
the term Open Science relates to multiple phases in the research and publication process: depuis
starting a study (par exemple., by preregistering it) to assessing the value of the published outcomes

un accès ouvert

journal

Citation: Bornmann, L., Guns, R.,
Thelwall, M., & Wolfram, D. (2021).
Which aspects of the Open Science
agenda are most relevant to
scientometric research and
édition? An opinion paper.
Études scientifiques quantitatives, 2(2),
438–453. https://est ce que je.org/10.1162
/qss_e_00121

EST CE QUE JE:
https://doi.org/10.1162/qss_e_00121

Reçu: 15 Mars 2020
Accepté: 21 Janvier 2021

Auteur correspondant:
Lutz Bornmann
bornmann@gv.mpg.de

Éditeur de manipulation:
Staša Milojević

droits d'auteur: © Lutz Bornmann, Raf
Guns, Michael Thelwall, and Dietmar
Wolfram. Published under a Creative
Commons Attribution 4.0 International
(CC PAR 4.0) Licence.

La presse du MIT

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

(par exemple., by using metrics other than traditional bibliometrics, such as data citations). Open Science
encourages multiple citable units for a given piece of research, including preregistration docu-
ments for studies, open notes, multiple manuscript versions made available through Open
Access (OA), associated data sets and software, author responses to reviewer comments, et
the accompanying reviews that are public in the case of open peer-review sources.

Open Science is partly driven by a science policy agenda (García-Peñalvo Francisco, García
de Figuerola, & Merlo José, 2010). The emergence of this term and the related movements is
intrinsically connected to new forms of knowledge production (postacademic science and
Triple Helix) in the knowledge-based economy, and the opportunities that web-based technol-
ogies offer. There would be no possibility for any of the “open the black box” steps (par exemple., prereg-
istration of research, disentangling of the whole research process, open data) without the
internet. En même temps, Open Science is hailed as the way to give a push to interdisciplinary
research and to the reuse of data (perceived by science policy as one of the motors behind
nouveautés). Cependant, there are also pushes towards Open Science from within the academic
community. Par exemple, the Open Journal Systems (OJS) software, developed by the Public
Knowledge Project, has been embraced by research communities in countries that cannot afford
scientific publishers (Willinsky, 2005).

Open Science shares many commonalities with the Free and Open Source Software (FOSS)
mouvement (Tennant, Agarwal et al., 2020). FOSS is itself an umbrella term used to refer to both
the Free Software movement, which started in the 1980s, and the Open Source movement,
which started around 1998. Although Free Software focuses on social issues, embodied in four
“fundamental freedoms,” Open Source is a more pragmatic reformulation of Free Software
ideals, by focusing mainly on the practical advantages of source code availability (Tozzi, 2017).

The term Open Science has many facets, and it is not clear which tools, initiatives, ideas, et
recommendations are relevant and meaningful for good scientific practices in a given research
field. Par exemple, recommendations for Open Science collaborations (Gold, Ali-Khan et al.,
2019) do not seem relevant for scientometrics due to the typically small-scale nature of research
in this field (in terms of the number of coauthors associated with a paper). En outre, there is
no general agreement on which facets are part of Open Science. It has even been argued that the
Open Science movement is not necessary for the (current) science system, as science is (toujours)
open to some extent (Fecher & Friesike, 2014). Watson (2015) argues that Open Science is “just
(good) science.” Independent of the roots of the ideas that are discussed in the Open Science
contexte, it might be worthwhile for every discipline to engage with Open Science recommen-
dations and identify relevant ideas.

Every Open Science recommendation has the potential to improve research as well as an
inherent cost, even if it is only the time taken to implement it. Par exemple, data sharing can
foster collaboration and more rapid development of a research topic, but can also require sub-
stantial effort to arrange data in a common format and create an effective set of metadata.
Additional requirements that add to the burden of research publishing potentially disadvantage
chercheurs (par exemple., in less rich nations) who have little time, equipment, and support to deal with
them easily.

Scientometric research often analyzes collections of journal articles, and these have been
affected by Open Science, such as with the introduction of open OA general megajournals.
The possibilities of distributed digital publishing may change the nature or importance of journal
articles and lead to a greater focus on outputs of all types (Guédon, Kramer et coll., 2019). Journal
articles have largely changed from static paper objects to electronic documents with embedded
links to other documents (in the references) as well as data and visualizations, all of which

Études scientifiques quantitatives

439

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

increase the importance of open infrastructures. Bourne, Clark et al. (2012) call for further
innovation in the forms and technologies of scholarly communication as well as the underlying
business and access models. Such developments may have a substantial impact on future
scientometric research, par exemple, if data sets and software become recognized as first-class
research objects in their own right.

Scientometrics research is affected by this new Open Science movement in two ways: as a
research field itself, and as a research field monitoring the science or academic system. Dans ce
papier, we mainly focus on the former, occasionally touching also on the latter. This opinion
paper identifies recommendations proposed in the Open Science context that may be relevant
and meaningful for scientometric research and publishing, although not necessarily in all or
most circumstances. We discuss the most interesting aspects from the Open Science literature
for scientometric research. We avoid being normative: Researchers in scientometrics should
be informed about the various aspects, but should decide themselves whether they are of interest
for their own research or not. The discussion of the Open Science issues in the following sections
is ordered by the phases in typical research and publication processes (starting a study and
publishing its outcome).

2. RELEVANCE OF OPEN SCIENCE TO SCIENTOMETRICS

Scientometric research is characterized by heterogeneity. En général, there is a difference between
the sociological ambition of studying the sciences and visualizing and mapping them (voir, par exemple., le
contributions by Robert K. Merton) and the connected, but different, objective of research evalu-
ation. In the latter field, researchers and their institutes are often units of analysis, whereas the struc-
turalist approaches share with information and library sciences a focus on document sets.
Scientometrics can be considered as a Mode-2 science: In addition to its theoretical core mission
of studying the sciences, there are applications in science studies considering scientometrics as
offering tools and methods for quantification, research evaluation on the applied side, et le
development of expertise in practice. The call for “opening the black box” is common for research
evaluation practice.

Within this broader framework of distinguishing two broad directions of scientometric research,
scientometric studies are very different in terms of their topics, méthodes, and indicators used, comme
well as their scopes. Some studies are policy-oriented (par exemple., Rushforth & de Rijcke, 2015), alors que
others are mathematically oriented (par exemple., Egghe & Rousseau, 2006). Some studies are based on
huge data sets (par exemple., Viser, Van Eck, & Waltman, 2020), whereas others focus on small data sets
or report the results of one case study (par exemple., Marx, 2014). As visualizations of scientometric data
have become very popular in recent years, some papers focus on the explanation of tools or soft-
ware (par exemple., Bornmann, Stefaner et al., 2014). Research on the h-index revealed, par exemple, que
papers dealing with the same topic—this indicator—can be very heterogeneous: Some studies
mathematically investigated the indicator (par exemple., Egghe & Rousseau, 2006), whereas others
addressed its convergent validity (par exemple., Bornmann & Daniel, 2005), tried to develop new variants
of the index (Bornmann, Mutz et al., 2011), or explained tools for its calculation on the internet
(par exemple., Thor & Bornmann, 2011).

Against the backdrop of the intermingling of heterogeneous objectives and styles in sciento-
metrics, Open Science recommendations may be relevant to different degrees for these studies.
Par exemple, for mathematically oriented studies, other recommendations would be relevant
than for studies from the policy context. In this paper, donc, we focus on Open Science
recommendations that might be made relevant for studies from the scientometric area using
(more extensive) data from literature databases such as Web of Science (WoS) (Clarivate

Études scientifiques quantitatives

440

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

Analytique) or Scopus (Elsevier) for empirical investigations in the science of science area
(Fortunato, Bergstrom et al., 2018). These studies may focus, par exemple, on certain structures
en sciences (par exemple., the existence of the Matthew effect) or the effectiveness of certain funding
instruments for the promotion of science. Autrement dit, we consider the study of data as the
common denominator of the scientometric enterprise.

Authors of this core type of science of science studies may be interested in the Open Science
issues raised in this opinion paper. Following the recommendations in scientometrics research
might lead to research of higher quality, accessibility, and transparency. Scientometricians can
consider the Open Science recommendations addressed in Sections 3 et 4 of this paper. If so
wished, this type of study can be preregistered, the underlying data, used codes, and applied
software can be made available, the contributions of the coauthors explained, and the paper
can be published OA, including reviews and other documents originating from the peer review
processus. We selected the relevant recommendations from the Open Science literature based on
our longstanding experiences in the field of scientometrics. These are mostly directly relevant to
the robustness/replication Open Science goal by making the research process more transparent,
with the tools and data openly available. They also support diversity, as openly shared papers,
data, or software may be used to produce new research.

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

3. STARTING A STUDY

3.1. Open Preregistration of Studies and Planned Analyses

(Publicly funded) Research is less valuable for research itself and society when researchers
cannot share their findings with others, and publishing is one way in which results can be shared.
Other ways include giving presentations, giving interviews, creating podcasts, y compris
research outcomes in teaching materials, and participating in meetings with policy makers.
As various studies have shown that results that validate a previously formulated hypothesis
are far more likely to get published (when compared with a refutation), researchers might be
interested in producing such publishable results (Marewski & Bornmann, 2018).

To reach the goal of publishing while using the strict model of hypothesis testing (Hempel &
Oppenheim, 1948), there is a danger that hypotheses are formulated based on the data at hand
and from the perspective of hindsight (par exemple., to achieve results that are statistically significant).
Ainsi, the statistical analysis is not used to test certain previously formulated hypotheses, mais
to fish results from the data that might be more publishable than other results (Cohen, 1994).
This is sometimes called harking (hypothesizing after the results are known) (Hand, 2014; voir
also Cumming & Calin-Jageman, 2016). Another term is historicism for the rationalization of
research results ex post (Popper, 1967). Cependant, analytical research questions or hypotheses
are more common as a stated objective in some fields of science than others, with vague terms,
such as aim, but, or goal being common alternatives (Thelwall & Mas-Bleda, 2020). These
alternatives suggest a more exploratory research approach that may be also relevant for many
types of scientometric study.

To make the formulation of hypotheses from the perspective of hindsight more difficult and to
demonstrate that hypotheses were formulated before analyzing the data, it is possible to register a
study at an early stage based on a detailed research plan that can be later checked by reviewers,
editors, and readers (Kupferschmidt, 2018; Nosek, Ebersole et al., 2018; see also Cumming &
Calin-Jageman, 2016). Preregistration can be done, par exemple, at the Center for Open Science
(https://cos.io/prereg). The PLOS journals PLOS Biology and PLOS ONE currently offer the possi-
bility to peer review and publish preregistered research (https://www.plos.org/preregistration).

Études scientifiques quantitatives

441

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

Published analysis plans for upcoming studies can be an interesting source for researchers
working on similar research questions. Young researchers may profit from detailed plans pub-
lished by experienced researchers of how the study of certain phenomena can be tackled.

The journal Psychological Science uses a badge created by the Center for Open Science for
acknowledging papers being preregistered (https://cos.io/our-services/open-science-badges).
Actuellement, no scientometric journal works with this badge to highlight this open practice.
Neither does any scientometric journal at the moment offer the option to peer review a preregis-
tered study. When designing (and potentially preregistering) a scientometric study, chercheurs
must specify many parameters: Which data will be used to analyze the research questions such
as funding data, peer review data, data on the scientific workforce, données d'enquête, or publication
data? Which database (par exemple., WoS, or Scopus) will be used? How many papers will be analyzed?
How will citation impact be measured? Will the median or geometric mean be calculated instead
of the arithmetic mean? Should percentages or the original scores be preferred? To make these
decisions wisely, pilot testing may be needed (Cumming & Calin-Jageman, 2016). Cependant,
the data used in pilot testing should not be fished for “publishable” results, because doing so
would undermine the purpose of preregistration.

It is an advantage of scientometric research that data are often available in large-scale data-
bases so that pilot testing can be a practical step in many studies. In other social sciences, le
generation of data can be effortful. Cependant, the availability of high-quality data can also be a
disadvantage. Before preregistering a study, the whole study could be completed, pour que le
outcomes are already known. This is a practical possibility for scientometric research because
data collection is rarely a substantial public event, unlike a clinical trial or survey.

Working with pilot testing can be especially important in the process of formulating a hypothesis.
Although the hypothesis is logically prior, its formulation does not have to be prior in time. On the
contrary, the formulation of expectations may be circular and repetitive. In pilot testing the
hypothèse, it can be improved, and the research can be made more precise and sophisticated.
The hypothesis can be improved, Par exemple, by being theoretically informed and rooted in the
previous literature. A hypothesis may be a prediction, but this is not necessarily the case. UN
hypothesis is not like a weather forecast; “prediction” has the meaning of a theoretically informed
expectation that can be tested against the data as observations. Based on theoretically informed
hypotheses, pilot testing can be used to operationalize, Par exemple, by specifying expectations
that can perhaps be tested.

Basically, in this phase of a planned study, the “logic of discovery” differs from the “logic of
justification.” In the “logic of discovery,” the relevant theorizing, hypotheses, and operationalization
can be changed. The empirical results can be discussed with colleagues in the “logic of justification.”
This process may lead to further changes until a more formal research plan can be finalized. Le
difference between the two logics is analytical; in research practices, both momenta are important.

Suppose a scientometric researcher is interested in the growth of science based on sciento-
metric data. Two obvious data options are numbers of papers and researchers. Previous studies
investigating the growth of science detail the advantages and disadvantages of these options
(par exemple., Bornmann & Mutz, 2015un; Tabah, 1999). Suppose the scientometrician has decided to
use numbers of publications: He or she then has to make further choices, such as the question
of which database is used for this study. Several databases are available, each with respective
advantages and possible disadvantages (par exemple., Dimensions from Digital Science, WoS, Scopus, ou
Microsoft Academic). Decisions are also to be made on the range of publication years and
document types included. Fractional or whole-number counting may make a difference if the
study includes the country perspective.

Études scientifiques quantitatives

442

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

Based on experiences from the literature and information from the database providers, le
data can be selected, and expectations specified about the growth curves: In which years, pour
example, is an increase or decrease of publication numbers expected and for which reasons? Dans
pilot testing, a decision could be made about whether the growth of science is studied for certain
disciplines (in addition to the whole of science) and—if so—which field-categorization scheme
can be used. A sample of publication numbers can be used to test the different statistical methods
(par exemple., regression models or effect-size measures) to analyze the data. The research plan of the
study contains the necessary steps to analyze the data.

In scientometric evaluations, tests are also used to determine whether differences are statis-
tically significant (par exemple., the difference between two mean or median citation rates). Cependant, le
results of statistical significance testing are dependent on factors such as the sample size: Le
larger the sample, the more likely it is to obtain statistically significant results. Ainsi, the sample
size that is needed to detect a certain effect in a study should be considered before the study is
conducted. This practice may prevent the increase of the sample size by the researchers in
hindsight (c'est à dire., after they have inspected the results) to obtain statistically significant results.
The danger of increasing the sample size in hindsight is present in scientometrics, as the data
are available in large literature databases such as WoS and can be downloaded without any
major problems.

When considering the above in the planning of a scientometric study, it is important to
consider that research evaluation—an important subset of questions in our field—can deal with
incomplete data and with goals (par exemple., research quality or impact assessment) that usually do not
match the data (par exemple., citation counts) well. En outre, the large number of variables potentially
influencing the data and continual changes over time make strong hypotheses often impossible.
Plutôt, unless using simulation or pure mathematical modeling, empirical studies must make
multiple simplifying assumptions. In this situation, research questions, when used, are likely to
be accompanied by strong caveats and may be primarily devices to frame the analysis in a paper.

Nevertheless, statistical tests and other uses of algorithms, such as for modeling or machine
learning, should not be misused for harking because this would undermine the validity of the
résultats. If a study is not preregistered, researchers might consider openly publishing (par exemple., comme
online supplementary material) details of prior failed tests, or visualizations of the preparation
stages that were associated with the project.

3.2. Open Data

There is widespread support for data sharing in academia (par exemple., Tenopir, Dalton et al., 2015),
including for the FAIR (Findable, Accessible, Interoperable, Reusable) principles (Wilkinson,
Dumontier et al., 2016). Data might be “open” if it is made available under a license that allows
free use, modification, and redistribution, Par exemple. The best-known example of such
licenses is the family of Creative Commons licenses (see http://opendefinition.org/licenses for
a broader overview). In practice, data sets are often made publicly available—and hence
considered open data in a broader sense—without an explicit license. From a scientometric
perspective, open data is relevant to disciplinary practices (c'est à dire., should we share our data openly)
as well as providing an issue for study: How can the impact of shared data be evaluated? Here we
focus primarily on the former.

Data sets from scientometric studies that can be made publicly available can be shared
through repositories like FigShare (https://figshare.com) or Zenodo (https://zenodo.org). Le
Open Materials and Open Data badges provided by the Open Science center can be used to
indicate that a paper provides data and other materials for the use by other researchers

Études scientifiques quantitatives

443

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

(Cumming & Calin-Jageman, 2016) and data sharing is increasingly mandated by funders and
journaux. En général, it is very helpful to have access to the data of studies for their replication,
checking for possible errors, and conducting meta-analyses (Verre, 1976).

Open data sets may also help in avoiding duplication in data collection (Fecher & Friesike,
2014). The data can be used for other research questions than those addressed by the producers
of the data sets. Researchers undoubtedly welcome broader access to data sets for scientometric
recherche. Par exemple, the aggregated data used for the various releases of the Leiden Ranking have
been published for many years (https://www.leidenranking.com/downloads). Because of their
transparency, they have already been used as data by several papers (par exemple., Bornmann & Mutz,
2015b; Frenken, Heimeriks, & Hoekman, 2017; Leydesdorff, Bornmann, & Mingers, 2019).
Cependant, in research evaluations, the sharing of data may be sensitive to strategic or privacy issues.

Data sharing has some common obstacles in scientometrics. Although data sharing and reuse
are goals of Open Science, much bibliometric research is based on data obtained from proprie-
tary sources (par exemple., WoS or Scopus). This limits what researchers can share, even if required by a
journal. Recently, Digital Science has made the Dimensions database available for scientometric
recherche (https://www.digital-science.com/products/dimensions). Elsevier’s International Center
for the Study of Research (ICSR) also provides free scientometrics data. Other initiatives of open
(or free to access) literature databases are Microsoft Academic Search (https://academic
.microsoft.com) and the Initiative for Open Citations (https://i4oc.org). Ainsi, scientometricians
can use a range of bibliometric data sets for scientometric research without paying fees (et
without restrictions on data sharing).

Based on these developments and requirements, it seems that the primary data-sharing goal
in scientometrics may no longer be to publish the data, but to publish the procedures to extract
and analyze the data. Autrement dit, it may be less useful to have access to the indicator values
for publications than to have access to the published procedure for calculating the indicator.
Access to certain data sets might become superfluous with the large data set sharing initiative
of Digital Science, although access in this case is mediated by the company and, as such, can in
principle be revoked at any time. This may not be possible with openly licensed data. De plus,
many scientometric studies test new sources of data (par exemple., brevets, altmetrics, webometrics) et
these will not be served by shared common data sets.

Although some disciplines have their own data repositories, supporting specific file formats,
metadata types, and legal requirements, scientometrics does not. Perhaps the closest is the
Initiative for Open Citations, which may lead to almost all citations being available from one
source in the same format. De la même manière, some scientometricians and other stakeholders have advo-
cated for making other (meta)data openly available through the Crossref infrastructure and
coupled to DOIs (par exemple., the recent Initiative for Open Abstracts: https://i4oa.org/). Ainsi,
Crossref is increasingly a source of raw scientometric data, which can be further refined and
enhanced by other commercial and noncommercial players. It is not clear that there is a system-
atic need for a separate database for any other kind of scientometric data for several reasons.
Many studies use Scopus, WoS, or Dimensions data, which is already either freely available
to researchers or copyright-protected and not sharable. Some studies have small data sets of
particular interest, such as studies of a specialist topic within a given country, that may be of
limited value to others. Other studies link scientometric data to other types of data (tel que
university recruitment or economic indicators) that would rarely be useful to other scientome-
tricians. The data of such studies can be made available on an ad hoc basis through generic data
repositories, such as Zenodo. Ainsi, there does not seem to be a universal pressing need for a
common disciplinary data type or data repository.

Études scientifiques quantitatives

444

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

From the perspective of the FAIR principles, large free data sets or data set access mechanisms
presumably adequately satisfy the Findable requirement through sparsity and because the
current main providers (Clarivate Analytics, Elsevier, Dimensions, and Microsoft Academic)
are well known. Accessibility is a problem for the subscription services (Clarivate Analytics
and Elsevier). Interoperability and reusability are relatively minor problems for these data sets
because researchers typically work with a single data source. In contrast, FAIR seems to be a
substantial problem for the many ad hoc small-scale data sets generated in scientometric
recherche, which seem to be rarely shared openly, and which are presumably poorly documented
and in a wide variety of formats. If these cannot be indirectly shared by publishing the proce-
dures by which the data was extracted from a well-known bibliometric source, then FAIR data
sharing may be difficult. Such data sets might include, Par exemple, altmetrics, publication lists
from departments or researcher CVs, selections from national current research information
systèmes, publication lists from national research evaluation exercises, and national journal
collections. It seems reasonable for scientometrics journals to encourage researchers to share
their data sets, when possible, or to publish detailed instructions about how to replicate the study
otherwise, such as the relevant queries.

4. PUBLISHING THE OUTCOME OF A STUDY (PAR EX., PAPERS OR SOFTWARE)

4.1. Open Code or Shared Software

The code or compiled software for analyzing data can be shared. Spellman et al. (2018) point out
that direct replication without involvement of the original researchers is very difficult, as the
methods are typically not described in sufficient detail. The same problems were found in a
small-scale exploration of reproducibility issues in scientometrics (Velden, Hinze et al.,
2018). If the data have been analyzed using code that is made available, it becomes far easier
to reproduce the analysis and/or spot errors in it that may otherwise go unnoticed. It seems
unlikely, cependant, that “computational reproducibility” (Boulanger, 2016) can help to prevent
outright fraud.

Several technical factors may make reproducing results more difficult, such as random
seeding of nondeterministic processes such as sampling, ongoing updates of the hardware
and software, and the use of parameters that may not be fully communicated in the publication.
Although all of these can be overcome in principle, they provide major hurdles in practice.
Large-scale models (par exemple., topic models) are often not reproducible because the updating of
the systems (Par exemple, for safety reasons) is not under the control of the researchers
(Hecking & Leydesdorff, 2019). Nevertheless, the routines can be published in an open source
repository such as Github (https://github.com).

The code for integrating the results in the paper or even writing the complete paper can also be
made available. The Stata corporation has recently developed new commands that can be used
for producing a Word document (or PDF) based on the data (https://www.stata.com/features
/overview/create-word-documents). Autrement dit, the commands refer to not only the process
of producing tables and figures but also to the complete paper. If both commands and data are
made publicly available, every researcher can exactly reproduce the paper. Other statistics
software provides similar functions: R Markdown supports reproducible papers by interspersing
R code and text in markdown format (Bauer, 2018).

Papers in scientometrics are often based on data from WoS or Scopus. These data might have
been downloaded from the web interfaces of these data providers. If the search terms for com-
piling the publication sets, search limiters used, and the export date of the data are made publicly

Études scientifiques quantitatives

445

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

available, not only can the paper’s production, beginning with the data and ending with the text,
be reproduced, but so can the process of generating the data set. Nevertheless, both bibliometric
and historical data can change over time, undermining replications. Changes in the media and
updates of hardware and software may make data and procedures irreproducible. The major
databases also backtrack newly admitted data into their history. En général, the databases are
dynamic, and cannot easily be reproduced. Internet data, cependant, might be archived by the
Wayback Machine (Wouters, Hellsten, & Leydesdorff, 2004).

Another source of bibliometric data is in-house databases (par exemple., at the Centre for Science and
Technology Studies or the Max Planck Society), which are based on data from WoS, Scopus,
Dimensions, etc.. Dans ce cas, the SQL for producing the data could be made available for com-
piling the data (including text explaining what the single command lines mean). Cependant, là
may be property rights involved when using these data.

There are several caveats for code sharing. Scientometric research may have involved exten-
sive data cleaning by the users, creating significant added value to a standard data set (par exemple., WoS)
in a proprietary clean version. With this, any software applied to the standard version of the
database would give different results. The solution would be to make the cleaned version open
(at least for research purposes) instead of proprietary, which is not always legally or practically
feasible. En outre, some applications may be complex, generating a substantial amount of
work to put the code in a shareable format. Ainsi, the likelihood of reuse is important when
deciding whether and how to share code.

4.2. Contributions of Authors

Authors contribute differently to research projects and resulting publications. The American
Psychological Association (APA) published a checklist which can be used by contributors to a
research project to declare their contributions (https://www.apa.org/science/leadership/students
/authorship-determination-scorecard.pdf). The use of such lists might lead to a more standard-
ized consideration of authors on papers in the long run and might be a possible action to avoid
bad practices such as ghost authorship (substantial contribution made without being listed as
a coauthor) and honorary authorship (being listed as a coauthor despite having contributed
little to nothing). In the scientometrics field, authors have to state authors’ contributions in
some journals (par exemple., Journal of Informetrics and Quantitative Science Studies).

The contributions of authors to research projects and publications can be made publicly
available in the process of publishing a paper. CASRAI has published CRediT (Contributor
Roles Taxonomy) for common authorship roles (https://www.casrai.org/credit.html). Dealing
with author contributions in research projects is important for knowing and codifying who
did what in the project. In many fields, but not in all, researchers have traditionally used author
order when assessing authorship credit. We can imagine that an initiative such as CRediT provides
a more objective way to assess author credit. Practices may also vary among countries.

4.3. Open Access (OA)

OA, where authors or publishers make research articles or reports freely available to readers in
traditional or OA journals, represents the most visible aspect of Open Science. The number of
OA journals has grown quickly since the 1990s. The Directory of Open Access Journals (https://
doaj.org/) included more than 14,300 journals by early 2020. An increasing number of pub-
lishers also support forms of OA in their traditional journals by providing authors the option

Études scientifiques quantitatives

446

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

to make their submissions freely available, for a fee, once accepted. These articles appear
alongside the closed access articles that require a subscription to access.

There is a cost associated with making publications freely available (van Noorden, 2013).
This cost is borne by the publishers, authors, and/or third parties. OA models are characterized
by who is responsible for the associated costs and the user rights to the content. Two common
OA models are Gold OA and Green OA. Gold OA publication venues have an associated Article
Processing Charge (APC) that makes articles freely available to readers once accepted. OA
journals that do not charge authors are sometimes also referred to as Gold OA, although other
terminology (Diamond/Platinum OA) is also used. Green OA venues allow authors to self-
archive prepublication (pre- and sometimes postpeer review) versions of their manuscripts in
public repositories such as arxiv.org. The rationale for Green OA is that the results of publicly
funded research should be made publicly available without cost to the reader.

Several journals that publish scientometric research support different forms of OA. Pour
example, PLOS journals, Frontiers in Research Metrics and Analytics and Quantitative Science
Studies support Gold OA. More traditional journals such as the Journal of Informetrics, Journal
of the Association for Information Science and Technology, and Scientometrics support hybrid
OA, where authors may pay an APC to make their articles freely available online. These three
journals also permit preprint archiving of manuscript submissions.

Recently, the OA topic has had a specific relevance for the scientometric field, as the chief
editor and editorial board of an important scientometric journal (the Journal of Informetrics)
decided to change the publisher (from Elsevier to MIT Press; see Waltman, 2019). The new
journal at the new publisher is Quantitative Science Studies (https://www.mitpressjournals
.org/qss). One reason for this change was that the outcome of scientometric research could be
made available for other researchers without any restrictions.

Dans 2018, a group of European funding agencies formed cOAlition S, which developed a strategy
for mandated OA called Plan S. “Plan S requires that recipients of research funding from cOAlition
S organisations make the resulting publications available immediately (without embargoes) et
under open licenses, either in quality Open Access platforms or journals or through immediate
deposit in open repositories that fulfil the necessary conditions” (https://www.scienceeurope
.org/our-priorities/open-access). Plus que 1,700 members of the scientific community have
signed an open letter expressing concerns that Plan S takes OA too far and is too risky.
Although Plan S has since been revised to address some of the letter writers’ objections, là
are still concerns that the plan favors APC OA models and does not address the researchers’
comments about the quality of peer review and international collaborations (https://sites.google
.com/view/plansopenletter/home).

4.4. Open Data and Software Citation

When the data used for a study have been made available on the internet (par exemple., at FigShare), le
data set can be cited in principle (https://datacite.org). Ainsi, the work that has been invested in
producing an interesting, complexe, or effortful data set might result in receiving credit in the form of
data citations. In some fields, sharing data has been shown to be associated with more citations for
a paper (Colavizza, Hrynaszkiewicz et al., 2020; Piwowar & Vision, 2013). The measurement of
data citation impact can be supported by assigning DOIs to the data set (as is done by FigShare, et
others), to combinations of data sets (as done by the Global Biodiversity Information Facility, GBIF;
see Khan, 2019), or by suggesting the format to be used in a reference list. The German National
Library of Science and Technology (TIB) developed DOIs for data sets (Fecher & Friesike, 2014).

Études scientifiques quantitatives

447

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

With DataCite (https://datacite.org), an institution exists with the goal to provide persistent identi-
fiers for research.

At present, data citation is not widely practiced by authors overall (Robinson-García, Jiménez-
Contreras, & Torres-Salinas, 2016) and in scientometrics. Unlike bibliographic sources that are
cited and appear in reference sections of scholarly works, the data sources used in conducting
recherche (including software) may not be granted the same level of acknowledgement. The prev-
alence of data citation also varies from one discipline to another (Zhao, Yan, & Li, 2018). Quand
authors do acknowledge shared data they have reused, they do not necessarily cite the data
sources in a manner that allows a data citation indexing service such as Clarivate Analytics’
Data Citation Index or DataCite to record instances of data reuse, thereby denying authors of
the data sets formal credit for their contributions (Parc, You, & Wolfram, 2018). Both DCI and
DataCite are currently limited in their data repository coverage. Scientometricians who rely on
citation data collected from these repositories are advised to be cautious about the conclusions
they draw.

Scientometric studies rarely seem to generate data sets with general use that are separate from
specific academic outputs. Par exemple, all the data sets within the first 100 matches on Figshare for
the query “Scientometrics” seem to be associated with a paper (they almost always state this directly
and associated papers can be found via Google for the exceptions), despite Figshare being a free
OA repository supporting data sharing. En outre, many citation analyses use commercial data
from WoS or Scopus. Nevertheless, some open data sets have been used in scientometrics, tel que
the European Tertiary Education Register data set (ETER; www.eter-project.com) and there are
some data sets related to scientometrics on FigShare that are not associated with journal articles
(par exemple., https://figshare.com/articles/UK_university_Web_sites_June_July_2005/785775/1).

Ainsi, it seems that scientometric authors reusing data may prefer to cite the paper associated
with the data rather than the data itself, which would explain the low rate of data citation
(Robinson-García et al., 2016). A search for data-related records (c'est à dire., data sets, data studies,
software) in Clarivate Analytics’ Data Citation Index using the topic search “scientometr* OR
bibliometr* OR informetr* OR altmetr*” on October 31, 2020 resulted in 2,044 records across
all disciplines for the period 2013 à 2020, and only 174 citations (two records with two cita-
tion, 170 records with one citation) to these records. Even if a large percentage of the citations
came from scientometric authors, this still indicates very limited data citation activity for metrics-
related data. This contrasts with fields such as biodiversity, where organism prevalence data can
be a primary research output, so data citations might be an important way to recognize the
usefulness of a nonpublishing scientist’s work. In fields such as genomics, cependant, data can
be extremely time-consuming to collect and valuable, but may not be commonly shared
(Thelwall, Munafo et al., 2020), which is needed to encourage reuse and ultimately data citation.

There are many valuable uses of research data that do not lead to citations (Thelwall &
Kousha, 2017), such as for verification and training. Ainsi, data citations reflect a possibly small
proportion of the uses of shared research data. Aussi, as noted above, researchers in some fields
are not regularly citing data sources in a way that allows these sources to be captured by data
citation indexing services. Data citation is a relatively new development, so that the tradition or
expectation of citing data sets in a manner comparable to bibliographic sources is not yet com-
monly practiced (Park et al., 2018).

4.5. Open Review Comments

Reviews, along with author rebuttals and editor comments, can be made publicly available (dans
anonymized or signed form; see Schmidt, Ross-Hellauer et al., 2018). Signed open comments

Études scientifiques quantitatives

448

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

.

/

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

can be cited in principle. Ainsi, reviewers can receive citation impact for their contributions.
Published reviews (in anonymized or signed form) might be an instrument of the journal to
demonstrate that it cannot be categorized as a predatory journal. Predatory journals are OA
journals that publish manuscripts for money (paid by the authors) without quality control
(adequate peer review) for what is published. Another benefit of open reviews is that the review
process is no longer a black box. Readers can gain insight into how a publication came to be in
its final form. Open reviews also can serve as an important learning tool for new scholars by
providing exemplars of the review process. Actuellement, cependant, most scientists seem to believe
that double-blind review is the most effective model for quality control (Moylan, Harold et al.,
2014; Mulligan, Hall, & Raphael, 2013; Rodriguez-Bravo et al., 2017).

No scientometric journal has published its peer review process (at the time of writing this
papier). The website of the journal Quantitative Science Studies (QSS) indicates that reviewers
may choose to identify themselves. QSS is currently running a transparent peer review pilot.
When a manuscript participating in the pilot is accepted for publication in QSS, the reports of
the reviewers, the responses of the authors, and the decision letters of the editor are made openly
available in Publons (https://publons.com). Participation in the pilot is voluntary (https://www
.mitpressjournals.org/journals/qss/peer_review). Cependant, scientometricians can choose to
publish in general journals or platforms that offer (complet) open peer review, if they believe
that this is valuable.

The availability of review reports and reviewer identities may be optional or required by the
journal policy. The journal eLife is publishing some “meta-research” adopting (a kind of ) open
peer review process. Another example is the journal Atmospheric Chemistry and Physics (ACP),
which was launched in 2001 and is freely accessible at https://www.atmospheric-chemistry
-and-physics.net (publisher: Copernicus Publications). ACP has a two-stage publication
processus, with a peer review process that is different from processes use by traditional journals.
The process is explained at https://www.atmospheric-chemistry-and-physics.net/peer_review
/interactive_review_process.html.

The innovative peer review process at ACP has been evaluated by Bornmann, Marx et al.
(2010) and Bornmann, Schier et al. (2011). The results of the study show that the process can
reliably and validly select manuscripts for publication. A currently new example is the PLOS
family of journals, for which open peer review became optional for authors in May 2019
(PLOS, 2019). Scientometric articles are also occasionally published in journals that reveal
reviewer identities, but not the reviews, offering a different type of transparency (par exemple., Journal de
Medical Internet Research (Eysenbach, 2011) or Frontiers in Research Metrics and Analytics).
De la même manière, the journal PeerJ, which also occasionally publishes scientometric research, provides
both authors the option to make their reviews available and reviewers the option to identify
themselves. This journal started in 2013.

OA journals that provide open reports and/or reviewer identities make it possible for re-
searchers to download or harvest data for scientometric and textual analysis. Because there is
currently no standardized way that these journals currently provide access to open peer review
data—which may be available in HTML, XML, or PDF format—crawling and scraping routines
need to be customized for each journal of interest, or at a minimum for each publisher. Actuel
challenges include reviewer comment discovery and identification of review components.
Reviewer comment discovery can be challenging if there is no standardized location for the
reviews. Individual web pages must be crawled for review-related text.

As standardization to provide access to open peer review data is ongoing, for instance by

Publons and Crossref, data will become increasingly available.

Études scientifiques quantitatives

449

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

5. DISCUSSION AND CONCLUSIONS

A key element in the Open Science program is the demand for transparency (par exemple., by preregistering
a study or publishing the underlying data). This can help both replication/robustness and acces-
sibility if the final product is openly available. Transparency is needed in research evaluation both
at the level of peer reviews and in terms of scientometric metaevaluations. A number of studies
have shown that bureaucracies are not necessarily able to identify the best performing researchers
(Irvine & Martine, 1984; Bornmann, Leydesdorff, & van den Besselaar, 2010), although these
studies have equated performance with bibliometric indicators rather than societal value or other
types of impact. The process of priority programming and funding is continuously in need of
legitimation.

In this opinion paper, we presented an overview of the Open Science program by discussing
aspects from this movement that are most relevant for the scientometric field. Many aspects of
the Open Science movement have been triggered by specific scientific disciplines, for specific
raisons. In this paper, we discuss several aspects towards their applicability for scientometrics.
The outcomes of Open Science adoption might be, Par exemple, better reproducibility, better
access, and more diversity. Although the Open Science program includes many interesting
proposals that seem worth considering in scientometric research, there are also potentially
problematic issues.

The developments that might result from the Open Science framework need to be scrutinized
for both potentially positive and negative effects on science. Par exemple, publications in Gold
OA journals enhance the accessibility of research to a wider audience and to practitioners and
scholars in less affluent institutions. En même temps, the costs associated with publishing in OA
journals arising from Article Publication Charges may be a barrier for researchers with limited
financial resources and cannot always be resolved through APC waivers (par exemple., in the case of
retired scientists). Possible solutions include publishing in Diamond (no fee) OA journals and/
or making a preprint version of the manuscript available through preprint servers and institu-
tional repositories, if permitted by the publisher’s OA policies.

Authors may not be willing or feel obliged by funding agencies to expend the time and effort
needed for research study preregistration and the sharing of data and software in formats that
make them usable by other researchers (Nosek, Alter et al., 2015; Nosek et al., 2018). Le
current reward system in science does not foster these activities explicitly. The “steering of
science,” however, is a policy process that can be analyzed by means of policy analysis. Le
track record of science-policy interventions, cependant, is poor (van den Daele & Weingart,
1975). Institutional interests are always important in the background and inclusiveness and
accessibility (par exemple., for minorities) may be more important than transparency.

At the end of this opinion paper, there are some take-home messages resulting from the issues
discussed in Sections 3 et 4. The target group for these messages is scientometricians con-
ducting their own (empirical) études.

6. TAKE-HOME MESSAGES

(cid:129) A scientometric study can be registered early based on a detailed research plan that can be
later checked by reviewers, editors, and readers in order to make the formulation of hypoth-
eses in hindsight more difficult and to demonstrate that hypotheses have been formulated
before analyzing the data.

(cid:129) Data used for a scientometric study can be made “open” if the data are available under a

license that allows free use, modification, and redistribution.

Études scientifiques quantitatives

450

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

(cid:129) The code or compiled software for analyzing scientometric data can be shared under an

open source license.

(cid:129) The contributions of authors to research projects and publications can be made publicly

available in the process of publishing a scientometric paper.

(cid:129) Authors can make scientometric papers or reports freely available to readers in traditional
or OA journals. Where permitted by journal policies, preprints of articles can be self-
archived or made available in public repositories to increase their availability.

(cid:129) Users of “open” scientometric data sets can give credit to the developers of the data sets in

the form of data citations.

(cid:129) Reviews of scientometric manuscripts, along with author rebuttals and editor comments,
can be made publicly available, making the process of the refinement of the research
reporting more transparent.

REMERCIEMENTS

We thank Ludo Waltman and Loet Leydesdorff for the discussions of previous versions of the
manuscript and detailed suggestions for improvements.

COMPETING INTERESTS

The authors have no competing interests.

INFORMATIONS SUR LE FINANCEMENT

L.B., M.T., and D.W. have received no funding for their research. The work of R.G. was supported
by the Flemish Government through its funding of the Flemish Centre for R&D Monitoring
(ECOOM).

RÉFÉRENCES

Boulanger, M.. (2016). Why scientists must share their research code.
Nature, Septembre 13. https://www.nature.com/news/why-scientists
-must-share-their-research-code-1.20504 (accessed February 18,
2020). EST CE QUE JE: https://doi.org/10.1038/nature.2016.20504

Bauer, P.. C. (2018). Writing a reproducible paper in R markdown
(SSRN scholarly paper no. ID 3175518). https://papers.ssrn.com
/abstract=3175518 (accessed February 18, 2020). EST CE QUE JE: https://
doi.org/10.2139/ssrn.3175518

Bornmann, L., & Daniel, H.-D. (2005). Does the h-index for ranking
of scientists really work? Scientometrics, 65(3), 391–392. EST CE QUE JE:
https://doi.org/10.1007/s11192-005-0281-4

Bornmann, L., Leydesdorff, L., & van den Besselaar, P.. (2010). A meta-
evaluation of scientific research proposals: Different ways of
comparing rejected to awarded applications. Journal of Informetrics,
4(3), 211–220. EST CE QUE JE: https://doi.org/10.1016/j.joi.2009.10.004

Bornmann, L., Marx, W., Schier, H., Thor, UN., & Daniel, H.-D. (2010).
From black box to white box at open access journals: Predictive
validity of manuscript reviewing and editorial decisions at
Atmospheric Chemistry and Physics. Research Evaluation, 19(2),
105–118. EST CE QUE JE: https://doi.org/10.3152/095820210X510089
Bornmann, L., & Mutz, R.. (2015un). Growth rates of modern science:
A bibliometric analysis based on the number of publications and
cited references. Journal of the Association for Information Science
and Technology, 66(11), 2215–2222. EST CE QUE JE: https://est ce que je.org/10
.1002/asi.23329

Bornmann, L., & Mutz, R.. (2015b). How well does a university per-
form in comparison with its peers? The use of odds, and odds

ratios, for the comparison of institutional citation impact using
the Leiden Rankings. Journal of the Association for Information
Science and Technology, 66(12), 2711–2713. EST CE QUE JE: https://est ce que je
.org/10.1002/asi.23451

Bornmann, L., Mutz, R., Hug, S., & Daniel, H. (2011). A multilevel
meta-analysis of studies reporting correlations between the h index
et 37 different h index variants. Journal of Informetrics, 5(3),
346–359. EST CE QUE JE: https://doi.org/10.1016/j.joi.2011.01.006

Bornmann, L., Schier, H., Marx, W., & Daniel, H.-D. (2011). Is interac-
tive open access publishing able to identify high-impact submissions?
A study on the predictive validity of Atmospheric Chemistry and
Physics by using percentile rank classes. Journal of the Association
for Information Science and Technology, 62, 61–71. EST CE QUE JE: https://
doi.org/10.1002/asi.21418

Bornmann, L., Stefaner, M., de Moya Anegon, F., & Mutz, R.. (2014).
What is the effect of country-specific characteristics on the research
performance of scientific institutions? Using multi-level statistical
models to rank and map universities and research-focused insti-
tutions worldwide. Journal of Informetrics, 8(3), 581–593. EST CE QUE JE:
https://doi.org/10.1016/j.joi.2014.04.008

Bourne, P.. E., Clark, T. W., Dale, R., De Waard, UN., Herman, JE.,
Hovy, E. H., & Shotton, D. (2012). Improving the future of research
communications and e-scholarship (Dagstuhl Perspectives
Workshop 11331). EST CE QUE JE: https://doi.org/10.4230/DAGMAN.1.1.41
Cohen, J.. (1994). The earth is round ( p < .05). American Psychologist, 49(12), 997–1003. DOI: https://doi.org/10.1037/0003-066X .49.12.997 Quantitative Science Studies 451 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / e d u q s s / a r t i c e - p d l f / / / / 2 2 4 3 8 1 9 3 0 7 5 8 q s s _ e _ 0 0 1 2 1 p d / . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Which aspects of the Open Science agenda are most relevant to scientometric research and publishing? Colavizza, G., Hrynaszkiewicz, I., Staden, I., Whitaker, K., & McGillivray, B. (2020). The citation advantage of linking publica- tions to research data. PLOS ONE, 15(4), e0230416. DOI: https:// doi.org/10.1371/journal.pone.0230416, PMID: 32320428, PMCID: PMC7176083 Cumming, G., & Calin-Jageman, R. (2016). Introduction to the new statistics: Estimation, open science, and beyond. Milton Park, UK: Taylor & Francis. DOI: https://doi.org/10.4324/9781315708607 Egghe, L., & Rousseau, R. (2006). An informetric model for the Hirsch-index. Scientometrics, 69(1), 121–129. DOI: https://doi .org/10.1007/s11192-006-0143-8 Eysenbach, G. (2011). Can tweets predict citations? Metrics of social impact based on Twitter and correlation with traditional metrics of scientific impact. Journal of Medical Internet Research, 13(4), e123. DOI: https://doi.org/10.2196/jmir.2012, PMID: 22173204, PMCID: PMC3278109 Fecher, B., & Friesike, S. (2014). Open science: One term, five schools of thought. In S. Bartling & S. Friesike (Eds.), Opening science: The evolving guide on how the internet is changing research, collaboration and scholarly publishing (pp. 17–47). Cham, Switzerland: Springer. DOI: https://doi.org/10.1007/978 -3-319-00026-8_2 Frenken, K., Heimeriks, G. J., & Hoekman, J. (2017). What drives university research performance? An analysis using the CWTS Leiden Ranking data. Journal of Informetrics, 11(3), 859–872. DOI: https://doi.org/10.1016/j.joi.2017.06.006 Fortunato, S., Bergstrom, C. T., Börner, K., Evans, J. A., Helbing, D., … Barabási, A.-L. (2018). Science of science. Science, 359(6379). DOI: https://doi.org/10.1126/science.aao0185, PMID: 29496846, PMCID: PMC5949209 García-Peñalvo Francisco, J., García de Figuerola, C., & Merlo José, A. (2010). Open knowledge: Challenges and facts. Online Information Review, 34(4), 520–539. DOI: https://doi.org/10 .1108/14684521011072963 Glass, G. V. (1976). Primary, secondary, and meta-analysis. Educational Researcher, 5, 3–8. DOI: https://doi.org/10.3102 /0013189X005010003 Gold, E. R., Ali-Khan, S. E., Allen, L., Ballell, L., Barral-Netto, M., Carr, D., & Cook-Deegan, R. (2019). An open toolkit for tracking open science partnership implementation and impact. Gates Open Research, 3, 1442. DOI: https://doi.org/10.12688/gatesopenres .12958.2, PMID: 31850398, PMCID: PMC6904887 Guédon, J. C., Kramer, B., Laakso, M., Schmidt, B., Šimukovic(cid:1), E., … Patterson, M. (2019). Future of scholarly publishing and scholarly communication: Report of the expert group to the European Commission. https://digitalcommons.unl.edu/cgi/viewcontent .cgi?article=1098&context=scholcom (accessed July 20, 2020). Hand, D. J. (2014). The improbability principle: Why coincidences, miracles, and rare events happen every day. New York, NY: Farrar, Straus and Giroux. Hecking, T., & Leydesdorff, L. (2019). Can topic models be used in research evaluations? Reproducibility, validity, and reliability when compared with semantic maps. Research Evaluation, 28(3), 263–272. DOI: https://doi.org/10.1093/reseval/rvz015 Hempel, C. G., & Oppenheim, P. (1948). Studies in the logic of explanation. Philosophy of Science, 15, 135–175. DOI: https:// doi.org/10.1086/286983 Irvine, J., & Martin, B. R. (1984). Foresight in science: Picking the winners. London, UK: Frances Pinter. Khan, N. (2019). Does data sharing influence data reuse in biodi- versity? A citation analysis. http://www.diva-portal.org/smash /get/diva2:1267704/FULLTEXT01.pdf (accessed February 18, 2020). Kupferschmidt, K. (2018). More and more scientists are preregistering their studies. Should you? https://www.sciencemag.org/news /2018/09/more-and-more-scientists-are-preregistering-their-studies -should-you (accessed October 16, 2019). DOI: https://doi.org /10.1126/science.aav4786 Latour, B. (1987). Science in action. Milton Keynes, UK: Open University Press. Latour, B., & Woolgar, S. (1979). Laboratory life: The social con- struction of scientific facts. Thousand Oaks, CA: Sage. Leydesdorff, L., Bornmann, L., & Mingers, J. (2019). Statistical signifi- cance and effect sizes of differences among research universities at the level of nations and worldwide based on the Leiden rankings. Journal of the Association for Information Science and Technology, 70(5), 509–525. DOI: https://doi.org/10.1002/asi.24130 Marewski, J. N., & Bornmann, L. (2018). Opium in science and society: Numbers. https://arxiv.org/abs/1804.11210 (accessed February 18, 2020). Marx, W. (2014). The Shockley-Queisser paper—A notable example of a scientific sleeping beauty. Annalen der Physik, 526(5–6), A41–A45. DOI: https://doi.org/10.1002/andp.201400806 Merton, R. K. (1942). Science and technology in a democratic order. Journal of Legal and Political Sociology, 1, 115–126. Moylan, E. C., Harold, S., O’Neill, C., & Kowalczuk, M. K. (2014). Open, single-blind, double-blind: Which peer review process do you prefer? BMC Pharmacology & Toxicology, 15. DOI: https:// doi.org/10.1186/2050-6511-15-55, PMID: 25266119, PMCID: PMC4191873 Mulligan, A., Hall, L., & Raphael, E. (2013). Peer review in a changing world: An international study measuring the attitudes of researchers. Journal of the American Society for Information Science and Technology, 64(1), 132–161. DOI: https://doi.org /10.1002/asi.22798 Nosek, B. A., Ebersole, C. R., DeHaven, A. C., & Mellor, D. T. (2018). The preregistration revolution. Proceedings of the National Academy of Sciences of the United States of America, 115(11), 2600–2606. DOI: https://doi.org/10.1073/pnas .1708274114, PMID: 29531091, PMCID: PMC5856500 Nosek, B. A., Alter, G., Banks, G. C., Borsboom, D., Bowman, S. D., … Yarkoni, T. (2015). Promoting an open research culture. Science, 348(6242), 1422–1425. DOI: https://doi.org/10.1126 /science.aab2374, PMID: 26113702, PMCID: PMC4550299 Park, H., You, S., & Wolfram, D. (2018). Informal data citation for data sharing and reuse is more common than formal data citation in biomedical fields. Journal of the Association for Information Science and Technology, 69(11), 1346–1354. DOI: https://doi .org/10.1002/asi.24049 Piwowar, H. A., & Vision, T. J. (2013). Data reuse and the open data citation advantage. PeerJ, 1, e175. DOI: https://doi.org/10 .7717/peerj.175, PMID: 24109559, PMCID: PMC3792178 PLOS. (2019). PLOS journals now open for published peer review. (https://blogs.plos.org/plos/2019/05/plos-journals-now-open-for -published-peer-review). Popper, K. R. (1967). The poverty of historicism. London, UK: Routledge and Kegan Paul. Robinson-García, N., Jiménez-Contreras, E., & Torres-Salinas, D. (2016). Analyzing data citation practices using the data citation index. Journal of the Association for Information Science and Technology, 67(12), 2964–2975. DOI: https://doi.org/10.1002/asi.23529 Rodriguez-Bravo, B., Nicholas, D., Herman, E., Boukacem-Zeghmouri, C., Watkinson, A., Xu, J., Abrizah, A., & Swigon, M. (2017). Peer review: The experience and views of early career researchers. Learned Publishing, 30, 269–277. DOI: https://doi.org/10.1002 /leap.1111. ://WOS:000412116600003

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Études scientifiques quantitatives

452

Which aspects of the Open Science agenda are most relevant to scientometric research and publishing?

Rushforth, UN., & de Rijcke, S. (2015). Accounting for impact? The journal
impact factor and the making of biomedical research in the
Netherlands. Minerva, 53(2), 117–139. EST CE QUE JE: https://doi.org/10.1007
/s11024-015-9274-5, PMID: 26097258, PMCID: PMC4469321

Schmidt, B., Ross-Hellauer, T., van Edig, X., & Moylan, E. (2018). Ten
considerations for open peer review [version 1; peer review: 2
approved]. F1000Research, 7(969). EST CE QUE JE: https://doi.org/10.12688
/f1000research.15334.1, PMID: 30135731, PMCID: PMC6073088
Spellman, B. UN., Gilbert, E. UN., & Corker, K. S. (2018). Open science.
In J. T. Wixted (Ed.), Stevens’ handbook of experimental psy-
chology and cognitive neuroscience. Wiley Online Library. EST CE QUE JE:
https://doi.org/10.1002/9781119170174.epcn519

Tabah, UN. N. (1999). Literature dynamics: Studies on growth, diffu-
sion, and epidemics. Annual Review of Information Science and
Technologie, 34, 249–286.

Tennant, J., Agarwal, R., Baždaric(cid:3), K., Brassard, D., Crick, T.,
Yarkoni, T. (2020). A tale of two “opens”: Intersections between
free and open source software and open scholarship. EST CE QUE JE:
https://doi.org/10.31235/osf.io/2kxq8 (accessed July 30, 2020).
Tenopir, C., Dalton, E. D., Allard, S., Frame, M., Pjesivac, JE.,
Dorsett, K. (2015). Changes in data sharing and data reuse prac-
tices and perceptions among scientists worldwide. PLOS ONE,
10(8), e0134826. EST CE QUE JE: https://doi.org/10.1371/journal
.pone.0134826, PMID: 26308551, PMCID: PMC4550246

Thelwall, M., & Kousha, K. (2017). Do journal data sharing man-
dates work? Life sciences evidence from Dryad. Aslib Journal of
Information Management, 69(1), 36–45. EST CE QUE JE: https://doi.org
/10.1108/AJIM-09-2016-0159

Thelwall, M.. & Mas-Bleda, UN. (2020). How common are explicit
research questions in journal articles? Études scientifiques quantitatives,
1(2), 730–748. EST CE QUE JE: https://doi.org/10.1162/qss_a_00041

Thelwall, M., Munafo, M., Mas-Bleda, UN., Stuart, E., Makita, M.,
Kousha, K. (2020). Is useful research data usually shared? Un
investigation of genome-wide association study summary statistics.
PLOS ONE, 15(2), e0229578. EST CE QUE JE: https://doi.org/10.1371/journal
.pone.0229578, PMID: 32084240, PMCID: PMC7034915

Thor, UN., & Bornmann, L. (2011). The calculation of the single publica-
tion h index and related performance measures: A web application
based on Google Scholar data. Online Information Review, 35(2),
291–300. EST CE QUE JE: https://doi.org/10.1108/14684521111128050

Tozzi, C. (2017). For fun and profit: A history of the free and open
source software revolution. Cambridge, MA: AVEC Presse. EST CE QUE JE:
https://doi.org/10.7551/mitpress/10803.001.0001

van den Daele, W., & Weingart, P.. (1975). Resistenz und
Rezeptivität der Wissenschaft zu den Entstehungsbedingungen

neuer Disziplinen durch wissenschaftliche und politische
Steuerung. Zeitschrift für Soziologie, 4(2), 146–164. EST CE QUE JE:
https://doi.org/10.1515/zfsoz-1975-0204

Van Noorden, R.. (2013). Open access: The true cost of science
édition. Nature, 495(7442), 426–429. EST CE QUE JE: https://est ce que je.org/10
.1038/495426un, PMID: 23538808

Velden, T., Hinze, S., Scharnhorst, UN., Schneider, J.. W., & Waltman, L.
(2018). Exploration of reproducibility issues in scientometric
recherche. In P. Wouters (Ed.), Proceedings of the 23rd International
Conference on Science and Technology Indicators (pp. 612–624).
Leiden, the Netherlands: Leiden University. https://openaccess
.leidenuniv.nl/handle/1887/65315

Viser, M., Van Eck, N. J., & Waltman, L. (2020). Large-scale com-
parison of bibliographic data sources: Scopus, Web de la Science,
Dimensions, Crossref, and Microsoft Academic. https://arxiv.org
/abs/2005.10732 (accessed May 22, 2020).

Waltman, L. (2019). From Journal of Informetrics to Quantitative
Science Studies. http://www.issi-society.org/blog/posts/2019
/april/from-journal-of-informetrics-to-quantitative-science-studies
(accessed March 9, 2020).

Watson, M.. (2015). When will “open science” become simply
“science”? Genome Biology, 16(1), 101. EST CE QUE JE: https://doi.org
/10.1186/s13059-015-0669-2, PMID: 25986601, PMCID:
PMC4436110

Whitley, R.. D. (1984). The intellectual and social organization of

the sciences. Oxford, ROYAUME-UNI: Presse universitaire d'Oxford.

Wilkinson, M.. D., Dumontier, M., Aalbersberg, je. J., Appleton, G.,
Axton, M., … Mons, B. (2016). Comment: The FAIR guiding prin-
ciples for scientific data management and stewardship. Scientific
Données, 3. EST CE QUE JE: https://doi.org/10.1038/sdata.2016.18, PMID:
26978244, PMCID: PMC4792175

Willinsky, J.. (2005). Open journal systems: An example of open source
software for journal management and publishing. Library Hi Tech,
23(4), 504–519. EST CE QUE JE: https://doi.org/10.1108/07378830510636300
Wouters, P., Hellsten, JE., & Leydesdorff, L. (2004). Internet time and
the reliability of search engines. First Monday, 9(10). http://www
.firstmonday.org/issues/issue9_10/wouters/index.html. EST CE QUE JE:
https://doi.org/10.5210/fm.v9i10.1177

Zhao, M.. N., Yan, E. J., & Li, K. (2018). Data set mentions and
citations: A content analysis of full-text publications. Journal de
the Association for Information Science and Technology, 69(1),
32–46. EST CE QUE JE: https://doi.org/10.1002/asi.23919

Ziman, J.. (2000). Real science: What it is and what it means.
Cambridge, ROYAUME-UNI: la presse de l'Universite de Cambridge. EST CE QUE JE: https://est ce que je
.org/10.1017/CBO9780511541391

Études scientifiques quantitatives

453

je

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

p

:
/
/

d
je
r
e
c
t
.

m

je
t
.

/

e
d
toi
q
s
s
/
un
r
t
je
c
e

p
d

je

F
/

/

/

/

2
2
4
3
8
1
9
3
0
7
5
8
q
s
s
_
e
_
0
0
1
2
1
p
d

/

.

F

b
oui
g
toi
e
s
t

t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3OPINION image
OPINION image

Télécharger le PDF