ARTÍCULO DE INVESTIGACIÓN

ARTÍCULO DE INVESTIGACIÓN

How reliable are unsupervised author
disambiguation algorithms in the assessment
of research organization performance?

Giovanni Abramo1

and Ciriaco Andrea D’Angelo1,2

1Laboratory for Studies in Research Evaluation, Institute for System Analysis and Computer Science (IASI-CNR),
National Research Council of Italy, Roma, Italia
2Department of Engineering and Management, University of Rome “Tor Vergata,” Rome, Italia

Palabras clave: author name disambiguation, evaluative scientometrics, FSS, Italia, research assessment,
universidades

ABSTRACTO

Assessing the performance of universities by output to input indicators requires knowledge
of the individual researchers working within them. Although in Italy the Ministry of University
and Research updates a database of university professors, in all those countries where such
databases are not available, measuring research performance is a formidable task. Uno
possibility is to trace the research personnel of institutions indirectly through their publications,
using bibliographic repertories together with author names disambiguation algorithms. Este
work evaluates the goodness-of-fit of the Caron and van Eck, CvE unsupervised algorithm by
comparing the research performance of Italian universities resulting from its application for the
derivation of the universities’ research staff, with that resulting from the supervised algorithm of
D’Angelo, Giuffrida, and Abramo (2011), which avails of input data. Results show that the CvE
algorithm overestimates the size of the research staff of organizations by 56%. Sin embargo,
the performance scores and ranks recorded in the two compared modes show a significant and
high correlation. Still, nine out of 69 universities show rank deviations of two quartiles.
Measuring the extent of distortions inherent in any evaluation exercises using unsupervised
algoritmos, can inform policymakers’ decisions on building national research staff databases,
instead of settling for the unsupervised approaches.

1.

INTRODUCCIÓN

The tools of performance assessment play a fundamental role in the strategic planning and
analysis of national and regional research systems, member organizations and individuals.
At the level of research organizations, assessment serves in identifying fields of strength and
weakness, which in turn inform competitive strategies, organizational restructuring, resource
allocation, and individual incentive systems. For regions and countries, knowledge of strengths
and weaknesses relative to others, and also the comparative performances of one’s own
research institutions, enables formulation of informed research policies and selective alloca-
tion of public funding across fields and institutions. By assessing performance before versus
después, institutions and governments can evaluate the impact of their strategic actions and imple-
mentation of policy (Karlsson, 2017; Gläser & Laudel, 2016). Performance-based research
funding and rewards stimulate improvement of performance. Such assessments also serve in
reducing information asymmetries between the suppliers (investigadores, institutions, territories)

un acceso abierto

diario

Citación: Abramo, GRAMO., & D’Angelo, C. A.
(2023). How reliable are unsupervised
author disambiguation algorithms in the
assessment of research organization
actuación? Quantitative Science
Estudios, 4(1), 144–166. https://doi.org/10
.1162/qss_a_00236

DOI:
https://doi.org/10.1162/qss_a_00236

Revisión por pares:
https://www.webofscience.com/api
/gateway/wos/peer-review/10.1162
/qss_a_00236

Recibió: 19 Julio 2022
Aceptado: 18 December 2022

Autor correspondiente:
Giovanni Abramo
giovanni.abramo@iasi.cnr.it

Editor de manejo:
Juego Waltman

Derechos de autor: © 2023 Giovanni Abramo
and Ciriaco Andrea D’Angelo.
Publicado bajo Creative Commons
Atribución 4.0 Internacional (CC POR 4.0)
licencia.

La prensa del MIT

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

and the end users of research (companies, estudiantes, investors). At the macroeconomic level,
this yields twofold beneficial results, resulting in a virtuous circle: a) In selecting research
suppliers, users can make more effective choices; and b) suppliers, aiming to attract more
users, will be stimulated to improve their research production. The reduction of asymmetric
information is also beneficial within the scientific communities themselves, particularly in the
face of the increasing challenges of complex interdisciplinary research, by lowering obstacles
among prospective partners as they seek to identify others suited for inclusion in team-building.

Over recent years, the stakeholders of research systems have demanded more timely assess-
mento, capable of informing in an ever more precise, confiable, and robust manner (Zacharewicz,
Lepori et al., 2019). Bibliometrics, and in particular evaluative bibliometrics, has the great
advantage of enabling large-scale research evaluations with levels of accuracy, costos, y tiempo-
scales far more advantageous than traditional peer-review (Abramo, D’Angelo, & Reale, 2019),
as well as possibilities for informing small-scale peer-review evaluations. For years, in view of
the needs expressed by policymakers, research managers, and stakeholders in general, eruditos
have continuously improved the indicators and methods of evaluative bibliometrics. En nuestro
opinión, sin embargo, the key factor holding bibliometricians back from a great leap forward is
the lack of input data, which in almost all nations have been very difficult to assemble.

In all production systems, the comparative performance of any unit is always given by the
ratio of outputs to inputs. In the case of research systems, the inputs or production factors consist
basically of labor (the researchers) and capital (all resources other than labor, p.ej., equipo,
facilities, bases de datos, etc.). For any research unit, por lo tanto, comparison to another demands that
we are informed of the component researchers, and the resources they draw on for conducting
their research. Además, bias in results would occur unless also informed of the prevailing
research discipline of each researcher, because output (new knowledge encoded in publica-
tions and the like), all inputs equal, is in part a function of discipline (Sorzano, Vargas et al.,
2014; Piro, Aksnes, & Rørstad, 2013; Lillquist & Verde, 2010; Sandström & Sandström,
2009; Iglesias & Pecharromán, 2007; Zitt, Ramanana-Rahary, & Bassecoulard, 2005): eruditos
of blood diseases, Por ejemplo, publish an average of about five times as much as scholars of
legal medicine (D’Angelo & Abramo, 2015). Finalmente, the measure of the researcher’s contribu-
tion to each scientific output should also take into account the number of coauthors, and in
some cases their position in the author list (waltman & van Eck, 2015; Abramo, D’Angelo, &
rosados, 2013; Aksnes, Schneider, & Gunnarsson, 2012; Huang, lin, & Chen, 2011; Gauffriau &
Larsen, 2005; van Hooydonk, 1997; Rinia, De Lange, & Moed, 1993).

Yet for many years, regardless of all the above requirements, organizations have regularly
published research institution performance rankings that are coauthor, tamaño, and field
dependent, among which the most renowned would be the Academic Ranking of World Uni-
versidades (ARWU),1 issued by Shanghai Jiao Tong University, and the Times Higher Education
World University Rankings.2 Despite the strong distortions in these rankings (mayordomo, 2010;
Dehon, McCathie, & Verardi, 2010; Tornero, 2005; van Raan, 2005), many decision-makers
persist in giving them serious credit. One of the most recent gestures of the sort came in
Puede 2022, when the British government, intending well for those seeking immigration but
without a job offer, offered early-career “High Potential Individuals” the possibility of a visa,
subject to graduation within the past five years from an eligible university: meaning any uni-
versity placing near top of the above—highly distorted—rankings.3

1 https://www.shanghairanking.com (last accessed 15/12/2022).
2 https://www.timeshighereducation.com/world-university-rankings (last accessed 15/12/2022).
3 https://www.gov.uk/government/publications/high-potential-individual-visa-global-universities-list.

Estudios de ciencias cuantitativas

145

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

To get around the obstacle of missing input data, some bibliometricians have seen a solu-
tion in the so-called size independent indicators of research performance—among these the
world-famous mean normalized citation score or MNCS (waltman, van Eck et al., 2011;
Moed, 2010). Sin embargo, these indicators result in performance scores and ranks that are
different from those obtained using other indicators, such as FSS (fractional scientific strength),
which do account for inputs, albeit with certain unavoidable assumptions.4 But the FSS indi-
cator has thus far been applied in only two countries, both with advantages of government
records on inputs: extensively, in Italy, for the evaluation of performance at the level of indi-
viduals (Abramo & D’Angelo, 2011) and then aggregated at the levels of research field and
university (Abramo, D’Angelo, & Di Costa, 2011), and to a lesser extent in Norway, con
additional assumptions (Abramo, Aksnes, & D’Angelo, 2020).

For policymakers and administrators, but also all interested others, the question then
becomes: “in demanding and/or using large-scale assessments of the positioning of research
institution performance, what margin of error is acceptable in the measure of their scores
and ranks?” To give an idea of the potential margins of error, a comparison of research-
performance scores and ranks of Italian universities by MNCS and FSS revealed that 48.4%
of universities shifted quartiles under these two indicators, y eso 31.3% of universities in the
top quartile by FSS fell into lower quartiles by MNCS (Abramo & D’Angelo, 2016C).

Italy is an almost completely unique case in the provision of the data on the research staff at
universidades, as necessary for institutional performance evaluation. Aquí, at the close of each
año, the Ministry of University and Research (MUR) updates a database of all university
faculty members, listing the first and last names of each researcher, their gender, institutional
affiliation, field classification and academic rank.5 The Norwegian Research Personnel Regis-
ter also offers a useful database of statistics,6 including notation of the capital cost of research
per person-year aggregated at area level, based on regular reports from the institutions to the
Nordic Institute for Studies in Innovation, Research and Education (NIFU).7

The challenge that practitioners are facing is then how to apply output-input indicators of
research performance aligned with microeconomic theory of production (like FSS), in all those
countries where databases of personnel are not maintained. One possibility is to trace the research
personnel of the institutions indirectly, through their publications, using bibliographic repertories
such as Scopus or Web of Science ( WoS), and referring exclusively to bibliometric metadata,
apply algorithms for disambiguation of authors’ names and reconciling of the institutions’ names.

Computer scientists and bibliometricians have developed several unsupervised algorithms
for disambiguation, at national and international levels (Rose & Kitchin, 2019; Backes, 2018a;
hussain & Asghar, 2018; Zhu, Wu et al., 2017; Liu, Dog(cid:1)an et al., 2014; Caron & van Eck, 2014;
Schulz, Mazloumian et al., 2014; Wu, Le et al., 2014; Wu & Ding, 2013; Cota, Gonçalves, &
Laender, 2007). The term unsupervised signifies that the algorithms operate without manually
labelled data, instead approaching the author-name disambiguation problem as a clustering
tarea, where each cluster would contain all the publications written by a specific author. Tekles
and Bornmann (2020), using a large validation set containing more than one million author
menciona, each annotated with a Researcher ID (an identifier maintained by the researchers),
compared a set of such unsupervised disambiguation approaches. The best performing

4 A thorough explanation of the theory and assumptions underlying FSS can be found in Abramo et al. (2020).
5 https://cercauniversita.cineca.it/php5/docenti/cerca.php, last accessed 15/12/2022.
6 https://www.nifu.no/en/statistics-indicators/4897-2/, last accessed 15/12/2022.
7 https://www.foustatistikkbanken.no/nifu/?language=en, last accessed 15/12/2022.

Estudios de ciencias cuantitativas

146

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

algorithm resulted as the one by Caron and van Eck (2014), hereinafter “CvE.” As discussed
arriba, sin embargo, the conduct of performance comparisons at organizational level requires more
than just precision in unambiguously attributing publications to each author. At that point we
also need precise identification of the research staff of each organization,8 the fields of research,
etc.. And so if the aim is to apply bibliometrics for the comparative evaluation of organizational
research performance, the goodness of the algorithms should be assessed on the basis of the
precision with which they actually enable measurement of such performance.

To assess it, we compare measurements of the research performance of universities in
the Italian academic system, which arise from the application of the previously conformed
CvE unsupervised algorithm, with those arising from the use of the supervised algorithm by
D’Angelo et al. (2011), hereinafter “DGA.” Over more than a decade, this algorithm has been
applied by the authors for feeding and continuous updating of the Public Research Observa-
tory of Italy (ORP), a database derived under license from Clarivate Analytics’ WoS Core
Collection. It indexes the scientific production of Italian academics at individual level,
achieving 97% harmonic average of precision and recall (F-measure),9 thanks to the operation
of the DGA algorithm, which avails of a series of “certain” metadata available in the MUR
database on university personnel, including their institutional affiliation, academic rank, años
of tenure, field of research, y género (for details see D’Angelo et al., 2011).

Given the maturity of the ORP, developed and refined year by year through the manual
correction of the rare false cases, it can be considered a reliable benchmark, or a gold
standard, against which to measure the deviations referable to any evaluation conducted using
unsupervised methods. The deviations, as we shall see in some detail, are attributable to
causes further than simply its lesser abilities in correctly disambiguating authorship. El
aim of our work, sin embargo, is not to criticize CvE (which is the best alternative when data
on staff are not available) but to give bibliometricians, practitioners, and especially decision
fabricantes, an idea of the extent of distortions in the research performance ranks of research insti-
tutions at overall and area level when forced to use unsupervised algorithms of this kind, bastante
than supervised ones based on research staff databases, such as DGA.10

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

In a nutshell, a) we evaluate the ability of the rule-based CvE algorithm to disambiguate
authorships and the relevant affiliations; then b) we extend the CvE evaluation to its applica-
tion in research assessment exercises. The paper is organized as follows: Sección 2 presenta el
methodology and describes the data and methods used. En la sección 3 we show the results of the
análisis. Sección 4 concluye, summarizing and also commenting the results, particularly for
practitioners and scholars who may wish to replicate the exercise in other geographical and
institutional frameworks.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

2. DATA AND METHODS

2.1.

Identification of Research Staff

The assessment of the comparative research performance of an organization cannot proceed
without survey of the scientific activity of its individual researchers, because evaluations that

8 The affiliation in the byline, in some cases multiple, is not always reliable to unequivocally identify the

organization to which the author belongs.

9 The most frequently used indicators to measure the reliability of bibliometric data sets are precision and
recordar, which originate from the field of information retrieval (Hjørland, 2010). Precision is the fraction of
retrieved instances that are relevant and recall is the fraction of relevant instances that are retrieved.
10 This is a conservative measure of distortions, as it refers to the application of the best performing unsuper-

vised algorithm according to Tekles and Bornmann (2020).

Estudios de ciencias cuantitativas

147

Unsupervised author disambiguation algorithms

operate directly at an aggregate level, without accounting for the sectoral distribution of input,
produce results with unacceptable error (Abramo & D’Angelo, 2011). Analyses at micro level,
sin embargo, presuppose precise knowledge of the research staff of the organization, as well as for
all “competitor” organizations eligible for comparative evaluation. Adopting an unsupervised
approach to solve this task implies using information embedded in bibliometric repositories.
Por ejemplo, let us consider this publication:

Abramo, GRAMO., & D’Angelo, C. A. (2022). Drivers of academic engagement in public–
private research collaboration: An empirical study. Journal of Technology Transfer, 47(6),
1861–1884.

WoS supplies, among others, an “address list” field containing, for each author, the relevant
affiliation:

(cid:129) [Abramo, Giovanni] Natl Res Council Italy, Lab Studies Res Evaluat, Inst Syst Anal &

Comp Sci IASI CNR, Via Taurini 19, I-00185 Roma, Italia;

(cid:129) [D’Angelo, Ciriaco Andrea] Univ Rome Tor Vergata Italy, Lab Studies Res Evaluat IASI

CNR, Dipartimento Ingn Impresa, Via Politecn 1, I-00133 Roma, Italia.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

From this field we can infer that the first author (Giovanni Abramo) is part of the research
staff of the National Research Council of Italy and the second one (Ciriaco Andrea D’Angelo),
of University of Rome “Tor Vergata.”

Aiming at inferring the research staff of all research organizations of a country, one can
analyze the set of publications in a given time window showing at least one affiliation of that
country. Sin embargo, it must be taken into account that

(cid:129) each organization may appear in many different ways (for Roma “Tor Vergata” there

occur dozens of variants in 2015–2019 WoS publications)

(cid:129) the same author may appear under different names in different publications (the second
author of the above publication in WoS also appears as Andrea D’Angelo and Andrea
Ciriaco D’Angelo)

(cid:129) many publications list authors with last name and first name initial (D’Angelo, A.;
D’Angelo, C.A.; D’Angelo A.C.), and in a very large data set, the cases of homonymy
can be numerically very relevant (at University of Rome “Tor Vergata” there are two
“D’Angelo, A.” professors, and probably just as many nonacademic staff ).

All this makes it very complex to disambiguate authors’ identity and, como consecuencia, to know

“who” works “where.”

As explained above, the current study aims to compare the outcomes of the evaluation of
research performance by Italian universities, based on two different bibliometric data sets,
obtained from the application of two disambiguation methods:

(cid:129) The first data set, hereinafter “CWTS,” relies on the CvE unsupervised approach, a rule-
based scoring and oeuvre identification method for disambiguation of authors used for
the WoS in-house database of the Centre for Science and Technology Studies (CWTS) en
Universidad de Leiden.

(cid:129) The second one hereinafter “ORP,” relies on the DGA supervised heuristic approach,
which “links” the Italian National Citation Report (indexing all WoS articles by those

Estudios de ciencias cuantitativas

148

Unsupervised author disambiguation algorithms

authors who indicated Italy as an affiliation country), with data retrieved from the data-
base maintained by the MUR, indexing the full name, academic rank, research field and
institutional affiliation of all researchers at Italian universities, at the close of each year.

Much fuller descriptions of the DGA and CvE approaches can be found in D’Angelo and

van Eck (2020).

In ORP

(cid:129) a priori, the availability of MUR data allows precise knowledge of the members of

research staff of national universities;

(cid:129) the census of their scientific production is then carried out by applying the DGA algo-

rithm to the Italian WoS publications.

The CWTS database does not rely on any national research personnel databases to derive
the research staff of an organization. It derives it by means of the CvE algorithm. Específicamente, a
identify the research staff of Italian universities, we apply the CvE algorithm that associates
clusters of publications with a cluster label (a kind of “protoindividual”) on the basis of the
similarity of the publications’ metadata. We name the resulting data set of Italian professors
the CWTS data set, not to be confused with the CWTS database. En mesa 1 we give as an
example the information that the algorithm associates with the cluster referred to the second
author of the present work (Ciriaco Andrea D’Angelo).

En particular, each protoindividual, uniquely identified by means of a “cluster_id”
(56122902 in the example in Table 1), is associated with an “organization” (univ roma tor

Mesa 1. Description of the output of the CvE approach for one of the authors of this paper

Field
cluster_id

n_pubs

first_year

last_year

“Academic” age

full_name

last_name

first_name

email

organización

city

country

orcid

researcherid

Value

56122902

129

1996

2020

24

d’angelo, ca

d’angelo

ciriaco andrea

dangelo@dii.uniroma2.it

univ roma tor vergata

rome

italy

0000-0002-6977-6611

J-8162-2012

Estudios de ciencias cuantitativas

149

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

vergata) on the basis of the most recurrent and recent affiliation in the publications assigned to
a ellos, as well as an email (dangelo@dii.uniroma2.it) on the basis of the same criterion. Para el
purposes of our work, we will consider the CvE algorithm as a black box and simply use the
output of its application to the Leiden in-house WoS database.

Por lo tanto, the identification of the research staff of each Italian university is based on the
“organization” and “email” fields of all clusters identified by CvE and, específicamente, all possible
variants of the prevailing “organization” and “email” of the protoindividuals indexed in
CWTS. As for the “organization”, the task has been accomplished by manually labelling all
variants in output of the CvE algorithm, imposing country “Italy.” As for the “email,” the
national academic system provides for a standardized web domain (p.ej., “uniroma2.it” for
Roma “Tor Vergata,” “unimi.it” for University of Milano, “unipa.it” for University of Palermo).

Por lo tanto, we can extract relevant information, putting together the two criteria,
es decir., concatenating, for each university to be evaluated, two distinct queries. A este respecto,
the box below shows the query related to the extraction of clusters for University of Rome
“Tor Vergata.”11

([organización] = (“state univ rome tor vergata” OR “tor vergata
univ” OR “tor vergata univ rome” OR “univ roma tor vergata” OR “univ
roma tor vergata 2” OR “univ tor vergata”)

O [email] like ‘%uniroma2.it’)

Y [country]=‘italy’

Note that our framework refers to national research assessment exercises, whereby the uni-
versities are evaluated on the basis of the performance of the researchers working within them
at the time of the launch of the exercise. The underlying rationale is that performance-based
research funding looks forward, concentrating on the potential of “current” research staff.
Because in Italy professors’ mobility is limited (Abramo, D’Angelo, & Di Costa, 2022), nosotros
assume that the prevailing “organization” identified by the CvE refers to the current affiliation
of the evaluated scholar.

For reasons of robustness, after the first extraction based on the above query, we eliminated
universities for which the procedure had identified fewer than 30 grupos (typically telematic
or predominantly humanistic universities). For the remaining universities (65 in all), nosotros
performed subsequent quality control and tuning operations. The combination of the two
condiciones ([organización] O [email]) in fact makes it possible to maximize the recall of
the procedure, but also inevitably generates false positives, eso es, retrieval of subjects that
do not actually belong to the institution in question. Such “incoherent” clusters, include those

(cid:129) where the “organization” remains ambiguous, or does not refer to a recognized

university;

11 It can be a formidable task, looking at all bibliometric addresses, to identify the variants of “organization”
attributable to a single institution (Backes, 2018b). In the present case, we are dealing with 90 universidades
and manually scanning the 4787 total name variants “associated” with their official emails. Practitioners
facing larger numbers or constraints on resources could decide to manually check only the first “n” in terms
of frequency. The six “organization” variants in the case shown in the box, Por ejemplo, account for 95% de
all clusters linked with a “%uniroma2.it” email.

Estudios de ciencias cuantitativas

150

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

(cid:129) with email not referring to a university;
(cid:129) with email and organization referring to different universities.

These control operations, Por ejemplo, lead to the exclusion of the first author of the current
artículo (Giovanni Abramo) from the data set, for whom the initial extraction gives an assignment
to the University of Rome “Tor Vergata” because of the email (giovanni.abramo@uniroma2.it),
even though his most recurrent organization is “natl res council italy,” that is, a nonacademic
research body.

Además, to exclude “occasional” and terminated researchers, we impose

(cid:129) an “academic age” of at least 4 años (given by the difference between the years of the

most recent and the first publications);

(cid:129) the most recent publication of the cluster no earlier than 2020.

Finalmente, conflicts involving distinct clusters, but sharing the same “ORCID” organization ID

or email, are resolved manually.12

2.2. Research Performance Measurement

To assess the yearly average performance of each researcher in a period of time, we recur to
the Fractional Scientific Strength (FSSR) indicator of research productivity, defined as

FSSR ¼

1

t

XN

i¼1

ci
(cid:3)C

f i

(1)

dónde

t = number of years of work of the researcher in the period under observation;13
N = number of publications14 of the researcher in the period under observation;
ci = citations received by publication i;
(cid:3)c = average of distribution of citations received by all publications in the same year and

WoS subject category (CAROLINA DEL SUR) of publication i;

fi = fractional contribution of the researcher to publication i, given by the inverse of the

number of coauthors in the byline.

The indicator is calculated over the period 2015–2019, with the citation count at week 13
en 2021. Performance measured at the individual level is then aggregated to obtain the per-
formance of a university (FSSU) at the SC, area15 or overall level. In formulae

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

FSSU ¼

XRSU

1

RSU

j¼1

FSSRj
(cid:3)FSSR

(2)

12 Solo 13.3% of the clusters identified through CvE and with affiliation country “Italy” have an ORCID.
Por lo tanto, despite the great potential of the ORCID in AD tasks, its use here is only aimed at solving such
conflicts and increasing the level of accuracy of the data set.

13 For researchers in CWTS database we assume t = 5 for all.
14 We consider all publications indexed in the WoS core collection (excluding ESCI) with document types:

artículos, reviews, letters, proceedings.

15 SCs are classified and grouped into areas according to a system previously published on the webpage of the
ISI Journal Citation Reports. This page is no longer available at the current Clarivate site. It should be noted
that all SCs are assigned to only one area.

Estudios de ciencias cuantitativas

151

Unsupervised author disambiguation algorithms

dónde

RSU = research staff of the university unit, in the observed period;
FSSRj
(cid:3)FSSR = national average productivity of all productive researchers in the same SC of

= productivity of researcher j in the unit;

researcher j.

Prior to any aggregation of data by university unit, it is absolutely necessary that individual
performance be scaled against the expected value of the reference SC, but this requires an “SC
classification” of each researcher. For this purpose, we used the WoS classification scheme,
assigning the prevailing SC as follows:

(cid:129) for the CWTS-based evaluation, with reference to the researcher’s entire scientific pro-
duction in WoS; in uncertain cases (researcher with multiple prevailing SCs) randomly
among those with a higher frequency;

(cid:129) for the ORP-based evaluation, with reference to the researcher’s scientific production in
2001–2019; in uncertain cases (researcher without publications or with multiple prev-
alent SCs) the one with the highest incidence relative to the SDS16-SC pair.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

We note that the performance indicator is calculated in the same way for both the ORP and
CWTS data sets.17 Due to its derivation, sin embargo, in comparison to the ORP, the CWTS data
set has two limitations. Primero, it does not contain those researchers who have never published in
the period under observation, as they cannot be identified from a bibliographic repertory,
although they must contribute to the evaluation as they represent a research cost. Segundo,
in the period under observation, the years on staff for each researcher remain unknown, entonces
if one presents publications in CWTS only over 2017–2019, Por ejemplo, it is unknown
whether the lack of production in years 2015–2016 was because they were not on staff.
En cambio, the information is known in ORP.

For reasons of significance, the analysis excludes researchers in SCs belonging to “Art and
Humanities” and “Law, political and social sciences”, where the coverage of bibliographic
repertories is scarce (Hicks, 1999; Larivière, Archambault et al., 2006; Aksnes & Sivertsen,
2019). The analysis is then restricted to the researchers of SCs pertaining to the STEM and
Ciencias económicas, but also excluding subjects classified in “Multidisciplinary Sciences,” and again
for the sake of significance, SCs with fewer than 10 observations in both data sets. Mesa 2
shows the resulting breakdown by area for the two data sets.

En general, the CWTS-based evaluation concerns 49,908 subjects, 56% more than the 31,989
in the MUR official database, and therefore in ORP. As the ORP data set is the benchmark, nosotros
can reasonably consider that, in the CWTS data set, the number of false positives (investigadores
assigned to a university although not officially part of the research staff ) is significantly higher
than the number of false negatives (researchers not assigned to a university although part of the
research staff ).

The data in Table 2 indicate, sin embargo, that the overrepresentation of CWTS data set com-
pared to ORP is not identical between the areas. Biomedical research and Clinical medicine

16 In the Italian university system all professors are classified in one and only one field (named the scientific

disciplinary sector, SDS, 370 in all).

17 This does not mean that the value is necessarily identical for the subjects in both data sets, because a) el
CvE and DGA algorithms are not free from error in attributing to a given author the publications they have
actually produced; and b) in ORP, the value of “t” may differ from 5.

Estudios de ciencias cuantitativas

152

Unsupervised author disambiguation algorithms

Area
Biología

Biomedical research

Chemistry

Clinical medicine

Earth and space sciences

Ciencias económicas

Ingeniería

Matemáticas

Physics

Psicología

Total

Mesa 2. Number of researchers in the two data sets for analysis, by area*

No. of obs data set ORP
4928

No. of obs data set CWTS
7987

2891

1606

6205

1937

3784

5554

2055

2617

412

31989

7217

2735

13444

3039

1540

7873

1805

3793

475

49908

* The counts exclude SCs with less than 10 observations in both data sets.

have the largest deviations (+149.6% y +116.7% respectivamente). In Economics, on the con-
trary, tenemos 2.5 times more observations in ORP data set than in CWTS (3784 vs. 1540). A
possible reason for such differences is that they might reflect the intensity of academic mobil-
idad. Areas characterized by intense mobility are likely to show more false positives, porque
visiting scholars tend to sign their papers with the hosting university affiliation. Sin embargo, el
level of mobility in Italy is substantially very low and rather homogeneous across fields. Sobre el
other hand, the anomaly found for Biomedical research and Clinical medicine could be due to
the presence of university hospitals with a large number of nonacademic staff, including phy-
sicians and clinical specialists hired under the “national health service” framework and not
affiliated to the hosting university.

3. RESULTADOS

3.1. The Distributions of Performance at the Individual Level

Primero, we analyze the differences between the two data sets in the distributions of performance
at individual level (FSSR). We report the results of three SCs, exemplary of different types of
casos. Mesa 3 shows the descriptive performance statistics for the authors of “Engineering,
manufacturing”; “Dermatology”; “Statistics & probability”; Cifra 1 shows the box plots of
the distributions for the last two SCs.

For “Engineering, manufacturing,” the number of observations in the two data sets results as
virtually identical (145 vs. 143), and the distributions seem almost superimposable, with prac-
tically identical mean/median and dispersion values. Además, the maximum values coin-
cide: referring to the same subject, Professor Francesco Lambiase of the University of L’Aquila.
En general, this accordance occurs in no less than 39 SCs out of the total 160. It can also be noted
that in ORP, the minimum value of the distribution is equal to 0, indicating that in the evalu-
ation of the SC using this data set, we find at least one unproductive researcher (FSS = 0),
which cannot happen with CWTS (where the minimum recorded is equal to 0.018). En general,

Estudios de ciencias cuantitativas

153

Delta
+62.1%

+149.6%

+70.3%

+116.7%

+56.9%

−59.3%

+41.8%

−12.2%

+44.9%

+15.3%

+56.0%

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

Mesa 3. Descriptive statistics of the distribution of research performance (FSSR) for researchers in three subject categories, comparing ORP
and CWTS data sets

Percentile

Obs

Significar

Std Dev.

Variance

Oblicuidad

Curtosis

1%

5%

10%

25%

50%

75%

90%

95%

99%

máx.

Ingeniería, manufacturing
CWTS
143

ORP

145

0.944

1.102

1.214

3.267

1.034

1.119

1.253

2.958

19.350

16.873

0

0.023

0.056

0.234

0.683

1.317

2.012

2.733

5.164

8.572

0.018

0.056

0.134

0.357

0.656

1.357

2.342

3.227

4.422

8.572

Dermatology

ORP

108

CWTS
258

Estadísticas & probabilidad
CWTS
ORP
277

442

0.942

1.161

1.348

2.190

8.791

0

0.021

0.053

0.151

0.601

1.203

2.670

3.041

5.499

6.488

0.453

0.818

0.669

3.471

0.306

0.494

0.244

4.887

0.353

0.455

0.207

5.248

18.923

36.813

50.275

0

0

0.004

0.030

0.139

0.443

1.372

2.162

3.455

6.639

0

0

0

0.049

0.151

0.387

0.749

0.996

2.403

5.034

0

0.006

0.028

0.095

0.212

0.454

0.856

1.048

1.666

5.216

this circumstance is detectable in only two SCs besides “Engineering, manufacturing.” In
contrast, hay 31 SCs where CWTS registers at least one unproductive (FSS = 0), versus
ORP finding none.

In “Dermatology,” the numerosity of observations in the two data sets is very different, con
258 in CWTS compared to 108 in ORP. In this case the distribution of CWTS performance
seems decidedly more shifted to the left than its counterpart in ORP, with lower values both
in terms of mean (0.453 vs 0.942) and median (0.139 vs 0.601). In terms of percentiles, el
CWTS distribution also shows systematically lower values than the ORP distribution, con el
sole exception of the maximum (6.639 vs 6.488).

The situation is diametrically opposed in “Statistics & probabilidad,” an SC where the CWTS
observaciones (277) are well below the ORP observations (442). En este caso, the CWTS perfor-
mance distribution appears decisively shifted to the right compared to that for ORP, con
higher values of mean (0.353 vs 0.306), median (0.212 vs 0.151), and all percentiles.

Figures 2 y 3 show the comparison of the mean and median values of the distributions for
todo 160 SC, in the two data sets. At a glance, one can observe a greater number of cases in
which the “central” values (mean and median) calculated in CWTS are lower than their coun-
terparts recorded in ORP. En efecto, en 107 SCs out of 160 (es decir., 67%), the mean value of the FSSR
is higher in the ORP relative to the CWTS data set, and for the medians, this occurs in 114 SC
out of 160 (71%).

Estudios de ciencias cuantitativas

154

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Cifra 1. Boxplots of research performance (FSSR) distribution for researchers in Dermatology (Georgia) and Statistics & probabilidad (XY), com-
paring ORP and CWTS data sets.

De este modo, en general, it would appear that the distribution of performance recorded in the ORP
data set is more rightward shifted than in the CWTS data set. This could be explained by the
different composition of the two data sets and, En particular, the overrepresentation of CWTS
compared to ORP. De hecho, the Pearson ρ correlation between the deviations in terms of the
number of observations and the deviations between the mean FSSR values for the 160 SC
is −0.515 (−0.454 when considering the medians). Basically, in the SCs where the over-
representation of the CWTS data set compared to ORP is greater, the mean performance values
in CWTS are significantly lower than those found in ORP, y viceversa. We can hypothesize

Cifra 2. Distribution of the mean values of FSSR detected in the two data sets, para el 160 sujeto
categories considered.

Estudios de ciencias cuantitativas

155

Unsupervised author disambiguation algorithms

Cifra 3. Distribution of the median FSSR values detected in the two data sets, para el 160 sujeto
categories considered.

that the so-called false positives have lower average performance values than the “true posi-
tives.” In other words, “nonfaculty” personnel, but with a bibliometrically prevalent university
affiliation, have a lower average FSSR than the research staff truly on faculty at the universities
in the same field of observation. This has an important implication concerning the use of
CWTS data sets for comparative performance assessment, in that it would evidently “penalize”
those organizations (and areas within organizations) with a higher concentration of nonfaculty
personnel in their research staff.

3.2. Evaluation of the Universities’ Performance

We now move on to analyze the deviations between the two data sets in terms of the score and
rank of the performance of the universities, through the FSSU indicator in formula [2]. Cifra 4
shows the values measured at the overall level for the two data sets of the 65 universities with
al menos 30 observations of researchers. The dispersion of the values for the ORP data set is
greater than that for the CWTS (desviaciones estandar 0.316 vs 0.175). Cifra 5 instead shows

Cifra 4. Distribution of FSSU for universities in the CWTS and ORP data sets, at the overall level.

Estudios de ciencias cuantitativas

156

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

Cifra 5. Scatterplot of FSSU values for universities in the CWTS and ORP data sets, at the overall
nivel.

the scatterplot of the values of the two indicators and evidences a strong correlation. El
detailed values are shown in Table 4: the University Vita-Salute San Raffaele and the Scuola
Superiore S. Anna are at the top in both assessments. The top 11 universities between the two
rankings vary by at most six positions (the case for the University of Padua and Politecnico di
Milano). De este modo, the evaluation by the CWTS data set returns a top part of the ranking substan-
tially similar to that resulting in ORP. En cambio, the situation in the middle and lower part of the
ranking is different, where LUISS stands out, which shifts 36 positions between the two rank-
ings, and the University “Campus Bio-medico,” which shifts 29. Por otro lado, hay
also five universities (University of Naples “Parthenope”; University of Enna; University of the
Mediterranean Studies of Reggio Calabria; University of Sannio; University of Teramo) cual
in the CWTS-based ranking gain between 31 y 34 positions compared to ORP. It is notice-
able the gain in performance score by the bottom-ranked universities in the CWTS-based rank-
En g. A possible interpretation is that the ORP performance of true positives is so low that false
positives cannot help contributing to increase overall CWTS-based performance.

The magnitude and direction of these variations could be related to the differing numbers of
researchers evaluated in the two modes. De hecho, the percentage deviations of FSSU between the
two data sets are significantly and negatively correlated with the percentage deviations of
numerosity (Pearson ρ = −0.526). The same is true for the ranking jumps and the percentage
deviations in numerosity between the two data sets (Pearson ρ = −0.361).

To summarize, compared to the ORP benchmark, the value of the performance recorded in
CWTS (both absolute and relative) decreases as the concentration of “nonfaculty” personnel in
the research staff increases. De este modo, the hypothesis is confirmed that conducted via CWTS, uni-
versidades (and areas within universities) with a higher proportion of nonfaculty research staff are
actually penalized. This does not prevent the Vita-Salute San Raffaele University from coming
out on top in both rankings, despite the fact that in the CWTS data set there are 364 investigadores
associated with it compared to the 103 actual researchers recorded in ORP.

Depending on the particular application one has in mind (p.ej., for a policymaker, or a pro-
gram administrator), the sensitivity of any measuring instrument will be more or less critical. En
the case of the data in Table 4, a variation to the second decimal place in the values of FSSU
results in some major jumps in ranking. This certainly prompts thinking that a less precise rank-
ing would be reasonable, such as by performance classes, such as by quartiles, which are in

Estudios de ciencias cuantitativas

157

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

Mesa 4.

Performance score and rank of Italian universities, comparing CWTS and ORP data sets

CWTS

ORP

Universidad
Vita-Salute San Raffaele

Scuola Superiore S.Anna

SISSA

Libera Università di Bolzano

Commerciale Luigi Bocconi

Politecnico di Bari

trento

Padova

Salerno

Politecnico di Milano

Napoli “Federico II”

Milano

Pisa

Verona

Firenze

“Campus Bio-medico”

Istituto Univ. di Scienze Motorie

Perugia

Catania

Torino

Politecnico di Torino

Ferrara

Magna Grecia di Catanzaro

Pavia

Bologna

Politecnica delle Marche

Tuscia

Bergamo

LUISS

Calabria

Cattolica del Sacro Cuore

Obs
364

172

128

103

112

264

487

FSSU
1.762

1.306

1.210

1.236

1.299

1.163

1.223

2845

1.086

696

1.170

1339

2405

2724

1622

1.058

1.117

0.998

1.018

838

0.981

1959

0.979

257

50

968

976

0.897

1.171

1.002

1.091

2166

0.904

988

700

290

957

0.984

0.931

1.026

0.973

2889

0.982

707

254

132

32

627

906

0.924

1.003

0.895

0.711

1.019

0.881

Rango
1

Perc.
100

2

6

4

3

9

5

14

8

16

12

23

20

29

31

45

7

22

13

44

27

38

18

32

28

39

21

46

65

19

48

98

92

95

97

88

94

80

89

77

83

66

70

56

53

31

91

67

81

33

59

42

73

52

58

41

69

30

0

72

27

Obs
103

82

67

91

191

212

332

FSSU
2.485

1.935

1.868

1.499

1.259

1.248

1.228

1394

1.193

523

993

1659

1317

908

421

983

107

42

654

708

1.148

1.139

1.079

1.068

1.027

1.025

1.020

1.004

0.985

0.983

0.968

1121

0.966

658

389

162

536

0.966

0.958

0.956

0.948

1640

0.944

436

177

145

51

473

704

0.936

0.932

0.926

0.924

0.918

0.900

Rango
1

Perc.
100

Δ Rank
0

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

98

97

95

94

92

91

89

88

86

84

83

81

80

78

77

75

73

72

70

69

67

66

64

63

61

59

58

56

55

53

0

−3

0

+2

−3

+2

−6

+1

−6

−1

−11

−7

−15

−16

−29

+10

−4

+6

−24

−6

−16

+5

−8

−3

−13

+6

−18

−36

+11

−17

Estudios de ciencias cuantitativas

158

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

Mesa 4.

(continued )

CWTS

ORP

Rango
30

Perc.
55

Obs
567

Universidad
Milano Bicocca

Urbino “Carlo Bo”

del Salento

Insubria

Messina

Roma Tre

Foggia

Genova

dell’Aquila

Napoli “Parthenope”

Roma “La Sapienza”

Brescia

Roma “Tor Vergata”

Enna

Cagliari

Mediterranea di Reggio Calabria

Gabriele D’Annunzio

Palermo

Barí

Parma

Trieste

Camerino

Modena e Reggio Emilia

Siena

Ca’ Foscari Venezia

Sannio

Teramo

Udine

Piemonte Orientale A. Avogadro

Sassari

Molise

Obs
1020

211

382

360

810

397

275

FSSU
0.980

0.813

0.876

0.923

1.052

0.990

0.965

1257

0.839

495

213

0.952

1.160

3414

0.852

702

0.827

1158

0.934

38

798

181

587

1181

1193

895

536

288

888

752

198

154

115

526

178

474

166

1.141

0.750

1.059

0.918

0.912

0.888

0.780

0.818

0.765

0.783

0.786

0.738

0.996

0.989

0.789

0.718

0.790

0.905

FSSU
0.897

0.893

0.889

0.872

0.859

0.848

0.847

0.828

0.827

0.820

159

298

241

642

349

211

745

383

223

2048

0.813

410

849

56

560

159

394

885

852

610

378

200

537

392

184

142

104

402

234

342

133

0.802

0.785

0.776

0.774

0.765

0.762

0.757

0.755

0.749

0.746

0.741

0.726

0.722

0.721

0.719

0.674

0.668

0.666

0.643

0.632

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Rango
32

Perc.
52

Δ Rank
+2

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

50

48

47

45

44

42

41

39

38

36

34

33

31

30

28

27

25

23

22

20

19

17

16

14

13

11

9

8

6

5

−21

−15

−5

+19

+12

+5

−12

+5

+31

−8

−9

+7

+34

−15

+32

+7

+7

+3

−8

−1

−7

−4

−2

−7

+33

+32

+3

−4

+6

+19

54

49

40

17

25

33

51

35

10

50

52

37

11

61

15

41

42

47

59

53

60

58

57

63

24

26

56

64

55

43

17

25

39

75

63

50

22

47

86

23

20

44

84

6

78

38

36

28

9

19

8

11

13

3

64

61

14

2

16

34

Estudios de ciencias cuantitativas

159

Unsupervised author disambiguation algorithms

Universidad
Seconda Napoli

Cassino

Basilicata

Mesa 4.

(continued )

CWTS

FSSU
0.958

0.946

0.743

Rango
34

36

62

Perc.
48

45

5

Obs
574

146

257

ORP

FSSU
0.623

0.608

0.585

Rango
63

64

65

Obs
563

152

222

Perc.
3

Δ Rank
+29

2

0

+28

+3

fact used in some national research evaluation exercises. En mesa 5 we report the ranking of
the universities’ performance in the two data sets, by quartiles. The data show that 36 out of 65
universities are ranked in the same quartile in the two modes (in the main diagonal). El
remaining 29 show a jump of at least one quartile. De estos

(cid:129) 13 have a better ranking based on CWTS than ORP (above the main diagonal);
(cid:129) 16 have a worse ranking based on CWTS than ORP (below the main diagonal).

The nine universities shown in Table 6 experience a two-quartile jump between the two
rankings. No university experiences a three-quartile jump, eso es, top to bottom or vice versa.

What was just seen in Section 3.2 at the overall level is repeated at the area level. Cifra 6
shows the scatterplot of FSSU calculated in the two modes, at this level of aggregation. Mesa 7

Mesa 5.

Performance quartiles of the universities evaluated using CWTS and ORP data sets

CWTS

I

II

III

IV

I
12

4

1

0

ORP

III
4

2

6

4

II
1

8

5

2

IV
0

2

4

10

Mesa 6. Universities showing two-quartile ranking jumps between the two data sets

Ateneo
Messina

Napoli “Parthenope”

Enna

Mediterranea di Reggio Calabria

Sannio

Teramo

“Campus Bio-medico”

LUISS

Urbino “Carlo Bo”

Quartile CWTS
1

Quartile ORP
3

1

1

1

2

2

3

4

4

3

3

3

4

4

1

2

2

Estudios de ciencias cuantitativas

160

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Cifra 6. Distributions of performance evaluated in the two modes, by area.

Estudios de ciencias cuantitativas

161

Unsupervised author disambiguation algorithms

Mesa 7. Correlation between score and ranking of the 65 Italian universities, evaluated in the two
modes, by area

Discipline
Biología

Biomedical Research

Chemistry

Clinical Medicine

Earth and Space Sciences

Ciencias económicas

Ingeniería

Matemáticas

Physics

Psicología

En general

No. of obs
(evaluated universities)
54

Pearson ρ
0.602

Spearman ρ
0.576

42

39

47

50

48

52

48

44

9

65

0.789

0.851

0.681

0.775

0.808

0.657

0.504

0.760

0.072

0.813

0.537

0.824

0.580

0.747

0.712

0.673

0.607

0.702

0.233

0.686

presents the correlation between score and ranking of the 65 universities evaluated by area. En
the score level, the Pearson coefficient recorded in the two assessments at the overall level is
0.816. The figure for the individual areas is never less than 0.5 with the sole exception of Psy-
chology, which is an area with a very low number of assessable universities (only nine). El
correlation coefficients between the ranks are slightly lower, eso es, 0.686 at the overall level.
This confirms the exception of Psychology, an area in which the two assessment modes lead to
results that are completely noncorrelated. Chemistry is confirmed as the area with the most
closely aligned ranks between the two modes (Lancero 0.830) followed by Earth and Space
Ciencias (0.747) and Economics (0.712).

4. DISCUSSION AND CONCLUSIONS

Evaluative bibliometricians have been engaged for years in the continuous improvement of
indicators and methods for evaluating the scientific activities of individuals, organizaciones,
and national and territorial research systems. The major obstacle hindering a leap ahead in
evaluation techniques is the lack of input data. Yet a basic principle is that the proper evalu-
ation of the performance of any subject at any level, including in research performance
requires, in addition to output, input data. En particular, assessing the performance of univer-
sities requires knowledge of the individual researchers working within them (Abramo &
D’Angelo, 2016a, 2016b). In large-scale research assessments, the evaluators have at least
three different options to acquire input (and output) datos:

(cid:129) They can ask the institutions being evaluated a direct involvement in declaring and sub-

mitting their research staff (as well as research products);

(cid:129) They may draw a list of unique identifiers for institutions and/or authors (and then use

these for querying a bibliometric database for having their research products);

(cid:129) They can extract publications from a bibliometric repertoire and, entonces, disambiguate the

true identity of the relevant authors and their institutions.

Estudios de ciencias cuantitativas

162

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

These approaches present significant trade-offs: the first one can guarantee a high level of
precision and recall but is particularly “costly” because of the opportunity cost of the surveyed
subjects for collecting and selecting inputs and outputs for the evaluation.

The introduction of unique identifiers for researchers and organizations (ROR, ORCID, etc.)
is important and necessary for improving the quality of research information systems (Enserink,
2009), but at the moment the coverage is limited and not uniform in terms of country and/or
campo (Youtie, Carley et al., 2017).

The third option implies setting up a large-scale bibliometric database in “desk mode” and
offers rapid and economical implementation. Sin embargo, the task is challenging because of
homonyms in author names and variations in the way authors indicate their name and
affiliation. Such challenges have determined the development and continuous improvement
of disambiguation methods.

En este trabajo, we have focused on the issue of the reliability of author disambiguation algorithms
in identifying the true publications of each observed subject in combination with their ability to
identify the research staff of the home institutions, including their placement in disciplinary fields.

En particular, we evaluated the goodness-of-fit of the CvE unsupervised author-name disam-
biguation algorithm in measuring the performance scores and ranks of Italian universities,
operating through direct processing of bibliometric data for deduction of the research staff
(and thus the input data) of each university. The validation was carried out through the com-
parison with the DGA algorithm, based on the a priori knowledge of the research staff officially
in post in each national university.

The results of the comparison showed that the application of the unsupervised approach
leads to an overestimation of the research staff of an organization. En general, for the field of obser-
vation adopted in this study, this meant 56% more subjects in the CWTS data set than in the
ORP data set, which draws on guaranteed data from the MUR. One of the reasons for this would
be that the CvE approach, which underlies the CWTS data set, attributes all researchers to an
organization when these have prevalently indicated the relative affiliation in signing their
publicaciones cientificas, independently of their effective position within the organization. En esto
way, doctoral and postdoctoral students, postdoctoral fellows, visiting scholars, collaborators,
and a range of other individuals who would not be eligible for evaluation in an official national
evaluation exercise also end up on a university’s list of researchers.18

It should also be considered that the CvE algorithm19 tends to favour precision over recall; en
particular, the publication oeuvre of an author can be split over multiple “clusters” if not enough
proof is found for joining publications together. This means that the actual value of the over-
representation of an organization’s research staff in the CWTS data set is lower than the figure
measurable by direct comparison with the ORP data. At an overall level, por lo tanto, eso +56%
represents an upper bound of the actual incidence of nonfaculty personnel in the CWTS data set.

Having said this, the scores and ranks recorded in the two compared modes show a significant
and rather high correlation: At the overall level, Pearson and Spearman coefficients, respectivamente,
son 0.813 y 0.686. At the area level, the values are never below 0.5 with peaks in Chemistry
(0.851; 0.824). The only critical area is Psychology, however this is an area present only in a small

18 Although including doctoral and post doctoral students or postdoctoral fellows may be problematic from the
viewpoint of the Italian research assessment system, it may not be necessarily so in other research assessment
sistemas.

19 Por conveniencia, we refer here to the CvE algorithm, but our conclusions refer to any unsupervised

approaches to author disambiguation.

Estudios de ciencias cuantitativas

163

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

.

/

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

number of assessed universities, at nine. Still, the overall correlation covers significant jumps
at local level: comparing the CWTS-based assessment to the benchmark, in the ranking of all
69 assessed universities, nine show deviations of two quartiles, better or worse.

Among the drivers of these deviations, certainly the greatest weight goes to the different
number of observations between the two data sets. Empirically, it emerges that percentage
deviations in FSS for universities are significantly and negatively correlated with percentage
deviations in numerosity between the two data sets.

If we assume that the overrepresentation of the CWTS data set with respect to ORP
depends largely on the CWTS inclusion of “nonfaculty” personnel, we can deduce that on
promedio, these personnel perform less well. More importantly, it can also be concluded that
an evaluation conducted by means of the CvE approach, although there could be exceptions,
would generally penalize universities (and areas) with a higher proportion of nonfaculty
research-active personnel. A comparison of the research performance of “nonfaculty” person-
nel vs. “faculty” personnel could be the object of future research.

Obviously this effect, which is certainly significant, must be discounted against the intrinsic
limits of CvE. Notablemente, as mentioned above, such an algorithm can in some cases attribute a
researcher’s scientific production to two (or more) distinct “clusters,” especially in the pres-
ence of a scientific production characterized by heterogeneous and highly differentiated
bibliometric metadata. The impact of such “splitting” on the outcomes of the comparative
evaluation of universities should, sin embargo, be very limited, as the splitting cases should be
evenly distributed and not focus on researchers from one organization over those from others.
Y, it is worth remembering, the literature in any case indicates CvE as the best performing of
such unsupervised algorithms (Tekles & Bornmann, 2020).

While waiting for policymakers to take action towards national and international systems
for collecting input data, which would enable bibliometricians to carry out what the same
policymakers are ever more insistently demanding, practitioners may consider using the
CWTS data as in this current paper. En particular, the methodology described here makes it
possible for others to replicate the comparative analyses in the frameworks of their interest
(national or international), simply by processing the output of the CvE algorithm in an appro-
priate manner, in particular by considering the relative institutions’ official URL domains. A
notable side benefit of this would be that with this, practitioners now have a precise measure
of the extent of distortions inherent in any evaluation exercises using unsupervised algorithms.
Y, for policymakers, knowing the extent of performance measure distortions reveals useful
in deciding whether to invest in development of databases of national research personnel, o
to settle for the less precise assessments.

EXPRESIONES DE GRATITUD

We are indebted to the Centre for Science and Technology Studies (CWTS) at Leiden Univer-
sity for providing us with access to the in-house WoS database from which we extracted data
at the basis of our elaborations.

CONTRIBUCIONES DE AUTOR
Giovanni Abramo: Conceptualización; Investigación; Metodología; Supervisión; Writing—
Original draft; Writing—Review & edición. Ciriaco Andrea D’Angelo: Conceptualización; Datos
curation; Investigación; Metodología; Visualización; Writing—Original draft; Writing—Review
& edición.

Estudios de ciencias cuantitativas

164

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

CONFLICTO DE INTERESES

The authors have no known competing financial interests or personal relationships that could
have appeared to influence the work reported in this paper.

INFORMACIÓN DE FINANCIACIÓN

The research project received no funding by third parties.

DISPONIBILIDAD DE DATOS

The bibliometric data set used in this study has been extracted from the CWTS in-house WoS
database, made available under license by Clarivate Analytics. The authors are not allowed to
redistribute WoS data.

REFERENCIAS

Abramo, GRAMO., & D’Angelo, C. A. (2011). National-scale research
performance assessment at the individual level. cienciometria,
86(2), 347–364. https://doi.org/10.1007/s11192-010-0297-2

Abramo, GRAMO., D’Angelo, C. A., & Di Costa, F.

(2011). A
national-scale cross-time analysis of university research perfor-
mance. cienciometria, 87(2), 399–413. https://doi.org/10.1007
/s11192-010-0319-0

Abramo, GRAMO., & D’Angelo, C. A. (2016a). A farewell to the MNCS and
like size-independent indicators. Journal of Informetrics, 10(2),
646–651. https://doi.org/10.1016/j.joi.2016.04.006

Abramo, GRAMO., & D’Angelo, C. A. (2016b). A farewell to the MNCS
and like size-independent indicators: Rejoinder. Journal of Infor-
métrica, 10(2), 679–683. https://doi.org/10.1016/j.joi.2016.01
.011

Abramo, GRAMO., & D’Angelo, C. A. (2016C). A comparison of university
performance scores and ranks by MNCS and FSS. Diario de
Informetrics, 10(4), 889–901. https://doi.org/10.1016/j.joi.2016
.07.004

Abramo, GRAMO., Aksnes, D. w., & D’Angelo, C. A. (2020). Comparison
of research productivity of Italian and Norwegian professors and
universidades. Journal of Informetrics, 14(2), 101023. https://doi.org
/10.1016/j.joi.2020.101023

Abramo, GRAMO., D’Angelo, C. A., & Di Costa, F. (2022). The effect of
academic mobility on research performance: The case of Italy.
Estudios de ciencias cuantitativas, 3(2), 345–362. https://doi.org/10
.1162/qss_a_00192

Abramo, GRAMO., D’Angelo, C. A., & Reale, mi. (2019). Peer review vs
bibliometría: Which method better predicts the scholarly impact
of publications? cienciometria, 121(1), 537–554. https://doi.org
/10.1007/s11192-019-03184-y

Abramo, GRAMO., D’Angelo, C. A., & rosados, F. (2013). Measuring
institutional research productivity for the life sciences: El
importance of accounting for the order of authors in the byline.
cienciometria, 97(3), 779–795. https://doi.org/10.1007/s11192
-013-1013-9

Aksnes, D. w., Schneider, j. w., & Gunnarsson, METRO. (2012). Ranking
national research systems by citation indicators. A comparative
analysis using whole and fractionalised counting methods.
Journal of Informetrics, 6(1), 36–43. https://doi.org/10.1016/j.joi
.2011.08.002

Aksnes, D. w., & Sivertsen, GRAMO. (2019). A criteria-based assessment
of the coverage of Scopus and Web of Science. Journal of Data
and Information Science, 4(1), 1–21. https://doi.org/10.2478/jdis
-2019-0001

Backes, t. (2018a). Effective unsupervised author disambiguation
with relative frequencies. In J. Chen, METRO. A. Gonçalves, & j. METRO.
allen (Editores.), Proceedings of the 18th ACM/IEEE on Joint Confer-
ence on Digital Libraries (páginas. 203–212). Fort Worth, Texas: Associ-
ation for Computing Machinery. https://doi.org/10.1145
/3197026.3197036

Backes, t. (2018b). The impact of name-matching and blocking
on author disambiguation. In Proceedings of the 27th ACM
International Conference on Information and Knowledge Man-
agement (páginas. 803–812). Turin, Italia: Association for Computing
Machinery. https://doi.org/10.1145/3269206.3271699

mayordomo, D.

(2010). University rankings smarten up. Naturaleza,
464(7285), 16–17. https://doi.org/10.1038/464016a, PubMed:
20203575

Caron, MI., & van Eck, norte. j. (2014). Large scale author name disam-
biguation using rule-based scoring and clustering. In E. Noyons
(Ed.), Proceedings of the Science and Technology Indicators
Conferencia 2014 Leiden (páginas. 79–86). Leiden: Universiteit
Leiden—CWTS.

Cota, R. GRAMO., Gonçalves, METRO. A., & Laender, A. h. F. (2007). A
heuristic-based hierarchical clustering method for author name
disambiguation in digital libraries. Paper presented at the XXII
Simpósio Brasileiro de Banco de Dados, João Pessoa.

D’Angelo, C. A., & Abramo, GRAMO. (2015). Publication rates in 192
research fields. In A. Salah, Y. Tonta, A. A. A. Salah, C. Sugimoto
(Editores.), Proceedings of the 15th International Society of Scientometrics
and Informetrics Conference – (ISSI 2015) (páginas. 909–919). Istanbul:
Bogazici University Printhouse.

D’Angelo, C. A., & van Eck, norte. j. (2020). Collecting large-scale
publication data at the level of individual researchers: A practical
proposal for author name disambiguation. cienciometria,
123(2), 883–907. https://doi.org/10.1007/s11192-020-03410-y
D’Angelo, C. A., Giuffrida, C., & Abramo, GRAMO. (2011). A heuristic
approach to author name disambiguation in bibliometrics
databases for large-scale research assessments. Journal of the
American Society for Information Science and Technology,
62(2), 257–269. https://doi.org/10.1002/asi.21460

Dehon, C., McCathie, A., & Verardi, V. (2010). Uncovering excel-
lence in academic rankings: A closer look at the Shanghai
ranking. cienciometria, 83(2), 515–524. https://doi.org/10.1007
/s11192-009-0076-0

Enserink, METRO. (2009). Are you ready to become a number? Ciencia,
323(5922), 1662–1664. https://doi.org/10.1126/science.323
.5922.1662, PubMed: 19325094

Estudios de ciencias cuantitativas

165

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Unsupervised author disambiguation algorithms

Gauffriau, METRO., & Larsen, PAG. oh. (2005). Counting methods are
decisive for rankings based on publication and citation studies.
cienciometria, 64(1), 85–93. https://doi.org/10.1007/s11192
-005-0239-6

Gläser, J., & Laudel, GRAMO. (2016). Governing science: How science
policy shapes research content. Archives Europeennes De
S o c i o l o g i e , 5 7 ( 1 ) , 11 7 – 1 6 8 . h t t p s : / / d o i . o rg / 1 0 . 1 0 1 7
/S0003975616000047

Hicks, D. (1999). The difficulty of achieving full coverage of inter-
national social science literature and the bibliometric conse-
quences. cienciometria, 44(2), 193–215. https://doi.org/10
.1007/BF02457380

Hjørland, B. (2010). The foundation of the concept of relevance.
Journal of the American Society for Information Science and
Tecnología, 61(2), 217–237. https://doi.org/10.1002/asi.21261
Huang, METRO. h., lin, C. S., & Chen, D. z. (2011). Counting methods,
country rank changes, and counting inflation in the assessment
of national research productivity and impact. Journal of the
American Society for Information Science and Technology,
62(12), 2427–2436. https://doi.org/10.1002/asi.21625

hussain, I., & Asghar, S. (2018). DISC: Disambiguating homonyms
using graph structural clustering. Journal of Information Science,
44(6), 830–847. https://doi.org/10.1177/0165551518761011
Iglesias, j. MI., & Pecharromán, C. (2007). Scaling the h-index for
different scientific ISI fields. cienciometria, 73(3), 303–320.
https://doi.org/10.1007/s11192-007-1805-x

Karlsson, S. (2017). Evaluation as a travelling idea: evaluando
the consequences of research assessment exercises. Investigación
Evaluation, 26(2), 55–65. https://doi.org/10.1093/reseval/rvx001
Larivière, v., Archambault, É., Gingras, y., & Vignola-Gagné, É.
(2006). The place of serials in referencing practices: Comparing
natural sciences and engineering with social sciences and
humanidades. Journal of the American Society for Information
Science and Technology, 57(8), 997–1004. https://doi.org/10
.1002/asi.20349

Liu, w., Dog(cid:1)un, R. I., kim, S., Comeau, D. C., kim, w., Yeganova,
l., … Wilbur, W.. j. (2014). Author name disambiguation for
PubMed. Journal of the Association for Information Science and
Tecnología, 65(4), 765–781. https://doi.org/10.1002/asi.23063,
PubMed: 28758138

Lillquist, MI., & Verde, S. (2010). The discipline dependence of cita-
tion statistics. cienciometria, 84(3), 749–762. https://doi.org/10
.1007/s11192-010-0162-3

Moed, h. F. (2010). CWTS crown indicator measures citation
impact of a research group’s publication oeuvre. Diario de
Informetrics, 4(3), 436–438. https://doi.org/10.1016/j.joi.2010
.03.009

Piro, F. NORTE., Aksnes, D. w., & Rørstad, k. (2013). A macro analysis of
productivity differences across fields: Challenges in the measure-
ment of scientific publishing. Journal of the American Society for
Information Science and Technology, 64(2), 307–320. https://doi
.org/10.1002/asi.22746

Rinia, mi. J., De Lange, C., & Moed, h. F. (1993). Measuring national
output in physics: Delimitation problems. cienciometria, 28(1),
89–110. https://doi.org/10.1007/BF02016287

Rose, METRO. MI., & Kitchin, j. R. (2019). pybliometrics: Scriptable bib-
liometrics using a Python interface to Scopus. SoftwareX, 10,
100263. https://doi.org/10.1016/j.softx.2019.100263

Sandström, Ud., & Sandström, mi. (2009). Meeting the micro-level
challenges: Bibliometrics at the individual level. In 12th
International Conference on Scientometrics and Informetrics
(páginas. 845–856). Río de Janeiro, Brasil.

Schulz, C., Mazloumian, A., Petersen, A. METRO., Penner, o., &
Helbing, D. (2014). Exploiting citation networks for large-scale
author name disambiguation. EPJ Data Science, 3, 11. https://
doi.org/10.1140/epjds/s13688-014-0011-3

Sorzano, C. oh. S., Vargas, J., Caffarena-Fernández, GRAMO., & Iriarte, A.
(2014). Comparing scientific performance among equals. ciencia-
tometrics, 101(3), 1731–1745. https://doi.org/10.1007/s11192
-014-1368-6

Tekles, A., & Bornmann, l. (2020). Author name disambiguation of
bibliometric data: A comparison of several unsupervised
approaches. Estudios de ciencias cuantitativas, 1(4), 1510–1528.
https://doi.org/10.1162/qss_a_00081

Tornero, D. (2005). Benchmarking in universities: League tables
revisited. Oxford Review of Education, 31(3), 353–371. https://
doi.org/10.1080/03054980500221975

van Hooydonk, GRAMO. (1997). Fractional counting of multi-authored
publicaciones: Consequences for the impact of authors. Diario
of the American Society for Information Science, 48(10),
944–945. https://doi.org/10.1002/(CIENCIA)1097-4571(199710)
48:10<944::AID-ASI8>3.0.CO;2-1

van Raan, A. F. j. (2005). Fatal attraction: Conceptual and method-
ological problems in the ranking of universities by bibliometric
methods. cienciometria, 62(1), 133–143. https://doi.org/10
.1007/s11192-005-0008-6

waltman, l., & van Eck, norte. j. (2015). Field-normalized citation
impact indicators and the choice of an appropriate counting
método. Journal of Informetrics, 9(4), 872–894. https://doi.org/10
.1016/j.joi.2015.08.001

waltman, l., van Eck, norte. J., van Leeuwen, t. NORTE., Visser, METRO. S., &
van Raan, A. F. j. (2011). Towards a new crown indicator: Alguno
theoretical considerations. Journal of Informetrics, 5(1), 37–47.
https://doi.org/10.1016/j.joi.2010.08.001

Wu, h., li, B., Pei, y., & Él, j. (2014). Unsupervised author disam-
biguation using Dempster–Shafer theory. cienciometria, 101(3),
1955–1972. https://doi.org/10.1007/s11192-014-1283-x

Wu, J., & Ding, X.-H. (2013). Author name disambiguation in sci-
entific collaboration and mobility cases. cienciometria, 96(3),
683–697. https://doi.org/10.1007/s11192-013-0978-8

Youtie, J., Carley, S., Portero, A. l., & Shapira, PAG. (2017). Tracking
researchers and their outputs: New insights from ORCIDs.
cienciometria, 113(1), 437–453. https://doi.org/10.1007
/s11192-017-2473-0

Zacharewicz, T., Lepori, B., Reale, MI., & Jonkers, k. (2019). Perfor-
mance-based research funding in EU member states—A compar-
ative assessment. Science and Public Policy, 46(1), 105–115.
https://doi.org/10.1093/scipol/scy041

Zhu, J., Wu, X., lin, X., Huang, C., Fung, GRAMO. PAG. C., & Espiga, Y. (2017).
A novel multiple layers name disambiguation framework for
digital libraries using dynamic clustering. cienciometria, 114(3),
781–794. https://doi.org/10.1007/s11192-017-2611-8

Zitt, METRO., Ramanana-Rahary, S., & Bassecoulard, mi. (2005). Relativity
of citation performance and excellence measures: From cross-field
to cross-scale effects of field-normalisation. cienciometria, 63(2),
373–401. https://doi.org/10.1007/s11192-005-0218-y

Estudios de ciencias cuantitativas

166

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

/

mi
d
tu
q
s
s
/
a
r
t
i
C
mi

pag
d

yo

F
/

/

/

/

4
1
1
4
4
2
0
7
8
3
8
9
q
s
s
_
a
_
0
0
2
3
6
pag
d

/

.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3ARTÍCULO DE INVESTIGACIÓN imagen
ARTÍCULO DE INVESTIGACIÓN imagen

Descargar PDF