RESEARCH ARTICLE
The prevalence and impact of university affiliation
discrepancies between four bibliographic
databases—Scopus, Web of Science,
Dimensions, and Microsoft Academic
a n o p e n a c c e s s
j o u r n a l
Philip J. Purnell1,2
1Centre for Science and Technology Studies, Leiden University, Leiden, The Netherlands
2United Arab Emirates University, Al Ain, UAE
Citation: Purnell, P. J. (2022). IL
prevalence and impact of university
affiliation discrepancies between four
bibliographic databases—Scopus, Web
of Science, Dimensions, and Microsoft
Academic. Quantitative Science
Studi, 3(1), 99–121. https://doi.org/10
.1162/qss_a_00175
DOI:
https://doi.org/10.1162/qss_a_00175
Peer Review:
https://publons.com/publon/10.1162
/qss_a_00175
Supporting Information:
https://doi.org/10.1162/qss_a_00175
Received: 20 Giugno 2021
Accepted: 17 novembre 2021
Corresponding Author:
Philip J. Purnell
p.j.purnell@cwts.leidenuniv.nl
Handling Editor:
Vincent Larivière
Copyright: © 2022 Philip J. Purnell.
Pubblicato sotto Creative Commons
Attribuzione 4.0 Internazionale (CC BY 4.0)
licenza.
The MIT Press
Keywords: affiliation, benchmarking, bibliometric database, disambiguation, unification, university
ABSTRACT
Research managers benchmarking universities against international peers face the problem of
affiliation disambiguation. Different databases have taken separate approaches to this problem
and discrepancies exist between them. Bibliometric data sources typically conduct a
disambiguation process that unifies variant institutional names and those of its subunits so that
researchers can then search all records from that institution using a single unified name.
This study examined affiliation discrepancies between Scopus, Web of Science (WoS),
Dimensions, and Microsoft Academic for 18 Arab universities over a 5-year period. Noi
confirmed that digital object identifiers (DOIs) are suitable for extracting comparable scholarly
material across databases and quantified the affiliation discrepancies between them. UN
substantial share of records assigned to the selected universities in any one database were not
assigned to the same university in another. The share of discrepancy was higher in the larger
databases (Dimensions and Microsoft Academic). The smaller, more selective databases
(Scopus and especially WoS) tended to agree to a greater degree with affiliations in the other
databases. Manual examination of affiliation discrepancies showed that they were caused by
a mixture of missing affiliations, unification differences, and assignation of records to the
wrong institution.
1.
INTRODUCTION
1.1. The Problem of Affiliations
The research community understands to varying degrees the importance of getting its univer-
sity affiliation names right. Individual researchers are now routinely assessed at least in part on
their ability to produce published articles, and their institutions are at least partially ranked on
those very same papers. The single factor tying the paper to its author’s employer is the affil-
iation name given by the author when they submit the manuscript to a journal. There are many
ways of acknowledging a university and one can easily confuse a rudimentary database by
simply swapping My City University with University of My City. Other common variations
involve acronyms, MCU, UMC, or partial acronyms, MC University, Univ MC. The list easily
extends to dozens of variants when authors introduce their faculty or department name, some-
times at the expense of the university name. Add to that the common practice of incorporating
one institution into another or splitting part of a university away from its main organization,
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
along with larger mergers and creation of international branch campuses and we have a com-
plex problem for those assessing the university’s research output.
Infatti, nowadays journal and author names are relatively constant, while it is not uncom-
mon for university names to change. Although there have been several initiatives to address
the problem by using unique identifiers for research institutions, none have been universally
adopted to the same extent as for journals (ISSN), individuals (ORCID), or documents (DOI).
These efforts, summarized in Table 1, have mainly been made by the major citation indexes,
such as Scopus (Affiliation Identifier or AFID), Web of Science (WoS; Organization Enhanced),
and Dimensions (Global Research Identifier Database or GRID). Inoltre, a new
community-led collaboration of multiple organizations has been launched and is known as
the Research Organization Registry (ROR). This holds promise because it is closely linked
to GRID and is to be incorporated into Crossref metadata (Lammey, 2020).
Databases used in bibliometric assessments have made strides into resolving the problem of
disambiguation using different solutions, including manual submission of affiliation variant
lists by universities to database owners or automated unification systems. The degree of accu-
racy is still unquantified and policy-makers who rely on bibliometric analysis often overlook
an inherent level of error when using these data sources.
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
1.2. The Importance of Affiliations
Initiatives to identify and list the world’s most influential academics based on citations to their
work rely on affiliation disambiguation techniques. Clarivate’s Highly Cited Researchers list rec-
ognizes approximately 6,000 scientists who have published papers cited in the top 1% of their
field in the preceding decade. The list is published once a year and is searchable by academic
institution. The composition and validity of the list is therefore dependent on the affiliation
disambiguation performed by Clarivate on its underlying WoS database. Allo stesso modo, the recently
updated “Stanford” author database of standardized citation metrics (Ioannidis, Boyack, & Baas,
2020) relies on Scopus affiliation disambiguation. The list of the top 2% authors can be down-
loaded and is sortable by academic institution. As universities seek to recruit scientists who
appear on these prestigious lists, an academic’s value is increased by virtue of their presence.
The ability of the databases to accurately link authors to their affiliations is therefore increasingly
valued by recruitment professionals as well as research managers.
Tavolo 1. Disambiguation processes used by bibliographic databases
Database
Scopus
Disambiguation
Affiliation identifier
Abbr.
AFID
Process
Institutions assigned an 8-digit AF-ID and variants
linked to main AFID
Web of Science
Organization enhanced
OE
Unifies the most frequently occurring address variants
Dimensions
Microsoft Academic
Global research
identifier database
Global research
identifier database
to preferred names
GRID
GRID
Freely accessible database of research organizations
that are assigned a unique and persistent identifier
linked to its variants
Freely accessible database of research organizations
that are assigned a unique and persistent identifier
linked to its variants
Community led registry
Research organization
ROR
Community-led initiative to supersede GRID and be
registry
incorporated into Crossref metadata
Quantitative Science Studies
100
The prevalence and impact of university affiliation discrepancies
Many universities driven by increased external competition (Brankovic, Ringel, & Werron,
2018; Espeland & Sauder, 2016) seek to maximize their position in the various international
ranking tables. The ranking organizations in turn typically assess institutions’ performance against
a set of criteria that usually include the quantity and impact of research publications (Centre for
Scienza & Technology Studies Leiden University, 2020; QS Intelligence Unit, 2019; Shanghai
Ranking Consultancy, 2019; Times Higher Education, 2019; NOI. News & World Report LP,
2019). These ranking systems use either Elsevier’s Scopus or Clarivate’s WoS to compute the bib-
liometric component of their tables. Nature Index, a database of author affiliations and institu-
tional relationships also recently used data from Dimensions in an experiment to identify research
in the field of Artificial Intelligence (Armitage & Kaindl, 2020).
The level of accuracy of those databases and their ability to assign papers to the correct affili-
ations consequently becomes one of the limiting factors in an institution’s performance (Orduna-
Malea, Aytac, & Tran, 2019). Any “missing” papers can cost a university valuable places in the
ranking table and authors are routinely encouraged to use the official institution name when
publishing. Nevertheless, each year universities complain to the ranking systems of missing
papers, but the rankers are constrained by the limitations of the citation indexes. Disgruntled uni-
versities are usually referred to the database owners to resolve their affiliation-related complaints.
Times Higher Education routinely uses the Scopus affiliation as delivered by Elsevier,
although it has occasionally worked directly with institutions to ensure the mapping used in
the ranking coincides with their organizational structure. QS receives data from Scopus and then
groups distinct Scopus AFIDs including medical schools, business schools, hospitals, and tech-
nical research institutes into single university entities that match those in its rankings database. In
this process, QS relies on input from the institutions to define such relationships (QSIU, 2019).
The Leiden Ranking uses its own (CWTS) version of WoS and conducts additional rounds of
affiliation disambiguation. Specifically, the CWTS system unifies all address variants that occur
at least five times in the WoS database, identifies missing university affiliations from depart-
ments and city names, and attributes papers from hospitals and medical centers to their affili-
ated universities based on author publication rules (Centre for Science & Technology Studies
Leiden University, 2020; Van Raan, 2005; Waltman, Calero-Medina et al., 2012).
Comparison of database coverage has become easier as most now aim to link their records to
the Digital Object Identifier (DOI). The DOI has become established as a persistent, reliable iden-
tifier together with a system using that identifier to locate digital services associated with that con-
tent (per esempio., Gasparyan, Yessirkepov et al., 2021; Zahedi, Costas, & Wouters, 2017). In bibliometric
studies, the DOI can therefore be used as a common identifier to determine overlapping coverage
of records in different databases. The existence of a DOI for a research article depends on the pub-
lisher generating the DOI and linking it to the article in the Crossref database. Most publishers now
aim to do this routinely but there are plenty of records that do not have a DOI and that limits our
ability to use it as a common identifier. Factors associated with lower prevalence of DOIs include
nonacademic records, arts and humanities fields, and document types from books and conference
proceedings. This study used DOIs to retrieve papers from selected universities and it was impor-
tant as a first step to understand what proportion of the actual output we were looking at.
1.3. Research Design
This paper addresses affiliation disambiguation by attempting to answer the following questions:
1. To what extent do discrepancies in author affiliations exist between the major biblio-
graphic databases? Which databases have most discrepancies?
Quantitative Science Studies
101
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
2. What types of discrepancies can be identified?
3. Are different types of discrepancy more prevalent in different databases?
The answers to these questions will be useful in our understanding of the extent to which
research outputs are accurately assigned to universities by the different databases. As many deci-
sions are based on the outcome of bibliometric studies, comparisons, and university rankings,
policy-makers will be better informed about the limitations of bibliometrics studies and compar-
isons. The ranking bodies may take these limitations into account when they publish their
league tables. Database owners may incorporate these findings into their development plans
and algorithms to improve the accuracy of their products and make them more competitive.
We selected 18 universities for the study, all from the Arab region. Local or regional languages
have been found to contribute to mistakes in author affiliations (Bador & Lafouge, 2005; Falahati
Qadimi Fumani, Goltaji, & Parto, 2013; Konur, 2013) and to our knowledge, no such study of
university affiliations in the Arab region has been published. We selected universities from Gulf
countries, the Levant, and North-East Africa because of our familiarity with the region, lingua,
and institutions.
We used a recent 5-year time window (2014–2018) and extracted records from four major
databases: Scopus, WoS, Dimensions, and Microsoft Academic.
The first part of the study was to determine the proportion of publications from our selected
universities that had DOIs. Records were extracted using DOIs and compared with the total
number of records for the selected publication years. This was to confirm we could use the
DOI to identify a sufficient quantity of scholarly material.
Each database has taken a different approach to affiliation disambiguation, and it is impor-
tant to know how they compare. Therefore, in the second part of the study we quantified the
affiliation discrepancies between databases. We paired the databases and specifically looked
at the nonoverlapping records (cioè., the surplus of records that were retrieved from one data-
base in the pair but not the other). Records could be in the surplus because of discrepancies in
affiliation or publication year, or differences in database coverage. We quantified the surplus in
each database with respect to the other three for all the universities. We then calculated the
proportion of the surplus that was caused by each of the three reasons (affiliation discrepancy,
publication year discrepancy, or coverage differences). Discrepancies due to publication year
were negligible and grouped together with coverage differences.
The third part of the study concentrated exclusively on those records that were caused by
discrepancies in the university affiliation. We manually examined two dozen records from
each database pair surplus and attempted to explain the discrepancies in the context of affil-
iation indexing. This has major implications for any university benchmarking study and par-
ticularly the international university rankings. The final section of the study requires close
knowledge of university names and we therefore chose to make this a regional study.
2. LITERATURE REVIEW
Work on author affiliations began in the mid-1980s. Per esempio, the LISBON Institute, IL
predecessor of CWTS already used affiliation data from the Science Citation Index (SCI) A
report the changes in academic collaboration in the Arab region following the geopolitical
developments in the 1980s (DeBruin, Braam, & Moed, 1991). The problems of author affilia-
tions became more important as bibliometric reports gained popularity and started being
offered as a service (Calero-Medina, Noyons et al., 2020). As in-house citation indexes were
Quantitative Science Studies
102
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
developed in the 1990s, care was taken to construct them in a manner that facilitated affilia-
tion disambiguation. The problem of missing author affiliations has been largely addressed, Ma
In 2015 the WoS indexes still contained sizable quantities of publications without any affilia-
tions whatsoever (SCIE: 7.6%, SSCI: 6%, e A&HCI: 35%) (Liu, Eh, & Tang, 2018).
The owner of WoS tried to overcome the affiliation disambiguation problem by introducing
its organization enhanced feature. The disambiguation process involves creating normalized
address segments for each record and unifying the most frequently occurring address variants
to a list of currently over 14,000 preferred names (Clarivate, 2020UN). Smaller or less frequently
occurring organizations might not have a preferred name in the system or some of its legiti-
mate address variants might not have been unified. Full details of the process are not avail-
able, but one imagines it a significant task to perform and then keep up to date in the light of
organization mergers and divestments. Organizations may request unification or corrections
to its address variant list via an online form (Clarivate, 2020B).
The organization enhanced feature has been selectively used in bibliometric studies to
improve accuracy (Baudoin, Akiki et al., 2018). A recent study (Donner, Rimmert, & van
Eck, 2020) showed widely varying recall and precision across institutions between the
WoS organization enhanced feature, Scopus AFID, and a German institution affiliation dis-
ambiguation system described as “near-complete” for German public research organizations.
Taking the German system as ground truth, neither WoS organization enhanced nor Scopus
AFID provided high recall rates, and both showed widely varying precision across institutions
impacting bibliometric indicators. The authors concluded that the resulting inconsistencies in
publication and citation indicators using the commercial vendor systems should be taken into
consideration by policy-makers.
A large-scale study comparing database coverage across a multi-institution data set
between Scopus, WoS, and Microsoft Academic examined the publication overlap of 15 uni-
versities with DOIs serving as the common attribute (Huang, Neylon et al., 2020). The authors
created a Venn diagram for each university showing the proportion of DOIs indexed by all
three data sources. The diagram also revealed the extent to which DOIs were covered by only
one of the three databases, which we term a surplus. For instance, DOIs found in WoS but not
in Scopus or Microsoft Academic count as WoS surplus. The authors found that Microsoft
Academic had the broadest DOI coverage but least complete affiliation metadata. IL
authors concluded that assessment of any institutional performance will produce different
results depending on the source database. They went on to demonstrate that representation
in databases varied widely between institutions, which compounds the likelihood of inaccu-
racies when comparing universities with each other. This introduces potential bias in any bib-
liometric assessment, such as university ranking, that relies on a single source. The present
study builds on the work by Huang et al. (2020) by further analyzing the affiliation assignation
of DOIs across the same three databases plus Dimensions for 18 universities in a specific
geographical region.
Another large-scale study (Visser, van Eck, & Waltman, 2021) compared coverage and the
completeness and accuracy of citation links between five databases, namely Scopus, WoS,
Dimensions, Microsoft Academic, and Crossref. The authors used a thorough and complex
matching technique that compared pairs of records against seven bibliographic attributes
and computed a matching score. They found Microsoft Academic had by far the broadest cov-
erage of the scientific literature. In pairwise analysis, Scopus contained large quantities of
exclusive publications not indexed in the other databases. Nel frattempo, Dimensions/Crossref
and Microsoft Academic indexed substantial bodies of records not found in Scopus. IL
Quantitative Science Studies
103
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
WoS surplus compared with Scopus was smaller and limited mainly to meeting abstracts and
book reviews, which are not indexed in Scopus. Another important finding was that Dimen-
sions and Crossref had almost overlapping coverage, which can be attributed to Dimensions’
reliance on Crossref as its core content.
A comparative analysis at institutional level found more institutions had greater coverage in
Scopus than in Dimensions, and that up to half the documents indexed in Dimensions have no
institutional affiliation (Guerrero-Bote, Chinchilla-Rodríguez et al., 2021). In the view of the
authors, this invalidates Dimensions as a suitable data source for assessing university impact.
2.1. Limitations of DOI Accuracy
All studies using DOIs are susceptible to a number of limitations. Primo, DOI assignation is not
always accurate. Infatti, one group found errors in 38% of the DOIs in the cited references of
their sample from WoS (Xu, Hao et al., 2019). Most (92%) of the errors in the DOI were in the
prefix and often included a surplus “DOI” or a duplication of the DOI. The authors went on to
propose an algorithm for cleaning the DOIs in the cited reference database. Another study
found 8,841 “illegal” DOIs (defined as those that do not begin with “10”) in Scopus (Huang
& Liu, 2019) and referred to Elsevier’s efforts to clean the Scopus data.
Zhu, Eh, and Liu (2019) created a search string to identify DOIs with errors in WoS. They
wrote a search strategy aimed at identifying cases in which a numeric digit such as “0” had
been replaced with the letter “o.” Similar errors occurred where the number “6” had been
confused with the letter “b” or the number “1” with the letter “l” among many other examples
identified. In some of these cases, searching both the correct version of the DOI (with the
numeric “0”) or the erroneous version (with the letter “o”) returned the WoS record.
Another problem is duplicate DOIs: This is when a single DOI is linked to multiple papers,
as reported by Franceschini, Maisano, and Mastrogiacomo (2015), or to multiple versions of
the same publication (Valderrama-Zurián, Aguilar-Moya et al., 2015). Databases have taken
different approaches to this problem and Elsevier has recently invested in improving the
Scopus data completeness and accuracy (Baas, Schotten et al., 2020).
The DOI was introduced in 2000, although there have been a number of other unique iden-
tifiers for published research papers. With the advent of electronic publishing, the Uniform
Resource Locator (URL) was an early candidate. A study of more than 10,000 MEDLINE
abstracts in 2008 looked at the decay rate of URLs between 1994 E 2006 (Ducut, Liu, &
Fontelo, 2008). The results showed that most (81%) of the URLs were available but that only
78% of the available URLs contained the actual information mentioned in the MEDLINE
record, and one in six (16%) of the total were “dead” URLs. A study comparing multiple
identifier systems found the DOI to be among the best following evaluation against seven
criteria including identifier features, digital coverage, and comprehensiveness of scope
(Khedmatgozar & Alipour Hafezi, 2015).
Over time, publishers incorporated DOIs into their online metadata. An examination of
WoS and Scopus records revealed that by 2014, most (90%) of “citable items” (defined as
journal articles, recensioni, and conference proceedings papers) were being assigned DOIs in
the sciences and social sciences (Gorraiz, Melero-Fuentes et al., 2016). The figures were lower
for all document types and much lower in the arts and humanities: 50% for journal citable
items and just 20% for books and book chapters. Articles published in regional publications
may also be less associated with DOIs. A sample of scholars from Brazil showed DOIs among
less than half of their journal papers and less than a tenth of their conference papers (Rubim &
Quantitative Science Studies
104
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
Braganholo, 2017). Nel frattempo, Mugnaini, Fraumann et al. (2021) found that the presence of
international coauthors increased the proportion of DOIs.
3. DATA AND METHODS
This study examines the overlap of indexed DOIs for 18 università (Tavolo 2) of all document
types in four international, multidisciplinary bibliometric databases often used in bibliometric
studies, namely Scopus, WoS, Dimensions, and Microsoft Academic.
From the WoS, we used five citation indexes: the Science Citation Index Expanded, Sociale
Sciences Citation Index, Arts & Humanities Citation Index, the Conference Proceedings Cita-
tion Index-Science, and the Conference Proceedings Citation Index-Social Sciences &
Humanities. We used neither the Book Citation Index, nor the Emerging Sources Citation
Index because we do not have access to them. From Dimensions, we extracted only publi-
cations because these are comparable with the documents in the other databases, but not
grants, patents, set di dati, clinical trials, or policy documents. All data were retrieved from
the Centre for Science and Technology Studies (CWTS) database system at Leiden University.
The data were received in March 2020 (WoS), April 2020 (Dimensions), Luglio 2020 (Microsoft
Academic), and April 2021 (Scopus).
Tavolo 2.
The universities used along with their abbreviations
Abbreviated name
Ain Shams
Ain Shams University
Full institutional name
Country
Alexandria
Alexandria University
American University of Beirut
Lebanon
Assiut
AUB
Babylon
Baghdad
Bahrain
Carthage
Jordan
Khalifa
Assiut University
University of Babylon
University of Baghdad
University of Bahrain
University of Carthage
University of Jordan
Khalifa University
King Abdulaziz
King Abdulaziz University
King Saud
Kuwait
Qatar
Lebanese
Sfax
King Saud University
Kuwait University
Qatar University
Lebanese University
University of Sfax
Sultan Qaboos
Sultan Qaboos University
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Egypt
Egypt
Egypt
Iraq
Iraq
Bahrain
Tunisia
Jordan
United Arab Emirates
Saudi Arabia
Saudi Arabia
Kuwait
Qatar
Lebanon
Tunisia
Oman
UAEU
United Arab Emirates University
United Arab Emirates
Quantitative Science Studies
105
The prevalence and impact of university affiliation discrepancies
We extracted records based on the highest level of affiliation disambiguation available for
the selected universities in each database. In Scopus, we used the Affiliation identifier (AFID),
which is a unique identifier for the institution to which records are tagged. In many cases,
Scopus includes one AFID for the overall organization and multiple additional AFIDs for part-
ner institutions and component units to reflect the organizational structure of the university.
This enables the user to search for records from the entire organization or for its subunits indi-
vidually. This process appears to have been conducted more rigorously for some universities
than others. The result is that for some universities there is only one AFID, while for others
there are many, and we therefore used only the main AFID for each university. From WoS,
we used the organiation enhanced feature, which is a preferred institutional name searchable
in the database and to which records from that organization and its component parts are uni-
fied. The organization enhanced unification process is performed by the database owner with
voluntary input from the institutions themselves and suffers from the same inconsistencies as
the Scopus AFIDs. Therefore, we used only the high-level organization enhanced name and
no additional variants. Dimensions and Microsoft Academic each use GRID, which was devel-
oped by Digital Science to describe both parent-child relationships between institutions and
external related organizations. GRID disambiguates affiliation names for approximately
100,000 organizations and we therefore used the GRID record linked to the generally
accepted name for each of the 18 university names.
In the first part of the study, we determined the proportion of records with a DOI in each
database. We extracted the unique DOIs present in each database between the years 2014 E
2018 inclusive and calculated the proportion of total records in the database with a DOI. IL
overall share is reported, as well as a share for each of the studied universities.
The second section of the study quantifies the share of the surplus caused by affiliation dis-
crepancy and, separately, the share of the surplus caused by database coverage. For this we
used only those records with DOIs in the same 5-year period and for the 18 selected universi-
ties. Any records whose publication year differed between databases such that they were
included in the 5-year window in one database but not another were counted under coverage
discrepancy. Preparatory work for the study demonstrated that the number of records in this
category was negligible. Because of the complexities of database comparison, we compared
coverage between the databases in pairs. That is to say, we extracted DOIs using university
affiliation names and the publication time window in one of the databases (“Primary”). We then
searched for those same DOIs in the second database (“Comparator”), restricting the search to
the same affiliation and the same publication time window. Those DOIs found in both data-
bases constitute overlapping coverage and were not studied further. Those DOIs not found
in the comparator database are termed the surplus and the reasons for their absence are inves-
tigated. To establish these reasons, we systematically repeated the comparison, removing ele-
ments of the search in the comparator database for affiliation and publication time period.
In the final part of the study, we manually analyzed a sample of 24 affiliation discrepancies for
each of the 12 database pairs. Examples were selected at random with a maximum three pub-
lications in each comparison from the same publisher. These examples served to illustrate the
presence and types of affiliation discrepancies between databases. For each database pair sur-
plus, we selected the university with the highest proportion of affiliation discrepancies. To have
as much diversity as possible in the universities studied, each university was selected at most
once. (If the university with the highest proportion of affiliation discrepancies had already been
selected for another database pair, we moved on to the university with the next highest propor-
tion.) Examination was performed by manually searching for the records on the web interface
versions of each database and checking the published PDF documents as the ground truth.
Quantitative Science Studies
106
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
4. RESULTS
4.1. Number and Proportion of DOIs
The total number of records along with the number of unique DOIs retrieved for each of the
four databases is shown in Table 3.
Scopus and WoS each had DOIs for more than three quarters of the publications. Both these
databases incorporate a selective expert assessment of journals, books, and conference series
before they are admitted to the database. This process ensures that most indexed articles go
through the established scholarly publishing route and are therefore increasingly likely to have
a DOI. The proportion of records with DOIs for the selected universities in this study (89.9%
Scopus and 82.3% WoS records) seem somewhat higher than the proportions reported by
Gorraiz et al. (2016). Tuttavia, that study used earlier data up to 2014 and demonstrated an
upward trajectory for documents with a DOI that would roughly coincide with the figures
presented here. Records with a DOI comprise 96% of the Dimensions database, which is the
highest share among the four. As Dimensions uses Crossref as the key pillar of its content along
with PubMed, the high proportion is not especially surprising. All Crossref records have a DOI
E 90% of PubMed articles already had DOIs by 2015 (Boudry & Chartron, 2017).
There is some variance in DOI prevalence between the universities. The variance might be
due to the overrepresentation of certain document types that are less frequently assigned DOIs
(per esempio., books, book chapters, and conference proceedings). Equally, universities that publish
more papers in subject fields such as the arts and humanities could also contribute to a lower
proportion of DOIs among their publications (Gorraiz et al., 2016).
For studies concerning detailed comparison of database coverage, additional metadata such
as article title and author names should be used to maximize identification of overlapping records.
The focus of the current study, Tuttavia, was not coverage, but the prevalence of affiliation
discrepancies between databases. Therefore, we could exclude records that do not have DOIs
without negatively impacting the study results. It was more important to identify records with a
common identifier so that we could compare their affiliations in the different databases. Questo
approach was largely inspired by the work of Huang et al. (2020), who also took this approach.
4.2. Prevalence of Affiliation Discrepancies
In the next part of the study, we examined the prevalence of affiliation discrepancies in four
citation indexes (Scopus, WoS, Dimensions, and Microsoft Academic) for 18 selected Arab
università.
In Figure 1 we present a stacked bar chart for each database pair, with the primary data-
base on the left and the comparator database on the right. The first portion of the bar, read-
ing from the left, represents the surplus records only found in the primary database and not
present in the comparator. For instance, the top bar shows 30,060 Scopus DOIs that are not
included in WoS. The second portion includes those DOIs that are present in both data-
bases but are not assigned to the relevant university affiliation in the comparator database.
In the first bar that means 1,831 Scopus DOIs are not found to have the same affiliation in
WoS, even though they are present in WoS. The central portion of the bar shows the number
of DOIs found in both databases with the same affiliation. The penultimate portion of the
bar shows that 5,924 WoS DOIs are also found in Scopus but not with the relevant affilia-
zione. In the final portion of the bar, there are 3,114 WoS DOIs that are not included in the
Scopus database.
Quantitative Science Studies
107
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Q
tu
UN
N
T
io
T
UN
io
T
io
v
e
S
C
e
N
C
e
S
tu
D
e
S
T
io
Tavolo 3. Number and proportion of records with DOI by university
Scopus
Web of Science
Dimensions
Microsoft Academic
DOI
Total
% DOI
DOI
Total
% DOI
DOI
Total
% DOI
DOI
Total
% DOI
7,785
6,875
4,608
4,364
1,269
2,687
811
4,595
3,725
5,781
8,567
90.9
7,524
91.4
5,027
91.7
4,733
92.2
1,978
64.2
3,827
70.2
952
85.2
4,957
92.7
4,334
85.9
6,252
92.5
6,621
5,277
3,930
4,420
389
1,380
506
3,985
2,599
3,936
8,139
81.3
6,584
80.1
4,762
82.5
5,569
79.4
472
82.4
1,629
84.7
566
89.4
5,263
75.7
3,132
83.0
5,176
76.0
9,741
7,218
4,439
4,080
618
1,930
722
4,103
3,277
4,991
9,809
7,280
99.3
99.1
4,511
98.4
4,115
99.1
621
99.5
10,834
11,230
96.5
7,198
4,538
5,620
701
7,528
95.6
4,829
94.0
5,961
94.3
878
79.8
1,959
98.5
2,214
3,003
73.7
723
99.9
4,113
99.8
3,306
99.1
4,991
100.0
722
3,257
3,758
3,746
778
92.8
3,433
94.9
4,441
84.6
4,031
92.9
Institution
Ain Shams
Alexandria
Assiut
AUB
Babylon
Baghdad
Bahrain
Carthage
Jordan
Khalifa
King Abdulaziz
22,038
24,045
91.7
19,287
21,502
89.7
19,080
19,221
99.3
18,966
19,889
95.4
T
H
e
P
R
e
v
UN
l
e
N
C
e
UN
N
D
io
M
P
UN
C
T
o
F
tu
N
io
v
e
R
S
io
T
sì
UN
F
F
io
l
io
UN
T
io
o
N
D
io
S
C
R
e
P
UN
N
C
io
e
S
21,620
23,870
90.6
18,246
21,457
85.0
19,470
19,788
98.4
19,165
20,305
94.4
3,204
2,880
6,080
7,339
3,727
3,699
3,617
88.6
3,107
92.7
6,686
90.9
7,863
93.3
4,229
88.1
4,090
90.4
2,406
1,590
4,182
6,296
2,650
2,540
2,976
80.8
2,198
72.3
5,393
77.5
8,155
77.2
3,241
81.8
3,172
80.1
3,002
2,933
5,888
6,793
3,485
3,393
3,021
99.4
2,967
98.9
5,895
99.9
6,803
99.9
3,539
98.5
3,419
99.2
3,040
2,884
5,980
6,055
4,471
3,327
3,280
92.7
3,148
91.6
6,373
93.8
6,340
95.5
4,898
91.3
3,626
91.8
107,578
119,720
89.9
85,069
103,363
82.3
100,511
101,390
99.1
102,145
109,503
93.3
13,046,237
15,495,969
84.2
9,833,766
12,975,857
75.8
21,691,146
22,589,839
96.0
21,042,708
55,168,811
38.1
King Saud
Kuwait
Lebanese
Qatar
Sfax
Sultan Qaboos
UAEU
Total 18
università
Whole
database
1
0
8
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
Figura 1. Affiliation discrepancy for the 18 universities by database pair.
Those DOIs in the second and fourth portions of the bar therefore represent DOIs where
there is a discrepancy between the affiliation assigned in the two databases. It is not possible at
this point to say which, if either, is wrong. To make a decision about the accuracy of the
assigned affiliation, we need to look at the individual cases and usually check with the pub-
lished PDF document. We report this in Section 4.4. Therefore, we do not refer to affiliation
errors, but prefer the term discrepancies. There will also be cases in which the affiliation is
missing or has failed to be assigned to the university in both databases, and those records
would not appear at all in the results. We are therefore not presenting a comprehensive list
of errors, but rather an indication of the proportion of discrepancies between database pairs.
We can now analyze the relative affiliation discrepancies between each of the database
pairs presented in Figure 1. For ease of discussion, we have organized the database pairs into
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Figura 2. Scopus–Web of Science differences in coverage and affiliation by university.
Quantitative Science Studies
109
The prevalence and impact of university affiliation discrepancies
four sections, with the primary database as the heading. In each comparison, we only used
those records associated with a DOI.
4.2.1.
Scopus
IL 1,831 Scopus records from the 18 selected universities that are present in WoS but not
linked to the same affiliation represent about 2% of the actual overlap between the two data-
bases. The next two bars show far more sizable proportions of Scopus records not linked to the
same affiliation in Dimensions and Microsoft Academic. We can interpret these results by sug-
gesting that the Scopus affiliation is more likely to agree with that assigned in WoS than it is
with those assigned in either Dimensions or Microsoft Academic.
D'altra parte, a small share (4%–7%) of publications has been assigned to the 18
universities in Scopus, but not in WoS (5,924), Dimensions (5,068), or Microsoft Academic
(4,697).
4.2.2. Web of Science
Records linked to the selected universities in WoS but not assigned to the same affiliations in
each of the other three databases appeared to constitute about 7% of the overlapping Scopus
records, and about a fifth of the overlapping DOIs in Dimensions (17,492) and Microsoft
Academic (20,384). This means that the WoS affiliation does not concur with the other three
databases in a substantial proportion of cases.
Comparing in the other direction, the penultimate section of the WoS bars (1,831 records
for Scopus, 2,640 for Dimensions, E 1,829 for Microsoft Academic) is relatively small in
each case. That means there are comparatively few records in the three comparator databases
in which the affiliations fail to concur with their corresponding record in WoS.
4.2.3. Dimensions
Dimensions records were clearly more likely to coincide with the affiliations assigned in WoS
and Scopus than they were for affiliations assigned by Microsoft Academic. Dimensions affil-
iation discrepancies with those assigned in WoS (2,640 records) and Scopus (5,068 records)
accounted for only 2% E 4% of the overlapping coverage respectively. Nel frattempo, 15%
Dimensions records with overlapping coverage in Microsoft Academic showed affiliation dis-
crepancies with their corresponding records in Microsoft Academic.
The opposite comparison shows that a substantial number of publications have not been
assigned to the 18 universities in Dimensions, while they have been assigned to these univer-
sities in the other three databases. This suggests that Dimensions may incorrectly not have
assigned these publications to the 18 università.
4.2.4. Microsoft Academic
Allo stesso modo, there was a relatively small number of Microsoft Academic records that did not agree
with affiliations assigned in WoS (1,829) and Scopus (4,697). These discrepancies represented
2% E 4% of the overlapping records respectively. A more sizable proportion of around 15%
Microsoft Academic records failed to match the affiliations assigned in Dimensions.
From the other direction, a substantial number of publications have not been assigned to
IL 18 universities in Microsoft Academic, while they have been assigned to these universities
in the other three databases. This suggests that Microsoft Academic may incorrectly not have
assigned these publications to the 18 università.
Quantitative Science Studies
110
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
Figura 3. Scopus–Dimensions differences in coverage and affiliation by university.
4.2.5. Overall summary of the results
When WoS or Scopus assign an affiliation to a publication, the same affiliation is usually
assigned by the other databases. The same is not the case the other way round. The largest
share of discrepancies occurred with affiliations assigned by Microsoft Academic when com-
pared with the other three databases. Dimensions-assigned affiliations also found sizable
shares of discrepancies when their records were compared in the other databases.
The database coverage played a role, with many records from the larger databases (Dimen-
sions and Microsoft Academic) not found indexed in the smaller, more selective databases
(Scopus and especially WoS).
4.3. Database Surplus by University
Figures 2–7 show the extent of differences in coverage and affiliation between databases for
each of the 18 selected universities. The additional level of data can help us interpret the dif-
ferences between the way the databases approach affiliation disambiguation.
There were some patterns that emerged from this analysis. Per esempio, there was a con-
sistently higher proportion of records assigned to the Lebanese University than for the other
universities found in Scopus, Dimensions, and Microsoft Academic that were not assigned to
this university in WoS. We should recall that for this study, we defined WoS records as belong-
ing to the universities only if they were retrieved using the organization enhanced unification
tool. If records contained the correct address but was not unified to its affiliation in the orga-
nization enhanced tool, then they would not have been retrieved.
Allo stesso modo, records assigned to Khalifa University in Scopus, WoS, and Dimensions were
frequently not found assigned to that university in Microsoft Academic. Khalifa University is
the result of a merger between three institutions that took place in 2017. Some of these
records will be examined in Section 4.4 to determine how different databases approached
the resulting affiliations.
Quantitative Science Studies
111
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
Figura 4. Scopus–Microsoft Academic differences in coverage and affiliation by university.
In the case of AUB many publications have been assigned to the university in WoS and
Microsoft Academic, while they have not been assigned to the university in Dimensions. As
we will show in the next section, this was likely due to differing treatment of records from the
American University of Beirut Medical Center.
Babylon had a higher proportion of Scopus records than other universities that were not
assigned to that university in Dimensions and Microsoft Academic. It should be noted that
Scopus uses one affiliation identifier (AFID) for the main institution and assigns other AFIDs
for the subunits of the university based on the organization hierarchy. Tuttavia, this process
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Figura 5. Web of Science–Dimensions differences in coverage and affiliation by university.
Quantitative Science Studies
112
The prevalence and impact of university affiliation discrepancies
Figura 6. Web of Science–Microsoft Academic differences in coverage and affiliation by university.
has been conducted more vigorously for some institutions than it has for others, and that dif-
ference might influence the level of affiliation discrepancy found. Per esempio, Babylon has
only one Scopus AFID and the lowest or nearly lowest share of affiliation discrepancies in all
the database pairings where Scopus is the comparator. Conversely, Carthage has one Scopus
AFID for the main institution and 31 AFIDs for the subunits, which add 78% more records to
the university total when they are all included in the search. As discussed in Section 3, we
used only the main AFID to identify the publications of a university in Scopus. Hence, in the
case of Carthage, the large number of AFIDs for subunits of the university probably explains
why the university has the highest or second highest share of affiliation discrepancies in all
comparisons where Scopus was the comparator.
4.4. Types of Affiliation Discrepancy
The discrepancies in affiliations are the main focus of this study and vary widely in their prev-
alence; some interesting examples are described in this section. These highlight the challenges
faced by database providers and the various ways they have responded to them. We found we
could organize the affiliation discrepancies into four main groups, as shown in Table 4.
We manually examined two dozen sample records at random from each of the database
pairs using the web interface for each database. For each of these examples, we attempted to
discover the reason for the affiliation discrepancy for one of the universities between the data-
bases. The main reasons are summarized in Table 5. These sample analyses serve to illustrate
that discrepancies exist and to shed light on the possible reasons behind them. Tuttavia, UN
larger study would be needed to provide a more robust comparison.
4.4.1. Missing affiliation
The author affiliation has not been captured by the database and therefore a search for the
affiliation name did not retrieve the record. We found almost all Qatar University records in
the Microsoft Academic–Dimensions surplus were caused by missing affiliations in Dimen-
sions. In most of these cases all author affiliations were missing, and the papers were
Quantitative Science Studies
113
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Database
Scopus
Dimensions
The prevalence and impact of university affiliation discrepancies
Affiliation discrepancy type
Missing affiliation
Definition
Esempio
Author’s affiliation is missing
10.4018/jdm.2016100102
Tavolo 4.
Types of affiliation discrepancy between databases
Author’s first affiliation is present, Ma
their second affiliation is missing
10.1166/asl.2017.7424
Missing second
affiliation
Unification
Affiliation mentioned in some form
but not linked to unified record
10.2174/1386207319666161214111822 Web of Science
Assigned to wrong
institution
Author address linked to a different
institution than that intended
10.1016/j.compfluid.2014.07.013
Microsoft
Academic
conference proceedings or book chapters. Allo stesso modo, we found that where WoS has missed
author affiliations, the majority were meeting abstracts.
4.4.2. Missing second affiliation
The author’s first affiliation has been listed but not the second. This is similar to the above
category but worth separating as it appears that distinct groups of papers are indexed in some
databases for which the additional affiliation is missed while the first is captured. As an exam-
ple, on 10.1166/asl.2017.7424 the PDF shows University of Babylon as second affiliation for
one author that is omitted from the record in Dimensions and Microsoft Academic but
included in Scopus and WoS. Sometimes we can speculate on the reason for this. In cases
such as 10.1016/j.asoc.2016.06.019, the author’s second affiliation is listed separately from
the first on the PDF under categories such as “Author’s current address” or in this case,
Database pair
Scopus–Web of Science
Scopus–Dimensions
Scopus–Microsoft Academic
Web of Science–Scopus
Web of Science–Dimensions
Institution
Lebanese
Sfax
Khalifa
Saud
AUB
Web of Science–Microsoft Academic
Assiut
Dimensions–Scopus
Dimensions–Web of Science
Dimensions–Microsoft Academic
Microsoft Academic–Scopus
Carthage
Bahrain
Babylon
Kuwait
Microsoft Academic–Web of Science UAEU
Microsoft Academic–Dimensions
Qatar
Tavolo 5.
Affiliation errors
Missing
affiliation
Missing second
affiliation
Assigned to
wrong institution Unification
11
2
6
6
6
3
1
9
4
20
1
1
12
6
1
1
1
13
1
5
19
7
3
1
22
11
20
14
1
21
4
2
15
2
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Inconclusive
1
2
2
4
2
24
1
Quantitative Science Studies
114
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
The prevalence and impact of university affiliation discrepancies
“Correspondence address.” Scopus and WoS included this as second affiliation, while Dimen-
sions and Microsoft Academic did not.
4.4.3. Assigned to the wrong institution
We found records from the College of Information Technology, University of Babylon, and the
College of Information Technology, UAEU that had been erroneously assigned to the College
of Information Technology, an institution registered in Pakistan in Microsoft Academic (per esempio.,
10.1007/s00500-018-3414-4). In other databases, these records had been correctly assigned
to their respective universities. The term “College of Information Technology” is a common
university subunit, and it appears that these words have triggered unification in Microsoft Aca-
demic to the standalone institution with the same name. Using these data in a bibliometric
study would therefore produce a lower than expected count of papers from the affected insti-
tutions and an artificially high result for the real College of Information Technology in Pakistan.
The same phenomenon occurs for Information Technology University in Pakistan, and the
University College of Engineering in India, each of which are assigned additional papers in
Microsoft Academic.
Other examples showed that records assigned to UAEU in WoS organization enhanced
were in fact published by authors at a Moroccan institution called Université Abdelmalek
Essaadi, locally abbreviated to “UAE University” and mistakenly unified to the wrong institution.
Allo stesso modo, authors from LaSTRe Laboratory in Tripoli, Northern Lebanon, which is affiliated to the
Lebanese University, had been erroneously affiliated to the University of Benghazi in Tripoli,
Libya in the WoS organization enhanced. This might have occurred because of the appearance
of the city name, Tripoli, which is the capital of Libya but also a city in Lebanon. Two further
records from the same university were assigned to the United States of America due to confusion
over the town of Lebanon in Grafton County, New Hampshire. It appears therefore that the
presence of a city name or country name might sometimes trigger unification to the wrong
organization enhanced name in the WoS.
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Figura 7. Dimensions–Microsoft Academic differences in coverage and affiliation by university.
Quantitative Science Studies
115
The prevalence and impact of university affiliation discrepancies
4.4.4. Unification
An affiliation is listed but it has not been unified to the main or correct university. Per esempio,
most of the Scopus records that we did not find in WoS under the Lebanese University affil-
iation did actually mention the university in the address field but were not unified to that
university in the organization enhanced field. This is a plausible explanation for the large pro-
portion of Lebanese university affiliation discrepancies with WoS described in Section 4.3.
Allo stesso modo, we discovered several examples of records attributed to either the Masdar Institute
or Petroleum Institute in Abu Dhabi, which have both been part of Khalifa University since a
merger in 2017. Microsoft Academic has kept the original affiliation while the other databases
have unified records to Khalifa University even from before the merger. This explains the high
proportion of records unified to Khalifa in Scopus, WoS, and Dimensions but not found with
that affiliation in Microsoft Academic. It also highlights the difficulties faced by database owners
with treating records from organizations following their mergers or separations.
Many papers attributed to authors from the American University of Beirut Medical Center
have been unified to AUB in WoS but not in Dimensions, which treats it as a separate insti-
tution. That explains the notable proportion of affiliation discrepancies found with Dimensions
in Figures 5 E 7. Allo stesso modo, many authors in Tunisia have acknowledged their institution as
Faculty of Sciences at Sfax. We found that Scopus unified these papers to the University of Sfax
while Dimensions treated it as a separate organization. In another similar case also in Tunisia,
we found several records affiliated to the National Institute of the Applied Sciences and Tech-
nology and several other research centers that WoS and Dimensions unified to the University
of Carthage while Scopus did not. In most of these cases, the university was not mentioned in
the PDF document, but Scopus has made the link through its disambiguation process. IL
Scopus interface offers users the option of searching “whole institutions,” which includes all
affiliated institutes, or “affiliation only,” which is the main identifier for the university.
4.4.5.
Inconclusive
We classified cases as inconclusive where there was no obvious reason for the DOI not being
retrieved, a human indexing decision involved, or access to PDF proved impossible. An exam-
ple of a human indexing decision is a book preface with a DOI (10.1016/B978-0-12-800887-
4.00034-1) but no authors or affiliations. While Scopus had assigned the book editors as
authors, Dimensions had not. Another interesting case was a letter to the editor published
by three authors with a long list of additional signatories at the end. Scopus had counted all
the signatories as authors of the letter while Dimensions limited authorship to the three at the
top of the paper. Neither of these cases is clear-cut. If one accepts that the book editors in the
first case and letter signatories in the second should be named as authors then their affiliations
are missing in Dimensions. If they should not be named, then they are phantom affiliations
in Scopus.
5. DISCUSSION AND CONCLUSION
Our results showed that the proportion of Scopus and WoS records with DOIs was in line
with previous studies and that Dimensions showed near universal DOI coverage. Microsoft
Academic included a large proportion of content including patents and other nonacademic
document types that do not have DOIs. There was some variance in DOI coverage between
università, probably due to the prevalence of certain document types or subject fields
which vary in their DOI assignation. Overall, bibliometric studies such as the one presented
Quantitative Science Studies
116
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
in this paper can use the presence of DOIs to limit their data sets to comparable scholarly
Materiale.
We analyzed overlapping coverage between databases in pairs and organized the results
based on whether or not the author affiliations matched. Some records were assigned to the
same university in both databases in the paired comparison. A substantial share of records
were assigned to a university in one database but not in the other. This study concentrated
on the affiliation discrepancies between the databases for 18 selected Arab universities.
We found evidence that up to one in five publications can have discrepancies in author
affiliations between the major bibliographic databases. We found the discrepancies more fre-
quently in the larger databases: Dimensions and Microsoft Academic. The highest incidences
of discrepancy were WoS records not found to have the same affiliation in Microsoft Academic
and then Dimensions. The next highest were Scopus records not found to have the same
affiliation in Microsoft Academic and then Dimensions. Nel frattempo, when publications
were assigned affiliations in any of the databases, the same affiliation was usually assigned
in Scopus, and especially in WoS. These two databases are smaller, more selective, E,
crucially, frequently engage with institutions to improve the unification of affiliation vari-
ants. Our WoS results might be more favorable than the real situation because our version
excludes book chapters and a lot of geographically diverse journals, often from university
presses that might less rigorously assign author affiliations.
Our results revealed different reasons behind the discrepancies. These included problems
of unifying address variants to the main institution, publications with missing author affilia-
zioni, and cases of records clearly being assigned to the wrong institution. One common
source of difficulty was the naming of institutions such as the College of Information Technol-
ogy, which is also the name of the subunit of many universities around the world. Another
arises from cities of the same name in different countries, such as Tripoli, which exists in both
Libya and Lebanon.
In our sample, we found that discrepancies in WoS were most frequently due to problems
with unifying variants and in some cases, confusion clearly led to assigning records to the
wrong institution. Discrepancies among Dimensions records were more likely found due to
missing affiliations but there were also some issues with unification. In Scopus and Microsoft
Academic there was no clear pattern and causes of discrepancy were mixed.
The Scopus affiliation identifier (AFID) and the WoS organization enhanced feature each
depend on engagement with the institutions to link affiliation variants to the main organi-
zation. Where Scopus has assigned multiple AFIDs to subunits of universities, such as the
31 AFIDs assigned to subunits of the University of Carthage, there is more chance of dis-
crepancy with other databases. When comparing institutions, bibliometricians should resist
the assumption the disambiguation process has been performed to the same level for all
institutions in the analysis. Our results show that this is not always the case and that some
comparisons will produce misleading results.
Manual examination of individual records revealed examples of publication records where
databases have made different choices about how to unify author affiliations. Il corretto
answer is often not clear and using one or another database will incorporate the impact of
the choices of database owners into the results of any bibliometric analysis or benchmarking
exercise based upon them.
University rankings providers usually rely on the disambiguation used in the source data-
bases. The major exception is the Leiden Ranking, which disambiguates all affiliations from its
Quantitative Science Studies
117
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
proprietary database (Calero-Medina et al., 2020), while both QS (QSIU, 2019) and Times
Higher Education have begun supplementary work on Scopus unification in special cases. Questo
practice is welcomed and providers of products that derive from bibliometric data sources
should be encouraged to increasingly participate in the analytical process and assume a share
of responsibility for the accuracy of the resulting publication.
These results support the conclusions of Huang et al. (2020), who encourage university
ranking publishers to employ multiple bibliographic data sources. While Visser et al. (2021)
discuss the merits of more selective databases for university rankings, the results in this paper
show the presence of author affiliation disambiguation in Scopus and WoS still poses a signif-
icant limitation to accuracy.
A number of limiting factors should be considered when discussing the results presented
here. This study only used publications assigned a DOI in the comparisons between databases.
As shown in Table 3, that excludes up to a fifth of the records, depending on the database
used, which might also contain affiliation discrepancies and influence the results. Inoltre,
as discussed in Section 2.1, there are potential errors in assigning DOIs, including duplicate
DOIs for the same publications, multiple publications, or versions of one publication with the
same DOI, and various types of error within the DOI rendering it unable to link to its assigned
pubblicazione. The DOI is the most appropriate identifier we found for use in this study, but we
acknowledge its limitations.
The universities selected for use in the study were all from a specific geographical region
and a broader comparison including institutions from different regions from around the world
would show whether our findings generalize to other regions. Some databases have engaged
with the institutions in the study to varying degrees, as evidenced by the range in the number
of affiliation variants listed among the Scopus AFIDs and WoS organization enhanced names.
The engagement with universities will clearly improve the ability of the database to accurately
identify and assign publications and therefore introduce a variable in our results.
The examination of types of affiliation discrepancy summarized in Table 5 relied on manual
work and introduced an element of human judgment when comparing the published PDF doc-
ument with its corresponding record in the bibliographic databases. Questo, combined with the
limited sample size for each comparison, requires the reader to interpret the results as illustra-
tions of the type of discrepancy and gives an idea of the size of the problem. Tuttavia, we do
not interpret these data as statistically representative of the full, global databases.
Another considerable limitation to the present and most previous studies on affiliation dis-
ambiguation is the fact that university names change over time. Changes result from a number
of factors, including mergers and splits but also for reasons of government naming conven-
zioni, changes of country leaders, city names, and other factors. Once an organizational sub-
unit is unified to an affiliation, all its prior papers will be found when searching the new unified
affiliation. Studies are therefore a time-frozen shot of the current unification and do not take
account of unification dynamics over time.
There is a need for a universal unique identifier for academic institutions that should reflect
the current and historical organization relationship tree. The ideal indicator will be supported
by input from the institutions themselves in the same way that researchers maintain their own
ORCID records. That way, the accuracy, maintenance, and historical record will be maxi-
mized. There will be scope for nonmaintenance or misuse, especially where an institution
can benefit from a certain interpretation of its organization, but these will be outweighed by
the benefits. Inoltre, in the case of an open infrastructure, any misuse will be publicly
Quantitative Science Studies
118
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
visible, which will act as a disincentive. Universities and their stakeholders should still decide
their own names and they are still the most appropriate managers of the public record of their
relationships with subunits and external entities.
This study has demonstrated the scope for improvement in four bibliographic databases and
highlighted many problems faced by those attempting the task of disambiguation. Many of
those cases can potentially be resolved by incorporating a global, universally accepted iden-
tifier for use by the worldwide research community and supported with input from universities.
ACKNOWLEDGMENTS
The author thanks Ton van Raan and Ludo Waltman for expert guidance throughout the study,
Martijn Visser for helping improve the data accuracy, Christian Herzog (Digital Science) E
Martin Szomszor (Clarivate) for their useful comments on earlier versions of the article, E
two anonymous reviewers whose suggestions helped improve the manuscript.
COMPETING INTERESTS
The author was previously affiliated with Thomson Reuters, erstwhile owner of WoS, now
owned by Clarivate.
FUNDING INFORMATION
No funding was sought or received for this study.
DATA AVAILABILITY
The Scopus and Dimensions data used in this paper have been made freely available to
CWTS for research purposes. The WoS data have been made available to CWTS under a paid
licenza. For Microsoft Academic, we made use of an openly available data dump. We are
not allowed to redistribute the Scopus, WoS, and Dimensions data used in this paper. IL
statistics presented in the figures in this paper are made available in the accompanying
supplementary material.
REFERENCES
Armitage, C., & Kaindl, M. (2020). Getting with the program.
Nature Index, Dicembre 9. Retrieved from https://www.nature
.com/articles/d41586-020-03416-9
Baas, J., Schotten, M., Plume, A., Côté, G., & Karimi, R. (2020).
Scopus as a curated, high-quality bibliometric data source for
academic research in quantitative science studies. Quantitative
Science Studies, 1(1), 377–386. https://doi.org/10.1162/qss_a
_00019
Bador, P., & Lafouge, T. (2005). Rédaction des adresses sur les pub-
lications: Un manque de rigueur défavorable aux universités fran-
çaises dans les classements internationaux. La Presse Médicale,
34(9), 633–636. https://doi.org/10.1016/S0755-4982(05)84000-X,
PubMed: 15988335
Baudoin, L., Akiki, V., Magnan, A., & Devos, P. (2018). Production
scientifique des CHU-CHR en 2006–2015: Évolutions et
positionnement national. La Presse Médicale, 47(11, Part 1),
e175–e186. https://doi.org/10.1016/j.lpm.2018.06.016,
PubMed: 30389213
Boudry, C., & Chartron, G. (2017). Availability of digital object
identifiers in publications archived by PubMed. Scientomet-
rics, 110(3), 1453–1469. https://doi.org/10.1007/s11192-016
-2225-6
Brankovic, J., Ringel, L., & Werron, T. (2018). How rankings pro-
duce competition: The case of global university rankings. Zeits-
chrift Fur Soziologie, 47(4), 270–288. https://doi.org/10.1515
/zfsoz-2018-0118
Calero-Medina, C., Noyons, E., Visser, M., & De Bruin, R.
(2020). Delineating organizations at CWTS—A story of many
pathways. In C. Daraio & W. Glänzel (Eds.), Evaluative informetrics:
The art of metrics-based research assessment. Festschrift in honour
Quantitative Science Studies
119
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
of Henk F. Moed (pag. 163–177). https://doi.org/10.1007/978-3-030
-47665-6_7
Centre for Science & Technology Studies Leiden University. (2020).
Indicators. Retrieved August 11, 2020, from https://www
.leidenranking.com/information/indicators
Clarivate. (2020UN). Data change FAQs. Retrieved November 10,
2 0 2 0 ,
f r o m h t t p s : / / s u p p o r t . c l a r i v a t e . c o m
/ScientificandAcademicResearch/s/datachanges?language=en
_US
Clarivate. (2020B). Web of Science journal evaluation process and
selection criteria—Web of Science Group. Retrieved September
5, 2020, from https://clarivate.com/webofsciencegroup/journal
-evaluation-process-and-selection-criteria/
DeBruin, R. E., Braam, R. R., & Moed, H. F. (1991). Bibliometric lines
in the sand. Nature, 349(6310), 559–562. https://doi.org/10.1038
/349559a0
Donner, P., Rimmert, C., & van Eck, N. J. (2020). Comparing
institutional-level bibliometric research performance indicator
values based on different affiliation disambiguation systems.
Quantitative Science Studies, 1(1), 150–170. https://doi.org/10
.1162/qss_a_00013
Ducut, E., Liu, F., & Fontelo, P. (2008). An update on Uniform
Resource Locator (URL) decay in MEDLINE abstracts and mea-
sures for its mitigation. BMC Medical Informatics and Decision
Making, 8. https://doi.org/10.1186/1472-6947-8-23, PubMed:
18547428
Espeland, W. N., & Sauder, M. (2016). Engines of anxiety: Aca-
demic rankings, reputation, and accountability. Retrieved from
https://www.scopus.com/inward/record.uri?eid=2-s2.0
– 8 5 0 1 1 8 6 5 4 0 9 & p a r t n e r I D = 4 0 & m d 5
=f6d9c54cf733b6b09f082abcdfe745e1
Falahati Qadimi Fumani, M. R., Goltaji, M., & Parto, P. (2013).
Inconsistent transliteration of Iranian university names: A hazard
to Iran’s ranking in ISI Web of Science. Scientometrics, 95(1),
371–384. https://doi.org/10.1007/s11192-012-0818-2
Franceschini, F., Maisano, D., & Mastrogiacomo, l. (2015). Errors
in DOI indexing by bibliometric databases. Scientometrics,
102(3), 2181–2186. https://doi.org/10.1007/s11192-014-1503-4
Gasparyan, UN. Y., Yessirkepov, M., Voronov, UN. A., Maksaev, UN. A.,
& Kitas, G. D. (2021). Article-level metrics. Journal of Korean
Medical Science, 36(11), e74. https://doi.org/10.3346/jkms
.2021.36.e74, PubMed: 33754507
Gorraiz, J., Melero-Fuentes, D., Gumpenberger, C., & Valderrama-
Zurián, J. C. (2016). Availability of digital object identifiers (DOIs)
in Web of Science and Scopus. Journal of Informetrics, 10(1),
98–109. https://doi.org/10.1016/j.joi.2015.11.008
Guerrero-Bote, V. P., Chinchilla-Rodríguez, Zaida, Mendoza, A., &
de Moya-Anegón, F. (2021). Comparative analysis of the biblio-
graphic data sources dimensions and Scopus: An approach at the
country and institutional levels. Frontiers in Research Metrics and
Analytics, 5. https://doi.org/10.3389/frma.2020.593494,
PubMed: 33870055
Huang, C.-K., Neylon, C., Brookes-Kenworthy, C., Hosking, R.,
Montgomery, L., … Ozaygen, UN. (2020). Comparison of biblio-
graphic data sources: Implications for the robustness of university
rankings. Quantitative Science Studies, 1(2), 445–478. https://doi
.org/10.1162/qss_a_00031
Huang, M., & Liu, W. (2019). Substantial numbers of easily iden-
Infor-
tifiable illegal DOIs still exist
metrics, 13(3), 901–903. https://doi.org/10.1016/j.joi.2019.03
.019
in Scopus.
Journal of
Ioannidis, J. P. A., Boyack, K. W., & Baas, J. (2020). Updated
science-wide author databases of standardized citation
indicators. PLOS Biology, 18(10), 1–3. https://doi.org/10.1371
/journal.pbio.3000918, PubMed: 33064726
Khedmatgozar, H., & Alipour Hafezi, M. (2015). A basic compara-
tive framework for evaluation of digital identifier systems. Journal
of Digital Information Management, 13, 190–197.
Konur, O. (2013). The scientometric evaluation of the institutional
research: The Inner Anatolian Universities—Part 3. Energy Edu-
cation Science and Technology Part B: Social and Educational
Studi, 5(2), 251–266.
Lammey, R. (2020). Solutions for identification problems: A look at
the Research Organization Registry. Science Editing, 7(1), 65–69.
https://doi.org/10.6087/kcse.192
Liu, W., Eh, G., & Tang, l. (2018). Missing author address infor-
mation in Web of Science—An explorative study.
Journal of
Informetrics, 12(3), 985–997. https://doi.org/10.1016/j.joi.2018
.07.008
Mugnaini, R., Fraumann, G., Tuesta, E. F., & Packer, UN. l. (2021).
Openness trends in Brazilian citation data: Factors related to the
use of DOIs. Scientometrics, 126(3), 2523–2556. https://doi.org
/10.1007/s11192-020-03663-7
Orduna-Malea, E., Aytac, S., & Tran, C. Y. (2019). Universities
through the eyes of bibliographic databases: A retroactive
growth comparison of Google Scholar, Scopus and Web of Sci-
ence. Scientometrics, 121(1), 433–450. https://doi.org/10.1007
/s11192-019-03208-7
QS Intelligence Unit. (2019). QS World University Rankings.
Retrieved August 11, 2020, from https://www.iu.qs.com
/university-rankings/world-university-rankings
QSIU. (2019). Carte & citations. Retrieved December 10, 2020,
from QS Intelligence Unit website: https://www.iu.qs.com
/university-rankings/indicator-papers-citations/
Rubim, IO. C., & Braganholo, V. (2017). Detecting referential incon-
sistencies in electronic CV data sets. Journal of the Brazilian
Computer Society, 23, 3. https://doi.org/10.1186/s13173-017
-0052-0
Shanghai Ranking Consultancy.
(2019). Academic ranking of
world universities methodology. Retrieved August 11, 2020,
from https://www.shanghairanking.com/ARWU-Methodology
-2019.html
Times Higher Education. (2019). THE World University Rankings
2020: Methodology. Retrieved January 14, 2021 from https://
www.timeshighereducation.com/world-university-rankings
/world-university-rankings-2020-methodology
NOI. News & World Report LP. (2019). How U.S. news calculated
the best global universities rankings. Retrieved August 11, 2020,
from https://www.usnews.com/education/best-global-universities
/articles/methodology
Valderrama-Zurián, J.-C., Aguilar-Moya, R., Melero-Fuentes, D., &
Aleixandre-Benavent, R. (2015). A systematic analysis of dupli-
cate records in Scopus. Journal of Informetrics, 9(3), 570–576.
https://doi.org/10.1016/j.joi.2015.05.002
Van Raan, UN. F. J. (2005). Fatal attraction: Conceptual and method-
ological problems in the ranking of universities by bibliometric
metodi. Scientometrics, 62(1), 133–143. https://doi.org/10
.1007/s11192-005-0008-6
Visser, M., van Eck, N.
J., & Waltman, l.
(2021). Large-scale
comparison of bibliographic data sources: Scopus, Web of
Scienza, Dimensions, Crossref, and Microsoft Academic.
Quantitative Science Studies, 2(1), 20–41. https://doi.org/10
.1162/qss_a_00112
Waltman, L., Calero-Medina, C., Kosten, J., Noyons, E. C. M.,
(2012). The Leiden ranking
indicators, and interpretation.
Tijssen, R. J. W., … Wouters, P.
2011/2012: Data collection,
Quantitative Science Studies
120
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
The prevalence and impact of university affiliation discrepancies
Journal of the American Society for Information Science and
Tecnologia, 63(12), 2419–2432. https://doi.org/10.1002/asi
.22708
Xu, S., Hao, L., An, X., Zhai, D., & Pang, H. (2019). Types of DOI
errors of cited references in Web of Science with a cleaning
method. Scientometrics, 120(3), 1427–1437. https://doi.org/10
.1007/s11192-019-03162-4
Zahedi, Z., Costas, R., & Wouters, P. (2017). Mendeley readership
as a filtering tool to identify highly cited publications. Journal of
the Association for Information Science and Technology, 68(10),
2511–2521. https://doi.org/10.1002/asi.23883
Zhu, J., Eh, G., & Liu, W. (2019). DOI errors and possible solutions
for Web of Science. Scientometrics, 118(2), 709–718. https://doi
.org/10.1007/s11192-018-2980-7
l
D
o
w
N
o
UN
D
e
D
F
R
o
M
H
T
T
P
:
/
/
D
io
R
e
C
T
.
M
io
T
.
/
e
D
tu
q
S
S
/
UN
R
T
io
C
e
–
P
D
l
F
/
/
/
/
/
3
1
9
9
2
0
0
8
3
4
8
q
S
S
_
UN
_
0
0
1
7
5
P
D
.
F
B
sì
G
tu
e
S
T
T
o
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Quantitative Science Studies
121