TRABAJO DE INVESTIGACIÓN
An Analysis of Crosswalks from Research Data
Schemas to Schema.org
Mingfang Wu1†, Stephen M. Richard2, Chantelle Verhey3, Leyla Jael Castro4,
Baptiste Cecconi5, Nick Juty6
1Australian Research Data Commons, Melbourne, Victoria 3145, Australia
2US Geoscience Information Network, Neward DE 19716-7501, EE.UU
3International Science Council, World Data System, Victoria BC V8N 1V8, Canada
4ZB MED – Information Centre for Life Sciences, Cologne 50931, Alemania
5Observatoire de Paris-PSL, Paris Astronomical Data Center, París 75001, Francia
6Departamento de Ciencias de la Computación, The University of Manchester, Oxford Road, Manchester M13 9PL, Reino Unido
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
Palabras clave: metadata schema; Schema.org; metadata interoperability; FAIR (meta)datos; metadata schemas
crosswalk; research data schemas
Citación: Wu, M.F., Ricardo, SM, Verhey, C., et al.: An analysis of crosswalks from research data schemas to Schema.org. Datos
Inteligencia 5(1), 100-121 (2023). doi:10.1162/dint_a_00186
Recibió: Noviembre 12, 2021; Revised: Abril 7, 2022; Aceptado: Puede 7, 2022
ABSTRACTO
The increased number of data repositories has greatly increased the availability of open data. To enable
broad discovery and access to research dataset, some data repositories have begun leveraging the web
architecture by embedding structured metadata markup in dataset web landing pages using vocabularies
from Schema.org and extensions. This paper aims to examine metadata interoperability for supporting global
data discovery. Específicamente, the paper reports a survey on which metadata schema has been adopted by
participating data repositories, and presents an analysis of crosswalks from fourteen research data schemas
to Schema.org. The analysis indicates most descriptive metadata are interoperable among the schemas, el
most inconsistent mapping is the rights metadata, and a large gap exists in the structural metadata and
controlled vocabularies to specify various property values. The analysis and collated crosswalks can serve as
a reference for data repositories when they develop crosswalks from their own schemas to Schema.org, y
provide the research data community a benchmark of structured metadata implementation.
†
Autor correspondiente: Mingfang Wu (Correo electrónico: Mingfang.Wu@ardc.edu.au; ORCID: 0000-0003-1206-3431).
© 2022 Academia China de Ciencias. Publicado bajo una atribución Creative Commons 4.0 Internacional (CC POR 4.0)
licencia.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
/
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
1. INTRODUCCIÓN
In recent years, it has become more and more common to share research data together with its
corresponding description through metadata, thanks to initiatives such as Open Science and the FAIR
(Findable, Accessible, Interoperable and Reusable) data principles [26]. To make data publicly accessible,
researchers and data collectors deposit their datasets into a data repository and provide metadata that
conforms to the repository’s metadata schema(cid:99); data repositories or metadata aggregators provide data
discovery capabilities to make dataset discoverable through indexed metadata. With the increase of datasets
managed in data repositories, some challenges arise including exchanging metadata, discovering relevant
conjuntos de datos, and supporting (semi)automatic metadata processing [29].
Data repositories typically host metadata, embed metadata in a web page and publish the web page on
the Web to make the dataset discoverable; such a web page, as shown in Figure 1a, is referenced as a
metadata landing page. Like any other web pages, a web landing page is encoded with HTML tags, optimised
for human readability. Before the recent explosion in commercial web index and search technology,
repositories also offered access to structured, machine-readable metadata for their holdings using various
metadata content and serialization schemes such as Dublin Core XML, Ecology Markup Language (EML),
Estados Unidos. Content Standard for Digital Geospatial Metadata (CSDGM), ISO 19115/19139, etcétera. Este
metadata was accessed through a standard API like Open Archives Initiative Protocol for Metadata Harvest
(OAI-PMH) or the Open Geospatial Consortium Catalogue Service for the Web (OGC-CSW).
Alrededor 2004, developers started introducing semantic markup in HTML documents to add information
about the web page subject and content to improve the display of search results, making it easier for people
to find the right web pages. En 2011, a consortium of search engines including Bing, Google, Yahoo! y
Yandex began developing a vocabulary of entities and properties that could be used in this semantic
mark-up to make it interoperable across browser systems [11]. The Schema.org vocabulary is the outcome
of this effort, with version 1 released in 2013. This initial release included an Entity for describing datasets
(https://www.w3.org/wiki/WebSchemas/Datasets), which was significantly revised in 2016 (https://github.
com/schemaorg/schemaorg/pull/1247).
This approach of publishing machine-readable metadata, es decir., structured metadata as shown in Figure 1b,
brings new opportunities for making research data FAIRer. Por ejemplo, the use of these common vocabularies
makes it easier for commercial web search engines like Google dataset search(cid:100), or any metadata aggregators,
to crawl and index metadata across data repositories globally in a more useful, consistent and robust way.
The interoperability of metadata sharing the same schema allows metadata from different sources to be
harvested and indexed without any intermediate mapping between schemas. Además, it makes it easier
to create federated queries across resources from different sources relevant to a research need. Metadata
aggregators are exploring new methods for metadata syndication via the web architecture. The NSF
(cid:99) We use the term ‘schema’ instead of ‘scheme’ throughout the paper, as this study focuses on the semantic meaning of data
propiedades.
(cid:100) https://datasetsearch.research.google.com/
Data Intelligence
101
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
.
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
EarthCube GeoCODES platform(cid:101) is indexing schema.org metadata in landing pages from 12 US NSF data
facilities. DataCite has already offered to crawl metadata through its embedded web page [10], DataOne(cid:102)
and ARDC’s catalogue service Research Data Australia(cid:103) are planning to offer a similar service.
Sin embargo, these opportunities also come with new challenges. Schema.org provides a domain agnostic
vocabulary to describe common data entities. By design, Schema.org expects and has enabled domains of
practice to extend this core vocabulary [11]. Similar to other domains of practice, research data communities
have their own needs for extending Schema.org core to describe research data and its relationships to other
resources. These extensions include, por ejemplo, specific data types and their corresponding properties
pertaining to a particular domain as well as support for persistent identifiers to meet needs for a specific
comunidad: Por ejemplo, bioschemas.org [12] for life sciences, science-on-schema.org for earth and
environmental sciences [14] and CodeMeta(cid:104) for research software.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
.
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Cifra 1. a) Left: an example of metadata landing page—metadata is published and embedded in a webpage for
human users to read, b) Right: Some metadata as shown in the left html page is marked up and embedded in the
source html for machine to access and parse.
To investigate interoperability and usability of Schema.org for describing research data, we collected 14
crosswalks from research data schemas to Schema.org [28], this crosswalk is a crucial step for repositories
to publish structured metadata [27]. A schema crosswalk is commonly expressed as a table showing
(cid:101) https://geocodes.earthcube.org/
(cid:102) https://www.dataone.org/
(cid:103) https://researchdata.edu.au/
(cid:104) https://codemeta.github.io/
102
Data Intelligence
An Analysis of Crosswalks from Research Data Schemas to Schema.org
equivalent terms across one or more data schemas. To source research data schemas, we used a survey
asking participating data repositories to share any crosswalk they had, as well as gaps and challenges that
they identified while creating the crosswalk. For schema providers, we used openly published crosswalks
available on the Internet; En particular, we found crosswalks corresponding to DCAT, Dublin Core and
ISO19115 to Schema.org(cid:105). This collection of crosswalks helps us to identify and bridge gaps in research
data communities when they mapped their metadata schemas to Schema.org.
This paper covers a report on the survey and an analysis of the crosswalks. The sections below are
organised as follows: we review the type of metadata schemas for research data in Section 2, present the
analysis of a survey and crosswalks in Section 3 and conclude the paper with a discussion of findings in
Sección 4.
2. METADATA SCHEMAS FOR RESEARCH DATA
2.1 General and Discipline-specific Metadata
There are many metadata standards for documenting research datasets; Wallis et al. [25] analysed 9
metadata schemas for describing scientific data and synthesised 22 metadata-related goals. In general, a
metadata schema should address the seven requirements for metadata schemas of all resources—abstraction,
extensibility, flexibilidad, modularity, comprehensiveness, sufficiency, and simplicity; and four requirements
for any schema to support data interchange, retrieval, achieving and publication.
The metadata directory implemented by the RDA Metadata Standard Directory Working Group includes
acerca de 65 standards(cid:106), ranging from general to extremely discipline specific [1]. General metadata schemas,
Por ejemplo, Data Catalogue Vocabulary (DCAT) and Dublin Core include data properties that are common
to almost all types of dataset. This general metadata can be widely adopted and easily used by metadata
providers, and supports broad data discovery use cases from data seekers, regardless of their research areas.
Discipline specific metadata, Por ejemplo, the Data Documentation Initiative (DDI for Social and
Behavioral Science data) and the Space Physics Archive Search and Extract (SPASE for heliophysics data),
usually include properties from general metadata standards, and provide additional properties and richer
vocabularies to allow detailed and more granular contextual information. This enriched information
increases data discovery efficiency and effectiveness for those with domain knowledge, and assists the
assessment of data reusability.
It is common practice for data repositories to publish metadata for their holdings, allowing it to be
harvested by aggregating metadata catalogs that offer indexing and user interfaces to support data search.
Such aggregation typically involves a mapping or crosswalk between metadata schemes or profiles used
by the various contributing repositories if there isn’t a schema agreed by all repositories for exchanging
(cid:105) ISO19115—DCAT—Schema.org mapping: https://www.w3.org/2015/spatial/wiki/ISO_19115_-_DCAT_-_Schema.org_mapping
(cid:106) http://rd-alliance.github.io/metadata-directory/standards/
Data Intelligence
103
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
/
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
metadata. This landscape may change due to major search engines starting to harvest structured metadata
using the standardized, schema.org vocabulary embedded in metadata landing pages that can be parsed
and interpreted by machine, to provide more accurate results and richer presentation of results [4].
2.2 Schema.org Vocabulary and Structured Metadata
Schema.org is among the most visible metadata vocabularies on the open Web, according to NISO [19].
The driving factor in the design of Schema.org was to make it easy for webmasters to publish information
with a single schema for a wide range of topics that included people, lugares, events, products and so
en [11]. Schema.org is a general schema or a set of vocabularies, the current version (V13.0, 2021-07-07)
consists of about 792 types (as RDF classes) y 1447 propiedades. The W3C Schema.org Community Group,
that is governed by a steering group(cid:107), is the main forum for the schema collaboration and the development
new types and properties can be added if there is community need and supporting use case, Por ejemplo,
the new type ‘LearningResource’ was added as a subtype of ‘CreativeWork’ in 2020 July release (9.0)(cid:108)11.
As another example, Bioschemas12, focusing on life science, have successfully incorporated many biomedical
terms into the schema.org vocabulary. The CodeMeta project13 has developed the CodeMeta vocabulary for
the description of software; 58 out of 68 Codemeta properties are from existing Schema.org vocabulary, 10
proposed new properties are based on the analysis of crosswalk from 23 software metadata, vocabulary
and ontology to Schema.org. There are also a steering group and communities who support developing
conventions for usage of the data model and guidelines for consistently implementing the data model. Para
ejemplo, the Schema.org Cluster of the Earth Science Information Partners (ESIP) working to develop best
practices and to provide education and outreach to the Earth science community for web accessible
structured data14 [14], The Ocean InfoHub Project15 provides an architecture solution for providing a Schema.
org based interoperability layer and supporting technology to allow existing and emerging ocean data and
information systems to interoperate with one another.
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
/
.
i
In order to make data widely discoverable, many research data repositories have started to implement
structured metadata markup in their metadata landing page. As of March 26, 2020, Google dataset search
has indexed 31M datasets from 4,600 dominios, where the top 10 domains include data.gov, figshare.com,
datacite.org. Geosciences and social sciences together accounted for 45% of the datasets, followed by
biology (~15%) and other research topics [21]. Search results include those from NASA, NOAA, and many
research repositories such as Harvard’s Dataverse repository [20]. This approach allows for broader
dissemination of metadata throughout the community to promote discoverability of datasets.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
(cid:107) https://schema.org/docs/about.html
(cid:108) Schema.org Releases: https://schema.org/docs/releases.html
11 Learning Resource Metadata is go for Schema: https://blogs.pjjk.net/phil/lrmi-in-schema/
12 https://bioschemas.org
13 https://codemeta.github.io/
14 https://github.com/ESIPFed/science-on-schema.org/blob/master/guides/Dataset.md
15 The Ocean InfoHub Project: https://book.oceaninfohub.org/index.html
104
Data Intelligence
An Analysis of Crosswalks from Research Data Schemas to Schema.org
2.3 Metadata Interoperability
The ‘I’ in ‘FAIR’ represents “interoperable” and is one of the four FAIR data principles [26], which apply
to both data and metadata. According to this principle, metadata should use community agreed standards
and vocabularies, and contain links to related information using persistent identifiers. Because there exist
a number of community agreed metadata schemas for meeting specific community needs, mapping between
schemas is necessary to make it possible for repositories to exchange and share metadata records [24].
There are different types of metadata interoperability, Por ejemplo, Nilsson et al. [18] proposed four
interoperability levels for Dublin Core Metadata. For a data repository to implementing interoperable
metadata, we adopt the three levels of metadata interoperability proposed by Chan and Zeng [6]:
•
•
•
Schema level—efforts are focused on the elements of the schemas, common results may include
crosswalks, application profiles, derived element sets, et al.;
Record level—efforts are intended to integrate the metadata records through the crosswalk of elements,
common results include converted records, new records resulting from combined values of existing
records; y
Repository level—efforts are focused mapping values associated with particular elements, the results
enable cross-collection searching.
We focus our analysis of crosswalks at the schema level: the elements of the schemas, being independent
of any applications. In particular, we will apply crosswalk to analyse the interoperability among studied
schemas. A crosswalk (or a mapping) is a chart or table (visual or virtual) that represents the semantic or
technical mappings of data elements from one schema (source schema) to data elements in another schema
(target schema) that has a similar function or meaning. The crosswalks guide record level interoperability,
which enables repository level interoperability so that heterogeneous repositories can be searched
simultaneously with a single query as if there were a single repository [2].
3. ANA LYSIS OF MAPPINGS FROM RESEARCH DATA SCHEMAS TO SCHEMA.ORG
As discussed above a crosswalk attempts to map equivalent or comparable metadata elements from two
schemas. We acknowledge that a crosswalk developed by a specific repository or a schema development
community would better reflect a proper and realistic mapping, as those repositories and communities can
provide a better interpretation of their implemented metadata terms. Por esta razón, we launched the survey
“Current practices in using schemas to describe research datasets”16 on 27th Feb. 2019 to gather information
on how Schema.org is applied by data repositories to describe research data and related resources. Nosotros
envisaged the gathered information would help repositories and the proposed Research Metadata Schema
WG understand current practices, identify commonalities, gaps and barriers in using schemas for describing
and discovering research datasets.
16
Survey on current practices in using schemas describing datasets: https://docs.google.com/spreadsheets/d/19cuspUioXp1Q
gxGFph6tjjvNB6JHSOzIFxh8UCc3aVM
Data Intelligence
105
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
t
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
In Section 3.1, we highlight relevant parts of the survey and indicate which schemas are adopted or
implemented by participating respondents, followed by our analysis of crosswalks from the available
mappings to Schema.org.
3.1 Su rvey on Repository’s Metadata Schema and the Implementation of Schema.org
Twenty-two organisations/data repository representatives participated in the survey. One respondent
failed to answer the survey questions, so that submission has been excluded from this summary. As shown
en mesa 1, six of 21 responses are from the general repositories covering all domains: four of them are either
based on or direct adoption of the DataCite schema; one is an application profile of DCAT—DCAT-AP,
while the other follows the Registry Interchange Format—Collections and Services (RIF-CS) schema, cual
is a profile of ISO 2146, originally developed for library registry services now used as a data interchange
format.
Among the 13 disciplinary repositories or projects, five are from the domain of Geoscience and Arctic
Research and have adopted the ISO19115 schema or ISO19115 compatible schema (EML). ISO19115 is
an internationally adopted schema for describing geographic information and services. ISO19115 provides
information about the identification, the extent, the quality, the spatial and temporal schema, spatial
reference, and distribution of digital geographic data17. One Social and Behaviour Science repository
adopted the international standard ‘Data Documentation Initiative’ (DDI), for describing the data produced
by surveys and other observational methods in the social, behavioural, económico, and health sciences18.
The remaining nine disciplinary repositories and the two “other” repositories adopted community developed
profiles or schemas. Most of them are compatible or interoperable with international standards, Por ejemplo,
the cultural heritage datasets in the ‘Other’ category defines a metadata profile based on Schema.org, DCAT
and VoID19, while the European Clinical Research Infrastructure Network (ECRIN) schema is an extension
of DataCite [5], and GigaDB from the Life Sciences and Biomedical domain can export metadata in general
purpose metadata such as DataCite and Schema.org.
We observe the following two trends from the survey responses:
1)
Newer schemas tend to adopt existing commonly used elements. Por ejemplo, the Data Catalogue
Vocabulary (DCAT) makes extensive use of elements from Dublin Core: 20 out of 29 terms for
describing a dataset are from Dublin Core20. The Bioschemas profiles adopt 5 mandatory properties
y 8 recommended properties from Schema.org21.
Document, Discover and Interoperate (DDI): https://ddialliance.org/
17 https://www.dcc.ac.uk/resources/metadata-standards/iso-19115
18
19 http://data.europeana.eu
20 Data Catalog Vocabulary (DCAT)—Version 3: https://www.w3.org/TR/vocab-dcat-3/
21 Bioschemas: https://bioschemas.org/
106
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
/
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
2)
Repositories, regardless of discipline, general, or specific, tend to use a general-purpose schema but
also support domain specific standards or vocabularies. Por ejemplo, the Dataverse project22 supports
general citation metadata compatible with the DataCite metadata schema [8] and DCMI metadata
terms but also a suite of domain specific metadata for Geoscience, Social Science and Humanities.
RIF-CS supports subject vocabularies from a range of disciplines for satisfying a range of data
discovery needs. This observation also applies to discipline specific repositories, Por ejemplo, el
DAta Tag Suite (DATS), a data description model adopted by DataMed23, has both core elements and
additional elements: the core elements are generic and applicable to any type of dataset, mientras que la
additional elements are specific for life, environmental and biomedical science domains [22].
The observed trend is that general repositories adopt general purpose standards that support data discovery
use cases at a high level for data searches across domains providing. Domain repositories adopt schemes
that are compatible with general metadata profiles for metadata interoperability, but add elements to
support a range of more granular disciplinary queries for more precise data discovery within a domain.
3 .2 Analysis of the Mappings
We collected the 14 crosswalks from the following schemas to Schema.org through the survey and other
publicly available crosswalks: B2FIND, DCAT-AP, DCAT, RIF-CS, Core DATS, Dataverse, DDI Codebook
2.5, corriente continua & DCTerms, BioSchema, SPASE, DataCite, ISO-19115-1:2014, EOSC/EDMI, ECRIN Clinical
Research Metadata Schema. We aligned the crosswalks with the mapped Schema.org properties. In total,
Había 232 terms from the 14 crosswalks being mapped to 34 Schema.org properties.
Since the survey results were collected, some crosswalks may have been updated (e.g DCAT to Schema.
org alignment) and some schemas (including Schema.org) may have been revised with additional properties.
In October 2021, the first author cross checked all crosswalks, as well as referencing publicly available
crosswalks. These included, Por ejemplo, ISO-19115 (from this W3C group24 and Habermann [13]), DCAT
alignment with Schema.org25, DataCite Schema to Dublin Core mapping26, the CodeMeta crosswalks27.
During the writing of this paper, the second author also added a mapping from ISO19115-1 to Schema.
org. For the purposes of this analysis we used this subsequent mapping as it covers more elements than the
original ISO-19115-1:2014 to schema.org mapping we collected from this website28. This resulted in 385
properties from the 14 crosswalks being mapped to the 40 Schema.org properties.
22 https://guides.dataverse.org/en/latest/user/appendix.html
23 https://datamed.org/
24 ISO19115—DCAT—Schema.org mapping: https://www.w3.org/2015/spatial/wiki/ISO_19115_-_DCAT_-_Schema.org_mapping
25 https://www.w3.org/TR/vocab-dcat-3/#dcat-sdo
26 https://schema.datacite.org/meta/kernel-4.4/doc/DataCite_DublinCore_Mapping.pdf
27 https://github.com/codemeta/codemeta/blob/master/crosswalk.csv
28 https://www.w3.org/2015/spatial/wiki/ISO_19115_-_DCAT_-_Schema.org_mapping
Data Intelligence
107
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
/
.
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
/
y
a
yo
pag
s
i
d
/
tu
a
.
gramo
r
oh
.
s
d
norte
a
.
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
d
/
/
:
s
pag
t
t
h
gramo
norte
i
t
r
oh
pag
pag
tu
s
,
a
metro
mi
h
C
s
–
s
C
–
F
i
r
/
s
mi
C
i
v
r
mi
s
–
mi
norte
i
yo
norte
oh
/
tu
a
.
gramo
r
oh
.
s
d
norte
a
.
w
w
w
/
/
:
s
pag
t
t
h
k
yo
a
w
s
s
oh
r
C
+
gramo
r
oh
.
a
metro
mi
h
C
S
+
oh
t
+
S
C
–
F
I
R
C
oh
D
/
.
s
mi
norte
i
yo
pag
i
C
s
i
d
yo
yo
a
metro
oh
r
F
)
s
gramo
norte
i
d
a
mi
h
t
C
mi
j
b
tu
s
.
gramo
.
mi
(
s
mi
i
r
a
yo
tu
b
a
C
oh
v
)
d
mi
h
s
i
yo
b
tu
pag
(
:
k
yo
a
w
s
s
oh
r
C
gramo
r
oh
.
a
metro
mi
h
C
S
oh
t
S
C
–
F
I
R
)
S
C
–
F
I
R
(
s
mi
C
i
v
r
mi
S
d
norte
a
s
norte
oh
i
t
C
mi
yo
yo
oh
C
—
t
a
metro
r
oh
F
mi
gramo
norte
a
h
C
r
mi
t
norte
I
y
r
t
s
i
gramo
mi
R
/
metro
oh
C
.
mi
yo
gramo
oh
oh
gramo
.
s
C
oh
d
/
/
:
s
pag
t
t
h
:
k
yo
a
w
s
s
oh
r
C
norte
i
a
metro
oh
d
d
norte
a
)
mi
R
oh
–
I
A
oh
,
mi
t
i
C
a
t
a
D
,
s
metro
r
mi
t
d
norte
a
t
norte
mi
metro
mi
yo
mi
C
D
,
gramo
r
oh
.
a
metro
mi
h
C
S
(
gramo
r
oh
.
a
metro
mi
h
C
S
oh
t
s
a
metro
mi
h
C
s
mi
s
r
mi
v
a
t
a
D
s
d
r
a
d
norte
a
t
s
mi
s
oh
pag
r
tu
pag
yo
a
r
mi
norte
mi
gramo
norte
i
a
t
a
d
a
t
mi
metro
t
mi
s
a
t
a
d
s
t
r
oh
pag
X
mi
mi
s
r
mi
v
a
t
a
D
yo
t
metro
h
.
gramo
norte
i
pag
pag
a
metro
/
s
mi
norte
i
yo
mi
d
i
tu
gramo
yo
a
norte
oh
i
t
i
d
d
A
.
a
metro
mi
h
C
S
a
t
a
d
a
t
mi
METRO
mi
t
i
C
a
t
a
D
mi
h
t
norte
oh
d
mi
s
a
b
s
i
a
metro
mi
h
C
s
s
i
h
t
"
t
norte
mi
metro
tu
r
t
s
norte
I
“
,
"
mi
norte
i
yo
pag
i
C
s
i
D
“
mi
d
tu
yo
C
norte
i
a
metro
mi
h
C
s
D
norte
I
F
2
B
mi
h
t
F
oh
s
t
norte
mi
metro
mi
yo
mi
.
"
mi
gramo
a
r
mi
v
oh
C
yo
a
r
oh
pag
metro
mi
t
“
d
norte
a
–
a
metro
mi
h
C
s
–
oh
t
–
pag
a
–
t
a
C
d
oh
/
i
.
b
tu
h
t
i
gramo
.
C
r
j
–
C
mi
/
/
:
s
pag
t
t
h
mi
h
t
t
a
d
mi
s
tu
,
t
A
C
D
F
oh
mi
yo
fi
oh
r
pag
norte
oh
i
t
a
C
i
yo
pag
pag
a
norte
a
(
PAG
A
–
t
A
C
D
mi
s
tu
mi
W.
/
gramo
r
oh
h
C
i
h
w
,
)
t
a
metro
r
oh
F
mi
gramo
norte
a
h
C
r
mi
t
norte
i
a
t
a
d
a
t
mi
metro
norte
i
a
metro
oh
d
–
s
s
oh
r
C
a
s
a
yo
mi
v
mi
yo
norte
a
mi
pag
oh
r
tu
mi
F
oh
s
t
norte
mi
metro
mi
r
i
tu
q
mi
r
norte
i
a
metro
oh
d
–
s
s
oh
r
C
s
s
mi
r
d
d
a
oh
t
r
mi
d
r
oh
norte
i
d
mi
d
norte
mi
t
X
mi
mi
w
/
4
5
C
5
d
METRO
8
metro
k
t
i
i
–
b
q
METRO
Ud.
C
X
R
3
q
oh
7
2
X
pag
–
A
t
k
V
t
V
v
s
7
i
t
z
tu
l
0
1
/
d
/
s
t
mi
mi
h
s
d
a
mi
r
pag
s
)
1
PAG
=
mi
gramo
norte
a
r
&
0
=
d
i
gramo
#
t
i
d
mi
)
b
a
t
–
A
S
I
,
mi
C
r
tu
oh
s
mi
R
oh
V
,
I
D
D
(
s
d
r
a
d
norte
a
t
s
C
fi
i
C
mi
pag
s
d
mi
d
i
v
oh
r
pag
t
oh
norte
d
mi
d
i
v
oh
r
pag
t
oh
norte
7
2
_
r
mi
pag
a
pag
_
6
1
C
oh
V
S
D
S
/
C
oh
v
s
d
s
/
1
1
/
6
1
0
2
/
gramo
r
oh
.
3
w
w
w
w
.
/
/
:
s
pag
t
t
h
0
1
8
1
5
1
3
.
9
5
7
1
5
1
3
/
5
4
1
1
.
0
1
/
gramo
r
oh
.
i
oh
d
/
/
:
s
pag
t
t
h
:
mi
yo
fi
oh
r
pag
PAG
A
–
t
A
C
D
mi
h
t
F
oh
norte
oh
i
t
pag
i
r
C
s
mi
d
A
.
norte
oh
i
t
a
t
i
C
a
t
a
d
,
.
gramo
.
mi
—
a
t
a
d
h
C
r
a
mi
s
mi
r
gramo
r
oh
.
a
metro
mi
h
C
S
d
norte
a
mi
t
i
C
a
t
a
D
mi
t
i
C
a
t
a
D
.
gramo
r
oh
.
a
metro
mi
h
C
S
oh
t
s
k
yo
a
w
s
s
oh
r
C
d
norte
a
s
a
metro
mi
h
C
s
d
mi
t
r
oh
pag
pag
tu
S
.
1
mi
yo
b
a
t
d
mi
d
i
v
oh
r
pag
mi
r
mi
h
w
y
norte
a
oh
t
r
oh
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
oh
t
s
k
norte
i
yo
mi
C
norte
mi
r
mi
F
mi
R
,
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
t
tu
oh
b
a
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
d
)
s
mi
s
norte
oh
pag
s
mi
r
y
mi
v
r
tu
s
t
C
mi
r
i
d
(
y
r
oh
t
i
s
oh
pag
mi
r
y
b
d
mi
t
r
oh
pag
pag
tu
s
)
s
(
a
metro
mi
h
C
S
r
mi
b
metro
tu
norte
mi
h
t
(
norte
i
a
metro
oh
D
)
s
mi
s
norte
oh
pag
s
mi
r
F
oh
/
tu
mi
.
t
a
d
tu
mi
.
d
norte
fi
2
b
/
/
:
pag
t
t
h
:
s
gramo
norte
i
pag
pag
a
METRO
.
a
metro
mi
h
C
s
a
t
a
d
a
t
mi
metro
yo
a
C
i
h
C
r
a
r
mi
i
h
–
norte
oh
norte
,
C
i
r
mi
norte
mi
gramo
a
d
mi
h
s
i
yo
b
a
t
s
mi
D
norte
I
F
2
B
)
6
(
s
norte
i
a
metro
oh
d
yo
yo
A
/
d
mi
F
PAG
I
S
mi
/
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
:
mi
d
i
tu
GRAMO
.
a
a
oh
norte
.
C
d
d
C
norte
.
mi
C
i
v
r
mi
s
/
/
:
s
pag
t
t
h
:
2
–
5
1
1
9
1
oh
S
I
)
A
A
oh
norte
(
,
t
A
C
D
,
gramo
r
oh
.
a
metro
mi
h
C
s
oh
s
yo
a
d
norte
a
mi
metro
i
t
r
oh
F
s
mi
h
C
a
oh
r
pag
pag
a
mi
metro
i
t
l
W.
oh
norte
i
k
r
oh
w
oh
t
t
norte
a
w
mi
w
mi
v
oh
b
a
mi
h
t
h
gramo
tu
oh
r
h
t
,
oh
s
yo
A
.
gramo
norte
i
pag
pag
a
metro
gramo
r
oh
.
a
metro
mi
h
C
s
s
t
i
mi
gramo
a
r
mi
v
mi
yo
oh
t
t
C
mi
pag
X
mi
d
norte
a
t
A
C
D
.
y
gramo
oh
yo
oh
t
norte
oh
mi
metro
i
t
C
i
gramo
oh
yo
oh
mi
GRAMO
norte
oh
k
r
oh
w
oh
R
I
S
C
mi
gramo
a
r
mi
v
mi
yo
/
d
mi
F
PAG
I
S
mi
/
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
:
mi
d
i
tu
GRAMO
oh
t
gramo
norte
i
k
oh
oh
yo
y
d
a
mi
r
yo
a
s
i
t
tu
b
)
t
mi
s
a
t
a
D
mi
pag
y
t
(
gramo
r
oh
.
a
metro
mi
h
C
s
norte
oh
d
mi
s
tu
C
oh
F
s
i
8
1
4
PAG
C
i
t
C
r
a
d
norte
a
mi
C
norte
mi
i
C
s
oh
mi
GRAMO
gramo
r
oh
.
a
metro
mi
h
C
s
–
norte
oh
–
mi
C
norte
mi
i
C
s
t
a
gramo
norte
i
k
oh
oh
yo
oh
s
yo
a
mi
r
a
mi
w
mi
t
oh
norte
.
PAG
I
S
mi
h
t
i
w
norte
oh
i
t
a
r
oh
b
a
yo
yo
oh
C
norte
i
s
i
h
t
d
norte
mi
t
X
mi
)
5
(
h
C
r
a
mi
s
mi
r
gramo
r
oh
.
a
metro
mi
h
C
s
–
norte
oh
–
mi
C
norte
mi
i
C
s
F
d
pag
.
a
t
a
d
a
t
mi
METRO
–
I
METRO
/
s
t
norte
mi
metro
tu
C
oh
d
/
s
d
r
a
d
norte
a
t
s
–
a
t
a
d
a
t
mi
metro
w
w
w
/
norte
d
r
/
v
oh
gramo
/
d
mi
d
i
v
oh
r
pag
t
oh
norte
,
mi
yo
fi
oh
r
pag
S
D
norte
mi
METRO
A
S
A
norte
oh
t
gramo
norte
i
metro
r
oh
F
norte
oh
C
(
5
1
1
9
1
oh
S
I
t
r
oh
pag
X
mi
mi
W.
–
d
norte
a
–
s
d
r
a
d
norte
a
t
s
/
s
mi
C
r
tu
oh
s
mi
r
–
r
mi
s
tu
/
v
oh
gramo
.
a
s
a
norte
.
a
t
a
d
h
t
r
a
mi
/
/
:
s
pag
t
t
h
(
F
I
D
,
)
mi
s
w
oh
r
b
metro
oh
t
s
tu
C
d
norte
a
,
)
d
r
a
d
norte
a
t
s
–
F
i
d
–
t
a
metro
r
oh
F
–
mi
gramo
norte
a
h
C
r
mi
t
norte
i
–
y
r
oh
t
C
mi
r
i
d
/
s
mi
C
norte
mi
r
mi
F
mi
r
/
s
a
metro
mi
h
C
s
–
oh
s
i
/
s
oh
pag
mi
r
/
D
F
METRO
mi
/
s
t
C
mi
j
oh
r
pag
/
v
oh
gramo
.
a
s
a
norte
.
a
t
a
d
h
t
r
a
mi
.
t
i
gramo
/
/
:
s
pag
t
t
h
.
norte
oh
S
j
108
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
t
.
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
I
j
2
X
3
pag
C
d
mi
PAG
V
mi
tu
d
Ud.
k
V
norte
h
6
1
/
d
/
s
t
mi
mi
h
s
d
a
mi
r
pag
s
/
4
B
yo
–
F
A
k
I
t
r
W.
q
pag
yo
r
z
4
j
R
0
I
X
h
t
i
a
t
a
d
mi
h
t
r
oh
F
a
metro
mi
h
C
s
C
fi
i
C
mi
pag
s
a
mi
s
r
oh
d
norte
mi
t
'
norte
s
mi
oh
d
R
A
D
mi
C
,
yo
a
r
mi
norte
mi
gramo
norte
I
.
yo
mi
d
oh
metro
–
mi
t
a
yo
pag
metro
mi
t
–
r
a
d
mi
C
d
mi
tu
norte
i
t
norte
oh
C
.
1
mi
yo
b
a
t
d
mi
d
i
v
oh
r
pag
mi
r
mi
h
w
y
norte
a
oh
t
r
oh
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
oh
t
s
k
norte
i
yo
mi
C
norte
mi
r
mi
F
mi
R
,
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
t
tu
oh
b
a
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
d
)
s
mi
s
norte
oh
pag
s
mi
r
y
mi
v
r
tu
s
t
C
mi
r
i
d
(
y
r
oh
t
i
s
oh
pag
mi
r
y
b
d
mi
t
r
oh
pag
pag
tu
s
)
s
(
a
metro
mi
h
C
S
r
mi
b
metro
tu
norte
mi
h
t
(
norte
i
a
metro
oh
D
)
s
mi
s
norte
oh
pag
s
mi
r
F
oh
D
oh
D
S
oh
t
9
3
1
9
1
oh
S
I
/
oh
D
S
w
l
METRO
t
h
–
oh
t
–
9
3
1
9
1
mi
h
t
metro
oh
r
F
s
mi
gramo
a
pag
gramo
norte
i
d
norte
a
yo
a
t
a
d
a
t
mi
metro
norte
i
norte
oh
i
s
tu
yo
C
norte
i
r
oh
F
gramo
r
oh
.
a
metro
mi
h
C
s
/
/
norte
i
gramo
s
tu
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
:
metro
r
oh
F
s
norte
a
r
t
t
l
S
X
h
t
r
a
mi
y
t
r
mi
h
oh
D
t
norte
oh
metro
a
l
t
a
mi
C
norte
a
i
yo
yo
A
a
t
a
D
h
t
r
a
mi
y
r
a
norte
i
yo
pag
i
C
s
i
d
r
mi
t
norte
I
mi
h
t
–
oh
s
i
/
r
mi
t
s
a
metro
b
oh
/
yo
b
/
s
metro
r
oh
F
s
norte
a
r
t
a
t
a
d
a
t
mi
metro
oh
t
a
t
a
d
a
t
mi
metro
1
–
5
1
1
9
1
oh
S
I
pag
a
metro
oh
t
metro
r
oh
F
s
norte
a
r
t
a
d
mi
pag
oh
yo
mi
v
mi
d
y
r
oh
t
a
v
r
mi
s
b
oh
t
yo
s
X
.
0
.
1
mi
norte
oh
yo
a
d
norte
a
t
S
t
mi
s
a
t
a
S
Ud.
d
norte
a
,
)
S
D
GRAMO
METRO
(
metro
mi
t
s
y
S
a
t
a
D
mi
C
norte
mi
i
C
s
oh
mi
GRAMO
mi
norte
i
r
a
METRO
,
y
r
a
r
b
i
l
metro
mi
h
C
h
t
r
a
mi
l
METRO
mi
r
tu
oh
F
oh
y
norte
a
metro
d
mi
pag
pag
a
metro
mi
v
a
h
mi
W.
h
t
i
w
mi
yo
b
i
t
a
pag
metro
oh
C
y
yo
mi
s
oh
yo
C
s
i
h
C
i
h
w
,
)
l
METRO
mi
(
mi
gramo
a
tu
gramo
norte
a
l
a
t
a
d
a
t
mi
METRO
yo
a
C
i
gramo
oh
yo
oh
C
mi
r
mi
h
t
oh
s
a
yo
yo
mi
w
s
a
,
s
d
yo
mi
fi
a
t
a
d
a
t
mi
metro
F
i
d
mi
norte
i
a
r
t
s
norte
oh
C
y
yo
mi
s
oh
oh
yo
y
yo
norte
oh
y
yo
t
norte
mi
r
r
tu
C
mi
r
a
s
t
norte
mi
t
norte
oh
C
t
norte
mi
metro
mi
yo
mi
.
5
1
1
9
1
oh
S
I
oh
t
,
s
a
metro
mi
h
C
s
a
t
a
d
a
t
mi
metro
yo
a
t
norte
mi
metro
norte
oh
r
i
v
norte
mi
mi
r
a
s
d
yo
mi
fi
k
norte
a
yo
b
,
s
r
mi
t
mi
metro
a
r
a
pag
/
s
mi
yo
b
a
i
r
a
v
gramo
norte
i
b
i
r
C
s
mi
d
norte
mi
h
w
.
gramo
.
mi
(
yo
yo
a
t
a
.
gramo
r
oh
.
a
metro
mi
h
C
S
oh
i
.
a
t
a
d
C
i
t
C
r
a
/
/
:
s
pag
t
t
h
,
gramo
r
oh
.
mi
norte
oh
a
t
a
d
.
h
C
r
a
mi
s
/
/
:
s
pag
t
t
h
)
.
C
t
mi
,
norte
oh
i
t
i
norte
fi
mi
d
,
yo
mi
b
a
yo
,
mi
metro
a
norte
r
oh
F
d
mi
t
norte
mi
s
mi
r
pag
d
mi
d
d
mi
b
metro
mi
s
i
gramo
norte
i
pag
pag
a
METRO
mi
h
t
.
y
r
oh
t
i
s
oh
pag
mi
r
a
t
a
d
)
PAG
A
S
Ud.
(
metro
a
r
gramo
oh
r
PAG
C
i
t
C
r
a
t
norte
A
.
metro
r
oh
F
s
norte
a
r
t
t
l
S
X
norte
a
norte
i
/
metro
oh
C
.
mi
yo
gramo
oh
oh
gramo
.
s
C
oh
d
/
/
:
s
pag
t
t
h
:
s
k
yo
a
w
s
s
oh
r
C
/
h
C
a
mi
r
t
tu
oh
/
gramo
norte
i
norte
i
a
r
t
–
s
yo
oh
oh
t
/
gramo
r
oh
.
r
mi
t
norte
mi
C
a
t
a
d
a
t
mi
metro
/
/
:
s
pag
t
t
h
t
a
d
mi
b
i
r
C
s
mi
d
s
i
t
a
h
t
gramo
r
oh
.
a
metro
mi
h
C
S
oh
t
S
l
C
h
d
norte
a
S
t
A
D
mi
r
oh
C
.
s
mi
t
a
yo
pag
metro
mi
t
a
t
a
d
a
t
mi
metro
s
t
i
yo
yo
a
r
oh
F
yo
mi
d
oh
metro
mi
s
oh
pag
r
tu
pag
–
yo
a
r
mi
norte
mi
gramo
a
s
a
h
R
A
D
mi
C
)
2
(
mi
norte
i
C
i
d
mi
METRO
i
/
s
a
metro
mi
h
C
S
oh
B
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
/
8
3
0
1
.
0
1
/
gramo
r
oh
.
i
oh
d
/
/
:
s
pag
t
t
h
t
a
s
yo
mi
d
oh
metro
a
t
a
d
a
t
mi
metro
r
mi
h
t
oh
h
t
i
w
)
s
k
yo
a
w
s
s
oh
r
C
(
t
norte
mi
metro
norte
gramo
i
yo
a
d
norte
a
,
yo
a
norte
oh
i
t
a
r
mi
h
t
,
t
norte
mi
metro
pag
oh
yo
mi
v
mi
d
s
t
i
F
oh
norte
oh
i
t
pag
i
r
C
s
mi
d
yo
yo
tu
F
9
5
.
7
1
0
2
.
a
t
a
d
s
mi
t
i
tu
s
gramo
a
t
a
t
a
d
/
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
:
y
r
oh
t
i
s
oh
pag
mi
r
b
tu
h
t
i
GRAMO
s
i
h
t
metro
oh
r
F
mi
yo
b
a
yo
i
a
v
a
y
yo
mi
mi
r
F
mi
r
a
s
mi
yo
pag
metro
a
X
mi
d
norte
a
s
norte
oh
i
t
a
z
i
yo
a
i
r
mi
s
D
l
–
norte
oh
S
j
,
s
norte
oh
i
t
a
C
fi
i
C
mi
pag
s
yo
yo
A
s
mi
yo
fi
oh
r
pag
s
a
metro
mi
h
C
s
oh
i
B
d
norte
a
gramo
r
oh
.
a
metro
mi
h
C
s
t
pag
oh
d
a
oh
t
s
norte
a
yo
pag
/
gramo
r
oh
.
s
a
metro
mi
h
C
s
oh
i
b
/
/
:
pag
t
t
h
gramo
norte
i
r
a
h
s
=
pag
s
tu
?
t
i
d
mi
r
oh
/
d
norte
a
mi
s
oh
oh
h
C
oh
t
metro
oh
d
mi
mi
r
F
mi
h
t
s
r
mi
s
tu
r
tu
oh
mi
v
i
gramo
mi
w
d
a
mi
t
s
norte
i
t
tu
b
t
C
mi
yo
yo
oh
C
norte
a
C
d
mi
d
i
v
oh
r
pag
t
oh
norte
a
t
a
d
a
t
mi
METRO
:
)
I
norte
R
C
mi
(
k
r
oh
w
t
mi
norte
mi
r
tu
t
C
tu
r
t
s
a
r
F
norte
I
h
C
r
a
mi
s
mi
R
yo
a
C
i
norte
i
yo
C
norte
a
mi
pag
oh
r
tu
mi
s
r
mi
s
tu
r
tu
oh
r
oh
F
s
metro
r
oh
F
t
yo
i
tu
b
–
mi
r
pag
mi
metro
oh
s
gramo
norte
i
d
i
v
oh
r
pag
h
t
i
w
gramo
norte
i
t
norte
mi
metro
i
r
mi
pag
X
mi
w
oh
norte
mi
r
a
mi
w
,
r
mi
v
mi
w
oh
h
.
metro
r
oh
F
a
gramo
norte
i
d
yo
i
tu
b
h
gramo
tu
oh
r
h
t
a
metro
mi
h
C
s
norte
w
oh
r
i
mi
h
t
mi
t
a
mi
r
C
.
gramo
r
oh
.
a
metro
mi
h
C
s
oh
t
t
C
mi
pag
s
mi
r
h
t
i
w
gramo
norte
i
t
s
mi
t
mi
r
a
mi
w
d
norte
a
norte
a
s
i
a
metro
mi
h
C
S
.
s
s
mi
C
C
a
d
mi
gramo
a
norte
a
metro
r
mi
d
norte
tu
mi
b
yo
yo
i
w
h
C
i
h
w
F
oh
y
norte
a
metro
—
h
C
r
a
mi
s
mi
r
s
s
mi
C
C
a
)
a
mi
b
i
r
C
s
mi
d
oh
t
s
t
norte
i
oh
pag
a
t
a
d
yo
a
norte
oh
i
t
i
d
d
a
h
t
i
w
,
mi
t
i
C
a
t
a
D
F
oh
norte
oh
i
s
norte
mi
t
X
mi
yo
a
C
i
norte
i
yo
C
metro
oh
r
F
s
t
C
mi
j
b
oh
a
t
a
d
F
oh
mi
tu
gramo
oh
yo
a
t
a
C
a
r
oh
F
d
mi
t
a
mi
r
C
norte
mi
mi
b
s
a
h
.
y
d
tu
t
s
mi
C
r
tu
oh
s
r
i
mi
h
t
F
oh
s
C
i
t
s
i
r
mi
t
C
a
r
a
h
C
C
i
s
a
b
)
C
d
norte
a
,
norte
oh
i
t
a
s
i
metro
y
norte
oh
d
tu
mi
s
pag
r
mi
h
t
r
tu
F
t
tu
b
F
,
C
tu
S
7
j
oh
d
W.
h
X
.
#
9
3
5
2
1
3
1
/
d
r
oh
C
mi
r
/
gramo
r
oh
.
oh
d
oh
norte
mi
z
/
/
:
s
pag
t
t
h
t
a
s
i
a
metro
mi
h
C
s
d
mi
s
oh
pag
oh
r
pag
mi
h
t
F
oh
norte
oh
i
s
r
mi
v
d
mi
h
s
i
yo
b
tu
pag
t
norte
mi
C
mi
r
t
s
oh
METRO
,
norte
oh
i
t
a
C
fi
i
t
norte
mi
d
i
–
mi
d
/
t
norte
mi
s
norte
oh
C
d
mi
t
a
i
C
oh
s
s
a
)
b
,
s
t
norte
mi
metro
mi
gramo
norte
a
r
r
a
.
s
d
mi
mi
C
oh
r
pag
t
C
mi
j
oh
r
pag
mi
h
t
s
a
y
yo
mi
k
i
yo
mi
r
a
s
mi
gramo
norte
a
h
C
d
mi
d
i
v
oh
r
pag
t
oh
norte
.
yo
mi
d
oh
metro
a
t
a
d
a
t
mi
metro
S
t
A
D
mi
h
t
d
norte
i
h
mi
b
pag
tu
oh
r
gramo
mi
h
t
mi
r
a
mi
W.
)
2
(
y
gramo
oh
oh
B
yo
i
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
t
.
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Data Intelligence
109
An Analysis of Crosswalks from Research Data Schemas to Schema.org
/
/
I
h
s
a
t
s
/
t
mi
norte
t
C
tu
mi
.
a
pag
oh
r
tu
mi
.
C
mi
.
mi
t
a
gramo
b
mi
w
/
/
:
s
pag
t
t
h
d
mi
tu
norte
i
t
norte
oh
C
.
1
mi
yo
b
a
t
d
metro
.
a
norte
a
mi
pag
oh
r
tu
mi
r
oh
F
t
mi
s
a
t
a
D
d
oh
l
gramo
norte
i
y
F
i
C
mi
pag
S
/
r
mi
t
s
a
metro
b
oh
/
yo
b
/
k
r
oh
w
mi
metro
a
r
F
–
norte
oh
i
t
i
s
i
tu
q
C
A
/
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
d
–
F
a
pag
oh
gramo
r
oh
.
a
metro
mi
h
C
S
d
norte
a
t
A
C
D
.
D
I
oh
V
d
norte
a
t
A
C
D
,
gramo
r
oh
.
a
metro
mi
h
C
S
h
t
i
w
d
mi
b
i
r
C
s
mi
d
tu
mi
.
a
norte
a
mi
pag
oh
r
tu
mi
.
a
t
a
d
/
/
:
pag
t
t
h
d
mi
d
i
v
oh
r
pag
mi
r
mi
h
w
y
norte
a
oh
t
r
oh
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
oh
t
s
k
norte
i
yo
mi
C
norte
mi
r
mi
F
mi
R
,
)
s
(
k
yo
a
w
s
s
oh
r
C
mi
h
t
t
tu
oh
b
a
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
d
)
s
mi
s
norte
oh
pag
s
mi
r
y
mi
v
r
tu
s
t
C
mi
r
i
d
(
y
r
oh
t
i
s
oh
pag
mi
r
y
b
d
mi
t
r
oh
pag
pag
tu
s
)
s
(
a
metro
mi
h
C
S
r
mi
b
metro
tu
norte
mi
h
t
(
norte
i
a
metro
oh
D
)
s
mi
s
norte
oh
pag
s
mi
r
F
oh
gramo
r
oh
.
a
metro
mi
h
C
s
F
oh
mi
yo
pag
metro
a
X
mi
F
oh
norte
oh
i
t
a
t
norte
mi
metro
tu
C
oh
D
s
d
r
a
d
norte
a
t
s
mi
s
oh
pag
r
tu
pag
yo
a
r
mi
norte
mi
gramo
norte
i
a
t
a
d
a
t
mi
metro
t
mi
s
a
t
a
d
s
t
r
oh
pag
X
mi
B
D
a
gramo
i
GRAMO
d
norte
a
s
mi
C
norte
mi
i
C
S
mi
F
i
l
a
t
a
d
a
t
mi
metro
y
norte
a
norte
oh
norte
mi
mi
s
mi
b
norte
a
C
pag
tu
k
r
a
metro
norte
i
a
metro
oh
d
mi
h
t
norte
i
t
r
oh
pag
X
mi
oh
t
mi
yo
b
i
s
s
oh
pag
oh
s
yo
a
s
i
t
i
,
)
mi
t
i
C
a
t
a
D
d
norte
a
gramo
r
oh
.
a
metro
mi
h
C
S
(
)
1
(
yo
a
C
i
d
mi
metro
oh
B
i
2
5
5
0
0
1
/
4
2
5
5
.
0
1
/
gramo
r
oh
.
i
oh
d
.
X
d
/
/
:
pag
t
t
h
.
s
d
r
a
d
norte
a
t
s
mi
h
t
F
oh
y
norte
a
norte
a
h
t
mi
v
i
s
norte
mi
t
X
mi
mi
r
oh
metro
r
a
F
s
i
h
C
i
h
w
.
gramo
.
mi
,
mi
gramo
a
pag
gramo
norte
i
d
norte
a
yo
a
t
a
d
a
t
mi
metro
l
METRO
X
mi
t
mi
yo
pag
metro
oh
C
norte
w
oh
r
tu
oh
s
a
r
oh
,
b
a
t
–
A
S
I
d
r
a
d
norte
a
t
s
C
fi
i
C
mi
pag
s
–
a
t
a
D
–
norte
mi
pag
oh
/
mi
r
i
mi
r
F
norte
/
metro
oh
C
.
b
tu
h
t
i
gramo
/
/
:
s
pag
t
t
h
mi
b
oh
t
s
t
mi
s
a
t
a
d
mi
gramo
a
t
i
r
mi
h
yo
a
r
tu
t
yo
tu
C
gramo
norte
i
w
oh
yo
yo
a
mi
yo
fi
oh
r
pag
a
t
a
d
a
t
mi
metro
a
s
mi
norte
fi
mi
D
)
2
(
s
r
mi
h
oh
t
3
v
/
7
0
0
0
.
9
1
0
2
/
gramo
r
oh
–
mi
0
8
7
–
8
C
b
0
0
8
1
mi
/
s
mi
i
t
i
norte
tu
metro
metro
oh
C
/
i
pag
a
/
tu
mi
.
t
a
d
tu
mi
.
mi
r
a
h
s
2
b
/
/
:
s
pag
t
t
h
:
a
metro
mi
h
C
s
norte
i
D
l
–
norte
oh
S
j
d
mi
d
d
mi
b
metro
mi
.
gramo
.
mi
mi
mi
S
mi
h
t
h
t
i
w
norte
gramo
i
yo
a
oh
t
norte
a
yo
pag
mi
w
t
a
h
t
a
metro
mi
h
C
s
norte
oh
s
j
yo
a
norte
r
mi
t
norte
i
norte
a
+
gramo
r
oh
.
a
metro
mi
h
C
s
d
norte
a
s
mi
C
norte
mi
i
C
S
yo
a
i
r
mi
t
a
METRO
.
d
tu
oh
yo
C
s
yo
a
i
r
mi
t
a
metro
.
mi
v
i
h
C
r
a
/
/
:
s
pag
t
t
h
a
h
C
tu
s
F
oh
mi
yo
pag
metro
a
X
mi
.
)
s
norte
oh
i
s
norte
mi
t
X
mi
y
t
i
norte
tu
metro
metro
oh
C
+
(
a
metro
mi
h
C
s
mi
R
A
h
S
2
B
)
1
(
gramo
norte
i
r
mi
mi
norte
i
gramo
norte
mi
d
mi
d
i
v
oh
r
pag
t
oh
norte
/
tu
mi
.
a
d
s
s
mi
C
.
mi
tu
gramo
oh
yo
a
t
a
C
a
t
a
d
/
/
:
s
pag
t
t
h
:
I
D
D
r
tu
oh
i
v
a
h
mi
b
d
norte
a
yo
a
i
C
oh
S
d
mi
d
i
v
oh
r
pag
t
oh
norte
mi
a
r
norte
I
a
t
a
D
,
)
mi
t
a
d
pag
tu
r
a
yo
tu
gramo
mi
r
h
t
i
w
(
mi
s
r
mi
v
a
t
a
D
norte
oh
d
mi
s
a
b
s
i
mi
a
r
norte
I
a
t
a
D
h
;
y
r
t
s
mi
r
oh
F
;
mi
r
tu
t
yo
tu
C
i
r
gramo
A
gramo
r
oh
.
a
metro
mi
h
C
S
,
norte
oh
S
j
,
I
D
D
,
mi
r
oh
C
norte
i
yo
b
tu
D
norte
i
a
t
a
d
a
t
mi
metro
t
mi
s
a
t
a
d
s
t
r
oh
pag
X
mi
y
r
a
norte
i
r
mi
t
mi
V
;
mi
r
tu
t
yo
tu
C
i
t
r
oh
)
1
(
mi
C
norte
mi
i
C
s
a
metro
mi
h
C
s
_
norte
oh
s
j
/
#
0
/
s
a
metro
mi
h
C
s
/
4
C
0
9
1
6
b
C
2
1
3
2
–
6
b
7
a
–
7
1
6
4
/
gramo
r
oh
.
d
tu
oh
yo
C
s
yo
a
i
r
mi
t
a
metro
.
mi
v
i
h
C
r
a
/
/
:
s
pag
t
t
h
:
yo
a
t
r
oh
pag
y
r
oh
t
i
s
oh
pag
mi
R
r
F
.
a
r
norte
i
.
a
t
a
d
/
/
:
s
pag
t
t
h
D
l
–
norte
oh
S
j
)
1
(
mi
norte
i
C
i
d
mi
METRO
–
oh
t
–
pag
a
–
t
a
C
d
/
s
oh
pag
mi
r
/
norte
A
k
C
D
oh
/
s
t
C
mi
j
oh
r
pag
mi
s
w
oh
r
b
/
gramo
r
oh
.
a
metro
mi
h
C
s
d
mi
d
i
v
oh
r
pag
t
oh
norte
/
v
oh
gramo
.
a
s
a
norte
.
a
t
a
d
h
t
r
a
mi
/
/
:
s
pag
t
t
h
;
METRO
METRO
Ud.
(
yo
mi
d
oh
METRO
a
t
a
d
a
t
mi
METRO
d
mi
fi
i
norte
Ud.
s
'
A
S
A
norte
.
gramo
r
oh
.
a
metro
mi
h
C
s
d
norte
a
)
metro
metro
tu
–
yo
mi
d
oh
metro
–
a
t
a
d
a
t
mi
metro
–
d
mi
fi
i
norte
tu
/
y
r
oh
t
i
s
oh
pag
mi
r
–
a
t
a
d
a
t
mi
metro
h
C
r
a
mi
s
r
oh
F
y
r
oh
t
i
s
oh
pag
mi
R
a
t
a
d
a
t
mi
METRO
norte
oh
metro
metro
oh
C
A
S
A
norte
mi
h
t
oh
t
s
mi
oh
gramo
METRO
METRO
Ud.
gramo
norte
i
d
norte
a
yo
mi
h
t
norte
oh
a
t
a
d
gramo
r
oh
.
a
metro
mi
h
C
s
mi
h
t
mi
s
tu
mi
W.
.
s
r
mi
t
norte
mi
C
a
t
a
d
A
S
A
norte
s
s
oh
r
C
a
–
norte
oh
metro
metro
oh
C
/
s
t
norte
mi
norte
oh
pag
metro
oh
C
–
s
i
d
s
oh
mi
/
norte
oh
i
t
pag
i
r
C
s
mi
d
–
metro
mi
t
s
y
s
–
mi
C
norte
mi
i
C
s
/
t
tu
oh
b
a
yo
a
C
i
metro
mi
h
C
oh
mi
gramo
oh
i
b
r
oh
F
r
mi
t
norte
mi
C
mi
v
i
h
C
r
a
mi
v
i
t
C
a
d
mi
t
tu
b
i
r
t
s
i
D
l
norte
R
oh
:
yo
a
t
r
oh
PAG
.
h
C
r
a
mi
s
F
oh
s
metro
r
oh
F
r
mi
h
t
oh
mi
C
norte
a
h
norte
mi
oh
t
s
mi
gramo
a
pag
v
oh
gramo
.
yo
norte
r
oh
.
C
a
a
d
/
/
:
s
pag
t
t
h
;
s
C
i
metro
a
norte
y
d
110
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
.
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
Categories of mapped terms
We classified the 40 mapped Schema.org properties/terms into 6 categories from the NISO (2004)
metadata classification model. As shown in Table 2, we use three top level categories: descriptive metadata,
administrative metadata and structural metadata; administrative metadata is further classified into technical
metadata, rights metadata and preservation metadata. We summarise the analysis of mapped terms as
seguir:
•
Descriptive metadata: Most of the mapped terms (17 out of 40) fall into the descriptive metadata
categoría. The mapped descriptive terms cover six of seven recommended citation metadata from the
DataCite guide29:
Creator (PublicationYear): Título. Versión. Publisher. (resourceTypeGeneral). Identifier
The citation term “resourceTypeGeneral” (recommended) is the only term not explicitly included in
the mapping, and we infer it to be of the type: conjunto de datos, since we asked for and collected all mappings
from schemas for descripting data. Todo 14 source schemas include the 6 mapped citation metadata,
except for the term “version” and “publisher” that occurred in the 13 afuera 14 source schemas.
• Administrative metadata:
(cid:123)
(cid:123)
(cid:123)
Technical metadata: ‘encodingFormat’ and ‘contentSize’ are the two mapped technical metadata
terms by the majority of the source schemas. The mappings are consistent: the term ‘format’ is
used by 9 out of 13 source schemas, the exact term ‘encodingFormat’ by one source schema, y
the alternate terms ‘resource file type’, ‘mediaType’, ‘distributionFormat’ each by a source schema.
Rights metadata: There are three mapped terms in rights metadata. The property “license” has a
mapping from 12 source schemas, sin embargo, five of them have the original term “rights”. The term
“rights” is the only one from the 15 Dublin Core terms (http://purl.org/dc/elements/1.1) that doesn’t
have an exact mapping in Schema.org. In Dublin Core, “rights” is defined as “information about
rights held in and over the resource”, “license” is subproperty of “rights” and has the definition
“a legal document giving official permission to do something with the resource”. According to
this definition, the closest semantically matched term in Schema.org is “copyrightHolder” (https://
schema.org/copyrightHolder): The party holding the legal copyright to the CreativeWork.
Preservation Metadata: Hay 11 mapped preservation metadata terms: five of them are dates
about data creation, modification, availability and copyright; another five about data access
method or location; and one about data (observation/process/reprocess) frequency. The mappings
of the dates and the access methods are consistent, except that the term ‘expectedArriveFrom/
expectedArriveUntil’ is mapped from four different terms: ‘distribution date’, ‘released date’,
‘available’, and ‘embargo’.
•
Structural metadata: The seven mapped terms in the structural metadata category describe the citation
relation between a dataset and its related academic articles/report (‘citation’), duplicated datasheet
(‘sameAs’), a clear relation between two datasets (‘isPartof/hasPart’, ‘isBasedOn’) and general relation
between two datasets (‘isRelatedTo’, ‘mentions’).
29 https://schema.datacite.org/meta/kernel-4.4/doc/DataCite-MetadataKernel_v4.4.pdf
Data Intelligence
111
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
t
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
We also examine how many mapped terms are recommended by the Google dataset search guide30. El
Google guide recommends 23 propiedades (in italics in Table 2) to be included in structured data. Three
of them are required terms (“name”, “description”, “distribution.contentURL”), while the other 20 son
recommended. El 23 terms are distributed among all six NISO metadata categories, and are mapped by
more than half of source schemas, especially those falling into the descriptive metadata category. Tenga en cuenta que
this analysis is on the schema level, and does not take into account whether a repository has implemented
a property value at the metadata record level. Benjelloun et al. [3] from Google Research analysed the
percentage of datasets in their index with specific properties, showing that the property “name” and
“description” both have 100% coverage, followed by “provider” (84.59%), “keywords” (80%) and “URL”
(68.08%), while all other recommended properties had less than 50% coverage (p.ej. “authors”—14.12%,
“isAccessibleForFree”—3.04%). This indicates that even if there is a property mappable at the schema level,
a repository may decide not to implement that mapping or to populate that property with a value. El
reason, most likely, is that the repository does not have sufficient records requiring that property to warrant
its implementation.
Mesa 2. Classifi cation of the mapped Schema.org properties or terms.
NISO Metadata Type
Descriptive metadata:
For fi nding or understanding a resource
Schema.org properties
(The numbers in brackets indicate the number of crosswalks
that have a term mapped to the schema.org property.
Properties in italics are those recommended by the Google
dataset search guide31.)
Identifi er (14), name (14), descripción (14), creator (14),
alternameName (9), datePublished (13), versión (13),
keywords (13), about or subjectOf (8), inLanguage (10),
temporalCoverage (11), spatialCoverage (11),
variableMeasured (7), publisher (12), contributor (10), funder
(10), producer (8), measurementTechnique (6)
Admin.
Metadata
Technical metadata
For decoding and rendering fi les
encodingFormat (13), contentSize (8)
Rights metadata
Intellectual property rights attached
to content
Preservation Metadata
Long-term management of fi les
copyrightHolder (4), isAccessibleForFree (6), licencia (12)
contentUrl (8), URL (14), distribución (9), contactPoint (10),
copyrightYear (5), dateCreated (11), dateModifi ed (11),
expectedArriveFrom/expectedArrivalUntil(4), repeatFrequency
(5), includeInDataCatalog (8)
Structural metadata
Relationships of parts of resources to one another
citation (12), sameAs (8), menciona (4), isBasedOn (7),
isPartOf (10), hasPart (10), isRelatedTo (8)
30 Google Search Central Documentation: Using structured data: https://developers.google.com/search/docs/advanced/
structured-data/dataset
31 https://developers.google.com/search/docs/advanced/structured-data/dataset
112
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
/
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
Gap analysis
From the survey, there are structural metadata elements that are recommended by source schemas that
do not have mappings to Schema.org. These include elements that clearly describe:
•
•
•
•
isNewVersionOf,
Por ejemplo: hasVersion,
Relationships between datasets,
isOriginalFromOf, isDerivedFrom (from DataCite);
Relationships between a dataset and responsible agent, Por ejemplo: hasFunder, isFundedby,
isCompiledBy, isOwnedBy, hasPrincipleInvestigator;
Relationships between a dataset and the activity by which is was collected, Por ejemplo: dataset ->
Cruise, dataset -> study design; y
Relationships between a dataset and instrument/software/other services used to produce the data, para
ejemplo, isProducedBy/produces, isPresentedBy/presents, isOperatedOnBy/operatedOn, isAnnotatedBy/
annotate (from RIF-CS).
isContinuedBy,
These structural, relation metadata properties are more granular than the PROV-O Ontology [16]. Estos
gaps reflect both the difference between documentation needed to describe scientific datasets for research
and that for more commercial data published on the Web (p.ej. cine, negocios, product catalogs, etc.),
and the difference between general data schemas and discipline specific schemas.
From information gathered from the survey and through inspection of the source schemas, we observe
eso:
•
•
Controlled vocabularies, thesauri or code lists are used to specify property values for various elements
in the source schema. Schema.org doesn’t offer any vocabularies for property values, but the
serieralization of Schema.org allows it to incorporate external vocabularies. Por ejemplo, cuando
populating the property schema:keyword or schema:acerca de, one can specify a text string (either from
a vocabulary or not) that can facilitate discovery but not interoperability, while an optimal way is to
specify a URI reference to a term from a controlled vocabulary. There is a proposal to add a DefinedTerm
element (https://schema.org/DefinedTerm) that could be substituted for plain text values to provide a
URI along with the term, but this has not, as yet, been formally adopted into schema.org.
A controlled vocabulary is a set of pre-defined, authorised terms that are used to specify a property
value so that consistency can be achieved within and across repositories. A controlled vocabulary
can be standard and controlled by an authoritative organisation (Por ejemplo, Library of Congress
Subject Headings, Australia and New Zealand Standard Research Classification—ANZSRC), a locally
defined subset of a standard vocabulary, or a locally defined vocabulary [23]. Idealmente, terms in the
vocabulary have dereferenceable URIs for unambiguous identification. This case requires a controlled
vocabulary to be openly accessible, referenceable and identifiable with a unique and persistent
identifier to the vocabulary, for each term in the vocabulary [7]. Research Vocabularies Australia32 is
an example of such a service for finding, accessing and reusing vocabularies.
32 ARDC Research Vocabularies Australia: https://vocabs.ardc.edu.au/
Data Intelligence
113
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
/
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
•
•
•
There are semantically equivalent properties which are named differently among the schemas. Para
ejemplo, schema:variableMeasured, dats:dimensions, spase:parameter and sosa:observedProperty
(DCATv3) all have the same meaning, related to observed or measured data variables. schema.name,
and schema:title likewise have equivalent meaning in other schemas.
De este modo, when developing a crosswalk it is necessary to check how each property is defined in each
schema, and how it is actually used in the implemented examples. Por ejemplo, schema.isBasedOn
(a resource from which this work is derived or from which it is a modification or adaptation) can be
mapped from datacite:IsOriginalFrom, datacite:isSourceOf, datacite:isDerivedFrom, datacite:isVersionOf.
It is also inevitable that many terms from one schema are mapped to one term in Schema.org, pendiente
to Schema.org being a general schema and the simplicity is one of its design rules. Por ejemplo: el
granular relations from RIF-CS:(collection/relatedInfo/isVersionOf, collection/relatedInfo/isEnrichedBy,
collection/relatedInfo/isDerivedFrom, collection/relatedInfo/hasValueAddedBy) and datacite:(isOriginal
FromOf, isSourceOf, isDerivedFrom) can all be mapped to schema:isBasedOn (A resource from which
this work is derived or from which it is a modification or adaptation).
Rich granular information may be lost where ‘many to one’ mapping occurs. Whether this loss of
information is significant depends on the purpose of a mapping and how this granular information is
utilised by a data discovery system. Por ejemplo, if a use case is to make a dataset widely findable
from the web, then adding more descriptive metadata is more important than having a detailed
relation; if a use case is to track the history or provenance of a dataset, then this granular relation
information is important to have. These two use cases can complement each other: a general repository
can have descriptive metadata for discovery and include links so that when a user finds ta potentially
relevant dataset, they can follow a link to metadata with more granular contextual information to
assess the fitness of the dataset for intended purpose.
3.3 Visualization Tool f or Facilitating Mapping
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
.
t
i
To make the crosswalks more useful for analysis, and for those who are going to do a crosswalk for their
own schema, the World Data System—International Technology Office has developed a tool to visualise
the above 14 crosswalks (and one from CodeMeta vocabulary to Schema.org)33. The tool provides a user-
friendly display of the collected crosswalks. By utilising the visualisation tool, crosswalk developers across
domains can reference existing mappings, repeating the same types of matches between the Schema.org
terms and similar elements found in different metadata schemas, regardless of whether the metadata format
is standard or bespoke. The visualization tool is intended as a prototype service for the research data
management community, in support of metadata managers who are investigating options for including
schema.org markup into existing well formed metadata. The visualizations include various tables, a Sankey
diagrama, and a Gap Analysis, to support different views for crosswalk inspection. Por ejemplo: Cifra 2
can help to check, given a property from Schema.org, what is its corresponding element in other schemas;
y figura 3 shows these mappings in a ‘Filter Table’, where a parent type is also shown for properties from
Schema.org.
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
33 Visualisation of crosswalks: https://rd-alliance.github.io/Research-Metadata-Schemas-WG
114
Data Intelligence
An Analysis of Crosswalks from Research Data Schemas to Schema.org
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
Cifra 2. This fi lter sankey diagram allows a user to choose a schema.org property and see which crosswalked
term is connected to which metadata standard. From left to right the labels go schema.org properties, crosswalked
terms, then metadata standards.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
.
t
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Cifra 3. This table is a free text search over both metadata terms and schema.org properties. Wildcard searches
are not supported but partial searches are. Por ejemplo, a search for “publish*” will not return any records, pero el
search for “publish” will return “datePublished”, “publisher”, and “Dataset Publisher.”
Data Intelligence
115
An Analysis of Crosswalks from Research Data Schemas to Schema.org
4. DISCUSSION AND CONCLUSION
En resumen, through the analysis of the 14 crosswalks, we find most descriptive metadata are mostly
interoperable among the schemas and can be mapped to corresponding Schema.org properties. The most
inconsistent mapping is the ‘Rights’ metadata, which requires clearer and consistent definition among the
schemas of the terms Rights, License, Copyright Holders, and Data Use Agreement or Conditions, to name
a few. The largest gap exists in the Structural metadata elements: primero, there is a lack of consistency among
the source metadata schemas themselves; y segundo, there are no rich relation terms in Schema.org. Como
Structural metadata is important in the linked-data world, the data community needs to agree what Structural
metadata from disciplinary schemas could be generalised and applied to all types of data. There also exists
a gap in controlled vocabularies to specify various property values, Por ejemplo, observational variables [17]
and a subject classification vocabulary (p.ej. Library of Congress Subject Headings) for populating Keyword
or Subject elements to describe a dataset.
The gaps are due to the Schema.org design principle that starts simple and increases complexity when
community need arises [11]. This challenge is complicated by the fact that relatively simple, domain
independent vocabularies satisfy the most common web data search needs, but the research community
tends to use more granular and rigorous schema and controlled vocabularies in describing and cataloging
research dataset. Lagoze [15] argued that attempting to intermix a single descriptive vocabulary for coarse
granularity queries with the complex semantics needed to enable ‘drill-down’ into more granular queries,
leads to metadata sets that are not ideally suited for either purpose; Lagoze advocated for establishing
frameworks for the creation of more complex descriptions that can coexist with similar ones as separate
packages.
Like any other schemas or vocabularies, Schema.org is evolving. To address the above gaps, the terms
schema:DefinedTerm and schema:inDefinedTermSet were introduced as pending changes in Schema.org
V12.0, and schema:hasDefinedTerm in Version 13.034 to enable the markup of external property names and
pre-defined property values from discipline specific vocabularies. This approach balances the simplicity for
a general schema and complexity of disciplinary schemas by following some principles that guide the
development of metadata schema, especially the modularity principle and the extensibility principle [9].
The recent trend, as observed in the survey and from the development of application profiles by domains
(p.ej. DCAT-Application Profile and Bioschema profiles,) also follows Duval’s metadata development
principios.
En resumen, we present the analysis of crosswalks to Schema.org from a cross section of domain
implemented metadata schema. The analysis is limited by the survey and the conceptual mapping that
focuses on the meaning of the elements or properties when mapping between two schemas. este análisis
could be enhanced to include the analysis of implemented marked up metadata across repositories to get
a more comprehensive picture of the interoperability of published structured metadata on the Web.
34 https://schema.org/docs/releases.html
116
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
.
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
A CKNOWLEDGEMENT
This work was developed as part of the Research Data Alliance (RDA) Working Group entitled ‘Research
Metadata Schemas’, and we acknowledge the support provided by the RDA community and structures. Nosotros
would like to thank members of the group for their support and their thoughtful discussions through plenary
sessions and regular monthly calls.
Special thanks go to:
•
•
Joel Benn (ARDC, Australia), Kerrin Borschewski (GESIS, Alemania), Steve Canham and Christian
Ohmann (University of Dusseldorf, Alemania), Baptiste Cecconi (Observatoire de Paris, PSL Research
Universidad, Francia), Douglas Fils (Ocean Leadership, US), Julian Gautier (Harvard University, US),
Josef Hardi and John Graybeal (Stanford University, US), Leopold Talirz (EPFL, Suiza), cris
Hunder (GigaScience Journal), Andrea Perego (European Parliament), Stephen M. Ricardo (US
Geoscience Information Network), Philippe Rocca-Serra and Susanna-Assunta Sansone (Oxford
Universidad, Reino Unido), Adam Shepherd (WHOI, EE.UU), Matt Styles (Nottingham University, Reino Unido), Enrique
Widmann (DKRZ, Alemán), Bruce Wilson (ORNL, EE.UU) and a few anonymous survey participants
for contributing to the survey on “Current practices in using schemas to describe research datasets”
and/or crosswalks;
Karen Payne, Seiya Terada and Chantelle Verhey (World Data System—International Technology
Office, Canada) for developing a suite of tools for visualising the collected crosswalks.
• ARDC intern Penelope Hagan for initial alignment of 13 crosswalks.
AUTHOR CONTRIBUTION STATEMENT
Mingfang Wu (mingfang.wu@ardc.edu.au) conceptualised and implemented the crosswalk analysis and
wrote the original draft, all authors contributed to further conceptualisation and the writing and review of
the paper. Stephen Richard (smrTucson@gmail.com) provided the mapping for ISO 19115-1 and contributed
text editing and review.
REFERENCE S
[1] Ball, A., Greenberg, Jane., jeffery, K., et al.: RDA Metadata Standards Directory Working Group: Reporte final.
(2016). Retrieved on 15 Sept. 2021 de: https://www.rd-alliance.org/system/files/MSDWG-Final-Report.pdf
[2] Baca, METRO. (Editor).: Introduction to Metadata: Third Edition. ISBN:978-1-60606-479-5. (2016). Disponible
en línea: http://www.getty.edu/research/publications/electronic_publications/intrometadata3′. Retrieved on
Oct. 31, 2021.
[3] Benjelloun, o., Chen, S., Noy, NORTE.: Google Dataset Search by the Numbers. arXiv:2006.06894. (2020).
https://arxiv.org/pdf/2006.06894.pdf
[4] Brickley, D., Murgess, METRO., Noy, NORTE.: Google Dataset Search: Building a search engine for datasets in an open
Web ecosystem. The World Wide Web Conference, San Francisco, California, EE.UU, Puede 2019. Pp.1365–1375
(2019)
Data Intelligence
117
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
.
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
[5] Canham, S., Ohmann, C.: ECRIN Clinical Research Metadata Schema Version 2 (Abril 2018) (2.0). Zenodo.
(2018). https://doi.org/10.5281/zenodo.1312539
[6] chan, L.M., Zeng, M.L.: Metadata Interoperability and Standardization – A Study of Methodology I: Achieving
Interoperability at the Schema Level. In D-Lib Magazine, Vol.12(6). (2006). ISSN 1082-9873. Disponible en:
https://dlib.org/dlib/june06/chan/06chan.html
[7] Cox, S.J.D., Gonzalez-Beltran, A.N., Magagna, B., Marinescu, M.-C.: Ten simple rules for making a vocabulary
FAIR. PLoS Comput Biol 17(6), e1009041 (2021). https://doi.org/10.1371/journal.pcbi.1009041
[8] DataCite Metadata Working Group (DataCite).: DataCite Metadata Schema Documentation for the
Publication and Citation of Research Data. Versión 4.1. DataCite e.V. 10.5438/0014 (2017)
[9] Duval, MI., Hodgins, w., suton, S., Weibel, S.L.: Metadata principles and practicalities. D-Lib Mag 8(4), 16
(2002). Available from: http://www.dlib.org/dlib/april02/weibel/04weibel.html.
[10] Fenner, METRO.: Using Schema.org for DOI registration. DataCite Blog (Ene. 9, 2017). (2017). Available from:
https://doi.org/10.5438/0000-00cc
[11] Guha, v., Brickley, D., Macbeth, S.: “Schema.org: Evolution of structured data on the Web: Big data makes
common schemas even more necessary”. ACMQuery, Noviembre 2015, https://doi.org/10.1145/2857274.
2857276
[12] Gray, A.J.G., Goble, C.A., Jimenez, r.: Bioschemas: From Potato Salad to Protein Annotation. En internacional
Semantic Web Conference (Posters, Demos at Industry Tracks) (2017)
[13] Habermann, T.: Mapping ISO 19115-1 geographic metadata standards to CodeMeta. PeerJ Computer Science
5, e174 (2019). https://doi.org/10.7717/peerj-cs.174
[14] jones, M.B., et al.: Science-on-Schema.org v1.2.0 (Versión 1.2.0). Zenodo. (2021). https://doi.org/10.5281/
zenodo.4477164
[15] Lagoze, C.: Keeping Dublin Core simple: Cross-domain discovery or resource description? D-Lib Magazine
7(1) (2001)
[16] Lebo, T., et al.: PROV-O: The PROV Ontology. (W3C Recommendation). World Wide Web Consortium.
(2013). http://www.w3.org/TR/2013/REC-prov-o-20130430/
[17] Magagna, B., et al.: The i-adopt interoperability framework for fairer data descriptions of biodiversity. (2021).
DOI:10.5194/egusphere-egu21-13155
[18] Nilsson, METRO., Panadero, T., Johnston P.: Interoperability levels for Dublin Core Metadata. (2009). Disponible en:
https://www.dublincore.org/specifications/dublin-core/interoperability-levels/. Accessed on May 1, 2022
[19] NISO (National Information Standards Organization).: Understanding metadata. Bethesda, Maryland: NISO Press.
(2004). Available from: http://www.niso.org/standards/resources/UnderstandingMetadata.pdf.
[20] Noy, NORTE.: Making it easier to discover datasets. Published Sept 5. 2018. Google Blog. (2018). Available from:
https://www.blog.google/products/search/making-it-easier-discover-datasets/
[21] Noy, NORTE.: An analysis of online datasets using dataset search. Google AI Blog. (2020). Disponible en: https://
ai.googleblog.com/2020/08/an-analysis-of-online-datasets-using.html. Accessed on May 1, 2022
[22] Sansone, SA., Gonzalez-Beltran, A., Rocca-Serra, PAG., et al.: DATS, the data tag suite to enable discoverability
of datasets. Sci Data 4, 170059 (2017). https://doi.org/10.1038/sdata.2017.59
[23] Southwick, S.B., Lampert, C.K., Southwick, r.: Preparing Controlled Vocabularies for LInked Data: Benefits
and Challenges. Journal of library metadata, 2015-10-02, Vol.15 (3–4), p.177–190 (2015)
[24] Tennant, r.: Different paths to interoperability. Library Journal 126(3), 118–119 (2001)
[25] willis, C., Greenberg, J., Blanco, H.C.: Analysis and Synthesis of Metadata Goals for Scientific Data. In Journal
of American Society for Information Science and Technology 63(8), 1505–1520 (2012). DOI: 10.1002/
asi.22683.
118
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
.
t
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
[26] Wilkinson, METRO., Dumontier, METRO., Aalbersberg, I., et al.: The FAIR Guiding Principles for scientific data
management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
[27] Wu, METRO., et al.: Guidelines for publishing structured metadata on the Web. Research Data Alliance. (2021a).
DOI: 10.15497/RDA00066
[28] Wu, METRO., et al.: A Collection of Crosswalks from Fifteen Research Data Schemas to Schema.org. Investigación
Data Alliance. (2021b). https://doi.org/10.15497/RDA00069
[29] Wu, METRO., et al.: Automated metadata annotation: What is and is not possible with machine learning. Datos
Inteligencia 5(1), 122–138 (2023). doi: 10.1162/dint_a_00162
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
/
t
.
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
Data Intelligence
119
An Analysis of Crosswalks from Research Data Schemas to Schema.org
AUTHOR BIOGRAPHY
Dr. Mingfang Wu is senior research data specialist at the Australian Research
Data Commons (ARDC). She leads ARDC data discovery projects for making
data discoverable by both machine and human users. She has conducted
research in the areas of information retrieval; user search behaviour and
search context through search log analysis, survey and interview; interfaces
supporting exploratory search; and natural language processing.
ORCID: 0000-0003-1206-3431
Dr. Stephen Richard is a geoinformatics consultant based in Tucson, Arizona.
His background is in geologic mapping and geoscience data management
durante 24 years at the Arizona Geological Survey. He has participated in
geoscience vocabulary development for state and federal geological surveys
in the US and the IUGS CGI Geoscience Terminology Working Group.
Richard was the editor for the ISO19115-3 XML implementation of ISO
19115 metadata, and has participated in technical development of metadata
catalogs for the US National Geothermal Data System and EarthCube
DataDiscovery Studio, using ISO 19115 metadata. Recent work has focused
on development of schema.org metadata profiles for geoscience datasets for
the EarthCube GeoCODES resource and data catalogs, and development of
metadata schema for cross-domain sample descriptions and astromaterials
analytical data.
ORCID: 0000-0001-6041-5302
Chantelle Verhey is a Research Associate for the World Data System-
International Technology Office hosted at Ocean Networks Canada. She has
a Masters of Science in Environmental Management from the University of
Reading in the UK, and was dedicated to researching Forest fire trends in the
Canadian Boreal Forest. After her research was completed, Chantelle moved
on to work at the University of Waterloo as a Data Manager for the Polar
Data Catalogue. Ahora, she is combining her research and work experience
to enhance data interoperability within the polar scientific community
through the use of semantic technologies.
ORCID: 0000-0002-0047-7875
120
Data Intelligence
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
/
.
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3
An Analysis of Crosswalks from Research Data Schemas to Schema.org
Dr. Leyla Jael Castro is currently working as team leader for the Semantic
Retrieval research team, part of the Knowledge Management Group, en
ZBMED Information Centre for life sciences, focusing on topics such as
literature-based information retrieval, recommendation systems, and ontology-
based search and categorization. She participates in community efforts such
as Bioschemas, and networks such as RDA and ELIXIR.
Dr. Baptiste Cecconi is an astronomer working at Observatoire de Paris in
Meudon, Francia. His background is radio astronomy, solar and planetary
sciences. He is an active member of his research field’s open science alliances:
the International Virtual Observatory Alliance (IVOA), the International Planetary
Data Alliance (IPDA) and the International Heliophysics Data Environment
Alliance (IHDEA). His recent data-related projects are aiming at building
interfaces between radio astronomy and neighbouring science fields, focussing
on semantic as well as operational interoperability.
ORCID: 0000-0001-7915-5571
Dr. Nick Juty is a Senior Research Technical Manager. He is an experienced
senior scientist with recent focus on standards adoption across scientific
dominios. He has played a leading role in delivering an international and cross-
disciplinary identification system for scientific data (http://identifiers.org).
ORCID: 0000-0002-2036-8350
Data Intelligence
121
yo
D
oh
w
norte
oh
a
d
mi
d
F
r
oh
metro
h
t
t
pag
:
/
/
d
i
r
mi
C
t
.
metro
i
t
.
mi
d
tu
d
norte
/
i
t
/
yo
a
r
t
i
C
mi
–
pag
d
F
/
/
/
/
5
1
1
0
0
2
0
7
4
2
4
9
d
norte
_
a
_
0
0
1
8
6
pag
d
t
.
/
i
F
b
y
gramo
tu
mi
s
t
t
oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3