ARTÍCULO VISIONARIO Y TEÓRICO

ARTÍCULO VISIONARIO Y TEÓRICO

Achieving Transparency: A Metadata Perspective

Daniel Gillman*†

US Bureau of Labor Statistics, 2 Massachusetts Ave NE Washington District of Columbia 20212, United States

Palabras clave: Metadata; Standards; Conformance; Usability; Metadata quality

Citación: Gillman, D.: Achieving transparency: a metadata perspective. Data Intelligence 5(1), 261-274 (2023). doi: 10.1162/

dint_a_00188

Recibió: Marzo 1, 2022; Revised: Agosto 4, 2022; Aceptado: Octubre 9, 2022

ABSTRACTO

Transparency is vital to realizing the promise of evidenced-based policymaking, where “evidence-based”
means including information as to what data mean and why they should be trusted. Transparency, Sucesivamente,
requires that enough of this information is provided. Loosely speaking then, transparency is achieved when
sufficient documentation is provided. Sufficiency is situation specific, both for the provider and consumer of
the documentation. These ideas are presented in two recent US commissioned reports: The Promise of
Evidence-Based Policymaking and Transparency in Statistical Information for the National Center for Science
and Engineering Statistics and All Federal Statistical Agencies.

Metadata are a more formalized kind of documentation, and in this paper, we provide and demonstrate
necessary, sufficient, and general conditions for achieving transparency from the metadata perspective:
conforming to a specification, providing quality metadata, and creating a usable interface to the metadata.
These conditions are important for any metadata system, but here the specification is tied to our framework
for metadata quality based on the situation-specific needs for transparency. These ideas are described, y
their interrelationships are explored.

1. INTRODUCCIÓN

The term transparency appears in a lot of recent writing and initiatives. Por ejemplo, Transparency
Internacional(cid:99) is an effort to reduce corruption in governments around the world by exposing the need for

* The opinions in this article are due to the author and do not necessarily reflect the official policies of the US Bureau of

Labor Statistics.
Autor correspondiente: Daniel Gillman (Correo electrónico: gillman.daniel@bls.gov; ORCID: 0000-0002-1638-3881). (cid:99) Transparency


International—https://www.transparency.org/

Sin derechos reservados. This work was authored as part of the Contributor’s official duties as an Employee of the United States
Government and is therefore a work of the United States Government. De acuerdo con 17 USC. 105, no copyright
protection is available for such works under U.S. Law.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

t

/

.

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

accountability. The Office of the Director for National Intelligence(cid:100) in the US promotes transparency of
intelligence for the intelligence community to inform the public. Amazonas(cid:101) has a transparency initiative to
reduce counterfeit products from reaching customers by using security and unique codes to identify those
products.

Data are also the subject of transparency efforts. The focus of the work of the statistical agencies is data.
Producing data is the purpose of these agencies, and much effort is expended towards ensuring all data
have the highest possible quality(cid:102). With the publication of the report The Promise of Evidence-Based
Policymaking by the US Commission on Evidence-Based Policymaking in September 2017 (Evidencia
Informe) [1], a new push towards greater access to data has emerged in the US and elsewhere. Congreso
passed the Evidence Act in 2018, and this act codified much of what was included in the Evidence Report.
Included in the Evidence Act were recommendations for greatly improving the way data are produced,
descrito, disseminated, and managed. Sin embargo, there remains much work to achieve the recommendations
and adhere to the themes in the Evidence Report. Two of these themes are 1) to increase the emphasis
in producing and maintaining metadata and 2) make data and methodologies behind those data more
transparent. These themes are related, and we explore this in some detail.

Plans and efforts to make data transparent are currently discussed widely, and we understand the need
and goal as generally desirable. A report sponsored by the US National Center for Science and Engineering
Estadísticas (NCSES) and under the auspices of the US National Academies of Sciences, Ingeniería, y
Medicine called Transparency in Statistical Information for the National Center for Science and Engineering
Statistics and All Federal Statistical Agencies (Transparency Report) [2] was published in November 2021.
This report addresses much of what is needed to make data and methodologies behind creation of the
data transparent. Though the report focuses on the needs of statistical agencies, we feel many of the
recommendations are sufficiently general as to apply to any organization producing data.

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

.

/

t

i

This paper reiterates some of what is in the Transparency Report and extends the ideas. The goal is to
propose necessary and sufficient conditions for data or methodologies to be transparent. Metadata play a
central role.

This paper is organized in the following way. Primero, we say what is meant by transparency. This moves
into a discussion about metadata, schemas, and technical specifications. Conformance—satisfying all
requirements in a technical specification—is a key idea. Also key are metadata quality, which is defined,
and the usability of a system designed to support transparency. Throughout, we advance the argument as
to why these criteria—conformance to a technical specification, metadata quality, and system usability—are
necessary and sufficient to providing transparency.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

(cid:100) Director of National Intelligence—https://www.dni.gov/index.php/how-we-work/transparency
(cid:101) Amazon—https://brandservices.amazon.com/transparency
(cid:102) US Office of Management and Budget—https://www.whitehouse.gov/omb/information-regulatory-affairs/information-policy/

262

Data Intelligence

Achieving Transparency: A Metadata Perspective

2. TRANSPARENCY

Dictionaries define transparency to mean “the condition of being easy to perceive or detect.” Therefore,
when it is easy to perceive or detect how to find, access, understand, and use data or methodologies they
are transparent.

The Transparency Report [2] contains a slightly more useful description—

transparency is the provision of sufficiently detailed documentation of all the processes of producing
official estimates.

This broad definition of transparency raises some important questions. In a specific circumstance, cómo
is transparency insured? What documentation is sufficient and how is it delivered? Is it possible to detect
when transparency is achieved through automated means or manual inspection?

Metadata are data in the role of describing other data, methodologies, or resources. Por lo tanto, sabemos
the answers will lie, en parte, with the metadata that is available to a user. The ability to find, understand,
and use data depends on the quality of the metadata available. And here the quality is a measure of how
well the metadata provide all the information necessary to allow the user to complete the tasks set forth.

3. TRANSPARENCY AND METADATA

As defined above, transparency is a general notion, or concept. It is easy to define, but difficult to
characterize generally. It is much easier to characterize when applied to specific circumstances or kind of
application. Why is this the case? Is it possible to make sense of this reality?

Cognitive psychologists [3] have identified at least two kinds of concepts: entity and relational. Entity
concepts are easier to characterize and more difficult to define. Relational concepts are the opposite.
“Tennis ball” and “variable” are entity concepts. Each is fairly easy to characterize. The concept “guest” is
an example of a relational, or role, concepto. One can be a guest at a party, a guest in a hotel, or a guest
user on a secure web site, por ejemplo. Each kind is characterizable. Sin embargo, these specific cases do not
have much in common, so the generic concept of guest is not easily characterized. Role concepts must be
refined, or specialized, to make them characterizable.

Transparency is a role concept. Consider the information needed to make a variable transparent to a user
versus what is needed to make a data transformation process transparent. These are very different situations,
and they require different characterizations.

En general, characterizing a concept provides a means to determine if a specific object or situation meets
those criteria, es decir., if the object or situation corresponds to the concept. Específicamente, the characteristics(cid:103) de

(cid:103) Characteristics differentiate concepts from each other.

Data Intelligence

263

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

t

.

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

a concept are the categories in which the properties(cid:104) of objects that correspond to the concept belong.
The properties of an object are descriptive of the object. Referencias [4] y [5] discuss these ideas in detail.

To illustrate we will assume we can describe a variable to support some transparency needs with the
following characteristics: name, defining concept, universe, pregunta, datatype, and set of allowed values.
See Table 1 below for a set of properties corresponding to the characteristics describing a marital status
variable. Short descriptions of the characteristics are included to help those less familiar with describing
variables. Each property corresponds to its characteristic in the following way: the characteristic takes the
role of a question (“What are the allowed values for this variable?") and the property takes the role as the
respuesta (“<1, single>, <2, married>, <3, separated>, <4, divorced>, y <5, widowed>").

Characteristics

Descripción

Properties

Mesa 1. Describing a Marital Status Variable.

Name of the variable

Nombre
Defi ning concept Meaning of the variable
Universe
Question
Datatype
Allowed values

Population the variable measures
Question asked to capture data
Computational aspects [6]
Valid values the variable may assign

MS01
Legally defi ned marital state
Adults
What is …’s marital status? Choose one of the values.
Nominal(cid:105)
<1, single>
<2, married>
<3, separated>
<4, divorced>
<5, widowed>

The properties in Table 1 are metadata, and their corresponding characteristics are elements of a schema
for describing variables for that metadata. In all cases, each characteristic has a set of properties that
correspond to it. These properties form the set of allowed values (or value domain) for the element in a
schema resulting from the characteristic. So, if a particular concept is characterizable, these characteristics
lead directly to a metadata schema. The elements and constraints of the schema turn the schema into a
technical specification.

4. TECHNICAL SPECIFICATIONS

A technical specification is a set of expressions [7], which are of the following kinds:

• Statement:

Instruction:
• Recommendation: expression that conveys advice or guidance
• Requirement:

expression that conveys information
expression that conveys an action to be performed

expression that conveys criteria to be fulfilled

(cid:104) Properties differentiate objects from each other.
(cid:105) The nominal datatype in statistics describes an unordered set of categories. The ISO/IEC 11404 equivalent datatype is state.

264

Data Intelligence

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

.

/

t

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

In our variable example, we can express the schema in tabular form as seen in Table 2.

Mesa 2. Schema for Describing Variables.

Characteristics

Rules

Nombre
Defi ning concept
Universe
Question
Datatype
Allowed values

Up to 16-character text
Term from a glossary entry
Set of possible units for observation
Source of data from a questionnaire
Nominal, Ordinal, Interval, Ratio, Date/Time, Lat/Log, Texto
Either a list, a numeric range, a format(cid:106)

This schema can also be turned into a set of expressions in natural language. These form the technical

specification in Table 3 abajo.

Mesa 3. Technical Specifi cation for Describing Variables.

Natural Language Expressions

A name, which shall be no more than 16 characters in length
A defi ning concept, which shall be either a link to an entry in an existing glossary or a term with well-known
unambiguous meaning
A universe, a specialization of people, households, establishments, events (p.ej., marriages, hiring), or outcomes
(p.ej., benefi ts, jobs) describing all possible units for observation
A question, which comes from the questionnaire used to collect data
A datatype, which shall be one of the following
o Nominal—unordered set of categories
o Ordinal—ordered set of categories
o Interval—numeric values where there is no meaningful zero (nothing)

(cid:131) E.g., Fahrenheit temperature scale
(cid:131) Expressed as a range

o Ratio—numeric values with a meaningful zero

(cid:131) E.g., Kelvin temperature scale
(cid:131) Expressed as range

o Date/Time—one of valid formats for date and time in ISO 8601
o Lat/Long—valid format for expressing latitude and longitude of a position on Earth
o Text—character string of no particular maximum length

The set of allowed values shall take one of the following forms:
o A list of categories and associated codes or terms
o A numeric range
o A specifi c format for date/time, latitude/longitude, or text

(cid:106) We recognize this list may not be inclusive. The author comes from the social science data community.

Data Intelligence

265

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

.

t

/

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

As the reader can see, we added more information to the technical specification in Table 3 than what
was in Table 1. Though not complete, the additional formality allows us to determine whether the rules of
the schema are obeyed by some application. By recognizing the kind of expression (statement, instrucción,
recommendation, or requirement) each rule exemplifies, it is possible to test (sometimes formally, pero
certainly informally) whether all the rules are adhered to.

5. CONFORMANCE

A metadata system (sistema, lo sucesivo) comprises a database for metadata (or repository), update and
retrieval capabilities, and a user interface. The system conforms to a technical specification if it satisfies all
the requirements in that specification [7]. Sometimes requirements include other expressions that are not
requirements themselves. Por ejemplo, if there is a requirement that a particular algorithm be used, entonces
the steps of that algorithm (expressions that are instructions) must be included as well.

Strict conformance means a conforming system satisfies all requirements and nothing more. A veces
it is useful to extend a technical specification by adding new requirements, so a system conforming to the
extended specification no longer strictly conforms to the original. Strict conformance is discussed in our
metadata quality framework. If we consider the example of the marital status variable in Table 1 y el
technical specification in Table 3, it is clear by inspection that the description of the marital status variable
en mesa 1 strictly conforms to the technical specification in Table 3. If we were to add a characteristic to
Mesa 1, then it would just conform to the specification in Table 3.

Another consideration is based on the functionality of a conforming system. If the system can store a
conforming instance of metadata, then that fulfills metadata instance level conformance. If the system can
export, or write, a conforming metadata instance, it conforms as a metadata writer. If the system can read
and store any conforming metadata instance written from another system, then it conforms as a metadata
lector. Finalmente, if it can read a conforming metadata instance, store it, and write the same instance back
afuera, the system has metadata repository level conformance. These ideas arise when we address usability.
Many details about how systems conform to technical specifications can be found in [7]. These ideas work
for strict conformance, también.

If we consider the example of the technical specification for describing a variable in Table 3, the rows
are the characteristics of a variable. These correspond to the characteristics column in Tables 1 y 2. Nosotros
are assuming here that the technical specification in Table 3 contains the schema elements necessary to
make variables transparent. The properties column in Table 1 describes the marital status variable named
MS01. These are metadata, and they strictly conform at the metadata instance level.

Continuing with the example, if we assume Table 3 contains a complete set of required characteristics
needed to describe a variable, then the description in Table 1 is a complete description of MS01, since it
strictly conforms to Table 3. If a user needs a description of the variable MS01, Mesa 1 makes that information
transparent to the user through metadata writer strict conformance. Transparency is achieved by a system

266

Data Intelligence

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

t

/

.

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

strictly conforming to the specification in Table 3 as a metadata writer. This implies strict conformance to
a specification is necessary for transparency. Sin embargo, is this all that is needed?

There are two questions that naturally arise. The first question is “Are the metadata (the properties for
MS01) correcto?” This is the metadata quality issue. The other question is “Is there some system that the user
can employ to find this information, or do they always have to turn to this paper?” This is the usability
concern.

We address these in the following sections.

6. METADATA QUALITY

Metadata are data as well, so it is natural to apply data quality frameworks [8] to metadata to determine
their quality. Sin embargo, metadata are data in the role of describing some other resources. Por lo tanto, a
metadata quality framework needs to account for this descriptive aspect.

Metadata that are instances of a schema are a shorthand for the textual descriptions contained in
traditional documentation. Consider the schema in Table 1. The schema elements and instance values for
the variable MS01 generate a set of declarative sentences as follows:

Defining concept of the variable is legally defined marital state.

Question capturing the variable is what is …’s marital status?

• Name of the variable is MS01.

• Universe of the variable is adults.

• Datatype of the variable is nominal.

Allowed values for the variable are <1, single>, <2, married>, <3, separated>, <4, divorced>,
<5, widowed>.

These simple sentences may be combined, with a few editorial flourishes, into a textual description of

the MS01 variable. The text might look like this:

The variable named MS01 measures the marital status of adults. En este caso, marital status is defined
as the legally defined marital state, and the possible values that may be assigned are <1, single>,
<2, married>, <3, separated>, <4, divorced>, <5, widowed>. These are unordered categories, so the
datatype is nominal. The question used to capture the marital status for each adult is “What is …’s
marital status?"

It is evident that this textual description is equivalent to the combination of the sentences, and they in
turn are derived directly from the schema for variables and the marital status variable instance. This means
the schema and instance are equivalent to the textual description of the variable named MS01. This story
approach is consistent with interview studies by Maron [9] of metadata experts, who indicate they consider
the story metadata tell.

Data Intelligence

267

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

.

t

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

The question of the quality of metadata (the instance values of the schema) and the equivalent text
can be broken into the language components syntax, semantics, and pragmatics; and we describe this
further here.

Syntax, semantics, and pragmatics are commonly used criteria in information quality and metadata
quality frameworks. Sundgren [10] and Price and Shanks [11], [12] used the framework to talk about
information quality—the story that some data convey about society. Sundgren only briefly defined each
component as representational, contents-oriented, and purpose-oriented, respectivamente. Price and Shanks
defined syntax and semantics more carefully, they indicated syntax is about format rules, allowed values
fitting validity criteria, and the correct code representing a category is assigned. Semantics refers to whether
the correct meaning has been assigned to elements, entonces, to use a simple example, if an adult is married their
marital status is recorded as such. Pragmatics was defined in terms of the user and their judgment as to
how relevant some information is.

Myrseth et al [13] used the same framework to talk about metadata quality as the metadata relate to
some underlying data. This is like the information quality approaches above. Syntax refers to validations
with respect to some schema, semantics refers to the match between data and what they are intended to
representar, and pragmatics refers to the perception by the user of the quality of information.

Price and Shanks [11] y [12] took a semiotic approach to recording data and metadata. This is based
on the earlier work of C. S. Pierce [14]. The approach allows a systematic way of tying representations and
meanings, especially in information systems. Myreth et al [13] adopted the same ideas. Curiosamente, para
Price and Shanks this leads to a definition of data from the semiotic perspective. Farance and Gillman [5]
arrived at the same idea independently but extended the definition to include a datatype as described in
ISO/IEC 11404 [6].

We base our metadata quality framework using the same terms, syntax, semantics, and pragmatics as
arriba. The definitions we use for syntax and semantics are roughly the same, but we alter the use of the
term pragmatics.

The approach considers both the schema instance and the derived story and asks if they are true. Este

addresses metadata quality from the point of view of description.

Primero, we need to make sure each instance value is valid under the rules of the technical specification.
We refer to this as the syntax aspect of metadata quality. Por ejemplo, typing errors, formatting mistakes
(p.ej., using more than 16 characters in the variable name, ver tabla 3, entering the wrong date/time format),
and assigning codes not on the valid list are syntax errors.

The syntax rules are part of the technical specification that defines the schema, and they are among the
requirements provided. Por lo tanto, the syntax component of metadata quality is achieved through strict
conformance to the schema. If an instance of the schema conforms, it must satisfy the syntax rules.

268

Data Intelligence

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

t

.

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

The next level is semantic. Do the instance values convey meaningful information? Por ejemplo, consider
the datatype in Table 1. Looking at the numeric codes from the list of allowed values, someone might decide
this is an ordered list of categories and select the ordinal datatype. That contradicts the nature of marital
status categories. They are not ordered.

In our example, we list 5 marital statuses, but maybe that was changed from an original list of 4: single,
married, divorced, widowed. Separated was added later. Entonces, if the choices associated with the question
do not match the allowed values, this is a semantic error also. In another example, the universe might say
monks instead of adults. That could be a valid value, but for a marital status variable it does not make a
lot of sense, because monks take a vow of celibacy and don’t get married.

Referring back to the sentences derived from the pairs of schema elements and instance values, we ask
if each is true. This is the semantic aspect of metadata quality. The truth may be defined in the same way
Tarski did for formal languages [15]. The sentence The datatype of the variable is nominal is true iff (si y
only if) the datatype really is nominal.

The highest level is the pragmatic aspect of metadata quality. Aquí, we focus on the story rather than

each sentence. We ask the following questions of the story:


Is the story complete? En otras palabras, does it contain the whole truth?
Is the story concise? En otras palabras, does it contain nothing but the truth?

The schema element / instance value pairs might be true individually (es decir., correcto), but they don’t
necessarily tell a complete and concise story. If part of the story is left out, the story is not complete. If part
of the story is irrelevant or unnecessary, the story is not concise. Using the example in Table 1 and the story
implied (see above), examples of possible missing and irrelevant characteristics are

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

t

.

/

i

1. Missing: leaving out the set of allowed values

a. The description is incomplete
b. This corresponds to whether the whole truth about the resource is provided.

2. Irrelevant: Include an estimate of the population of the Soviet Union

a. The description of the variable is not concise because the Soviet Union no longer exists as a

country and the included information is not relevant to understanding variables
b. This corresponds to whether nothing but the truth about the resource is provided.

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Pragmatics fails when the story fails in some way. From the point of view of the technical specification
we conform to, we expect that specification to include a set of elements that are relevant and complete.
In this way, the stories based on the schema element / instance value pairs (key/value pairs) are informative.

Using our example again, we limit transparency for variable MS01 if there are errors in the description.
Conveying erroneous information may be worse that conveying none. An error associated with any of the
quality aspects leads to this conclusion, though it is possible that some errors will be more harmful than

Data Intelligence

269

Achieving Transparency: A Metadata Perspective

otros. The important point is that conformance by itself does not imply quality, and lack of quality impedes
transparencia.

We summarize the metadata quality framework in Table 4 abajo:

Mesa 4. Metadata Quality Framework.

Metadata Quality Framework

Components

Defi nition

Verifi cation

Syntax
Semántica
Pragmatics Story generated by all element / instance pairs is

Instance values follow formatting and validity rules Conformance of instances to metadata schema
Each instance value is correct

Truth of each instance value
Whole truth and nothing but the truth of
generated story

complete and concise

We note previous authors defined the pragmatic aspect of metadata quality as something seen by the
users of the metadata. Our framework takes a different perspective. It places pragmatics as the aspect of
quality associated with the combination of each schema element / instance value pair into forming a
narrative. It is the accuracy of this narrative with respect to all three quality aspects that characterizes our
estructura.

7. SYSTEM USABILITY

There are many articles and books about usability and human-computer interaction. A whimsical and
leisurely explanation of usability is in [16]. [17] y [18] contain a more detailed look at human-computer
interacción. The most important thing to understand is that any effective system must have a usable interface.
Failure to achieve this makes systems hard to use. Hard to use systems don’t deliver information easily or,
a menudo, completely. Transparency is reduced when information is difficult to get.

For us, usability is applied to the interface and functionality of the system. There are two main

considerations:

• The system must provide adequate functionality to support transparency
• The system user interface must be usable.

The main functionality needed to support transparency is for the system to provide the necessary metadata.
Usability of an interface is closely related to conformance to a metadata specification. Es posible que
through an interface to a system that another system (usually a person) can determine whether the first
system conforms to some specification. This is possible when the interface is usable. The usable interface
allows the person to inspect the system to make sure all requirements of the specification are satisfied. So,
by inspecting the output of the system, it is possible check whether it conforms at the writer level. If the
system interface includes the ability to input metadata and those metadata are expected to conform to a
metadata specification, then the system must conform at the reader level, también. If the input metadata are

270

Data Intelligence

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

t

.

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

delivered back to a user (the same or different than the one providing the metadata), the system must
conform at the repository level.

The other main consideration is how easily a user can interact with the system interface. Placement of
text and buttons; use of fonts and colors; naming of labels on buttons, paginas, and links; and whether
provided functions behave in the expected way are all areas that usability testing can improve. A usable
system delivers the desired results in a predictable and straightforward way. This is achieved through
affordances—the ability of the user to intuit which buttons and what functions to use [19].

This points to something interesting, aunque. The usability necessary to determine conformance to the
technical specification implies an object described by a high-quality metadata instance of the technical
specification is transparent. In this way, all our transparency criteria are inter-connected.

8. CONCLUSIÓN

en este documento, we make the case that conforming to a technical specification for metadata, achieving
metadata quality, and ensuring that any system interface is usable are necessary and sufficient to achieve
transparencia.

In the course of presenting the argument for our 3 criteria for transparency we laid out 3 componentes
for assessing metadata quality—syntax, semantics, and pragmatics. We provided criteria for assessing each
of the components: conformance for the syntactic component, the truth of each instance value for the
semantic component, the whole truth for the completeness aspect of the pragmatic component, and nothing
but the truth for the conciseness aspect of the pragmatic component.

We demonstrated how the criteria for transparency are interrelated, and these interrelationships means

all the criteria need to be considered when planning and designing systems supporting transparency.

REFERENCIAS

[1] The Promise of Evidence-Based Policymaking, Commission on Evidence-Based Policymaking, Chaired by
Katharine Abraham, Septiembre 2017. Disponible en: https://bipartisanpolicy.org/download/?file=/wp-content/
uploads/2019/03/Full-Report-The-Promise-of-Evidence-Based-Policymaking-Report-of-the-Comission-on
Evidence-based-Policymaking.pdf. Accedido 19 Febrero 2022

[2] National Academies of Sciences, Ingeniería, and Medicine; Division of Behavioral and Social Sciences and
Educación; Committee on National Statistics; Panel on Transparency and Reproducibility of Federal Statistics
for the National Center for Science and Engineering Statistics. Transparency in Statistical Information for the
National Center for Science and Engineering Statistics and All Federal Statistical Agencies (2021).
Washington, corriente continua: The National Academies Press. https://doi.org/10.17226/26360.

[3] Gentner, D., Kurtz, K.: Learning and using relational categories. En: W.K. Ahn, R.L. Goldstone, B.C. Amar, A.B.
Markman & P.W. Wolff (Editores.), Categorization Inside and Outside the Laboratory. Washington, corriente continua: QUÉ
(2005)

Data Intelligence

271

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

.

t

/

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

[4]

[5]

[6]

[7]

[8]

ISO 704: 2000. Terminology Work—Principles and Methods. International Organization for Standardization,
Geneva.
Farance, F., Gillman, D.: Foundational Metadata in Cross-Domain Integration: Detailed Model. Ch2,
páginas. 9–21. Disponible en: https://ddialliance.org/Specification/ddi-cdi (2020)
ISO/IEC 11404: 2007. Information technology—General purpose datatypes. International Organization for
Standardization, Geneva.
ISO/IEC 20944-1: 2013. Information technology—Metadata Registries Interoperability and Bindings
(MDRIB)—Part 1: Framework, common vocabulary, and common provisions for conformance. Internacional
Organization for Standardization, Geneva.
Ehling, METRO., Korner, t. (eds): Handbook on Data Quality Assessment Methods and Tools. European Commission,
Weisbaden (2007)

[9] Maron, D. The Pragmatics of Metadata: From Practice to Concept to Practice, IEEE Tech. Comm. Digit. Libr.

(2018)

[10] Sundgren, B.: Guidelines for the Modelling of Statistical Data and Metadata. Published as Guidelines from

the United Nations Statistical Division, Nueva York (1995)

[11] Precio, r., Shanks, GRAMO.: Empirical Refinement of a Semiotic Information Quality Framework. En: Actas

of the 38th Hawaii International Conference on System Sciences, pag. 216a (2005)

[12] Precio, r., Shanks, GRAMO.: A Semiotic Information Quality Framework: Development and Comparative Analysis.
En: Willcocks, L.P., Sauer, C., Lacity, M.C. (eds): Enacting Research Methods in Information Systems. Palgrave
Macmillan, cham (2016)

[13] Myrseth, PAG., Stang, J., Dalberg, v.: A data quality framework applied to e-government metadata: A prerequsite
to establish governance of interoperable e-services; 2011 Conference on E-Business and E-Government
(ICEE) (2011)

[14] Pierce, C.S.: Collected Papers. Prensa de la Universidad de Harvard, Cambridge, MAMÁ (1931–1935).
[15] Tarski, A.: “Truth, Significado, and the Indeterminacy of Translation—The Semantic Conception of Truth” in

Philosophy of Language: The Central Topics, Chapter 2.
[16] krug, S.: Rocket Surgery Made Easy. New Riders (2009)
[17] Cowley, B., Filetti, METRO., Lukander, K.: The Psychophysiology Primer—A Guide to Methods and a Broad Review
with a Focus on Human-Computer Interaction in Human-Computer Interaction. Now Publishers, Cª. (2016)

[18] dix, A., et al.: Human-Computer Interaction. 3rd Ed. Pearson (2003)
[19] Norman, D.: The Design of Everyday Things. CON prensa, Cambridge, MAMÁ (2013)

272

Data Intelligence

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

.

t

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Achieving Transparency: A Metadata Perspective

AUTHOR BIOGRAPHY

Daniel Gillman works in the Office of Survey Methods Research at the
A NOSOTROS. Bureau of Labor Statistics (BLS). His research interests include metadata,
standards, terminology, y transparencia. At BLS, Gillman led the effort to
build a taxonomy of terms describing all time-series data and was a member
of the team to build a glossary of BLS technical terms. He is consultant to
several program initiatives at BLS to build metadata repositories in support of
public data releases and internal processing efficiencies, to the BLS output
database redesign effort, and to data-governance modernization effort sat
the U.S. Department of Labor. He is chair of an interagency ad hoc group
to develop guidance on metadata management for the statistical agencies.
Over the years, Gillman has chaired numerous metadata standards groups
under the US federal government, ANSI/INCITS, SDMX, UNECE, y el
DDI-Alliance. He is a member representative to the DDI-Alliance and is a
key developer of the DDI-4 (Cross-DomainIntegration) model driven standard.
Además, he has participated in numerous efforts across many standards-
development organizations, and he has participated as an expert consultant
to several metadata modeling efforts outside BLS. Prior to working at BLS, él
worked at the U.S. Census Bureau.
ORCID: 0000-0002-1638-3881

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

.

t

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3

Data Intelligence

273

yo

D
oh
w
norte
oh
a
d
mi
d

F
r
oh
metro
h

t
t

pag

:
/
/

d
i
r
mi
C
t
.

metro

i
t
.

mi
d
tu
d
norte

/

i

t
/

yo

a
r
t
i
C
mi

pag
d

F
/

/

/

/

5
1
2
6
1
2
0
7
4
2
3
9
d
norte
_
a
_
0
0
1
8
8
pag
d

/

.

t

i

F

b
y
gramo
tu
mi
s
t

t

oh
norte
0
7
S
mi
pag
mi
metro
b
mi
r
2
0
2
3VISIONARY AND THEORY PAPER image
VISIONARY AND THEORY PAPER image

Descargar PDF