FOCUS FEATURE:
Topological Neuroscience
Feasibility of topological data analysis for
event-related fMRI
Cameron T. Ellis1, Michael Lesnick3, Gregory Henselman-Petrusek2,
Bryn Keller4, and Jonathan D. Cohen2
1Department of Psychology, Yale University, New Haven, CT, USA
2Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA
3Department of Mathematics, University at Albany, Albany, NY, USA
4Intel Labs, Hillsboro, OR, USA
a n o p e n a c c e s s
j o u r n a l
Keywords: Topological data analysis, Persistent homology,
design, Representation
fMRI, Simulation, Event-related
ABSTRACT
Recent fMRI research shows that perceptual and cognitive representations are instantiated in
high-dimensional multivoxel patterns in the brain. However, the methods for detecting these
representations are limited. Topological data analysis (TDA) is a new approach, based on the
mathematical field of topology, that can detect unique types of geometric features in patterns
of data. Several recent studies have successfully applied TDA to study various forms of neural
data; however, to our knowledge, TDA has not been successfully applied to data from
event-related fMRI designs. Event-related fMRI is very common but limited in terms of the
number of events that can be run within a practical time frame and the effect size that can be
expected. Here, we investigate whether persistent homology—a popular TDA tool that
identifies topological features in data and quantifies their robustness—can identify known
signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI
data, to assess the plausibility of recovering a simple topological representation under a
variety of conditions. Our results suggest that persistent homology can be used under certain
circumstances to recover topological structure embedded in realistic fMRI data simulations.
AUTHOR SUMMARY
How do we represent the world? In cognitive neuroscience it is typical to think
representations are points in high-dimensional space. In order to study these kinds of spaces
it is necessary to have tools that capture the organization of high-dimensional data.
Topological data analysis (TDA) holds promise for detecting unique types of geometric
features in patterns of data. Although potentially useful, TDA has not been applied to
event-related fMRI data. Here we utilized a popular tool from TDA, persistent homology, to
recover topological signals from event-related fMRI data. We simulated realistic fMRI data
and explored the parameters under which persistent homology can successfully extract
signal. We also provided extensive code and recommendations for how to make the most out
of TDA for fMRI analysis.
INTRODUCTION
A fundamental construct in cognitive psychology and neuroscience is that of a representational
space, within which knowledge is stored. A representational space is a set of dimensions along
Citation: Ellis, C. T., Lesnick, M.,
Henselman-Petrusek, G., Keller, B., &
Cohen, J. D. (2019). Feasibility of
topological data analysis for
event-related fMRI. Network
Neuroscience, 3(3), 695–706. https://
doi.org/10.1162/netn_a_00095
DOI:
https://doi.org/10.1162/netn_a_00095
Supporting Information:
https://doi.org/10.1162/netn_a_00095
https://github.com/CameronTEllis/
event_related_fmri_tda
Received: 30 October 2018
Accepted: 9 May 2019
Competing Interests: The authors have
declared that no competing interests
exist.
Corresponding Author:
Cameron Ellis
cameron.ellis@yale.edu
Handling Editor:
Louis-David Lord
Copyright: © 2019
Massachusetts Institute of Technology
Published under a Creative Commons
Attribution 4.0 International
(CC BY 4.0) license
The MIT Press
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
t
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
.
t
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
Representation:
A point in a high-dimensional space
where each dimension reflects
features of that representation (e.g.,
perceptual, semantic, evaluative
features).
which items of a particular type are described, in which each dimension reflects features of
items of that type (e.g., perceptual, semantic, evaluative). Thus, a central goal of both cognitive
psychology and neuroscience is to characterize the dimensions that define representational
spaces for different types of information (e.g., objects in the world, goals, classes of actions).
Progress towards this goal obviously relies on methods that can identify the structure of such
spaces. Cognitive neuroscience seeks evidence for such structure in patterns of neural activity.
Topological data analysis:
A tool for characterizing the coarse
scale, global, nonlinear geometric
features of data.
fMRI:
Functional magnetic resonance
imaging is a neuroscience method
that indirectly measures brain activity
via the hemodynamic response.
Topological data analysis (TDA) holds promise for this effort, as an unbiased (that is, “ex-
ploratory”) method for identifying the latent structure in high-dimensional data. TDA uses ideas
from the mathematical field of algebraic topology to study the shape (i.e., coarse scale, global,
nonlinear geometric features) of data. For example, TDA seeks to detect clusters, holes, hollow
voids, and tendrils in data (Carlsson, 2009). TDA has recently been applied productively to data
from direct neuronal recordings. For instance, realistic models (Dabaghian, Mémoli, Frank, &
Carlsson, 2012) and real data (Giusti, Pastalkova, Curto, & Itskov, 2015) suggest that the co-
firing of hippocampal place cells contains topological structure. Other studies have shown
that TDA can find structure in fMRI data (Bassett & Sporns, 2017). Examples of TDA being ap-
plied to fMRI data show that when a signal is periodic, TDA can capture it (Knyazeva, Orlov,
Ushakov, Makarenko, & Velichkovsky, 2016), and TDA can describe the topological structure
identified in the state transitions between tasks using other methods (Saggar et al., 2018). How-
ever, to our knowledge, TDA has not yet been demonstrated to directly identify structure in
event-related fMRI data.
Event-related fMRI, in which discrete events are presented one after the other, is an im-
portant method for identifying mental representations. An example of a representation is the
one you have of a landmark near your home (inspired by an experiment from Nielson, Smith,
Sreekumar, Dennis, & Sederberg, 2015). This representation has many dimensions (e.g., its
location relative to your home, the type of place it is, how much you like it, how often you go
there). Your representation of different landmarks around your home will have different coor-
dinates in this representational space. To identify these representations, an event-related fMRI
experiment could be carried out in which you are shown images of these landmarks one after
the other, separated by a pause. Each image presentation is an event and will evoke a neural
response. If these neural responses capture the spatial dimensions of these landmarks, then the
physical distances in these landmarks will relate to the differences in neural representations.
Neural responses might also capture perceptual or evaluative dimensions (e.g., their sensory
features and/or how much you like each), in which case differences along these dimensions
should relate to differences in neural representations. With this logic, it should be possible to
identify patterns of activity in the brain that reflect differences between these landmarks along
each of these dimensions; that is, their topological structure. This is a simple example but it
applies to other types of representational spaces such as faces (Valentine, 1991), temporal re-
lationships (Schapiro, Rogers, Cordova, Turk-Browne, & Botvinick, 2013), and so forth. Some
fMRI studies have generated evidence for patterns of brain activity that reflect the metric prop-
erties inherent in a set of stimuli (Nau, Schröder, Bellmund, & Doeller, 2018; Schapiro et al.,
2013), but these did not use TDA.
An appeal of TDA is that it has the potential to identify the topological structure of repre-
sentational spaces (e.g., contiguity relationships) without requiring that affine properties (i.e.,
exact metric relationships) be preserved; for example, a rubber band preserves its topologi-
cal identity even as it is stretched into various shapes. This may be important for identifying
representational spaces in the brain, if these have topological structure that goes beyond strict
metric relationships—something that standard methods would find harder to detect.
Network Neuroscience
696
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
t
/
/
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
In this work, we focus on the application of persistent homology (Zomorodian & Carlsson,
2005), one of the most widely studied and applied TDA tools, to the analysis of representations
in event-related fMRI. In order to evaluate the potential usefulness of persistent homology
for identifying the structure of representational spaces in the brain, we simulated an event-
related fMRI dataset by embedding a simple topologically structured signal (a loop/ring) within
a realistically noisy model of fMRI data. We tested different design parameters (e.g., number
of event types, and samples per event type), as well as different levels of signal strength, to
evaluate the extent to which persistent homology is able to recover the topological structure
of the signal imposed on the noisy data.
METHODS
Materials and Procedure
◦
Our simulations were based on parameters extracted from raw fMRI data. We obtained the
data from Schapiro et al. (2013). We chose to work with these data because many of their
characteristics were representative of other fMRI studies (including their size: 20 participants,
each with five runs); however, we note that some aspects of the design we simulated are distinct
from the design of these authors (as described below). These data were collected on a 3T
scanner (Siemens Allegra) with a 16-channel head coil and a T2* gradient echo planar imaging
, matrix = 64 × 64, slices = 34, resolution =
sequence (TR = 2 s, TE = 30 ms, flip angle = 90
3 × 3 × 3 mm, gap = 1 mm). We used fmrisim (Ellis, Baldassano, Schapiro, Cai, & Cohen,
2019) to simulate data with equivalent noise properties and embed a known topological
signal into this realistic noise. The code is published at Ellis (2019). An extensive description
of fmrisim is outside of the scope of the current manuscript; we refer interested readers to
Ellis et al. (2019) for a thorough description of how it works and analyzes validating the
accuracy of the simulations. Importantly, this package uses raw fMRI data to estimate noise
parameters that can then be used to make a simulation with noise properties that approximate
as closely as possible those in real data. This means that properties like the spatial structure,
temporal variability, and signal-to-noise ratio are similar between a simulation of a participant
and the participant’s raw data. This can then be used to simulate data with a prespecified (i.e.,
synthesized) signal embedded in it and determine “how much signal” is needed to observe
statistically meaningful effects.
In this study, the noise parameters were estimated from the Schapiro et al. (2013) dataset and
used to generate a noise volume. Figure S1 (see Supporting information) shows plots demon-
strating the extent to which the simulated noise properties approximate real noise properties.
These analyses show that the simulated participants have typical and appropriate noise proper-
ties. With these noise volumes, a prespecified signal (a loop) was linearly added following the
steps demonstrated in Figure S2 (see Supporting information). To explain these steps, the spa-
tial structure of the signal was a set of N equi-spaced points on the circle, where N = 12, 15,
or 18. We embedded these N points into high-dimensional voxel space via a fixed linear,
Euclidean distance-preserving (orthonormal) transformation. Specifically, we embedded the
N points into a region of interest (ROI) of 442 voxels (and only these voxels) spanning the left
superior temporal gyrus, as well as the anterior temporal lobe and inferior frontal gyrus (based
on the significant voxels in an analysis by Schapiro et al., 2013). Note that some of these voxels
in the ROI were outside of the brain because of differences in brain anatomy between partici-
pants, so no signal was inserted into these voxels for that participant/run (mean: 381.0, range:
347–411). This process generated N vectors in a 442-dimensional subspace of voxel space,
arranged in a circle. The embedding into the 442-dimensional space was chosen at random,
Voxel:
A 3-dimensional pixel. In fMRI the
brain is discretized into tens of
thousands of voxels.
Region of interest:
An anatomically circumscribed area
of the brain that is considered in
isolation for an analysis.
Network Neuroscience
697
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
t
/
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
Event-related design:
Experiment design in which events
occur in quick succession (less than
a 15-s pause between them) and
these events are the unit of analysis.
Signal change:
The magnitude of the brain’s
response to an event. This is
measured as a percentage difference
from baseline activity.
Hemodynamic response function:
The expected change in blood
oxygen level that reflects the brain’s
resupply of nutrients after
metabolism.
and the range of values were shifted to all be positive. Figure 1 shows schematically how the
loop structure was embedded into the ROI.
To generate the temporal structure of the signal we took the N points sampled from the
circle and treated them as event types that occur in an event-related design. An event type
can be thought of as a unique stimulus type, like a landmark or face, that lies on the circle in
high-dimensional voxel space. We generated designs in which different event types occurred
in a trial sequence, split in to five runs. Each event type was repeated either 25 or 50 times
per session, distributed equally throughout the five runs. This resulted in simulated session
durations ranging from 60 to 180 min (more than 100 min would be a long fMRI experiment
and might need to be completed across multiple sessions). For reference, 25 repetitions per
event type was equivalent to the usable fMRI data from Schapiro et al. (2013).
These events each lasted for 1 s, and occurred 7, 9, or 11 s apart with equal probability and
in a randomized order. Although this delay is longer than what others like Schapiro et al. (2013)
have used, this longer delay provides a cleaner measurement of the delayed, protracted nature
of the brain’s response on which the fMRI signal relies (Burock, Buckner, Woldorff, Rosen,
& Dale, 1998), and thus gives persistent homology a better chance to identify topological
structure inherent in this signal.
Each event in this design evoked a characteristic pattern of activity across the voxels in the
ROI. The magnitude of each voxel’s response to events was determined by the voxel’s embed-
ding in the representation space. The maximum value in this embedding for each voxel (i.e.,
the coordinate farthest from the origin) was used to scale the response. Specifically, the per-
cent signal change of this coordinate was set to either 0, 0.25, 0.5, 0.75, or 1% and all other
responses were then scaled proportionally relative to this maximum response. Percent signal
change refers to the magnitude of evoked response relative to baseline fluctuations (Gläscher,
2009). In typical experiments, we expect to find between 0.25% and 1% signal change, de-
pending on the cortical area and stimulus (Desmond & Glover, 2002; Rao et al., 1996). The
efficacy of simulating data with a given magnitude was validated in another study (Ellis et al.,
2019).
Note, this simulation may be overestimating the power of these analyses because the par-
ticipants being simulated have unrealistically consistent responses to the task. Although noise
differences exist between simulated participants, the percent signal change that events evoke
is all identical, which is not likely in most samples. Adding variability to the percent signal
change in the simulation would add variability to the test statistics and diminish their signifi-
cance. This was not implemented here since estimating the variability in percent signal change
could not be done systematically. Nonetheless, this is not expected to qualitatively alter the
conclusions that can be drawn from this simulation.
Each voxel’s time course of evoked responses was convolved with a double gamma hemo-
dynamic response function (Friston et al., 1998) to imitate the brain’s response to event onsets.
These evoked responses were then added to the noise and Z-scored across time to complete the
simulation of a run. For each combination of the conditions (number of event types, repetitions
per run, and percent signal change), a new simulation was made for all runs and all participants.
Analysis
To extract the evoked representation, a mass-univariate general linear model was performed
on each run, following the standard approach with real fMRI data. To do this, all of the
Network Neuroscience
698
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
t
/
/
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
Searchlight:
A computation, such as
classification, performed on a local
cluster of voxels that is iterated over
the whole brain.
events for a given type were convolved with a double gamma hemodynamic response function
(Friston et al., 1998) to predict the neural response, creating a coefficient for each event type
at each voxel. These coefficients were then averaged across runs to create a voxelwise re-
sponse to each event type. These averaged responses to each event type were then used in a
BrainIAK (http://brainiak.org) searchlight analysis. A searchlight analysis involves iterating the
same computation over a subset (“searchlight”) of voxels across the brain. The searchlight is a
tensor of 3 × 3 × 3 × N voxels centered on every voxel in the brain, where N is the number of
distinct event types. This searchlight size was chosen to suit convention. Figure S3 (see Sup-
porting information) shows the results with a 7 × 7 × 7 × N searchlight. Compared with the
results of the smaller searchlight, the results for this larger searchlight suggest that size may
influence the false alarm rate.
Importantly, the signal we embedded was distributed arbitrarily across the 442 voxels in the
ROI, but a searchlight can contain only a subset of these voxels. This was especially important
since the shape of the ROI was not smooth or uniform. There were 2,569 searchlights that con-
tained at least one voxel in this ROI, and there were 325 searchlights out of the whole brain that
contained at least 10 voxels from the ROI. Nevertheless, since the signal being embedded into
the high-dimensional space was very low dimension (2-D), this meant that the signal’s repre-
sentation was highly redundant within this ROI. In other words, only a small sample of voxels
was necessary to represent the signal structure. We suggest that this is biologically plausible:
voxels in an ROI may represent different features or dimensions, but any small sample of these
voxels may be sufficient to differentiate representations, especially when the representations
are distributed and thus likely to differ along many dimensions.
For each searchlight, we construct a 27 × N matrix, where each column of the matrix spec-
ifies the voxel pattern in the searchlight for one event type. Each column of voxels is Z-scored
to standardize differences between the events. We then formed an N × N distance matrix by
taking the Euclidean distance between each pair of columns of the 27 × N matrix. Events evok-
ing similar patterns of activity within a given searchlight had a low distance between them,
and vice versa.
A critical step in this analysis is that the columns of voxels are Z-scored; if this is not
performed then the sensitivity to purely multivariate patterns is severely diminished (Davis
& Poldrack, 2013; Walther, Nili, Ejaz, Alink, Kriegeskorte, & Diedrichsen, 2016). In this sim-
ulation, there was no mean difference in the evoked response; however, small, unsystematic
differences in the average response to an event arise by chance. The size of these differences
compared with those of the multivariate pattern that the stimulus evokes is sufficiently large to
generate arbitrary differences in the Euclidean distance between events. Z-scoring the voxel
activity within a condition eliminates these differences and thereby increases the sensitivity
to differences in the multivariate pattern. Accordingly, we used the normalized Euclidean
distance matrices in a persistent homology analysis of searchlights centered on every voxel
in the brain. Note that if mean differences between conditions are expected, for instance be-
cause one condition evokes a stronger response than the other (Davis & Poldrack, 2013), then
Z-scoring may not be appropriate. An alternative approach is to use correlation as the distance
metric, which also has the effect of normalizing mean differences.
We chose to use Euclidean distance here because our simulated signal ought to have no
systematic mean differences between condition, and because it is a typical metric in the TDA commu-
nity; however, in the neuroscience community correlation distance is often used (Kriegeskorte,
Mur, & Bandettini, 2008). Figure S4 (see Supporting information) shows the results are qualitatively
Network Neuroscience
699
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
t
/
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
the same if correlation distance is used. These normalized Euclidean distance matrices were
used as inputs into the persistent homology computation.
A full description of persistent homology is outside of the scope of the present work; inter-
ested readers are pointed to standard introductions to the topic (Carlsson, 2009; Ghrist, 2008).
In brief, persistent homology takes as input a distance matrix and outputs an object called a
barcode, which we explain below. The barcode is constructed from the data (i.e., the Euclidean
distance matrix) by building a growing sequence of geometric objects called a filtration. We
work with a standard filtration construction called the Vietoris-Rips complex (Carlsson, 2009).
The barcode is a list of intervals. We interpret the start of the interval (“the birth”) as
the scale at which some hole in the filtration forms, and the end of the interval (“the death”) as
the scale at which that hole closes up. The length of the interval (i.e., the difference between the
birth and death) is called the persistence of that feature, and is usually interpreted as a measure
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
Persistence:
The length of the interval of a
topological feature. The interval starts
at the scale that a feature forms and
ends at the scale that it disappears.
/
t
/
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
.
t
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Figure 1. Application of persistent homology analysis to fMRI data. Lower left box shows a topo-
logical pattern of activity (N equidistant points lying on a circle in the Euclidean plane). The points
in this set are embedded into an ROI in the brain, convolved with the hemodynamic response, and
combined with noise (blue voxels in upper left image). A distance matrix is then calculated for each
3 × 3 × 3 searchlight of voxels by comparing the event-evoked activity for each event type (matrix at
upper right). This distance matrix is the input to the persistent homology computation to create the
barcode. The birth and deaths of cluster (orange circles) and loop (blue cross) features are plotted
together (purple box at lower right). The persistence diagram is assigned to the central voxel that
contributed to that searchlight.
Network Neuroscience
700
Feasibility of topological data analysis for event-related fMRI
of “importance” or robustness of the feature. Features that have small persistence are consid-
ered topological noise, whereas larger persistence is taken to reflect a meaningful topological
feature (Cohen-Steiner, Edelsbrunner, & Harer, 2007). In fact, in topology we distinguish be-
tween holes of different dimensionality: a zero-dimensional hole corresponds to a cluster, a
one-dimensional hole is a loop, a two-dimensional hole is a hollow sphere, and so on. Each
interval in the barcode is labeled by the dimension of the corresponding feature. One way to
plot the output of persistent homology is as a persistence diagram, where the x-coordinates
are births and the y-coordinates are deaths of features. We use different symbols to distinguish
the dimensions of the features. Figure 1 shows a persistence diagram with a number of cluster
features and a single loop feature.
Our analysis tests for the presence of the single loop feature that was embedded into the data
as part of the simulation. We consider two test statistics of this signal: (a) the number of loops
identified in the barcode (which should be exactly 1); and (b) the persistence of the longest-lived
loop (if any were identified). To do so, we computed a barcode for every searchlight in the brain.
This was done by first constructing the N × N Euclidean distance matrix specified above. This
distance matrix was input in to BrainIAK-extra’s (https://github.com/brainiak/brainiak-extras)
Python wrapper of the PHAT algorithm (Bauer, Kerber, Reininghaus, & Wagner, 2017) to per-
form the persistent homology computation. The barcode from this computation was assigned
to the voxel in the center of the searchlight. This computation was performed on each search-
light in the brain. For computational efficiency, we looked only for cluster and loop features
(i.e., persistent homology in dimensions 0 and 1). If persistent homology is an adequate tool
for identifying the loop structure embedded in realistically noisy fMRI data, then within the
signal ROI the persistent homology plots should have exactly one persistent (long-lived) fea-
ture; this should not be so for an analogous ROI that does not contain signal. For this control
ROI, we used voxels on the symmetrically opposite side of the brain as the signal ROI, and
did not have signal added to the simulation (also 442 voxels). These ROIs then serve as a basis
of comparison to determine where there should and should not be high test statistic values
(either the maximum persistence or the proportion of single loops).
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
t
e
d
u
n
e
n
a
r
t
i
c
e
–
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
To evaluate the reliability of the test statistics, we perform a whole-brain, unbiased statisti-
cal test that corrects for multiple comparison issues. Due to the non-normal properties of the
persistent homology metrics used here (i.e., maximum persistence is a value from 0 to infinity,
or whether the barcode contains a single loop feature is a binary value), we subtracted the
whole-brain mean of each test statistic volume and then performed FSL’s “randomize” func-
tion to compute the nonparametric reliability of test statistics across the sample participants
(Winkler, Ridgway, Webster, Smith, & Nichols, 2014). Voxels surviving threshold-free cluster
enhancement (TFCE) correction at p < 0.05 are plotted for each metric and for each condition.
Finally, we compared the proportion of voxels in the signal ROI that are significant with the
number in the control ROI.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
RESULTS
To visualize the analyses performed here, Figure 2 shows the data for the 12 event types, 25
repetitions condition with different signal change parameters. Figure 2A shows the distance
matrix from a single searchlight in a single subject in the signal ROI. This searchlight was
used as an input into Figure 2B to produce multidimensional scaling plots (Shepard, 1962).
Figure 2C shows the persistence diagram computed for this searchlight. These plots indicate
that as the signal change increases, the fidelity with which the loop structure can be recovered
is greatly increased.
Network Neuroscience
701
Feasibility of topological data analysis for event-related fMRI
Figure 2. Data from example searchlight for different levels of percent signal change for the con-
dition with 12 event types and 25 repetitions. (A) The event-by-event Euclidean distance matrices
after voxels have been normalized within condition in this searchlight. (B) The MDS plots for these
distance matrices. Lines connect events that should be adjacent in representation space. (C) The
persistence diagrams for cluster and loop features, derived from the distance matrices in (A).
To systematically quantify the evidence of a loop, we analyzed the size of the most persistent
(or long-lived) loop feature. Figure 3A shows the mean persistence of voxels in the signal ROI
subtracted from the mean persistence of voxels in the control ROI (i.e., voxels on the exact
opposite side of the brain that did not contain signal). Each line represents a different set of
experimental conditions (25 or 50 repetitions per session; 12, 15, or 18 event types), and each
increment represents a different signal magnitude (0, 0.25, 0.5, 0.75, or 1% signal change).
Figure 3A shows that with low (0.25%) signal, persistence initially dips in the signal ROI relative
to the control ROI. With more signal, especially when there are more repetitions, the signal
ROI shows greater persistence than the control ROI. Figure 3B shows the proportion of voxels
in the signal ROI that are significantly reliable across participants. This shows a similar pattern
as Figure 3A: The number of significant voxels only exceeds the baseline when there is at least
0.5% signal change, especially when there are more repetitions. Figure 3C shows a slice of
the brain test statistic map for the 12 events, 25 repetitions condition for each of the levels of
signal change. Even at moderate signal there is evidence that the loop structure is expressed in
the signal ROI. These findings suggest that it may be possible to recover a single, simple, clear
topological signal with persistent homology in designs with a moderate evoked response and
a large amount of data.
Figures 3D–F show the corresponding analyses for the test statistic evaluating whether there
is only one loop feature (i.e., the associated one-dimensional persistence diagram has exactly
one point). Figure 3D shows that in some conditions there is evidence that the signal ROI
can contain a loop structure even at the lowest signal change (0.25%). In the conditions with
more event types or fewer repetitions, greater signal change was needed, though still in a
reasonable range. Although these mean differences indicate evidence of a loop structure at
low signal change, Figure 3E indicates that more signal is needed for the loop structure to be
significantly reliable—only at 0.5% signal change or higher are single loops detected reliably.
Figure 3F is similar to Figure 3C, and shows that even with moderate signal there is evidence
that the loop structure is expressed in the signal ROI but nowhere else in the brain.
Network Neuroscience
702
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
t
/
e
d
u
n
e
n
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
t
/
e
d
u
n
e
n
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Figure 3. Results of persistent homology analysis of simulated event-related fMRI data. The left
column shows the results using maximum loop persistence (i.e., length of the longest bar in the
1-D barcode) as the metric. (A) The difference in persistence in the signal ROI compared with the
control ROI. Different colors depict numbers of event types (12, 15, or 18), and the different lines
depict numbers of trial repetitions (25 or 50). (B) The proportion of voxels in the signal ROI that
were significant. The black dashed line is the proportion of significant voxels in the control ROI. (C)
In red, the maps of voxels for which the maximum persistence is significantly greater than baseline
at p < 0.05 with TFCE correction for one condition of the experiment. Each brain represents a
different percent signal change. The right column shows the equivalent analyses as (A–C) for the
metric testing the proportion of voxels in the signal ROI exhibiting one loop feature (i.e., the 1-D
barcode with exactly one point). (D) The mean difference between signal and control ROIs. (E) The
proportion of significant voxels in the signal ROI. (F) The significant voxels for each level of signal
change. Error bars for (A) and (D) are the standard error of the mean between participants.
Additional analyses, reported in Figure S5 (see Supporting information), tested whether it
was possible to improve upon the detection of single loops by taking into account “topological
noise.” Topological noise refers to low-persistence features that may emerge briefly in the pres-
ence of much larger and more robust high-persistence features. We can ignore these features
by setting a threshold on the length of persistence that is sufficient to warrant inclusion relative
to the length of the next longest feature. By introducing this threshold, the rates of finding one
loop increase for both the signal and the control ROI. Hence it is necessary to find an optimal
threshold for which the control ROI is kept low and the rate in the signal ROI is high. There are
indications that when using a low signal (0.25%) it is possible to find greater loop evidence
Network Neuroscience
703
Feasibility of topological data analysis for event-related fMRI
in the signal ROI relative to the control ROI when using a ratio threshold between 1.4 and
2.6. However, the optimal threshold differs depending on the number of event types and thus
is difficult to set a priori. Hence, topological thresholding may hold value but determining a
principled way of applying it to neural data requires further investigation.
DISCUSSION
Here we report simulation analyses evaluating the usefulness of persistent homology in recov-
ering topological structure in event-related fMRI data. Specifically, we used fmrisim to simulate
a realistic pattern of noise in fMRI data, and inserted a specific topological feature (a single
loop) into some voxels. We calculated the persistent homology of all voxel-centered search-
lights in the brain and evaluated the extent to which the embedded loop was reliably identified
by persistent homology. We showed that persistent homology can extract structure from a sam-
ple of participants collected with a realistic event-related fMRI design, especially when there
is moderate signal and few event types, with many repetitions of each type.
We used two different metrics to evaluate the topological structure inserted into the data: (a)
maximum persistence of a loop feature and (b) whether there is only one loop feature present
in the barcode. We found generally consistent results between these methods; however, there
were some circumstances in which the evidence conflicted. When the signal strength was
low and there were few repetitions of each event, the measure of maximum persistence was
actually lower in the region of the brain containing signal compared with a control region. At
the same time, the proportion of searchlights that contain a single loop in the signal ROI was
greater than the control ROI, even though this was not significantly reliable across participants.
There are other circumstances where the best metric is less clear. If a larger searchlight is used,
our supplemental analyses suggest that the proportion metric may be more susceptible to noise
than maximum persistence. It is hoped that this simulation protocol could help to investigate
these nuances and other novel designs.
The pattern of results showed that topological signal was best recovered when (a) the signal
strength was high, (b) there were numerous repetitions per event, and (c) there were fewer
events in the circle. The benefits of high signal strength and increased repetitions were ex-
pected; however, it was not anticipated that fewer event types would make it easier to extract
the representation of the loop. This phenomenon may emerge from the greater risk that a sin-
gle noisy event will disrupt the topology of the entire representation. That said, with too few
events, the topological structure cannot be formed.
Although we were successfully able to use persistent homology to recover signal, it is im-
portant to consider a number of limitations that were exposed in this simulation. First, the
distance metric used, as well as the preprocessing done on this metric, is extremely important:
It is possible to entirely miss signal in the brain by doing the wrong type of preprocessing on
the data. We strongly advise using metrics that normalize voxel activity within condition when
the differences between conditions are expected to result from multivariate patterns. Second,
the amount of data needed in these analyses (60 to 180 min) is substantial and may preclude
certain types of experiments. Third, the topological structure that we tested is simple and de-
scribed by relatively few points. Hence for more complicated topological features, such as
community structure (Schapiro et al., 2013), “figures of 8,” interlocking rings, and branching
structures, it is likely that persistence analyses will have even lower power. Nonetheless we
believe the pipeline we set up provides an opportunity to test the viability of extracting different
topological structures from event-related fMRI.
Network Neuroscience
704
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
t
e
d
u
n
e
n
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
It is also possible that other experimental designs may profit from the use of TDA to identify
mental representations. For example, designs using more naturalistic stimuli (e.g., movies or
narratives; Hasson, Nir, Levy, Fuhrmann, & Malach, 2004; Yeshurun et al., 2017) generate large
quantities of data (in which each time point can be treated as a node on a graph) with strong
evoked responses. Such data may be conducive to analysis using TDA. Moreover, persistent
homology is only one type of TDA; it may be that other types of TDA, such as Mapper (Singh,
Mémoli, & Carlsson, 2007), can also recover topological structure in fMRI data.
In sum, we have shown that topological structure embedded in realistic simulations of fMRI
data can be identified and we have characterized conditions under which persistent homology
is highly effective. We hope that the insights and tools introduced here can guide researchers
in discovering the topological structure of neural representations.
ACKNOWLEDGMENTS
Thank you to N. Malek for comments on an earlier draft. We thank the editors and reviewers for
their helpful feedback. We especially thank one of the anonymous reviewers who went above
and beyond what is expected from reviewers and substantively improved the manuscript.
AUTHOR CONTRIBUTIONS
Cameron T. Ellis: Conceptualization; Data curation; Formal analysis; Methodology; Software;
Validation; Visualization; Writing - Original Draft; Writing - Review & Editing. Michael Lesnick:
Conceptualization; Methodology; Writing - Review & Editing. Gregory Henselman-Petrusek:
Methodology; Validation; Writing - Review & Editing. Bryn Keller: Methodology; Soft-
ware; Validation; Visualization; Writing - Review & Editing. Jonathan D. Cohen: Conceptu-
alization; Funding acquisition; Methodology; Supervision; Writing - Review & Editing.
FUNDING INFORMATION
Jonathan D. Cohen, John Templeton Foundation (US). Jonathan D. Cohen: Intel Corporation.
Gregory Henselman-Petrusek: Swartz Center for Theoretical Neuroscience at Princeton
University.
COMPETING INTERESTS
B. K. is an employee of Intel Corporation, which has a current grant with Princeton University.
No other conflicts are declared.
REFERENCES
Bassett, D. S., & Sporns, O. (2017). Network Neuroscience. Nature
Neuroscience, 20(3), 353–364.
Bauer, U., Kerber, M., Reininghaus,
PHAT—Persistent Homology Algorithms Toolbox.
Symbolic Computation, 78, 76–90.
J., & Wagner, H.
(2017).
Journal of
Burock, M. A., Buckner, R. L., Woldorff, M. G., Rosen, B. R., & Dale,
A. M. (1998). Randomized event-related experimental designs al-
low for extremely rapid presentation rates using functional MRI.
NeuroReport, 9(16), 3735–3739.
Carlsson, G. (2009). Topology and data. Bulletin of the American
Mathematical Society, 46(2), 255–308.
Dabaghian, Y., Mémoli, F., Frank, L., & Carlsson, G. (2012). A
topological paradigm for hippocampal spatial map formation
using persistent homology. PLoS Computational Biology, 8(8),
e1002581.
Davis, T., & Poldrack, R. A. (2013). Measuring neural representa-
tions with fMRI: Practices and pitfalls. Annals of the New York
Academy of Sciences, 1296(1), 108–134.
Desmond,
J. E., & Glover, G. H.
(2002). Estimating sample
size in functional MRI
(fMRI) neuroimaging studies: Statisti-
cal power analyses. Journal of Neuroscience Methods, 118(2),
115–128.
Cohen-Steiner, D., Edelsbrunner, H., & Harer, J. (2007). Stability
of persistence diagrams. Discrete and Computational Geometry,
37(1), 103–120.
Ellis, C. T. (2019). Repository for event-related fMRI simulation
for testing topological data analysis, Github, https://github.com/
CameronTEllis/event_related_fmri_tda
Network Neuroscience
705
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
/
/
t
e
d
u
n
e
n
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
t
.
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Feasibility of topological data analysis for event-related fMRI
Ellis, C. T., Baldassano, C., Schapiro, A. C., Cai, M. B., & Cohen,
J. (2019). Facilitating open-science with realistic fMRI simu-
lation: Validation and application. bioRxiv. https://doi.org/10.
1101/532424
Friston, K. J., Fletcher, P., Josephs, O., Holmes, A., Rugg, M., &
Turner, R. (1998). Event-related fMRI: Characterizing differential
responses. NeuroImage, 7(1), 30–40.
Ghrist, R. (2008). Barcodes: The persistent topology of data. Bulletin
of the American Mathematical Society, 45(1), 61–75.
Giusti, C., Pastalkova, E., Curto, C., & Itskov, V. (2015). Clique topol-
ogy reveals intrinsic geometric structure in neural correlations. Pro-
ceedings ofthe National AcademyofSciences, 112(44), 13455–13460.
Gläscher, J. (2009). Visualization of group inference data in func-
tional neuroimaging. Neuroinformatics, 7(1), 73–82.
Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004).
Intersubject synchronization of cortical activity during natural vi-
sion. Science, 303(5664), 1634–1640.
Knyazeva,
I., Orlov, V., Ushakov, V., Makarenko, N., &
Velichkovsky, B. (2016). On alternative instruments for the fMRI
data analysis: General linear model versus algebraic topology
approach. In Biologically inspired cognitive architectures (BICA)
for young scientists (pp. 107–113). Cham, Switzerland: Springer.
Kriegeskorte, N., Mur, M., & Bandettini, P. (2008). Representational
similarity analysis—Connecting the branches of systems neuro-
science. Frontiers in Systems Neuroscience, 2(4), 1–28.
Nau, M., Schröder, T. N., Bellmund, J. L., & Doeller, C. F. (2018).
Hexadirectional coding of visual space in human entorhinal cor-
tex. Nature Neuroscience, 21(2), 188.
Nielson, D. M., Smith, T. A., Sreekumar, V., Dennis, S., &
Sederberg, P. B. (2015). Human hippocampus represents space
and time during retrieval of real-world memories. Proceedings of
the National Academy of Sciences, 112(35), 11078–11083.
Rao, S. M., Bandettini, P. A., Binder,
J. A.,
Hammeke, T. A., Stein, E. A., & Hyde, J. S. (1996). Relation-
ship between finger movement rate and functional magnetic res-
J. R., Bobholz,
onance signal change in human primary motor cortex. Journal of
Cerebral Blood Flow and Metabolism, 16(6), 1250–1254.
Saggar, M., Sporns, O., Gonzalez-Castillo, J., Bandettini, P. A.,
Carlsson, G., Glover, G., & Reiss, A. L. (2018). Towards a new
approach to reveal dynamical organization of the brain using
topological data analysis. Nature Communications, 9(1), 1399.
Schapiro, A. C., Rogers, T. T., Cordova, N. I., Turk-Browne, N. B.,
& Botvinick, M. M. (2013). Neural representations of events
arise from temporal community structure. Nature Neuroscience,
16(4), 486–492.
Shepard, R. N. (1962). The analysis of proximities: Multidimen-
sional scaling with an unknown distance function. I. Psychome-
trika, 27(2), 125–140.
Singh, G., Mémoli, F., & Carlsson, G. E.
(2007). Topological
methods for the analysis of high dimensional data sets and 3d
object recognition. In M. Botsch & R. Pajarola (Eds.), Eurograph-
icssymposium on pointbased graphics (pp. 91–100). Geneva:
Eurographics Association.
Valentine, T. (1991). A unified account of the effects of distinctive-
ness, inversion, and race in face recognition. Quarterly Journal
of Experimental Psychology Section A, 43(2), 161–204.
Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., &
Diedrichsen, J. (2016). Reliability of dissimilarity measures for
multi-voxel pattern analysis. NeuroImage, 137, 188–200.
Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., &
Nichols, T. E. (2014). Permutation inference for the general lin-
ear model. NeuroImage, 92, 381–397.
Yeshurun, Y., Swanson, S., Simony, E., Chen, J., Lazaridi, C., Honey,
C. J., & Hasson, U. (2017). Same story, different story: The neural
representation of interpretive frameworks. Psychological Science,
28(3), 307–319.
Zomorodian, A., & Carlsson, G. (2005). Computing persistent ho-
mology. Discrete and Computational Geometry, 33(2), 249–274.
Network Neuroscience
706
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
t
/
/
e
d
u
n
e
n
a
r
t
i
c
e
-
p
d
l
f
/
/
/
/
/
3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
a
_
0
0
0
9
5
p
d
.
t
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3