焦点功能: - 麻省理工学院人工智能研究专业

焦点功能:
Topological Neuroscience

Feasibility of topological data analysis for
event-related fMRI

Cameron T. Ellis1, Michael Lesnick3, Gregory Henselman-Petrusek2,
Bryn Keller4, and Jonathan D. Cohen2

1心理学系, Yale University, 新天堂, CT, 美国
2Princeton Neuroscience Institute, 普林斯顿大学, 普林斯顿大学, 新泽西州, 美国
3Department of Mathematics, University at Albany, Albany, 纽约, 美国
4Intel Labs, Hillsboro, 或者, 美国

开放访问

杂志

关键词: Topological data analysis, Persistent homology,
设计, Representation

功能磁共振成像, Simulation, Event-related

抽象的

Recent fMRI research shows that perceptual and cognitive representations are instantiated in
high-dimensional multivoxel patterns in the brain. 然而, the methods for detecting these
representations are limited. Topological data analysis (TDA) is a new approach, based on the
mathematical ﬁeld of topology, that can detect unique types of geometric features in patterns
of data. Several recent studies have successfully applied TDA to study various forms of neural
数据; 然而, 据我们所知, TDA has not been successfully applied to data from
event-related fMRI designs. Event-related fMRI is very common but limited in terms of the
number of events that can be run within a practical time frame and the effect size that can be
预期的. 这里, we investigate whether persistent homology—a popular TDA tool that
identiﬁes topological features in data and quantiﬁes their robustness—can identify known
signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI
数据, to assess the plausibility of recovering a simple topological representation under a
variety of conditions. Our results suggest that persistent homology can be used under certain
circumstances to recover topological structure embedded in realistic fMRI data simulations.

作者总结

How do we represent the world? In cognitive neuroscience it is typical to think
representations are points in high-dimensional space. In order to study these kinds of spaces
it is necessary to have tools that capture the organization of high-dimensional data.
Topological data analysis (TDA) holds promise for detecting unique types of geometric
features in patterns of data. Although potentially useful, TDA has not been applied to
event-related fMRI data. Here we utilized a popular tool from TDA, persistent homology, 到
recover topological signals from event-related fMRI data. We simulated realistic fMRI data
and explored the parameters under which persistent homology can successfully extract
signal. We also provided extensive code and recommendations for how to make the most out
of TDA for fMRI analysis.

介绍

A fundamental construct in cognitive psychology and neuroscience is that of a representational
空间, within which knowledge is stored. A representational space is a set of dimensions along

引文: Ellis, C. T。, Lesnick, M。,
Henselman-Petrusek, G。, 凯勒, B., &
科恩, J. D. (2019). Feasibility of
topological data analysis for
event-related fMRI. 网络
神经科学, 3(3), 695–706. https://
doi.org/10.1162/netn_a_00095

DOI:
https://doi.org/10.1162/netn_a_00095

支持信息:
https://doi.org/10.1162/netn_a_00095
https://github.com/CameronTEllis/
event_related_fmri_tda

已收到: 30 十月 2018
公认: 9 可能 2019

利益争夺: 作者有
声明不存在竞争利益
存在.

通讯作者:
Cameron Ellis
cameron.ellis@yale.edu

处理编辑器:
Louis-David Lord

版权: © 2019
麻省理工学院
在知识共享下发布
归因 4.0 国际的
(抄送 4.0) 执照

麻省理工学院出版社

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Feasibility of topological data analysis for event-related fMRI

Representation:
A point in a high-dimensional space
where each dimension reﬂects
features of that representation (例如,
感性的, semantic, evaluative
特征).

which items of a particular type are described, in which each dimension reﬂects features of
items of that type (例如, 感性的, semantic, evaluative). 因此, a central goal of both cognitive
psychology and neuroscience is to characterize the dimensions that deﬁne representational
spaces for different types of information (例如, objects in the world, 目标, classes of actions).
Progress towards this goal obviously relies on methods that can identify the structure of such
空间. Cognitive neuroscience seeks evidence for such structure in patterns of neural activity.

Topological data analysis:
A tool for characterizing the coarse
规模, 全球的, nonlinear geometric
features of data.

功能磁共振成像:
Functional magnetic resonance
imaging is a neuroscience method
that indirectly measures brain activity
via the hemodynamic response.

Topological data analysis (TDA) holds promise for this effort, as an unbiased (那是, “ex-
ploratory”) method for identifying the latent structure in high-dimensional data. TDA uses ideas
from the mathematical ﬁeld of algebraic topology to study the shape (IE。, coarse scale, 全球的,
nonlinear geometric features) of data. 例如, TDA seeks to detect clusters, holes, hollow
voids, and tendrils in data (Carlsson, 2009). TDA has recently been applied productively to data
from direct neuronal recordings. 例如, realistic models (Dabaghian, Mémoli, Frank, &
Carlsson, 2012) and real data (Giusti, Pastalkova, Curto, & Itskov, 2015) suggest that the co-
ﬁring of hippocampal place cells contains topological structure. Other studies have shown
that TDA can ﬁnd structure in fMRI data (Bassett & 斯波恩斯, 2017). Examples of TDA being ap-
plied to fMRI data show that when a signal is periodic, TDA can capture it (Knyazeva, 奥尔洛夫,
Ushakov, Makarenko, & Velichkovsky, 2016), and TDA can describe the topological structure
identiﬁed in the state transitions between tasks using other methods (Saggar et al., 2018). 如何-
曾经, 据我们所知, TDA has not yet been demonstrated to directly identify structure in
event-related fMRI data.

Event-related fMRI, in which discrete events are presented one after the other, is an im-
portant method for identifying mental representations. An example of a representation is the
one you have of a landmark near your home (inspired by an experiment from Nielson, 史密斯,
Sreekumar, 丹尼斯, & Sederberg, 2015). This representation has many dimensions (例如, 它是
location relative to your home, the type of place it is, how much you like it, how often you go
那里). Your representation of different landmarks around your home will have different coor-
dinates in this representational space. To identify these representations, an event-related fMRI
experiment could be carried out in which you are shown images of these landmarks one after
另一个, separated by a pause. Each image presentation is an event and will evoke a neural
response. If these neural responses capture the spatial dimensions of these landmarks, 那么
physical distances in these landmarks will relate to the differences in neural representations.
Neural responses might also capture perceptual or evaluative dimensions (例如, their sensory
features and/or how much you like each), in which case differences along these dimensions
should relate to differences in neural representations. With this logic, it should be possible to
identify patterns of activity in the brain that reﬂect differences between these landmarks along
each of these dimensions; 那是, their topological structure. This is a simple example but it
applies to other types of representational spaces such as faces (Valentine, 1991), temporal re-
lationships (Schapiro, 罗杰斯, Cordova, Turk-Browne, & 博特维尼克, 2013), 等等. 一些
fMRI studies have generated evidence for patterns of brain activity that reﬂect the metric prop-
erties inherent in a set of stimuli (Nau, Schröder, Bellmund, & Doeller, 2018; Schapiro et al.,
2013), but these did not use TDA.

An appeal of TDA is that it has the potential to identify the topological structure of repre-
sentational spaces (例如, contiguity relationships) without requiring that afﬁne properties (IE。,
exact metric relationships) be preserved; 例如, a rubber band preserves its topologi-
cal identity even as it is stretched into various shapes. This may be important for identifying
representational spaces in the brain, if these have topological structure that goes beyond strict
metric relationships—something that standard methods would ﬁnd harder to detect.

网络神经科学

696

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Feasibility of topological data analysis for event-related fMRI

在这项工作中, we focus on the application of persistent homology (Zomorodian & Carlsson,
2005), one of the most widely studied and applied TDA tools, to the analysis of representations
in event-related fMRI. In order to evaluate the potential usefulness of persistent homology
for identifying the structure of representational spaces in the brain, we simulated an event-
related fMRI dataset by embedding a simple topologically structured signal (a loop/ring) 之内
a realistically noisy model of fMRI data. We tested different design parameters (例如, 数字
of event types, and samples per event type), as well as different levels of signal strength, 到
evaluate the extent to which persistent homology is able to recover the topological structure
of the signal imposed on the noisy data.

方法

Materials and Procedure

○

Our simulations were based on parameters extracted from raw fMRI data. We obtained the
data from Schapiro et al. (2013). We chose to work with these data because many of their
characteristics were representative of other fMRI studies (including their size: 20 参与者,
each with ﬁve runs); 然而, we note that some aspects of the design we simulated are distinct
from the design of these authors (如下所述). These data were collected on a 3T
scanner (Siemens Allegra) with a 16-channel head coil and a T2* gradient echo planar imaging
, matrix = 64 × 64, slices = 34, resolution =
顺序 (TR = 2 s, = 30 多发性硬化症, ﬂip angle = 90
3 × 3 × 3 毫米, gap = 1 毫米). We used fmrisim (Ellis, Baldassano, Schapiro, Cai, & 科恩,
2019) to simulate data with equivalent noise properties and embed a known topological
signal into this realistic noise. The code is published at Ellis (2019). An extensive description
of fmrisim is outside of the scope of the current manuscript; we refer interested readers to
Ellis et al. (2019) for a thorough description of how it works and analyzes validating the
accuracy of the simulations. 重要的, this package uses raw fMRI data to estimate noise
parameters that can then be used to make a simulation with noise properties that approximate
as closely as possible those in real data. This means that properties like the spatial structure,
temporal variability, and signal-to-noise ratio are similar between a simulation of a participant
and the participant’s raw data. This can then be used to simulate data with a prespeciﬁed (IE。,
synthesized) signal embedded in it and determine “how much signal” is needed to observe
statistically meaningful effects.

In this study, the noise parameters were estimated from the Schapiro et al. (2013) dataset and
used to generate a noise volume. Figure S1 (see Supporting information) shows plots demon-
strating the extent to which the simulated noise properties approximate real noise properties.
These analyses show that the simulated participants have typical and appropriate noise proper-
领带. With these noise volumes, a prespeciﬁed signal (a loop) was linearly added following the
steps demonstrated in Figure S2 (see Supporting information). To explain these steps, the spa-
tial structure of the signal was a set of N equi-spaced points on the circle, where N = 12, 15,
或者 18. We embedded these N points into high-dimensional voxel space via a ﬁxed linear,
Euclidean distance-preserving (orthonormal) 转型. Speciﬁcally, we embedded the
N points into a region of interest (ROI) 的 442 voxels (and only these voxels) spanning the left
superior temporal gyrus, as well as the anterior temporal lobe and inferior frontal gyrus (基于
on the signiﬁcant voxels in an analysis by Schapiro et al., 2013). Note that some of these voxels
in the ROI were outside of the brain because of differences in brain anatomy between partici-
pants, so no signal was inserted into these voxels for that participant/run (意思是: 381.0, 范围:
347–411). This process generated N vectors in a 442-dimensional subspace of voxel space,
arranged in a circle. The embedding into the 442-dimensional space was chosen at random,

Voxel:
A 3-dimensional pixel. In fMRI the
brain is discretized into tens of
thousands of voxels.

Region of interest:
An anatomically circumscribed area
of the brain that is considered in
isolation for an analysis.

网络神经科学

697

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Feasibility of topological data analysis for event-related fMRI

Event-related design:
Experiment design in which events
occur in quick succession (少于
a 15-s pause between them) 和
these events are the unit of analysis.

Signal change:
The magnitude of the brain’s
response to an event. 这是
measured as a percentage difference
from baseline activity.

Hemodynamic response function:
The expected change in blood
oxygen level that reﬂects the brain’s
resupply of nutrients after
metabolism.

and the range of values were shifted to all be positive. 数字 1 shows schematically how the
loop structure was embedded into the ROI.

To generate the temporal structure of the signal we took the N points sampled from the
circle and treated them as event types that occur in an event-related design. An event type
can be thought of as a unique stimulus type, like a landmark or face, that lies on the circle in
high-dimensional voxel space. We generated designs in which different event types occurred
in a trial sequence, split in to ﬁve runs. Each event type was repeated either 25 或者 50 次
per session, distributed equally throughout the ﬁve runs. This resulted in simulated session
durations ranging from 60 到 180 min (多于 100 min would be a long fMRI experiment
and might need to be completed across multiple sessions). For reference, 25 repetitions per
event type was equivalent to the usable fMRI data from Schapiro et al. (2013).

These events each lasted for 1 s, and occurred 7, 9, 或者 11 s apart with equal probability and
in a randomized order. Although this delay is longer than what others like Schapiro et al. (2013)
have used, this longer delay provides a cleaner measurement of the delayed, protracted nature
of the brain’s response on which the fMRI signal relies (Burock, 巴克纳, Woldorff, 罗森,
& 戴尔, 1998), and thus gives persistent homology a better chance to identify topological
structure inherent in this signal.

Each event in this design evoked a characteristic pattern of activity across the voxels in the
ROI. The magnitude of each voxel’s response to events was determined by the voxel’s embed-
ding in the representation space. The maximum value in this embedding for each voxel (IE。,
the coordinate farthest from the origin) was used to scale the response. Speciﬁcally, the per-
cent signal change of this coordinate was set to either 0, 0.25, 0.5, 0.75, 或者 1% and all other
responses were then scaled proportionally relative to this maximum response. Percent signal
change refers to the magnitude of evoked response relative to baseline ﬂuctuations (Gläscher,
2009). In typical experiments, we expect to ﬁnd between 0.25% 和 1% signal change, 的-
pending on the cortical area and stimulus (德斯蒙德 & Glover, 2002; Rao et al., 1996). 这
efﬁcacy of simulating data with a given magnitude was validated in another study (Ellis et al.,
2019).

笔记, this simulation may be overestimating the power of these analyses because the par-
ticipants being simulated have unrealistically consistent responses to the task. Although noise
differences exist between simulated participants, the percent signal change that events evoke
is all identical, which is not likely in most samples. Adding variability to the percent signal
change in the simulation would add variability to the test statistics and diminish their signiﬁ-
cance. This was not implemented here since estimating the variability in percent signal change
could not be done systematically. 尽管如此, this is not expected to qualitatively alter the
conclusions that can be drawn from this simulation.

Each voxel’s time course of evoked responses was convolved with a double gamma hemo-
dynamic response function (Friston et al., 1998) to imitate the brain’s response to event onsets.
These evoked responses were then added to the noise and Z-scored across time to complete the
simulation of a run. For each combination of the conditions (number of event types, repetitions
per run, and percent signal change), a new simulation was made for all runs and all participants.

分析

To extract the evoked representation, a mass-univariate general linear model was performed
on each run, following the standard approach with real fMRI data. 要做到这一点, all of the

网络神经科学

698

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Feasibility of topological data analysis for event-related fMRI

Searchlight:
A computation, 例如
分类, performed on a local
cluster of voxels that is iterated over
the whole brain.

events for a given type were convolved with a double gamma hemodynamic response function
(Friston et al., 1998) to predict the neural response, creating a coefﬁcient for each event type
at each voxel. These coefﬁcients were then averaged across runs to create a voxelwise re-
sponse to each event type. These averaged responses to each event type were then used in a
BrainIAK (http://brainiak.org) searchlight analysis. A searchlight analysis involves iterating the
same computation over a subset (“searchlight”) of voxels across the brain. The searchlight is a
tensor of 3 × 3 × 3 × N voxels centered on every voxel in the brain, where N is the number of
distinct event types. This searchlight size was chosen to suit convention. Figure S3 (see Sup-
porting information) shows the results with a 7 × 7 × 7 × N searchlight. Compared with the
results of the smaller searchlight, the results for this larger searchlight suggest that size may
inﬂuence the false alarm rate.

重要的, the signal we embedded was distributed arbitrarily across the 442 voxels in the
ROI, but a searchlight can contain only a subset of these voxels. This was especially important
since the shape of the ROI was not smooth or uniform. 有 2,569 searchlights that con-
tained at least one voxel in this ROI, and there were 325 searchlights out of the whole brain that
contained at least 10 voxels from the ROI. 尽管如此, since the signal being embedded into
the high-dimensional space was very low dimension (2-D), this meant that the signal’s repre-
sentation was highly redundant within this ROI. 换句话说, only a small sample of voxels
was necessary to represent the signal structure. We suggest that this is biologically plausible:
voxels in an ROI may represent different features or dimensions, but any small sample of these
voxels may be sufﬁcient to differentiate representations, especially when the representations
are distributed and thus likely to differ along many dimensions.

For each searchlight, we construct a 27 × N matrix, where each column of the matrix spec-
iﬁes the voxel pattern in the searchlight for one event type. Each column of voxels is Z-scored
to standardize differences between the events. We then formed an N × N distance matrix by
taking the Euclidean distance between each pair of columns of the 27 × N matrix. Events evok-
ing similar patterns of activity within a given searchlight had a low distance between them,
and vice versa.

A critical step in this analysis is that the columns of voxels are Z-scored; if this is not
performed then the sensitivity to purely multivariate patterns is severely diminished (戴维斯
& Poldrack, 2013; Walther, Nili, Ejaz, Alink, Kriegeskorte, & Diedrichsen, 2016). In this sim-
计算, there was no mean difference in the evoked response; 然而, 小的, 不系统的
differences in the average response to an event arise by chance. The size of these differences
compared with those of the multivariate pattern that the stimulus evokes is sufﬁciently large to
generate arbitrary differences in the Euclidean distance between events. Z-scoring the voxel
activity within a condition eliminates these differences and thereby increases the sensitivity
to differences in the multivariate pattern. 因此, we used the normalized Euclidean
distance matrices in a persistent homology analysis of searchlights centered on every voxel
in the brain. Note that if mean differences between conditions are expected, for instance be-
cause one condition evokes a stronger response than the other (戴维斯 & Poldrack, 2013), 然后
Z-scoring may not be appropriate. An alternative approach is to use correlation as the distance
metric, which also has the effect of normalizing mean differences.

We chose to use Euclidean distance here because our simulated signal ought to have no
systematic mean differences between condition, and because it is a typical metric in the TDA commu-
本质; 然而, in the neuroscience community correlation distance is often used (Kriegeskorte,
穆尔, & Bandettini, 2008). Figure S4 (see Supporting information) shows the results are qualitatively

网络神经科学

699

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Feasibility of topological data analysis for event-related fMRI

the same if correlation distance is used. These normalized Euclidean distance matrices were
used as inputs into the persistent homology computation.

A full description of persistent homology is outside of the scope of the present work; 国际米兰-
ested readers are pointed to standard introductions to the topic (Carlsson, 2009; Ghrist, 2008).
In brief, persistent homology takes as input a distance matrix and outputs an object called a
barcode, which we explain below. The barcode is constructed from the data (IE。, the Euclidean
distance matrix) by building a growing sequence of geometric objects called a ﬁltration. 我们
work with a standard ﬁltration construction called the Vietoris-Rips complex (Carlsson, 2009).

The barcode is a list of intervals. We interpret the start of the interval (“the birth”) 作为
the scale at which some hole in the ﬁltration forms, and the end of the interval (“the death”) 作为
the scale at which that hole closes up. The length of the interval (IE。, the difference between the
birth and death) is called the persistence of that feature, and is usually interpreted as a measure

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

Persistence:
The length of the interval of a
topological feature. The interval starts
at the scale that a feature forms and
ends at the scale that it disappears.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

数字 1. Application of persistent homology analysis to fMRI data. Lower left box shows a topo-
logical pattern of activity (N equidistant points lying on a circle in the Euclidean plane). The points
in this set are embedded into an ROI in the brain, convolved with the hemodynamic response, 和
combined with noise (blue voxels in upper left image). A distance matrix is then calculated for each
3 × 3 × 3 searchlight of voxels by comparing the event-evoked activity for each event type (matrix at
upper right). This distance matrix is the input to the persistent homology computation to create the
barcode. The birth and deaths of cluster (orange circles) and loop (blue cross) features are plotted
一起 (purple box at lower right). The persistence diagram is assigned to the central voxel that
contributed to that searchlight.

网络神经科学

700

Feasibility of topological data analysis for event-related fMRI

of “importance” or robustness of the feature. Features that have small persistence are consid-
ered topological noise, whereas larger persistence is taken to reﬂect a meaningful topological
feature (Cohen-Steiner, Edelsbrunner, & Harer, 2007). 实际上, in topology we distinguish be-
tween holes of different dimensionality: a zero-dimensional hole corresponds to a cluster, A
one-dimensional hole is a loop, a two-dimensional hole is a hollow sphere, 等等. 每个
interval in the barcode is labeled by the dimension of the corresponding feature. 一种方法是
plot the output of persistent homology is as a persistence diagram, where the x-coordinates
are births and the y-coordinates are deaths of features. We use different symbols to distinguish
the dimensions of the features. 数字 1 shows a persistence diagram with a number of cluster
features and a single loop feature.

Our analysis tests for the presence of the single loop feature that was embedded into the data
as part of the simulation. We consider two test statistics of this signal: (A) the number of loops
identiﬁed in the barcode (which should be exactly 1); 和 (乙) the persistence of the longest-lived
环形 (if any were identiﬁed). 这样做, we computed a barcode for every searchlight in the brain.
This was done by ﬁrst constructing the N × N Euclidean distance matrix speciﬁed above. 这
distance matrix was input in to BrainIAK-extra’s (https://github.com/brainiak/brainiak-extras)
Python wrapper of the PHAT algorithm (Bauer, Kerber, Reininghaus, & 瓦格纳, 2017) to per-
form the persistent homology computation. The barcode from this computation was assigned
to the voxel in the center of the searchlight. This computation was performed on each search-
light in the brain. For computational efﬁciency, we looked only for cluster and loop features
(IE。, persistent homology in dimensions 0 和 1). If persistent homology is an adequate tool
for identifying the loop structure embedded in realistically noisy fMRI data, then within the
signal ROI the persistent homology plots should have exactly one persistent (long-lived) fea-
真实; this should not be so for an analogous ROI that does not contain signal. For this control
ROI, we used voxels on the symmetrically opposite side of the brain as the signal ROI, 和
did not have signal added to the simulation (还 442 voxels). These ROIs then serve as a basis
of comparison to determine where there should and should not be high test statistic values
(either the maximum persistence or the proportion of single loops).

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
e
n
A
r
t
我
C
e
–
p
d

我

F
/

3
3
6
9
5
1
0
9
2
4
8
7
n
e
n
_
A
_
0
0
0
9
5
p
d

To evaluate the reliability of the test statistics, we perform a whole-brain, unbiased statisti-
cal test that corrects for multiple comparison issues. Due to the non-normal properties of the
persistent homology metrics used here (IE。, maximum persistence is a value from 0 to inﬁnity,
or whether the barcode contains a single loop feature is a binary value), we subtracted the
whole-brain mean of each test statistic volume and then performed FSL’s “randomize” func-
tion to compute the nonparametric reliability of test statistics across the sample participants
(Winkler, Ridgway, 韦伯斯特, 史密斯, & Nichols, 2014). Voxels surviving threshold-free cluster
enhancement (TFCE) correction at p < 0.05 are plotted for each metric and for each condition. Finally, we compared the proportion of voxels in the signal ROI that are signiﬁcant with the number in the control ROI. f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 RESULTS To visualize the analyses performed here, Figure 2 shows the data for the 12 event types, 25 repetitions condition with different signal change parameters. Figure 2A shows the distance matrix from a single searchlight in a single subject in the signal ROI. This searchlight was used as an input into Figure 2B to produce multidimensional scaling plots (Shepard, 1962). Figure 2C shows the persistence diagram computed for this searchlight. These plots indicate that as the signal change increases, the ﬁdelity with which the loop structure can be recovered is greatly increased. Network Neuroscience 701 Feasibility of topological data analysis for event-related fMRI Figure 2. Data from example searchlight for different levels of percent signal change for the con- dition with 12 event types and 25 repetitions. (A) The event-by-event Euclidean distance matrices after voxels have been normalized within condition in this searchlight. (B) The MDS plots for these distance matrices. Lines connect events that should be adjacent in representation space. (C) The persistence diagrams for cluster and loop features, derived from the distance matrices in (A). To systematically quantify the evidence of a loop, we analyzed the size of the most persistent (or long-lived) loop feature. Figure 3A shows the mean persistence of voxels in the signal ROI subtracted from the mean persistence of voxels in the control ROI (i.e., voxels on the exact opposite side of the brain that did not contain signal). Each line represents a different set of experimental conditions (25 or 50 repetitions per session; 12, 15, or 18 event types), and each increment represents a different signal magnitude (0, 0.25, 0.5, 0.75, or 1% signal change). Figure 3A shows that with low (0.25%) signal, persistence initially dips in the signal ROI relative to the control ROI. With more signal, especially when there are more repetitions, the signal ROI shows greater persistence than the control ROI. Figure 3B shows the proportion of voxels in the signal ROI that are signiﬁcantly reliable across participants. This shows a similar pattern as Figure 3A: The number of signiﬁcant voxels only exceeds the baseline when there is at least 0.5% signal change, especially when there are more repetitions. Figure 3C shows a slice of the brain test statistic map for the 12 events, 25 repetitions condition for each of the levels of signal change. Even at moderate signal there is evidence that the loop structure is expressed in the signal ROI. These ﬁndings suggest that it may be possible to recover a single, simple, clear topological signal with persistent homology in designs with a moderate evoked response and a large amount of data. Figures 3D–F show the corresponding analyses for the test statistic evaluating whether there is only one loop feature (i.e., the associated one-dimensional persistence diagram has exactly one point). Figure 3D shows that in some conditions there is evidence that the signal ROI can contain a loop structure even at the lowest signal change (0.25%). In the conditions with more event types or fewer repetitions, greater signal change was needed, though still in a reasonable range. Although these mean differences indicate evidence of a loop structure at low signal change, Figure 3E indicates that more signal is needed for the loop structure to be signiﬁcantly reliable—only at 0.5% signal change or higher are single loops detected reliably. Figure 3F is similar to Figure 3C, and shows that even with moderate signal there is evidence that the loop structure is expressed in the signal ROI but nowhere else in the brain. Network Neuroscience 702 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / t / e d u n e n a r t i c e - p d l f / / / / / 3 3 6 9 5 1 0 9 2 4 8 7 n e n _ a _ 0 0 0 9 5 p d t . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Feasibility of topological data analysis for event-related fMRI l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / t / e d u n e n a r t i c e - p d l f / / / / / 3 3 6 9 5 1 0 9 2 4 8 7 n e n _ a _ 0 0 0 9 5 p d t . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Figure 3. Results of persistent homology analysis of simulated event-related fMRI data. The left column shows the results using maximum loop persistence (i.e., length of the longest bar in the 1-D barcode) as the metric. (A) The difference in persistence in the signal ROI compared with the control ROI. Different colors depict numbers of event types (12, 15, or 18), and the different lines depict numbers of trial repetitions (25 or 50). (B) The proportion of voxels in the signal ROI that were signiﬁcant. The black dashed line is the proportion of signiﬁcant voxels in the control ROI. (C) In red, the maps of voxels for which the maximum persistence is signiﬁcantly greater than baseline at p < 0.05 with TFCE correction for one condition of the experiment. Each brain represents a different percent signal change. The right column shows the equivalent analyses as (A–C) for the metric testing the proportion of voxels in the signal ROI exhibiting one loop feature (i.e., the 1-D barcode with exactly one point). (D) The mean difference between signal and control ROIs. (E) The proportion of signiﬁcant voxels in the signal ROI. (F) The signiﬁcant voxels for each level of signal change. Error bars for (A) and (D) are the standard error of the mean between participants. Additional analyses, reported in Figure S5 (see Supporting information), tested whether it was possible to improve upon the detection of single loops by taking into account “topological noise.” Topological noise refers to low-persistence features that may emerge brieﬂy in the pres- ence of much larger and more robust high-persistence features. We can ignore these features by setting a threshold on the length of persistence that is sufﬁcient to warrant inclusion relative to the length of the next longest feature. By introducing this threshold, the rates of ﬁnding one loop increase for both the signal and the control ROI. Hence it is necessary to ﬁnd an optimal threshold for which the control ROI is kept low and the rate in the signal ROI is high. There are indications that when using a low signal (0.25%) it is possible to ﬁnd greater loop evidence Network Neuroscience 703 Feasibility of topological data analysis for event-related fMRI in the signal ROI relative to the control ROI when using a ratio threshold between 1.4 and 2.6. However, the optimal threshold differs depending on the number of event types and thus is difﬁcult to set a priori. Hence, topological thresholding may hold value but determining a principled way of applying it to neural data requires further investigation. DISCUSSION Here we report simulation analyses evaluating the usefulness of persistent homology in recov- ering topological structure in event-related fMRI data. Speciﬁcally, we used fmrisim to simulate a realistic pattern of noise in fMRI data, and inserted a speciﬁc topological feature (a single loop) into some voxels. We calculated the persistent homology of all voxel-centered search- lights in the brain and evaluated the extent to which the embedded loop was reliably identiﬁed by persistent homology. We showed that persistent homology can extract structure from a sam- ple of participants collected with a realistic event-related fMRI design, especially when there is moderate signal and few event types, with many repetitions of each type. We used two different metrics to evaluate the topological structure inserted into the data: (a) maximum persistence of a loop feature and (b) whether there is only one loop feature present in the barcode. We found generally consistent results between these methods; however, there were some circumstances in which the evidence conﬂicted. When the signal strength was low and there were few repetitions of each event, the measure of maximum persistence was actually lower in the region of the brain containing signal compared with a control region. At the same time, the proportion of searchlights that contain a single loop in the signal ROI was greater than the control ROI, even though this was not signiﬁcantly reliable across participants. There are other circumstances where the best metric is less clear. If a larger searchlight is used, our supplemental analyses suggest that the proportion metric may be more susceptible to noise than maximum persistence. It is hoped that this simulation protocol could help to investigate these nuances and other novel designs. The pattern of results showed that topological signal was best recovered when (a) the signal strength was high, (b) there were numerous repetitions per event, and (c) there were fewer events in the circle. The beneﬁts of high signal strength and increased repetitions were ex- pected; however, it was not anticipated that fewer event types would make it easier to extract the representation of the loop. This phenomenon may emerge from the greater risk that a sin- gle noisy event will disrupt the topology of the entire representation. That said, with too few events, the topological structure cannot be formed. Although we were successfully able to use persistent homology to recover signal, it is im- portant to consider a number of limitations that were exposed in this simulation. First, the distance metric used, as well as the preprocessing done on this metric, is extremely important: It is possible to entirely miss signal in the brain by doing the wrong type of preprocessing on the data. We strongly advise using metrics that normalize voxel activity within condition when the differences between conditions are expected to result from multivariate patterns. Second, the amount of data needed in these analyses (60 to 180 min) is substantial and may preclude certain types of experiments. Third, the topological structure that we tested is simple and de- scribed by relatively few points. Hence for more complicated topological features, such as community structure (Schapiro et al., 2013), “ﬁgures of 8,” interlocking rings, and branching structures, it is likely that persistence analyses will have even lower power. Nonetheless we believe the pipeline we set up provides an opportunity to test the viability of extracting different topological structures from event-related fMRI. Network Neuroscience 704 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / / t e d u n e n a r t i c e - p d l f / / / / / 3 3 6 9 5 1 0 9 2 4 8 7 n e n _ a _ 0 0 0 9 5 p d t . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Feasibility of topological data analysis for event-related fMRI It is also possible that other experimental designs may proﬁt from the use of TDA to identify mental representations. For example, designs using more naturalistic stimuli (e.g., movies or narratives; Hasson, Nir, Levy, Fuhrmann, & Malach, 2004; Yeshurun et al., 2017) generate large quantities of data (in which each time point can be treated as a node on a graph) with strong evoked responses. Such data may be conducive to analysis using TDA. Moreover, persistent homology is only one type of TDA; it may be that other types of TDA, such as Mapper (Singh, Mémoli, & Carlsson, 2007), can also recover topological structure in fMRI data. In sum, we have shown that topological structure embedded in realistic simulations of fMRI data can be identiﬁed and we have characterized conditions under which persistent homology is highly effective. We hope that the insights and tools introduced here can guide researchers in discovering the topological structure of neural representations. ACKNOWLEDGMENTS Thank you to N. Malek for comments on an earlier draft. We thank the editors and reviewers for their helpful feedback. We especially thank one of the anonymous reviewers who went above and beyond what is expected from reviewers and substantively improved the manuscript. AUTHOR CONTRIBUTIONS Cameron T. Ellis: Conceptualization; Data curation; Formal analysis; Methodology; Software; Validation; Visualization; Writing - Original Draft; Writing - Review & Editing. Michael Lesnick: Conceptualization; Methodology; Writing - Review & Editing. Gregory Henselman-Petrusek: Methodology; Validation; Writing - Review & Editing. Bryn Keller: Methodology; Soft- ware; Validation; Visualization; Writing - Review & Editing. Jonathan D. Cohen: Conceptu- alization; Funding acquisition; Methodology; Supervision; Writing - Review & Editing. FUNDING INFORMATION Jonathan D. Cohen, John Templeton Foundation (US). Jonathan D. Cohen: Intel Corporation. Gregory Henselman-Petrusek: Swartz Center for Theoretical Neuroscience at Princeton University. COMPETING INTERESTS B. K. is an employee of Intel Corporation, which has a current grant with Princeton University. No other conﬂicts are declared. REFERENCES Bassett, D. S., & Sporns, O. (2017). Network Neuroscience. Nature Neuroscience, 20(3), 353–364. Bauer, U., Kerber, M., Reininghaus, PHAT—Persistent Homology Algorithms Toolbox. Symbolic Computation, 78, 76–90. J., & Wagner, H. (2017). Journal of Burock, M. A., Buckner, R. L., Woldorff, M. G., Rosen, B. R., & Dale, A. M. (1998). Randomized event-related experimental designs al- low for extremely rapid presentation rates using functional MRI. NeuroReport, 9(16), 3735–3739. Carlsson, G. (2009). Topology and data. Bulletin of the American Mathematical Society, 46(2), 255–308. Dabaghian, Y., Mémoli, F., Frank, L., & Carlsson, G. (2012). A topological paradigm for hippocampal spatial map formation using persistent homology. PLoS Computational Biology, 8(8), e1002581. Davis, T., & Poldrack, R. A. (2013). Measuring neural representa- tions with fMRI: Practices and pitfalls. Annals of the New York Academy of Sciences, 1296(1), 108–134. Desmond, J. E., & Glover, G. H. (2002). Estimating sample size in functional MRI (fMRI) neuroimaging studies: Statisti- cal power analyses. Journal of Neuroscience Methods, 118(2), 115–128. Cohen-Steiner, D., Edelsbrunner, H., & Harer, J. (2007). Stability of persistence diagrams. Discrete and Computational Geometry, 37(1), 103–120. Ellis, C. T. (2019). Repository for event-related fMRI simulation for testing topological data analysis, Github, https://github.com/ CameronTEllis/event_related_fmri_tda Network Neuroscience 705 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . / / t e d u n e n a r t i c e - p d l f / / / / / 3 3 6 9 5 1 0 9 2 4 8 7 n e n _ a _ 0 0 0 9 5 p d t . f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Feasibility of topological data analysis for event-related fMRI Ellis, C. T., Baldassano, C., Schapiro, A. C., Cai, M. B., & Cohen, J. (2019). Facilitating open-science with realistic fMRI simu- lation: Validation and application. bioRxiv. https://doi.org/10. 1101/532424 Friston, K. J., Fletcher, P., Josephs, O., Holmes, A., Rugg, M., & Turner, R. (1998). Event-related fMRI: Characterizing differential responses. NeuroImage, 7(1), 30–40. Ghrist, R. (2008). Barcodes: The persistent topology of data. Bulletin of the American Mathematical Society, 45(1), 61–75. Giusti, C., Pastalkova, E., Curto, C., & Itskov, V. (2015). Clique topol- ogy reveals intrinsic geometric structure in neural correlations. Pro- ceedings ofthe National AcademyofSciences, 112(44), 13455–13460. Gläscher, J. (2009). Visualization of group inference data in func- tional neuroimaging. Neuroinformatics, 7(1), 73–82. Hasson, U., Nir, Y., Levy, I., Fuhrmann, G., & Malach, R. (2004). Intersubject synchronization of cortical activity during natural vi- sion. Science, 303(5664), 1634–1640. Knyazeva, I., Orlov, V., Ushakov, V., Makarenko, N., & Velichkovsky, B. (2016). On alternative instruments for the fMRI data analysis: General linear model versus algebraic topology approach. In Biologically inspired cognitive architectures (BICA) for young scientists (pp. 107–113). Cham, Switzerland: Springer. Kriegeskorte, N., Mur, M., & Bandettini, P. (2008). Representational similarity analysis—Connecting the branches of systems neuro- science. Frontiers in Systems Neuroscience, 2(4), 1–28. Nau, M., Schröder, T. N., Bellmund, J. L., & Doeller, C. F. (2018). Hexadirectional coding of visual space in human entorhinal cor- tex. Nature Neuroscience, 21(2), 188. Nielson, D. M., Smith, T. A., Sreekumar, V., Dennis, S., & Sederberg, P. B. (2015). Human hippocampus represents space and time during retrieval of real-world memories. Proceedings of the National Academy of Sciences, 112(35), 11078–11083. Rao, S. M., Bandettini, P. A., Binder, J. A., Hammeke, T. A., Stein, E. A., & Hyde, J. S. (1996). Relation- ship between ﬁnger movement rate and functional magnetic res- J. R., Bobholz, onance signal change in human primary motor cortex. Journal of Cerebral Blood Flow and Metabolism, 16(6), 1250–1254. Saggar, M., Sporns, O., Gonzalez-Castillo, J., Bandettini, P. A., Carlsson, G., Glover, G., & Reiss, A. L. (2018). Towards a new approach to reveal dynamical organization of the brain using topological data analysis. Nature Communications, 9(1), 1399. Schapiro, A. C., Rogers, T. T., Cordova, N. I., Turk-Browne, N. B., & Botvinick, M. M. (2013). Neural representations of events arise from temporal community structure. Nature Neuroscience, 16(4), 486–492. Shepard, R. N. (1962). The analysis of proximities: Multidimen- sional scaling with an unknown distance function. I. Psychome- trika, 27(2), 125–140. Singh, G., Mémoli, F., & Carlsson, G. E. (2007). Topological methods for the analysis of high dimensional data sets and 3d object recognition. In M. Botsch & R. Pajarola (Eds.), Eurograph- icssymposium on pointbased graphics (pp. 91–100). Geneva: Eurographics Association. Valentine, T. (1991). A uniﬁed account of the effects of distinctive- ness, inversion, and race in face recognition. Quarterly Journal of Experimental Psychology Section A, 43(2), 161–204. Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., & Diedrichsen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern analysis. NeuroImage, 137, 188–200. Winkler, A. M., Ridgway, G. R., Webster, M. A., Smith, S. M., & Nichols, T. E. (2014). Permutation inference for the general lin- ear model. NeuroImage, 92, 381–397. Yeshurun, Y., Swanson, S., Simony, E., Chen, J., Lazaridi, C., Honey, C. J., & Hasson, U. (2017). Same story, different story: The neural representation of interpretive frameworks. Psychological Science, 28(3), 307–319. Zomorodian, A., & Carlsson, G. (2005). Computing persistent ho- mology. Discrete and Computational Geometry, 33(2), 249–274. Network Neuroscience 706 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . t / / e d u n e n a r t i c e - p d l f / / / / / 3 3 6 9 5 1 0 9 2 4 8 7 n e n _ a _ 0 0 0 9 5 p d . t f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 焦点功能: 图像

下载pdf