Early Visual Responses Predict Conscious Face
Perception within and between Subjects
during Binocular Rivalry
Kristian Sandberg1,2, Bahador Bahrami1,2,3, Ryota Kanai2,
Gareth Robert Barnes4, Morten Overgaard1,
and Geraint Rees2,4
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
Abstrakt
■ Previous studies indicate that conscious face perception
may be related to neural activity in a large time window around
170–800 msec after stimulus presentation, yet in the majority of
these studies changes in conscious experience are confounded
with changes in physical stimulation. Using multivariate classi-
fication on MEG data recorded when participants reported
changes in conscious perception evoked by binocular rivalry
between a face and a grating, we showed that only MEG signals
in the 120–320 msec time range, peaking at the M170 around
180 msec and the P2m at around 260 ms, reliably predicted
conscious experience. Conscious perception could not only be
decoded significantly better than chance from the sensors that
showed the largest average difference, as previous studies sug-
gest, but also from patterns of activity across groups of occipital
sensors that individually were unable to predict perception
better than chance. Zusätzlich, source space analyses showed
that sources in the early and late visual system predicted con-
scious perception more accurately than frontal and parietal
sites, although conscious perception could also be decoded
Dort. Endlich, the patterns of neural activity associated with
conscious face perception generalized from one participant to
another around the times of maximum prediction accuracy.
Our work thus demonstrates that the neural correlates of par-
ticular conscious contents (Hier, faces) are highly consistent in
time and space within individuals and that these correlates are
shared to some extent between individuals. ■
EINFÜHRUNG
There has been much recent interest in characterizing the
neural correlates of conscious face perception, but two
critical issues remain unresolved. The first is the time at
which it becomes possible to determine conscious face
perception from neural signals obtained after a stimulus
is presented. The second is whether patterns of activity
related to conscious face perception generalize mean-
ingfully across participants, thus allowing comparison of
the neural processing related to the conscious experience
of particular stimuli between different individuals. Hier, Wir
addressed these two questions using MEG to study face
perception during binocular rivalry. We also examined sev-
eral more detailed questions, including which MEG sensors
and sources were the most predictive, which frequency
bands were predictive, and how to increase prediction
accuracy based on preprocessing and preselection of trials.
The neural correlates of conscious face perception have
only been studied in the temporal domain in a few recent
EEG studies. The most commonly employed strategy in
1Aarhus University Hospital, 2University College London, 3Aarhus
Universität, 4Institute of Neurology, London, Großbritannien
those studies was to compare neural signals evoked by
masked stimuli that differ in stimulus-mask onset asyn-
chrony that results in differences in visibility of the masked
stimulus (Harris, Wu, & Woldorff, 2011; Pegna, Darque,
Berrut, & Khateb, 2011; Babiloni et al., 2010; Pegna, Landis,
& Khateb, 2008; Liddell, Williams, Rathjen, Shevrin, &
Gordon, 2004). Jedoch, because all but one of these
Studien (Babiloni et al., 2010) compared brief presentations
with long presentations, the stimuli (and corresponding
neural signals) differed not only in terms of whether or
not they were consciously perceived but also in terms of
their duration. Conscious perception of a stimulus was thus
confounded by physical stimulus characteristics (Lumer,
Friston, & Rees, 1998). Darüber hinaus, all of these earlier stud-
ies used conventional univariate statistics, comparing, für
Beispiel, the magnitude of averaged responses between
different stimulus conditions across participants. Solch
approaches are biased toward single strong MEG/EEG
sources and may overlook distributed yet equally predic-
tive information.
It remains controversial whether relatively early or late
ERP/ERF components predict conscious experience. Der
relatively early components in question are the N170 found
around 170 msec after stimulus onset and a later response
© 2013 Massachusetts Institute of Technology Published under
eine Creative-Commons-Namensnennung 3.0 Unportiert (CC-BY 3.0) Lizenz
Zeitschrift für kognitive Neurowissenschaften 25:6, S. 969–985
doi:10.1162/jocn_a_00353
l
l
/
/
/
/
J
T
T
F
/
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
F
T
/
.
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
at around 260 ms (sometimes called P2 or N2, depend-
ing on the analyzed electrodes, and sometimes P300 or
P300-like). The N170 is sometimes found to be larger for
consciously perceived faces than for those that did not
reach awareness (Harris et al., 2011; Pegna et al., 2011;
Babiloni et al., 2010), yet this difference is not always
found (Pegna et al., 2008; Liddell et al., 2004). Ähnlich,
the P2/N2 correlated positively with conscious experi-
ence in one article (Babiloni et al., 2010) and negatively
in others (Pegna et al., 2011; Liddell et al., 2004). Addi-
tionally, both the N170 (Pegna et al., 2008) and the P2/N2
(Pegna et al., 2011; Liddell et al., 2004) depend on in-
visible stimulus characteristics, suggesting that these com-
ponents reflect unconscious processing (but see Harris
et al., 2011).
Late components are found between 300 Und 800 ms
after stimulus presentation. Two studies point to these com-
ponents (300–800 msec) as reflecting conscious experience
of faces (Pegna et al., 2008; Liddell et al., 2004), yet these
late components are only present when stimulus durations
differ between conscious and unconscious stimuli and not
when stimulus duration is kept constant across the en-
tire experiment and stimuli are classified as conscious or
unconscious by the participants (Babiloni et al., 2010).
Hier, we therefore sought to identify the time range for
which neural activity was diagnostic of the contents of
conscious experience in a paradigm where conscious ex-
perience changed, but physical stimulation remained con-
stant. We used highly sensitive multivariate pattern analysis
of MEG signals to examine the time when the conscious
experience of the participants viewing intermittent bin-
ocular rivalry (Leopold, Wilke, Maier, & Logothetis, 2002;
Breese, 1899) could be predicted. During intermittent bin-
ocular rivalry, two different stimuli are presented on each
trial—one to each eye. Although two different stimuli are
vorgeführt, the participant typically reports perceiving only
one image, and this image varies from trial to trial. In other
Wörter, physical stimuli are kept constant, but conscious
experience varies from trial to trial. This allowed us to
examine whether and when MEG signals predicted con-
scious experience on a per-participant and trial-by-trial
basis. Consistent with previous studies using multivariate
decoding, we collected a large data set from a relatively
small number of individuals (Raizada & Connolly, 2012;
Carlson, Hogendoorn, Kanai, Mesik, & Turret, 2011; Haynes,
Deichmann, & Rees, 2005; Haynes & Rees, 2005), employ-
ing a case-plus-replication approach supplemented with
group analyses where necessary.
Having established the temporal and spatial nature of
the neural activity specific to conscious face perception
by use of multivariate pattern analysis applied to MEG sig-
nals, we further sought to characterize how consistently
this pattern generalized between participants. If the pat-
tern of MEG signals in one participant was sufficient to
provide markers of conscious perception that could be
generalized to other participants, this would provide one
way to compare similarities in neural processing related
to the conscious experience of particular stimuli between
different individuals.
After having examined our two main questions, zwei
methods for improving multivariate classification accuracy
were also examined: stringent low-pass filtering to smooth
the data and rejection of trials with unclear perception.
Nächste, univariate and multivariate prediction results were
compared with find correlates of conscious face perception
that are not revealed by univariate analyses. This analysis
was performed at the sensor level as well as on activity
reconstructed at various cortical sources. In addition to
these analyses, it was examined whether decoding accu-
racy was improved by taking into account information
distributed across the ERF or by using estimates of power
in various frequency bands.
METHODEN
MEG signals were measured from healthy human par-
ticipants while they experienced intermittent binocular
rivalry. Participants viewed binocular rivalry stimuli (Bilder
of a face and a sinusoidal grating) intermittently in a series
of short trials (Figure 1A) and reported their percept using
a button press. This allowed us to label trials by the
reported percept, yet time-lock analyses of the rapidly
changing MEG signal to the specific time of stimulus pre-
sentation instead of relying on the timing of button press
Berichte, which are both delayed and variable with respect
to the timing of changes in conscious contents. The advan-
tages of this procedure have been described elsewhere
(Kornmeier & Bach, 2004).
Teilnehmer
Eight healthy young adults (six women) zwischen 21 Und
34 Jahre (mean = 26.0 Jahre, SD = 3.55 Jahre) with normal
or corrected-to-normal vision gave written informed con-
sent to participate in the experiment. The experiments
were approved by the University College London Research
Ethics Committee.
Apparatus and MEG Recording
Stimuli were generated using the MATLAB toolbox Cogent
(www.vislab.ucl.ac.uk/cogent.php). They were projected
onto a 19-in. screen (resolution = 1024 × 768 pixels, Re-
fresh rate = 60 Hz) using a JVC D-ILA, DLA-SX21 projector.
Participants viewed the stimuli through a mirror stereo-
scope positioned at approximately 50 cm from the screen.
MEG data were recorded in a magnetically shielded room
with a 275-channel CTF Omega whole-head gradiometer
System ( VSM MedTech, Coquitlam, BC, Kanada) with a
600-Hz sampling rate. After participants were comfortably
seated in the MEG, head localizer coils were attached to the
nasion and 1 cm anterior (in the direction of the outer
canthus) of the left and right tragus to monitor head
movement during recording.
970
Zeitschrift für kognitive Neurowissenschaften
Volumen 25, Nummer 6
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
F
/
T
T
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
.
/
F
T
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
F
/
T
T
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
T
.
/
F
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
Figur 1. Experimental design and results. (A) Experimentelles Design. Rivaling stimuli (face/grating) were presented for trials lasting ∼800 msec
separated by blank periods of ∼900 msec. Stimuli were dichoptically presented to each eye and rotated in opposite directions at a rate of
0.7 rotations per second. Participants reported which of the two images they perceived with a button press as soon as they saw one image clearly.
If perception did not settle, or if the perceived image changed during the trial, the participant reported mixed perception with a third button
press. (B) Classification procedure. SVMs were trained to distinguish neuromagnetic activity related to conscious face and grating perception
für jeden Teilnehmer. The SVMs were then used to decode the perception of (1) the same participant on different trials (top) Und (2) jeder von
the other participants (bottom). (C) Links: RT as a function of perceptual report. Rechts: RT as a function of trial number after a perceptual switch.
(D) RT as a function of time after a perceptual switch by perception. The decrease in RT for nonmixed perception indicates that perception on
average is clearer far from a perceptual switch than immediately after. Trials for which the same percept has been reported at least 10 mal
are hereafter referred to as “stable” whereas other trials are referred to as “unstable.”
Stimuli
A red Gabor patch (contrast = 100%, spatial frequency =
3 cycles/degree, standard deviation of the Gaussian
envelope = 10 pixels) was presented to the right eye of
the participants, and a green face was presented to the
left eye (Figure 1A). To avoid piecemeal rivalry where
each image dominates different parts of the visual field
for the majority of the trial, the stimuli rotated at a rate
von 0.7 rotations/sec in opposite directions, and to ensure
that stimuli were perceived in overlapping areas of the vi-
sual field, each stimulus was presented within an annulus
Sandberg et al.
971
(inner/outer r = 1.3/1.6 degrees of visual angle) consisting
of randomly oriented lines. In the center of the circle was
a small circular fixation dot.
Verfahren
During both calibration and experiment, participants re-
ported their perception using three buttons each corre-
sponding to either face, grating, or mixed perception.
Participants swapped the hand used to report between
blocks. This was done to prevent the classification algo-
rithm from associating a perceptual state with neural ac-
tivity related to a specific motor response. To minimize
perceptual bias (Fuhrmann & Cavanagh, 2007), der Verwandte
luminance of the images was adjusted for each participant
until each image was reported equally often (±5%) während
a 1-min-long continuous presentation.
Each participant completed six to nine runs of 12 blocks
von 20 Versuche, das ist, 1440–2160 trials were completed per
participant. On each trial, the stimuli were displayed for
etwa 800 ms. Each trial was separated by a uni-
form gray screen appearing for around 900 ms. Between
blocks, participants were given a short break of 8 Sek.
After each run, participants signaled when they were ready
to continue.
Vorverarbeitung
Using SPM8 (www.fil.ion.ucl.ac.uk/spm/), data were down-
sampled to 300 Hz and high-pass filtered at 1 Hz. Sei-
havioral reports of perceptual state were used to divide
stimulation intervals into face, grating or mixed epochs
beginnend 600 msec before stimulus onset and ending
1400 msec after. Trials were baseline-corrected based on
the average of the 600 msec prestimulus activity. Artifacts
were rejected at a threshold of 3 pT. On average 0.24%
(SD = 0.09) of the trials were excluded for each participant
because of artifacts.
ERF Analysis
Traditional, univariate ERF analysis was first performed.
Für diese Analyse, data were filtered at 20 Hz using a fifth-
order Butterworth low-pass filter, and face and grating
perception trials were averaged individually using SPM8.
Source Analysis
Sources were examined using the multiple sparse priors
(MSP; Friston et al., 2008) Algorithmus. MSP operates by find-
ing the minimum number of patches on a canonical cor-
tical mesh that explain the largest amount of variance
in the MEG data, this tradeoff between complexity and
accuracy is optimized through maximization of model
evidence. The MSP algorithm was first used to identify
the electrical activity underlying the grand-averaged face/
grating contrast maps at a short time window around the
M170 and the P2m (100–400 msec after stimulus onset).
Afterwards, the MSP algorithm was used to make a group-
level source estimation based on template structural MR
scans using all trials (over all conditions) from all eight
Teilnehmer. The inverse solution restricts the sources to
be the same in all participants but allows for different ac-
tivation levels. This analysis identified 33 sources activated
at stimulus onset (siehe Tabelle 1). Activity was extracted on a
single trial basis across the 33 sources for each scan of each
participant and thus allowed for analyses to be performed
in source space.
Multivariate Prediction Analysis
Multivariate pattern classification of the evoked responses
was performed using the linear support vector machine
(SVM) of the MATLAB Bioinformatics Toolbox (Math-
funktioniert). The SVM decoded the trial type (face or grating)
independently for each time point along the epoch. Clas-
sification was based on field strength data as well as power
estimates in separate analyses.
Conscious perception was decoded within and between
Teilnehmer. For within-subject training/testing, 10-fold
cross-validation was used (Figure 1B). For between-subject
training/testing, the SVM was trained on all trials from a sin-
gle participant and tested on all trials of each of the re-
maining participants. The process was repeated until
data from all participants had been used to train the SVM
(Figure 1B).
To decrease classifier training time (for practical rea-
sons), the SVM used only 100 randomly selected trials of
each kind (200 in Summe). As classification accuracy cannot
be compared between classifiers trained on different num-
bers of trials, participants were excluded from analyses
if they did not report 100 of each kind of analyzed trials.
The number of participants included in each analysis is
reported in the Results section.
In addition to the evoked response analysis, a mov-
ing window discrete Fourier transform was used to
make a continuous estimate of signal power in selected
frequency bands over time: theta = 3–8 Hz, alpha = 9–
13 Hz, low beta = 14–20 Hz, high beta = 21–30 Hz,
six gamma bands in the range of 31–90 Hz, each con-
sisting of 10 Hz (Gamma 1, zum Beispiel, would thus
be 31–40 Hz) but excluding the 50-Hz band. The dura-
tion of the moving window was set to accommodate
at least three cycles of the lowest frequency within
each band (z.B., for theta [3–8 Hz], the window was
900 ms).
Statistical Testing
All statistical tests were two-tailed. Comparisons of classi-
fication accuracies were performed on a within-subject
972
Zeitschrift für kognitive Neurowissenschaften
Volumen 25, Nummer 6
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
F
/
T
T
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
/
T
.
F
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
Tisch 1. Sources
Quelle
Bereich
Name
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
Occipital lobe
OFA
lV1
rV1
lvOCC1
rvOCC1
lvOCC2
rvOCC2
ldOCC
rdOCC
lOFA
rOFA
Face-specific
lpSTS1
Parietal
Motor
Frontal
rpSTS1
lpSTS2
rpSTS2
lpSTS3
rpSTS3
lFFA
rFFA
lSPL1
rSPL1
lSPL2
rSPL2
lSPL3
rSPL3
lPC
rPC
laMFG1
raMFG1
laMFG2
lOFC1
rOFC1
lOFC2
rOFC2
X
−2
12
−16
21
−14
15
−18
19
−38
39
−54
53
−55
54
−59
55
−53
52
−40
36
−33
36
−41
39
−54
54
−40
38
38
−24
22
−43
41
j
−96
−98
−94
−96
−80
−80
−81
−82
−80
−80
−63
−63
−50
−49
−33
−34
−51
−52
−37
−37
−65
−64
−35
−36
−12
−11
18
18
41
7
8
31
35
z
5
−1
−18
−17
−13
−12
40
40
−15
−15
9
13
23
18
10
7
−22
−22
60
60
49
46
44
44
15
13
27
26
19
−18
−19
−16
−15
Der 33 sources judged to be most active across all trials independently of
perception/stabilization across all participants. Sources were localized
using MSPs to solve the inverse problem. Source abbreviations: V1 =
striate cortex; OCC = occipital lobe; IT = inferior temporal cortex;
SPL = superior parietal lobule; PC = precentral cortex; MFG = middle
frontaler Gyrus. Navigational abbreviations: l = left hemisphere; r = right
hemisphere; p = posterior; a = anterior; d = dorsal; v = ventral.
basis using the binomial distributions of correct/incorrect
classifications. To show the reproducibility of the within-
subject significant effects across individuals, we used the
cumulative binomial distribution,
Pr X ≤ x
D
Þ ¼
(cid:1) (cid:2)
N
ich
Xx
i¼0
pið1 − pÞn−i
ð1Þ
where n is the total number of participants, the within-
subject significant criterion is p (=.05), x is the number
of participants that reach this criterion, and n
ist der
ich
binomial coefficient.
(cid:1) (cid:2)
Prediction accuracy for each power envelope was
averaged across a 700-msec time window after stimulus
presentation (211 sampling points) für jeden Teilnehmer.
Histogram inspection and Shapiro–Wilk tests showed
that the resulting accuracies were normally distributed.
One-sample t tests (n = 8) were used to compare the
prediction accuracy level of each power band to chance
(0.5). Bonferroni correction for 10 comparisons was used
als 10 power bands were analyzed.
ERGEBNISSE
EEG research points to the N170 and the component
sometimes called the P2 as prime candidates for the
correlates of conscious face perception (following con-
vention, we shall call these M170 and P2m hereafter)
but later sustained activity around 300–800 msec may
also be relevant. To search for predictive activity even
earlier than this, activity around the face-specific M100
was also examined. Before analyses, trials with unclear per-
ception were identified and excluded from subsequent
Analysen.
Identification of Unclear Perception Based on
Behavioral Data
Analyses were optimized by contrasting only face/grating
trials on which perception was as clear as possible. Partici-
pants generally reported perception to be unclear in two
ways, both of which have been observed previously (sehen
Blake, 2001). Erste, participants reported piecemeal rivalry
where both images were mixed in different parts of the
visual field for the majority of the trial. Such trials were
not used in the MEG analyses. Zweite, participants some-
times experienced brief periods (<200 msec) of fused or
mixed perception at the onset rivalry. Participants were
not instructed to report this initial unclear perception
if a stable image was perceived after few hundred milli-
seconds keep task simple. To minimize im-
pact type on analyses, we
exploited phenomenon stabilization that occurs
during intermittent rivalry presentations, which will be
explained below.
Sandberg et al.
973
D
o
w
n
l
o
a
d
e
d
f
r
o
m
l
l
>10 sensors from every site was enough to decode
perception significantly above chance (Figure 4A).
Nächste, the ability of the sensors in one area alone
to decode conscious perception at the M170 was exam-
ined (Figure 4B). Wie erwartet, low decoding accuracy
was found for most sites where previous analyses showed
no grand-averaged difference (central sensors: 56.7%,
parietal sensors: 60.5%, and frontal sensors: 57.9%)
while decoding accuracy was high for temporal sensors
(75.2%) where previous analyses had shown a large
grand-averaged difference. Jedoch, decoding accuracy
was numerically better when using occipital sensors
(78.0%). This finding was surprising as previous analyses
had indicated little or no grand-averaged difference over
occipital sensors.
daher, the predictability of single sensor data was
compared with the group-level decoding accuracy. In Fig-
ure 4D, individual sensor performance is plotted for occip-
ital and temporal sensors. The highest single sensor
decoding accuracy was achieved for temporal sensors
showing the greatest grand-averaged difference in the
ERF analysis. In the plots, it can be seen that, for occipital
978
Zeitschrift für kognitive Neurowissenschaften
Volumen 25, Nummer 6
sensors, the group level classification (black bar) is much
greater than that of the single best sensor, whereas this
is not the case for temporal sensors. Tatsächlich, a prediction
accuracy of 74.3% could be achieved using only 10 oc-
cipital sensors with individual chance-level performance
(maximum of 51.3%).
Just as multivariate classification predicted conscious
face perception at sensors that were at chance individu-
ally, it is possible that perception might be decoded
using multiple time points for which individual classifica-
tion accuracy was at chance. It may also be possible that
the information at the P2m was partially independent
from the information at the M170, causing joint classifica-
tion accuracy to increase beyond individual classification.
For these reasons, we examined classification accuracy
when the SVM classifiers were trained on data from multi-
ple time points. The formal analysis is reported in Appen-
dix: Decoding using multiple time points and shows that
including a wide range of time points around each peak
(11 time points, 37 msec of data) does not improve de-
coding accuracy. Neither does inclusion of information at
both time points in a single classifier, and finally, decod-
ing of consciousness perception is not improved above
chance using multiple time points individually at chance.
Decoding in Source Space
Our finding that signals from single time points at the
sensors close to visual areas of the brain were the most
predictive does not necessarily mean, Jedoch, that the
activity at these sensors originates from visual areas. To
test this, analyses of sources are necessary. daher,
activity was reconstructed at the 33 sources that were
most clearly activated by the stimuli in general (d.h., inde-
pendently of conscious perception), and decoding was
performed on these data. The analysis was performed
on 2–10 Hz filtered data from stable trials using the six
participants who had 100 or more stable trials with
reported face/grating perception.
Erste, decoding accuracy was examined across time
when classifiers were trained/tested on data from all
sources (Figure 5A). Nächste, classifiers were trained on
groups of sources based on cortical location (siehe Tabelle 1).
Comparisons between the accuracies achieved by each
group of sources may only be made cautiously as the
number of activated sources differs between areas, Und
the classifiers were thus based on slightly different num-
bers of features. The occipital, the face-specific, the frontal,
and the parietal groups, Jedoch, included almost the
same number of sources (8, 8, 7, Und 6, jeweils). Über-
alle, Figur 5 (A, B) shows that for all sources, decoding
accuracy peaked around the M170 and/or the P2m and
that conscious perception could be predicted almost as
accurately from eight occipital or face-specific sources as
from all 33 sources combined. This was not found for any
other area.
Decoding accuracy was also calculated for the individual
sources at the M170 (Figure 5C) and the P2m (Figure 5D)
using the individual peaks of each participant (see Fig-
ure 3). The single most predictive source with an accu-
racy of 64% at the M170 and 59% at the P2m was the
right OFA—a face-sensitive area in the occipital lobe. Der
majority of the remaining predictive sources were found in
occipital and face-specific areas with the exception of a
ventral medial prefrontal area and possibly an area in the
superior parietal lobe around the P2m. The peak classi-
fication accuracies for groups of sources (black bars in
Figure 5C, D) were also the highest for occipital and
face-specific sources, yet when combined the sources in
other areas also became predictive above chance. Gesamt,
it appeared that the most predictive sources were in the
visual cortex, although information in other areas also
predicted conscious perception. Generally, little or no
difference was observed regarding which sources were
predictive at the M170 and at the P2m.
DISKUSSION
Two unresolved major questions were presented in the
Einführung. The first was the question of which temporal
aspects of the MEG signal are predictive of conscious face
perception.
M170 and P2m Predict Conscious Face Perception
Multivariate classification on binocular rivalry data demon-
strated that activity around the face-specific M170 and P2m
components differed on a single trial basis, depending on
whether a face was perceived consciously or not. Percep-
tion was predicted significantly better than chance from
temporal sensors showing large average activity differ-
zen, and around these sensors group-level decoding
accuracy was dependent on the single best sensor used.
Zusätzlich, perception could be decoded as well or
better when using occipital sensors that showed little or
no mean activity differences between conscious percep-
tion of a face or not. At these locations, perception was
predicted as accurately when using sensors that were in-
dividually at chance as when using all temporal sensors,
thus showing a difference that was not revealed by uni-
variate analyses. No predictive components were found
nach 300 ms, thus arguing against activity at these times
predicting conscious experience.
Interessant, the event-related signal related to con-
scious face perception found in the masking study using
identical durations for “seen” and “unseen” trials (Babiloni
et al., 2010) appeared more similar to that found in the
present experiment than to those found in other EEG
masking experiments. This indicates that when physical
stimulation is controlled for, very similar correlates of
conscious face perception are found across paradigms.
In neither experiment were differences found between
Sandberg et al.
979
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
T
T
F
/
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
.
T
F
/
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
late components (in fact, no clear late components are
found).
MEG/EEG Sensor and Source Correlates of
Visual Consciousness
Our findings appear to generalize to not only to conscious
face perception across paradigms but also to visual aware-
ness more generally. Zum Beispiel, Koivisto and Revonsuo
(2010) reviewed around 40 EEG studies using different
experimental paradigms and found that visual awareness
correlated with posterior amplitude shifts around 130–
320 ms, also known as visual awareness negativity,
whereas later components did not correlate directly with
awareness. Außerdem, they argued that the earliest and
most consistent ERP correlate of visual awareness is an
amplitude shift around 200 ms, corresponding well with
the findings of this study.
Trotzdem, other studies have argued that compo-
nents in the later part of the visual awareness negativity
around 270 ms (corresponding to the P2m of this study)
correlate more consistently with awareness and that the
fronto-parietal network is involved at this stage and later
(Del Cul, Baillet, & Dehaene, 2007; Sergent, Baillet, &
Dehaene, 2005). In this study, the same frontal and pa-
rietal sources were identified, but little or no difference
was found in the source estimates at the M170 and the
P2m, and in fact, the frontoparietal sources were identified
already at the M170. At both the M170 and the P2m, Wie-
immer, occipital and later face-specific source activity was
more predictive than frontal and parietal activity, and early
Aktivität (around the M170) was much more predictive than
late activity (>300 msec). One reason for the difference in
Erkenntnisse, Jedoch, could be that these studies, Del Cul
et al. and Sergent et al., examined having any experience
versus having none (d.h., seeing vs. not seeing), wohingegen
our study examined one conscious content versus another
(but participants perceived something consciously on all
Versuche).
Gesamt, this study appears to support the conclusion
that the most consistent correlate of the contents of visual
awareness is activity in sensory areas at around 150–
200 msec after stimulus onset. Prediction of conscious
perception was no more accurate when taking information
across multiple time points (and peaks) into account than
when training/testing the classifier on the single best time
Punkt.
Between-subject Classification
The second question of our study was whether the con-
scious experience of an individual could be decoded
using a classifier trained on a different individual. Es ist
important to note that between-subject classifications of
this kind do not reveal neural correlates of consciousness
that generally distinguish a conscious from an unconscious
state or whether a particular, single content is consciously
perceived or not, but they do allow us to make compari-
sons between the neural correlates of particular types of
conscious contents (Hier, faces) across individuals.
The data showed that neural signals associated with
specific contents of consciousness shared sufficient com-
mon features across participants to enable generalization
of performance of the classifier. Mit anderen Worten, we provide
empirical evidence that the neural activity distinguishing
particular conscious content shares important temporal
and spatial features across individuals, which implies that
the crucial differences in processing are located at similar
stages of visual processing across individuals. Trotzdem,
generalization between individuals was not perfect, indi-
cating that there are important interindividual differences.
Inspecting Figure 3, zum Beispiel, it can be seen that the
predictive time points around the M170 varied with up to
40 msec between participants (from ∼170 msec for S3 to
∼210 msec for S2). At present, it is difficult to conclude
whether these differences in the neural correlates indicate
that the same perceptual content can be realized dif-
ferently in different individuals or whether they indicate
subtle differences in the perceptual experiences of the
Teilnehmer.
Methodological Decisions
The results of the present experiment were obtained by
analyzing the MEG signal during binocular rivalry. MEG
signals during binocular rivalry reflect ongoing patterns of
distributed synchronous brain activity that correlate with
spontaneous changes in perceptual dominance during
rivalry (Cosmelli et al., 2004). To detect these signals
associated with perceptual dominance, the vast majority
of previous studies have “tagged” monocular images by
flickering them at a particular frequency that can subse-
quently be detected in the MEG signals (z.B., Kamphuisen,
Bauer, & Van Ee, 2008; Srinivasan, Russell, Edelman, &
Tononi, 1999; Braun & Norcia, 1997; Lansing, 1964).
This method, Jedoch, impacts on rivalry mechanisms
(Sandberg, Bahrami, Lindelov, Overgaard, & Rees, 2011)
and causes a sustained frequency-specific response, daher
removing the temporal information in the ERF com-
ponents associated with normal stimulus processing. Das
not only biases the findings but also makes comparison
between rivalry and other paradigms difficult. To avoid
Das, yet maintain a high signal-to-noise ratio (SNR), Wir
exploited the stabilization of rivalrous perception asso-
ciated with intermittent presentation (Noest et al., 2007;
Leopold et al., 2002; Orbach, ehrlich, & Heath, 1963) Zu
evoke signals associated with a specific (stable) percept
and time locked to stimulus onset. Such signals proved
sufficient to decode spontaneous fluctuations in percep-
tual dominance in near real-time and in advance of behav-
ioral reports. We suggest that this general presentation
method may be used in future ambiguous perception
980
Zeitschrift für kognitive Neurowissenschaften
Volumen 25, Nummer 6
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
F
/
T
T
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
/
.
F
T
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
experiments when examining stimulus-related differences
in neural processing.
ings thus argue against earlier and later components
correlating with conscious face perception.
Potential Confounds
There were two potential confounds in our classification
Analyse: eye movements and motor responses. Diese
Sind, Jedoch, unlikely to have impacted on the results
as source analysis revealed that at the time of maximum
classification, sources related to visual processing were
most important for explaining the differences related to
face and grating perception. Zusätzlich, the fact that
the motor response used to signal a perceptual state
was swapped between hands and fingers every 20 Versuche
makes it unlikely that motor responses were assigned high
weights by the classification algorithm. Trotzdem, unser
findings of prediction accuracy slightly greater than chance
for power in high-frequency bands may conceivably have
been confounded by some types of eye movements.
Although we may conclude that specific evoked activity
(localized and distributed) is related to conscious experi-
enz, this should not be taken as an indication that
induced oscillatory components are not important for
conscious processing. Local field potentials, zum Beispiel,
in a variety of frequency bands are modulated in monkeys
by perception during binocular rivalry (Wilke, Logothetis,
& Leopold, 2006).
Apart from potential confounds in the classification
Analysen, it could be argued that the use of rotating stimuli
alters the stimulus-specific components. The purpose of
rotating the stimuli in opposite directions was to mini-
mize the amount of mixed perception throughout the trial
(Haynes & Rees, 2005). Es ist möglich, and remains a topic
for further inquiries, whether this manipulation affects the
mechanisms of the rivalry process, zum Beispiel, in terms
of stabilization of perception. Inspecting the ERF in Fig-
ure 2, it is nevertheless clear that we observed the same
face-specific components as are typically found in stud-
ies of face perception as reported above. Our M170 was
observed slightly later than typically found (peaking at
187 ms). This has previously been observed for partially
occluded stimuli (Harris & Aguirre, 2008), and the delay
in this study might thus be because of binocular rivalry
in general or rotation of the stimuli. The impact of rotating
the stimuli upon face-specific components thus appears
minimal.
Abschluss
In this study, participants viewed binocular rivalry between
a face and a grating stimulus, and prediction of conscious
face perception was attempted based on the MEG signal.
Perception was decoded accurately in the 120–300 msec
time window, peaking around the M170 and again around
the P2m. Im Gegensatz, little or no above-chance accuracy
was found around the earlier M100 component. The find-
Zusätzlich, conscious perception could be decoded
from sensors that were individually at chance performance
for decoding, whereas this was not the case when decod-
ing using multiple time points. The most informative sen-
sors were located above the occipital and temporal lobes,
and a follow-up analysis of activity reconstructed at the
source level revealed that the most predictive single
sources were indeed found in these areas both at the
M170 and the P2m. Trotzdem, conscious perception
could be decoded accurately from parietal and frontal
sources alone, although not as accurately as from occipital
and later ventral stream sources. These results show that
conscious perception can be decoded across a wide range
of sources, but the most consistent correlates are found
both at early and late stages of the visual system.
The impact of increasing the number of temporal fea-
tures of the classifier was also examined. Im Kontrast zu
including more spatial features, more temporal features
had little or no impact on classification accuracy. Weiter-
mehr, the predictive strength of power estimation was
examined across a wide range of frequency bands. Gener-
ally, the low frequencies contained in the evoked response
were the most predictive and the peak time points of clas-
sification accuracy coincided with the latencies of the
M170 and the P2m. This indicates that the main MEG
correlates of conscious face perception are the two face-
sensitive components, the M70 and the P2m.
Endlich, the results showed that conscious perception
of each participant could be decoded above chance
using classifiers trained on the data of each of the other
Teilnehmer. This indicates that the correlates of con-
scious perception (in diesem Fall, faces) are shared to
some extent between individuals. It should be noted,
obwohl, that generalization was far from perfect, indi-
cating that there are significant differences as well for
further exploration.
APPENDIX
Improving Decoding Accuracy
We hypothesized that decoding accuracy could be in-
creased in two ways: by rejecting trials for which per-
ception was not completely clear and by applying a
more stringent filter to the data. Participantʼs reports
(see Results) suggested that the probability of clear per-
ception on a given trial increased the further away the trial
is from a perceptual switch. Classifiers were thus trained
and tested on unstable perception (Trials 1–9 after a
switch) and stable perception (Trial 10 or more after a
switch) separately and decoding accuracies were com-
pared. Five participants reported 100 trials of all kinds
(stable/unstable faces/gratings) required for training the
classifier, and the analysis was thus based on these. Feige-
ure A1a shows that analyzing stable trials as compared
Sandberg et al.
981
D
Ö
w
N
l
Ö
A
D
e
D
F
R
Ö
M
l
l
/
/
/
/
J
F
/
T
T
ich
T
.
:
/
/
H
T
T
P
:
/
D
/
Ö
M
w
ich
N
T
Ö
P
A
R
D
C
e
.
D
S
F
ich
R
Ö
l
M
v
e
H
R
C
P
H
A
D
ich
ich
R
R
e
.
C
C
T
.
Ö
M
M
/
J
e
D
Ö
u
C
N
Ö
/
C
A
N
R
A
T
R
ich
T
ich
C
C
l
e
e
–
P
–
D
P
D
2
F
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
Ö
9
C
2
N
1
_
2
A
/
_
J
0
Ö
0
C
3
N
5
3
_
A
P
_
D
0
0
B
3
j
5
G
3
u
.
e
P
S
T
D
Ö
F
N
B
0
j
8
S
M
e
ICH
P
T
e
M
L
ich
B
B
e
R
R
A
2
R
0
ich
2
3
e
S
/
J
F
/
.
T
u
S
e
R
Ö
N
1
7
M
A
j
2
0
2
1
points only in the analyses above. This makes classifica-
tion accuracy potentially vulnerable to minor fluctuations
at single time points. Such fluctuations could reflect small
differences in latency between trials as well as artifacts
and high-frequency processes that the classifier cannot
exploit, and analyses based on field strength data may
thus be improved if the impact of these high-frequency
components and trial-by-trial variation is minimized.
There are two methods to do this: classification may
either use several neighboring time points or a low
low-pass filter may be applied before analysis to tempo-
rally smooth the data.
Given the temporal extent of the three analyzed
components (50–130 msec), they can be seen as half
cycles of waves with frequencies of 4–10 Hz (d.h., around
100–250 msec). Aus diesem Grund, we compared classi-
fication accuracies for nonfiltered data, 1–20 Hz filtered
Daten, and 2–10 Hz filtered data. We used only stable
Versuche. Six participants had 100 stable trials or more of
each kind (face/grating) and were thus included in the
Analyse.
Figure A1b shows the differences between the three
filter conditions for within-subject decoding. Improve-
ment in decoding accuracy was found comparing no filter
and the filtered data. Comparing unfiltered and 1–20 Hz
filtered data at the M170 and P2m, differences of 5–10%
were found around both peaks, and around the M100 a
difference of around 5% was found. Decoding accuracy
was significantly higher for five of six participants at the
187 ms (cumulative probability of p = 1.9 × 10−6, un-
corrected) and for four of six participants at 260 ms
(cumulative probability of p = 8.7 × 10−5, unkorrigiert),
but only for two of six participants at 90 ms (cumulative
probability of p = .03, unkorrigiert). The largest improve-
ment of applying a 20-Hz low-pass filter was thus seen
for the two most predictive components, the M170
and the P2m. The only impact of applying a 2–10 Hz
filter instead of a 1–20 Hz filter was significantly increased
accuracy for two participants at 187 ms, but decreased
for one.
As between-subject ERF variation is much larger than
within-subject variation (Sarnthein, Andersson, Zimmermann,
& Zumsteg, 2009), we might expect that the most stringent
filter mainly improved between-subject decoding accuracy.
Figure A1c shows a 2–3% improvement of using a 2–10 Hz
compared with a 1–20 Hz filter at the M170 and the P2m
und ein <1% improvement at the M100. This improvement
was significant for two participants at the 180 and 260 msec
(cumulative p = .03, uncorrected), for both, and one par-
ticipant around the M100 at 117 msec (cumulative p = .27,
uncorrected).
Overall, the best decoding accuracies were achieved
using stable trials and filtered data. Numerically better
and slightly more significant results were achieved using
2–10 Hz filtered data compared with 1–20 Hz filtered
data. Importantly, using this more stringent filter did
not alter the time points for which conscious perception
Figure A1. Improvements to prediction accuracy by filtering and
trial selection. The figure plots the impact of using stable trials
only as well as filtering the data. Dotted gray line represents the
95% binomial confidence interval around chance (uncorrected).
(A) Prediction accuracy for stable and unstable trials, respectively.
The comparison is based on the five participants who reported
enough trials of all conditions (stable/unstable faces/gratings) to
train the classifiers. (B, C) Within-subject (B) and between-subject
(C) prediction accuracy for data that has not been low-pass filtered
compared with data low-pass filtered at 20 and 10 Hz, respectively.
This analysis was based on stable trials, and the data reported are
from the analysis of the six participants reporting enough stable
face and grating trials to train the classifier.
with unstable trials results in a large improvement in
classification accuracy of around 10–15% around the
M170 (∼187 msec), 5–8% around the P2m (∼260 msec),
and similarly 5–8% around the M100 (∼93 msec). Signifi-
cant improvements in classification accuracy was found
for at least three of five participants for all components
(cumulative p = .0012, uncorrected).
Some components analyzed (M100, M170, and P2m)
had a temporal spread of around 50–130 msec (see Fig-
ure A1a–c), yet the classifiers were trained on single time
982
Journal of Cognitive Neuroscience
Volume 25, Number 6
D
o
w
n
l
o
a
d
e
d
f
r
o
m
l
l
/
/
/
/
j
f
/
t
t
i
t
.
:
/
/
h
t
t
p
:
/
D
/
o
m
w
i
n
t
o
p
a
r
d
c
e
.
d
s
f
i
r
o
l
m
v
e
h
r
c
p
h
a
d
i
i
r
r
e
.
c
c
t
.
o
m
m
/
j
e
d
o
u
c
n
o
/
c
a
n
r
a
t
r
i
t
i
c
c
l
e
e
-
p
-
d
p
d
2
f
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
o
9
c
2
n
1
_
2
a
/
_
j
0
o
0
c
3
n
5
3
_
a
p
_
d
0
0
b
3
y
5
g
3
u
.
e
p
s
t
d
o
f
n
b
0
y
8
S
M
e
I
p
T
e
m
L
i
b
b
e
r
r
a
2
r
0
i
2
3
e
s
/
j
t
f
.
/
u
s
e
r
o
n
1
7
M
a
y
2
0
2
1
could be decoded—it only improved accuracy around
the peaks.
Decoding Using Power Estimations
Power in several frequency bands (for all sensors) was
also used to train SVM classifiers. This analysis revealed
that theta band power was the most highly predictive
of perception followed by alpha power (Figure A2).
Again the data were the most informative at around 120–
320 msec after stimulus onset. Power estimates in the
higher-frequency bands related to both face and grat-
ing perception (40–60 Hz) and possibly also some re-
lated to face perception alone (60–80 Hz) could be used
D
o
w
n
l
o
a
d
e
d
f
r
o
m
l
l
/
/
/
/
j
f
/
t
t
i
t
.
:
/
/
h
t
t
p
:
/
D
/
o
m
w
i
n
t
o
p
a
r
d
c
e
.
d
s
f
i
r
o
l
m
v
e
h
r
c
p
h
a
d
i
i
r
r
e
.
c
c
t
.
o
m
m
/
j
e
d
o
u
c
n
o
/
c
a
n
r
a
t
r
i
t
i
c
c
l
e
e
-
p
-
d
p
d
2
f
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
o
9
c
2
n
1
_
2
a
/
_
j
0
o
0
c
3
n
5
3
_
a
p
_
d
0
0
b
3
y
5
g
3
u
.
e
p
s
t
d
o
f
n
b
0
y
8
S
M
e
I
p
T
e
m
L
i
b
b
e
r
r
a
2
r
0
i
2
3
e
s
/
j
t
f
/
.
u
s
e
r
o
n
1
7
M
a
y
2
0
2
1
Figure A2. Prediction accuracy across time for various frequencies (stable trials). Six participants had enough trials to train the classifiers on
stable trials alone. The figure plots the data from these participants. The dotted gray line indicates the threshold for which a binomial distribution
of the same number as the total number of trials the prediction is performed upon is different from chance (uncorrected). Average prediction
accuracy is plotted across participants based on estimates of power in different frequency bands as a function of time. SVMs were trained to
predict reported perception (face vs. grating) for each time point.
Sandberg et al.
983
to predict perception significantly better than chance
(Duncan et al., 2010; Engell & McCarthy, 2010). In these
bands, the prediction accuracy did not have any clear
peaks (Figure A2).
Using Bonferroni correction, average prediction accura-
cies across participants across the stimulation period were
above chance in the theta (t(7) = 4.4, p = .033), gamma 2
(40–49 Hz) (t(7) = 4.9, p = .017), and gamma 3 (51–60 Hz)
(t(7) = 4.2, p = .038) bands. Without Bonferroni correc-
tion, alpha (t(7) = 3.2, p = .0151), low beta (t(7) = 3.7,
p = .0072), high beta (t(7) = 3.1, p = .0163), gamma 4
(61–70 Hz) (t(7) = 3.3, p = .0123), and gamma 5 (71–
80 Hz) (t(7) = 2.4, p = .0466) were also above chance.
The classification performance based on the moving win-
dow spectral estimate was always lower than that based on
the field strength. Also, spectral classification was optimal
for temporal frequencies dominating the average evoked
response (inspecting Figure 2B, C, it can be seen, for in-
stance, that for faces, the M170 is half a cycle of a 3–4 Hz
oscillation). Taken together, this suggests that the pre-
dictive information was largely contained in the evoked
(i.e., with consistent phase over trials) portion of the
single trial data.
Figure A3. Prediction based on multiple time points (stable trials).
Six participants had enough trials to train the classifiers on stable
trials alone. The figure plots the data from these participants. Classifiers
were trained/tested on 1 Hz high-pass filtered data from 16 randomly
distributed sensors. (A–C) Prediction accuracy as a function of the
number of neighboring time samples used to train the classifier around
the M170 peak (A), the P2m peak (B), and 50 msec after stimulus onset
(C). No improvement was found at the peaks nor at 50 msec when
classifier baseline accuracy was close to chance. (D) Prediction accuracy
when classifiers were trained on data around both peaks combined
versus each peak individually.
Decoding Using Multiple Time Points
The potential benefit of including multiple time points
when training classifiers was examined. As multiple time
points increase the number of features drastically, the
SVM was trained on a subset of sensors only. For these
analyses, 16 randomly selected sensors giving a perfor-
mance of 72.6% when trained on a single time point were
used (see Figure 4A). As the temporal smoothing of low-
pass filter would theoretically remove any potential benefit
of using multiple time points for time intervals shorter
than one cycle of activity, these analyses were performed
1 Hz high-pass filtered data. Here, the sampling frequency
of 300 Hz is thus the maximum frequency.
We tested the impact of training on up to 11 time
points (37 msec) around each peak (M170 and P2m)
and around a time point for which overall classification
accuracy was at chance (50 msec). At 50 msec, the signal
should have reached visual cortex, but a 37-msec time
window did not include time points with individual
above-chance decoding accuracy. We also tested the
combined information around the peaks. As seen in Fig-
ure A3, the inclusion of more time points did not in-
crease accuracy, and the use of both peaks did not
increase accuracy beyond that obtained at the M170
alone. This may indicate that the contents of conscious-
ness (in this case, rivalry between face and grating per-
ception) are determined already around 180 msec.
Acknowledgments
This work was supported by the Wellcome Trust (G. R. and G. R. B.),
the Japan Society for the Promotion of Science (R. K.), the
European Commission under the Sixth Framework Programme
(B. B., K. S., M. O.), the Danish National Research Foundation
and the Danish Research Council for Culture and Communication
(B. B.), and the European Research Council (K. S. and M. O.).
Support from the MINDLab UNIK initiative at Aarhus University
was funded by the Danish Ministry of Science, Technology, and
Innovation.
Reprint requests should be sent to Dr. Kristian Sandberg, Cog-
nitive Neuroscience Research Unit, Aarhus University Hospital,
Noerrebrogade 44, Building 10G, 8000 Aarhus C, Denmark, or
via e-mail: krissand@rm.dk.
REFERENCES
Babiloni, C., Vecchio, F., Buffo, P., Buttiglione, M., Cibelli, G.,
& Rossini, P. M. (2010). Cortical responses to consciousness
of schematic emotional facial expressions: A high-resolution
EEG study. Human Brain Mapping, 31, 1556–1569.
Blake, R. (2001). A primer on binocular rivalry, including
current controversies. Brain and Mind, 2, 5–38.
Brascamp, J. W., Knapen, T. H. J., Kanai, R., Noest, A. J., Van Ee, R.,
Van den Berg, A. V., et al. (2008). Multi-timescale perceptual
history resolves visual ambiguity. PLoS One, 3, e1497.
Breese, B. B. (1899). On inhibition. Psychological Monographs,
3, 1–65.
Brown, R. J., & Norcia, A. M. (1997). A method for investigating
binocular rivalry in real-time with the steady-state VEP.
Vision Research, 37, 2401–2408.
984
Journal of Cognitive Neuroscience
Volume 25, Number 6
D
o
w
n
l
o
a
d
e
d
f
r
o
m
l
l
/
/
/
/
j
f
/
t
t
i
t
.
:
/
/
h
t
t
p
:
/
D
/
o
m
w
i
n
t
o
p
a
r
d
c
e
.
d
s
f
i
r
o
l
m
v
e
h
r
c
p
h
a
d
i
i
r
r
e
.
c
c
t
.
o
m
m
/
j
e
d
o
u
c
n
o
/
c
a
n
r
a
t
r
i
t
i
c
c
l
e
e
-
p
-
d
p
d
2
f
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
o
9
c
2
n
1
_
2
a
/
_
j
0
o
0
c
3
n
5
3
_
a
p
_
d
0
0
b
3
y
5
g
3
u
.
e
p
s
t
d
o
f
n
b
0
y
8
S
M
e
I
p
T
e
m
L
i
b
b
e
r
r
a
2
r
0
i
2
3
e
s
/
j
.
/
f
t
u
s
e
r
o
n
1
7
M
a
y
2
0
2
1
Carlson, T. A., Hogendoorn, H., Kanai, R., Mesik, J., &
Leopold, D. A., Wilke, M., Maier, A., & Logothetis, N. K.
Turret, J. (2011). High temporal resolution decoding of
object position and category. Journal of Vision, 11,
9.1–9.17.
Carter, O., & Cavanagh, P. (2007). Onset rivalry: Brief
presentation isolates an early independent phase of
perceptual competition. PloS One, 2, e343.
Cosmelli, D., David, O., Lachaux, J.-P., Martinerie, J.,
Garnero, L., Renault, B., et al. (2004). Waves of
consciousness: Ongoing cortical patterns during
binocular rivalry. Neuroimage, 23, 128–140.
Del Cul, A., Baillet, S., & Dehaene, S. (2007). Brain
dynamics underlying the nonlinear threshold for
access to consciousness. PLoS Biology, 5, e260.
Duncan, K. K., Hadjipapas, A., Li, S., Kourtzi, Z., Bagshaw, A.,
& Barnes, G. (2010). Identifying spatially overlapping local
cortical networks with MEG. Human Brain Mapping, 31,
1003–1016.
Engell, A. D., & McCarthy, G. (2010). Selective attention
modulates face-specific induced gamma oscillations
recorded from ventral occipitotemporal cortex. The
Journal of Neuroscience: The Official Journal of the
Society for Neuroscience, 30, 8780–8786.
Freeman, A. W. (2005). Multistage model for binocular rivalry.
Journal of Neurophysiology, 94, 4412–4420.
Friston, K. J., Harrison, L., Daunizeau, J., Kiebel, S., Phillips, C.,
Trujillo-Barreto, N., et al. (2008). Multiple sparse priors
for the M/EEG inverse problem. Neuroimage, 39,
1104–1120.
Harris, A. M., & Aguirre, G. K. (2008). The effects of parts,
wholes, and familiarity on face-selective responses in MEG.
Journal of Vision, 8, 4.1–4.12.
Harris, J. A., Wu, C.-T., & Woldorff, M. G. (2011). Sandwich
masking eliminates both visual awareness of faces
and face-specific brain activity through a feedforward
mechanism. Journal of Vision, 11, 3.1–3.12.
Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000).
The distributed human neural system for face perception.
Trends in Cognitive Sciences, 4, 223–233.
Haynes, J.-D., Deichmann, R., & Rees, G. (2005). Eye-specific
effects of binocular rivalry in the human lateral geniculate
nucleus. Nature, 438, 496–499.
Haynes, J.-D., & Rees, G. (2005). Predicting the stream
of consciousness from activity in human visual cortex.
Current Biology: CB, 15, 1301–1307.
Kamphuisen, A., Bauer, M., & Van Ee, R. (2008). No evidence
for widespread synchronized networks in binocular rivalry:
MEG frequency tagging entrains primarily early visual
cortex. Journal of Vision, 8, 4.1-4.8.
Koivisto, M., & Revonsuo, A. (2010). Event-related brain
potential correlates of visual awareness. Neuroscience
& Biobehavioral Reviews, 34, 922–934.
Kornmeier, J., & Bach, M. (2004). Early neural activity in
Necker-cube reversal: Evidence for low-level processing
of a Gestalt phenomenon. Psychophysiology, 41, 1–8.
Lansing, R. W. (1964). Electroencephalographic correlates
of binocular rivalry in man. Science (New York, N.Y.),
146, 1325–1327.
(2002). Stable perception of visually ambiguous patterns.
Nature Neuroscience, 5, 605–609.
Liddell, B. J., Williams, L. M., Rathjen, J., Shevrin, H., & Gordon, E.
(2004). A temporal dissociation of subliminal versus supraliminal
fear perception: An event-related potential study. Journal
of Cognitive Neuroscience, 16, 479–486.
Lumer, E. D., Friston, K. J., & Rees, G. (1998). Neural
correlates of perceptual rivalry in the human brain.
Science (New York, N.Y.), 280, 1930–1934.
Noest, Van Ee, R., Nijs, M. M., & Van Wezel, R. J. (2007). Percept-
choice sequences driven by interrupted ambiguous stimuli:
A low-level neural model. Journal of Vision, 7, 1–14.
Orbach, J., Ehrlich, D., & Heath, H. (1963). Reversibility of
the Necker cube: I. An examination of the concept of
“satiation of orientation”. Perceptual and Motor Skills,
17, 439–458.
Pegna, A. J., Darque, A., Berrut, C., & Khateb, A. (2011).
Early ERP modulation for task-irrelevant subliminal faces.
Frontiers in Psychology, 2, 88.1–88.10.
Pegna, A. J., Landis, T., & Khateb, A. (2008).
Electrophysiological evidence for early non-conscious
processing of fearful facial expressions. International
Journal of Psychophysiology, 70, 127–136.
Raizada, R. D. S., & Connolly, A. C. (2012). What makes different
peopleʼs representations alike: Neural similarity space solves
the problem of across-subject fMRI decoding. Journal of
Cognitive Neuroscience, 24, 868–877.
Sandberg, K., Bahrami, B., Lindelov, J. K., Overgaard, M., &
Rees, G. (2011). The impact of stimulus complexity and
frequency swapping on stabilization of binocular rivalry.
Journal of Vision, 11, 1–10.
Sandberg, K., Barnes, G., Bahrami, B., Kanai, R., Overgaard, M.,
& Rees, G. (submitted). Distinct MEG correlates of conscious
experience, perceptual reversals, and stabilization during
binocular rivalry.
Sarnthein, J., Andersson, M., Zimmermann, M. B., & Zumsteg, D.
(2009). High test–retest reliability of checkerboard reversal
visual evoked potentials (VEP) over 8 months. Clinical
Neurophysiology, 120, 1835–1840.
Sergent, C., Baillet, S., & Dehaene, S. (2005). Timing of the
brain events underlying access to consciousness during the
attentional blink. Nature Neuroscience, 8, 1391–1400.
Srinivasan, R., Russell, D. P., Edelman, G. M., & Tononi, G.
(1999). Increased synchronization of neuromagnetic
responses during conscious perception. The Journal of
Neuroscience: The Official Journal of the Society for
Neuroscience, 19, 5435–5448.
Wilke, M., Logothetis, N. K., & Leopold, D. A. (2006). Local field
potential reflects perceptual suppression in monkey visual
cortex. Proceedings of the National Academy of Sciences,
U.S.A., 103, 17507–17512.
Wilson, H. R. (2003). Computational evidence for a rivalry
hierarchy in vision. Proceedings of the National Academy of
Sciences, U.S.A., 100, 14499–14503.
Wilson, H. R. (2007). Minimal physiological conditions for
binocular rivalry and rivalry memory. Vision Research, 47,
2741–2750.
Sandberg et al.
985
D
o
w
n
l
o
a
d
e
d
f
r
o
m
l
l
/
/
/
/
j
t
t
f
/
i
t
.
:
/
/
h
t
t
p
:
/
D
/
o
m
w
i
n
t
o
p
a
r
d
c
e
.
d
s
f
i
r
o
l
m
v
e
h
r
c
p
h
a
d
i
i
r
r
e
.
c
c
t
.
o
m
m
/
j
e
d
o
u
c
n
o
/
c
a
n
r
a
t
r
i
t
i
c
c
l
e
e
-
p
-
d
p
d
2
f
5
/
6
2
5
9
/
6
6
9
/
1
9
9
6
4
9
5
/
4
1
0
7
0
7
o
9
c
2
n
1
_
2
a
/
_
j
0
o
0
c
3
n
5
3
_
a
p
_
d
0
0
b
3
y
5
g
3
u
.
e
p
s
t
d
o
f
n
b
0
y
8
S
M
e
I
p
T
e
m
L
i
b
b
e
r
r
a
2
r
0
i
2
3
e
s
/
j
/
.
f
t
u
s
e
r
o
n
1
7
M
a
y
2
0
2
1