RESEARCH ARTICLE
Dynamics of Functional Networks for
Syllable and Word-Level Processing
Johanna M. Rimmele1,4
, Yue Sun1
Oded Ghitza1,2, and David Poeppel1,3,4,5
, Georgios Michalareas1,
Keine offenen Zugänge
Tagebuch
Zitat: Rimmele, J. M., Sun, Y.,
Michalareas, G., Ghitza O., & Kacke,
D. (2023). Dynamics of functional
networks for syllable and word-level
Verarbeitung. Neurobiology of Language,
4(1), 120–144. https://doi.org/10.1162
/nol_a_00089
DOI:
https://doi.org/10.1162/nol_a_00089
zusätzliche Informationen:
https://doi.org/10.1162/nol_a_00089
Erhalten: 18 April 2021
Akzeptiert: 7 November 2022
Konkurrierende Interessen: Die Autoren haben
erklärte, dass keine konkurrierenden Interessen bestehen
existieren.
Korrespondierender Autor:
Johanna M. Rimmele
johanna.rimmele@ae.mpg.de
Handling-Editor:
Jonathan Peelle
Urheberrechte ©: © 2023
Massachusetts Institute of Technology
Veröffentlicht unter Creative Commons
Namensnennung 4.0 International
(CC BY 4.0) Lizenz
Die MIT-Presse
1Departments of Neuroscience and Cognitive Neuropsychology, Max-Planck-Institute for Empirical Aesthetics,
Frankfurt am Main, Deutschland
2College of Biomedical Engineering & Hearing Research Center, Boston University, Boston, MA, USA
3Department of Psychology and Center for Neural Science, New York University, New York, New York, USA
4Max Planck NYU Center for Language, Music and Emotion, Frankfurt am Main, Deutschland; New York, New York, USA
5Ernst Strüngmann Institute for Neuroscience, Frankfurt am Main, Deutschland
Schlüsselwörter: Rede, word, syllable transitions, frequency tagging, MEG
ABSTRAKT
Speech comprehension requires the ability to temporally segment the acoustic input for
higher-level linguistic analysis. Oscillation-based approaches suggest that low-frequency
auditory cortex oscillations track syllable-sized acoustic information and therefore emphasize
the relevance of syllabic-level acoustic processing for speech segmentation. How syllabic
processing interacts with higher levels of speech processing, beyond segmentation, einschließlich
the anatomical and neurophysiological characteristics of the networks involved, is debated.
In two MEG experiments, we investigate lexical and sublexical word-level processing and
the interactions with (acoustic) syllable processing using a frequency-tagging paradigm.
Participants listened to disyllabic words presented at a rate of 4 syllables/s. Lexical content
(native language), sublexical syllable-to-syllable transitions (foreign language), or mere syllabic
Information (pseudo-words) were presented. Two conjectures were evaluated: (ich) syllable-to-
syllable transitions contribute to word-level processing; Und (ii) processing of words activates
brain areas that interact with acoustic syllable processing. We show that syllable-to-syllable
transition information compared to mere syllable information, activated a bilateral superior,
middle temporal and inferior frontal network. Lexical content resulted, additionally, In
increased neural activity. Evidence for an interaction of word- and acoustic syllable-level
processing was inconclusive. Decreases in syllable tracking (cerebroacoustic coherence) In
auditory cortex and increases in cross-frequency coupling between right superior and middle
temporal and frontal areas were found when lexical content was present compared to all other
Bedingungen; Jedoch, not when conditions were compared separately. The data provide
experimental insight into how subtle and sensitive syllable-to-syllable transition information
for word-level processing is.
EINFÜHRUNG
Oscillation-based approaches to speech comprehension posit that temporally segmenting the
continuous input signal is realized through phase-alignment of low-frequency (<8 Hz; delta–
theta) neuronal oscillations in auditory cortex to the slow fluctuations of speech signal at
the syllabic scale (Ahissar & Ahissar, 2005; Ghitza, 2011; Ghitza Greenberg, 2009; Gross
et al., 2013; Haegens Zion Golumbic, 2018; Lakatos et 2019; Meyer 2020;
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
> 0.100
0.43
0.08
0.25
0.06
<0.001 0.001 −0.006 0.004 p2 < 0.001 p3 = 0.064 ps > 0.1
ps > 0.1
ps > 0.1
p1 > 0.1
p2 = 0.074
p3 > 0.100
Phoneme across syllable boundary
0.08
0.07
0.001
ps > 0.100
Notiz. For each measurement, average transitional probabilities between consecutive syllables within word boundary (within word/pseudo-word) and across
word boundary (between word/pseudo-word), the average difference between those measures, and the p value (Mann–Whitney–Wilcoxon tests) are displayed
for each experiment and the German and Turkish/Non-Turkish conditions. Note that transition probabilities and differences are displayed as averaged over the
three different stimulus sets, and p values are further differentiated in case different results were observed for the sets. P values are displayed corrected for
multiple comparison using Bonferroni correction. CV: consonant–vowel.
ERGEBNISSE
Statistical Analysis
Syllable-to-syllable transition probability analysis
Syllable transition probabilities present in the stimulus sequences between and within words
(and pseudo-words) were computed for all conditions (Deutsch, Turkish, Non-Turkish) and sep-
arately for the three stimulus sets (d.h., that were used for different participants; siehe Tabelle 1).
Average syllable transition probabilities between consecutive syllables within word boundary
(within word) and across word boundary (between word) in German, Turkish, and Non-Turkish
sequences were computed for the following phonological measurements: syllable identity,
syllable CV pattern (syllable CV), syllable onset phoneme (onset), initial phoneme manner of
articulation, rime (corresponding to a sub-syllabic unit that groups the vowel nucleus and the
coda consonant(S) of a syllable), and phonemes across syllable boundary. Syllable transition
probabilities were computed following the classical definition of transitional probabilities (Auch
termed “conditional probabilities”) between two elements (Saffran et al., 1996). Entsprechend,
the transitional probability of Syllable 2 (Syl2) given Syllable 1 (Syl1) was computed as follows
(with frequencies computed based on occurrence in the CELEX corpus):
Frequency of the pair Syl1 − Syl2
Frequency of Syl1
Mann-Whitney-Wilcoxon tests were conducted separately for each stimulus set, condition,
and experiment in order to test whether the transitional probabilities between syllables are
significantly higher within word than between word. P values were corrected for multiple
comparison using Bonferroni correction.
Neurobiology of Language
127
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
MRI data analysis
For MRI and MEG data analyses, we used the FieldTrip toolbox (https://fieldtrip.fcdonders.nl)
(Oostenveld et al., 2011).
From the individual MRIs of all participants, probabilistic tissue maps (including cerebro-
spinal fluid white and gray matter) were retrieved. MRI scans were conducted for all partici-
Hose, except for some participants who either did not match the MRI criteria or did not show
up to the MRI scan session (Exp. 1: n = 5; Exp. 2: n = 3). In any case in which an individual
MRI was missing, the standard Montreal Neurological Institute (MNI) template brain was used.
In a next step, the physical relation between sensors and sources was obtained using a single
shell volume conduction model (Nolte, 2003). The linear warp transformation was computed
between the individual T1 MRI and the MNI template T1. The inverse of that transformation
was computed, das ist, a template 8 mm grid defined on the MNI template T1 was inversely
transformed so that it was warped on the individual head space, based on the individual MRI
and the location of the coils during the MEG recording. A leadfield (forward model) was cal-
culated based on the warped MNI grid and the probabilistic tissue map, and used for source
reconstruction. This allowed computing statistics across subjects in the MNI space with the
grids of all subjects being aligned to each other.
MEG data analysis
Vorverarbeitung. For preprocessing, the data were band-pass filtered off-line (1–160 Hz,
Butterworth-Filter; filter order 4) and line-noise was removed using bandstop filters (49.5–
50.5, 99.5–100.5, 149.5–150.5 Hz, two-pass; filter order 4). In a common semiautomatic arti-
fact detection procedure (d.h., the output of the automatic detection was monitored), the signal
was filtered in a frequency range that typically contains muscular artifacts (band-pass: 110–
140 Hz) or jump artifacts (median filter) and z-normalized per time point and sensor. To accu-
mulate evidence for artifacts that typically occur in more than one sensor, the z-scores were
averaged over sensors. We excluded trials exceeding a predefined z-value (muscular artifacts,
z = 15; jumps, z = 30). Slow artifacts were removed by rejecting trials in which the range (min–
max difference) in any channel exceeded a threshold (threshold = 0.75e−5). The data were
down-sampled to 500 Hz. And epoched (−2.1–9.6 s). Trials with head movements that
exceeded a threshold (5 mm) were rejected. Nachher, the different blocks of recorded
MEG data were concatenated. (Note that for each block, during the recording, the head posi-
tion was adjusted to the initial position of the first block). Sensors with high variance were
rejected.
Eye-blink, eye-movement and heartbeat-related artifacts were removed, using independent
component analysis (infomax algorithm; Makeig et al., 1996). Components were first reduced
Zu 64 components using principal component analysis. Only in the case of a conclusive con-
junction of component topography, time course, and variance across trials components were
rejected. For the sensor space analysis, spherical spline interpolation was used to interpolate
the missing sensors (Perrin et al., 1989).
Trials with correct responses were selected and the trial number was matched between the
conditions by randomly selecting trials of the condition with fewer trials (trial number, Exp. 1:
mean = 73.22, SD = 11.02; Exp. 2: mean = 68.68, SD = 10.27).
For display purposes and for the additional control analyses of statistical learning, Die
individual “M100 sensors” were computed based on the auditory cortex sound localizer
MEG data (for details see Supporting Information, verfügbar unter https://doi.org/10.1162/nol_a
_00089).
Neurobiology of Language
128
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
Power. Neuronal power was analyzed (in sensor and source space) to investigate the brain
areas recruited for the processing of lexical- versus syllable-transition cues of words (2 Hz),
and syllables in these conditions (4 Hz). For the sensor space analysis, the data were interpo-
lated toward a standard gradiometer location based on the headmodel. It was epoched using a
time window of 0.5–9.5 s (0–0.5 s after stimulus onset were excluded to avoid onset-related
contamination) and averaged across all trials of a condition. Evoked power was computed
using singletaper frequency transformation (1–7 Hz) separately for each participant of the
two experiments at each condition (frequency resolution: 0.1111 Hz). At each frequency
the power was contrasted by the neighboring frequency bins (± 2–3 bins). Cluster-based per-
mutation tests using Monte Carlo estimation (Maris & Oostenveld, 2007) were performed to
analyze differences between the conditions within each experiment (German vs. Turkish/Non-
Turkish; dependent-sample T statistics) and across experiments (German vs. German and
Turkish vs. Non-Turkish; independent-sample T statistics) bei 2 Hz and 4 Hz, with an iteration
of the condition affiliation (1,000 random permutations). In each permutation the cluster
across sensors with the highest summed t value was identified by keeping only the sensors
for which the difference between randomized conditions was significant at p = 0.05 (cluster
alpha; minimum number of neighborhood sensors = 2). This resulted in a distribution of 1,000
random permutation t values of maximum random clusters. Dann, all the identified clusters
from the comparison between the actual conditions were compared to this random permuta-
tion distribution, and all the clusters with t value higher than the 97.5% or lower than the 2.5%
of the permutation distribution were flagged as significant.
In order to analyze the brain areas recruited during the processing of lexical versus syllable-
to-syllable transition cues of words (2 Hz) and syllables (4 Hz), dynamic imaging of coherent
sources (DICS) was used to localize neuronal power (Gross et al., 2001). Erste, basierend auf
individual leadfields a common source filter (1.333–4.666 Hz) was computed across condi-
tions for each participant (lambda = 10%; 0.8 cm grid; note that we explored different lambda
Werte. See Figure S1 in the Supporting Information for an analysis with lambda = 100%,
which shows similar, Jedoch, slightly less conservative findings.). Zweite, based on the filter
and Fourier transformed data (multi-taper frequency transformation; 0.1111 Hz resolution)
the power at 2 Hz and 4 Hz was localized and contrasted with the neighboring frequency
Mülleimer (± 2–3 bins). Differences in source power at 2 Hz and 4 Hz were tested using cluster-
based permutation tests (1,000 Iterationen; two-sided) to analyze differences between the
conditions within each experiment (German vs. Turkish and German vs. Non-Turkish;
dependent-sample T statistics) and across experiments (German vs. German and Turkish vs.
Non-Turkish; independent-sample T statistics) with an iteration of the condition affiliation.
In each permutation the cluster across voxels with the highest summed t value was identified
by keeping only the voxels for which the difference between randomized conditions was sig-
nificant at p = 0.05 (cluster alpha). This resulted in a distribution of 1,000 random permutation
t values of maximum random clusters. Dann, all the identified clusters from the comparison
between the actual conditions were compared to this random permutation distribution, Und
all the clusters with t value higher than the 97.5% or lower than the 2.5% of the permutation
distribution were flagged as significant.
Furthermore in an additional analysis, the Brainetome atlas (Fan et al., 2016) was used to
define regions of interest (ROIs; left and right superior temporal gyrus (STG), or STG1:
A41_42_L/R and STG2: TE1.0_TE1.2_L/R; MTG: anterior STS; superior middle gyrus (SMG):
IPL A40rv; IFG: A44v; präzentraler Gyrus (PCG): A6cdl) to further test the condition differences at
2 Hz revealed in the cluster-test analysis. Differences between conditions at each ROI were
tested separately for the hemispheres and the comparisons within each experiment (Deutsch
Neurobiology of Language
129
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
vs. Turkish and German vs. Non-Turkish; Wilcoxon signed-rank tests) and across experiments
(German vs. German and Turkish vs. Non-Turkish; Mann-Whitney-Wilcoxon tests). Bonferroni
correction across ROIs and hemispheres was applied to correct for inflated p values.
In order to analyze the interaction between the 2 Hz word-level and
Cerebroacoustic coherence.
Die 4 Hz syllable-level processing, syllable tracking (cerebroacoustic coherence in the auditory
cortex ROI at 4 Hz) was compared between conditions with or without word-level information
in both experiments. Note that cerebroacoustic coherence is typically computed at the syllabic
Ebene, as this is where the most acoustic energy is contained in the speech envelope (sehen
Figure 1C and D; Peelle et al., 2013). daher, Erste, the speech envelope was computed
separately for each sentence. The acoustic waveforms were filtered in 8 frequency bands that
are equidistant on the cochlear map (zwischen 100 Und 8000 Hz; third-order Butterworth filter;
forward and reverse; Smith et al., 2002). The speech envelope was computed by averaging
the magnitude of the Hilbert transformed signal of the 8 frequency bands separately for each
Satz. The envelope was resampled to 500 Hz to match the MEG data sampling rate. Zweite,
after the spectral complex coefficients at 4 Hz were computed for the speech envelope of each
trial and the neuronal data (0.1111 Hz resolution), coherence (Rosenberg et al., 1989) zwischen
all sensors and the speech envelope was computed. A common filter (DICS; lambda = 10%;
0.8 cm grid) was multiplied with the coherence, and Fisher z-transformation was applied. Der
cerebro-acoustic coherence was averaged across voxels of the auditory cortex ROIs (STG1 and
STG2) separately for the left and right hemisphere. A mixed model analysis of variance
(ANOVA) was conducted to test the between-subject effect of experiment and the within-
subject effects of condition (Deutsch, Turkish/Non-Turkish) and hemisphere (links, Rechts).
In order to test the connectivity between auditory cortex and other
Cross-frequency coupling.
brain areas, the interactions between word- and syllable-level processing, revealed by the
analysis of cerebro-acoustic coherence, were further investigated by comparing cross-
frequency coupling in conditions with and without lexical content and syllable transition infor-
mation. Zusätzlich, condition contrasts were tested merged across experiments (d.h., Die
German conditions were merged, as were the Turkish and Non-Turkish condition). Cross-
frequency coupling was computed separately between the 4 Hz power envelope in a left or
right auditory cortex ROI and the 2 Hz power envelope measured across the whole cortex.
After trials were downsampled (100 Hz) and filtered (Butterworth, fourth order, bandpass: 1.5–
2.5 Hz and 3.5–4.5 Hz), the Hilbert transform was used to compute the complex spectral coef-
ficients at 2 Hz and at 4 Hz separately for each trial, hemisphere, condition, and participant. A
common filter (across conditions and frequencies: 1.5–4.5 Hz; linearly constrained minimum
variance; lambda = 10%; 0.8 cm grid) was computed and used to project each trial in source
Raum. Power envelopes were copula normalized (Ince et al., 2017). Mutual information (MI)
was estimated (Ince et al., 2017) between the 4 Hz power envelopes (at voxels of a left and
right auditory cortex ROI) Und 2 Hz power envelopes (measured across the whole cortex). Für
this analysis trials were concatenated separately for each participant and condition and MI was
computed on the concatenated trials. MI was averaged across the voxels of the left and right
auditory cortex ROIs, jeweils. Note that no correction of multiple comparisons across
permutation tests was applied.
Statistical learning analysis
In order to access the dynamics of power changes across the experiment (d.h., to test statistical
learning across blocks; note that each block had a duration of 2.9 min with 1.45 min per con-
dition), sensor space power was computed trial-wise by using a jack-knifing procedure (d.h.,
Neurobiology of Language
130
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
the frequency analysis was performed across n-block-trials-leave-one-out) and averaged
across the individual M100 sensors. Otherwise the power analysis was matched to the
other analyses (d.h., computed for trials with correct responses; the neural power was con-
trasted with the neighboring frequency bins). A linear mixed-effects model (LMM; using R Core
Team, 2022, and lme4 package, Bates et al., 2015) analysis was used to test effects of
experimental block order (fixed effect: block order, random effects: participant ID; in an addi-
tional model the random slope effect of a polynomial model of block order was added) on the
neural power observed at 2 Hz in the Turkish condition in Experiment 1. Statistical learning
would be indicated by a linear increase of neural power across blocks (polynomial first order
Modell). Additionally to learning, a fatigue effect might occur at the end of the experiment
(polynomial second order model). Models with/without a random slope effect of block order,
and first and second order polynomial models were compared based on the Bayesian infor-
mation criterion (BIC).
Behavioral Measures
A mixed ANOVA was used to test the effect of lexical and syllable-to-syllable transition cues
on target discrimination accuracy (Figure 1D) in Experiments 1 Und 2 (between-subject factor:
Experiment; within-subject factor: condition; equality of variances; Levene’s test: ps > 0.07;
normality, Shapiro–Wilk test: Bonferroni, pcorr = 0.0125; ps ∼ 0.027, 0.005, 0.586, 0.258).
Accuracy was higher in the German compared to the Turkish/Non-Turkish conditions (F(1, 35) =
100.759; P < 0.001; η2 = 0.365; German: 92% vs. Turkish/Non-Turkish: 83%). There was no
main effect of experiment (F(1, 35) = 1.794; p = 0.189; η2 = 0.025) or interaction (F(1, 35) =
1.303; p = 0.261; η2 = 0.005). The results indicate that the presence of lexical cues (in the
German conditions) facilitated performance.
Syllable-to-Syllable Transition Probability
Mann–Whitney–Wilcoxon tests revealed in both German and Turkish conditions significantly
higher within compared to between word transitional probabilities for all experiments, stimu-
lus sets, and measurements (for statistics see Table 1). The presence of phonological patterning
at the word (2 Hz) rate in the Turkish sequences, allows for temporal grouping of Turkish
syllable pairs by German listeners in the absence of lexical processes. In contrast, the Non-
Turkish condition showed no significant difference (with one exception) in transitional prob-
abilities of syllables within and between pseudo-words, suggesting that no syllable transition
cues for grouping syllables into words were present. In the Non-Turkish condition for all mea-
surements, the transitional probabilities within and between pseudoword syllables differed by
less than 0.01 (1%). In contrast, in the Turkish condition differences in transitional probabilities
ranged from 5% to 67% among different measurements. Nonetheless, in the Non-Turkish
condition a significant effect of syllable identity transitions was observed for one stimulus
set. (Note, however, that this contrast was not significant when outlier transitions, higher than
2.5 SD, were removed.)
Lexical and Syllable-to-Syllable Transition Processing
In the sensor-space MEG analysis, lexical access effects are reflected in Experiment 1 in power
increases in the German compared to the Turkish condition at 2 Hz, at a left frontal and left
temporal cluster (p = 0.002; Figure 2A). In source space, the comparison revealed differences
at 2 Hz at a left lateralized frontal cluster (p = 0.022; strongest in left IFG (pars opercularis),
also including left pars orbitalis, left superior frontal gyrus, right superior frontal gyrus). Non-
parametric comparisons performed at 2 Hz separately for the left and right hemisphere and at
Neurobiology of Language
131
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
n
o
/
l
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
n
o
_
a
_
0
0
0
8
9
p
d
/
.
l
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Networks for syllable and word-level processing
l
D
o
w
n
o
a
d
e
d
f
r
o
m
h
t
t
p
:
/
/
d
i
r
e
c
t
.
m
i
t
.
e
d
u
n
o
/
l
/
l
a
r
t
i
c
e
-
p
d
f
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
n
o
_
a
_
0
0
0
8
9
p
d
/
.
l
f
b
y
g
u
e
s
t
t
o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3
Figure 2. Lexical and syllable-to-syllable transition processing increases sensor space power. A–D.
Power contrasts (with neighboring frequency bins) averaged across the individual M100 sensors
(left); the topography of the power contrast differences between conditions at 2 Hz and 4 Hz (right).
Clusters that showed significant differences are marked with black dots. A. Lexical processing, in
Experiment 1, resulted in increased power at 2 Hz for the German compared to the Turkish condi-
tion at a left frontal cluster. B. Syllable transition processing, resulted in increased power in the
Turkish compared to the Non-Turkish condition at 2 Hz at a left frontocentral and temporal, and
a right frontocentral cluster. C. Lexical plus syllable transition processing, in Experiment 2, resulted
in increased power at 2 Hz in the German compared to the Turkish condition at a broadly distrib-
uted cluster. D. No differences were detected for the across experiment comparison of the German
conditions.
Neurobiology of Language
132
Networks for syllable and word-level processing
the STG1, STG2, MTG, SMG, IFG, and PCG ROIs revealed no significant condition differences
(left hemisphere: all p values > 0.0347; right hemisphere: all p values > 0.1119; Bonferroni
corrected alpha = 0.0042).
The cross-experiment comparison shows sensor-space syllable-to-syllable transition process-
ing effects (Turkish vs. Non-Turkish) within a broad left and right hemispheric cluster (p = 0.002)
and a broad right hemispheric cluster (p = 0.004; Figure 2B). In source space, syllable-to-syllable
transition processing resulted in increased power in the Turkish compared to the Non-Turkish
condition at a bilateral frontal, zentral, and temporal cluster (p = 0.002; with strongest activa-
tions at the STG, MTG, precentral/postcentral gyrus, and Rolandic operculum; Abbildung 3B).
Non-parametric comparisons performed at 2 Hz revealed condition differences at the left
STG1, STG2, MTG, and SMG ROIs (0.0001 < ps < 0.0034; Bonferroni corrected alpha =
0.0042). In the right hemisphere condition differences were significant at the STG1, STG2,
MTG, and SMG ROIs (0.0006 < ps < 0.0037; alpha = 0.0042).
Lexical plus sublexical processing in Experiment 2 resulted in sensor power increases in the
German compared to the Non-Turkish condition at a bilateral widespread cluster (p = 0.002;
Figure 2C). In source space, connectivity was increased in the German compared to the Non-
Turkish condition at a bilateral frontal, central, and temporal cluster at 2 Hz (p = 0.0020; with
strongest activations at the STG, MTG, insula, precentral/postcentral gyrus; Figure 3C). Non-
parametric comparisons performed at 2 Hz revealed significant condition differences in the left
hemisphere at the STG1, STG2, MTG, SMG, and PCG ROIs (0.0001 < ps < 0.0015; Bonferroni
corrected alpha = 0.0042). In the right hemisphere condition differences were significant at all
ROIs (0.0001 < ps < 0.0022).
There were no significant differences detected at any cluster for the cross-experiment con-
trol comparison of the German conditions (sensor space: Figure 2D; source space: Figure 3D;
the statistics can be viewed in the t statistic maps in Figure S3) and no condition differences at
any ROI of the two hemispheres (ps > 0.2545). Likewise, there were no effects at 4 Hz at any
comparison in sensor or source space.
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Word-Level Processing
In order to investigate whether word-level processing affects syllable tracking in auditory cor-
tex at 4 Hz (note that no neural power differences between conditions were observed at this
frequency), a mixed ANOVA on the cerebro-acoustic coherence in the auditory cortex ROI
durchgeführt wurde (within-subject: hemisphere, condition; between subject: Experiment; equality
of variances, Levene’s test: ps > 0.2; normality, Shapiro–Wilk test: Bonferroni, pcorr = 0.0063;
ps ∼ 0.0565, 0.982, 0.034, 0.615, 0.052, 0.952, 0.226, 0.865). Cerebro-acoustic coherence
was smaller in the German conditions of both experiments compared to the Turkish/Non-
Turkish conditions (main effect of condition: (F(1, 35) = 7.34, p = 0.010, ηp
2 = 0.173;
Figure 4A). Außerdem, there was a main effect of hemisphere (F(1, 35) = 12.59, p =
0.001, ηp
2 = 0.265), with overall larger cerebro-acoustic coherence in the right auditory cortex
ROI (Figure 4B). There were no interaction effects (Hemisphere × Experiment: F(1, 35) = 2.43,
p = 0.625, ηp
2 = 0.007; Hemisphere × Condition × Experiment: F(1, 35) = 0.155, p = 0.696,
ηp
2 = 0.004). Jedoch, there was a trend for larger condition differences in Experiment 1 com-
pared to Experiment 2 and larger hemisphere differences for the Turkish/Non-Turkish com-
pared to the German conditions (Condition × Experiment: F(1, 35) = 3.188, p = 0.083, ηp
2 =
0.083; Hemisphere × Condition: F(1, 35) = 3.568, p = 0.067, ηp
2 = 0.093). Zusammenfassend, Die
findings suggest when lexical content was present (d.h., in the German condition) syllable
tracking in auditory cortex at 4 Hz was decreased.
Neurobiology of Language
133
Networks for syllable and word-level processing
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Figur 3. Lexical and syllable-to-syllable transition processing activates frontal and temporal cortex. A. In Experiment 1, lexical processing
resulted in increased power at 2 Hz in the German compared to the Turkish condition in a cluster with stronger activations particularly in left
inferior frontal brain areas (links). Exploratory comparison of condition differences at several regions of interest (ROIs; Bonferroni corrected;
Rechts). B. Syllable transition processing resulted in a broad left and a broad right hemispheric cluster showing power increases at 2 Hz (links).
Condition differences were significant in several left and right hemispheric ROIs (Rechts). C. Lexical plus syllable transition processing resulted in
a broad bilateral cluster showing power increases (links). Condition differences were significant in several left and right hemispheric ROIs (Rechts).
In A–C the activity is masked by the clusters that showed significant effects. D. No significant differences were revealed in the German con-
ditions across experiments. Note that because of the null findings, no mask was applied in this figure.
Neurobiology of Language
134
Networks for syllable and word-level processing
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Lexical content increases interactions between word- and acoustic syllable-level
Figur 4.
Verarbeitung. A. Syllable tracking in the auditory cortex ROI, measured using cerebro-acoustic coher-
enz, was significantly reduced in conditions where lexical content was present (Deutsch
Bedingungen) compared to conditions where no lexical content was present (Turkish/Non-Turkish)
(main effect condition mixed ANOVA; left column). Cerebro-acoustic coherence was significantly
higher in the right compared to left hemisphere (main effect condition mixed ANOVA; Rechts
column). B. The hemispheric differences were tendency larger for the Turkish/Non-Turkish condi-
tionen, suggesting reduced speech tracking when lexical content was present particularly in the right
hemisphere, and condition effects were tendency larger in Experiment 1. C. Mutual information
(MI) was computed to estimate cross-frequency coupling between syllable-level processing
(4 HZ) in the auditory cortex ROI and word-level processing (2 Hz), in order to further investigate
the observed interaction. No cluster with significant effects was observed for the left hemispheric
auditory cortex ROI. Im Gegensatz, when lexical content was present (German conditions vs.
Turkish/Non-Turkish) increased MI between the right auditory cortex ROI and a cluster including
inferior frontal, superior, and middle temporal, and temporal-parietal brain areas.
Neurobiology of Language
135
Networks for syllable and word-level processing
To further analyze how syllable processing at 4 Hz is affected by the presence of lexical
content at 2 Hz, cross-frequency coupling analyses were performed using MI. No clusters with
significant effects were found for the contrasts German vs. Turkish (Exp. 1; for the left and right
auditory cortex ROIs: ps > 0.39) or German vs. Non-Turkish (Exp. 2; for the left and right audi-
tory cortex ROIs: ps = 1) or Turkish vs. Non-Turkish (across experiments; for the left and right
auditory cortex ROIs: ps > 0.53). Zusätzlich, the merged condition contrast was tested
(German conditions were merged across experiments, and similarly Turkish/Non-Turkish con-
ditions). For the right auditory cortex ROI, a cluster with significant differences between con-
ditions was observed (p = 0.004). In the German conditions (merged across experiments), MI
was increased between the 4 Hz envelope amplitude in the right auditory cortex ROI and a
right hemispheric frontal, superior, middle temporal, and temporal parietal positive cluster;
activity was most pronounced in the right: STG, MTG, IFG, insula, postcentral/precentral,
and inferior parietal cortex (Jedoch, some activity was observed in the left PCG) verglichen
to the conditions without lexical content (Turkish/Non-Turkish; Figure 4C). No clusters with
significant condition differences were observed for the left auditory cortex ROI (ps > 0.39).
Statistical Learning Analyses
A trial-wise LMM analysis was conducted on the sensor-space power at 2 Hz in the Turkish
condition of Experiment 1 (Figure S4A–C). The LMM polynomial second order model shows a
tendency towards a first degree effect (beta estimate: −2.82, SE: 1.53, CI: [−5.83–0.19], t =
−1.84, p = 0.066) and a significant second degree effect (beta estimate: −4.06, SE: 1.54, CI:
[−7.07–1.05], t = −2.64, p = 0.008; Table S1). Jedoch, if the random slope block order was added
to the model first degree (beta estimate: −3.88, SE: 2.89, CI: [−9.54–1.78], t = −1.35, p = 0.178) Und
second degree effects were not significant (beta estimate: −4.39, SE: 3.16, CI: [−10.59–1.81], t =
−1.39, p = 0.165; Table S2). The polynomial model with the random slope effect included was
selected based on the BIC (BIC = 4,907; model without random slope, BIC = 4,931; polynomial
first order models without/with slope had larger BIC values, BIC = 4,933 and BIC = 4,920).
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
DISKUSSION
We show that the frequency-tagging paradigm can be used to distinguish aspects of lexical-
level and syllable-to-syllable-transition information processing by differentiating neuronal net-
works activated at 2 Hz. Our findings indicate that syllable-to-syllable-transitions of a foreign
language are rapidly learned and tracked, at least when there is an overlap in sublexical cues
between foreign and native languages. Außerdem, we used the frequency-tagging paradigm
to investigate interactions between acoustic syllable-level and word-level processing. Specifi-
cally, we found, Erste, decreased tracking of syllables (cerebro-acoustic coherence at 4 Hz) In
auditory cortex when lexical word-level content was present compared to all other conditions;
zweite, for the same contrast cross-frequency coupling was increased between 4 Hz activity in
right auditory cortex and 2 Hz activity in a cluster that included frontal, Mitte, and superior
temporal areas. The data might indicate interactions between lexical processing of words (Hier
bei 2 Hz) and acoustic-syllable processing (here at 4 Hz), Jedoch, further work is required. Notiz
that at both the syllable level and the word level we are not committed to any decoding scheme.
At the syllable level, we show that acoustic syllabic information—to be decoded as a whole
unit or as a sequence of phonemes—is obtained within a window duration that is inside the
theta range. The strongest evidence that this window is determined by theta-band oscillations
comes from earlier work on the association of the drop in intelligibility of speeded speech with
the upper frequency range of theta (Doelling et al., 2014; Ghitza & Greenberg, 2009). Bei der
word level, we do not link our findings on the lexical processing to oscillations.
Neurobiology of Language
136
Networks for syllable and word-level processing
Lexical and Syllable-to-Syllable Transition Processing of Words
Lexical processing, compared to sublexical syllable-to-syllable transition processing showed
increased activity at a cluster of left-lateralized frontal sensors, localized to left frontal brain
Bereiche. Previous (fMRT, MEG, and lesion) research emphasized the role of the posterior middle
temporal lobe in lexical-semantic processing of words, which was often reported to be left
lateralized with some degree of bilateral recruitment (Figure 2A and Figure 3A; Gow, 2012;
Hickok & Kacke, 2004, 2007; Peelle, 2012; Rice et al., 2018; Thompson-Schill et al., 1997;
Utman et al., 2001). Außerdem, some studies have reported a much broader network for
lexical-semantic processing including the (more strongly activated) inferior frontal lobe, für
example in tasks that elicit lexical competition (Kan et al., 2006; Rodd et al., 2015; Thompson-
Schill et al., 1997). Jedoch, others suggested a role of the inferior frontal lobe in sublexical
segmentation (Burton et al., 2000) or argued that the recruitment of frontal motor areas reflects
working memory processes rather than lexical-semantic processing per se (Rogalsky et al.,
2022). In light of these previous findings, our findings of increased activity in left lateralized
frontal brain areas when lexical content was present need to be interpreted cautiously. Limi-
tations of contrasting MEG source maps need to be considered, which can result in erroneous
brain maps (Bourguignon et al., 2018). Given such limitations, our findings alternatively might
reflect activity of sources centered in STG with slightly different center configurations in the
German and Turkish conditions. For visual comparison the source maps are displayed sepa-
rately per condition (Figure S3).
“Mere” syllable transition processing compared to acoustic syllable processing, in contrast,
activated fronto-centro-temporal brain areas in both hemispheres (Figure 2B and Figure 3B;
see also Figure 2C and Figure 3C). Previously, a functional subdivision of the temporal cortex
has been proposed, with bilateral STS activations during lower-level acoustic speech process-
ing and a left-lateralized activation of the more ventral temporal-parietal cortex during lexical-
semantic processing (Binder et al., 2000, 2009). In line with this subdivision, our findings
further suggest that, beyond acoustic processing, sublexical syllable-transition processing
occurs bilaterally. In our paradigm, increased neuronal activity in the native language condi-
tion, which contained semantic and syllable transition cues to group syllables into words, com-
pared to a foreign language condition, which contained only syllable transition cues (Tisch 1),
indicates lexical processing of words. Lexical processing and syllable transition processing,
Jedoch, are tightly entangled, thus an alternative possibility is that the observed increase in
neuronal activity partly reflects better-learned syllable transitions in a native compared to a
foreign language condition.
Processing of Syllable-to-Syllable Transition Cues
Behavioral research suggests that sequencing of phonemes—because the distribution of pho-
nemes varies across syllables—can be used to detect syllable-to-syllable transitions and word
boundaries (Brodbeck et al., 2018; McQueen, 1998), as well as the position of syllables within
Wörter (Cutler, 2012; van der Lugt, 2001). Our findings indicate brain areas involved in using
syllable transition information to process disyllabic words (Figure 2B and Figure 3B). Our find-
ings provide evidence that even the syllable transition information present in a foreign lan-
Spur, das ist, sublexical cues that can be used for grouping syllables into words (einschließlich
phoneme transition probabilities between words) such as the onset of a syllable or the
consonant-vowel pattern, which were present in both German and Turkish conditions but
not the Non-Turkish condition (Tisch 1), can be extracted. In der vorliegenden Studie, the stimuli
were recorded and preprocessed so that acoustical cues at the word level were minimized,
Neurobiology of Language
137
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
resulting in a prominent power peak only at the syllable rate at 4 Hz, but not at the word rate at
2 Hz (Figure 1B–C). Daher, the increased power peak at 2 Hz in the Turkish compared to the
Non-Turkish condition most likely reflects the processing of syllable-transition features rather
than the processing of acoustic cues. (For caveats because of acoustic cues at the word level in
artificial languages, see C. Luo & Ding, 2020; Pinto et al., 2022.)
In the current study, we carefully matched the German and Turkish stimulus material with
regard to the sublexical syllable-to-syllable transition cues. Possibly this enhanced the ability
of participants to quickly learn and extract the sublexical contingencies of a foreign language.
If the ability to extract such contingencies at the word-level depends on the similarity of these
features between languages, the frequency-tagging paradigm could be used as a neurophysi-
ological tool to investigate the phonological similarity between languages, without requiring
explicit feedback from participants. In order to test statistical learning (Ota & Skarabela, 2016,
2018; Pena & Melloni, 2012; Saffran et al., 1996), we analyzed whether the tracking of sub-
lexical syllable transitions (in the Turkish condition) varied across experimental blocks. Wir
found a tendency toward an increase of neural power at the word level (2 Hz) across the initial
experimental blocks. Außerdem, a power decrease across the last blocks was significant,
indicating statistical learning and possibly fatigue related effects, jeweils (Figure S4A–
B). Jedoch, if the variance across participants in neural power changes across blocks was
berücksichtigt, these effects were not significant (Figure S4C). Visual inspection of the individual
Daten (Figure S4C) suggests that statistical learning only occurred in some participants. Unser
findings are in line with previous findings that show rapid statistical learning of words and
phrases in an artificial language (Buiatti et al., 2009; Getz et al., 2018; Pinto et al., 2022), mit
some variation in the time needed to establish word tracking (Buiatti et al., 2009, ∼9 min;
Pinto et al., 2022, ∼3.22 min) or phrase tracking (Getz et al., 2018, ∼4 min). Insbesondere, A
recent study on statistical learning in an artificial language found no effects of the block order
on word-level tracking, interpreted as rapid learning within the duration of the first block (Pinto
et al., 2022, 3.22 min). In line with our finding, they furthermore pointed out the high variance
in whether neural tracking of words occurred at the single subject level, which was only
observed in 30% of the participants.
Interaktionen
Previous speech comprehension models have focused on mapping of acoustic-phonemic to
lexical processing (z.B., Marslen-Wilson & Welsh, 1978). Neurophysiological data, Jedoch,
provide compelling evidence for the extraction of acoustic information at the syllable level
(Gross et al., 2001; H. Luo & Kacke, 2007; Panzeri et al., 2010). What does that mean
for our understanding of speech comprehension? In accordance with previous evidence,
our findings show stronger syllable tracking (4 Hz; cerebro-acoustic coherence) in the right
compared to the left auditory cortex (Flinker et al., 2019; Giroud et al., 2020; H. Luo &
Kacke, 2007). Crucially, syllable tracking decreased when lexical content was present
(d.h., German condition; compared to when no lexical content was present), indicating an
interaction between word-level and acoustic syllable-level processing. Our findings are in line
with several previous findings: In frequency-tagging paradigms, lexical processing of words (In
artificial word learning, or when compared with a foreign language) resulted in reduced power
at the syllabic rate when words were intelligible compared to unintelligible (Buiatti et al.,
2009; Makov et al., 2017; Pinto et al., 2022). Im Gegensatz, many studies have found increased
syllable tracking in left auditory cortex during the processing of intelligible compared to unin-
telligible speech (z.B., Park et al., 2015; Peelle et al., 2013; Rimmele et al., 2015). Such con-
troversial findings have been explained in the context of the predictive coding framework
Neurobiology of Language
138
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
(Sohoglu & Davis, 2016). Increased intelligibility and tracking in STG due to sensory detail in
the stimulus acoustic (original vs. noise-vocoded speech) was related to increased prediction
Fehler. Im Gegensatz, increased intelligibility and reduced speech tracking in STG due to prior
(linguistic) Wissen, was related to increased top-down predictions. The latter effect was
observed particularly in the right hemisphere. The findings suggest that the effects of the inter-
action can vary depending on the paradigm, performed processes, und so weiter.
Genauer, in our study, acoustic syllable-level processing in right auditory cortex
showed increased interactions with lexical word-level processing in right inferior frontal, supe-
rior, and middle temporal cortex (cross-frequency coupling). In line with proposals of a crucial
role of the MTG as an interface between phonetic and semantic representations (Gow, 2012),
our findings suggest that in addition to the inferior frontal brain areas and the STG, the MTG is
involved in communicating information between syllable and word-level processing. Es ist
likely that our findings indicate both feedforward communication from auditory cortex to
higher-level processing areas and feedback from the word-level to the syllable-level. Für
example the first syllable might provide (temporal and/or semantic) predictions of the second
syllable. Interactions between lexical and phonological processing have been shown to
involve feedback from posterior MTG to posterior STG (Gow & Segawa, 2009; for review
see Gow et al., 2008). Außerdem, several electrophysiological studies suggest
interactions/feedback from sentential (Gow & Olson, 2016) or phrasal processing (Keitel
et al., 2018), or possibly both (Park et al., 2015) to syllable processing. Jedoch, research that
is particularly designed to investigate the interactions at the word level is rare (Gow & Olson,
2016; Keitel et al., 2018; Mai et al., 2016). One limitation of our findings is that effects sug-
gesting syllable-to-word level interactions were only observed when conditions with lexical
content at the word level were compared to all other conditions (Turkish/Non-Turkish), Aber
not in separate comparisons. A possibility is that the acoustic syllable to word-level interac-
tions were weak and the effects significant only for the larger data sets. This conjecture is in
line with Pinto et al. (2022), who reported low statistic reliability of the effect of word learning
on syllable tracking.
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
Conclusions
Our data shed light on the contribution of syllable-to-syllable transition cues to neural process-
ing at the word-level. Insbesondere, we find that sublexical syllable-to-syllable transition are rap-
idly tracked in a foreign language. Außerdem, the increased coupling between word- Und
syllable-level processing, when lexical cues are present, suggests that these processes are
interactive.
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
ACKNOWLEDGMENTS
We thank Marius Schneider for help with the data recording, Ilkay Isik for checking the Turkish
stimulus material, DR. Florencia Assaneo for discussions, and Dr. Klaus Frieler for statistics
support.
FUNDING INFORMATION
This work was funded by the Max Planck Institute for Empirical Aesthetics.
BEITRÄGE DES AUTORS
Johanna M. Rimmele: Konzeptualisierung: Equal; Formale Analyse: Lead; Methodik: Lead;
Projektverwaltung: Equal; Visualisierung: Lead; Writing – original draft: Equal; Writing –
Neurobiology of Language
139
Networks for syllable and word-level processing
Rezension & Bearbeitung: Equal. Yue Sun: Konzeptualisierung: Equal; Formale Analyse: Supporting;
Methodik: Supporting; Writing – review & Bearbeitung: Equal. Georgios Michalareas: Method-
ology: Supporting; Visualisierung: Supporting; Writing – review & Bearbeitung: Equal. Oded Ghitza:
Konzeptualisierung: Equal; Formale Analyse: Supporting; Visualisierung: Supporting; Writing –
Rezension & Bearbeitung: Equal. David Poeppel: Konzeptualisierung: Equal; Akquise von Fördermitteln:
Equal; Methodik: Supporting; Writing – review & Bearbeitung: Equal.
DATA AVAILABILITY STATEMENT
Parts of the data are available on Edmond, the open research repository of the Max Planck
Society.
VERWEISE
Ahissar, E., & Ahissar, M. (2005). Processing of the temporal enve-
lope of speech. In R. König, P. Heil, E. Budinger, & H. Scheich
(Hrsg.), The auditory cortex: A synthesis of human and animal
Forschung (S. 295–313). Psychology Press.
Altmann, G. T. M., & Kamide, Y. (1999). Incremental interpretation
at verbs: Restricting the domain of subsequent reference. Cogni-
tion, 73(3), 247–264. https://doi.org/10.1016/S0010-0277(99)
00059-1, PubMed: 10585516
Aslin, R. N., & Newport, E. L. (2012). Statistical learning: Aus
acquiring specific items to forming general rules. Current Direc-
tions in Psychological Science, 21(3), 170–176. https://doi.org/10
.1177/0963721412436806, PubMed: 24000273
Aslin, R. N., & Newport, E. L. (2014). Distributional language learn-
ing: Mechanisms and models of category formation. Language
Learning, 64(s2), 86–105. https://doi.org/10.1111/ lang.12074,
PubMed: 26855443
Assaneo, M. F., & Kacke, D. (2018). The coupling between
auditory and motor cortices is rate-restricted: Evidence for an
intrinsic speech-motor rhythm. Science Advances, 4(2), Article
eaao3842. https://doi.org/10.1126/sciadv.aao3842, PubMed:
29441362
Baayen, R., Piepenbrock, R., & Gulikers, L. (1995). CELEX2
LDC96L14 [Datenbank]. Linguistic Data Consortium. https://doi
.org/10.35111/gs6s-gm48
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear
mixed-effects models using lme4. Journal of Statistical Software,
67(1), 1–48. https://doi.org/10.18637/jss.v067.i01
Batterink, L. J., & Paller, K. A. (2017). Online neural monitoring of
statistical learning. Kortex, 90, 31–45. https://doi.org/10.1016/j
.cortex.2017.02.004, PubMed: 28324696
Binder, J. R., Desai, R. H., Graves, W. W., & Conant, L. L. (2009).
Where is the semantic system? A critical review and
meta-analysis of 120 functional neuroimaging studies. Zerebral
Kortex, 19(12), 2767–2796. https://doi.org/10.1093/cercor
/bhp055, PubMed: 19329570
Binder, J. R., Frost, J. A., Hammeke, T. A., Bellgowan, P. S. F.,
Springer, J. A., Kaufman, J. N., & Possing, E. T. (2000). Human
temporal lobe activation by speech and nonspeech sounds. Cere-
bral Cortex, 10(5), 512–528. https://doi.org/10.1093/cercor/10.5
.512, PubMed: 10847601
Boersma, P. (2001). PRAAT, a system for doing phonetics by com-
puter. Glot International, 5(9/10), 341–347.
Bourguignon, M., Molinaro, N., & Wens, V. (2018). Contrasting
functional imaging parametric maps: The mislocation problem
and alternative solutions. NeuroImage, 169, 200–211. https://doi
.org/10.1016/j.neuroimage.2017.12.033, PubMed: 29247806
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision,
10(4), 433–436. https://doi.org/10.1163/156856897X00357,
PubMed: 9176952
Brodbeck, C., Hong, L. E., & Simon, J. Z. (2018). Rapid transforma-
tion from auditory to linguistic representations of continuous
Rede. Aktuelle Biologie, 28(24), 3976–3983. https://doi.org/10
.1016/j.cub.2018.10.042, PubMed: 30503620
Brysbaert, M., & Diependaele, K. (2012). Dealing with zero word
frequencies: A review of the existing rules of thumb and a sug-
gestion for an evidence-based choice. Behavior Research
Methoden, 45, 422–430. https://doi.org/10.3758/s13428-012
-0270-5, PubMed: 23055175
Buiatti, M., Peña, M., & Dehaene-Lambertz, G. (2009). Investigat-
ing the neural correlates of continuous speech computation with
frequency-tagged neuroelectric responses. NeuroImage, 44(2),
509–519. https://doi.org/10.1016/j.neuroimage.2008.09.015,
PubMed: 18929668
Burton, M. W., Small, S. L., & Blumstein, S. E. (2000). Die Rolle von
segmentation in phonological processing: An fMRI investiga-
tion. Zeitschrift für kognitive Neurowissenschaften, 12(4), 679–690.
https://doi.org/10.1162/089892900562309, PubMed: 10936919
Chen, Y., Jin, P., & Ding, N. (2020). The influence of linguistic infor-
mation on cortical tracking of words. Neuropsychologie, 148,
Article 107640. https://doi.org/10.1016/j.neuropsychologia
.2020.107640, PubMed: 33011188
Corretge, R. (2022). Praat vocal toolkit (Software plugin). https://
www.praatvocaltoolkit.com
CTF MEG Neuro Innovations. (2021). Omega 2000 (Apparatus).
https://www.ctf.com
Current Designs. (2022). Button box (Apparatus). https://www
.curdes.com
Cutler, A. (2012). Native listening: Language experience and the
recognition of spoken words. MIT Press. https://doi.org/10.7551
/mitpress/9012.001.0001
Daube, C., Ince, R. A. A., & Brutto, J. (2019). Simple acoustic features
can explain phoneme-based predictions of cortical responses to
Rede. Aktuelle Biologie, 29(12), 1924–1937. https://doi.org/10
.1016/j.cub.2019.04.067, PubMed: 31130454
Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low-
frequency cortical entrainment to speech reflects phoneme-level
Verarbeitung. Aktuelle Biologie, 25(19), 2457–2465. https://doi.org
/10.1016/j.cub.2015.08.030, PubMed: 26412129
Ding, N., Melloni, L., Zhang, H., Tian, X., & Kacke, D. (2016).
Cortical tracking of hierarchical linguistic structures in connected
Rede. Naturneurowissenschaften, 19(1), 158–164. https://doi.org/10
.1038/nn.4186, PubMed: 26642090
Neurobiology of Language
140
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
Ding, N., Pan, X., Luo, C., Su, N., Zhang, W., & Zhang, J. (2018).
Attention is required for knowledge-based sequential grouping:
Insights from the integration of syllables into words. Zeitschrift für
Neurowissenschaften, 38(5), 1178–1188. https://doi.org/10.1523
/JNEUROSCI.2606-17.2017, PubMed: 29255005
Doelling, K. B., Arnal, L. H., Ghitza, O., & Kacke, D. (2014).
Acoustic landmarks drive delta–theta oscillations to enable
speech comprehension by facilitating perceptual parsing. Neuro-
Image, 85(2), 761–768. https://doi.org/10.1016/j.neuroimage
.2013.06.035, PubMed: 23791839
Fan, L., Chu, C., Li, H., Chen, L., Xie, S., Zhang, Y., Yang, Z., Jiang, T.,
Laird, A. R., Wang, J., Zhuo, J., Yu, C., Fuchs, P. T., & Eickhoff, S. B.
(2016). The human Brainnetome Atlas: A new brain atlas based on
connectional architecture. Hirnrinde, 26(8), 3508–3526.
https://doi.org/10.1093/cercor/bhw157, PubMed: 27230218
Flinker, A., Doyle, W. K., Mehta, A. D., Devinsky, O., & Kacke,
D. (2019). Spectrotemporal modulation provides a unifying
framework for auditory cortical asymmetries. Nature Human
Behaviour, 3(4), 395–405. https://doi.org/10.1038/s41562-019
-0548-z, PubMed: 30971792
Getz, H., Ding, N., Newport, E. L., & Kacke, D. (2018). Cortical
tracking of constituent structure in language acquisition. Cogni-
tion, 181, 135–140. https://doi.org/10.1016/j.cognition.2018.08
.019, PubMed: 30195135
Ghitza, Ö. (2011). Linking speech perception and neurophysiol-
Ogy: Speech decoding guided by cascaded oscillators locked to
the input rhythm. Grenzen in der Psychologie, 2, 130. https://doi.org
/10.3389/fpsyg.2011.00130, PubMed: 21743809
Ghitza, O., & Greenberg, S. (2009). On the possible role of brain
rhythms in speech perception: Intelligibility of time-compressed
speech with periodic and aperiodic insertions of silence. Phone-
tica, 66(1–2), 113–126. https://doi.org/10.1159/000208934,
PubMed: 19390234
Giroud, J., Trébuchon, A., Schön, D., Marquis, P., Liegeois-Chauvel,
C., Kacke, D., & Morillon, B. (2020). Asymmetric sampling in
human auditory cortex reveals spectral processing hierarchy.
PLOS Biology, 18(3), Article e3000207. https://doi.org/10.1371
/journal.pbio.3000207, PubMed: 32119667
Gow, D. W. (2012). The cortical organization of lexical knowledge:
A dual lexicon model of spoken language processing. Brain and
Language, 121(3), 273–288. https://doi.org/10.1016/j.bandl
.2012.03.005, PubMed: 22498237
Gow, D. W., & Olson, B. B. (2016). Sentential influences on
acoustic-phonetic processing: A Granger causality analysis of
multimodal imaging data. Language, Cognition and Neurosci-
enz, 31(7), 841–855. https://doi.org/10.1080/23273798.2015
.1029498, PubMed: 27595118
Gow, D. W., & Segawa, J. A. (2009). Articulatory mediation of
speech perception: A causal analysis of multi-modal imaging
Daten. Cognition, 110(2), 222–236. https://doi.org/10.1016/j
.cognition.2008.11.011, PubMed: 19110238
Gow, D. W., Segawa, J. A., Ahlfors, S. P., & Lin, F.-H. (2008).
Lexical influences on speech perception: A Granger causality
analysis of MEG and EEG source estimates. NeuroImage, 43(3),
614–623. https://doi.org/10.1016/j.neuroimage.2008.07.027,
PubMed: 18703146
Brutto, J., Hoogenboom, N., Thut, G., Schyns, P., Panzeri, S., Belin,
P., & Garrod, S. (2013). Speech rhythms and multiplexed oscilla-
tory sensory coding in the human brain. PLOS Biology, 11(12),
Article e1001752. https://doi.org/10.1371/journal.pbio
.1001752, PubMed: 24391472
Brutto, J., Kujala, J., Hämäläinen, M., Timmermann, L., Schnitzler,
A., & Salmelin, R. (2001). Dynamic imaging of coherent sources:
Studying neural interactions in the human brain. Proceedings of
the National Academy of Sciences, 98(2), 694–699. https://doi
.org/10.1073/pnas.98.2.694, PubMed: 11209067
Haegens, S., & Zion Golumbic, E. (2018). Rhythmic facilitation of
sensory processing: A critical review. Neurowissenschaften & Biobehav-
ioral Reviews, 86, 150–165. https://doi.org/10.1016/j.neubiorev
.2017.12.002, PubMed: 29223770
Henin, S., Turk-Browne, N. B., Friedman, D., Liu, A., Dugan, P.,
Flinker, A., Doyle, W., Devinsky, O., & Melloni, L. (2021). Learn-
ing hierarchical sequence representations across human cortex
and hippocampus. Science Advances, 7(8), Article eabc4530.
https://doi.org/10.1126/sciadv.abc4530, PubMed: 33608265
Hickok, G., & Kacke, D. (2004). Dorsal and ventral streams: A
framework for understanding aspects of the functional anatomy
of language. Cognition, 92(1–2), 67–99. https://doi.org/10.1016/j
.cognition.2003.10.011, PubMed: 15037127
Hickok, G., & Kacke, D. (2007). The cortical organization of
speech processing. Nature Reviews Neurowissenschaften, 8(5), 393–402.
https://doi.org/10.1038/nrn2113, PubMed: 17431404
Hilton, C. B., & Goldwater, M. B. (2021). Linguistic syncopation:
Meter-syntax alignment affects sentence comprehension and
sensorimotor synchronization. Cognition, 217, Article 104880.
https://doi.org/10.1016/j.cognition.2021.104880, PubMed:
34419725
Howard, M. F., & Kacke, D. (2010). Discrimination of speech
stimuli based on neuronal response phase patterns depends on
acoustics but not comprehension. Journal of Neurophysiology,
104(5), 2500–2511. https://doi.org/10.1152/jn.00251.2010,
PubMed: 20484530
Ince, R. A. A., Giordano, B. L., Kayser, C., Rousselet, G. A., Brutto, J.,
& Schyns, P. G. (2017). A statistical framework for neuroimaging
data analysis based on mutual information estimated via a
gaussian copula. Kartierung des menschlichen Gehirns, 38(3), 1541–1573.
https://doi.org/10.1002/hbm.23471, PubMed: 27860095
Jadoul, Y., Ravignani, A., Thompson, B., Filippi, P., & de Boer, B.
(2016). Seeking temporal predictability in speech: Comparing
statistical approaches on 18 world languages. Frontiers in Human
Neurowissenschaften, 10, 586. https://doi.org/10.3389/fnhum.2016
.00586, PubMed: 27994544
Jepsen, M. L., Ewert, S. D., & Dau, T. (2008). A computational
model of human auditory signal processing and perception.
The Journal of the Acoustical Society of America, 124(1), 422–
438. https://doi.org/10.1121/1.2924135, PubMed: 18646987
Kann, ICH. P., Kable, J. W., Van Scoyoc, A., Chatterjee, A., & Thompson-
Schill, S. L. (2006). Fractionating the left frontal response to tools:
Dissociable effects of motor experience and lexical competition.
Zeitschrift für kognitive Neurowissenschaften, 18(2), 267–277. https://doi.org
/10.1162/jocn.2006.18.2.267, PubMed: 16494686
Kaufeld, G., Bosker, H. R., ten Oever, S., Alday, P. M., Meyer, A. S.,
& Martin, A. E. (2020). Linguistic structure and meaning organize
neural oscillations into a content-specific hierarchy. Zeitschrift für
Neurowissenschaften, 40(49), 9467–9475. https://doi.org/10.1523
/JNEUROSCI.0302-20.2020, PubMed: 33097640
Keitel, A., & Brutto, J. (2016). Individual human brain areas can be
identified from their characteristic spectral activation fingerprints.
PLOS Biology, 14(6), Article e1002498. https://doi.org/10.1371
/journal.pbio.1002498, PubMed: 27355236
Keitel, A., Brutto, J., & Kayser, C. (2018). Perceptually relevant
speech tracking in auditory and motor cortex reflects distinct lin-
guistic features. PLOS Biology, 16(3), Article e2004473. https://
doi.org/10.1371/journal.pbio.2004473, PubMed: 29529019
Kösem, A., Basirat, A., Azizi, L., & van Wassenhove, V. (2016). Hoch-
frequency neural activity predicts word parsing in ambiguous
Neurobiology of Language
141
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
speech streams. Journal of Neurophysiology, 116(6), 2497–2512.
https://doi.org/10.1152/jn.00074.2016, PubMed: 27605528
Kotz, S. A., & Schmidt-Kassow, M. (2015). Basal ganglia contribu-
tion to rule expectancy and temporal predictability in speech.
Kortex, 68, 48–60. https://doi.org/10.1016/j.cortex.2015.02
.021, PubMed: 25863903
Kuntay, A., Lowe, J., Orgun, O., Sprouse, R., & Rhodes, R. (2009).
Turkish Electronic Living Lexicon (TELL) ( Version 2.0) [Datenbank].
https://linguistics.berkeley.edu/TELL
Schlosser, P., Brutto, J., & Thut, G. (2019). A new unifying account of
the roles of neuronal entrainment. Aktuelle Biologie, 29(18),
R890–R905. https://doi.org/10.1016/j.cub.2019.07.075,
PubMed: 31550478
Schlosser, P., Shah, A. S., Knuth, K. H., Ulbert, ICH., Karmos, G., &
Schroeder, C. E. (2005). An oscillatory hierarchy controlling neu-
ronal excitability and stimulus processing in the auditory cortex.
Journal of Neurophysiology, 94(3), 1904–1911. https://doi.org/10
.1152/jn.00263.2005, PubMed: 15901760
Lewis, G., Solomyak, O., & Marantz, A. (2011). The neural basis of
obligatory decomposition of suffixed words. Brain and Language,
118(3), 118–127. https://doi.org/10.1016/j.bandl.2011.04.004,
PubMed: 21620455
Lu, L., Sheng, J., Liu, Z., & Gao, J.-H. (2021). Neuronale Darstellungen
of imagined speech revealed by frequency-tagged magnetoen-
cephalography responses. NeuroImage, 229, 117724. https://
doi.org/10.1016/j.neuroimage.2021.117724, PubMed:
33421593
Lubinus, C., Orpella, J., Keitel, A., Gudi-Mindermann, H., Engel,
A. K., Roeder, B., & Rimmele, J. M. (2021). Data-driven classifi-
cation of spectral profiles reveals brain region-specific plasticity
in blindness. Hirnrinde, 31(5), 2505–2522. https://doi.org
/10.1093/cercor/bhaa370, PubMed: 33338212
Luo, C., & Ding, N. (2020). Cortical encoding of acoustic and
linguistic rhythms in spoken narratives. ELife, 9, e60433.
https://doi.org/10.7554/eLife.60433, PubMed: 33345775
Luo, H., & Kacke, D. (2007). Phase patterns of neuronal
responses reliably discriminate speech in human auditory cortex.
Neuron, 54(6), 1001–1010. https://doi.org/10.1016/j.neuron
.2007.06.004, PubMed: 17582338
Mai, G., Minett, J. W., & Wang, W. S.-Y. (2016). Delta, theta, beta,
and gamma brain oscillations index levels of auditory sentence
Verarbeitung. NeuroImage, 133, 516–528. https://doi.org/10.1016/j
.neuroimage.2016.02.064, PubMed: 26931813
Makeig, S., Glocke, A. J., Jung, T.-P., & Sejnowski, T. J. (1996). Inde-
pendent component analysis of electroencephalographic data. In
D. Touretzky, M. C. Mozer, & M. Hasselmo (Hrsg.), Advances in
Neural Information Processing Systems 8 (S. 145–151). NeurIPS.
Makov, S., Sharon, O., Ding, N., Ben-Shachar, M., Nir, Y., & Zion
Golumbic, E. (2017). Sleep disrupts high-level speech parsing
despite significant basic auditory processing. Journal of Neurosci-
enz, 37(32), 7772–7781. https://doi.org/10.1523/JNEUROSCI
.0168-17.2017, PubMed: 28626013
Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing
of EEG- and MEG-data. Journal of Neuroscience Methods,
164(1), 177–190. https://doi.org/10.1016/j.jneumeth.2007.03
.024, PubMed: 17517438
Marslen-Wilson, W., & Tyler, L. K. (1980). The temporal structure of
spoken language understanding. Cognition, 8(1), 1–71. https://
doi.org/10.1016/0010-0277(80)90015-3, PubMed: 7363578
Marslen-Wilson, W., & Welsh, A. (1978). Processing interactions
and lexical access during word recognition in continuous
Rede. Cognitive Psychology, 10(1), 29–63. https://doi.org/10
.1016/0010-0285(78)90018-X
Martin, A. E., & Doumas, L. A. A. (2017). A mechanism for the cor-
tical computation of hierarchical linguistic structure. PLOS Biol-
Ogy, 15(3), Article e2000663. https://doi.org/10.1371/journal
.pbio.2000663, PubMed: 28253256
McQueen, J. M. (1998). Segmentation of continuous speech using
phonotactics. Journal of Memory and Language, 39(1), 21–46.
https://doi.org/10.1006/jmla.1998.2568
Mehler, J., Dommergues, J. Y., Frauenfelder, U., & Segui, J. (1981).
The syllable’s role in speech segmentation. Journal of Verbal
Learning and Verbal Behavior, 20(3), 298–305. https://doi.org
/10.1016/S0022-5371(81)90450-3
Mesgarani, N., Cheung, C., Johnson, K., & Chang, E. F. (2014).
Phonetic feature encoding in human superior temporal gyrus.
Wissenschaft, 343(6174), 1006–1010. https://doi.org/10.1126
/science.1245994, PubMed: 24482117
Meyer, L. (2017). The neural oscillations of speech processing and
language comprehension: State of the art and emerging mecha-
nisms. Europäisches Journal für Neurowissenschaften, 48(7), 2609–2621.
https://doi.org/10.1111/ejn.13748, PubMed: 29055058
Meyer, L., Sun, Y., & Martin, A. E. (2020). “Entraining” to speech,
generating language? Language, Cognition and Neuroscience,
35(9), 1138–1148. https://doi.org/10.1080/23273798.2020
.1827155
Moineau, S., Dronkers, N. F., & Bates, E. (2005). Exploring the pro-
cessing continuum of single-word comprehension in aphasia.
Journal of Speech, Language, and Hearing Research, 48(4),
884–896. https://doi.org/10.1044/1092-4388(2005/061),
PubMed: 16378480
Molinaro, N., & Lizarazu, M. (2018). Delta(but not theta)-band cor-
tical entrainment involves speech-specific processing. European
Zeitschrift für Neurowissenschaften, 48(7), 2642–2650. https://doi.org/10
.1111/ejn.13811, PubMed: 29283465
Möttönen, R., & Watkins, K. E. (2009). Motor representations of
articulators contribute to categorical perception of speech
Geräusche. Zeitschrift für Neurowissenschaften, 29(31), 9819–9825. https://
doi.org/10.1523/ JNEUROSCI.6018-08.2009, PubMed:
19657034
Niesen, M., Vander Ghinst, M., Bourguignon, M., Wens, V., Bertels,
J., Goldman, S., Choufani, G., Hassid, S., & De Tiège, X. (2020).
Tracking the effects of top–down attention on word discrimina-
tion using frequency-tagged neuromagnetic responses. Zeitschrift für
Cognitive Neuroscience, 32(5), 877–888. https://doi.org/10.1162
/jocn_a_01522, PubMed: 31933439
Nolte, G. (2003). The magnetic lead field theorem in the quasi-
static approximation and its use for magnetoencephalography
forward calculation in realistic volume conductors. Physics in
Medicine and Biology, 48(22), 3637–3652. https://doi.org/10
.1088/0031-9155/48/22/002, PubMed: 14680264
Okada, K., & Hickok, G. (2006). Identification of lexical-
phonological networks in the superior temporal sulcus using
functional magnetic resonance imaging. Neuroreport, 17(12),
1293–1296. https://doi.org/10.1097/01.wnr.0000233091.82536
.b2, PubMed: 16951572
Oostenveld, R., Fries, P., Maris, E., & Schoffelen, J.-M. (2011). Field-
Trip: Open source software for advanced analysis of MEG, EEG,
and invasive electrophysiological data. Computational Intelli-
gence and Neuroscience, 2011, Article 156869. https://doi.org
/10.1155/2011/156869, PubMed: 21253357
Ota, M., & Skarabela, B. (2016). Reduplicated words are easier to
learn. Language Learning and Development, 12(4), 380–397.
https://doi.org/10.1080/15475441.2016.1165100
Ota, M., & Skarabela, B. (2018). Reduplication facilitates early
word segmentation. Journal of Child Language, 45(1), 204–218.
Neurobiology of Language
142
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
https://doi.org/10.1017/S0305000916000660, PubMed:
28162111
Panzeri, S., Brunel, N., Logothetis, N. K., & Kayser, C. (2010).
Sensory neural codes using multiplexed temporal scales. Trends
in Neurosciences, 33(3), 111–120. https://doi.org/10.1016/j.tins
.2009.12.001, PubMed: 20045201
Park, H., Ince, R. A. A., Schyns, P. G., Thut, G., & Brutto, J. (2015).
Frontal top-down signals increase coupling of auditory
low-frequency oscillations to continuous speech in human
listeners. Aktuelle Biologie, 25(12), 1649–1653. https://doi.org/10
.1016/j.cub.2015.04.049, PubMed: 26028433
Park, H., Thut, G., & Brutto, J. (2018). Predictive entrainment of
natural speech through two fronto-motor top-down channels.
Language, Cognition and Neuroscience, 35(6), 739–751. https://
doi.org/10.1080/23273798.2018.1506589, PubMed: 32939354
Peelle, J. E. (2012). The hemispheric lateralization of speech pro-
cessing depends on what “speech” is: A hierarchical perspective.
Grenzen der menschlichen Neurowissenschaften, 6, 309. https://doi.org/10
.3389/fnhum.2012.00309, PubMed: 23162455
Peelle, J. E., & Davis, M. H. (2012). Neural oscillations carry speech
rhythm through to comprehension. Grenzen in der Psychologie, 3,
320. https://doi.org/10.3389/fpsyg.2012.00320, PubMed:
22973251
Peelle, J. E., Brutto, J., & Davis, M. H. (2013). Phase-locked
responses to speech in human auditory cortex are enhanced
during comprehension. Hirnrinde, 23(6), 1378–1387.
https://doi.org/10.1093/cercor/bhs118, PubMed: 22610394
Pena, M., & Melloni, L. (2012). Brain oscillations during spoken
sentence processing. Zeitschrift für kognitive Neurowissenschaften, 24(5),
1149–1164. https://doi.org/10.1162/jocn_a_00144, PubMed:
21981666
Perrin, F., Pernier, J., Bertrand, O., & Echallier, J. F. (1989). Spheri-
cal splines for scalp potential and current density mapping.
Electroencephalography and Clinical Neurophysiology, 72(2),
184–187. https://doi.org/10.1016/0013-4694(89)90180-6,
PubMed: 2464490
Pinto, D., Prior, A., & Zion Golumbic, E. (2022). Assessing the sen-
sitivity of EEG-based frequency-tagging as a metric for statistical
learning. Neurobiology of Language, 3(2), 214–234. https://doi
.org/10.1162/nol_a_00061
R Core Team. (2022). R: A language and environment for statistical
computing. R Foundation for Statistical Computing. https://www
.R-project.org/
Rice, G. E., Caswell, H., Moore, P., Lambon Ralph, M. A., &
Hoffman, P. (2018). Revealing the dynamic modulations that
underpin a resilient neural network for semantic cognition: Ein
fmri investigation in patients with anterior temporal lobe resec-
tion. Hirnrinde, 28(8), 3004–3016. https://doi.org/10
.1093/cercor/bhy116, PubMed: 29878076
Rimmele, J. M., Brutto, J., Molholm, S., & Keitel, A. (2018). Editorial:
Brain oscillations in human communication. Frontiers in Human
Neurowissenschaften, 12, 39. https://doi.org/10.3389/fnhum.2018
.00039, PubMed: 29467639
Rimmele, J. M., Morillon, B., Kacke, D., & Arnal, L. H. (2018).
Proactive sensing of periodic and aperiodic auditory patterns.
Trends in den Kognitionswissenschaften, 22(10), 870–882. https://doi.org
/10.1016/j.tics.2018.08.003, PubMed: 30266147
Rimmele, J. M., Kacke, D., & Ghitza, Ö. (2021). Acoustically
driven cortical delta oscillations underpin prosodic chunking.
ENeuro, 8(4), ENEURO.0562-20.2021. https://doi.org/10.1523
/ENEURO.0562-20.2021, PubMed: 34083380
Rimmele, J. M., Zion Golumbic, E., Schröger, E., & Kacke, D.
(2015). The effects of selective attention and speech acoustics
on neural speech-tracking in a multi-talker scene. Kortex, 68,
144–154. https://doi.org/10.1016/j.cortex.2014.12.014,
PubMed: 25650107
Rodd, J. M., Vitello, S., Woollams, A. M., & Adank, P. (2015). Loca-
lising semantic and syntactic processing in spoken and written
language comprehension: An Activation Likelihood Estimation
meta-analysis. Brain and Language, 141, 89–102. https://doi.org
/10.1016/j.bandl.2014.11.012, PubMed: 25576690
Rogalsky, C., Basilakos, A., Rorden, C., Pillay, S., LaCroix, A. N.,
Keator, L., Mickelsen, S., Anderson, S. W., Liebe, T., Fridriksson,
J., Binder, J., & Hickok, G. (2022). The neuroanatomy of speech
Verarbeitung: A large-scale lesion study. Journal of Cognitive Neu-
roscience, 34(8), 1355–1375. https://doi.org/10.1162/jocn_a
_01876, PubMed: 35640102
Rosenberg, J. R., Amjad, A. M., Breeze, P., Brillinger, D. R., &
Halliday, D. M. (1989). The Fourier approach to the identifica-
tion of functional coupling between neuronal spike trains.
Progress in Biophysics and Molecular Biology, 53(1), 1–31.
https://doi.org/10.1016/0079-6107(89)90004-7, PubMed:
2682781
Saffran, J. R., Newport, E. L., & Aslin, R. N. (1996). Word seg-
mentation: The role of distributional cues. Journal of Memory and
Language, 35(4), 606–621. https://doi.org/10.1006/jmla.1996
.0032
Scharinger, M., Idsardi, W. J., & Poe, S. (2011). A comprehensive
three-dimensional cortical map of vowel space. Zeitschrift für
Cognitive Neuroscience, 23(12), 3972–3982. https://doi.org/10
.1162/jocn_a_00056, PubMed: 21568638
Scontras, G., Badecker, W., Shank, L., Lim, E., & Fedorenko, E.
(2015). Syntactic complexity effects in sentence production.
Cognitive Science, 39(3), 559–583. https://doi.org/10.1111/cogs
.12168, PubMed: 25256303
Siemens Medical Solutions. (2022). 3T Magnetom Trio [Apparatus].
https://www.siemens-healthineers.com
Schmied, Z. M., Delgutte, B., & Oxenham, A. J. (2002). Chimaeric
sounds reveal dichotomies in auditory perception. Natur, 416,
87–90. https://doi.org/10.1038/416087a, PubMed: 11882898
Sohoglu, E., & Davis, M. H. (2016). Perceptual learning of
degraded speech by minimizing prediction error. Verfahren
der Nationalen Akademie der Wissenschaften, 113(12), E1747–E1756.
https://doi.org/10.1073/pnas.1523266113, PubMed: 26957596
Stolk, A., Todorovic, A., Schoffelen, J.-M., & Oostenveld, R. (2013).
Online and offline tools for head movement compensation in
MEG. NeuroImage, 68, 39–48. https://doi.org/10.1016/j
.neuroimage.2012.11.047, PubMed: 23246857
Ten Oever, S., & Martin, A. E. (2021). An oscillating computational
model can track pseudo-rhythmic speech by using linguistic pre-
dictions. ELife, 10, e68066. https://doi.org/https://doi.org/10
.7554/eLife.68066, PubMed: 34338196
Teng, X., Tian, X., Rowland, J., & Kacke, D. (2017). Concurrent
temporal channels for auditory processing: Oscillatory neural
entrainment reveals segregation of function at different scales.
PLOS Biology, 15(11), e2000812. https://doi.org/10.1371
/journal.pbio.2000812, PubMed: 29095816
Thompson-Schill, S. L., D'Esposito, M., Aguirre, G. K., & Farah,
M. J. (1997). Role of left inferior prefrontal cortex in retrieval of
semantic knowledge: Areevaluation. Verfahren des Nationalen
Akademie der Wissenschaften, 94(26), 14792–14797. https://doi.org/10
.1073/pnas.94.26.14792, PubMed: 9405692
Ulrich Keller Medizin-Technik. (n.d.). E-A-RTONE Gold 3A insert
earphones [Apparatus]. https://keller-meditec.de
Utman, J. A., Blumstein, S. E., & Sullivan, K. (2001). Mapping from
sound to meaning: Reduced lexical activation in Broca’s
Neurobiology of Language
143
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
/
.
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Networks for syllable and word-level processing
aphasics. Brain and Language, 79(3), 444–472. https://doi.org/10
.1006/brln.2001.2500, PubMed: 11781053
van der Lugt, A. H. (2001). The use of sequential probabilities in the
segmentation of speech. Wahrnehmung & Psychophysics, 63(5),
811–823. https://doi.org/10.3758/BF03194440, PubMed: 11521849
Xu, C., Li, H., Gao, J., Li, L., Er, F., Yu, J., Ling, Y., Gao, J., Li, J.,
Melloni, L., Luo, B., & Ding, N. (2022). Statistical learning in
patients in the minimally conscious state. Hirnrinde.
Advance online publication. https://doi.org/10.1093/cercor
/bhac222, PubMed: 35670595
Zion Golumbic, E., Cogan, G. B., Schroeder, C. E., & Kacke, D.
(2013). Visual input enhances selective speech envelope tracking
in auditory cortex at a “cocktail party.” Journal of Neuroscience,
33(4), 1417–1426. https://doi.org/10.1523/JNEUROSCI.3675-12
.2013, PubMed: 23345218
Zion Golumbic, E., Kacke, D., & Schroeder, C. E. (2012). Tempo-
ral context in speech processing and attentional stream selection:
A behavioral and neural perspective. Brain and Language,
122(3), 151–161. https://doi.org/10.1016/j.bandl.2011.12.010,
PubMed: 22285024
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
N
Ö
/
l
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
4
1
1
2
0
2
0
7
4
4
8
7
N
Ö
_
A
_
0
0
0
8
9
P
D
.
/
l
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
Neurobiology of Language
144