RESEARCH ARTICLE

Hierarchy, Not Lexical Regularity, Modulates
Low-Frequency Neural Synchrony During
Language Comprehension

开放访问

杂志

Chia-Wen Lo1,2

, Tzu-Yun Tung2

, Alan Hezao Ke2,3

, and Jonathan R. Brennan2

1Research Group Language Cycles, 马克斯·普朗克人类认知和脑科学研究所, 莱比锡, 德国
2语言学系, 密歇根大学, 安娜堡, MI, 美国
3语言学系, Languages and Cultures, Michigan State University, East Lansing, MI, 美国

关键词: neural oscillations, delta rhythms, neural synchronization, language comprehension,
syntax, 语义学

抽象的

Neural responses appear to synchronize with sentence structure. 然而, 研究人员有
debated whether this response in the delta band (0.5–3 Hz) really reflects hierarchical
information or simply lexical regularities. Computational simulations in which sentences are
represented simply as sequences of high-dimensional numeric vectors that encode lexical
information seem to give rise to power spectra similar to those observed for sentence
同步, suggesting that sentence-level cortical tracking findings may reflect
sequential lexical or part-of-speech information, and not necessarily hierarchical syntactic
信息. Using electroencephalography (EEG) data and the frequency-tagging paradigm,
we develop a novel experimental condition to tease apart the predictions of the lexical and the
hierarchical accounts of the attested low-frequency synchronization. Under a lexical model,
synchronization should be observed even when words are reversed within their phrases
(例如, “sheep white grass eat” instead of “white sheep eat grass”), because the same lexical
items are preserved at the same regular intervals. Critically, such stimuli are not syntactically
well-formed; thus a hierarchical model does not predict synchronization of phrase- 和
sentence-level structure in the reversed phrase condition. Computational simulations confirm
these diverging predictions. EEG data from N = 31 native speakers of Mandarin show
robust delta synchronization to syntactically well-formed isochronous speech. 重要的,
no such pattern is observed for reversed phrases, consistent with the hierarchical, but not the
词汇的, 账户.

介绍

Human language is compositional; language users create unbounded and novel phrases and
sentences from a finite number of words. This compositional ability is highly structured; 字
must be combined according to syntactic rules to yield well-formed and interpretable phrases
and sentences. Previous studies have narrowed down the neural timing and localization of
compositional processing (see Hagoort & Indefrey, 2014; Matchin & Hickok, 2020; Pylkkänen
& Brennan, 2019 for reviews). 例如, Bemis and Pylkkänen (2011) examined how
humans process two-word combinatorial phrases (例如. “red boat”) 与. non-combinatorial
短语 (例如, “xkq boat”) 与. word lists (例如, “cup boat”) in magnetoencephalography (乙二醇)

引文: Lo, C.-W., Tung, T.-Y., Ke,
A. H。, & Brennan, J. 右. (2022).
Hierarchy, not lexical regularity,
modulates low-frequency neural
synchrony during language
comprehension. Neurobiology of
语言, 3(4), 538–555. https://doi.org
/10.1162/nol_a_00077

DOI:
https://doi.org/10.1162/nol_a_00077

已收到: 2 行进 2022
公认: 20 六月 2022

利益争夺: 作者有
声明不存在竞争利益
存在.

通讯作者:
Chia-Wen Lo
lo@cbs.mpg.de

处理编辑器:
Peter Hagoort

版权: © 2022
麻省理工学院
在知识共享下发布
归因 4.0 国际的
(抄送 4.0) 执照

麻省理工学院出版社

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

recordings and found that for combinatorial phrases increased activity was elicited at 200–
250 ms after the presentation of the second word at the left anterior temporal lobe, unlike
for non-combinatorial phrases and word lists. Neufeld et al. (2016) found a greater negativity
in the similar time window (184–256 ms) for combinatorial phrases compared to the non-word
condition by using the same experimental paradigm in electroencephalography (EEG)
录音. The emerging temporal picture complements functional magnetic resonance
成像 (功能磁共振成像) studies that narrow down the localization of combinatoric processing. 为了
例子, studies have shown greater activation for sentences compared to word lists in brain
regions such as inferior frontal gyrus (Pallier et al., 2011; Schell et al., 2017; Zaccarella et al.,
2017), posterior superior temporal sulcus (Zaccarella et al., 2017), anterior temporal lobe
(Humphries et al., 2006; Matchin et al., 2017), angular gyrus (Humphries et al., 2006; Matchin
等人。, 2017), and temporal parietal junction (Matchin et al., 2017).

Although many studies have provided neural evidence for when and where compositional
processing takes place, how it is actually implemented in neural circuits remains largely
underspecified. A growing body of work seeks to develop formal models to account for
how computation of hierarchical and compositional processes integrate and modulate neural
活动. 例如, 马丁 (2020) argues that linguistic representations may be realized by
different patterns of synchronized neural activity while levels of representations are connected
by the modulation of neural gain functions. 具体来说, a speech envelope segment is recog-
nized as a syllable or phoneme via gain modulation between neural populations that serves to
inhibit the process of edge detection of the speech envelope and pass information forward to
next stages of lexical and morphosyntactic operations. Repeating this same template at mul-
tiple concurrent processes yields a model for a neural architecture that is tuned to linguistic
composition at multiple timescales, from phonemes up to sentences. Research in this domain
requires examining rhythmic or synchronized neural activity across these different timescales.

Synchronized neural activity, as in the theory developed by Martin (2020), offers one pos-
sible response to the “mapping problem” articulated by Poeppel and Embick (2005) 和
Poeppel (2012). 至关重要的是, the core components of linguistic theories, such as the syntactic
operation of Merge, aim to capture representational generalizations, not algorithmic processes;
they cannot be directly mapped to neuronal activation. 但, it may be feasible to decompose
linguistic operations and map them to cross-frequency patterns, which denote the association
across multiple frequency bands of neural oscillations (比照. Benítez-Burraco & 墨菲, 2019).
This leading idea builds on a growing trend that takes synchronized patterns of neuronal
circuits as a computational primitive (例如, Buzsáki & Draguhn, 2004). 最后, 考试-
ining patterns of neural synchrony offers a promising avenue to test how neural circuits might
work to implement concurrent linguistic processes as continuous speech unfolds.

Consistent with such a model, rhythmic activity at different frequency bands has been
linked to distinct stages of language comprehension and speech processing (Arnal et al.,
2016; 迈耶, 2018). Neural activity in the low gamma band (30–50 Hz) appears to be
involved in connecting acoustic fine-structure to discrete phonemic information (Di Liberto
等人。, 2015; Giraud & Poeppel, 2012). Slower synchronized activity spanning the delta and
theta bands (1–4 and 4–8 Hz, 分别) has been linked with the analysis of higher-level
syllabic information (Ghitza, 2011; Ghitza & Greenberg, 2009). Rhythmic activity in lower bands
has more recently been associated with the processing of more abstract high-level linguistic
信息. Multiple studies conducting time-frequency analysis have shown evidence that
neural activity in the delta band in particular is associated with the processing of syntactic struc-
真实 (例如, Bonhage et al., 2017; Kaufeld et al., 2020; Meyer et al., 2016; 迈耶 & Gumbert,
2018). To give one example, Kaufeld et al. (2020) evaluated the mutual information between

Neural synchrony:
Brain activity that is synchronized to
endogenous events.

Neurobiology of Language

539

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

Frequency tagging:
Presenting stimuli rhythmically such
that different features occurring at
different rates can be used to elicit
distinct signatures of neural
entrainment or synchrony.

Neural entrainment:
Brain activity that is synchronized
to the presentation of exogenous
事件.

neural activity in the delta band and the higher level syntactic content of sentence stimuli, com-
pared to stimuli composed of meaningless words or word lists. They found increased mutual
information between EEG signals in the delta band that is specific for sentential stimuli that
contain meaningful syntactic structure.

Complementary evidence comes from studies using isochronous speech. Ding et al. (2016)
used a frequency-tagging paradigm with sentence stimuli composed from four one-syllable
words in Mandarin Chinese. Each monosyllabic word spanned 250 多发性硬化症, so each sentence
was exactly 1 s long. With this design, syllables and words were presented at 4 赫兹, two-word
phrases at 2 赫兹, and sentences repeated at 1 赫兹. 至关重要的是, the stimuli were constructed
by concatenating individual syllables together, removing prosodic contours at the supraseg-
mental level (but cf. Glushko et al., 2020). When native speakers of Mandarin listened to
these stimuli during MEG recording, neuromagnetic spectral peaks at 1, 2, 和 4 Hz were
observed. 重要的, for English speakers without Mandarin linguistic knowledge, spectral
peaks were observed only at the 4 Hz syllable rate not at the phrasal or sentential rates
(2 或者 1 赫兹).

Ding et al. (2017) replicated these findings using EEG and further demonstrated that these
peaks were observed in so-called evoked power (phase-synchronous power changes) and also
intertrial phase coherence (consistency of phase-angles across trials), but not in induced power
(non-phase-aligned changes in power). This result was also replicated cross-linguistically:
English stimuli presented in the same paradigm to English-speaking listeners also elicited
entrainment patterns at sentence and phrasal rates.

然而, syntactic structure may not be the only explanation for the patterns of delta band
entrainment described above. The stimuli used by Ding et al. (2016) were designed such that
nouns occurred two times per second (2 赫兹) while verbs occurred at 1 赫兹. 最后, 这
observed signals could reflect neural entrainment to lexical or part-of-speech properties of
these words, rather than to hierarchical structure-building (Frank & 哪个, 2018).

在此背景下, two computational models have been proposed to interpret the
functional significance of these peaks; these are summarized in Table 1. Martin and Doumas
(2017) proposed a structural account in terms of a time-based binding mechanism. 在此之下
机制, lexical-level representations are bound into phrases and, 最终, sentences by
modulations of (A)synchrony between firing units at each respective level. 这种方法
captures the compositional relationship between levels of representation without discarding
information from lower levels. Take the adjective phrase “dry fur,“ 例如. 这
model encodes semantic features for each word at the lowest layer; word information such
作为 [dry adj] 和 [fur noun] is encoded in the second layer. Artificial neurons in each layer fire
asynchronously. A third layer encodes phrase information and will be activated after [dry adj]
和 [fur noun] encodings fire.

Simulations from this model reveal that grammatical sequences (例如, “dry fur rubs skin”)
elicited spectral peaks at 1 赫兹, 2 赫兹, 和 4 赫兹, consistent with the experimental results
from Ding et al. (2016). Such peaks were also observed in a jabberwocky condition, 在哪里
nonsense words were combined to retain syntactic relationships but minimize semantic
内容. This follows as the distinct spectral peaks reflect patterns of synchrony and asyn-
chrony between layers in the model that directly encode structural details. As with the
neural signals, word sequences lacking syntactic structure only elicited 4 Hz oscillations in
该模型.

In contrast to the hierarchical oscillations of Martin and Doumas (2017), Frank and Yang
(2018) developed a computational account of these low-frequency spectral peaks by

Neurobiology of Language

540

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

桌子 1.

Summary of two accounts and predictions for reversed phrases.

Accounts
Structural account

Major study
马丁 & Doumas: time-based
encoding representations

Predictions for
reversed phrases

4 赫兹

Critical simulation results
句子: 1, 2, 4 赫兹

Phrase: 2, 4 赫兹

Word list: 4 赫兹

Jabberwocky: 1, 2, 4 赫兹

Lexical representation

Frank & 哪个: Lexical semantics

语法: 1, 2, 4 赫兹

1, 2, 4 赫兹

and POS

Phrase: 2, 4 赫兹

Word list: 4 赫兹

笔记. Martin and Doumas (2017), Frank and Yang (2018).

Word embedding:
A representation of a word as a high-
dimensional numerical vector.

appealing just to sequential patterns of lexical information. They argue that the observed neu-
ral synchrony may reflect patterns of words and word categories that are repeated across the
刺激. They tested this hypothesis using a series of simulations in which the stimuli from
Ding et al. (2016) were recast as sequences of high-dimensional numerical vectors based on
word-to-word co-occurrence in a large corpus of text (word embedding; Mikolov et al.,
2013). Such vectors capture semantic information through the reasoning that words that
are judged to have similar meanings will have more similar vectors; they also encode lin-
guistic regularities like grammatical category of each word, such that two nouns tend to
have more similar vectors than a noun and a verb. No further syntactic information for combin-
ing phrases and sentences is included in their model. The simulation for both English and Chi-
nese grammatical sentences elicited increased power at 1 赫兹, 2 赫兹, 和 4 赫兹. The simulations
using Chinese VP stimuli showed increased power at 2 赫兹和 4 赫兹, 但不是 1 赫兹. Randomly
shuffled Chinese monosyllabic words showed increased power at 4 Hz only. These simulation
results revealed power spectra similar to that reported by Ding et al. (2016). Frank and Yang
(2018) suggest that those neural entrainment patterns may follow from the tracking of lexical or
grammatical category sequence information (1 verb/s; 2 nouns/s, ETC。).

总结一下, whether neural activity found in the delta range reflects hierarchical infor-
mation or merely lexical properties remains elusive. Computational models based on either
hierarchical structural information or lexical-sequence information have been proposed to
account for the neural data from Ding et al. (2016) (见表 1).

Three previous studies have attempted to tease these two theories apart. Burroughs et al.
(2021) recorded EEG while native English speakers listened to isochronous speech that
included grammatical adjective-noun phrases, ungrammatical adjective-verb phrases, 公克-
matical mixed phrases, and random syllables. A phrase-level peak was found in the gram-
matical adjective-noun phrases and mixed phrases, but not in the adjective-verb phrases and
random syllables. The results are inconsistent with the lexical representation model, 哪个
shows a phrasal-level peak in the adjective-verb condition. A similar conclusion is supported by
another recent EEG study using the frequency-tagging approach during a word-monitoring task
and a sequence chunking task. Lu et al. (2022) report a 1 Hz sentence-level peak that was
weaker in the word list than the sentence condition; they interpret this in support of the hierar-
chical account.

Neurobiology of Language

541

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

相比之下, another study appears to support the lexical-sequence account. Kalenkovich
等人. (2022) recorded MEG data while Russian speakers listened to isochronous speech that
came from one of two different syntactic structures: genitive or dative. The difference was cued
by just a single affixal phoneme; all other words and affixes remained the same. This small
surface difference affects the underlying phrasal organization of these constructions, 和
under a direct interpretation of the hierarchical account, these phrasal structures should lead
to different patterns of synchrony in isochronous speech. Neural peaks related to sentence,
two-word, word, and syllable rates were observed in all conditions, but none of these were
modulated by syntactic construction. This is taken to be consistent with the simulated results
from the lexical-sequence results.

The above recent studies further show that the functional interpretation of delta rhythms is
still under debate. The present study uses reversed phases that preserve semantic information
and the regular pattern of parts-of-speech at the lexical level, yet remove any grammatical
结构. A lexical-sequence model predicts that isochronous presentation of these reversed
stimuli will elicit 1 赫兹和 2 Hz peaks because they preserve regular part-of-speech
序列. 那是, each sequence still has one adjective, two nouns, and one verb. Computa-
tional simulations in which sentences are represented simply as sequences of high-dimensional
vectors verify this prediction. 相比之下, the structural account predicts no 1 Hz or 2 Hz peaks
for reversed phrases, as the original phrase structures are lost. To preview, our EEG data are in
line with the structural account such that reversed phrases elicit an oscillatory peak at 4 Hz but
not at 1 Hz or 2 赫兹; this is inconsistent with the simulated results from the lexical models for
these stimuli.

材料和方法

This experiment tests whether neural synchronization in the delta band reflects lexical
sequence or hierarchical information. If such neural oscillations are modulated by lexical
信息, 具体来说, a regular sequence of parts-of-speech (例如, one verb per second,
two nouns per second, ETC。), we would expect such synchrony to emerge even when the order
of the word sequence is reversed, thereby preserving sequence regularity but disrupting phrase
结构 (Frank & 哪个, 2018). If neural synchrony does depend on hierarchical structure,
然而, then we would not expect it to emerge for the reversed version of grammatical
句子.

参加者

Thirty-seven native speakers (22 女性, 15 males) of Mandarin Chinese between the ages of
19 和 52 (mean = 27.7) participated in the experiment. They were all right-handed and had
normal hearing. They self-reported that they did not have any neurological disorders. 他们
gave informed consent and were reimbursed for their time ($15 per hour in U.S. 美元). 数据
from six participants were excluded from the analysis due to poor data quality. 因此, 数据来自
31 参与者 (18 女性, 13 males) were included in the final analysis.

Materials

Experimental items were four-syllable Chinese sequences drawn from 50 sets of four experi-
mental conditions, which are illustrated in Table 2. For condition 1, Four-syllable sentences
(denoted ABCD) were adapted from Ding et al. (2016), with some modifications. The first two
syllables constituted a noun phrase (NP) made up of either Adjective + Noun (例如, lao + niu
‘old + cow’) or Noun + Noun (例如, 蜀 + mu ‘tree + wood’). The last two syllables constituted

Neurobiology of Language

542

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

Condition 1: Four-syllable sentence (ABCD)

Condition 2: Semantically-mismatched sequence

桌子 2.

Stimuli design.

綿

羊吃草

mian yang chi cao

Cotton sheep eat grass

‘Sheep eat grass.’

軍

孩奔草

jun

hai ben cao

Soldier child run grass

Condition 3: Two-syllable phrase (ABAB)

Condition 4: Reversed phrase (BADC)

老牛青草

lau niu qing

cao

Old cattle green grass

羊

棉草吃

yang mian cao chi

Sheep cotton grass eat

a verb phrase ( 副总裁) (例如, chi + cao ‘eat + grass’). Six items from Ding et al. (2016)’s study were
replaced or modified for the following two reasons: (1) Items that do not sound natural for
native speakers from either Taiwan or mainland China were replaced with novel sentences;
(2) Stimuli using bound morphemes such as heshang ‘monk’ and hudie ‘butterfly’ cannot be
broken down further into Adjective + Noun or Noun + Noun; these were replaced with sen-
tences with free morphemes.

The second condition was composed of Semantically-mismatched sequences. 下列的
Ding et al. (2016), we randomly replaced each of the four words in the four-syllable sentence
condition independently with a new word from another sentence while preserving word posi-
的. These replacements were reviewed to ensure that they do not sound meaningful or famil-
iar to native speakers of Mandarin. (This is important as there are many syllables in Mandarin
that are completely different in meaning but share the same sounds.)

The third condition was composed of Two-syllable phrases of the pattern ABAB. Items in
this condition were constructed by extracting the first two words of the four-syllable sentences
and pairing them together into NP + NP sequences.

The fourth condition was made up of Reversed phrases following the pattern BADC. 这里,
we reversed the order of the first two words and the last two words from each four-syllable
句子. 至关重要的是, this condition allows us to tease apart lexical from hierarchical syn-
chrony. Similar to four-syllable sentences, this condition includes regular lexical sequences
(IE。, noun at 2 Hz and verb at 1 赫兹); 然而, reversed ordering leads to ungrammatical
sentences in Mandarin.

All stimuli were recorded using artificial speech synthesis developed by iFLYTek (https://
www.xfyun.cn/services/online_tts). Each monosyllabic word was recorded separately to
avoid inducing a prosodic contour over the syllable sequences. Each word was compressed
到 240 多发性硬化症, preserving pitch, using the Praat vocal toolkit (Corretge, 2020) in Praat (Boersma
& Weenink, 2022) and a 10 ms silence gap was added after each word. As each syllable has
a duration of 250 多发性硬化症, each four-syllable item spans 1 第二. Items were further grouped
into sequences of 10 that were all drawn from the same condition; each set of 10-second
sequences comprised one trial.

Neurobiology of Language

543

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

数字 1. Power spectra for the speech envelope of the stimuli from all four conditions. Only a
syllable-level peak at 4 Hz is observed in the speech stimuli.

The power spectrum of the speech stimuli is shown in Figure 1. This was computed using a
fast Fourier transform based on the broadband envelope of the stimulus defined by the abso-
lute value of the Hilbert transformation of the stimuli waveforms and then averaged over all
10-second trials for each condition. 正如预期的那样, only a syllable-level peak at 4 Hz was
observed in the acoustic envelope.

Trials were organized into eight blocks, each made up of 20 plausible and 20 implausible
试验. Plausible trials were those with grammatical and semantically meaningful phrases,
drawn either from Condition 1 (Four-syllable sentences) or Condition 3 (Two-syllable phrases).
Implausible trials were drawn from either Condition 2 (Semantically-mismatched sequence) 或者
4 (Reversed phrases). A given block was made of items from Condition 1 paired with those
from Condition 2, or items from Condition 3 paired with those from Condition 4. Trials from
each condition were intermixed and presented randomly in each block. 因此, 320 试验是
presented to each participant in the whole experiment.

程序

Participants were seated comfortably in front of a computer screen in a quiet room. 在此之前
main session, participants were fitted with an electrode cap. Electrodes were also affixed
above and below the left eye and electrolyte gel was applied to minimize impedance below
25 kΩ. The setup took approximately 30 minutes. Sound loudness was set for each participant
在 +45 dB above their hearing threshold (determined using 300 多发性硬化症 1 kHz tones). 随后,
120 1 kHz tones were presented and the auditory-evoked response analyzed to ensure the
data quality was sufficient to continue with the experiment.

During the main session, participants were instructed to judge whether a trial included
plausible sentences/phrases or not by a button-press. After the button-press, the next trial
was played after a delay randomized between 800–1,400 ms (Ding et al., 2016). Stimuli were
presented with Psychopy2 (v1.84.2; Peirce, 2007, 2009). Participants were also instructed to
avoid frequent blinking and unnecessary body adjustments while the stimuli were presented.
Participants had the opportunity to take breaks between each block. Participants had 4 实践-
tice trials to become familiar with the procedure of the experiment. The order of blocks was
counterbalanced across participants. The main experiment took about 1.5 小时. After the main

Neurobiology of Language

544

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

session, participants washed their hair to remove the electrolyte gel and were debriefed about
the goals of the experiment.

EEG Recording and Data Analysis

EEG data were recorded at 500 Hz from 61 active electrodes (actiCHamp, BrainProducts
GMBH) in a 0.01–200 Hz band with online reference to an electrode placed on the left mas-
toid. Impedances were kept below 25 kΩ. FieldTrip software was used to analyze the data
(Oostenveld et al., 2011). Artifacts related to eye blinks were removed via independent com-
ponent analysis (Jung et al., 2000; Makeig et al., 1995), and remaining trials containing arti-
facts were removed manually following visual inspection. Following Ding et al. (2017), 这
first 1-second sentence from each 10-second trial was excluded to avoid potential EEG
responses to sound onset.

Data were filtered from 0.1–25 Hz and re-referenced offline to a common average. Syn-
chrony was assessed from 0.5 到 10 Hz at 0.111 Hz intervals; excluding the initial sentence
yields 9 seconds of data per trial and thus a frequency resolution of 1/9 = 0.111 赫兹. 尽管
Ding et al. (2016) assessed synchrony via total power recorded from MEG, the current study
follows the analysis from Ding et al. (2017), which separates total power into several compo-
尼特: evoked power, induced power, and intertrial phase coherence.

Evoked power reflects the power of EEG responses that is synchronized in both phase
and time with speech stimuli. The discrete Fourier transform of the response in trial n is
denoted as Xn(F ), and Xn(F ) is a complex-value Fourier coefficient. 因此, evoked power is the
summation of complex-value Fourier coefficient of trials averaged over the total number of
trials N.

E fð Þ ¼

(西德:2)
磷
(西德:2)

(西德:2)
(西德:2)2
nXn fð Þ
氮

(1)

The 1/f trend in power spectrum was normalized by dividing the value at the target frequency
from the average of neighboring values within ±0.5 Hz via Equation 2 adapted from Ding et al.
(2017), where w represents the neighboring frequency around the target frequency f. We adopt
this approach to normalization to make our analysis as comparable as possible to that of Ding
等人. (2017). (In response to a reviewer query, we also analyzed evoked power using the nor-
malization algorithm proposed by Donoghue et al., 2020, as well as non-normalized evoked
力量; results are stable regardless of normalization strategy.)

En fð Þ ¼ E fð Þ
磷
w E wð

; w − f
j

j < 0:5 Hz; w ≠ f (2) Intertrial phase coherence (ITPC) reflects similarities in phase across trials (Cohen, 2014). The summation of cosine and sine values of phase angle θn of each complex-value Fourier coef- ficient is computed and then the square root of the summation is averaged over the total num- ber of trials N. (The original formula in Ding et al., 2017, did not take the square root.) q ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ (cid:4) (cid:5)2 (cid:5)2 þ P Þ N ð sinθn cosθn (cid:4) P Þ ð n n R fð Þ ¼ (3) Induced power reflects the power of EEG responses that is synchronized in time but not phase with the speech stimuli. Induced power is computed from the difference between the complex-value Fourier coefficient per trial and the mean over trials (denoted ) 从

Neurobiology of Language

545

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

each trial n. Then the summation of difference from each trial is averaged over the total
number of trials N.

磷

I fð Þ ¼

n Xn fð Þ − < X fð Þ >
j
氮

(4)

For statistical analysis, conditions were compared via a one-way repeated measures
analysis of variance (ANOVA) for each measure at each frequency of interest: 1 赫兹, 2 赫兹,
和 4 赫兹. A Greenhouse-Geisser correction was applied for calculating p values when
non-sphericity was indicated by Mauchly’s test.

Simulations

We conducted a series of simulations to test the predictions of the lexical-sequence account
for four-word sentences and reversed phrases under different methodologies for representing
word meanings as vectors in a high-dimensional semantic space. Twelve simulated subjects
和 50 sentences adapted from Ding et al. (2016) were simulated according to the proce-
dure and code shared by Frank and Yang (2018). 第一的, each word in a sentence was con-
verted to an N-dimensional column vector based on the co-occurrence of that word with
others in a large corpus of text; this is a word embedding (例如, Mikolov et al., 2013). 这些
vectors were copied across M columns to simulate a word lasting 250 多发性硬化症, with an onset
time t drawn from the distribution U(40, 50) (simulating ear-brain lag). These word represen-
tations were concatenated into four-word sentences represented as a N × M matrix w.
Gaussian noise with a standard deviation 0.5 was added to each sentence matrix and the
discrete Fourier transform was applied to each of N rows. Spectral power was then averaged
row-wise yielding a single time series for each sentence and each subject, as implemented
by Frank and Yang (2018).

This procedure was repeated for both the four-syllable sentences and reversed phrases for
each of three different methods for calculating word embeddings: (我) Frank and Yang (2018)’s
word vectors for four-syllable sentences (reversed phrases were derived by simply swapping
columns; no other parameters were changed), (二) word embeddings from Wikipedia2vec
(Yamada et al., 2020), 和 (三、) pre-trained Chinese bidirectional encoder representations
from transformers (BERT; Cui et al., 2021). Wikipedia2vec was trained from a word-based
skip-gram model, an anchor context model, and the link graph model; thus embeddings
were learned by predicting the neighboring context from the given words and the link
graphs on Wikipedia. Prior literature suggests that Wikipedia2vec trained in this way offers
high performance especially on word analogy and text classification tasks (例如, Yamada
等人。, 2016; Yamada & Shindo, 2019). In contrast to both the embeddings from Frank
and Yang (2018) and Wikipeda2vec (Yamada et al., 2020), BERT is trained with an unsuper-
vised learning and bidirectional approach, which means that the word vectors for the same
word may be different depending on the context. Note the Chinese BERT with whole word
masking takes the Chinese word segmentation into consideration before training. 因此, 这
model is trained from masking whole words, instead of word fragments. This model has
shown higher performance on various tasks across the sentence and document levels (Cui
等人。, 2021). We compare word vectors extracted from different models to evaluate the gen-
eralizability of Frank and Yang (2018)’s lexical model across alternative methods for repre-
senting lexical semantics.

Neurobiology of Language

546

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

Delta rhythms for language comprehension

结果

Model Simulations

数字 2 shows the simulated power spectra up to 10 Hz for both four-word sentences and
reversed phrases as derived from three separate word embedding representations. As observed
by Frank and Yang (2018), four-word sentences showed spectral peaks at 1 赫兹和 2 Hz based
on the lexical properties of the word sequences alone (top row). Those models carry the pre-
diction that such peaks will also be observed in the novel reversed phrases condition, 作为
lexical patterns remain unchanged and only hierarchical phrase structure has been disrupted.
The experiment tests precisely whether such peaks are also observed in human EEG signals.

EEG Results

数字 3 summarizes EEG spectra across all four conditions. Normalized evoked power evi-
dences a 4 Hz “syllable” peak across all conditions. A 2 Hz peak for evoked power was
observed for four-syllable sentences and two-syllable phrases, but not for semantically mis-
matched sentences or, 关键地, for reversed phrases. The first three of these results serve to
replicate Ding et al. (2016, 2017) by demonstrating that linguistic patterns beyond those
explicitly encoded in the acoustic envelope can elicit neural synchrony. The key novel com-
parison is the result concerning reversed phrases. 不 2 Hz “phrase-level” peak was found
这里, in contrast to predictions from the lexical-sequence model (see simulation results in
数字 2). A similar pattern was also seen for evoked power at 1 赫兹: A peak was observed

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

数字 2. Simulated power spectra for four-word sentences (顶部) and reversed phrases (底部) 为了
three different approaches to calculating word embeddings (columns). Colored traces indicate indi-
vidual simulation trials and black traces indicate the mean spectral pattern. The left-most column
shows power spectra simulated using the four-sentence word vectors proposed by Frank and Yang
(2018) and their reversed counterpart. Arrows indicate clear spectral peaks at the phrasal (2 赫兹) 和
sentential (1 赫兹) 等级, likely reflecting repeated lexical-level patterns such as part-of-speech infor-
运动, at these rates. 至关重要的是, these lexical-level patterns are preserved in the reversed phrases.
The same pattern is observed when word vectors are calculated using Wikipedia2Vec (middle col-
umn) and Chinese BERT (right-most column).

Neurobiology of Language

547

Delta rhythms for language comprehension

数字 3. Normalized evoked power (log-scale) for four-word sentences (红色的), semantically mis-
matched sentences (蓝色的), two-word phrases (绿色的), and reversed phrases (purple). Colored traces
show individual participant data; black traces indicate the group average per condition. Sensor
topographies are shown at the 4 Hz syllable/word rate, 这 2 Hz phrase rate, 和 1 Hz sentence
速度. All conditions show robust entrainment at 4 赫兹; phrasal entrainment at 2 Hz is apparent for
four-word sentences, two-word phrases, 和, to a lesser extent, mismatched sentences. Sentential
entrainment at 1 Hz is apparent for four-word sentences only. See main text and Figure 5 for sta-
tistical details.

for four-syllable phrases (left-most) but not reversed phrases (right-most). The absence of a 1 赫兹
peak for semantically-mismatched sentences and two-syllable phrases again replicates findings
from Ding et al. (2016). 再次, in contrast to predictions of the lexical-sequence model, 不 1 赫兹
peak was observed for reversed-phrases (right-most). Statistical evaluation of these patterns is
reported below.

数字 4 illustrates results for ITPC and induced power, 分别. ITPC results follow the
same patterns found for evoked power across all four experimental conditions; this result

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

(A) Intertrial phase coherence (ITPC) for four-word sentences (红色的), semantically mis-
数字 4.
matched sentences (蓝色的), two-word phrases (绿色的), and reversed phrases (purple). Colored traces
show individual participant responses; black traces show the group average per condition. Spectral
peaks show phase-alignment at 4 Hz across all conditions, 在 2 Hz for four-word sentences and two-
word phrases, and at 1 Hz for four-word sentences only. This pattern matches that seen for normalized
evoked power. (乙) Induced power (log-scale) across four conditions; no relevant spectral patterns are
apparent. See main text and Figure 5 for statistical details.

Neurobiology of Language

548

Delta rhythms for language comprehension

我

D
哦
w
n
哦
A
d
e
d

F
r
哦
米
H

t
t

:
/
/

d
我
r
e
C
t
.

米

我
t
.

e
d
你
n
哦

我
/

我

A
r
t
我
C
e
–
p
d

F
/

3
4
5
3
8
2
0
4
4
1
0
7
n
哦
_
A
_
0
0
0
7
7
p
d

我

乙
y
G
你
e
s
t

哦
n
0
7
S
e
p
e
米
乙
e
r
2
0
2
3

数字 5. 这 1, 2, 和 4 Hz spectral activity across four conditions for normalized evoked power
(A), Intertrial phase coherence (ITPC) (乙), and induced power (C). Error bars indicate ±1 standard
error of the mean. Significance code: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05.

pattern includes the key absence of 1 赫兹和 2 Hz peaks for the reversed phrases condition. 不
spectral peaks were observed in induced power at any target frequency band (1, 2, 或者 4 赫兹).

Statistical comparisons at each frequency of interest are illustrated in Figure 5. For nor-
malized evoked power, we observed a main effect of condition at 1 赫兹 (F(1.53, 45.9) =
8.16, p < 0.01). Post hoc pairwise Tukey’s tests showed a statistically significant difference in the comparison of the four-syllable sentence condition and each of the others (all p < 0.01) as well as no significant difference between the semantically mismatched sentences and the phrases (p = 0.7), semantically mismatched sentences and the reversed phrases (p = 0.99), or between the phrases and reversed phrases (p = 0.64). A main effect for con- dition was also found for the 2 Hz peak (F(2.19, 65.7) = 25.97, p < 0.001). Post hoc pair- wise Tukey’s tests showed statistically significant differences between four-syllable sentences and semantically mismatched sentences (p < 0.0001), four-syllable sentences and reversed phrases (p < 0.0001), as well as between phrases and reversed phrases (p < 0.0001). No statistically significant difference was found in the comparison between four-word sentences and two-word phrases (p = 0.97), nor between semantically mismatched and reversed Neurobiology of Language 549 Delta rhythms for language comprehension phrases (p = 0.51). There was a marginal effect for condition at the 4 Hz syllable peak (F(2.22, 66.6) = 2.53, p = 0.08). A nearly identical statistical pattern was observed for ITPC. A main effect at 1 Hz (F(1.77, 53.1) = 8.29, p < 0.01) was supported by pairwise differences (Tukey’s test) between the four- syllable sentences and all other conditions (all p < 0.01); there were no significant differences between semantically-mismatched sentences, phrases, or reversed phrases (all p > 0.7). A sta-
tistically reliable effect was also found at 2 赫兹 (F(2.16, 64.8) = 30.77, p < 0.0001). Post hoc tests revealed significant differences for four-syllable sentences and reversed phrases (p < 0.0001), sentences and semantically-mismatched sentences (p < 0.0001), phrases and reversed phrases (p < 0.0001), as well as between phrases and semantically-mismatched sen- tences (p < 0.0001). No significant difference was found in the comparison between the four- syllable sentences and phrases (p = 0.92) nor between the semantically-mismatched and reversed phrases (p = 0.27). There was no main effect of condition at 4 Hz (F(3, 90) = 1.99, p = 0.12). No statistically reliable effects were observed for induced power (1 Hz: F(3, 90) = 1.61, p = 0.19; 2 Hz: F(1.98, 59.4) = 1.04, p = 0.36; 4 Hz: F(3, 90) = 2.5, p = 0.06). DISCUSSION Low-frequency neural activity in the delta band may become synchronized with abstract linguistic patterns (Ding et al., 2016). We tested between two accounts for the functional inter- pretation of this synchronization using EEG data and a frequency-tagging experimental proto- col where spoken words were presented at a 4 Hz rate with and without syntactic structure. The lexical sequence theory holds that this synchrony emerges due to patterns of sequential lexical or part-of-speech information (Frank & Yang, 2018). The structural account links delta band synchrony with how syntactic structure is encoded across time (Martin & Doumas, 2017); on this account such activity is modulated by hierarchical syntactic information. To tease apart the two accounts, we investigated reversed phrases, which preserve lexical seman- tics and part-of-speech patterns in comparison to four-word sentences but crucially do not license grammatical structure at the phrasal or sentential level. If delta band neural activity reflects lexical sequence information, reversed phrases should elicit peaks at 1, 2, and 4 Hz, just as seen with regular four-word sentences. Replicating Frank and Yang (2018), we demon- strated with a series of computational simulations that those predictions are robust across a range of embedding strategies for word meaning (see Figure 2). However, if delta band syn- chrony is modulated by structural information, then reversed phrases (lacking structure) should elicit synchrony only at the 4 Hz rate of monosyllabic words. Inconsistent with the lexical sequence theory and simulations, but consistent with the hierarchical model, EEG data revealed that the reversed phrases elicit peaks at 4 Hz only, in contrast to regular four-word sentences and two-word phrases (see, e.g., Figure 3). These data support the conclusion that neural activity in the delta band reflects the processing of hierarchical information above and beyond lexical-sequence information. Our data are consistent with the recent report from Burroughs et al. (2021), who tested for neural synchrony by comparing English phrases that followed a grammatical Adj-N phrasal template versus an ungrammatical Adj-V pattern. We replicated their findings that ungram- matical sequences disrupt neural synchrony at the phrasal level using a new manipulation in Mandarin, and also extended their results to the sentential level. On the other hand, our observations appear to contrast with the conclusions of Kalenkovich et al. (2022), who reasoned that different syntactic structures in Russian should elicit distinct Neurobiology of Language 550 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d . / l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Delta rhythms for language comprehension patterns of neural synchrony under hierarchical, but not lexical, accounts. That study and ours used very different strategies for manipulating grammatical structure; crucially our manipula- tion affects grammatical well-formedness, while the dative and genitive target conditions used by Kalenkovich et al. (2022) are both grammatically acceptable. They reasoned that a hierar- chical account would predict greater phrase-level synchrony for genitive structures, where phrases appear at regular intervals, as opposed to dative structures. Yet, similar patterns of neural synchrony were found for the two constructions. The interpretation of this result is highly dependent both on the syntactic analysis of the relevant structures and on the theory of parsing of these structures that underlies online sentence recognition. Both of these facets warrant further study. For example, their particular analysis of datives assumes a ternary- branching structure for verb phrases; a layered verb phrase (Larson, 1988, inter alia) carries distinct predictions about the rate of phrases processed per unit time for these stimuli. The dynamics of the parsing process also bear on how distinct constructions affect synchrony, yet little work has modeled the parsing mechanisms associated with these low-frequency signals (see Brennan & Martin, 2019, for discussion). Progress on sorting out these discrepancies will likely require pairing carefully controlled syntactic manipulations in the mold of Kalenkovich et al. (2022) with explicit models that link parsing with neural mechanisms such as phase resetting (Martin, 2020). Whether the neural synchrony observed for isochronous speech reflects evoked responses or endogenous oscillatory activities remains under debate (Martorell et al., 2020; Zoefel et al., 2018); our results help to sharpen the issue. In our study, trials built from four-syllable sentences shared the same words as trials built from reversed phrases, and both sequences contained lex- ical patterns that repeat at 1, 2, and 4 Hz (e.g., 1 verb/second; 2 nouns/second, etc.) If evoked responses are limited to those due to exogenous stimuli, then our results are consistent with the endogenous oscillatory view, perhaps via a phase-reset mechanism (e.g., Martin, 2020). On the other hand, if evoked responses may be attributed to internally generated state transitions, such as recognizing a phrasal node by applying grammatical knowledge, such processing would be time-locked to the isochronous speech rate and thus could give rise to the 1 and 2 Hz patterns of synchrony we observed. That is, the fact that 1 and 2 Hz peaks were only found for regular sentences must be due to endogenous syntactic processing based on the linguistic knowledge of the participant, but whether these signals reflect internally- evoked neural responses or the phase resetting of ongoing oscillatory rhythms remains unknown. Meyer et al. (2019) offers more discussion of how synchronicity might reflect from the combination of external acoustic information and endogenous application of linguistic knowledge. In addition to the target theoretical question, our results also serve to replicate several ear- lier observations using frequency tagging and isochronous speech. We replicated with EEG several key results from the MEG study by Ding et al. (2016). As previously reported, four- syllable sentences elicited peaks at 1, 2, and 4 Hz and two-syllable phrases elicited peaks at 2 and 4 Hz, but not 1 Hz. We also found, as with previous reports, that semantically- mismatched sentences elicited absent or attenuated responses at 1 Hz and 2 Hz. While Ding et al. (2016) only investigated neural synchrony using a measure of total power, Ding et al. (2017) separately analyzed evoked and induced power; the former reflects neural activity that is time-locked and phase-locked to an external stimulus, while the latter reflects neural activity that is time-locked but not phase-locked. They separated out phase locking specifically using ITPC, which measures the phase-consistency neural signals across trials. In line with the EEG findings from English reported by Ding et al. (2017), we observed sentential, phrasal, and syl- labic synchrony in evoked power and ITPC, but not induced power. This finding is consistent Neurobiology of Language 551 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d . / l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Delta rhythms for language comprehension with patterns of synchrony that reflect a phase-reset mechanism (e.g., Cravo et al., 2011; Kösem et al., 2014). One concern in the current study is how our results relate to delta band findings from lan- guage processing that do not rely on frequency tagging and, more broadly, how results from this less natural experimental protocol might generalize to more naturalistic contexts. Kaufeld et al. (2020) and Coopmans et al. (2022) present one possible avenue forward, where the lin- guistic properties of more natural stimuli are analyzed in the frequency domain and fit against neural dynamics. Here, rather than isochronous speech, controlled sentences were presented where phrases spanned a narrow temporal window. They observed increased mutual informa- tion between EEG signals and the speech envelope within a narrow frequency band defined by the frequency of phrases, but this increase was only observed for structured sentences, not for word lists. Using another strategy, Luo and Ding (2020) tested for oscillatory effects of structure when participants listened to metrical stories, which were made up of pairs of mono- and di- syllabic words in both isochronous speech and natural story listening. They reported no delta band peak in the non-metrical stories, which did not have fixed word onsets and length. These studies provide some insight into the processing of more natural speech, but key questions remain, including how to scale a theory based on relatively narrow-band endogenous rhythms to the higher temporal variation found in quasi-periodic every-day language, and whether the same approach can be applied to longer phrases (and therefore slower neural rhythms). Other key directions for generalization also remain to be explored. As Martorell et al. (2020) note, it is unclear how neural synchrony of this sort might vary across populations, including in children and patients with aphasia, though see Getz et al. (2018) for an exami- nation of these patterns in a language-learning setting (cf. Maguire & Abel, 2013). Another open question concerns whether these effects generalize across modalities of stimulus presen- tation (sign vs. speech). Conclusion The current study investigated whether neural activity in the delta band represents the process- ing of sequence-based lexical items alone or also reflects hierarchical structure. Our findings based on a novel reversed-phrases design are inconsistent with the lexical sequence hypoth- esis. Only peaks at 4 Hz, but not at 1 Hz and 2 Hz, were elicited in this condition suggesting that low-frequency delta oscillations are not modulated by part-of-speech or word-sequence patterns. This result contrasts with robust tracking of abstract patterns at 1 Hz and 2 Hz for four-word sentences presented at 4 words per second, and for two-word phrases presented at the same rate. That tracking was observed in ITPC and evoked power, but not induced power; this replicates Ding et al. (2016, 2017) and Burroughs et al. (2021) and confirms that cortical tracking of abstract hierarchical information, possibly reflecting a phase-reset mecha- nism, can be detected robustly across languages with different brain-imaging techniques. ACKNOWLEDGMENTS We thank Samia Elahi for data collection, and audiences from SNL 2019 and AMLaP 2020 for helpful comments. AUTHOR CONTRIBUTIONS Chia-Wen Lo: Conceptualization: Equal; Data curation: Lead; Formal analysis: Lead; Inves- tigation: Lead; Methodology: Equal; Visualization: Equal; Writing – original draft: Lead; Writing – review & editing: Equal. Tzu-Yun Tung: Conceptualization: Equal; Data curation: Neurobiology of Language 552 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d / . l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Delta rhythms for language comprehension Supporting; Methodology: Equal; Writing – review & editing: Equal. Alan Hezao Ke: Concep- tualization: Equal; Data curation: Supporting; Methodology: Equal; Writing – review & editing: Equal. Jonathan R. Brennan: Conceptualization: Equal; Formal analysis: Supporting; Funding acquisition: Lead; Investigation: Supporting; Methodology: Equal; Project administration: Lead; Supervision: Lead; Visualization: Equal; Writing – original draft: Supporting; Writing – review & editing: Equal. REFERENCES Arnal, L. H., Poeppel, D., & Girard, A.-L. (2016). A neurophysiolog- ical perspective on speech processing. In G. Hickok & S. L. Small (Eds.), The neurobiology of language (pp. 463–478). Elsevier. https://doi.org/10.1016/B978-0-12-407794-2.00038-9 Bemis, D. K., & Pylkkänen, L. (2011). Simple composition: A mag- netoencephalography investigation into the comprehension of minimal linguistic phrases. Journal of Neuroscience, 31(8), 2801–2814. https://doi.org/10.1523/JNEUROSCI.5003-10.2011, PubMed: 21414902 Benítez-Burraco, A., & Murphy, E. (2019). Why brain oscillations are improving our understanding of language. Frontiers in Behav- ioral Neuroscience, 13, Article 190. https://doi.org/10.3389 /fnbeh.2019.00190, PubMed: 31551725 Boersma, P., & Weenink, D. (2022). Praat: Doing phonetics by computer ( Version 6.2.09) [Computer software]. https://www .praat.org Bonhage, C. E., Meyer, L., Gruber, T., Friederici, A. D., & Mueller, J. L. (2017). Oscillatory EEG dynamics underlying automatic chunking during sentence processing. NeuroImage, 152, 647–657. https://doi .org/10.1016/j.neuroimage.2017.03.018, PubMed: 28288909 Brennan, J. R., & Martin, A. E. (2019). Phase synchronization varies systematically with linguistic structure composition. Philosophical Transactions of the Royal Society B, 375(1791), Article 20190305. https://doi.org/10.1098/rstb.2019.0305, PubMed: 31840584 Burroughs, A., Kazanina, N., & Houghton, C. (2021). Grammatical category and the neural processing of phrases. Scientific Reports, 11(1), Article 2446. https://doi.org/10.1038/s41598-021-81901-5, PubMed: 33510230 Buzsáki, G., & Draguhn, A. (2004). Neuronal oscillations in cortical networks. Science, 304(5679), 1926–1929. https://doi.org/10 .1126/science.1099745, PubMed: 15218136 Cohen, M. X. (2014). Analyzing neural time series data: Theory and practice. MIT Press. https://doi.org/10.7551/mitpress/9609.001 .0001 Coopmans, C. W., de Hoop, H., Hagoort, P., & Martin, A. E. (2022). Effects of structure and meaning on cortical tracking of linguistic units in naturalistic speech. Neurobiology of Language, 3(3), 386–412. https://doi.org/10.1162/nol_a_00070 Corretge, R. (2020). Praat vocal toolkit [Computer software]. https:// www.praatvocaltoolkit.com Cravo, A. M., Rohenkohl, G., Wyart, V., & Nobre, A. C. (2011). Endogenous modulation of low frequency oscillations by tempo- ral expectations. Journal of Neurophysiology, 106(6), 2964–2972. https://doi.org/10.1152/jn.00157.2011, PubMed: 21900508 Cui, Y., Che, W., Liu, T., Qin, B., & Yang, Z. (2021). Pre-training with whole word masking for Chinese BERT. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 29, 3504–3514. https://doi.org/10.1109/TASLP.2021.3124365 Di Liberto, G. M., O’Sullivan, J. A., & Lalor, E. C. (2015). Low- frequency cortical entrainment to speech reflects phoneme-level processing. Current Biology, 25(19), 2457–2465. https://doi.org /10.1016/j.cub.2015.08.030, PubMed: 26412129 Ding, N., Melloni, L., Yang, A., Wang, Y., Zhang, W., & Poeppel, D. (2017). Characterizing neural entrainment to hierarchical linguis- tic units using electroencephalography (EEG). Frontiers in Human Neuroscience, 11, Article 481. https://doi.org/10.3389/fnhum .2017.00481, PubMed: 29033809 Ding, N., Melloni, L., Zhang, H., Tian, X., & Poeppel, D. (2016). Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience, 19, 158–164. https://doi.org/10 .1038/nn.4186, PubMed: 26642090 Donoghue, T., Haller, M., Peterson, E. J., Varma, P., Sebastian, P., Gao, R., Noto, T., Lara, A. H., Wallis, J. D., Knight, R. T., Shestyuk, A., & Voytek, B. (2020). Parameterizing neural power spectra into periodic and aperiodic components. Nature Neuroscience, 23, 1655–1665. https://doi.org/10.1038/s41593-020-00744-x, PubMed: 33230329 Frank, S. L., & Yang, J. (2018). Lexical representation explains cortical entrainment during speech comprehension. PLOS ONE, 13(5), Article e0197304. https://doi.org/10.1371/journal.pone .0197304, PubMed: 29771964 Getz, H., Ding, N., Newport, E. L., & Poeppel, D. (2018). Cortical tracking of constituent structure in language acquisition. Cogni- tion, 181, 135–140. https://doi.org/10.1016/j.cognition.2018.08 .019, PubMed: 30195135 Ghitza, O. (2011). Linking speech perception and neurophysiol- ogy: Speech decoding guided by cascaded oscillators locked to the input rhythm. Frontiers in Psychology, 2, Article 130. https:// doi.org/10.3389/fpsyg.2011.00130, PubMed: 21743809 Ghitza, O., & Greenberg, S. (2009). On the possible role of brain rhythms in speech perception: Intelligibility of time-compressed speech with periodic and aperiodic insertions of silence. Phone- tica, 66(1–2), 113–126. https://doi.org/10.1159/000208934, PubMed: 19390234 Giraud, A.-L., & Poeppel, D. (2012). Cortical oscillations and speech processing: Emerging computational principles and operations. Nature Neuroscience, 15, 511–517. https://doi.org /10.1038/nn.3063, PubMed: 22426255 Glushko, A., Poeppel, D., & Steinhauer, K. (2020). Overt and covert prosody are reflected in neurophysiological responses previously attributed to grammatical processing. BioRxiv. https://doi.org/10 .1101/2020.09.17.301994 Hagoort, P., & Indefrey, P. (2014). The neurobiology of language beyond single words. Annual Review of Neuroscience, 37, 347–362. https://doi.org/10.1146/annurev-neuro-071013 -013847, PubMed: 24905595 Neurobiology of Language 553 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d . / l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Delta rhythms for language comprehension Humphries, C., Binder, J. R., Medler, D. A., & Liebenthal, E. (2006). Syntactic and semantic modulation of neural activity during auditory sentence comprehension. Journal of Cognitive Neuro- science, 18(4), 665–679. https://doi.org/10.1162/jocn.2006.18.4 .665, PubMed: 16768368 Jung, T.-P., Makeig, S., Humphries, C., Lee, T.-W., McKeown, M. J., Iragui, V., & Sejnowski, T. J. (2000). Removing electroencephalo- graphic artifacts by blind source separation. Psychophysiology, 37(2), 163–178. https://doi.org/10.1111/1469-8986.3720163, PubMed: 10731767 Kalenkovich, E., Shestakova, A., & Kazanina, N. (2022). Frequency tagging of syntactic structure or lexical properties; a registered MEG study. Cortex, 146, 24–38. https://doi.org/10.1016/j.cortex .2021.09.012, PubMed: 34814042 Kaufeld, G., Bosker, H. R., ten Oever, S., Alday, P. M., Meyer, A. S., & Martin, A. E. (2020). Linguistic structure and meaning organize neural oscillations into a content-specific hierarchy. Journal of Neuroscience, 40(49), 9467–9475. https://doi.org/10.1523 /JNEUROSCI.0302-20.2020, PubMed: 33097640 Kösem, A., Gramfort, A., & van Wassenhove, V. (2014). Encoding of event timing in the phase of neural oscillations. NeuroImage, 92, 274–284. https://doi.org/10.1016/j.neuroimage.2014.02.010, PubMed: 24531044 Larson, R. K. (1988). On the double object construction. Linguistic Inquiry, 19(3), 335–391. Lu, Y., Jin, P., Pan, X., & Ding, N. (2022). Delta-band neural activity primarily tracks sentences instead of semantic properties of words. NeuroImage, 251, Article 118979. https://doi.org/10 .1016/j.neuroimage.2022.118979, PubMed: 35143977 Luo, C., & Ding, N. (2020). Cortical encoding of acoustic and lin- guistic rhythms in spoken narratives. eLife, 9, Article e60433. https://doi.org/10.7554/eLife.60433, PubMed: 33345775 Maguire, M. J., & Abel, A. D. (2013). What changes in neural oscil- lations can reveal about developmental cognitive neuroscience: Language development as a case in point. Developmental Cognitive Neuroscience, 6, 125–136. https://doi.org/10.1016/j .dcn.2013.08.002, PubMed: 24060670 Makeig, S., Bell, A. J., Jung, T.-P., & Sejnowski, T. J. (1995). Inde- pendent component analysis of electroencephalographic data. In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), NIPS 1995: Advances in neural information processing systems 8 (pp. 145–151). MIT Press. (2020). A compositional neural architecture for Martin, A. E. Journal of Cognitive Neuroscience, 32(8), 1407– language. 1427. https://doi.org/10.1162/jocn_a_01552, PubMed: 32108553 Martin, A. E., & Doumas, L. A. A (2017). A mechanism for the cortical computation of hierarchical linguistic structure. PLOS Biology, 15(3), Article e2000663. https://doi.org/10.1371 /journal.pbio.2000663, PubMed: 28253256 Martorell, J., Morucci, P., Mancini, S., & Molinaro, N. (2020). Sen- tence processing: How words generate syntactic structures in the brain. PsyArXiv. https://doi.org/10.31234/osf.io/3utpv Matchin, W., Hammerly, C., & Lau, E. (2017). The role of the IFG and pSTS in syntactic prediction: Evidence from a parametric study of hierarchical structure in fMRI. Cortex, 88, 106–123. https://doi.org/10.1016/j.cortex.2016.12.010, PubMed: 28088041 Matchin, W., & Hickok, G. (2020). The cortical organization of syn- tax. Cerebral Cortex, 30(3), 1481–1498. https://doi.org/10.1093 /cercor/bhz180, PubMed: 31670779 Meyer, L. (2018). The neural oscillations of speech processing and language comprehension: State of the art and emerging mechanisms. The European Journal of Neuroscience, 48(7), 2609–2621. https://doi.org/10.1111/ejn.13748, PubMed: 29055058 Meyer, L., & Gumbert, M. (2018). Synchronization of electrophys- iological responses with speech benefits syntactic information processing. Journal of Cognitive Neuroscience, 30(8), 1066–1074. https://doi.org/10.1162/jocn_a_01236, PubMed: 29324074 Meyer, L., Henry, M. J., Gaston, P., Schmuck, N., & Friederici, A. D. (2016). Linguistic bias modulates interpretation of speech via neu- ral delta-band oscillations. Cerebral Cortex, 27(9), 4293–4302. https://doi.org/10.1093/cercor/bhw228, PubMed: 27566979 Meyer, L., Sun, Y., & Martin, A. E. (2019). Synchronous, but not entrained: Exogenous and endogenous cortical rhythms of speech and language processing. Language, Cognition and Neu- roscience, 35(9), 1089–1099. https://doi.org/10.1080/23273798 .2019.1693050 Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient esti- mation of word representations in vector space. ArXiv, 1301.3781v3. https://doi.org/10.48550/arXiv.1301.3781 Neufeld, C., Kramer, S. E., Lapinskaya, N., Heffner, C. C., Malko, A., & Lau, E. F. (2016). The electrophysiology of basic phrase building. PLOS ONE, 11(10), Article e0158446. https://doi.org /10.1371/journal.pone.0158446, PubMed: 27711111 Oostenveld, R., Fries, P., Maris, E., & Schoffelen, J.-M. (2011). Field- Trip: Open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data. Computational Intelli- gence and Neuroscience, 2011, Article 156869. https://doi.org /10.1155/2011/156869, PubMed: 21253357 Pallier, C., Devauchelle, A.-D., & Dehaene, S. (2011). Cortical repre- sentation of the constituent structure of sentences. PNAS, 108(6), 2522–2527. https://doi.org/10.1073/pnas.1018711108, PubMed: 21224415 Peirce, J. W. (2007). PsychoPy—Psychophysics software in Python. Journal of Neuroscience Methods, 162(1–2), 8–13. https://doi.org /10.1016/j.jneumeth.2006.11.017, PubMed: 17254636 Peirce, J. W. (2009). Generating stimuli for neuroscience using Psy- choPy. Frontiers in Neuroinformatics, 2, Article 10. https://doi.org /10.3389/neuro.11.010.2008, PubMed: 19198666 Poeppel, D. (2012). The maps problem and the mapping prob- lem: Two challenges for a cognitive neuroscience of speech and language. Cognitive Neuropsychology, 29(1–2), 34–55. https://doi.org/10.1080/02643294.2012.710600, PubMed: 23017085 Poeppel, D., & Embick, D. (2005). Defining the relation between linguistics and neuroscience. In A. Cutler (Ed.), Twenty-first cen- tury psycholinguistics: Four cornerstones (pp. 103–120). Routledge. Pylkkänen, L., & Brennan, J. R. (2019). Composition: The neurobiol- ogy of syntactic and semantic structure building. In D. Poeppel, G. R. Mangun, & M. S. Gazzaniga (Eds.), The cognitive neurosci- ences (pp. 859–868). MIT Press. Schell, M., Zaccarella, E., & Friederici, A. D. (2017). Differen- tial cortical contribution of syntax and semantics: An fMRI study on two-word phrasal processing. Cortex, 96, 105–120. https://doi.org/10.1016/j.cortex.2017.09.002, PubMed: 29024818 Yamada, I., Asai, A., Sakuma, J., Shindo, H., Takeda, H., Takefuji, Y., & Matsumoto, Y. (2020). Wikipedia2Vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia. ArXiv, 1812.06280v3. https://doi.org/10.48550 /arXiv.1812.06280 Yamada, I., & Shindo, H. (2019). Neural attentive bag-of-entities the 23rd for text classification. In Proceedings of model Neurobiology of Language 554 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d / . l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Delta rhythms for language comprehension conference on computational natural (CoNLL) guistics. https://doi.org/10.18653/v1/K19-1052 language learning (pp. 563–573). Association for Computational Lin- Yamada, I., Shindo, H., Takeda, H., & Takefuji, Y. (2016). Joint learning of the embedding of words and entities for named entity disambiguation. In Proceedings of the 20th SIGNLL conference on computational natural language learning (pp. 250–259). Asso- ciation for Computational Linguistics. https://doi.org/10.18653 /v1/K16-1025 Zaccarella, E., Meyer, L., Makuuchi, M., & Friederici, A. D. (2017). Building by syntax: The neural basis of minimal linguistic struc- tures. Cerebral Cortex, 27(1), 411–421. https://doi.org/10.1093 /cercor/bhv234, PubMed: 26464476 Zoefel, B., ten Oever, S., & Sack, A. T. (2018). The involvement of endogenous neural oscillations in the processing of rhythmic input: More than a regular repetition of evoked neural responses. Frontiers in Neuroscience, 12, Article 95. https://doi.org/10.3389 /fnins.2018.00095, PubMed: 29563860 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u n o / l / l a r t i c e - p d f / / / / 3 4 5 3 8 2 0 4 4 1 0 7 n o _ a _ 0 0 0 7 7 p d / . l f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Neurobiology of Language 555 RESEARCH ARTICLE image

下载pdf