The Neural Correlates of Similarity- E
Rule-based Generalization
Fraser Milton1, Pippa Bealing1, Kathryn L. Carpenter1,
Abdelmalek Bennattayallah1, and Andy J. Wills2
Astratto
■ The idea that there are multiple learning systems has be-
come increasingly influential in recent years, with many studies
providing evidence that there is both a quick, similarity-based
or feature-based system and a more effortful rule-based system.
A smaller number of imaging studies have also examined
whether neurally dissociable learning systems are detectable.
We further investigate this by employing for the first time in
an imaging study a combined positive and negative patterning
procedure originally developed by Shanks and Darby [Shanks,
D. R., & Darby, R. J. Feature- and rule-based generalization in
human associative learning. Journal of Experimental Psychology:
Animal Behavior Processes, 24, 405–415, 1998]. Unlike previous
related studies employing other procedures, rule generalization
in the Shanks–Darby task is beyond any simple non-rule-based
(per esempio., associative) account. We found that rule- and similarity-
based generalization evoked common activation in diverse re-
gions including the pFC and the bilateral parietal and occipital
lobes indicating that both strategies likely share a range of com-
mon processes. No differences between strategies were iden-
tified in whole-brain comparisons, but exploratory analyses
indicated that rule-based generalization led to greater activa-
tion in the right middle frontal cortex than similarity-based
generalization. Conversely, the similarity group activated the
anterior medial frontal lobe and right inferior parietal lobes
more than the rule group did. The implications of these results
are discussed. ■
INTRODUCTION
The ability to generalize information we have previously
learned to novel stimuli is fundamental for successful
functioning in our everyday environment. An enduring
and contentious question is whether this is achieved by
separable learning systems (per esempio., Ashby, Alfonse-Reese,
Turken, & Waldron, 1998; Brooks, 1978) or just a single
system (per esempio., Newell, Dunn, & Kalish, 2011; Nosofsky &
Kruschke, 2002). Multiple-system accounts typically posit the
existence of a nondeliberative ( Wills, Milton, Longmore,
Hester, & Robinson, 2013) or nonanalytic (Brooks, 1978)
processi, that is automatic (Smith, Patalano, & Jonides, 1998),
similarity-based (Milton, Longmore, & Wills, 2008), E
driven by associative (McLaren, Verde, & Mackintosh,
1994) or implicit (Ashby et al., 1998) processes. A second
system is assumed to be deliberative (Wills et al., 2013) O
analytic (Brooks, 1978), controlled (Smith et al., 1998), rule-
based (Ashby et al., 1998), and requiring of extensive cogni-
tive resources (Wills, Inkster, & Milton, 2015). In questo articolo,
we refer to these two systems as similarity- and rule-based.
Much of the evidence relevant to this debate has come
from behavioral or comparative studies. Some of this evi-
dence is consistent with multiple learning systems accounts
(per esempio., Maes et al., 2015; Ashby & Maddox, 2011; Allen &
1University of Exeter, 2University of Plymouth
© 2016 Istituto di Tecnologia del Massachussetts
Brooks, 1991; Rips, 1989; Kemler Nelson, 1984), whereas
others maintain that this evidence can be more parsimoni-
ously explained by a single system (per esempio., Edmunds, Milton,
& Wills, 2015; Wills et al., 2015; Newell, Moore, Wills, &
Milton, 2013; Stanton & Nosofsky, 2013). Consequently,
there is currently no clear consensus on this issue. A com-
plimentary and currently relatively underexplored ap-
proach is to use brain imaging to examine whether there
are neurally dissociable learning systems. One such fMRI
study, loosely based on earlier behavioral work by Allen
and Brooks (1991), was conducted by Koenig et al.
(2005), who asked participants to classify a set of cartoon
animals differing on four stimulus dimensions (per esempio., legs,
neck type). Participants in the rule condition were in-
formed of a complex rule (category membership requires
the instance to possess three of four characteristic fea-
tures for that category). In the similarity condition, par-
ticipants were not told the rule but instead asked to
make a quick decision using their first impressions about
which category a particular instance was more similar to.
Both groups were provided with trial-by-trial feedback.
Koenig et al. found that similarity-based, compared with
rule-based, categorization recruited greater activation in
bilateral temporoparietal regions as well as bilateral ante-
rior prefrontal regions (BA 10). Conversely, the rule-based
condition led to greater activation than the similarity
condition in the left frontal lobes, left inferior parietal
Journal of Cognitive Neuroscience 29:1, pag. 150–166
doi:10.1162/jocn_a_01024
l
l
/
/
j
F
/
T
T
io
T
.
:
/
/
D
o
w
N
l
o
UN
D
D
o
e
w
D
N
l
F
o
R
UN
o
D
M
e
D
H
T
F
T
R
P
o
:
M
/
D
/
o
H
M
w
T
io
N
T
T
P
o
P
:
UN
R
/
D
C
/
e
.
D
M
S
io
F
io
R
T
o
l
P
M
v
R
e
C
H
R
.
C
P
S
H
io
UN
l
D
io
v
io
R
R
e
e
.
R
C
C
T
C
.
o
M
H
M
UN
/
io
j
e
R
D
o
.
tu
C
C
N
o
o
/
M
C
UN
N
/
R
j
UN
T
o
R
io
T
C
io
C
C
N
l
e
/
e
–
UN
P
–
R
D
P
T
D
io
2
F
9
C
/
l
1
2
e
9
1
–
/
5
P
1
0
D
/
F
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
C
2
0
N
5
/
_
2
UN
1
/
_
7
j
0
8
o
1
5
C
0
7
N
2
5
4
_
7
UN
P
/
_
D
j
0
o
1
B
C
0
sì
N
2
G
_
4
tu
UN
.
e
_
P
S
0
T
D
1
o
F
0
N
2
B
0
4
sì
8
.
P
S
M
D
e
IO
F
P
T
e
B
M
l
sì
io
B
B
e
G
R
R
tu
UN
2
e
R
0
S
io
2
T
3
e
S
o
N
tu
S
0
e
3
R
M
o
UN
N
sì
/
j
F
/
.
T
/
/
1
2
7
0
2
M
1
UN
sì
2
0
2
1
lobes, and the right superior parietal lobes. One feature of
this study, Tuttavia, is that it is not clear exactly what
strategy participants in the similarity condition are employ-
ing, which makes interpretation of the imaging results
more complicated. Specifically, Koenig et al. assume that
participants are using a similarity-based approach that
presumably requires the use of most, if not all, of the di-
mensions. Although this is plausible, an alternative ex-
planation is that participants in the similarity condition
are using a simpler rule-based approach, such as a single
dimension-plus-exception strategy (per esempio., Ward & Scott,
1987), which could also result in the level of performance
obtained. In this latter case, participants in the similarity
condition are using fewer of the dimensions than those
in the rule condition.
In a study somewhat more closely based on the work
of Allen and Brooks (1991), Patalano, Smith, Jonides, E
Koeppe (2001) also observed activation in bilateral frontal
cortex in the rule-based condition that was not present in
the similarity condition. Occipital lobe and cerebellum
activation was prevalent in both conditions. Tuttavia,
significant neural differences between the groups were
relatively restricted, even though one-tailed tests were
used. Per esempio, the greater frontal lobe activation in
the rule than the similarity condition was only marginally
significant ( p = .06).
Using a slightly different approach—the criterial attri-
bute procedure, based on earlier behavioral work by
Kemler Nelson (1984)—Tracy et al. (2003) investigated
the neural correlates of family resemblance categoriza-
zione (assumed to use the similarity system) and unidimen-
sional categorization (assumed to employ the rule system).
Similar to the category structure employed by Koenig et al.
(2005), a family resemblance category (per esempio., Rosch &
Mervis, 1975) possessed a number of characteristic but
not defining features—an item did not have to possess
any single feature or features as long as it possessed
enough characteristic features (three of four typical fea-
tures) of that category. In contrasto, a unidimensional
category was based around a single defining feature that
the authors assumed required use of the rule-based
system. Tracy et al. (2003) found greater activation in the
extrastriate cortex (BA 18 and BA 19) and the left cere-
bellum for family resemblance (similarity-based) cate-
gorization than unidimensional categorization, whereas
unidimensional categorization led to greater activation in
bilateral frontal lobes than family resemblance categoriza-
zione. Tuttavia, recent behavioral model-based analysis
suggests that family resemblance categorization in the
criterial attribute procedure is often due to the use of a
single noncriterial dimension, which is a strategy not
detectable by the standard analysis employed by Tracy
et al. (for a detailed discussion, see Wills et al., 2015). Questo
again makes interpretation of the neural differences ob-
served more difficult.
In contrast to Tracy et al.’s (2003) conclusions, Milton,
Wills, and Hodgson (2009) proposed that both family
resemblance and unidimensional categorization are the
result of a single rule-based system, with family resem-
blance categorization requiring a more complex, multi-
dimensional rule than unidimensional categorization
(see also Wills et al., 2013). Consistent with their proposal,
Milton et al. found extensive common activation between
family resemblance and unidimensional categorization
including the dorsolateral frontal cortex and the anterior
cingulate. The most notable difference between groups
was the greater right ventrolateral frontal cortex activation
for family resemblance than unidimensional categoriza-
zione, which the authors proposed indicated the greater
working memory resources required to employ a multi-
dimensional rule.
A different approach was taken by Nomura et al.
(2007), who conducted an fMRI study based on the influ-
ential COVIS framework (Ashby et al., 1998); participants
viewed a series of Gabor patches and learned either a
rule-based task that possessed an easily verbalizable,
unidimensional rule (“thinner lines belong in category
UN, thicker lines in category B”), which is assumed to
encourage use of the explicit system or an information-
integration task, which requires participants to combine
information from two unrelated stimulus dimensions.
The optimal information-integration category structure
is assumed to be difficult or impossible to verbalize, Quale
should encourage use of COVIS’s implicit system. In line
with COVIS’s predictions, dissociable neural activation
was found with the medial temporal lobes more activated
in rule-based compared with information-integration
apprendimento, and the caudate body more engaged in infor-
mation-integration than rule-based learning.
Although intriguing, the category separation (cioè., IL
mean distance between category items as plotted in stim-
ulus space divided by the within-category variance along
the direction of the comparison) was smaller in the rule-
based than the information-integration condition, E
the selective attention demands were greater in the
rule-based than the information-integration condition
(as only one of the two dimensions was relevant to learn
the rule-based structure whereas both were required for
the information-integration structure) meaning that non-
essential differences could have been driving the neural
dissociations. In a recent study conducted by Carpenter,
Wills, Benattayallah, and Milton (in press) when these
nonessential differences between the rule-based and
information-integration conditions were better equated
(by comparing a conjunctive rule-based structure against
a standard information-integration structure), the pattern
of results observed by Nomura et al. (2007) did not
emerge, and instead there was extensive common over-
lap between the conditions. Inoltre, the informa-
tion-integration condition evoked greater activation in
the medial-temporal lobes than the rule-based condition,
which may reflect the greater memory demands in the
information-integration condition where no rule was
readily available. In another related study, albeit one
Milton et al.
151
l
l
/
/
j
T
T
F
/
io
T
.
:
/
/
D
o
w
N
l
o
UN
D
D
o
e
w
D
N
l
F
o
R
UN
o
D
M
e
D
H
T
F
T
R
P
o
:
M
/
D
/
o
H
M
w
T
io
N
T
T
P
o
P
:
UN
R
/
D
C
/
e
.
D
M
S
io
F
io
R
T
o
l
P
M
v
R
e
C
H
R
.
C
P
S
H
io
UN
l
D
io
v
io
R
R
e
e
.
R
C
C
T
C
.
o
M
H
M
UN
/
io
j
e
R
D
o
.
tu
C
C
N
o
o
/
M
C
UN
N
/
R
j
UN
T
o
R
io
T
C
io
C
C
N
l
e
/
e
–
UN
P
–
R
D
P
T
D
io
2
F
9
C
/
l
1
2
e
9
1
–
/
5
P
1
0
D
/
F
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
C
2
0
N
5
/
_
2
UN
1
/
_
7
j
0
8
o
1
5
C
0
7
N
2
5
4
_
7
UN
P
/
_
D
j
0
o
1
B
C
0
sì
N
2
G
_
4
tu
UN
.
e
_
P
S
0
T
D
1
o
F
0
N
2
B
0
4
sì
8
.
P
S
M
D
e
IO
F
P
T
e
B
M
l
sì
io
B
B
e
G
R
R
tu
UN
2
e
R
0
S
io
2
T
3
e
S
o
N
tu
S
0
e
3
R
M
o
UN
N
sì
/
j
/
.
/
T
/
F
1
2
7
0
2
M
1
UN
sì
2
0
2
1
which used very different stimuli (the stimuli varied on
rectangle height and width of an ellipse), Milton and
Pothos (2011) compared activation between a rule-based
task and an information-integration-like task but found
minimal neural dissociations and instead found extensive
overlap of activation suggesting that both groups were
using similar neural processes.
Finalmente, Grossman et al. (2002), using a modified version
of Rips’s (1989) classic procedure, gave participants a
description of an item such as “a round object 2 inches
in diameter” who had to assign it to either the category
of “quarter” or “pizza.” The description is more similar in
size to a quarter than a pizza but a pizza has a variable
diameter (so could, in principle, be 2 inches) whereas a
quarter does not (so it cannot be 2 inches). Participants
who choose the quarter category were assumed to be
making a similarity judgment, whereas those who assign
it to the pizza category are following a rule. Grossman
et al. found that there was greater recruitment of the
left dorsolateral pFC for rule than similarity responses
whereas the right inferior parietal lobe, which they noted
is involved in overall feature configuration ( Wilkinson,
Halligan, Henson, & Dolan, 2002), was activated more for
similarity- than rule-based responses. Tuttavia, Nosofsky
and Johansen (2000) have demonstrated that the results
from this procedure can be accommodated by a simple,
single-process, exemplar-based learning system without
requiring qualitatively distinct systems for the different
strategies.
We investigate whether there are neurally separable
rule and similarity generalization systems from a different
angle using a procedure based on Shanks and Darby’s
(1998) Experiment 2, which has not previously been
examined using brain imaging. The design of this exper-
iment is shown in Figure 1. Participants took the role of
an allergist who had to determine whether the meals a
hypothetical patient, Mr. X, eats will cause an allergic
reaction or not. Letters in Figure 1 stand for particular
foods (per esempio., pasta or eggs), + indicates that an allergic
reaction will develop, and − indicates that no allergic
reaction will occur. During training, participants learn
two complete negative patterning problems (per esempio., A+,
Figura 1. The training and test trial types in the Shanks and Darby
(1998, Experiment 2) allergy prediction task; letters indicate foods
eaten by a hypothetical patient Mr. X, + = patient develops an allergic
reaction; − = patient does not develop an allergy reaction; ? = no
feedback given.
B+, AB−) and two complete positive patterning prob-
lems (per esempio., C−, D−, CD+). Critically, Tuttavia, there
are also four incomplete patterning problems—for exam-
ple, participants are trained on I+ and J+ but not on the
outcome of I and J combined and trained that eating KL
together leads to an allergic reaction but not what hap-
pens when K and L are eaten separately. During the test
phase, as well as being tested on items they studied dur-
ing training (per esempio., I+, J+, and KL+), participants have to
generalize the knowledge they have obtained to what will
henceforth be referred to as the critical items (per esempio., IJ, K,
and L; shown in bold in Figure 1) and are provided no
feedback on their responses.
In the case of IJ, if participants are using a similarity-
based strategy then they should predict an allergic re-
action as it is similar to I and J, both of which lead to
an allergic reaction alone. Equally, when presented with
K or L alone they should predict an allergic reaction be-
cause they are similar to KL which results in an allergic
reaction. In contrasto, if participants have learned the
“opposites” rule from training—single foods predict the
opposite to their compounds—they can use this to gen-
eralize to novel items. In questo caso, IJ should lead to no
allergic reaction, because it is the opposite outcome to I
or J when presented alone. Allo stesso modo, K or L, when pre-
sented separately, should lead to no allergic reaction as
this is the opposite to KL, which resulted in an allergic
reaction. Shanks and Darby (1998) found that partici-
pants with high accuracy during training produced more
rule-based responses for these critical test items than
participants with lower accuracy. They explained this by
postulating that there is a transition from a similarity to a
rule-based approach, which can only be used when the
basic associations have been acquired (see also Wills,
Graham, Koh, Mclaren, & Rolland, 2011).
While using a novel procedure to compare the neural
correlates of the purported rule and similarity systems is
of value in itself, the Shanks–Darby procedure has some
particular advantages that make it well equipped to pro-
vide new insight into this debate. Primo, both similarity-
and rule-based responses require utilizing the same
number of stimulus dimensions. This is in contrast to many
of the studies described above (per esempio., Milton & Pothos,
2011; Milton et al., 2009; Nomura et al., 2007; Tracy et al.,
2003) where the number of stimulus dimensions utilized
in the similarity condition seem unlikely to be the same
as in the rule conditions (it may either be more, as is
commonly assumed, or sometimes less, depending on
how participants approach the task in the similarity con-
dizione). Across a range of different procedures, categoriz-
ing by a larger number of dimensions is more effortful
than categorizing by fewer dimensions (per esempio., Edmunds
et al., 2015; Wills et al., 2015; Milton & Wills, 2004). È
plausible that this could be driving the difference in neural
activation between the groups, rather than indicating the
involvement of qualitatively different systems. A further
advantage of employing the Shanks–Darby procedure is
152
Journal of Cognitive Neuroscience
Volume 29, Numero 1
l
l
/
/
j
T
T
F
/
io
T
.
:
/
/
D
o
w
N
l
o
UN
D
D
o
e
w
D
N
l
F
o
R
UN
o
D
M
e
D
H
T
F
T
R
P
o
:
M
/
D
/
o
H
M
w
T
io
N
T
T
P
o
P
:
UN
R
/
D
C
/
e
.
D
M
S
io
F
io
R
T
o
l
P
M
v
R
e
C
H
R
.
C
P
S
H
io
UN
l
D
io
v
io
R
R
e
e
.
R
C
C
T
C
.
o
M
H
M
UN
/
io
j
e
R
D
o
.
tu
C
C
N
o
o
/
M
C
UN
N
/
R
j
UN
T
o
R
io
T
C
io
C
C
N
l
e
/
e
–
UN
P
–
R
D
P
T
D
io
2
F
9
C
/
l
1
2
e
9
1
–
/
5
P
1
0
D
/
F
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
C
2
0
N
5
/
_
2
UN
1
/
_
7
j
0
8
o
1
5
C
0
7
N
2
5
4
_
7
UN
P
/
_
D
j
0
o
1
B
C
0
sì
N
2
G
_
4
tu
UN
.
e
_
P
S
0
T
D
1
o
F
0
N
2
B
0
4
sì
8
.
P
S
M
D
e
IO
F
P
T
e
B
M
l
sì
io
B
B
e
G
R
R
tu
UN
2
e
R
0
S
io
2
T
3
e
S
o
N
tu
S
0
e
3
R
M
o
UN
N
sì
/
j
/
T
.
F
/
/
1
2
7
0
2
M
1
UN
sì
2
0
2
1
that a full explanation of the “opposites” rule generalization
is commonly thought to be beyond published associative
accounts (see Maes et al., 2015, for a further discussion)
allowing clear inferences to be drawn. This does not appear
to be the case for the other procedures described above.
For instance, non-rule-based accounts, such as Kruschke’s
(1992) ALCOVE model, which have a mechanism for di-
mensional attention, are able to account for the purported
rule-based classification in these tasks.
In summary, Poi, this study uses a novel approach to
investigate the neural differences between rule-based
and similarity-based generalization. We predicted, based
on previous behavioral and comparative work with this
procedure (Maes et al., 2015; Wills et al., 2011), that we
would observe neural differences between the generali-
zation strategies. In particular, we hypothesized that
there would be greater frontal lobe activation in the
rule-based condition than the similarity-based condition
(per esempio., Milton et al., 2009; Nomura & Reber, 2008; Grossman
et al., 2002; Patalano et al., 2001). Our prediction for which
regions would be implicated in similarity-based generali-
zation was more tentative given the greater heterogeneity
in previous studies but viable options a priori included the
right inferior parietal lobes (Grossman et al., 2002) E
the occipital lobes (per esempio., Nomura et al., 2007; Patalano
et al., 2001)/extrastriate cortex (Tracy et al., 2003).
METHODS
Participants
Sixty-two right-handed participants were recruited from
the University of Exeter participant pool. Participants
were either volunteers, received course credits, or were
paid £7. Participants all gave informed consent accord-
ing to procedures approved by the Psychology Ethics
Committee, University of Exeter. A learning criterion was
set as significantly above chance accuracy in the second
half of training to ensure that all participants included
in the analyses had clear evidence of learning. Without
Questo, one could not reasonably expect true generalization
to occur. This resulted in the exclusion of 10 participants.
A further 14 participants did not show clear evidence of
either rule- or similarity-based generalization—defined as
significantly above chance (64.6%) strategy-consistent re-
sponding for the critical test trials. These participants
who did not adopt a clear strategy would likely obscure
any differences that emerged between participants who
did demonstrate clear rule or similarity generalization
so we consequently excluded them from all the principal
analyses. Noi, Tuttavia, consider the test phase data for
these 14 participants who used a mixture of rule and sim-
ilarity consistent responses separately. This left 38 partici-
pants in total for our principal analyses; 24 rule-based
responders and 14 similarity-based responders. Questo
trend for a greater proportion of rule responders was not
significant, χ2(1) = 2.632, p = .105.
Stimuli
The stimuli (food names) were identical to those used in
Experiment 2 of Shanks and Darby (1998). For half of the
participants, the food names A–P (Guarda la figura 1) were
cheese, garlic, milk, mushrooms, seafood, red meat, olive
oil, coffee, banana, eggs, orange squash, bread, avocado,
peanuts, pasta, and chocolate. For the other half, IL
foods assigned to A/B were exchanged for those assigned
to C/D and likewise for E/F and G/H, I/J and K/L, E
M/N and O/P.
Procedure
Before entering the scanner, participants were asked to
take the role of a food allergist and to learn when Mr. X
would develop an allergic reaction after eating a meal
containing certain foods. Twenty-nine participants were
additionally provided with instructions outlining the rule
(per esempio., “If Mr. X is allergic to a food when it is presented on
its own, he won’t be allergic to it when it is presented
together with another food. Conversely, if Mr. X is not al-
lergic to a food when it is presented on its own then he
will be if it is presented in combination with another
food”), E 33 participants were provided with instruc-
tions designed to encourage a similarity-based approach
(“When making a response, please use your intuition as
to what you feel is the correct answer based on what
you have previously seen”). The rationale behind this
was to facilitate obtaining a sufficient number of partici-
pants who consistently categorized the critical trials by
either a similarity or a rule-based approach rather than
to look at the neural effects of differing instructions per
se. In practice, Tuttavia, this instructional manipulation
had no significant impact on the strategy used (we sus-
pect, in hindsight, that it would have been more effective
if, like the training items, it had been presented inside the
scanner in the same context as where learning took place)
and will, consequently, not be discussed further.
Visual stimuli were presented on a back-projection
screen positioned at the foot end of the MRI scanner
and viewed via a mirror mounted on a head coil. Button-
press responses and RTs were measured using a fiber-
optic button box. E-Prime (Psychological Software Tools,
2002, www.pstnet.com) was used for the presentation
and timing of stimuli and collection of response data.
In the training phase, participants received six blocks
of trials, divided into two scanning runs of three blocks,
with each of the 18 training stimuli (Guarda la figura 1) pre-
sented twice in each block in a random order. Each trial
began with a white screen lasting a random interval be-
tween 500 E 4000 msec, before a black fixation cross
was presented in the middle of the screen for 250 msec.
A meal, food names presented in black font, was then
presented in the middle of a white screen for 3000 msec
during which time participants indicated whether it
would lead to an allergic reaction (by pressing the left
Milton et al.
153
l
l
/
/
j
F
/
T
T
io
T
.
:
/
/
D
o
w
N
l
o
UN
D
D
o
e
w
D
N
l
F
o
R
UN
o
D
M
e
D
H
T
F
T
R
P
o
:
M
/
D
/
o
H
M
w
T
io
N
T
T
P
o
P
:
UN
R
/
D
C
/
e
.
D
M
S
io
F
io
R
T
o
l
P
M
v
R
e
C
H
R
.
C
P
S
H
io
UN
l
D
io
v
io
R
R
e
e
.
R
C
C
T
C
.
o
M
H
M
UN
/
io
j
e
R
D
o
.
tu
C
C
N
o
o
/
M
C
UN
N
/
R
j
UN
T
o
R
io
T
C
io
C
C
N
l
e
/
e
–
UN
P
–
R
D
P
T
D
io
2
F
9
C
/
l
1
2
e
9
1
–
/
5
P
1
0
D
/
F
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
C
2
0
N
5
/
_
2
UN
1
/
_
7
j
0
8
o
1
5
C
0
7
N
2
5
4
_
7
UN
P
/
_
D
j
0
o
1
B
C
0
sì
N
2
G
_
4
tu
UN
.
e
_
P
S
0
T
D
1
o
F
0
N
2
B
0
4
sì
8
.
P
S
M
D
e
IO
F
P
T
e
B
M
l
sì
io
B
B
e
G
R
R
tu
UN
2
e
R
0
S
io
2
T
3
e
S
o
N
tu
S
0
e
3
R
M
o
UN
N
sì
/
j
/
/
F
/
.
T
1
2
7
0
2
M
1
UN
sì
2
0
2
1
button box key) or would not lead to an allergic reaction
(by pressing the right button box key). Following this,
feedback (“Correct,” in blue or “Incorrect,” in red) era
presented for 500 msec. For participants who failed to
respond when the meal was presented, the message
“Timeout!!" (in red) appeared instead. The next trial then
immediately began. At the end of each block of 36 trials,
the average accuracy on that block was displayed on the
screen for 12 sec, before the next block began.
The test phase included all 24 stimuli, comprising the
18 training stimuli plus the six critical generalization stim-
uli (shown in bold in Figure 1). IL 18 training stimuli
were presented once in each block, whereas the six crit-
ical trials were each presented twice leading to 30 trials in
each of the four blocks. Stimuli were presented in a ran-
dom order. The intratrial structure was identical to the
training phase, except that after a response a 500-msec
blank screen appeared rather than feedback. If no re-
sponse was made, a time out message appeared as in
the training phase.
fMRI Data Acquisition
Images were collected using a 1.5-T Gyroscan magnet
equipped with a Sense coil (Philips, Amsterdam, IL
Netherlands). A T2*-weighted echo-planar sequence was
used (repetition time = 3000 msec, echo time = 45 msec,
flip angle = 90°, 32 transverse slices, field of view =
240 mm, 3.5 × 2.5 × 2.5 mm). The training phase com-
prised two runs of 240 scans, and the test phase one run
Di 260 scans. Five dummy scans were performed before
the start of each stimulus sequence. Standard volumetric
anatomical MRI was performed after functional scanning
by using a 3-D T1-weighted pulse sequence (repetition
time = 25 msec, echo time = 4.1 msec, flip angle = 30°,
160 axial slices, 1.6 × 0.9 × 0.9 mm).
Analysis of fMRI Data
Analyses were carried out using SPM8 software (www.fil.
ion.ucl.ac.uk/spm). Functional images were corrected for
acquisition order, realigned to the mean image, E
resliced to correct for motion artifacts. The realigned
images were coregistered with the structural T1 volume,
and the structural volumes were spatially normalized.
The spatial transformation was applied to the realigned
T2* volumes, which were spatially smoothed using a
Gaussian kernel of 8 mm FWHM. Data were high-pass
filtered (1/128 Hz) to account for low-frequency drifts.
The BOLD response was modeled by a canonical hemo-
dynamic response function.
All analyses were conducted using the general linear
modello. In the individual participant models, the critical
trials that were consistent with their overall favored
strategy (cioè., rule- or similarity-based generalization)
were included as one regressor, whereas critical trials
inconsistent with this approach were a second regressor.
The familiar items were partitioned into correct and in-
correct responses. The duration of each event was mod-
eled as the participant’s RT for that trial (see Grinband,
Wager, Lindquist, Ferrera, & Hirsch, 2008, for the advan-
tages of using this “variable epoch” approach). Time outs
were included as a fifth regressor of no interest. The six
head movement parameters were included as additional
covariates. Contrasts comparing strategy-consistent re-
sponses for the critical trials were subtracted against
the implicit baseline (the intervals between the five event
types listed above; cf. Milton et al., 2009; Tracy et al.,
2003, for a similar approach), and correct familiar trials
were likewise compared with the implicit baseline. These
comparisons were then included in random-effects anal-
yses. For these analyses, participants were divided into
those who provided clear evidence of either similarity
or rule-based generalization (cioè., significantly above
chance strategy-consistent responding on the critical
generalization trials).
Whole-brain analyses were completed using a com-
bined statistical threshold of p < .001 (uncorrected)
and a threshold of 100 contiguous voxels, which together
produce an overall corrected threshold of p < .05. These
values were estimated using 3dClustSim as implemented
in the AFNI toolbox (afni.nimh.nih.gov/afni/). For this, we
used a smoothness estimate of 10.1 × 10.1 × 9.6 mm
(this was a group level estimate calculated in SPM8 using
the group residuals from the general linear model, e.g.,
Kiebel, Poline, Friston, Homes, & Worsley, 1999). In
addition, to measure common activation between rule-
based and similarity-based participants, conjunction anal-
yses were performed. To do this, the relevant contrasts
were combined using a logical “and” function through
the minimum statistic to the conjunction null hypothesis
(MS/CN; Nichols, Brett, Andersson, Wager, & Poline,
2005) technique implemented in SPM8. Both contrasts
were again conducted with a combined threshold of
p < .001 (uncorrected) and a cluster threshold of 100
contiguous voxels. Note that this analysis is conservative
because it reveals only those regions significantly acti-
vated for both the rule ( p < .05, corrected) and the sim-
ilarity ( p < .05, corrected) conditions.
After performing the whole-brain analyses, we decided
to conduct more exploratory ROI analyses (using the
WFU Pickatlas, e.g., Maldjian, Laurienti, Burdette, & Kraft,
2003) when directly comparing rule and similarity gen-
eralization. These post hoc ROI analyses were based on
our a priori predictions of regions we thought would be
differentially involved between strategies and comprised
the pFC (e.g., Milton & Pothos, 2011; Milton et al., 2009),
the occipital lobes/extrastriate cortex (BA 18 and BA 19;
e.g., Nomura et al., 2007; Tracy et al., 2003), and the right
inferior parietal lobes (Grossman et al., 2002). Although
these exploratory analyses should accordingly be taken
with some caution, we believe that they help to charac-
terize better the nature of our results, which is particu-
larly important given that this is the first imaging study
154
Journal of Cognitive Neuroscience
Volume 29, Number 1
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
f
/
t
/
.
1
2
7
0
2
M
1
a
y
2
0
2
1
mance for the familiar items (i.e., those seen during the
training phase) across blocks is displayed in Figure 2A.
The average accuracy (collapsed across blocks) for both
the rule (M = .923, SD = .083, t(23) = 25.111, p < .001)
and similarity (M = .763, SD = .129; t(13) = 7.616, p <
.001) groups was significantly above chance, although as
in the training phase, rule-based participants had higher
accuracy than similarity-based participants, t(19.3) =
4.151, p < .001. Median RTs were nonsignificantly longer
in the similarity group (1208 msec) than the rule group
(1060 msec), t(18.4) = 1.78, p = .09.
Of particular importance, given our interest in general-
ization strategies, both the rule and similarity groups
used their preferred strategy significantly above chance
levels (rule group, M = .862, SD = .106, t(24) = 16.691,
p < .001; similarity group, M = .825, SD = .095, t(13) =
12.868, p < .001) for the critical test items (see Fig-
ure 2B), and there was no significant difference in strategy-
consistent responding between groups, t(29.3) = 1.074,
p = .292, with substantial evidence for the null, BF (Bayes
factor) = 0.25.2 Median RTs were nonsignificantly shorter
in the similarity group (1119 msec) than the rule group
(1294 msec), t(22.3) = 1.86, p = .08.
Although participants were classified on the basis of
being either rule-consistent or similarity-consistent col-
lapsed across all critical items, it does not necessarily follow
that both the critical compound and element stimuli show
this pattern. We therefore consider generalization to com-
pound and element stimuli separately as in past work (e.g.,
Wills et al., 2011; Shanks & Darby, 1998). The mean prob-
ability of predicting an allergic reaction to the critical com-
pound stimuli (i.e., IJ and MN) is shown in Figure 3A. As
expected, there was a significant interaction between strat-
egy used and stimulus type, F(1, 36) = 382.02, p < .001. No
other main effects or interactions were significant ( ps >
.25). The rule group showed rule-consistent generalization
to compounds, T(23) = 25.87, P < .001, whereas the
Figure 2. (A) Accuracy of rule-based and similarity-based responders
for familiar items during the test phase. (B) Proportion of critical
trials across blocks which are consistent with the strategy participants
were assigned to.
of the Shanks–Darby procedure. For these analyses, we
used thresholds of p < .001 and 64 contiguous voxels,
which together produce an overall corrected threshold
of p < .05, as estimated by 3dClustSim. Normalized
MNI space coordinates were transformed to Talairach
space (imaging.mrccbu.cam.ac.uk/imaging/MniTalairach)
to establish activation sites as per the atlas of Talairach
and Tournoux (1988).1
RESULTS
Behavioral Analyses
Training Phase
The proportions of timeouts were low in both the rule
(M = .015, SD = .021) and similarity (M = .020, SD =
.018) groups, and there was no significant difference be-
tween them, t(31.1) = .829, p = .413. One-sample t tests
revealed that the average performance in the second half
of training (Blocks 4–6) was significantly above chance
for both the rule-based (M = .872, SD = .092; t(23) =
19.647, p < .001) and the similarity-based (M = .739,
SD = .088; t(13) = 10.217, p < .001) groups, although,
as in Shanks and Darby (1998), the rule group had higher
accuracy than the similarity group, t(28.6) = 4.409, p <
.001. Median RTs were longer in the similarity group
(1291 msec) than in the rule group (1010 msec), t(21.8) =
3.874, p < .001.
Test Phase
The proportions of timeouts were again low (rule group:
M = .012, SD = .016; similarity group: M = .026, SD =
.031), and there was no significant difference between
conditions, t(17.2) = 1.608, p = .126. Average perfor-
Figure 3. (A) Mean probability of predicting an allergic reaction for
the critical compound stimuli. (B) Mean probability of predicting an
allergic reaction for the critical element stimuli. Also shown are
difference-adjusted 95% confidence intervals for the between-subject
effects (Baguley, 2012).
Milton et al.
155
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
.
f
/
/
t
/
1
2
7
0
2
M
1
a
y
2
0
2
1
similarity group showed similarity-consistent generali-
zation, t(13) = 8.21, p < .001. Median RTs were non-
significantly shorter in the similarity group (1313 msec)
than the rule group (1457 msec), F(1, 36) = 1.29, p =
.26. No other effects were significant ( ps > .12).
Figure 3B shows the probability of predicting an aller-
gic reaction to the critical element stimuli (cioè., K/L and
O/P). There was again a significant interaction between
strategy used and stimulus type, F(1, 36) = 201.80, P <
.001. No other main effects or interactions were signifi-
cant ( ps > .4). The rule group showed rule-consistent
generalization to elements, T(23) = 11.75, P < .001,
whereas the similarity group showed similarity-consistent
generalization, t(13) = 9.43, p < .001. Median RTs were
nonsignificantly shorter in the similarity group (1123
msec) than the rule group (1289 msec), F(1, 36) =
3.96, p = .054. No other main effects or interactions were
significant ( ps > .06).
Imaging Analyses
Training
A contrast comparing all trials and groups against the
implicit baseline revealed extensive activation including
diverse regions of the bilateral pFC, bilateral parietal
lobes, and bilateral occipital lobes (see Figure 4A). Noi
then compared performance on early training (Blocks
1–3) to late training (Blocks 3–6) for all trials and partic-
ipants. A large cluster including the caudate head and
body, which have been extensively linked to category
apprendimento (per esempio., Seger, 2008), and the thalamus, was acti-
vated more early in training than later in training (peak
voxel: x = 16, y = −32, z = 16). Conversely, the right
inferior frontal gyrus (peak voxel: x = 24, y = 25, z =
−5) and the anterior cingulate/medial prefrontal gyrus
(peak voxel: x = 8, y = 30, z = 22) were activated more
late in training than early in training.
l
l
/
/
j
T
T
F
/
io
T
.
:
/
/
D
o
w
N
l
o
UN
D
D
o
e
w
D
N
l
F
o
R
UN
o
D
M
e
D
H
T
F
T
R
P
o
:
M
/
D
/
o
H
M
w
T
io
N
T
T
P
o
P
:
UN
R
/
D
C
/
e
.
D
M
S
io
F
io
R
T
o
l
P
M
v
R
e
C
H
R
.
C
P
S
H
io
UN
l
D
io
v
io
R
R
e
e
.
R
C
C
T
C
.
o
M
H
M
UN
/
io
j
e
R
D
o
.
tu
C
C
N
o
o
/
M
C
UN
N
/
R
j
UN
T
o
R
io
T
C
io
C
C
N
l
e
/
e
–
UN
P
–
R
D
P
T
D
io
2
F
9
C
/
l
1
2
e
9
1
–
/
5
P
1
0
D
/
F
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
C
2
0
N
5
/
_
2
UN
1
/
_
7
j
0
8
o
1
5
C
0
7
N
2
5
4
_
7
UN
P
/
_
D
j
0
o
1
B
C
0
sì
N
2
G
_
4
tu
UN
.
e
_
P
S
0
T
D
1
o
F
0
N
2
B
0
4
sì
8
.
P
S
M
D
e
IO
F
P
T
e
B
M
l
sì
io
B
B
e
G
R
R
tu
UN
2
e
R
0
S
io
2
T
3
e
S
o
N
tu
S
0
e
3
R
M
o
UN
N
sì
/
j
/
.
T
/
/
F
1
2
7
0
2
M
1
UN
sì
2
0
2
1
Figura 4. (UN) Brain regions activated across all trials and all participants during training. (B) Regions activated by correct responses by the rule
group during training. (C) Regions activated by correct responses in the similarity group during training. (D) Common brain regions activated by
correct responses in the similarity and rule groups. The coordinates indicate the origin for the image displayed. Lighter colors indicate higher
z scores.
156
Journal of Cognitive Neuroscience
Volume 29, Numero 1
Tavolo 1. Regions Commonly Activated by Rule-based and Similarity-based Generalization in the Training Phase
Talairach Coordinates
Region
Cluster Size
BA
Left anterior cingulate
1176
Right anterior cingulate
Left medial frontal gyrus
Left middle frontal gyrus
5721
Left precuneus
Left inferior parietal lobe
Right inferior occipital gyrus
3147
Right occipital lobe
Right middle occipital gyrus
Left insula
Left inferior frontal gyrus
Right inferior frontal gyrus
Right superior parietal lobe
Right precuneus
Right superior parietal lobe
Right precentral gyrus
Right precentral gyrus
Right precentral gyrus
185
196
406
354
32
32
32
9
19
40
18
19
18
13
47
45
7
7
7
4
6
6
X
−6
6
−8
−50
−28
−32
38
28
36
−30
−30
32
32
18
32
48
34
40
sì
18
16
12
7
−67
−50
−82
−74
−84
20
25
24
−52
−62
−58
−11
−14
−7
z
40
40
45
29
29
43
−4
−5
2
1
−5
4
45
49
40
48
62
52
z Score
6.59
5.98
5.89
6.40
6.36
6.34
5.83
5.71
5.66
5.65
4.90
5.34
4.68
4.19
4.19
4.66
4.18
3.87
All activations significant at p < .001. Indented rows indicate voxels in the same cluster as the nonindented row above them. BA = Brodmann’s area.
For the rule group, comparing correct responses against
the baseline revealed extensive activation including in the
pFC, parietal lobes, and the occipital lobes (Figure 4B).
Similar regions were recruited by the similarity group
(Figure 4C). This extensive common overlap of activa-
tion was confirmed by a conjunction analysis (Figure 4D;
Table 1). When directly comparing the groups, no regions
were more activated by the similarity group than the rule
group; however, the bilateral posterior cingulate/pre-
cuneus (cluster size: 166 voxels) was engaged more in
the rule group than the similarity group (Figure 5A). An
exploratory analysis of this rule–similarity contrast, with
Figure 5. (A) Brain regions
more activated by the rule
group than the similarity group
across all blocks of training.
(B) Brain regions more activated
by rule-based responders than
similarity responders across
the second half of training.
The coordinates indicate the
origin for the image displayed.
Lighter colors indicate higher
z scores.
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
/
f
t
/
.
1
2
7
0
2
M
1
a
y
2
0
2
1
Milton et al.
157
more liberal thresholds ( p < .001, 25 contiguous voxels)
and which should consequently be taken with caution,
revealed two clusters in right middle frontal gyrus (first
cluster, peak voxel x = 32, y = 24, z = 15, cluster size:
41 voxels; second cluster, peak voxel, x = 28, y = 23, z =
39, cluster size: 50 voxels).
No differences emerged between groups when consid-
ering just the first half of training. When looking at the
second half of training alone, there were again no areas
more activated by the similarity group than the rule-
based group. However, the bilateral posterior cingulate/
precuneus was again activated more by the rule group than
the similarity group, and the right anterior cingulate/medial
frontal gyrus was also engaged (see Figure 5B).
Test
Critical trials (generalization). A number of brain re-
gions were activated by similarity responders including
bilateral inferior and superior parietal lobes, right middle
occipital gyrus, and left medial frontal gyrus (Figure 6A).
Rule-based responders engaged the left superior parietal
lobes, bilateral inferior parietal lobes, bilateral middle
frontal gyrus, left medial frontal gyrus, right inferior frontal
gyrus, and bilateral occipital lobes (Figure 6B). A conjunc-
tion analysis (see Figure 6C; Table 2) revealed extensive
common overlap of activation between the similarity-
and rule-based participants, which included the left supe-
rior parietal lobes, bilateral inferior parietal lobes, bilateral
medial frontal gyrus, left middle frontal gyrus, and the
bilateral occipital gyrus.
Next, we directly contrasted brain activation between
the similarity and rule groups. In contrast to the exten-
sive common activation, no differences were identified
in whole-brain analyses. However, in the exploratory
ROI analyses (comprising the pFC, right inferior parietal
lobes, and bilateral occipital lobes, with thresholds of
p < .001 and 64 contiguous voxels), we found that the
right middle frontal gyrus (see Figure 7A; BA 9) was acti-
vated more for the rule group than the similarity group
(in the same region as identified by the exploratory anal-
ysis documented in the rule–similarity comparison for the
training phase). In contrast, we observed greater activa-
tion in the anterior medial frontal lobes (BA 10) and the
right inferior parietal lobes (BA 40) for the similarity group
compared with the rule-based group (Figure 7B).
Element versus compound critical stimuli. As a sup-
plementary question, we assessed whether there were
activation differences in the element (i.e., K/L and O/P)
and compound (i.e., IJ and MN) critical trials. As before,
only trials that were consistent with the preferred strat-
egy of the participants (i.e., rule- or similarity-based)
were included. For the rule group, there was greater
activation in the occipital lobes/cerebellum and the left
caudate body for the compound stimuli than for the
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
.
f
/
/
t
1
2
7
0
2
M
1
a
y
2
0
2
1
Figure 6. (A) Brain regions significantly activated by similarity responders during the critical trials. (B) Brain regions significantly activated by
rule-based responders during the critical trials. (C) Common brain regions activated by similarity- and rule-based responders during the critical trials.
The coordinates indicate the origin for the image displayed. Lighter colors indicate higher z scores.
158
Journal of Cognitive Neuroscience
Volume 29, Number 1
Table 2. Regions Commonly Activated by Rule-based and Similarity-based Generalization in the Critical Generalization Trials
Talairach Coordinates
Region
Cluster Size
Right superior parietal lobe
566
Right precuneus
Right inferior parietal lobe
Left superior parietal lobe
1040
Left precuneus
Left parietal lobe
Left middle occipital gyrus
Left middle occipital gyrus
Left inferior occipital gyrus
Right middle occipital gyrus
Right middle occipital gyrus
Right occipital lobe
Left medial frontal gyrus
Right medial frontal gyrus
Left middle frontal gyrus
Left precentral gyrus
556
425
329
104
BA
7
19
40
7
7
39
18
18
19
18
18
17
6
6
9
6
x
32
30
34
−26
−26
−28
−26
−26
−38
24
16
22
−2
8
−50
−38
y
−58
−68
−50
−62
−67
−62
−91
−84
−76
−89
−94
−88
16
16
4
0
z
51
29
41
44
27
36
10
−4
−5
10
14
−2
45
42
33
33
z Score
5.40
4.77
4.36
5.09
5.05
4.97
4.82
4.74
4.23
4.44
4.21
3.97
4.40
3.94
3.63
3.46
All activations significant at p < .001. Indented rows indicate voxels in the same cluster as the nonindented row above them. BA = Brodmann’s area.
element stimuli (see Figure 8A). In contrast, no regions
were more activated for the element stimuli than the
compound stimuli. For the similarity group, no areas
were more active for the compound stimuli than the
element stimuli, although the occipital lobes were, as
for the rule-based group, activated at lower thresholds
(this could reflect, in part, the smaller sample size of
the similarity group compared with the rule group).
However, the left precentral/postcentral gyrus was acti-
vated more for the element stimuli than the compound
stimuli (Figure 8B). When comparing the rule and similar-
ity groups, no significant differences emerged.
Familiar items. Although not our primary focus, we
also examined the brain activation for the familiar items
to supplement the critical generalization trials analyses.
Participants in the similarity-based group activated a
diverse set of regions including bilateral occipital gyrus,
left inferior parietal lobes, and bilateral middle frontal
gyrus (Figure 9A). Rule-based generalizers recruited the
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
/
/
f
t
.
Figure 7. (A) Brain regions
more activated by the
rule-based responders than
similarity responders for
the critical trials. (B) Brain
regions more activated by
similarity responders than
rule-based responders for the
critical trials. The coordinates
indicate the origin for the
image displayed. Lighter colors
indicate higher z scores.
1
2
7
0
2
M
1
a
y
2
0
2
1
Milton et al.
159
Figure 8. (A) Regions more
activated by compound items
than element items during
the critical generalization trials
for the rule-based group.
(B) Regions more activated
by element items than
compound items during
the critical generalization
trials for the similarity group.
bilateral occipital lobes, the left superior parietal lobes,
and the left inferior and middle frontal gyrus (Figure 9B).
A conjunction analysis indicated common activation be-
tween the similarity- and rule-based responders in the bi-
lateral occipital lobes, the left superior parietal lobes, and
left postcentral gyrus (see Figure 9C; Table 3). This pattern
is, broadly speaking, similar to what we observed in the
training phase with the same stimuli.
No differences emerged between the rule and similar-
ity groups in either whole-brain or our exploratory ROI
analyses. This is perhaps not that surprising as these anal-
yses are considerably less sensitive than the analogous
critical trials analyses given that they do not directly mea-
sure generalization and one cannot determine at the in-
dividual trial level whether a response is rule or similarity
consistent. As a further exploratory analysis, we used the
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
t
/
f
/
.
/
1
2
7
0
2
M
1
a
y
2
0
2
1
Figure 9. (A) Brain regions engaged by the similarity responders for the familiar items during test. (B) Brain regions engaged by the rule-based
responders for the familiar items during test. (C) Brain regions commonly engaged by the similarity- and rule-based responders for the familiar
items during test. The coordinates indicate the origin for the image displayed. Lighter colors indicate higher z scores.
160
Journal of Cognitive Neuroscience
Volume 29, Number 1
Table 3. Regions Commonly Activated by the Rule and Similarity Groups during Test for the Familiar Items
Talairach Coordinates
Region
Cluster Size
BA
Right superior parietal lobe
Right parietal lobe
Left middle occipital gyrus
Left middle occipital gyrus
Left cuneus
Left superior parietal lobe
Left precuneus
Left superior parietal lobe
154
565
858
Right middle occipital gyrus
564
Right middle occipital gyrus
Right middle occipital gyrus
Left anterior cingulate
244
Right superior frontal gyrus
Left superior frontal gyrus
7
39
18
18
17
7
19
7
18
18
18
32
8
6
x
30
30
−36
−28
−14
−30
−28
−30
14
38
30
−8
2
−6
y
−49
−56
−86
−80
−91
−56
−68
−62
−91
−82
−87
17
18
8
z
39
38
−2
−6
8
40
38
49
14
1
4
38
47
49
z Score
5.37
4.70
5.17
5.08
5.01
5.07
4.88
4.81
4.47
4.44
4.33
4.45
3.86
3.54
All activations significant at p < .001. Indented rows indicate voxels in the same cluster as the nonindented row above them. BA = Brodmann’s area.
WFU Pickatlas (Maldjian et al., 2003) to construct a mask
containing all regions identified in the similarity–rule
analysis of the critical items previously reported (Fig-
ure 7B) and identified whether there was any activation
in these areas for the familiar items at thresholds of p <
.005 (uncorrected) and 10 contiguous voxels. This analy-
sis revealed activation in the right inferior parietal lobes
(peak voxel x = 59, y = −37, z = 31, cluster size: 37 vox-
els). We also conducted the same type of analysis for the
right middle frontal gyrus region (Figure 7A) that was
more activated in the rule than the similarity condition
for the critical generalization trials but did not detect
any activation here.
Participants with a mixture of rule- and similarity-
consistent responses. Fourteen participants met the
learning criterion for training but showed no clear pat-
tern of similarity- or rule-based responding for the critical
generalization trials and were consequently excluded from
the analyses above. Nevertheless, given that these par-
ticipants had a mixture of both types of strategy (rule-
consistent, M = .501; similarity-consistent, M = .499), they
Figure 10. (A) Regions
activated by similarity-consistent
responses for the group who
displayed a mixture of both
strategies. (B) Regions activated
by rule-consistent responses for
the group who displayed a
mixture of both strategies.
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
.
f
/
/
t
1
2
7
0
2
M
1
a
y
2
0
2
1
Milton et al.
161
provide an opportunity to look at the neural correlates of
rule- and similarity-based responding within participants.
Similarity-consistent responses evoked activation in
the bilateral anterior cingulate/medial frontal gyrus, the
left superior parietal lobe, and the bilateral occipital lobes
(Figure 10A). Rule-based responding also activated bilat-
eral occipital lobes (see Figure 10B). We did not, how-
ever, detect any significant activation elsewhere although
exploratory analyses with more liberal thresholds ( p <
.001 and 25 contiguous voxels) revealed a similar pattern
of activation to what we observed with the similarity re-
sponders. We suspect the reduced activation here com-
pared with the corresponding analysis for the consistent
rule-based sorters reflects the lower number of partici-
pants and trials in the current analysis. We found no indi-
cation of any differences between strategies in either
whole-brain or exploratory ROI analyses.
DISCUSSION
This study used a negative and positive patterning design
originally developed by Shanks and Darby (1998; see also
Wills et al., 2011) to compare the brain activation of rule
and similarity generalization. Participants were divided
into either rule-based or similarity-based generalizers
according to their responses during the critical items at
test. Importantly, participants in both groups were highly
consistent and did not differ in their ability to follow
their preferred strategy. There was extensive overlap of
activation between the rule- and similarity-based groups,
but at the same time there were regions that were dif-
ferentially activated by the two strategies. We discuss the
most notable aspects of our findings below.
In the training phase, there was extensive overlap be-
tween groups including diverse regions of the pFC, the
parietal lobes, and the occipital lobes. Differences be-
tween strategies, in contrast, were more restricted—no
regions were more active in the similarity group than the
rule group, although the posterior cingulate/precuneus
and the anterior cingulate/medial prefontal cortex (in the
second half of training) were more active in the rule group
than the similarity group. The greater posterior cingulate/
precuneus activation is perhaps somewhat surprising,
although these regions have been implicated in rule-
based category learning (Milton & Pothos, 2011). The
anterior cingulate activation is in line with the key role this
region is thought to play in rule selection in COVIS’s rule-
based system (Ashby et al., 1998).
In the test phase, there was again considerable com-
mon overlap in the regions activated by the rule and
similarity groups for both the critical generalization trials
and the familiar trials. As before, areas activated included
regions of the pFC (including the middle frontal gyrus),
bilateral parietal lobes, and bilateral occipital lobes. These
regions have all previously been implicated in categori-
zation tasks (e.g., Carpenter et al., in press; Milton et al.,
2009). For example, the bilateral parietal lobes have been
heavily implicated in both explicit and implicit classifi-
cation of dot patterns (Aizenstein et al., 2000) and stimu-
lus generalization (Seger, Braunlich, Wehe, & Liu, 2015).
Furthermore, our results add to the growing body of
evidence that diverse regions of the pFC are involved in
categorization (e.g., Milton & Pothos, 2011; Koenig et al.,
2005; Grossman et al., 2002; Seger & Cincotta, 2002). The
common activation shared by the rule and similarity
groups across both the category learning and generali-
zation components of this task is consistent with the
idea that both strategies share a number of common, inter-
related processes, such as stimulus processing, response
selection, stimulus–response mappings, feedback pro-
cessing (whether it is an external signal as in training or
more internally generated as is likely in test), uncertainty,
and attentional and working memory demands to name
just a few likely candidates.
Although no differences were identified between gen-
eralization strategies in whole-brain analyses, exploratory
ROI analyses provided evidence for dissociable activa-
tion between the rule and similarity groups. As predicted,
the rule-based generalizers activated the right middle
frontal cortex to a greater extent than the similarity-based
generalizers. This was in keeping with the trend from the
training phase for there to be greater activation in the
right pFC in the rule group than the similarity group. In
contrast, the similarity-based generalizers preferentially
recruited the anterior medial prefrontal lobe and the right
inferior parietal lobes.
The greater right middle frontal gyrus activation in
rule-based generalizers than similarity-based generalizers
is consistent with a broad range of previous work indi-
cating that the middle frontal lobes is a critical site for rule-
based categorization (e.g., Milton & Pothos, 2011; Seger
& Cincotta, 2002, 2005; Tracy et al., 2003; Grossman
et al., 2002; Patalano et al., 2001) and, more generally,
its role is well established in working memory process-
ing (Owen, 2000). This pattern of findings is, therefore,
in line with the idea that the “opposites” rule is the result of
deliberative, rule-based processes, rather than being driven
by a non-rule-based, associatively mediated system (Maes
et al., 2015).
Turning to the regions preferentially linked to simi-
larity responding, the right inferior parietal lobes were
identified by Grossman et al. (2002) as being linked to
similarity judgments, and they suggested that it had a
role in overall feature configuration processing. Alter-
natively, it could reflect recollection-based memory pro-
cesses, which are often observed in this region (e.g.,
Milton, Muhlert, Butler, Benattayallah, & Zeman, 2011;
Wheeler & Buckner, 2004). This would be consistent with
the idea that similarity processing places particular de-
mands on the retrieval of past, related instances. We found
that the anterior medial prefrontal lobe was activated
more for the similarity than the rule group, in a strikingly
similar location to that observed by Koenig et al. (2005) in
their analogous comparison. Our explanation for this result
162
Journal of Cognitive Neuroscience
Volume 29, Number 1
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
f
.
/
t
/
1
2
7
0
2
M
1
a
y
2
0
2
1
is similar to Koenig et al.’s—the activation in this region
may reflect the greater dependence on retrieving specific
exemplars from long-term memory in the similarity-based
condition rather than having generalization supported by
an abstract rule.
Of course, there is always a danger in linking specific
brain regions to a particular function and in knowing
whether the regions identified are essential for the strat-
egy used. For example, the fact that both the similarity
and rule groups activated different parts of the pFC more
than the other strategy appears to challenge any clear-cut
narrative that rule generalization requires higher-order,
deliberative processes whereas similarity generalization
requires more automatic, nondeliberative processes. One
intriguing future approach may be to further explore the
regions where differences emerged (i.e., the right middle
frontal gyrus for the rule generalizers and the anterior
medial frontal lobe and right inferior parietal lobes for
the similarity generalizers) using TMS. For example,
according to our results, one might expect that stimulating
the right middle frontal gyrus would disrupt rule generali-
zation but leave similarity generalization intact, whereas
stimulating the anterior prefrontal lobe (or right inferior
parietal lobes) might impair similarity generalization but
not rule generalization. Furthermore, if the greater activa-
tion in the anterior medial pFC reflects a greater reliance
on retrieving past instances in similarity generalization than
rule generalization, then one might expect that stimulating
this region would disrupt performance in a task using sim-
ilar stimuli which more directly assesses memory capacity.
Regardless of the precise role that these regions may
play, using the Shanks–Darby procedure to examine the
neural correlates of rule and similarity generalization makes
a valuable new contribution to the area because it over-
comes some problems that mar the paradigms used in
previous related imaging studies. In particular, many stud-
ies in this area confound rule-based and similarity-based
learning with single versus multidimensional learning
(e.g., Milton & Pothos, 2011; Nomura et al., 2007; Tracy
et al., 2003). Furthermore, all previous studies investigat-
ing this issue index rule-based learning through behavior
that can also be produced by a simple associative mech-
anism that incorporates some process of selective atten-
tion (e.g., ALCOVE; Kruschke, 1992). In contrast, for the
Shanks–Darby procedure, there is no simple associative
model that can explain both the similarity- and rule-based
generalization observed and the same number of dimen-
sions is relevant in both rule- and similarity-based learning.
There are other notable differences between the
Shanks–Darby procedure and the other tasks previously
used (e.g., Nomura et al., 2007; Koenig et al., 2005; Tracy
et al., 2003; Grossman et al., 2002), which may have im-
pacted the results obtained. For instance, the Shanks–
Darby task enables one to partition items into generaliza-
tion trials (items not encountered during training) and
familiar trials (which had previously been learned during
training). In contrast, other studies have either used cat-
egory learning (e.g., Carpenter et al., in press; Nomura
et al., 2007) or category decision-making (e.g., Milton
et al., 2009; Grossman et al., 2002) procedures where
there is no generalization phase or used imaging analyses
that combine old training items with generalization trials
(e.g., Koenig et al., 2005). Given that, in the test phase,
we only observed evidence for differences in the gener-
alization trials and not in the familiar trials, this could be
an important distinction.
A second important difference is that previous studies
typically use multidimensional stimuli that possess either
binary (e.g., Milton et al., 2009; Tracy et al., 2003) or con-
tinuous values (e.g., Carpenter et al., in press; Nomura
et al., 2007) on a particular dimension. In contrast, our
stimuli have discrete components (e.g., A, B, AB, etc.) that
are either present or absent. This distinction has not been
systematically investigated but could potentially have an
important impact on the pattern of results obtained.
Clearly, in the future, it would be of value to further ex-
plore rule and similarity learning under a more diverse
range of conditions to build up a broader understanding
of how the two types of strategies relate to each other.
Although the Shanks–Darby procedure appears to have
a number of strengths, one potential limitation is that,
although the rule and similarity groups were well matched
in their consistency of applying their preferred strategy
for the critical generalization trials, the rule group signif-
icantly outperformed the similarity group for the familiar
training trials (this is the same pattern that Shanks &
Darby, 1998, found). It is worth noting, though, that
the familiar trials were analyzed separately from the
critical generalization trials where the neural differences
emerged, which should attenuate any influence this dif-
ference would have on our results. Furthermore, differ-
ences between groups for the familiar items (where only
correct responses were considered) were extremely re-
stricted, which suggests that performance difference
between groups had little impact on the pattern of activa-
tion. Nevertheless, in future work with the Shanks–Darby
procedure, it would be desirable to have the groups
better matched on performance for the familiar trials.
One way of doing this could be to introduce a learning
criterion during the training phase, which has been shown
to better equate groups on the familiar test items in this
procedure (Wills et al., 2011).
Another notable aspect of our results is that the key
comparison between rule and similarity generalization
was between participants. While we looked at this from
a within-subject perspective as well by considering those
participants who displayed a mixture of strategies, these
analyses were not particularly revealing. We suspect that
this was due to these participants randomly responding
on the critical trials and/or that there were insufficient
trials of each type to reliably detect any differences. In
the future, one could potentially further increase the
number of critical generalization trials in the procedure
and/or use a combination of more effective instructions
Milton et al.
163
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
.
t
/
f
/
1
2
7
0
2
M
1
a
y
2
0
2
1
than we used with training outside the scanner to ensure
that participants produce a good mixture of rule and
similarity generalization responses.
Another limitation of this study is that, although the
differences between generalization strategies were con-
sistent with our a priori predictions and in line with past
related work, it must be acknowledged that these differ-
ences were not identifiable in whole-brain analyses and
could only be detected in more exploratory, post hoc
analyses. Clearly, it would be valuable for future work
to try and replicate our basic pattern of results.
Nevertheless, our finding of differences in activation
between rule and similarity generalization (albeit only
in exploratory analyses) is in line with recent behavioral
and comparative studies that have been conducted with
the Shanks–Darby procedure. Specifically, Wills et al.
(2011) found that participants who completed the train-
ing session under a concurrent load went on to produce
significantly less rule-based generalization than participants
who undertook training under no load. Wills et al. sug-
gested that this is consistent with the idea that discovering
the “opposites” rule requires considerable working mem-
ory capacity and if this is not available that participants
will fail to transition from using a similarity approach to a
rule-based approach. Related to this, Maes et al. (2015)
found that, although humans (under no concurrent load)
could readily discover the opposites rule, both pigeons
and rats were unable to do so and relied on similarity
generalization. This is consistent with the idea that pigeons
and rats may be forced to rely on the similarity system
whereas humans also have access to a rule-based system.
Our findings complement these two recent behavioral
studies by identifying specific neural correlates that are
associated with rule and similarity-based generalization.
However, although our results make a novel and valu-
able contribution to the area, our findings stop some way
short of providing clear evidence for qualitatively sepa-
rable generalization systems by any reasonable definition.
For instance, neural differences in themselves should not
be taken as evidence for separable systems given that
past work has shown that items within the same category
can provoke differential activation (e.g., Davis & Poldrack,
2014; DeGutis & D’Esposito, 2009; Grinband, Hirsch, &
Ferrera, 2006). What may be more compelling evidence
for separable learning systems would be large differences
in activation in regions that are not also activated by the
other strategy. This does not appear to be the case in this
study, where the differences in generalization were re-
stricted and located close to regions that were activated
in training and test by both strategies. Furthermore, the
commonalities in activation between strategies clearly out-
weigh the differences.
One potentially fruitful way of viewing the current data,
then, is to consider category learning and stimulus gen-
eralization as cognitively complex processes comprising a
number of subcomponents (e.g., stimulus processing,
hypothesis testing, decision-making, feedback process-
ing, etc.) many of which are likely to be shared by rule
and similarity strategies. Furthermore, one strategy may
place more of an emphasis on one of these subprocesses
than the other strategy does. For example, the “oppo-
sites” rule needed for rule generalization appears likely
to place particular demands on working memory capacity
and rule formation that do not appear to be needed to the
same extent for similarity generalization and this may re-
quire increased activation of the right middle frontal gyrus.
Conversely, a similarity strategy may, for example, impose
higher memory demands than the rule condition (where
there may be more of an emphasis on abstract rules),
which could lead to greater engagement of the anterior
medial prefrontal lobes. Of course, our hypotheses as to
what particular role these brain regions play may not be
correct but as others have recommended (e.g., Davis,
Love, & Preston, 2012) trying to link brain regions to spe-
cific functions of the learning process may be at least as
profitable an approach as focusing on the more general
and contentious question of whether data is more in line
with single-or multiple-system accounts. We suggest that
further examination of the Shanks–Darby procedure, with
its notable strengths, could play an important role in fur-
ther illuminating both these important questions.
Acknowledgments
We thank the Experimental Psychology Society for their support
and two anonymous reviewers for their insightful and construc-
tive comments.
Reprint requests should be sent to Fraser Milton, Discipline of
Psychology, University of Exeter, Exeter, EX4 4QG, United Kingdom,
or via e-mail: f.n.milton@ex.ac.uk.
Notes
1. The raw imaging and behavioral data are available for inter-
ested readers at www.willslab.co.uk/exe10/index.html.
2. By convention, a BF of over three is interpreted as pro-
viding substantial evidence for the experimental hypothesis
( Jeffreys, 1961), whereas a BF below a third provides substan-
tial evidence for the null (Dienes, 2011). BF analysis requires an
estimate of the mean expected difference under the experi-
mental hypothesis; we estimated this from the observed differ-
ence for the familiar test items. Following Dienes (2011), the
expected difference was modeled as a two-tailed normal distri-
bution with a standard deviation equal to half the mean. Calcu-
lations were run using a custom script (Baguley & Kaye, 2010)
within R (R Core Team, 2015).
REFERENCES
Aizenstein, H. S., MacDonald, A. W., Stenger, V. A., Nebes, R. D.,
Larson, J. K., Ursu, S., et al. (2000). Complementary category
learning systems using event-related functional MRI. Journal
of Cognitive Neuroscience, 12, 977–987.
Allen, S. W., & Brooks, L. R. (1991). Specializing the operation
of an explicit rule. Journal of Experimental Psychology:
General, 120, 3–19.
Ashby, F. G., Alfonse-Reese, L. A., Turken, A. U., & Waldron,
E. M. (1998). A neuropsychological theory of multiple
164
Journal of Cognitive Neuroscience
Volume 29, Number 1
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
t
/
.
f
/
1
2
7
0
2
M
1
a
y
2
0
2
1
systems in category learning. Psychological Review, 105,
442–481.
Ashby, F. G., & Maddox, W. T. (2011). Human category learning
2.0. Annals of the New York Academy of Sciences, 1224,
147–161.
Baguley, T. (2012). Calculating and graphing within-subject
confidence intervals for ANOVA. Behavior Research Methods,
44, 158–175.
Baguley, T., & Kaye, D. (2010). Book review: Understanding
psychology as a science: An introduction to scientific and
statistical inference. British Journal of Mathematical and
Statistical Psychology, 63, 695–698.
Brooks, L. (1978). Nonanalytic concept formation and memory
for instances. In E. Rosch & B. B. Lloyd (Eds.), Cognition
and categorization (pp. 169–211). Hillsdale, NJ: Erlbaum.
Carpenter, K. L., Wills, A. J., Benattayallah, A., & Milton, F.
(in press). A comparison of the neural correlates that underlie
rule-based and information-integration category learning.
Human Brain Mapping. doi:10.1002/hbm.23259.
Davis, T., Love, B. C., & Preston, A. R. (2012). Learning the
exception to the rule: Model-based fMRI reveals specialized
representations for surprising category members. Cerebral
Cortex, 22, 260–273.
Davis, T., & Poldrack, R. A. (2014). Quantifying the internal
structure of categories using a neural typicality measure.
Cerebral Cortex, 24, 1720–1757.
DeGutis, J., & D’Esposito, M. (2009). Network changes in
the transition from initial learning to well-practiced visual
categorization. Frontiers in Human Neuroscience, 3,
1–13.
Dienes, Z. (2011). Bayesian versus orthodox statistics: Which
side are you on? Perspectives in Psychological Science, 6,
274–290.
Edmunds, C. E. R., Milton, F., & Wills, A. J. (2015).
Feedback can be superior to observational training for both
rule-based and information-integration category learning.
Quarterly Journal of Experimental Psychology, 68,
1203–1222.
Grinband, J., Hirsch, J., & Ferrera, V. P. (2006). A neural
representation of categorization uncertainty in the human
brain. Neuron, 49, 757–763.
Grinband, J., Wager, T. D., Lindquist, M., Ferrera, V. P., &
Hirsch, J. (2008). Detection of time-varying signals in
event-related fMRI designs. Neuroimage, 43, 509–520.
Grossman, M., Smith, E. E., Koenig, P., Glosser, G., DeVita, C.,
Moore, P., et al. (2002). The neural basis for categorization
in semantic memory. Neuroimage, 17, 1549–1561.
Jeffreys, H. (1961). The theory of probability (3rd ed.). Oxford,
UK: Oxford University Press.
Kemler Nelson, D. G. (1984). The effect of intention on what
concepts are acquired. Journal of Verbal Learning & Verbal
Behavior, 23, 734–759.
Kiebel, S. J., Poline, J. B., Friston, K. J., Homes, A. P., & Worsley,
K. J. (1999). Robust smoothness estimation in statistical
parametric maps using standardized residuals from the
general linear model. Neuroimage, 10, 756–766.
Koenig, P., Smith, E. E., Glosser, G., DeVita, C., Moore, P.,
McMillan, C., et al. (2005). The neural basis for novel semantic
categorization. Neuroimage, 24, 369–383.
Kruschke, J. K. (1992). ALCOVE: An exemplar-based
connectionist model of category learning. Psychological
Review, 99, 22–44.
Maes, E., De Filippo, G., Inkster, A. B., Lea, S. E. G., De Houwer,
J., D’Hooge, R., et al. (2015). Feature- versus rule-based
generalization in rats, pigeons and humans. Animal
Cognition, 18, 1267–1284.
Maldjian, J. A., Laurienti, P. J., Burdette, J. B., & Kraft, R. A.
(2003). An automated method for neuroanatomic and
cytoarchitectonic atlas-based interrogation of fMRI data
sets. Neuroimage, 19, 1233–1239.
McLaren, I. P. L., Green, R., & Mackintosh, N. J. (1994). Animal
learning and the explicit/implicit distinction: Or why what
we think of as explicit for us can be implicit for them.
In N. Ellis (Ed.), Implicit and explicit learning of languages
(pp. 313–332). London: Academic Press.
Milton, F., Longmore, C. A., & Wills, A. J. (2008). Processes of
overall similarity sorting in free classification. Journal of
Experimental Psychology: Human Perception and
Performance, 34, 676–692.
Milton, F., Muhlert, N., Butler, C. R., Benattayallah, A., & Zeman,
A. Z. J. (2011). The neural correlates of everyday recognition
memory. Brain and Cognition, 76, 369–381.
Milton, F., & Pothos, E. M. (2011). Category structure and
the two learning systems of COVIS. European Journal of
Neuroscience, 34, 1326–1336.
Milton, F., & Wills, A. (2004). The influence of stimulus
properties on category construction. Journal of Experimental
Psychology: Learning, Memory & Cognition, 30, 407–415.
Milton, F., Wills, A. J., & Hodgson, T. L. (2009). The neural
basis of overall similarity and single-dimension sorting.
Neuroimage, 46, 319–326.
Newell, B. R., Dunn, J. C., & Kalish, M. (2011). 6 Systems of
category learning: Fact or fantasy? In B. H. Ross (Ed.), The
psychology of learning & motivation (Vol. 54, pp. 167–215).
San Diego, CA: Elsevier.
Newell, B. R., Moore, C. P., Wills, A. J., & Milton, F. (2013).
Reinstating the frontal lobes? Having more time to think
improves implicit perceptual categorization: A comment on
Filoteo, Lauritzen, and Maddox (2010). Psychological Science,
24, 386–389.
Nichols, T., Brett, M., Andersson, J., Wager, T., & Poline, J. B.
(2005). Valid conjunction inference with the minimum
statistic. Neuroimage, 25, 653–660.
Nomura, E. M., Maddox, W. T., Filoteo, J. V., Ing, A. D.,
Gitelman, D. R., Parrish, T. B., et al. (2007). Neural correlates
of rule-based and information-integration visual category
learning. Cerebral Cortex, 17, 37–43.
Nomura, E. M., & Reber, P. J. (2008). A review of medial
temporal lobe and caudate contributions to visual category
learning. Neuroscience and Biobehavioral Reviews, 32,
279–291.
Nosofsky, R. M., & Johansen, M. K. (2000). Exemplar-based
accounts of “multiple-system” phenomena in perceptual
categorization. Psychonomic Bulletin & Review, 7, 375–402.
Nosofsky, R. M., & Kruschke, J. K. (2002). Single-system models
and interference in category learning: Commentary on
Waldron and Ashby (2001). Psychonomic Bulletin & Review,
9, 169–174.
Owen, A. M. (2000). The role of the lateral frontal cortex in
mnemonic processing: The contribution of functional
imaging. Experimental Brain Research, 133, 33–43.
Patalano, A. L., Smith, E. E., Jonides, J., & Koeppe, R. A. (2001).
PET evidence for multiple strategies of categorization.
Cognitive, Affective & Behavioral Neuroscience, 1, 360–370.
R Core Team. (2015). R: A language and environment for
statistical computing. Vienna: R Foundation for Statistical
Computing. www.R-project.org/.
Rips, L. J. (1989). Similarity, typicality, and categorization. In
S. Vosniadou & A. Ortony (Eds.), Similarity and analogical
reasoning (pp. 21–59). Cambridge, UK: Cambridge University
Press.
Rosch, E., & Mervis, C. B. (1975). Family resemblances: Studies
in the internal structure of categories. Cognitive Psychology,
7, 573–605.
Seger, C. A. (2008). How do the basal ganglia contribute to
categorization? Their roles in generalization, response
Milton et al.
165
l
l
/
/
j
f
/
t
t
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
.
f
/
/
t
/
1
2
7
0
2
M
1
a
y
2
0
2
1
selection, and learning via feedback. Neuroscience and
Biobehavioral Reviews, 32, 265–278.
Seger, C. A., Braunlich, K., Wehe, H. S., & Liu, Z. (2015).
Generalization in category learning: The roles of
representational and decisional uncertainty. Journal of
Neuroscience, 35, 8802–8812.
Seger, C. A., & Cincotta, C. M. (2002). Striatal activity in concept
learning. Cognitive Affective & Behavioral Neuroscience,
2, 149–161.
Seger, C. A., & Cincotta, C. M. (2005). The roles of the caudate
nucleus in human classification learning. Journal of
Neuroscience, 25, 2941–2951.
Shanks, D. R., & Darby, R. J. (1998). Feature- and rule-based
generalization in human associative learning. Journal of
Experimental Psychology: Animal Behavior Processes, 24,
405–415.
Smith, E. E., Patalano, A. L., & Jonides, J. (1998). Alternative
strategies of categorization. Cognition, 65, 167–196.
Stanton, R. D., & Nosofsky, R. M. (2013). Category number
impacts rule-based and information-integration category
learning: A reassessment of evidence for dissociable
category-learning systems. Journal of Experimental
Psychology: Learning, Memory, & Cognition, 39, 1174–1191.
Talairach, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas
of the human brain. 3-Dimensional proportional system: An
approach to cerebral imaging. Stuttgart, Germany: Thieme.
Tracy, J. I., Mohamed, F., Faro, S., Pinus, A., Tiver, R., Harvan, J.,
et al. (2003). Differential brain responses when applying
criterion attribute versus family resemblance rule learning.
Brain and Cognition, 51, 276–286.
Ward, T. B., & Scott, J. (1987). Analytic and holistic modes of
learning family-resemblance concepts. Memory & Cognition,
15, 42–54.
Wheeler, M. E., & Buckner, R. L. (2004). Functional-anatomic
correlates of remembering and knowing. Neuroimage, 21,
1337–1349.
Wilkinson, D. T., Halligan, P. W., Henson, R. N. A., & Dolan, R. J.
(2002). The effects of interdistracter similarity on search
processes in the superior parietal cortex. Neuroimage, 15,
611–619.
Wills, A. J., Graham, S., Koh, Z., Mclaren, I. P. L., & Rolland,
M. D. (2011). Effects of concurrent feature- and rule-based
generalization in human contingency learning. Journal of
Experimental Psychology: Animal Behavior Processes, 37,
308–316.
Wills, A. J., Inkster, A. B., & Milton, F. (2015). Combination
or differentiation? Two theories of processing order in
classification. Cognitive Psychology, 80, 1–33.
Wills, A. J., Milton, F., Longmore, C. A., Hester, S., &
Robinson, J. (2013). Is overall similarity classification
less effortful than single-dimension classification? Quarterly
Journal of Experimental Psychology, 66, 299–318.
l
l
/
/
j
t
t
f
/
i
t
.
:
/
/
D
o
w
n
l
o
a
D
d
o
e
w
d
n
l
f
o
r
a
o
d
m
e
d
h
t
f
t
r
p
o
:
m
/
D
/
o
h
m
w
t
i
n
t
t
p
o
p
:
a
r
/
d
c
/
e
.
d
m
s
i
f
i
r
t
o
l
p
m
v
r
e
c
h
r
.
c
p
s
h
i
a
l
d
i
v
i
r
r
e
e
.
r
c
c
t
c
.
o
m
h
m
a
/
i
j
e
r
d
o
.
u
c
c
n
o
o
/
m
c
a
n
/
r
j
a
t
o
r
i
t
c
i
c
c
n
l
e
/
e
-
a
p
-
r
d
p
t
d
i
2
f
9
c
/
l
1
2
e
9
1
-
/
5
p
1
0
d
/
f
1
1
9
/
5
5
2
0
2
9
/
1
/
1
4
1
9
7
/
1
1
o
4
5
c
2
0
n
5
/
_
2
a
1
/
_
7
j
0
8
o
1
5
c
0
7
n
2
5
4
_
7
a
p
/
_
d
j
0
o
1
b
c
0
y
n
2
g
_
4
u
a
.
e
_
p
s
0
t
d
1
o
f
0
n
2
b
0
4
y
8
.
p
S
M
d
e
I
f
p
T
e
b
m
L
y
i
b
b
e
g
r
r
u
a
2
e
r
0
s
i
2
t
3
e
s
o
n
u
s
0
e
3
r
M
o
a
n
y
/
j
/
/
/
t
f
.
1
2
7
0
2
M
1
a
y
2
0
2
1
166
Journal of Cognitive Neuroscience
Volume 29, Number 1