Ryan Kaplan** - Recherche en IA spécialisée au MIT

Ryan Kaplan**
Brown University

Joseph Klobušický†
Brown University

Shivendra Pandey
The Johns Hopkins University

‡

David H. Gracias*,§
The Johns Hopkins University

Govind Menon*,†
Brown University

Mots clés
Virus, viral tiling theory, self-folding,
origami, microfabrication, nanotechnology

A version of this paper with color figures
is available online at http://dx.doi.org/10.1162/
artl_a_00144. Subscription required.

Building Polyhedra
by Self-Assembly:
Theory and Experiment

Abstract We investigate the utility of a mathematical framework
based on discrete geometry to model biological and synthetic self-
assembly. Our primary biological example is the self-assembly of
icosahedral viruses; our synthetic example is surface-tension-driven
self-folding polyhedra. In both instances, the process of self-assembly
is modeled by decomposing the polyhedron into a set of partially
formed intermediate states. The set of all intermediates is called the
configuration space, pathways of assembly are modeled as paths in
the configuration space, and the kinetics and yield of assembly are
modeled by rate equations, Markov chains, or cost functions on the
configuration space. We review an interesting interplay between
biological function and mathematical structure in viruses in light of
this framework. We discuss in particular: (je) tiling theory as a coarse-
grained description of all-atom models; (ii) the building game—a growth
model for the formation of polyhedra; et (iii) the application of
these models to the self-assembly of the bacteriophage MS2. We then
use a similar framework to model self-folding polyhedra. We use a
discrete folding algorithm to compute a configuration space that
idealizes surface-tension-driven self-folding and analyze pathways
of assembly and dominant intermediates. These computations are
then compared with experimental observations of a self-folding
dodecahedron with side 300 Am. In both models, despite a
combinatorial explosion in the size of the configuration space, a few
pathways and intermediates dominate self-assembly. For self-folding
polyhedra, the dominant intermediates have fewer degrees of
freedom than comparable intermediates, and are thus more rigid.
The concentration of assembly pathways on a few intermediates
with distinguished geometric properties is biologically and physically
important, and suggests deeper mathematical structure.

Author contributions: DHG, GM: Designed research, analyzed data, wrote the paper; RK, JK, SP: Performed research with equal
contributions.
* Contact author.
** Work performed at Brown University. Current address: Dropbox LLC, 185 Berry Street, San Francisco, Californie 94107.
E-mail: ryankap@gmail.com
† Division of Applied Mathematics, Brown University, 182 George St., Providence, RI 02912. E-mail: joe_klobusicky@brown.edu (J.K.);
govind_menon@brown.edu (G.M.)
‡ Department of Chemical and Biomolecular Engineering, The Johns Hopkins University, Baltimore, MARYLAND 21218.
E-mail: shivendra@jhu.edu
§ Department of Chemical and Biomolecular Engineering and Department of Chemistry, The Johns Hopkins University, Baltimore, MARYLAND
21218. E-mail: dgracias@jhu.edu

Artificial Life 20: 409–439 (2014) est ce que je:10.1162/ARTL_a_00144

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

1 Introduction

Since the mid-1990s, an important goal in engineering has been to understand how to manufacture
materials and devices by mimicking physical and biological mechanisms of self-assembly. Crystalli-
zation is a familiar physical mechanism of self-assembly. More broadly, self-assembly in physics
includes the emergence of order in various geometric models of condensed matter driven by the
minimization of free energy. While all living systems self-assemble while dissipating energy, the term
self-assembly in biology usually refers to the formation of biomolecules such as microtubules, et
small organisms such as viruses. It is a fundamental challenge in materials chemistry and nano-
technology to create experimental paradigms inspired by these natural examples. Such a “bottom-
up” approach to manufacturing eliminates the cost of manipulating individual components at small
scales. De plus, it is conceivable that larger components could be functionalized with a wide range
of properties (chemical, geometric, magnetic, photonic), offering the possibility of designer materials
[26, 74].

Our primary purpose in this article is to explain how an important biological example—the self-
assembly of icosahedral viruses—can inspire and guide the development of self-assembly in tech-
nology. To this end, we introduce the reader to the elegant natural design of viruses and a set of
mathematical models for the structure and self-assembly of viruses. The viral capsid has inspired the
development of several techniques for synthesizing polyhedra on scales ranging from 10 nm to
1 mm. The particular technology studied here is surface-tension-driven self-folding, described in
greater detail below. While these two examples of self-assembly have completely different physics,
we show that they can be modeled by a common mathematical framework rooted in discrete
geometry. In both instances, the process of self-assembly is modeled by decomposing the poly-
hedron into a set of partially formed intermediates. The set of all intermediates is called the config-
uration space, pathways of assembly are modeled as paths in the configuration space, et le
kinetics and yield of assembly are modeled by rate equations, Markov chains, or cost functions
on the configuration space. The particular decomposition that is chosen is distinct for viruses
and self-folding polyhedra. Informally, each decomposition may be thought of as a discrete geo-
metric idealization of experimental observations.

Our emphasis lies in understanding the pathways of self-assembly. Such a study involves an interplay
between natural and synthetic self-assembly. While we are inspired by nature, we hope that an analy-
sis of synthetic self-folding polyhedra can reveal new insights about the self-assembly of viruses. Dans
our work, the synthetic polyhedra have side 300 Am and the self-folding process may be observed
under an optical microscope. Par contre, it is very challenging to describe the equilibrium crystalline
structure of complete viruses at high resolution, and there is no understanding of the pathways at a
comparable level of detail. Pathways cannot be observed by microscopy, and must be inferred from
other experimental data. The interpretation of such data requires a model, and a wide range of
theoretical models have been used to model virus assembly, as reviewed in Sections 4–6. Our work
allows us to test the utility of some of these models by direct observation of assembly pathways in a
distinct, but related, contexte.

In what follows, we first introduce the reader to self-assembly and describe our experiments on
self-folding polyhedra. We then discuss discrete geometric aspects of the self-assembly of viruses,
highlighting recent work by several groups. We focus on viral tiling theory, the building game, et
the role of RNA folding in the assembly of the bacteriophage MS2. We then discuss how these ideas
can be adapted to our experiments, present our results, and review some common lessons from
these studies. As will become clear in our discussion, developing a consistent mathematical theory
of self-assembly (natural or synthetic) is challenging, and requires an interplay between careful
modeling of the experiments, continuous math (applied probability, differential equations), discrete
math (geometric combinatorics, rigidity theory), and scientific computation. A secondary purpose of
our article is to introduce the reader to many unanswered questions in the area, all of which are easily
illustrated in the context of self-assembly of polyhedra.

410

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 1. All nets for the cube.

2 Assembly versus Self-Assembly

Let us first distinguish between assembly and self-assembly. Classroom models of polyhedra are
usually made from a material such as cardboard by folding a net, gluing faces as they meet at edges
[78, p. 12]. A net is a simply connected, non-overlapping polygon made up of faces of the poly-
hedron attached at edges. Informally, a net may be obtained from a polyhedron by cutting along the
edges of a spanning tree, flattening out the faces as they become free to rotate. The reader who
unfolds a cube with this procedure will soon discover the 11 nets for the cube listed in Figure 1.1
Polyhedra fascinated medieval polymaths such as Dürer, Kepler, and Leonardo, and this approach
is at least as old as Dürer and Strauss’ classic work, The Painter’s Manual [17].

Note that this method, and art forms such as kirigami and origami, require an external
creator actively cutting, folding, and gluing the edges in a prescribed sequence. The goal of self-
assembly is to decompose the polyhedron into suitable subunits, and to use physical principles to
cause the subunits to assemble without active manipulation. The dominant physical principles
governing self-assembly depend strongly on the length scale. This is best explained with examples.
On the nano scale, some examples to keep in mind are the following: (je) polyhedral molecules
such as C60 and closo boranes whose subunits are single atoms interacting through covalent bonds
[44]; (ii) synthetic supramolecular cages whose subunits are small molecules interacting through
weak (non-covalent) bonds [47, 53, 71]; (iii) polyhedral cages formed from DNA bricks and tiles
[12, 15, 40]; (iv) viral capsids whose subunits are protein molecules held together by electrostatic
bonds and hydrophobic interactions between protein residues. In these examples, the domi-
nant physical terms are bond energies and thermal fluctuations. Subunits form bonds as they
collide while undergoing stochastically driven motion. Thermal fluctuations also cause bonds to
break and re-form, allowing the polyhedra to self-correct during assembly. While these heuristics
are broadly accurate, the precise kinetics of self-assembly is far more subtle, as we discuss in
detail below.

Our main interest is to mimic similar self-assembly on much larger scales (100 nm to 1 mm).
We illustrate this idea with a historical curiosity. In his delightful book, Mathematical Snapshots
[69], Steinhaus cuts a net of a polyhedron into two “half-nets” that pop up under the effect of
elastic energy as illustrated in Figure 2. Steinhaus is silent on whether he discovered this technique,
but there is no debating its ingenuity. We view it as an early example that shows us how to
use physics to self-assemble polyhedra from a net with just a little help from a creator. In our
experiments, the polyhedra are much smaller (100 nm to 1 mm). Capillarity dominates gravity
on these length scales and may be used to drive self-assembly from a net as explained below.
Although our work in this article is restricted to the use of capillary forces, the self-folding of poly-
hedra from nets could also conceivably be achieved using magnetic forces, residual stress, ou
pneumatics [48].

1 The reader is also cautioned not to try this with the dodecahedron or icosahedron—both these polyhedra have 43,380 nets! These
exact numbers for the Platonic solids are exceptions: Enumerating nets is a subtle problem, and it is not yet known whether every
convex polyhedron can be unfolded along its edges to a (non-overlapping) net [59].

Artificial Life Volume 20, Nombre 4

411

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 2. Self-assembly of a dodecahedron using elasticity. (un) A net of the dodecahedron is cut into two “flowers” at the
edge between them. These two flowers are laid crosswise and an elastic thread in tension is wrapped around the faces.
(b) The upper half rises as the elastic energy of the thread overcomes gravity. The reader is referred to Steinhaus
[69, Plates 229–231]. An interesting historical sidelight on the origin of these photographs appears in Kac’s autobiography
[38, p. 41].

3 Surface-Tension-Driven Self-Folding

Photolithography is an accurate and inexpensive microfabrication process used to pattern thin films.
But its reliance on projective optics inherently restricts patterning to flat sheets. Folding allows us
to transform these flat sheets into three-dimensional shapes. Cependant, while sheets can be folded
by hand or small probes at supra-millimeter length scales, it is necessary to use self-assembly at
sub-millimeter length scales. The mathematical challenge is to code the folding pathways and
outcome through the information implicit in the geometry and interactions of units on the pat-
terned sheet.

Since polygonal faces can be patterned with high accuracy, the first challenge is to build complex
polyhedra with high yield. In our experiments, a small net (side length 300 Am) made of nickel-
coated polygonal panels joined with solder hinges was patterned on a substrate and then released
in a high-boiling-point solvent. When the hinge melts, the molten solder cannot spread laterally,
because solder does not wet nickel. Plutôt, the high surface tension of solder causes the panels to
rotate. Snapshots of surface-tension-driven self-folding of a dodecahedron are shown in Figure 3 (voir
also Figures 13–15). The reader is invited to view movies of polyhedra self-folded by this process
([60, Supplementary Information]; http://www.youtube.com/watch?v=GL0im9b6GgU). This ex-
perimental procedure is discussed in detail in [48]. A simple model for the physics of the surface-
tension-driven hinge is analyzed in Section 7.1.

We stress that there is no external control in these experiments except the temperature of the
solvent and the choice of initial net. Folding usually begins with the outer panels, and the order of
folding is principally determined by the relative position of panels, though there are significant
fluctuations in the sequence in which panels fold. The main sources of randomness are fluctuations
in the amount of solder deposited at a hinge, thermal gradients within the solvent, and mechanical

Chiffre 3. Optical microscope images of surface-tension-driven self-assembly of a dodecahedron from a net. The sides of
each panel (face) of the dodecahedron are 300 Am. Here we illustrate self-folding from the same net as in Figure 2. Ce
net is one of 21 tested in experiment as discussed in Section 9.

412

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

(fluidic) agitation of the solvent. The temperature distribution within the solvent and variations in the
amount of solder determine the order in which hinges liquefy. For small polyhedra, the thermal
gradients on the length scale of a single net are minimal. Our experiments also randomize the
placement of polyhedra in the assembly dish. Fluidic agitation has the beneficial effect of correcting
for small misalignment errors without breaking apart panel seams. Despite these efforts, self-folding
sometimes goes awry and faces can be misaligned. This immediately suggests some natural questions:

(je) Feasibility: Is it possible to build polyhedra of arbitrary complexity by self-folding?

(ii) Rational design: How must an initial net be chosen to maximize yield?

(iii) Assembly pathways: What determines the pathways of self-assembly? Can these be directed

through specific intermediates?

These questions have only been investigated for a few polyhedra to date, and our prior work
addressing rational design is reported in [3, 60]. One of our main conclusions in [60] is that in order
to fold more complex polyhedra, it is necessary to tackle a combinatorial explosion in the space of
nets and folding pathways. Viruses face (and rapidly solve) a similar assembly problem despite a
combinatorial explosion. How could insights from the assembly of viruses be brought to bear on
self-folding? In order to explain the similar conceptual issues in both fields, we now discuss the self-
assembly of viruses, paying particular attention to discrete geometric models.

4 The Structure of Icosahedral Viruses

4.1 Virus Genomes and Structure
Viruses are the most populous and genetically diverse organisms on earth. Several estimates place
the number of viruses at 1031 and the number of distinct genotypes as high as 106–108 [6]. (Note,
though, that most viruses are marine bacteriophages [70], and there are several challenges in esti-
mating and classifying viral population and diversity.) At present, the NIH-NCBI public database
contains complete genetic sequences for approximately 4000 viruses.

Viruses lack the biosynthetic machinery for independent existence. The simplest virus particles
consist of a genome (RNA or DNA) encapsulated in a protein shell (the capsid). The function of the
capsid is to protect the genome outside a host, to disassemble when the virus attacks a host, and to
assemble rapidly when the genome uses the cellular machinery of the host to produce nucleic acid
and protein. The natural design of a virus reflects two deep forms of elegance:

1. Genetic economy: This principle was formulated by Crick and Watson, as discussed later

[13]. It is best illustrated by glancing through the NCBI data on the simplest viruses with
a single-stranded RNA genome. Among the 1017 known ss-RNA genomes, most viruses
typically have genomes that consist of about 5000 nucleotides, which code for very few
proteins (1-dix). A central example is the bacteriophage MS2, which infects the bacterium
E. coli. The MS2 genome was the first to be completely sequenced [24]. The genome is a
single strand of RNA that is 3569 nucleotides long and has four genes that code for four
proteins: coat protein, maturation protein, and lysis and replicase enzymes. Of these, le
coat protein and maturation protein determine the structure of the capsid as discussed
below. The lysis enzyme degrades the cell wall of the host bacteria, and the replicase catalyzes
the production of RNA.

2. Structural symmetry: Early crystallographic studies of viruses in the 1940s and 1950s found that
many viruses possess either helical or icosahedral symmetry. As an example, we illustrate the
structure of MS2 in Figure 4. The MS2 capsid consists of 180 copies of the coat protein and
one unit of the maturation protein, which serves as a terminal point. The units of the coat
protein are arranged in a manner that has rotational icosahedral symmetry. The symmetry

Artificial Life Volume 20, Nombre 4

413

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 4. The structure of bacteriophage MS2. (un) Ribbon diagram showing protein structure. (b) A tiling model. Le
coat protein exists in three distinct conformations (UN, B, and C), which merge in pairs into A/B dimers (blue/green) et
C/C dimers (maroon). A/B dimers cluster into pentamers around the fivefold axes of an icosahedron; three alternating
A/B and C/C clusters form at the threefold axes; and the C/C dimers sit at axes of twofold symmetry. (un) is from
Wikipedia Commons; (b) is adapted from [18].

of the capsid influences the organization of the genomic RNA in proximity to the capsid
(voir [73, Chiffre 4] for cryo-EM pictures, and Figure 5 for a coarse-grained description).
In addition to crystalline symmetry, it is now also known that genetically unrelated viruses
can share very similar virion architecture, down to the persistence of specific protein folds
[45]. We discuss a ubiquitous trapezoidal motif in greater detail below.

4.2 Goldberg Polyhedra and the Caspar-Klug Construction
Crick and Watson hypothesized that the crystallographic symmetry of viruses was due to the repeated
use of a few protein subunits in the construction of the virus [13]. Such a construction is clearly
conducive to genetic economy: If the subunits consist of a small number of proteins (Par exemple,
just one coat protein for MS2), the genome can be very small. Caspar and Klug laid the foundations
for structural virology of icosahedral viruses a few years later [10]. (By icosahedral virus, we mean a virus
whose capsid has icosahedral symmetry.) They assumed that the viral capsid is built out of subunits

Chiffre 5. Discrete geometric skeleton of the (icosahedrally averaged) ss-RNA genome in MS2. (un ) Cryo-EM (icosa-
hedrally averaged) yields a density distribution for RNA that is modeled by the polyhedron shown in red. This RNA
polyhedron has 60 vertices, corresponding to the centers of the 60 asymmetric (A/B) dimers shown in Figure 4b [73].
(b) A decorated net obtained from (un) by unfolding the RNA polyhedron with an icosahedron net. Adapted from [28].

414

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

that assembled into a shape with icosahedral symmetry, and deduced that the viral capsid must consist
of 60T protein subunits, with T ¼ h2 þ hk þ k2; h; k ∈ ℤþ. Let us explain this restriction.

The class of Caspar-Klug viruses are modeled on the Goldberg polyhedra. While studying the iso-
perimetric problem for polyhedra, Goldberg discovered a family of polyhedra with the following
properties: (je) The faces of the polyhedron are pentagons or hexagons; (ii) each vertex is trivalent;
(iii) the polyhedron has icosahedral symmetry. Each such polyhedron (written G(h, k)) is characterized
by its index ðh; kÞ ∈ ℤþ [27]. The combinatorial topology of G(h, k) may be described by folding a
net whose vertices lie on a hexagonal lattice as follows.

Assume given a lattice L of regular hexagons of unit length. Let L0 be the dual lattice formed by
the centers of the hexagons, and let e and f be unit basis vectors for L0. Given an index (h, k),
consider a walk in the dual lattice L0 from a point O to the point P = O + il + kf. Continue this
walk, tracing out the net of an icosahedron with vertices in the lattice L and edges of length |OP|.
When the net is folded into an icosahedron I, the polyhedron formed by the restriction of the lattice
L0 to I gives a triangulation S(h, k) of I. The polyhedron formed by the restriction of the hexagonal
lattice L to I is topologically equivalent to the Goldberg polyhedron G(h, k). (The folding construc-
tion does not give G(h, k) in general, since gluing along edges such as OP gives “folded” hexagons
and “curved” pentagons, whereas G(h, k) has flat hexagons and pentagons). Examples of G(h, k)
are shown in Figure 6. An interesting survey of Goldberg polyhedra in art and science may be found
dans [33].

The topology of G(h, k) can be obtained by duality from S(h, k). We compute the area of each
face of the icosahedron I to see that it contains T = h2 + hk + k2 faces of S(h, k). Ainsi, S(h, k) a
20T faces. Each face has three edges, each of which is shared by exactly two faces. Thus S(h, k) a
30T edges. By an application of Euler’s formula, the triangulation S(h, k) has 10T + 2 vertices. Par
duality, G(h, k) has 10T + 2 faces, of which 12 are pentagons. Ainsi, there are 10(T − 1) hexagons,
and 60T subunits in all.

Goldberg’s construction was rediscovered by Caspar and Klug. When applied to viruses, le
faces in the Goldberg polyhedron correspond to clusters of protein subunits called capsomeres. Dans
the simplest setting, there are two kinds of capsomeres, hexamers and pentamers, corresponding to
the hexagons and pentagons respectively. As a consequence, the capsid has 12 × 5 + 10(T − 1) ×
6 = 60T protein subunits. (While it is simplest to imagine that each subunit is a single protein
molecule, it is also possible for the subunit to consist of a cluster of protein molecules.) The number
of subunits in a capsid is amenable to direct measurement, and the Caspar-Klug “quantization” of
capsid size provides a fundamental classification of viruses.

4.3 Recent Developments: Viral Tiling Theory
The Caspar-Klug theory coarse-grains the all-atom structure of the virus into a Goldberg poly-
hedron. It allows the possibility that other polyhedra with icosahedral symmetry may provide a more

Chiffre 6. Goldberg polyhedra: G(4, 1), G(4, 0), and G(1, 1).

Artificial Life Volume 20, Nombre 4

415

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 7. (un) The type 5 tiling of the plane by a polygon with five sides [30]. (b) A topologically equivalent tiling of the
hexavalent lattice by trapezoids. Adapted from [54].

refined description of the capsid. We present two recent examples that demonstrate such an inter-
play between polyhedral models and experimental data. In both instances, high-resolution crystal-
lography was the motivation for new tilings of the sphere that resolve puzzles in structural virology.
We first discuss Twarock’s resolution of the structure of certain “forbidden” viruses. The Caspar-
Klug theory does not allow certain T-numbers, for example T = 2 and T = 6. Cependant, icosahedral
viruses with these T-numbers are known to exist. Some yeast bacteriophages have T = 2 [11].
De plus, certain tumor-causing polyomaviruses, such as SV40, have a capsid structure modeled
sur, but distinct from, the T = 7 viruses. Roughly, these capsids have pentamers at the centers of the
hexagons of G(2, 1) polyhedra [52, 63]. Ainsi, they have 360 subunits (T = 6), plutôt que 420 (T = 7).
Twarock made the beautiful discovery that the structure of these anomalous viruses corresponds to
tilings of the sphere by darts and kites [75]. There is an interesting parallel here with aperiodic tilings
and quasicrystals with “forbidden” symmetry [68]. The Penrose tiling with darts and kites is the sim-
plest aperiodic tiling of the plane that possesses fivefold rotational symmetry. It is remarkable that the
same tiles may be used to construct tilings of the sphere with icosahedral symmetry, resolving the
structure of anomalous viruses.

Mannige and Brooks have further refined the Caspar-Klug theory to explain a structural similarity
in virus architecture across different genetic strains [54, 55]. An important finding from the first
high-resolution images of viral capsids was that the shape of the individual protein subunit in some
genetically distinct viruses is roughly trapezoidal [1, 32]. This motif has now been seen in several
virus strains. In order to explain the emergence of the trapezoidal motif, Mannige and Brooks tile the
hexavalent lattice L0 with a single polygonal prototile with j sides such that: (je) there is no overlap
between prototiles; (ii) there is no gap between prototiles (c'est à dire., each edge in the tiling is an edge
between precisely two prototiles).2 As in the Caspar-Klug construction, the only admissible tilings
of L0 are those that can be folded into a tiling of the icosahedron. This restriction is surprisingly
rigid: The prototile must have j = 5. De plus, by comparing all known tilings of the hexavalent
lattice L0 by pentagonal prototiles, one sees that the only possibile tiling is the one shown in Figure 7.
Observe that the pentagonal prototiles can be flattened into trapezoidal prototiles. Ainsi, this con-
struction provides a natural discrete geometric model for the trapezoidal motif discovered in struc-
tural virology. The tiling by trapezoidal prototiles is chiral, a property of critical importance for
biomolecules.

Mannige and Brooks compared their discrete model with all-atom structures for all 63 viruses
whose resolved structure is stored in the VIPERdb viral capsid data bank. This comparison requires

2 This is an example of a strongly balanced tiling in the sense of Grünbaum and Shephard [30].

416

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

the introduction of a metric that measures a suitable distance between an all-atom model and the
trapezoidal tiling. Roughly speaking, the method in [54] involves projecting the three-dimensional
all-atom model radially outwards onto a sphere, and analyzing the “overlap” with a pure tiling. With
this metric, they find that their monohedral tilings fit experimental data in the majority of examples,
showing that the individual protein subunits in genetically distinct viruses from eight families exhibit
the same trapezoidal shape.

5 The Self-Assembly of Icosahedral Viruses

Our description so far has been restricted to the equilibrium structure of the viral capsid. We now turn
to the use of discrete geometry to understand the pathways of self-assembly.

5.1 Crytallization and Directed Self-Assembly
In a seminal experiment in 1955, Fraenkel-Conrat and Williams showed that fully functional tobacco
mosaic virus (TMV) could be created from a solution of purified RNA and capsid protein [25].
Their experiment suggested that the self-assembly of viruses was a form of crystallization. Ici
is the early dogma, as stated by Caspar and Klug in 1962 [10]:

Self assembly (of a virus) is a process akin to crystallization and is governed by the laws
of statistical mechanics. The protein subunits and the nucleic acid chain spontaneously
come together to form a simple virus particle because this is their lowest (free) energy
state.

But the dogma soon unraveled. A complete theory must describe the kinetics of energy minimi-
zation and predict the correct time scale for self-assembly. Cependant, the number of intermediate
conformations in virus self-assembly explodes combinatorially. This is similar to protein folding, et
as Levinthal pointed out [50], naive kinetics—such as sequential sampling of all states—predict a
time scale that is far too long. He suggested instead that protein folding is directed through specific
intermediates by sequential nucleation [51]:

We feel that protein folding is speeded and guided by the rapid formation of local
interactions which then determine the further folding of the peptide. This suggests local
amino acid sequences which form stable interactions and serve as nucleation points in the
folding process.

By the 1970s, the role of RNA-directed nucleation in the self-assembly of TMV was emphatically
demonstrated by Klug [41, 42]. He showed that RNA folds served as nucleation sites for conforma-
tion changes in protein subunits, allowing the capsid to flip between a stack of disks and a helix. Though
TMV is rodlike, RNA folds also serve as nucleation sites for conformational changes of capsid protein
in the self-assembly of icosahedral viruses, as we discuss later [9].

En résumé, we should note that the Levinthal paradox does not preclude the possibility that
stochastically forced energy minimization may lead a protein or virus to its minimum-energy state.
But it does emphatically point to the fact that such models are valid only if they predict realistic time
scales and pathways through a few dominant intermediates. En fait, energy minimization is widely
used in protein folding, and virus self-assembly is compatible with a more nuanced view of pathways
[14, 66, 77]. We now introduce a class of discrete geometric models that allows us to quantitatively
analyze the energy landscape and distinguished pathways for self-assembly of polyhedra. Quand
applied to ss-RNA viruses such as MS2, this class of models has recently been used to identify
sequence-specific nucleation sites and stable intermediates as in Levinthal’s original vision (voir [21]
and Section 6).

Artificial Life Volume 20, Nombre 4

417

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

5.2 Zlotnick’s Model, the Building Game, and Energy Landscapes
We now discuss an approach to virus assembly that was pioneered by Zlotnick [81]. A similar de-
scription in synthetic self-assembly was adopted by Hosokawa et al., though the planar geometry in
their model is rather simple [35]. The underlying physical caricature is a process of capsid assembly
driven solely by the attachment and detachment of single subunits. The geometry can be described
by the building game—a growth process on a fixed polyhedron that serves as a simple model for the
formation of fullerenes and viral capsids [76]. We first introduce the model and then discuss its
utility and flaws.

We formulate the building game as a coloring problem. We begin with a connected polyhedron P
with all faces colored white. We call this state □. At the first step, an arbitrary face is colored black.
At each step that follows, a white face that shares an edge with the contiguous, connected cluster of
black faces is colored black. The procedure terminates when all faces in the polyhedron are colored
black. We denote the terminal state by ▪. The configuration space, C, consists of the set of all distinct
(c'est à dire., non-congruent) black clusters formed with this algorithm, ranging from the empty configuration
□ to the complete polyhedron ▪. Each such arrangement of black clusters is called an intermediate. Deux
intermediates are neighbors if they differ by the attachment or detachment of a single black face.
Ainsi, the configuration space C is a graph, or more properly a multigraph, with the intermediates
as vertices and edges linking two neighbors. The number of black faces in an intermediate provides
a natural ordering of C as a partially ordered set. An assembly pathway is an increasing path in C that
begins with the empty cluster □, increases by one black face at each step, and terminates at ▪.

Zlotnick’s model allows both an energetic and a kinetic description of self-assembly. An energy
function is a map E : C → ℝ. In order to define the kinetics of assembly we assume that a state
j ∈ C transforms to a distinct state k ∈ C at a constant rate ajk ≥ 0. Then the evolution of the popu-
lation of states is given by the master equation

dcj
dt

akj ck −

kjk→j

kj j→k

ajkcj ;

dc
dt

¼ Ac;

ð1Þ

où (as is conventional) A denotes the matrix with entries ajk, ajj = −∑k| j →k ajk, and c = (c1,…,
c#C) denotes the connection of each intermediate in a configuration space C, with size denoted by
#C. The matrix A is the generator of a continuous-time Markov process on C, and the master
equation is the forward equation associated to this Markov process.

It is important to note at the outset that both energetic and kinetic descriptions are simple to
formulate. The complexity of self-assembly is a consequence of the complexity of the configuration
space C. The energy functions E and rate constants A used in applications are usually quite simple.
Par exemple, a natural energy function E assigns a unit energy to each edge (bond) between black
faces. This E depends only on the topology of the intermediates. It favors intermediates that are less
spread out (more “compact” in the jargon of chemists), since they form multiple bonds while grow-
ing, and have lower energy than intermediates that grow with the formation of only one bond at
each attachment step. De la même manière, a simple choice for rates is to assume that ajk is proportional to the
number of edges joining intermediates j and k. It is also common to introduce an inverse temper-
−hE on C, and to choose a matrix A that has the Gibbs distribution
ature h > 0 and Gibbs weights e
as a unique equilibrium distribution. (C'est, A has a unique left eigenvector with positive entries kj
proportional to e

−bEj, j = 1,…, #C.)

5.3 Dominant Intermediates and Pathways
This model has been used in applications as follows [81]. D'abord, the entire configuration space C has
been enumerated for a few solids (see Table 1). Suivant, the energy functions and solutions to the rate
equations are computed numerically and the solutions compared with experiments. Some care is
needed in these comparisons, since the number of distinct intermediates in C far outweighs the

418

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Tableau 1. A sample of enumerative results for the building game [36]. All results were determined by exhaustive
enumeration with a computer; we are not aware of theoretical bounds or an exact enumeration formula for the
building game. The numbers in brackets refer to shellable intermediates, edges between shellable intermediates, et
pathways through shellable intermediates, respectivement (voir [80] for more on shellability). The number of intermediates
for the Platonic solids is 1 more than in the results from [23], since we also count the complete polyhedron as an
intermédiaire.

Polyhedron

Tetrahedron

Cube

Octahedron

Dodecahedron

Icosahedron

Truncated tetrahedron

Cuboctahedron

Truncated cube

Truncated octahedron

Non. of faces

Non. of intermediates

Non. of edges in C

Non. of assembly pathways

from □ to▪

5 (5)

9 (8)

15 (12)

74 (53)

4 (4)

10 (8)

22 (14)

1 (1)

3 (2)

14 (4)

264 (156)

17,696 (2,166)

2650 (468)

17242 (1984)

57,396,146,640 (10,599,738)

29 (22)

65 (42)

402 (171)

341 (137)

1636 (470)

10,170,968 (6,258)

500 (248)

2731 (1002)

101,443,338 (5,232,294)

556 (343)

3071 (1466)

68,106,377 (5,704,138)

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

number of distinct intermediates that can be observed in experiments. (For a moderately complex
Goldberg polyhedron, we expect the number of intermediates to be greater than 1023.) De plus,
even for simple polyhedra such as the dodecahedron there has been no systematic attempt to
determine the rates ajk empirically. Plutôt, experimental comparisons are between solutions to
Équation 1 with assumed rate constants, and experimental observations of the population of a
few intermediate species of the capsid. This seems like a limited comparison at first sight, but it is
justified by an interesting empirical observation in the biochemistry literature: For several chemically
natural E and A, such as the examples above, the same intermediates dominate self-assembly [23].
Such a collapse of the complexity of C onto a few dominant intermediates is suggestive of deeper
mathematical structure. At present, this collapse is based on computational experiments in which
both E and A are used to define dominant intermediates and pathways. Every energy function E
may be used to rank states ( j ≤ k if Ej ≤ Ek), and to define the path of least energy between □ and ▪
(voir la figure 8). From a kinetic standpoint, given A, the states j may be ranked by considering the
equilibrium distribution kj associated to A. For several chemically natural choices of E and A, le
top few intermediates are the same. As we show in Section 9, a similar collapse onto a few dominant
intermediates is also seen in self-folding polyhedra. A broader empirical study of C, E, and A will be
presented in the forthcoming article [36].

5.4 Some Flaws in the Building-Game Model
There are many distinct models for virus self-assembly. We have emphasized the building-game
framework because it is best suited to our work on self-folding. Let us briefly discuss some short-
comings of this framework in the context of virus assembly.

The building game assumes that the final polyhedron is “known” even if assembly proceeds by
random attachment, and it cannot account for the formation of distinct or malformed polyhedra
from the same subunits. In the first experiments with fullerenes, it was noted that when one fullerene

Artificial Life Volume 20, Nombre 4

419

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 8. Lowest-energy intermediates for an energy function proportional to the number of bonds between faces [23].
The arrows between intermediates denote the numbers of forward and backward reactions connecting intermediates.
Par exemple, there are five edges of the pentagon at which a second pentagon may attach, giving five forward paths from
the first to the second intermediate. Either of the two faces can be removed from the second intermediate, giving
two backward paths.

was found, several others were found too (par exemple., C60 and C70 were found together) [44]. It is also known
that capsid assembly often leads to malformed and partially formed shells [57, 58]. Another, plus
serious flaw in this model is that it does not account for the maturation of the capsid protein. True
biological assembly of a number of plant viruses includes not just the formation of a proto-shell, mais
a second stage, maturation, that consists of growth and buckling of the capsid into a shape with
icosahedral symmetry that is critical for infectivity in some viruses [7, 37].

It is possible to partly address the formation of different polyhedra with a “semi-discrete” model.
Berger et al. [4] idealized each protein subunit as a point in space labeled with a set of finite con-
formations. A finite set of local rules then describes the geometry of the connections between these
labeled points. This model allows the formation of both complete and malformed capsids. Enfin,
extensive molecular dynamics (MARYLAND) simulations with varying degrees of approximation have been
used to model capsid assembly (see for instance [31, 57, 62]).

6 Self-Assembly of MS2—the Role of RNA Folding

6.1 RNA Folding and Hamiltonian Paths
The NCBI database lists the primary structure (c'est à dire., complete A,C,G,U sequence) of about 1000 ss-RNA
viruses. The tertiary (c'est à dire., folded, packed) structure of these genomes within the capsid in vivo is of
fundamental interest, and it is an important computational challenge to predict the tertiary structure
based on primary structure.

The icosahedral symmetry of the capsid need not be matched by a symmetric arrangement of
the genome within the capsid, but there are beautiful examples where this is indeed the case. Le
icosahedrally averaged distribution of the genomic RNA of MS2 is shown in Figure 5. Another
example of genomic symmetry is provided by nodaviridae such as Flock House virus (FHV) [72].
We summarize the discovery of Hamiltonian paths in the context of FHV below [65]. We then
discuss how Hamiltonian paths were recently exploited to obtain sequence-specific information
about the assembly of MS2 [18, 20, 21].

The FHV genome consists of two single strands of RNA that lie roughly on a dodecahedral cage
within the capsid. The RNA strands have an orientation, and RNA1 and RNA2 each cover the
dodecahedron, traversing every edge in opposite directions. The crossings of RNA1 and RNA2
at vertices can be classified intro three junctions (UN,B,C), with A having least energy and C having
the most energy. The A junctions are chiral and may be further divided into two types, A1 and A2
[65, Chiffre 3]. In the first approximation, the RNA folding problem is an energy minimization

420

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

problem on the set of all decorations of the dodecahedron such that: (je) all edges of the dodecahedron
are occupied by rigid, duplex RNA segments; (ii) all vertices of the dodecahedron are occupied by
an A, B, or C junction; (iii) the decoration has icosahedral symmetry [65]. The set of decorations is
also subject to topological constraints. D'abord, each RNA strand must remain connected. Deuxième, le
genome must contain no knots, since a knotted RNA molecule cannot be translated by the cellular
machinery of the host.

The condition that the RNA strands remain connected forbids a naive minimizer that places 20
identical A junctions at each vertex of the dodecahedron, because this decoration actually separates
each RNA strand into 12 separate rings. A more sophisticated possibility is to consider a decoration
that visits each vertex of the dodecahedron twice, in opposite directions, with a choice of A1 and A2
junctions at each vertex. Cependant, all such paths are knotted.3 Restricting attention to paths with A
and B vertices consisting of a path on the dodecahedron with side branches, Bruinsma and Rudnick
were finally led to consider the RNA molecule as a Hamiltonian path on the dodecahedron. While this
restriction was initially posed to find a topologically admissible least-energy configuration, it also has
interesting implications for assembly. The building game for the dodecahedron can be augmented to
include an RNA decoration. The intermediates now consist of connected black clusters along with a
Hamiltonian path that links the centers of these clusters (voir [20, Chiffre 1]). This is a fully discrete
configuration space, and as before, one can seek the lowest-energy pathway. Surprisingly, the states
of the capsid in the first nine steps of the least-energy Hamiltonian cycle assembly pathway coincide
with the assembly pathway shown in Figure 8.

6.2 Hamiltonian Paths and Packaging Signals in MS2
In the examples so far, virus self-assembly was modeled with ideas from statistical physics and
discrete geometry. None of these models incorporates information on virus self-assembly that is
specific to a particular genetic sequence. In very interesting recent work, Twarock, Stockley, et
their coworkers have combined the previous theoretical models with bioinformatics and experi-
ments to elucidate the assembly pathways of MS2 and the evolutionarily related phage GA, avec
careful attention to the sequence specificity of the assembly process. In order to explain their work,
it is necessary to appreciate the structural role that RNA plays in the assembly process for MS2.

We note at the outset that conformational changes in the capsid protein(s) help balance genetic
economy and structural symmetry. Par exemple, if the genome of an icosahedral virus codes for just
one coat protein, and the coarse-grained capsid is a Goldberg polyhedron, conformation changes in
the protein allow it to flip between pentamers and hexamers. This conformation change is accel-
erated by RNA folding. As the RNA folds, exposed stem loops serve as nucleation sites for the
conformation change in the capsid protein [19, 22]. These loops are known as packaging signals.
The packaging signals associated to a particular genome are sequence-specific. They are not all of
the same strength, and typically only the packaging signals with high affinity are well understood. Nous
now explain how Hamiltonian paths were recently used to discover a hierarchy of packaging signals
for the MS2 genome [18, 20].

As we have noted above, the MS2 genome codes for four proteins. Of these, only two—the
maturation protein and the coat protein (labeled A)—are used in the capsid. The capsid contains
1 copy of the maturation protein and 180 copies of the coat protein. A glance at Figure 4 reveals,
cependant, that the coat protein exists in three different conformations, labeled A, B, and C. Dans
solution, the coat proteins bind in pairs to coexist as asymmetric A/B dimers and symmetric C/C
dimers. RNA stem loops trigger conformation changes in the coat protein from symmetric C/C
dimers to asymmetric A/B dimers.

The discovery of the crystalline structure of the RNA genome, in combination with the crystal-
line structure of the capsid, suggests that the process of self-assembly in some viruses should be
viewed as a co-assembly of RNA and capsid. The RNA is packaged so that it traces the shape of the

3 This is a computational observation in [65], and it is not clear if a rigorous proof exists.

Artificial Life Volume 20, Nombre 4

421

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

polyhedron shown in Figure 5 as the capsid assembles into the polyhedron shown in Figure 4. Le
first conformation change occurs at the highest-affinity packaging signal on the RNA strand. Ce
packaging signal is a well-understood 19-nt sequence of the MS2 genome, called the TR site. Le
coat protein first binds to the RNA at the TR site, flipping conformation in the process in order to
facilitate the formation of fivefold clusters, avoiding electrostatic repulsion of the F-G loops around
these axes. As the RNA binds, the remainder of the RNA is rearranged (some stem loops are already
présent; some others might form in the process), allowing for further binding events to promote
capsid formation. This heuristic picture can be modeled accurately with discrete geometry, by a suit-
able extension of the building game. Intermediates consist of partially formed capsid polyhedra that
are decorated by segments on the RNA polyhedron, with each vertex on the RNA polyhedron
corresponding to a packaging signal (voir [20, Chiffre 1]). Figures 4 et 5 now suggest that there
must be 60 packaging signals on the RNA genome that serve as nucleation sites for conformational
flips as co-assembly proceeds. Cependant, the TR site is atypical—very few packaging signals are
known, certainly not 60 of them. How could such packaging signals be discovered? We now explain
how Hamiltonian paths elucidate a sequence-specific question: Where do packaging signals lie on the
RNA genome?

A solely bionformatic search for packaging signals involves the statistical analysis of primary
structure along with computations of secondary structure. Roughly speaking, the packaging signal
is a short motif within the secondary structure, and these motifs are sought among a large number of
computer-generated possible secondary structures. Such an analysis is possible for short motifs in
short genomes (Par exemple, for the ss-RNA of the virus STMV); cependant, it is computationally
intractable for MS2. In order to search for packaging signals in MS2, Twarock and coworkers
assume that the RNA is folded along a Hamiltonian path on the RNA polyhedron shown in Figure 5.
Il y a 40,678 Hamiltonian paths with icosahedral symmetry linking the 60 vertices. A further
restriction on this set is provided by the single unit of maturation protein—it binds at two sites near
le 50 et 30 ends on the genome, effectively converting it (topologically) to a circle. De la 40,678
Hamiltonian paths, only 66 are consistent with binding the maturation protein. It is computationally
tractable to compute the co-assembly intermediates along each of these 66 paths, and to compare
the icosahedrally averaged density of the RNA with experimental data from mass spectrometry
and cryo-EM. From these computations, they discover that only 3 of the 66 paths are admissible
[20, 18]. The geometric arrangement of the RNA can then be combined with bionformatic analysis
of sequence data to identify packaging signals. Roughly, the search in sequence space for packaging
signal motifs becomes tractable once there are only a few folding pathways that need to be considered.
In this manner, a family of 60 packaging signals and a precise co-assembly sequence have been
discovered for MS2.

Remarquablement, the same (discrete-geometric) assembly pathway is also used by the evolutionarily
related ss-RNA virus GA. Both MS2 and GA are leviviridae that infect enterobacteria. Their genomes
are roughly the same size and code for four proteins. While a comparitive study of four related
enterobacteriaphages (GA, MS2, fr, and Qbeta) revealed that they are T = 3 viruses with similar
crystalline structure [39], the recent studies based on Hamiltonian paths show that these two viruses
share the same assembly pathway. En résumé, these recent studies demonstrate a convergence of
several themes: A basic assembly model rooted in discrete geometry and statistical physics may be
combined with the analysis of genetic sequences and experimental data to provide a concrete and elegant
instance of an assembly pathway from primary sequence to quaternary structure that is conserved by
evolution.

7 Physical Modeling of Surface-Tension-Driven Self-Folding

We now return to surface-tension-driven self-folding and the basic questions that drive our work:
Why do some nets fold better than others? Are there dominant pathways for succesful self-folding?
What is common to natural and synthetic self-assembly?

422

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

We will use a discrete geometric framework to analyze these questions. As in the models of viral
capsid assembly, we first define and compute an ideal configuration space C, define an energy land-
scape and a master equation on C, and use these to define dominant pathways and intermediates. Dans
order to motivate a discrete geometric model, we first review the physics of the surface tension
hinge and some prior work. Our work culminates in a comparison between theory and experiment
for self-folding polyhedra that is described in Section 9.

7.1 The Surface-Tension Hinge
We first contrast the relative magnitudes of capillarity and thermal fluctuations. In the experiments
reported here, the hinges are made of solder. The side of each panel is 300 Am, and the sides of
−8 m2. A typical
hinges are in the range 30–60 Am, so the area of the solder is on the order of 10
value of the surface tension of solder is 0.5 N/m. Ainsi, an approximate scale of the capillarity is
−8 J. The temperature of the solvent is at most 200°C = 473 K, which gives typical thermal
10
energy kT ≈ 6.5 × 10

−21 J. Ainsi, capillarity strongly outweighs thermal fluctuations.

In order to model the hinge, we neglect dynamics, gravity, and three-dimensional effects. In static
equilibrium, the Young-Laplace law tells us that the interface of the solder droplet is a circular arc
that meets the panel at the wetting angle. The experimental setup determines the area of the solder
droplet and the wetting angle. These two variables then determine the angle of hinge rotation. Là
are three cases, as shown in Figure 9. The crucial assumption is that molten solder does not wet the
adjacent parts of the panels. This implies k/2 < h < k. Let L denote the length of the segments AP and BP. We then find with some basic trigonometry that the area of the droplet, A(f), is given by R ¼ L cos f sinðh − fÞ ; AðfÞ ¼ R2ðh − fÞ − LR cos h; k=2 < h < k: ð2Þ In this regime, the area strictly decreases with increasing f and can be inverted to yield f(h, A). Thus, the amount of solder deposited controls the angle of rotation of the hinge. When the droplet wets the panel (h ∈ (0, k/2)), the area is not a monotonically decreasing function of A and there can be two positions of static equilibrium for a given area. To summarize, it is necessary that the panels should not be wetted by the solder. Equation (2) then serves as a first approximation in the laboratory. Figure 9. The surface tension hinge. Shaded area is the solder droplet, h is the wetting angle, and 2f is the angle of rotation of the hinge. (a) Wetting and a concave interface, 0 < h < f; (b) wetting and a convex interface, f < h < k/2; (c) non-wetting and a convex interface, k/2 < h < k. Artificial Life Volume 20, Number 4 423 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u a r t l / / l a r t i c e - p d f / / / / 2 0 4 4 0 9 1 6 6 4 8 9 5 a r t l / _ a _ 0 0 1 4 4 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 R. Kaplan et al. Building Polyhedra by Self-Assembly: Theory and Experiment Figure 10. Experimental schematic of nets: (a) dodecahedron; (b) octahedron. 7.2 Locking Hinges and Compact Nets In order to realize the precise dihedral angle between faces (for example, 116.6° for the dodeca- hedron), it is also necessary to incorporate additional solder hinges at the external edges. We term these locking or self-aligning hinges in contrast with the folding hinges that lie within the net (see Figure 10). While the amount of solder deposited provides control over the dihedral angle of folding, cooperativity between panels also plays an important role. For instance, the truncated octahedron has two distinct dihedral angles (109.47° and 125.26°), and it may be succesfully self-folded with an equal amount of solder on each folding hinge [60]. Our discrete model of self-folding relies on two experimental observations: (i) Folding begins at the outer boundary of the net; (ii) locking hinges impose the dihedral angle. We explain (i) as follows. While inertial forces are much weaker than capillarity, their effect is strong enough to cause the outer faces to fold first. As for (ii), while the amount of solder deposited at the folding hinge provides some control over the angle of folding, it does not ensure the exact dihedral angle. Instead, when the two faces meet, their locking hinges meet and self-align to correct the dihedral angle. The role of the locking hinge is especially strong for corners that are rigid (e.g., when three squares or regular pentagons meet at a corner ). As a consequence, nets that have more corners where locking hinges meet self-fold with higher yield. This heuristic idea was first investigated for self-folding cubes and octahedra [3]. The main guide- line for the design of nets that emerged from [3] is the notion that compactness determines the success of self-folding. The use of the term compactness was motivated by its use in protein folding. Since we do not wish to confuse our use of compactness with its common meaning in mathematics, let us describe precisely what it means here. The nets for a given polyhedron differ in their geometry and combinatorial topology. A natural geometric quantification of the intuitive idea that a compact net is “not too spread out” is to compute the radius of gyration Rg of a planar net, defined by ¼ R2 g Z S jx − x̄j2dA; x̄ ¼ R S xdAR S dA : ð3Þ Here S ⊂ ℝ2 is the planar net, and x ∈ ℝ2. Nets also differ in their combinatorial topology. Since an important role in folding is played by the corners at which locking hinges lock panels in place, it is natural to expect that the more such corners a net possesses, the higher the likelihood of suc- cesful folding. In [3, 60] a vertex on the boundary of the net was called a vertex connection if it is shared by two faces that do not share a common edge (see Figure 10). This motivates the use of the total number of vertex connections of a net (denoted Vc ) as a purely combinatorial measure of 424 Artificial Life Volume 20, Number 4 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u a r t l / / l a r t i c e - p d f / / / / 2 0 4 4 0 9 1 6 6 4 8 9 5 a r t l / _ a _ 0 0 1 4 4 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 R. Kaplan et al. Building Polyhedra by Self-Assembly: Theory and Experiment compactness (again, compactness is meant in the intuitive sense of “not too spread out”). The use of Rg and Vc as predictors of yield was tested systematically for the dodecahedron, icosahedrons, and truncated octahedron in [60]. 8 Discrete Models of Self-Folding Pathways 8.1 Gluing at Vertex Connections and the Configuration Space C We idealize the experimental observations of self-folding in a discrete folding algorithm termed gluing at vertex connections [60]. A self-folding pathway is treated as a discrete sequence of states p ¼ ðS0; S1; …; SnÞ between a net N = S0 and the convex polyhedron P = Sn. While each state Sk is a collection of edges, faces, and vertices, we identify it with the closure of its embedded image in Euclidean space for simplicity of description. We generalize the notion of folding and locking hinges and vertex connections from the planar net S0 to each state Sk: We call the edges on the boundary of Sk locking hinges, and we call all other edges folding hinges. A vertex on the boundary of Sk is called a vertex connection if (i) it lies on the boundary of two faces of Sk that do not share a folding hinge, and (ii) the exterior angle at the vertex in the plane of these faces is less than k.4 The states Sk are generated in sequence from the state S0 as follows. For each k ≥ 0 a vertex connection vk of Sk is chosen, and all locking hinges that meet at vk are glued in pairs to form a state Sk+1. When gluing faces at locking hinges, the faces are only allowed to rotate rigidly through the dihedral angle about folding hinges that meet at vk. The procedure terminates when the polyhedron is formed. As in the building game, the set of distinct (i.e., non-congruent) states obtained by enumerating all pathways is called the configuration space C. (Two states S; S̃ ⊂ ℝ3 are congruent if there is a map L(x) = a + Qx with a ∈ ℝ3 and a rotation matrix Q ∈ SO(3) such that LðSÞ ¼ S̃.) States that can be linked by gluing vertex connections are neighbors. Thus, C is a multigraph with an edge for each move that folds or unfolds a state into a neighbor. We find it more convenient to treat C as a graph and to keep track of the number of ways a state Sj can fold into a state Sk. We call this integer the degeneracy and denote it by Djk. The space C is also a poset, because the number of edges glued provides a partial order on states. We use this ordering in the visual representations of C for the cube [60, Figure 7] and dodecahedron (Figure 16). Some shortcomings of this model must be noted at the outset. First, when the folding algo- rithm gives rise to flexible corners, it is necessary to augment the configuration space with addi- tional states in order to obtain a reasonable description of experiments. Such an extended configuration space for the octahedron is described in [61]. Second, the apparent generality of the formulation should not disguise the fact that it has only been used to compute configuration spaces for a few simple experimentally feasible polyhedra. We have not shown that the algorithm is well posed in general. That is, given an arbitrary convex polyhedron and a net for this poly- hedron, we have not shown that the algorithm above will fold the net into the polyhedron. Within the restricted setting of a few polyhedra, we show that the notion of a discrete configuration space provides a useful description for the analysis and design of experiments and suggests inter- esting mathematical questions. 8.2 The Distance Function Since the states in C are embedded in ℝ3, we assigned a cost d(Sk, Sk+1) to each edge between neighboring states Sk; Skþ1 ∈ C based on the distance traveled by faces in ℝ3 as the polyhedral link- age Sk is transformed into Sk+1 by gluing at a vertex connection [60]. It is convenient to informally refer to the cost d(Sk, Sk+1) as the distance between Sk and Sk+1. The underlying heuristic idea here is that the energy dissipated when moving a panel scales linearly with Euclidean distance in an overdamped flow. Thus, the distance between the net and polyhedron serves as an easily computed 4 Note that condition (ii) was not explicitly stated in [60], though it is implicit in the nets studied there. Artificial Life Volume 20, Number 4 425 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u a r t l / / l a r t i c e - p d f / / / / 2 0 4 4 0 9 1 6 6 4 8 9 5 a r t l / _ a _ 0 0 1 4 4 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 R. Kaplan et al. Building Polyhedra by Self-Assembly: Theory and Experiment proxy for the viscous energy dissipated along an assembly pathway, which incorporates the kine- matic constraints of folding by rigid rotations about folding hinges. We now recall the definition of d from [60]. We first define the distance between two surfaces that differ by a rotation, and then extend this formula to a distance between two neighbors in C. Assume given a surface S that consists of a collection of polygons, an edge e that separates S into two polygonal surfaces V1 and V2 with common boundary e, and an angle of rotation u. We define a new surface S̃ by rotating V1 and V2 rigidly about e through u. In this process, each point on the surface Vi at a distance r from the edge e moves a distance i (r ui)2dA = Ii ui = u. The total squared distance traveled by Vi is then r ui, where u 2, where Ii denotes the first moment of area of Vi about the axis e. We then define the cost of this move, (cid:3) (cid:2) d S; S̃ , by the formula 1 + u R V 2 (cid:3) (cid:2) d 2 S; S̃ ¼ min u1þu2¼u I1u2 1 þ I2u2 2 ¼ I1I2 I1 þ I2 u2: ð4Þ The transformation of a state Sk into a state Sk+1 by gluing all faces at the vertex connection vk consists of a number of substeps each of which involves a single rotation. The distance between Sk and Sk+1 is defined by minimizing the distance over the set of possible surfaces between Sk and Sk+1 that differ by the rotation of a single face. Let us make this precise. Assume faces f1,…, fl are glued at vk by rotating each face about the edge e1,…, el. Let j denote a permutation on l symbols, and assume j = Sk+1 1, fj the faces are rotated in the order fj each of which differs from its preceding state by the rotation of fj j about ej through the dihedral j , 0 ≤ j ≤ l − 1, differ through a rotation, the distance between them is angle. Since T j j do not lie in C except at the endpoints j = 0 and defined by Equation 4. (Note that the surfaces Tj j = l.) We then set l. We then form the surfaces T 0 j = Sk, T 1 j and Tj +1 j ,…, T l 2,…, fj d 2ðSk; Skþ1Þ ¼ minj (cid:2) d 2 T j k ; T j kþ1 (cid:3) : Xl −1 k¼0 ð5Þ ≪ I2 than if I1 The above definition of d has the following important intuitive consequence. For fixed u, d S; S̃ ≈ I2. Thus, it costs less to fold a face on an extremity of S than to is smaller if I1 rotate two equally balanced domains V 2. Hence, when computing shortest paths from a net to a polyhedron we usually find that the faces on the boundary of S fold first, until no such moves are possible. 1 and V (cid:2) (cid:3) 8.3 Computational Determination of Dominant Intermediates We use two approaches to determine dominant intermediates and pathways. The first is based on cost functions on C, and the second on the kinetics of assembly. The method of cost functions takes the following form: We assign a weight to each edge in C. The cost of a path from a net S0 to the polyhedron P is the sum of the costs over each edge in the path. We then look for common structure in the optimal paths linking S0 and P. For example, are these paths spread out on a large set, or do they funnel through a few intermediates? We have mainly used the distance (5) as the cost function. Since the configuration space is discrete, we distinguish between geodesic paths, which are obtained by minimizing the distance over all paths in the configu- ration space that connect S0 and P, and greedy paths, obtained by choosing Sk+1 to be the nearest neighbor “below” Sk at each step. The kinetics of assembly can be modeled by the master equation (1) if we assume that a large number of nets fold simultaneously. In contrast with virus self-assembly, we always assume that folding is irreversible. As a consequence, A is lower triangular in a basis where the states are ordered 426 Artificial Life Volume 20, Number 4 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u a r t l / / l a r t i c e - p d f / / / / 2 0 4 4 0 9 1 6 6 4 8 9 5 a r t l / _ a _ 0 0 1 4 4 p d . f b y g u e s t t o n 0 8 S e p e m b e r 2 0 2 3 R. Kaplan et al. Building Polyhedra by Self-Assembly: Theory and Experiment in levels according to the number of locking hinges glued (as shown in [60, Figure 7] and Figure 16). The matrix A is mainly composed of lower triangular blocks, where the kth block links level k with level k + 1. However, there can be exceptional nonzero entries because of folding moves that jump more than one level. Since A is triangular, its diagonal entries are its eigenvalues and the spectral decomposition is easily computed. For configuration spaces such as the cube and dodecahedron that include a single terminal state, all diagonal entries except the last are strictly negative and the unique normalized equilibrium distribution is concentrated on the terminal state, i.e., c = (0,…, 0, 1). Since folding is irreversible, the equilibrium distribution does not provide much information on dominant intermediates. Instead, we must use the transient behavior of c(t ). We choose a time scale in Equation 1 so that the spectral gap is 1 and consider the average pðtÞ ¼ 1 t Z t 0 (cid:4) cðsÞds ¼ 1 t Z t 0 (cid:5) eAsds cð0Þ ð6Þ at m points t1,…, tm. The initial condition cj (0) is 1 if j corresponds to a net, and 0 otherwise. We rank the intermediates according to their relative magnitude in p(tj ) and consider the top M states at times t1,…, tm. We then score the intermediates by a weight wj for j = 1,…, m, and rank the intermediates that appear in the top-M list according to their cumulative score. Of course, if tm ≪ 1, then all the nets appear, and if t1 ≫ 1 only the complete polyhedron appears. In the examples, we have chosen m = 4, t1 = 2, t2 = 4, t3 = 6, t4 = 8, wj = 1 for all j, M = 10 for the dodecahedron, and M = 5 for the octahedron. We call the resulting intermediates prevalent. In our experiments, we have considered two main choices for A. The first of these is the degeneracy, and we set ajk = Djk. We have also considered the kinetics under rates of the form −x, and f(x) = 1/ ajk = f (d (Sj , Sk )) where f is a positive decreasing function (we used f(x) = 1/x, f(x) = e (log(1+x)), x > 0). Note that the degeneracy Djk relies only on the combinatorial topology of
R and is independent of the distance d(Sj , Sk); thus these two measures sense quite different
properties of R.

These methods of extracting dominant intermediates are admittedly ad hoc, as is our empha-
sis on the distance (5). Par exemple, it is clear from Equation 6 that prevalence is determined by
the spectral decomposition of A. We found the above scheme convenient; other choices are
certainly possible. We have not (yet) shown that the geodesic distance between the net and poly-
hedron is correlated with the energy dissipated in the Stokes flow. Nor have we fully justified the
use of the distance function to define rates A. At this time, the main justification for these
choices is that they yield dominant intermediates that agree with experiments in a manner that
is insensitive to the choice of cost functions and kinetics. We view these results as a reflection
of the robust nature of the energy landscape of self-folding for these polyhedra, and expect them
to be reinforced by detailed studies with more realistic cost functions and rates. The motivation
for these choices in connection with other heuristics about self-assembly are discussed again in
Section 10.2.

9 Self-Folding the Dodecahedron

9.1 Description of the Experiments
Since the dodecahedron has 43,380 nets, we restrict ourselves to a much smaller set of nets to
obtain a manageable comparison with experiment. An exhaustive search through all 43,380 nets
reveals that only 21 of them have the maximal number of vertex connections (Vc = 10 in this
case; voir la figure 11). We compute all intermediates that originate in these nets to find a poset with
2799 vertices that we call the restricted configuration space R. This provides a piece of C that is
large enough to show interesting structure, but small enough to be experimentally tractable. UN

Artificial Life Volume 20, Nombre 4

427

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

Chiffre 11. Le 21 nets with maximal Vc and self-folded dodecahedra. Nets 1–21 are ranked in order of decreasing Rg. Dans
each portrayal, the image on the left is an optical-microscope image of the net. On the right is a scanning electron
microscope (SEM) image of the self-folded dodecahedron. The scale bar is 300 Am. Net 17 was used by Steinhaus as
shown in Figure 2.

brief description of the computational scheme used to compute R is presented at the end of this
section.

In order to formulate questions that allow meaningful contact with the experiments, it is neces-
sary to first describe the scale of the experiments. All 21 nets were patterned on a single mask in
random order in order to minimize the role of manufacturing defects. 50 samples of each net were
self-folded in the experiment. Nets were released from the substrate, folded in batches of 60 chaque
(so approximately 3 from each net), then visually sorted under an optical microscope into three
different grades. All nets and high-quality SEM images of dodecahedra folded from these nets are
shown in Figure 11. The yield from the experiments is shown in Figure 12. This set of experiments

428

Artificial Life Volume 20, Nombre 4

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 12. Yield of self-folded dodecahedra. Each self-folded polyhedron is examined visually and sorted into three
grades. Grade A polyhedra have no visual flaws; grade B polyhedra allow misfoldings by at most 20°; grade C polyhedra
have faces that are misaligned by more than 20°. Only comparisons that differ by more than 5% are statistically mean-
ingful. Par exemple, the experiments do not allow us to compare net 14 and net 21. Cependant, we can conclude that
net 14 is superior to net 5.

allows a statistically meaningful formulation of the yield problem: How can the yield of self-folded
polyhedra from a particular net be estimated a priori?

In addition to the above experiments, movies of the folding process were made for all 21 nets.
Each of these movies was based on the folding of a single net in isolation. In order to compare
the observed folding process with pathways computed from a discrete model, we visually extract
snapshots of the folding process (as shown in Figures 13–15). Since we follow the folding of a
single net in isolation in each movie, a direct comparison between theoretically computed and
experimentally observed pathways is statistically meaningful only when we focus on the emergence

Chiffre 13. Pathways of self-folding: je. Optical-microscope images of folding pathways of nets 1–7.

Artificial Life Volume 20, Nombre 4

429

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 14. Pathways of self-folding: II. Optical-microscope images of folding pathways of nets 8–15.

of a few dominant intermediates (since we then have a sample size of 21). These experimental
comparisons may appear restricted at first sight, but it should be recognized that the experiments
are challenging and the visual sorting of states is time-consuming. The results we report are the first
systematic examination of pathways of synthetic self-assembly on this length scale.

Chiffre 15. Pathways of self-folding: III. Optical-microscope images of folding pathways of nets 15–21.

430

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 16. Part of the configuration space of the dodecahedron. A complete subgraph of the restricted configuration
space R is shown. The space R is organized in ten levels. The uppermost level (level 1) consists of the 21 nets, et le
lowest level (level 10) consists of the complete polyhedron. There are many more edges and vertices than are shown.
This graph was obtained by pruning R according to the prevalence of intermediates for a particular choice of the rate
matrix A. Some of the numbered states are shown in Figure 17.

9.2 Results
The configuration space R is too large to be represented effectively in a figure. In order to help
the reader understand the nature of R, we portray a subgraph of R in Figure 16 and Figure 17. Dans
comparison with the simpler configuration spaces of the cube (voir [60, Chiffre 7]) and octahedron
[61], we note that the dodecahedron involves many more levels of folding intermediates. In the early
étapes, faces fold independently at the boundary of the dodecahedron net. Ainsi, the comparison
with experimental pathways is more meaningful in the later stages of folding.

There are typically between 106 à 107 folding pathways from a net to the dodecahedron
in R. We considered three main computations on these folding pathways: (je) predicting yield;
(ii) computation of shortest paths; (iii) determining prevalent intermediates. We discuss these in
turn.

In order to predict yield, we assigned a probability to each path from the net to the polyhedron
as well as a model yield function. Recall that a pathway p ¼ ðS0; S1; …; SmÞ is a sequence of states
from the net S0 to the polyhedron Sm. In order to assign a probability to each pathway, we first
assign a weight w (Sj, Sj +1) ≥ 0 to each edge and then define the weight of each pathway to be
m −1 wðSj ; Sjþ1Þ. A yield function on a pathway is defined in a similar manner. D'abord, chaque
wðpÞ ¼
j¼0
edge is assigned a cost c S; S̃
≥ 0, and the cost for a pathway p is the sum of costs along the edges
in the path. Both these notions are used to introduce a weighted cost function C(N ) and expected
yield Y(N ) for each net N as follows:

(cid:2)

(cid:3)

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

CðN Þ ¼

p∈PN
P.
p∈P

cðpÞwðpÞ
N wðpÞ

; Y ðN Þ ¼ CðN Þ
CðS0Þ

Artificial Life Volume 20, Nombre 4

ð7Þ

431

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 17. Selected intermediates for the dodecahedron States from the subgraph of R are represented with models.
Upper left: level 3. Upper right: level 6. Lower left: level 8. Lower right: level 9.

Here PN denotes the set of pathways that originate at net N. The expected yield Y(N ) is normalized
by averaging over all initial nets.

The above procedure formalizes the problem of predicting yield. Cependant, it involves both a
choice of weight on paths w and a cost function c. We explored various choices and compared
the numerical computations with the experimental yield. The weight was chosen to be w(Sj, Sj +1) =
exp (−hd(Sj, Sj +1)), where h > 0 is an ad hoc parameter. Various models of the cost function c were
investigated, including the distance between two states, the degeneracy, and the number of non-rigid
vertices in the state S. While it was possible to vary the parameter h and find the best match between
Oui(N ) and the observed yield, these comparisons did not convey much insight. De plus, the com-
parison of predicted Y(N ) with experiment was difficult, since the error bars on experimental yield
measurements are sufficiently large that several nets are clustered together (since only 50 samples are
folded for each net). In presenting this procedure, our main hope is that it provides guidelines for
statistical estimation and prediction at a future time when a much larger number of samples can be
tested simultaneously and sorted automatically.

On the other hand, the comparison of computed pathways and experiments does convey gen-
uine insight, and these results are reported in detail. We computed both geodesics and greedy
paths by evaluating the distance function on these paths. In order not to overwhelm the reader
with data, we classify the observed pathways according to the penultimate intermediate. These
results are shown in Figures 18 et 19 along with a few experimental pathways for ease of
comparison. While the space R contains 23 states on level 9, an important feature of these
computations is that the greedy paths and global geodesics funnel through only four of these 23
states. De plus, all the greedy paths funnel through states that consist of two half dodecahedra
linked by a hinge (states 2788, 2791, et 2797). Of these three, 2797 is the most important, depuis
16 of the 21 nets funnel through it. These three states, et 2797 in particular, are especially con-
ducive to self-folding with no misfolding: Each half dodecahedron is rigid, and the intermediate has
a single rotational degree of freedom about the hinge. When the two halves mate, several edges lock
into place simultaneously.

When we compare with experiments, we find that the greedy paths replicate the observed path-
ways better. These computed pathways are a rough approximation of the true continuous pathway,
and the experiments do not permit a statistically reliable resolution of fine distinctions such as whether

432

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 18. Greedy paths for the dodecahedron and comparison with experiment. Upper: net 3; middle: net 17; lower:
net 18. Nets 3, 4, et 5 pass through state 2788 (upper ). Nets 12 et 18 pass through state 2791 (lower ). All 16
remaining nets pass through state 2797 (middle).

intermédiaire 2788, 2791, ou 2797 best approximates the observed half dodecahedra. Nevertheless, le
trend towards such pathways is clear, as is the fact that the greedy paths are much better approximations
to the observed pathways than the global geodesics.

The kinetically dominant intermediates were determined by the method outlined in Section 8.3.
The prevalent intermediates are shown in Figure 20. It is important to note that all rate functions
yield exactly the same set of prevalent intermediates and that these are in accordance with the domi-
nant intermediates determined from the funnel of greedy paths.

Chiffre 19. Geodesics (global shortest paths) between a net and the dodecahedron. Upper: net 3; middle: net 17; lower:
net 18. Nets 3, 4, et 5 pass through 2788 (upper ). Nets 1, 6, 8, 9, 17, 20, et 21 pass through 2798 (middle). All 11
other nets pass through state 2791 (lower ).

Artificial Life Volume 20, Nombre 4

433

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Chiffre 20. Kinetically dominant intermediates. Le 10 most prevalent states for the dodecahedron in decreasing order
(2799 precedes 2788; 2788 precedes 2703).

9.3 Computational Scheme
A Mathematica package was written to compute R. The first step was to find the 21 nets of the
dodecahedron with maximum Vc. To do so, we enumerated all spanning trees on the 1-skeleton of
the dodecahedron, used these to obtain all 43,380 nets for the dodecahedron, and computed Vc
for each net. The spanning trees were enumerated by algorithm S in Knuth’s book [43, §7.2.1.6].
Algorithm S takes time exponentional in the number of vertices, but it is computationally feasible for
the dodecahedron. We then chose the 21 nets with maximum Vc and computed R by performing
each possible folding move on each net, testing intermediates for congruence as needed. For each
pair of intermediates Sj and Sj +1 on a folding pathway, we kept track of the degeneracy of the edge
linking these states (the number of times that intermediate i folded into j ), as well the distance
between intermediate states. It was easy to calculate greedy paths: We started at an initial vertex
and repeatedly chose the vertex at the end of the cheapest outgoing edge, keeping track of the edges
traversed. Since R is a poset, its geodesics were found with dynamic programming in time pro-
portional to the size of R.

10 Outlook

We present our conclusions in two parts. D'abord, we give a set of findings specific to polyhedra. Deuxième,
we speculate more broadly on the role of mathematical theory in the field of self-assembly. Here we
discuss some recent investigations that are similar in spirit to our work, and suggest areas for common
inquiry.

10.1 Synthetic Self-Assembly of Polyhedra
Viruses are marvels of natural engineering and have inspired several synthetic emulations. While we
have focused on self-folding, other important synthetic examples arise in supramolecular chemistry
and DNA nanotechnology. In the first category, chemists have explored the assembly of polyhedra
from molecular tiles held together by non-covalent bonds. Important recent examples of such
molecular design are the work of Fujita’s and Ward’s groups [53, 71]. DNA nanotechnology
emerged from Seeman’s fundamental insight that the stiffness of DNA molecules could conceivably
be used to design and build molecular crystals independent of any biological context. This is a
booming area of enquiry that we cannot adequately survey (voir [64, 67, 79] for examples of break-
through experiments, et [5, 12, 16] for examples of polyhedra built with synthetic DNA). Le
conceptual framework outlined in this article is well suited to the design and analysis of these
self-assembling systems. While the modular subunits vary from experiment to experiment—they
may be edges [5, 71], faces [53], or junctions [34]—a common feature of the assembly pathways
is that they are described by combinatorial decompositions of polyhedra. Ainsi, in each instance, nous
may construct a configuration space for the analysis of assembly.

434

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

Cependant, few properties of these configuration spaces are understood. À ce jour, the shapes built
by synthetic self-assembly are relatively simple polyhedra—the Platonic solids, simple prisms, et
simpler Archimedean solids. In order to build more complex shapes, even moderately complex
polyhedra (20–30 faces), it is necessary to understand how the physics of a particular technique
of self-assembly can be coupled with geometric combinatorics to provide sufficient control of
the assembly pathway. Our work constitutes mainly an exploration of the geometry of folding, avec
the use of optimal paths serving as a crude approximation of the physics. Let us stress two key
findings.

D'abord, depite a combinatorial explosion in the configuration space, we found that pathways focus
through a few dominant intermediates. We used two measures to quantify “dominance”: (je) metric:
greedy paths on C or E were computed and seen to concentrate on a small subset of possible inter-
mediates; (ii) kinetic: prevalent intermediates were computed using the kinetic equation (1) for several
choices of rates. Both methods give essentially the same set of intermediates. De plus, computed
pathways compare favorably with experiment.

Deuxième, we discovered that the dominant intermediates have good rigidity properties. More pre-
cisely, intermediates such as 2797 for the dodecahedron, and similar intermediates for the cube and
truncated octahedron, have a single rotational degree of freedom about a central hinge. Passage
through these intermediates is particularly well suited to self-folding, since several edges mate simul-
taneously, reducing the chance of misfolding. Note that our cost functions have no a priori connec-
tion to rigidity theory. Ainsi, the rigidity of the dominant intermediates is surprising. Recent
experiments in self-folding show that modifications to precursor nets such as removal of even a
single panel can have a strong influence on self-assembly and direct pathways through more rigid
intermediates. This strategy was used to enrich one octahedral isomer over another [61].

These lessons are not restricted to self-folding. Dominant compact intermediates and pathways
have been discovered in several model systems. Various measures of dominance seem to lead to the
same intermediates, and further these intermediates typically have favorable properties such as com-
pactness, energetic stability, or fewer degrees of freedom. The principal challenge in engineering self-
assembling systems is to understand how to guide a system through these intermediates. Viruses
such as MS2 use an exquisite interplay between RNA folding and capsid assembly to direct succesful
assembly. At present, such precise direction lies outside the scope of any synthetic technique
(though current experiments in DNA nanotechnology are exploring the design of DNA sequences
that break assembly into a hierarchy of steps). The decoding of the intricate sequence-specific math-
ematical structure of the assembly process for MS2 suggests an inspiring inverse challenge: Can we
design a (synthetic) sequence of nucleotides that code for protein shells that assemble as gracefully
as MS2?

10.2 Discrete Geometry and Self-Assembly
Polyhedra are mathematical objects of antiquity. Cependant, studies of self-assembling polyhedra
give rise to new questions that are of practical importance and intrinsic interest. The examples
surveyed in the article show that essential insights into the biological, chemical, or physical
mechanism of self-assembly can be captured with a discrete algorithm for building a polyhedron.
Similar ideas are also used to model protein folding [46, 66] and directed assembly of sphere
packings [2, 56]. The common structure of these models involves a geometrically defined discrete
configuration space, E, and a Markov process on E with transition probabilities defined using bio-
chemical or physical energy functions. As the strength of thermal fluctuations decreases, it is also
natural to consider gradient descent of the energy. In biophysical models, the main goal is to understand
how natural energy functions rapidly drive the system to a preferred minimum energy. In synthetic
examples, the goal is to understand how to direct assembly through preferred pathways.

As this rough description suggests, there are two main mathematical aspects to these studies. Le
first involves geometric combinatorics—to enumerate and classify the configuration space E. Le
second is to understand how the geometry of E affects an energy landscape on E. In the models we

Artificial Life Volume 20, Nombre 4

435

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

consider, the geometry of intermediates is modeled faithfully, while using a naive approximation of
the physics. While the agreement with experiment is surprising at first sight, our cautious view is
that the intrinsic geometry of the configuration space underlies robust self-assembly. En outre
to our work on self-folding, this heuristic idea plays an important role in the kinetics of capsid
formation, both in a crude model using the building game [81], and in a careful study of directed
assembly that takes into account the role of RNA folding [18, 20, 21]. In the protein folding
literature, computations have shown that even randomly chosen energies can lead specific HP
strings into a compact state [66]. In this view, the energy landscape of protein folding is a folding
funnel attracting all initial states into the narrow neck of the funnel, followed by a rapid passage
down the neck [8, 14, 49].

A precise understanding of the geometry of the configuration space for models of self-assembly
promises to be of broad interest. At present, our understanding of the structure of the configuration
spaces relies mainly on computational explorations for simple polyhedra. These computations sug-
gest interesting problems in geometric combinatorics that are yet to be investigated systematically.
Par exemple, how big is the configuration space for the building game? How do growth algorithms
for polyhedra suggested by experiments (the building game, self-folding, local rules) compare with
well-studied combinatorial methods such as shellability [80]? Do the above configuration spaces
have favorable CAT(0) geometry that underlies robust assembly? (Compare with the speculative
ideas of Gromov in [29].)

The size of the configuration space provides a simple measure of the complexity of self-
assembly. By this measure, for the same polyhedron, self-folding is a more complex growth process
than the building game. On the other hand, the configuration spaces we have explored are far
smaller than the configuration space for simple protein-folding models like the 27-mer [66]. Ainsi, soi-
assembly of polyhedra provides problems of intermediate complexity, amenable to experimental
and theoretical investigation. Our hope is that heuristics common to several models of self-assembly,
such as the formulation of a folding funnel and directed pathways, can be better understood in the
investigation of polyhedra. We also hope that the immediacy of the technological challenge of self-
assembly will stimulate further interdisciplinary work in the area.

Remerciements
RK, JK, and GM acknowledge support from NSF grants DMS 07-48482 and EFRI 10-22638
(BECS). SP and DHG acknowledge support from NSF grant CMMI 12-00241 and EFRI 10-22730
(BECS).

Les références
1. Abad-Zapatero, C., Abdel-Meguid, S. S., Johnson, J.. E., Leslie, UN. G. W., Rayment, JE., Rossmann, M.. G.,
Suck, D., & Tsukihara, T. (1980). Structure of southern bean mosaic virus at 2.8 Å resolution. Nature,
286(5768), 33–39.

2. Arkus, N., Manoharan, V. N., & Brenner, M.. P.. (2011). Deriving finite sphere packings. SIAM Journal on

Discrete Mathematics, 25, 1860–1901.

3. Azam, UN., Leong, T. G., Zarafshar, UN. M., & Gracias, D. H. (2009). Compactness determines the success of

cube and octahedron self-assembly. PloS One, 4, e4451.

4. Berger, B., Shor, P.. W., Tucker-Kellogg, L., & King, J.. (1994). Local rule-based theory of virus shell

assembly. Proceedings of the National Academy of Sciences of the U.S.A., 91, 7732–7736.

5. Bhatia, D., Mehtab, S., Krishnan, R., Indi, S. S., Basu, UN., & Krishnan, Oui. (2009). Icosahedral DNA

nanocapsules by modular assembly. Angewandte Chemie International Edition, 48, 4134–4137.

6. Breitbart, M., & Rohwer, F. (2005). Here a virus, there a virus, everywhere the same virus? Trends in

Microbiology, 13, 278–284.

7. Bruinsma, R.. F., Gelbart, W. M., Reguera, D., Rudnick, J., & Zandi, R.. (2003). Viral self-assembly as a

thermodynamic process. Physical Review Letters, 90, 248101.

436

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

8. Bryngelson, J.. D., Onuchic, J.. N., Socci, N. D., & Wolynes, P.. G. (1995). Funnels, pathways, et

the energy landscape of protein folding: A synthesis. Proteins: Structure, Function, and Bioinformatics, 21,
167–195.

9. Casjens, S., & King, J.. (1975). Virus assembly. Annual Review of Biochemistry, 44, 555–611.

10. Caspar, D. L. D., & Klug, UN. (1962). Physical principles in the construction of regular viruses. In Cold Spring

Harbor Symposia on Quantitative Biology, Vol. 27 (pp. 1–24). Cold Spring Harbor Laboratory Press.

11. Castón, J.. R., Trus, B. L., Booy, F. P., Wickner, R.. B., Wall, J.. S., & Steven, UN. C. (1997). Structure of LA
virus: A specialized compartment for the transcription and replication of double-stranded RNA. The Journal
of Cell Biology, 138, 975–985.

12. Chen, J., & Seeman, N. C. (1991). Synthesis from DNA of a molecule with the connectivity of a cube.

Nature, 350, 631–633.

13. Crick, F. H., & Watson, J.. D. (1956). Structure of small viruses. Nature, 177, 473.

14. Dill, K. UN., & Chan, H. S. (1997). From Levinthal to pathways to funnels. Nature Structural and Molecular

Biology, 4, 10–19.

15. Douglas, S. M., Dietz, H., Liedl, T., Högberg, B., Graf, F., & Shih, W. M.. (2009). Self-assembly of DNA

into nanoscale three-dimensional shapes. Nature, 459, 414–418.

16. Douglas, S. M., Dietz, H., Liedl, T., Högberg, B., Graf, F., & Shih, W. M.. (2009). Self-assembly of DNA

into nanoscale three-dimensional shapes. Nature, 459, 414–418.

17. Dürer, UN., & Strauss, W. L. (1977). The painter’s manual: A manual of measurement of lines, domaines, et
solids by means of compass and ruler assembled by Albrecht Dürer for the use of all lovers of art with
appropriate illustrations arranged to be printed in the year MDXXV. New York: Abaris Books.

18. Dykeman, E. C., Grayson, N. E., Toropova, K., Ranson, N. UN., Stockley, P.. G., & Twarock, R.. (2011).
Simple rules for efficient assembly predict the layout of a packaged viral RNA. Journal of Molecular Biology,
408, 399–407.

19. Dykeman, E. C., Stockley, P.. G., & Twarock, R.. (2010). Dynamic allostery controls coat protein conformer

switching during MS2 phage assembly. Journal of Molecular Biology, 395, 916–923.

20. Dykeman, E. C., Stockley, P.. G., & Twarock, R.. (2013). Building a viral capsid in the presence of genomic

RNA. Physical Review E, 87, 022717.

21. Dykeman, E. C., Stockley, P.. G., & Twarock, R.. (2014). Solving a Levinthal’s paradox for virus

assembly identifies a unique antiviral strategy. Proceedings of the National Academy of Sciences of the U.S.A.,
11(14), 5361–5366.

22. Dykeman, E. C., & Twarock, R.. (2010). All-atom normal-mode analysis reveals an RNA-induced allostery

in a bacteriophage coat protein. Physical Review E, 81, 031908.

23. Endres, D., Miyahara, M., Moisant, P., & Zlotnick, UN. (2005). A reaction landscape identifies the

intermediates critical for self-assembly of virus capsids and other polyhedral structures. Protein Science,
14, 1518–1525.

24. Fiers, W., Contreras, R., Duerinck, F., Haegeman, G., Iserentant, D., Merregaert, J., Min Jou, W.,

Molemans, F., Raeymaekers, UN., Van den Berghe, UN., Volckaert, G., & Ysebaert, M.. (1976). Complete
nucleotide sequence of bacteriophage MS2 RNA: Primary and secondary structure of the replicase gene.
Nature, 260, 500–507.

25. Fraenkel-Conrat, H., & Williams, R.. C. (1955). Reconstitution of active tobacco mosaic virus from its

inactive protein and nucleic acid components. Proceedings of the National Academy of Sciences of the U.S.A.,
41, 690.

26. Glotzer, S. C., & Solomon, M.. J.. (2007). Anisotropy of building blocks and their assembly into complex

structures. Nature Materials, 6, 557–562.

27. Goldberg, M.. (1937). A class of multi-symmetric polyhedral. Tohoku Mathematical Journal, 43, 104–108.

28. Grayson, N. E. (2012). Models of molecular self-assembly for RNA viruses and synthetic DNA cages. Ph.D. thesis,

Université d'York.

29. Gromov, M.. (2011). Crystals, proteins, stability and isoperimetry. Bulletin of the American Mathematical Society

(N.S.), 48, 229–257.

Artificial Life Volume 20, Nombre 4

437

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

30. Grünbaum, B., & Shephard, G. C. (1989). Tilings and patterns. New York: W. H. Freeman.

31. Hagan, M.. F., & Chandler, D. (2006). Dynamic pathways for viral capsid assembly. Biophysical Journal, 91,

42–54.

32. Harrison, S. C., Olson, UN. J., Schutt, C. E., Winkler, F. K., & Bricogne, G. (1978). Tomato bushy stunt virus

à 2.9 Å resolution. Nature, 276(5686), 368–373.

33. Hart, G. (2013). Goldberg polyhedra. En M. Senechal (Ed.), Shaping space (pp. 125–138). New York: Springer.

34. Il, Y., Ye, T., Su, M., Zhang, C., Ribbe, UN. E., Jiang, W., & Mao, C. (2008). Hierarchical self-assembly of

DNA into symmetric supramolecular polyhedra. Nature, 452, 198–201.

35. Hosokawa, K., Shimoyama, JE., & Miura, H. (1994). Dynamics of self-assembling systems: Analogy with

chemical kinetics. Artificial Life, 1, 413–427.

36. Johnson, D., & Menon, G. (preprint). Enumerative results for the building game.

37. Johnson, J.. E. (2010). Virus particle maturation: Insights into elegantly programmed nanomachines. Actuel

Opinion in Structural Biology, 20, 210–216.

38. Kac, M.. (1985). Enigmas of chance: An autobiography. Berkeley: Presse de l'Université de Californie.

39. Kaspars, T., Maija, B., Kerstin, F., & Lars, L. (1997). The crystal structure of bacteriophage GA and a

comparison of bacteriophages belonging to the major groups of Escherichia coli leviviruses. Journal of Molecular
Biology, 271, 759–773.

40. Ke, Y., Ong, L. L., Shih, W. M., & Yin, P.. (2012). Three-dimensional structures self-assembled from DNA

bricks. Science, 338, 1177–1183.

41. Klug, UN. (1999). The tobacco mosaic virus particle: Structure and assembly. Philosophical Transactions of the

Royal Society of London. Série B: Sciences biologiques, 354, 531–535.

42. Klug, UN. (2003). From macromolecules to biological assemblies (Nobel Lecture). Angewandte Chemie

International Edition in English, 22, 565–582.

43. Knuth, D. E. (2006). Generating all trees—History of combinatorial generation. In The art of computer

programming, Vol. 4, Fasc. 4. Upper Saddle River, New Jersey: Addison-Wesley.

44. Kroto, H. W., Heath, J.. R., O’Brien, S. C., Curl, R.. F., & Smalley, R.. E. (1985). C60: Buckminsterfullerene.

Nature, 318, 162–163.

45. Krupovic̀, M., & Bamford, D. H. (2008). Virus evolution: How far does the double h-barrel viral lineage

extend? Nature Reviews Microbiology, 6, 941–948.

46. Lau, K. F., & Dill, K. UN. (1989). A lattice statistical mechanics model of the conformational and sequence

spaces of proteins. Macromolecules, 22, 3986–3997.

47. Lehn, J.. M.. (1996). Supramolecular chemistry: Concepts and perspectives. Weinheim, Allemagne: Wiley-VCH.

48. Leong, T. G., Zarafshar, UN. M., & Gracias, D. H. (2010). Three-dimensional fabrication at small size scales.

Petit, 6, 792–806.

49. Leopold, P.. E., Montal, M., & Onuchic, J.. N. (1992). Protein folding funnels: A kinetic approach to the
sequence-structure relationship. Proceedings of the National Academy of Sciences of the U.S.A., 89, 8721–8725.

50. Levinthal, C. (1968). Are there pathways for protein folding? Journal de Chemie Physique et de Physico-chemie

Biologique, 65, 44–45.

51. Levinthal, C. (1969). How to fold graciously. In P. DeBrunner, J.. Tsibris, & E. Munck (Éd.), Mössbauer

Spectroscopy in Biological Systems (pp. 22–24). Urbana, IL: University of Illinois Press.

52. Liddington, R.. C., Yan, Y., Moulai, J., Sahli, R., Benjamin, T. L., & Harrison, S. C. (1991). Structure of

simian virus 40 à 3.8 Å resolution. Nature, 354, 278–284.

53. Liu, Y., Hu, C., Comotti, UN., & Ward, M.. D. (2011). Supramolecular Archimedean cages assembled with

72 hydrogen bonds. Science, 333, 436–440.

54. Mannige, R.. V., & Brooks, C. L. (2008). Tilable nature of virus capsids and the role of topological

constraints in natural capsid design. Physical Review E, 77, 051902.

55. Mannige, R.. V., & Brooks, C. L. (2009). Geometric considerations in virus capsid size specificity, auxiliary
requirements, and buckling. Proceedings of the National Academy of Sciences of the U.S.A., 106, 8531–8536.

438

Artificial Life Volume 20, Nombre 4

D
o
w
n
o
un
d
e
d

F
r
o
m
h

t
t

:
/
/

d
je
r
e
c
t
.

je
t
.

e
d
toi
un
r
t
je
/

un
r
t
je
c
e
–
p
d

F
/

2
0
4
4
0
9
1
6
6
4
8
9
5
un
r
t
je

_
un
_
0
0
1
4
4
p
d

b
oui
g
toi
e
s
t

o
n
0
8
S
e
p
e
m
b
e
r
2
0
2
3

R.. Kaplan et al.

Building Polyhedra by Self-Assembly: Theory and Experiment

56. Meng, G., Arkus, N., Brenner, M.. P., & Manoharan, V. N. (2010). The free-energy landscape of clusters of

attractive hard spheres. Science, 327, 560–563.

57. Nguyen, H. D., Reddy, V. S., & Brooks, C. L. (2007). Deciphering the kinetic mechanism of spontaneous

self-assembly of icosahedral capsids. Nano Letters, 7, 338–344.

58. Nguyen, H. D., Reddy, V. S., & Brooks, C. L. (2009). Invariant polymorphism in virus capsid assembly.

Journal of the American Chemical Society, 131, 2606–2614.

59. O’ Rourke, J.. (2013). Dürer’s problem. En M. Senechal (Ed.), Shaping Space (pp. 77–86). New York: Springer.

60. Pandey, S., Ewing, M., Kunas, UN., Nguyen, N., Gracias, D. H., & Menon, G. (2011). Algorithmic design of
self-folding polyhedra. Proceedings of the National Academy of Sciences of the U.S.A., 108, 19885–19890.

61. Pandey, S., Johnson, D., Kaplan, R., Klobusicky, J., Menon, G., & Gracias, D. H. (unpublished).

Self-assembly of mesoscale isomers: The role of pathways and degrees of freedom.

62. Rapaport, D. C. (2010). Modeling capsid self-assembly: Design and analysis. Physical Biology, 7, 045001.

63. Rayment, JE., Boulanger, T. S., Caspar, D. L. D., & Murakami, W. T. (1982). Polyoma virus capsid structure at

22.5 Å resolution. Nature, 295, 110–115.

64. Rothemund, P.. W. K. (2006). Folding DNA to create nanoscale shapes and patterns. Nature, 440, 297–302.

65. Rudnick, J., & Bruinsma, R.. F. (2005). Icosahedral packing of RNA viral genomes. Physical Review Letters,

94, 038101.

66. Sali, UN., Shakhnovich, E., & Karplus, M.. (1994). How does a protein fold? Nature, 369, 248–251.
67. Seeman, N. C. (2003). DNA in a material world. Nature, 421, 427–431.

68. Senechal, M.. (1996). Quasicrystals and geometry. Cambridge, ROYAUME-UNI: CUP Archive.

69. Steinhaus, H. (1999). Mathematical snapshots. New York: Dover Publications.
70. Suttle, C. UN. (2005). Viruses in the sea. Nature, 437, 356–361.

71. Takeda, N., Umemoto, K., Yamaguchi, K., & Fujita, M.. (1999). A nanometre-sized hexahedral

coordination capsule assembled from 24 components. Nature, 398, 794–796.

72. Tang, L., Johnson, K. N., Balle, L. UN., Lin, T., Yeager, M., & Johnson, J.. E. (2001). The structure of

pariacoto virus reveals a dodecahedral cage of duplex RNA. Nature Structural & Molecular Biology, 8, 77–83.

73. Toropova, K., Basnak, G., Twarock, R., Stockley, P.. G., & Ranson, N. UN. (2008). The three-dimensional
structure of genomic RNA in bacteriophage MS2: Implications for assembly. Journal of Molecular Biology, 375,
824–836.