Florian Wendt, Gerriet K. Sharma, - Specialized Research AI at MIT

Florian Wendt, Gerriet K. Sharma,
Matthias Frank, Franz Zotter,
and Robert H ¨oldrich
Institute of Electronic Music and
Acoustics
University of Music and Performing Arts
Graz
Inffeldgasse 10/3, 8010 Graz, Austria
{wendt, sharma, frank, zotter,
hoeldrich}@iem.at

Perception of Spatial Sound
Phenomena Created by the
Icosahedral Loudspeaker

Abstract: The icosahedral loudspeaker (IKO) is able to project strongly focused sound beams into arbitrary directions.
Incorporating artistic experience and psychoacoustic research, this article presents three listening experiments that
provide evidence for a common, intersubjective perception of spatial sonic phenomena created by the IKO. The
experiments are designed on the basis of a hierarchical model of spatiosonic phenomena that exhibit increasing
complexity, ranging from a single static sonic object to combinations of multiple, partly moving objects. The results
are promising and explore new compositional perspectives in spatial computer music.

The icosahedral loudspeaker (IKO, cf. Figure 1) is
a compact, 20-sided, 20-channel playback device
that uses acoustic algorithms to steer sound into
freely adjustable directions. These acoustic beams
(referred to in this article as sound beams) are not
only freely adjustable in terms of their radiation
direction and beam width, it is also possible to blend
multiple beams. A metaphorical idea behind using
these beams in music is to “orchestrate” reflecting
surfaces, yielding useful effects in the perceived
spatial impression.

The particular perception of the IKO’s effects
depends on the sonic material, how sound beams are
configured and mixed, and on the room situation.

Over the last six years, two basic staging constel-

lations of the IKO have been shown to be feasible
from an artistic point of view: those in typical
rectangular rooms and those that utilize a concave
setup of reflectors behind the IKO. Staging directly
affects the sound-propagation paths in concert situa-
tions and, thus, the number of discretely localizable
directions.

In rectangular staging situations, the IKO is
placed near the corners of the room, allowing
the orchestration of at least two side walls (see
Figure 2a). For situations that are more complex, a
concave set of reflectors are placed behind the IKO.

Computer Music Journal, 41:1, pp. 76–88, Spring 2017
doi:10.1162/COMJ a 00396
c(cid:2) 2017 Massachusetts Institute of Technology.
Published under a Creative Commons
Attribution 3.0 Unported (CC BY 3.0) license.

This permits more flexibility in setting the number
of reflections, as in Figure 2b. To better control
spatial effects, the IKO’s setup can be fine-tuned by
ear to the given environment.

Existing compositions have been presented
at festivals in configurations like these, including
Insonic2015 in Karlsruhe, the International Summer
Course for New Music in Darmstadt (2015), the
International Computer Music Conference (2012),
and in venues such as House of World Cultures
in Berlin, the Center for Art and Media (ZKM)
in Karlsruhe, the House of Music and Music
Drama (MUMUTH) at the University of Music and
Performing Arts Graz (KUG), the European Forum
Alpbach, and the French Pavilion in Zagreb (shown
in Figure 1).

After many concerts performed with the IKO,
listeners reported having perceived auditory objects
that move away from the IKO and that can have
various shapes and layerings, often described as
sonic sculptures or (borrowing a term from the
visual arts) as “plastic.”

The appearance of the terms “sonic sculpture”
and “plastic sound” could be a starting point for
the research in this field. They are used in compo-
sitional practice (Wishart 1996; Gonz ´ales-Arroyo
2012), can be found in theoretical writings (Emmer-
son 2000; Ihde 2007; Peters 2010), and have been
used in many places in the history of organized
sounds and computer music. Examples include
Max Neuhaus’s “Time Square” installation, de-
scribed as a sonic sculpture by Collins, Schedel,
and Wilson (2013); Bill Fontana calls his works

Computer Music Journal

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Figure 2. Staging
constellations of the IKO:
rectangular (a) and
concave (b).

Figure 1. The icosahedral
loudspeaker (IKO) in the
French Pavilion of the
Student Centre Zagreb at
the Showroom of
Contemporary Sound
Festival 2015. (Photo by
Kristijan Smok, Zagreb.)

Figure 1

Figure 2

sound sculptures (see www.resoundings.org); and
Jonty Harrison (1998) writes of sonic sculptures
in connection with the sound diffusion. More-
over, considering the fact that a well-known
musical software tool is called AudioSculpt
(http://forumnet.ircam.fr/product/audiosculpt-en),
this clearly hints both at a prevalent idea of sound
as sculptural material and at the composition of
electronic music as an act that can be linked to

sculptural field within the fine arts. Thus in the
musical context, the use of these terms oscillates
between extended sonic objects, loudspeaker con-
stellations, and sound as a sculptural material itself,
reminding one of Edgar Var `ese’s planes, shapes, and
zones of intensities (Var `ese 2004).

If we examine some of the historical body–
space relations that have been distinguished in
the theory of sculpture (Klant and Walch 2014),

Wendt et al.

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

several relations between matter and space might
be useful in spatialized computer music. What we
can say axiomatically is that “a body” or “matter”
is opposed to an infinite space (Kr ¨amer 2011). Both
exist in a reciprocal relation. We can further observe
that, historically, the sculptured body volume opens
step by step towards space, trying to invade it
until finally almost dissolving into it. That means
that space is not just a surrounding shell or an
envelope but, since modernity, it is an active
cocreator of sculpture. Without detailed empirical
musicological study we might assume a similar
idea of spatial sound composition, especially in the
case of electroacoustic music over the last several
decades, where space became a parameter on an
equal footing with timbre or rhythm (Bayle 2007;
Smalley 2007; Nystr ¨om 2013).

Using a terminology derived from the theory of
sculpture, we still lack a specific denotation for types
of sonic sculptures that best represents how they
are percieved. First, this raises the question whether
such entities are perceived at all intersubjectively,
as intended by the composer, causing something we
call a shared perceptual space (Sharma, Zotter, and
Frank 2015). This space is defined as an open set of
spatiosonic phenomena for which the perceptions of
composers, scientists, and audience intersect.

Strictly speaking, objective evidence regarding the

qualities of perceived sculptural sonic objects can
only be accessed systematically through listening
tests, but until now tests have only seldom been
applied (Landy 2007).

To resolve the question of whether the IKO’s
auditory objects and sculptures are intersubjectivley
perceivable (as well as the question of which ones),
we performed listening tests. Based on the results,
we propose a classification of complexity levels.
The levels characterize sculptured auditory objects,
and categories of plastic sound objects and can be
seen as composition elements. This might provide
a basis for common verbalization. Moreover, doing
so resolves the question of whether (and which
of) the IKO’s auditory objects and sculptures are
intersubjectively perceivable. Finally, we propose
a classification of complexity levels concerning
sculptured auditory objects and categories of plastic
sonic objects. These can be seen as composition

elements and might provide a basis for a common
vocabulary.

Experimental Framework and Setup

A general approach to the spatial perception of
sound can be found in the psychoacoustic literature.
A comprehensive review of this issue is provided by
Jens Blauert (1983). More specifically, the work of
Rakerd and Hartmann examines the localization of
sound in reverberant environments, such as rooms
(Hartmann 1983; Rakerd and Hartmann 1985, 1986;
Hartmann et al. 1989). A fundamental phenomenon
of localization is the precedence effect. It refers to a
group of phenomena that are thought to be involved
in resolving the competition for perception and
localization that occurs between temporally delayed
sounds with partial coherence, such as a direct
sound and a reflection. Comprehensive reviews
approaching the precedence effect were conducted
by Litovsky et al. (1999) and by Brown, Stecker, and
Tollin (2015). In addition, localization effects of the
IKO in rooms can be partly deduced by the work
on localization in surrounding loudspeaker arrays
at off-center listening positions reported by Frank
(2013) and by Stitt (2015). More specific studies
dealing with the properties of auditory objects
created by variable directivity in a room are still
fairly new (cf. Schmeder 2009; Sharma, Zotter, and
Frank 2014; Zotter et al. 2014; Frank, Sharma, and
Zotter 2015; Laitinen et al. 2015; Zotter and Frank
2015; Wendt et al. 2016).

For the purpose of this article, sculptural sonic
objects, considered as artistically designed entities,
can consist of several time-variant spatiospectral
elements. Owing to the combinatorial explosion
with this number of elements, an exhaustive
investigation would not be practicable. To overcome
this problem of complexity in our experimental
design, we used a hierarchical model of spatiosonic
phenomena consisting of three levels:

1. First-order phenomena, consisting of a single
static percept (i.e., a shape or object) triggered
by a simple element (in the aforementioned
sense) by time-invariant spatial projection.

Computer Music Journal

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Figure 3. Setup of the
listening experiments (a)
and frequency-dependent
beam patterns of the IKO
(b). The room layout
shows the position of the
IKO with its directivity

pattern, the listening
positions (P1, P2, and
Pa–Pf), and the materials
used for the six main
boundary surfaces. For the
beam patterns we used a
third-order, horizontal

beam steered to 0◦ over
frequency (in Hz) and
azimuth angle (in degrees)
on the horizontal plane; dB
levels represented as
shades of gray.

These fundamental phenomena are easy
to explain or investigate on the basis of
psychoacoustic research.

2. Second-order phenomena, consisting of time-
variant spatial projections with similar exci-
tation signals. Instances of such projections
can be trajectories such as turns, pendulums,
or movements of greater complexity.

3. Third-order phenomena, which superimpose
several first- and second-order phenomena
and lead to complex spatiosonic objects—
sonic sculptures as artistic entities.

These three phenomena are investigated in a
series of three listening experiments, which we
describe later in this article.

In contrast to experiments that examined local-

ization effects of a virtual realization of the IKO
with simplified settings (Zotter and Frank 2015;
Wendt et al. 2016), the experiments we present in
this work were conducted in a physical room, a
lecture hall with the dimensions 6.8 × 7.6 × 3 m.

The IKO was placed near the corners of the room,
which corresponds to a rectangular performance sit-
uation. To investigate the influence of the listening
position on spatial sonic phenomena, the subjects

Table 1. Mean Reverberation Times

T60

125 Hz

250 Hz

500 Hz

1 kHz

2 kHz

4 kHz

800

600

530

500

540

Reverberation times (in msec) measured at listening positions
P1 and P2 for six consecutive octave bands.

performed the tests at various listening positions.
Figure 3a shows the layout of the room, indicating
the positions of IKO and listeners (in Experiments 1
and 2 these are labeled P1 and P2; for Experiment 3,
the labels Pa–Pf are used), and lists the materials of
the six main boundary surfaces. It is safe to assume
that in this room most of the reverberant energy
is caused by reflections of the three frontal walls,
whereas the rear wall has a higher absorption coef-
ficient. Table 1 lists the mean reverberation time,
T60, measured at both listening positions in octave
bands defined by the center frequency fc.

The IKO uses “spherical beam forming,” as
developed by Zotter (2009) and L ¨osler (2014). This
yields beam patterns that are slightly frequency-
dependent and with relatively narrow width. The
result for a horizontal third-order beam is shown in
Figure 3b.

Wendt et al.

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Figure 4. Answers
collected at listening
positions P1 (a) and P2 (b)
for four third-order beams,
coded in grayscale. See
Figure 5 for coding of
tested beam directions.

Listening Experiment 1

The first listening experiment investigates whether
the IKO is able to create an intersubjective spatial
perception of different projections. In particular,
localization is evaluated for static sound beams
projected towards various azimuth angles.

Sound projection was achieved by four different,

third-order spherical beams at azimuth angles of
0◦, 90◦, 180◦, and 235◦ (delineated in Figure 4).
Pink noise bursts with four combinations of two
different durations for the onset and release times
(tshor t = 10 msec, tlong = 500 msec) were chosen as
sounds. Although broadband noise can be problem-
atic in simple localization tasks (multiple locations
may be perceived), pretrial experience revealed that
it is possible to identify one dominant auditory
object whose location is, however, influenced by
the shape of the envelope. The critical effect of
envelope indicates that localization is determined
by the buildup of the precedence effect rather than
by summing localization (Litovsky et al. 1999;
Brown, Stecker, and Tollin 2015). This assump-
tion is further supported by the magnitude of
time delays between direct sound and wall reflec-
tions (6 msec < (cid:2)t < 30 msec at both listening positions). As a result of these considerations, envelope shapes were identified as meaningful parameters for the burst signal of pink noise. Each of the pink-noise sounds, denoted S1–S4, is represented by a marker whose outline indicates the envelope shape (i.e., S2, which exhibits a slow onset and short release, is represented by the symbol (cid:2); see Figure 4). Sound 5 (S5, represented by the symbol +) consisted of a sequence of irregular short bursts, and Sound 6 (S6, represented by (cid:3)) was a chain of overlapping regular grains. For each run, subjects were seated at one of the two listening positions (P1 and P2), facing the IKO. Subjects were free to move their heads while seated. The binaurally rendered stimuli, using the virtual IKO (Zaunschirm, Frank, and Zotter 2016), are available on our Web site (P1 at https://phaidra.kug.ac.at/detail object/o:34849, P2 at https://phaidra.kug.ac.at/detail object/o:34853). The listeners were asked to specify azimuth angle and distance of the dominant auditory object within an IKO-centric coordinate system. With the successful reduction to a single percept, the general controllability of auditory objects is shown. Fifteen experienced listeners with normal hearing participated in the experiment. Seven selectable stimuli, of which six belonged to different sounds with randomly selected direction, were shown 80 Computer Music Journal l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Figure 5. Median values for the localization results shown in Figure 4, for positions P1 (a) and P2 (b). Medians are calculated for each sound and beam direction (arrows indicate corresponding beam angles). together on one screen to allow comparative re- sponses. The seventh stimulus was a randomly selected repetition of one of the other six stimuli on the screen. With four such screens per listening position, each of the 15 participants gave two to four responses per stimulus. In total, all listeners gave 420 responses per listening position, as shown in Figure 4. Sounds are coded as marker shapes and beam direction by different shades (see Figure 5 for the coding of beam direction). Results The distinct shades of the marker cloud in Fig- ure 4 indicate the different intersubjective per- ceptions of various projections. This is supported by the two-dimensional median values shown in Figure 5. A pairwise analysis of variance (ANOVA) of all azimuth angles for each beam direction confirms the different perceptions of various angular projections. For both positions, all six sounds yield at least three different directions ( p < 0.05). For some conditions, however, neighboring directions are perceived to be statistically identical ( p > 0.95). According to the

ANOVA for P1, beam directions 0◦ and 90◦ tend to
coincide, and for P2 the directions 180◦ and 235◦
coincide.

The distance of auditory objects to the IKO is
the second parameter analyzed in the experiment.
Figure 6a shows the 95 percent confidence intervals
of the perceived distance for all conditions. The
perceived distance of sounds S1–S4 depends on the
onset duration ( pS1S2/S3S4 = 0.0017). Similarly, the
irregular bursts (S6) are localized closer to the IKO
than the grains (S5). This can be explained by a
higher proportion of transient signal components
within S6. Both dependencies support findings on
the specific properties of the precedence effect for
transient signals (see Hartmann 1983; Rakerd and
Hartmann 1985). We see, moreover, that auditory
objects are perceived closer to the IKO for listening
position P1 than for P2 ( p = 0.002).

A combined presentation of azimuth angle and
distance as dependency of the onset (long onset:
S1–S2; short onset: S3–S4) can be found in Figure 6b,
where the angular distribution of the median
distance (including interquartile ranges) is shown
for listening position 1. Except for the zone behind
the IKO, S3 and S4 are localized being closer to the
IKO.

Wendt et al.

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

Figure 6. Perceived
distance of auditory
objects to the IKO: median
and 95 percent conﬁdence
interval of perceived
distance to the IKO over
all beam directions (a);

median and interquartile
range for stimuli S1/S2
(long onset) and S3/S4
(short onset) at P2, equally
split on twelve circular
segments around the
IKO (b).

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

The results of Listening Experiment 1 show
that the IKO is able to trigger controlled auditory
objects in space by using spherical beam-forming
algorithms. Depending on its staging and the
listening position, different zones around the IKO
can be “orchestrated.” The perceived distance (and
therefore, the reachable spatial extent of the auditory
scene) is signal-dependent. We have shown that the
more transient signals, i.e., signals with short onset
durations, tend to be localized closer to the IKO
than smooth signals.

Listening Experiment 2

The second listening experiment evaluates the
perception of trajectories of sound beams.

The creation of time-variant spatial projections
is a further means of expression and hence a further
step towards the orchestration of space. A narrow
beam rotating in the horizontal plane of the IKO is
a rather simple realization of this kind of projection.
Three different realizations of this trajectory have
been investigated in the second listening exper-
iment: a full turn and two half turns in opposite
directions, all starting at 90◦, shown in Figure 7.

Each trajectory lasted 5 sec and the subjects
were asked to adjust ten markers to the perceived
location in successive half-second intervals during
playback. Markers were flashed successively at
the associated playback time, and they could be
moved using a mouse on a graphical interface
showing the layout of the test setup. Playback
could be repeated until listeners were satisfied
with the match between marker placement and
what they heard. Room and positioning of the IKO
and listener remained the same as for Listening
Experiment 1. Because of the relation between onset
duration and perceived distance, two variant bursts
of pink noise were tested. Sound 1 (S1) consisted of
uniform pink noise representing an infinite onset
duration. Sound 2 (S2) consisted of 200-msec bursts
of pink noise, each with a linear fade-in and fade-out
of 10 msec each, and with 100 msec of silence
between bursts. Additionally, irregular bursts (S3)
and grains (S4), as in the previous experiment, were
tested.

Again, the experiment was carried out using 15

experienced listeners with normal hearing. The
stimuli S1–S4 were played in random order for
both listening positions, each one with a clockwise
half turn, a counterclockwise half turn, and a

Computer Music Journal

Figure 7. Mean and 95
percent conﬁdence area for
S1 and S2 at listening
position P2: full turn of
360◦ (a), turn of 180◦
counterclockwise (b), and
turn of 180◦ clockwise (c).

D
o
w
n
o
a
d
e
d

f
r
o
m
h

t
t

:
/
/

d
i
r
e
c
t
.

i
t
.

e
d
u
/
c
o
m

j
/

a
r
t
i
c
e
–
p
d

f
/

4
1
1
7
6
1
8
5
6
6
8
4
/
c
o
m
_
a
_
0
0
3
9
6
p
d

b
y
g
u
e
s
t

o
n
0
7
S
e
p
e
m
b
e
r
2
0
2
3

full turn. The binaurally rendered stimuli for
this experiment are also available online (P1:
https://phaidra.kug.ac.at/detail object/o:34854; P2:
https://phaidra.kug.ac.at/detail object/o:34856).

Results

For representation of the collected data, a two-
dimensional plot for each time step shows its mean

Wendt et al.

Figure 8. Mean and 95
percent conﬁdence interval
of perceived distance over
all trajectories.

value within the 95 percent confidence area (see
Figure 7).

The circular trajectory is reflected by almost
perfect circles around the IKO, obtained by the
mean values of the collected data of both listening
positions (see Figure 7a). Furthermore, the 95
percent confidence areas are almost equally spread
around the IKO, which is in contrast to the findings
from Experiment 1, where localization could not
fully encircle the IKO. Not only does the full turn
deliver smooth results, but the half-turn trajectories
shown in Figure 7 also track the idea of the spatial
movement.

In contrast to the responses to the trajectory
of the full turn, which match the angular range
presented to the listener, the responses to the half
turns were larger than the semicircle presented.
This effect could be caused by the slower angular
speed of the spherical beam (half as fast as for the full
turn), or the effect could be due to a psychoacoustic
phenomenon called “auditory representational
momentum” (Getzmann and Lewald 2007), which
describes the displacement of the final position of
a moving sound source in the direction of motion.
A thorough review of phenomena in the perception
of auditory motion is documented by Carlile and
Leung (2016).

Comparing the perceived distance of auditory
objects for the different signal onsets (S1 and S2),
the results at listening position P1 indicate that
transient signals with short onsets are perceived
closer to the IKO than signals with smoother
envelopes ( pP1 < 0.001; see Figure 8), and so confirm the findings of Experiment 1. This is in contrast to the results for position P2, however, where Figure 7 does not indicate any envelope dependency. Furthermore, the findings of Experiment 1—that at P2 auditory objects are generally perceived further away from the IKO than at P1—is not as pronounced in this experiment ( pS1...S4 = 0.200). Excluding S1, however, the other results were as statistically significant as in Experiment 1. Finally, a direct comparison of all collected data from Experiments 1 and 2 shows that the perception of trajectories cannot be fully explained by extrapolating the perception of static sound beams. It seems that listeners try to understand the intended auditory motion and thus experience a more comprehensive percept. Listening Experiment 3 The third listening experiment evaluates what we call third-order phenomena. These are composed of several sounds that are spatialized in different directions and with different angular movements. More specifically, it is an evaluation of the ability to discriminate stimuli based on their spatiosonic character. With this first step towards exploring the composition of sonic sculptures, insights on the existence of a shared perception of spatiosonic objects are gained. To arrive at artistically meaningful and musi- cally expressive spatiosonic objects, composer and coauthor Gerriet K. Sharma designed the stimuli of this experiment making use of his experience and means of expression developed for the IKO. Building on this experience, he arrived at an understanding of spatiosonic objects in terms of body–space re- lationships deduced from the theory of sculpture. Following Torsten (Kr ¨amer 2011), we distinguish three main categories of body–space relationships, which we call kernel plastic (abbreviated KP), spatial plastic (SP), and the kernel–shell principle (KSP). 84 Computer Music Journal l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Table 2. Composition of Stimuli Element 1 Element 2 Stimulus Sound Trajectory Sound Trajectory KPI KPII KPIII KPIV SPI SPII SPIII SPIV SPV KSPI KSPII KSPIII BN LFD SMS LFD cut 180◦, static 180◦, static 0◦, static 210◦, static CCW 237◦/sec rotation BN LFD cut CCW 140◦/sec rotation SMS CG LFD CW 180◦/sec rotation CCW 270◦/sec rotation CCW 120◦/sec rotation SMS 210◦ static CCW 180◦/sec rotation BN 305◦, static BN LFD cut CCW 180◦/sec rotation BN cut CW 180◦/sec rotation BN del CW 180◦/sec rotation CW 180◦/sec rotation SMS Elements used to compose the stimuli for Listening Experiment 3 are: constant Brown noise (BN), ﬁltered Brown noise with a low-pass cutoff frequency at 2,426 Hz (BN cut), Brown noise with the same cutoff and a delay of 15 sec (BN del), a low-to-mid-frequency drone (LFD), the drone with a high-pass cutoff frequency at 236 Hz (LFD cut), a multilayered, stretched metal sound with long onset and release (SMS), and a chain of ﬁne regular grains (CG). The direction of movements are either clockwise (CW) or counterclockwise (CCW). All angles are in the horizontal plane around the IKO. The stimulus categories KP, SP, and KSP are described in the article. l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j Kernel plastic (or “body plastic,” taken from the German terms Kernplastik and K ¨orperplastik) is used to describe objects with the attribute of “superseding” space (raumverdr ¨angend). An analogous example from the visual arts would be Auguste Rodin’s The Thinker (1902). Spatial plastic (Raumplastik) describes those objects with the attributes of “encompassing” or “binding” space (raumumfassend, raumbindend). An example of this would be Naum Gabo’s Linear Construction in Space No.2 (conceived 1949, executed 1959–1960). Finally, the kernel–shell principle (Kern-Schale- Prinzip) describes objects that “embody” space (raumbildend). Here, an appropriate example might be Henry Moore’s Mother and Child: Egg Form (1977, LH 717). The last of these categories embodies the idea of creating space by establishing a tension between two entities, for instance, a focused entity inside and environmental coordinates at the same time. These categories have provided meaningful hermeneutics for artistic practice over decades. Sharma composed a set of twelve spatiosonic miniatures along his artistic interpretation of the categories of body–space relations (KPI...IV, SPI...V, and KSPI...III). The material used for the creation of the twelve stimuli was composed from four different sound sources. The idea was to use easily recognizable idioms, similar to the stimuli from preceding experiments, but with a more musical quality to bridge the gap between the laboratory situation and more performative situations. The stimulus composition and grouping is shown in Table 2. Each of the stimuli had a duration of 30 seconds. To test the ability of listeners to naively discriminate between sculptural categories, neither the hierarchical organization scheme nor the composer’s categories were known to the listeners. This served to eliminate the potential side effect of interpreted terminology. f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Wendt et al. 85 Table 3. Permutation of Stimuli in Triplets Results Triplet 1 2 3 4 5 6 KPIII Stimulus 1 KPI Stimulus 2 KPIV SPI Stimulus 3 KPII KSPI SPIV KSPII KSPIII SPII KSPI SPV SPI KPII SPIII KPII SPIII SPV Listening Experiment 3 used triplets of the stimuli deﬁned in Table 2. The “odd-man-out” stimulus in each triplet is highlighted in boldface. Triplet 3 consisted of three stimuli from the same category, serving as a control condition in the listening experiment. The ability to discriminate between perceived sculptural shapes was tested by a three-alternative, forced-choice method (also known as “oddity” or “one-man-out”; cf. Kingdom and Prins 2010). Each stimulus triplet consisted of two stimuli from one category and one stimulus from another category. The stimulus triplets were then presented twice consecutively in a single joint listening session, in which all subjects took part. The listeners were asked to indicate the stimulus that differed from the others by its spatial appearance. As a control condition, Triplet 3 comprised stimuli from a single category. Table 3 shows the triplets tested. The listening session was conducted twice, each time with six subjects, all of whom were famil- iar with computer music and were experienced listeners to spatial audio. To monitor possible im- pacts of the listening position, the listeners were spread within the room (at positions Pa and Pb, see Figure 3a) and changed their position after the first run. Additionally five of the six subjects evaluated the same triplets using mono playback over headphones. Because there is no spatial differ- ence within the triplets, subjects were requested to discriminate between them on the basis of ar- bitrary characteristics. The playback order of the triplets itself and of the stimuli within triplets was chosen at random. The stimuli are all avail- able on the Web. The monophonic stimuli are at https://phaidra.kug.ac.at/detail object/o:34857, binaural stimuli https://phaidra.kug.ac.at/detail object/o:34859, and the binuaral stimuli at P2 at https://phaidra.kug.ac.at/detail object/o:34860. from position P1 at A direct comparison of the results for both play- back methods provides insights into the impact of spatialization on the ability to discriminate be- tween different types of sonic materials. Response frequencies for Listening Experiment 3 are shown in Table 4. For all test triplets played in the IKO, listeners identified one stimulus as the oddity, on average agreeing at least 83 percent of the time. Triplet 3, the control triplet, however, did not produce a clear agreement. For the triplets in the monophonic playback over headphones, there was, on average a strong agreement of at least 90 percent on the oddity. Comparing the results of playback inside the IKO with monophonic playback, there is little correlation regarding stimuli recognized as oddities, despite both stimulus sets being based on identical sonic material. The additional spatial character when using the IKO therefore seemed to dominate in recognizing the intended oddity stimulus within the triplets. The exception was Triplet 2, where the same stimulus was recognized. As none of the listeners were aware of hierar- chical spatiosonic phenomena or the composer’s categories, we do not know which feature they used to distinguish the odd stimulus. We can not establish a causal relation, but the listeners’ percep- tual distinction agrees almost completely with the differentiation intended by the composer and the categories he used. Conclusion In this article we successfully provided a comprehen- sive experimental evaluation of sonic phenomena evoked by the IKO. A hierarchical model of spa- tiosonic phenomena was proposed and validated by extensive listening experiments. Listening Experiment 1 examined single, static percepts evoked by spatial projections of the IKO. Depending on the staging of the IKO, different zones around it could be “orchestrated.” The results revealed where knowledge from psychoacoustic research is applicable to auditory objects. For instance, distance to the IKO is highly dependent 86 Computer Music Journal l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Table 4. Counts of Stimuli Deemed Most Different triplet stimulus A stimulus B stimulus C 1 0 12 0 2 1 0 11 3 0 5 7 4 0 0 12 5 10 2 0 6 0 0 12 triplet 1 stimulus A 0 stimulus B 0 stimulus C 5 2 0 0 5 3 0 0 5 4 4 1 0 5 1 4 0 6 1 4 0 Number of times each stimulus was perceived as “most different” in each triplet. Results using the IKO (left) and monophonic playback over headphones (right). Bold numbers mark the stimuli that were designed to have a different sculptural shape. on onsets and envelopes. This relation is known from studies on the precedence effect (Litovsky et al. 1999; Brown, Stecker, and Tollin 2015). Listening Experiment 2 examined phenomena of greater complexity that were evoked by time-variant spatial projections. In contrast to static sound beams, the use of trajectories involves perceptual properties indicated in studies on auditory motion (Getzmann and Lewald 2007; Carlile and Leung 2016). Listening Experiment 3 investigated the superpo- sition of several phenomena of static and dynamic nature. For this experiment, spatiosonic objects de- veloped by Sharma as means of expression using the IKO were reduced to useful, exemplary categories. Although not generalizable to spatiosonic artistic expression that other composers would develop when using the IKO, and despite the relatively small number of experienced listeners participating, the experimental results are instructive and remarkably distinct. The experimental design proved to be a promising first step to detect intersubjective spa- tiosonic features that are stronger and more salient as those of nonspatialized sonic materials. The terminology that the composer developed for the compositions in the last experiment was derived from the theory of sculpture and aims to provide a useful basis for composition and a classification of complex auditory objects that can be created by the IKO. Three categories were proposed, each describing a body–space relationship that can be translated into musical context. Acknowledgments Our work was funded by the Austrian Science Fund (FWF), project no. AR 328-G21, “Orchestrating Space by Icosahedral Loudspeaker.” References Bayle, F. 2007. “Space and More.” Organised Sound 3(12):241–249. Blauert, J. 1983. Spatial Hearing: The Psychophysics of Human Sound Source Localization. Cambridge, Massachusetts: MIT Press. Brown, A. D., G. C. Stecker, and D. J. Tollin. 2015. “The Precedence Effect in Sound Localization.” Journal of the Association for Research in Otolaryngology 16(1):1– 28. Carlile, S., and J. Leung. 2016. “The Perception of Audi- tory Motion.” Trends in Hearing 20. Available online at tia.sagepub.com/content/20/2331216516644254.full.pdf. Accessed November 2016. Collins, N., M. Schedel, and S. Wilson. 2013. Cambridge Introductions to Music: Electronic Music. Cambridge: Cambridge University Press. Emmerson, S., ed. 2000. Music, Electronic Media and Culture. Farnham, UK: Ashgate. Frank, M. 2013. “Phantom Sources Using Multiple Loud- speakers in the Horizontal Plane.” PhD dissertation, University of Music and Performing Arts, Graz, Austria. Frank, M., G. K. Sharma, and F. Zotter. 2015. “What We Already Know about Spatialization with Compact Spherical Arrays as Variable-Directivity Loudspeakers.” Paper presented at the inSonic Conference, 26–28 November, Karlsruhe, Germany. Available online at iem.kug.ac.at/fileadmin/media/osil/2015 FrankEtAl inSonic WhatWeAlreadyKnowAboutSpatialization WithCompactSphericalArraysAsVariabledirectivity Loudspeakers.pdf. Accessed November 2016. Getzmann, S., and J. Lewald. 2007. “Localization of Mov- ing Sound.” Perception and Psychophysics 69(6):1022– 1034. Gonz ´ales-Arroyo, R. 2012. “Towards a Plastic Sound Object.” In P. Ernst and A. Strohmaier, eds. Raum: Konzepte in den K ¨unsten, Kultur- und Naturwis- senschaften. Baden-Baden, Germany: Nomos, pp. 239–258. Wendt et al. 87 l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Harrison, J. 1998. “Sound, Space, Sculpture: Some Thoughts on the ‘What’, ‘How’ and ‘Why’ of Sound Diffusion.” Organised Sound 3:117–127. Hartmann, W. M. 1983. “Localization of Sound in Rooms.” Journal of the Acoustical Society of America 74(5):1380–1391. Hartmann, W. M., et al. 1989. “Localization of Sound in Rooms, IV: The Franssen Effect.” Journal of the Acoustical Society of America 86(4):1366–1373. Ihde, D. 2007. Listening and Voice: Phenomenologies of Sound. Albany: State University of New York Press. Kingdom, F., and N. Prins. 2010. Psychophysics: A Practical Introduction. London: Academic. Klant, M., and J. Walch. 2014. Grundkurs Kunst, Sekun- darstufe II, Ausgabe 2014: Plastik, Skulptur, Objekt. Braunschweig, Germany: Schroedel. Kr ¨amer, T. 2011. Grundlagen der Skulptur und Plastik. Berlin: Klett. Laitinen, M.-V., et al. 2015. “Controlling the Perceived Distance of an Auditory Object by Manipulation of Loudspeaker Directivity.” Journal of the Acoustical Society of America 137(6):462–468. Landy, L. 2007. Understanding the Art of Sound Organi- zation. Cambridge, Massachusetts: MIT Press. Litovsky, R. Y., et al. 1999. “The Precedence Effect.” Jour- nal of the Acoustical Society of America 106(4):1633– 1654. L ¨osler, S. 2014. “MIMO-Rekursivfilter f ¨ur Kugelarrays.” Master’s thesis, University of Music and Performing Arts, Graz, Austria. Nystr ¨om, E. 2013. “Topology of Spatial Texture in the Acoustic Medium.” PhD dissertation, City University, London. Peters, N. 2010. “Developing Sound Spatialization Tools for Musical Applications with Emphasis on Sweet Spot and Off-Center Perception.” PhD dissertation, McGill University, Montreal. Rakerd, B., and W. M. Hartmann. 1985. “Localization of Sound in Rooms, II: The Effects of a Single Reflecting Surface.” Journal of the Acoustical Society of America 78(2):524–533. Rakerd, B., and W. M. Hartmann. 1986. “Localization of Sound in Rooms, III: Onset and Duration Effects.” Journal of the Acoustical Society of America 80(6):1695– 1706. Schmeder, A. 2009. “An Exploration of Design Pa- rameters for Human-Interactive Systems with Com- pact Spherical Loudspeaker Arrays.” In Proceedings of the Ambisonics Symposium. Available online at ambisonics.iem.at/symposium2009/proceedings /ambisym09-schmeder-csphlsinteraction.pdf. Accessed November 2016. Sharma, G. K., F. Zotter, and M. Frank. 2014. “Or- chestrating Wall Reflections in Space by Icosahedral Loudspeaker: Findings from First Artistic Research Exploration.” In Proceedings of the Joint International Computer Music Conference and the Sound and Music Computing Conference, pp. 830–835. Sharma, G. K., F. Zotter, and M. Frank. 2015. “To- wards Understanding and Verbalizing Spatial Sound Phenomena in Electronic Music.” Paper presented at the inSonic Conference, 26–28 November, Karlsruhe, Germany. Available online at iem.kug.ac.at/fileadmin /media/osil/2015VerbaPap OSIL inSonic 2.pdf. Ac- cessed November 2016. Smalley, D. 2007. “Space-Form and the Acousmatic Image.” Organised Sound 12(1):35–38. Stitt, P. 2015. “Ambisonics and Higher-Order Ambisonics for Off-Centre Listeners: Evaluation of Perceived and Predicted Image Direction.” PhD dissertation, Queen’s University, Belfast, UK. Var ´ese, E. 2004. “The Liberation of Sound.” In Audio Culture: Readings in Modern Music. New York: Continuum, pp. 17–21. Wendt, F., et al. 2016. “Directivity Patterns Controlling the Auditory Source Distance.” In Proceedings of the International Conference on Digital Audio Effects, pp. 295–303. Wishart, T. 1996. On Sonic Art. Reading, UK: Harwood. Zaunschirm, M., M. Frank, and F. Zotter. 2016. “An Interactive Virtual Icosahedral Loudspeaker Array.” In Tagungs-CD der deutschen Arbeitsgemeinschaft f ¨ur Akustik, pp. 1331–1334. Zotter, F. 2009. “Analysis and Synthesis of Sound- Radiation with Spherical Arrays.” PhD dissertation, University of Music and Performing Arts, Graz, Austria. Zotter, F., and M. Frank. 2015. “Investigation of Auditory Objects Caused by Directional Sound Sources in Rooms.” Acta Physica Polonica A 128(1):5–10. Zotter, F., et al. 2014. “Preliminary Study on the Per- ception of Orientation-Changing Directional Sound Sources in Rooms.” Paper presented at the Fo- rum Acusticum, 7–12 September, Krakow, Poland. Available online at ambisonics.iem.at/Members /zotter/2014 zotter OrientationDirectionalSource.pdf. Accessed November 2016. 88 Computer Music Journal l D o w n o a d e d f r o m h t t p : / / d i r e c t . m i t . e d u / c o m j / l a r t i c e - p d f / / / / 4 1 1 7 6 1 8 5 6 6 8 4 / c o m _ a _ 0 0 3 9 6 p d . j f b y g u e s t t o n 0 7 S e p e m b e r 2 0 2 3 Florian Wendt, Gerriet K. Sharma, image

Download pdf