Pierre Alexandre Tremblay, Gerard Roma, - Am MIT spezialisierte KI-Forschung

Pierre Alexandre Tremblay, Gerard Roma,
and Owen Green
Centre for Research in New Music
University of Huddersfield
Queensgate Campus, Huddersfield,
HD1 3DH, Großbritannien
{p.a.tremblay, g.roma, o.green}@hud.ac.uk

Enabling Programmatic
Data Mining as Musicking:
The Fluid Corpus
Manipulation Toolkit

Abstrakt: This article presents a new software toolbox to enable programmatic mining of sound banks for musicking
and musicking-driven research. The toolbox is available for three popular creative coding environments currently
used by “techno-fluent” musicians. The article describes the design rationale and functionality of the toolbox and its
ecosystem, then the development methodology—several versions of the toolbox have been seeded to early adopters who
have, im Gegenzug, contributed to the design. Examples of these early usages are presented, and we describe some observed
musical affordances of the proposed approach to the exploration and manipulation of music corpora, as well as the
main roadblocks encountered. We finally reflect on a few emerging themes for the next steps in building a community
around critical programmatic mining of sound banks.

Developments in general computing continue to
be inspired by the possibilities of exploiting large
amounts of data. For “musicking” (Small 1998),
taken as music making in its widest, most diverse,
and inclusive sense, new data-driven paradigms
are promising but still challenging, owing to the
diversity and size of sound collections amassed
by musicians, as well as to the context-dependent
nature of questions that determine the paths of
musical creativity.

Vor allem, although there has been much research

around the potential of isolated technologies or
techniques in a musical context, there has been
much less detailing or comparing of musical strate-
gies and tactics arising from sustained practice. Der
Fluid Corpus Manipulation project (FluCoMa) aims
to help bridge this gap by making available cross-
platform technologies in tandem with supporting
resources and community, all focused on discover-
ing and developing complete creative workflows for
audio corpora with data-driven techniques.

In früheren Arbeiten (Tremblay et al. 2019), Wir
introduced tools for decomposing and analyzing
sounds to facilitate the creation of audio corpora and
the extraction of audio descriptors from within the
most popular musical creative coding environments
(CCEs): Max, Pure Data (Pd), and SuperCollider. Das
leaves open the challenge of enabling musicking
and musicking-driven research using the resulting
corpora and data.

In this article we present the second iteration
of our toolbox, in which we focus on exploring,
interacting with, and manipulating audio corpora
with a framework of tools for organizing and learning
aus Daten. Our aim is to provide this combined
functionality in a single package for each CCE with
enough commonality in syntax and concepts to
enable discussion and exchange between a range
of musicians, in the interest of supporting a broad
community for future research.

In the next section, we review existing tools for
music creation based on audio corpora and note some
limitations with respect to the musical workflows
they imply. We follow with a description of the aims
and content of our toolbox, followed by a section
describing examples of workflows enabled by the
toolbox from our community of early adopters. Wir
then dedicate a section to offering some insights
into our design process and early observations on
the project’s impact. As described previously (Grün,
Tremblay, and Roma 2018), our methodology is
rooted in a pluralist and cross-disciplinary view
of techno-fluent musicking. Hier, we focus on
some emerging themes around knowledge transfer,
interface granularity, and points of entanglement.
We finish the article with some anticipated next
steps in the Fluid Corpus Manipulation project,
with the view of augmenting the toolset, sowie
as strengthening and diversifying an emerging
Gemeinschaft.

Computermusikjournal, 45:2, S. 9–23, Sommer 2021
doi:10.1162/COMJ_a_00600
© 2022 Massachusetts Institute of Technology.

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Related Work

There is an extensive body of prior work that
provides some of the tools and techniques in which
we are interested. Although none of this work aligns
precisely with our focus on developing a framework
aimed squarely at flexible music making with audio
corpora, it has been a rich source of learning and
motivation. Our design has also been influenced by
the wealth of libraries and learning materials in data
science communities, notably around the Python
language’s scikit-learn library (Pedregosa et al.
2011), and their reflections on interface (Buitinck
et al. 2013).

One category of existing work focuses on spe-

cific forms of musical interaction. AudioGuide
(Hackbarth, Schnell, and Schwarz 2010) Und
Orchidea (Carpentier et al. 2012) are geared to-
wards computer-assisted orchestration. In der Zwischenzeit,
CataRT (Schwarz et al. 2006) and the more recent
AudioStellar (Garber, Ciccola, and Amusategui
2021) offer an intuitive interface for corpus explo-
ration through snippets arranged in a 2-D space. In
these cases, the very specific focus means that the
system presents itself as more of a black box than
is our aim in the current work, insofar as certain
decisions about workflow, Analyse, presentation,
and general interface are already taken and fixed.
This limits the creative-coding affordances of these
packages: It makes exploration of interaction, cus-
tomization, and integration within the CCE more
difficult, if at all possible.

A second category is work that makes available
particular algorithms or tools in isolation, but not
as part of a framework that also supports gathering,
erkunden, and refining data as part of an overall mu-
sical project. Here one could include the Wekinator
application (Fiebrink and Cook 2010), the Max pack-
ages ml.lib (Bullock and Momeni 2015) and ml.star
(Smith and Deal 2014), and certain SuperCollider
extensions (called quarks) such as the KDTree and
KMeans quarks. Most of these solutions are focused
more on gestural data streams and offer neither
facilities for exploring large corpora nor for dynamic
creative coding between various complementary al-
gorithms. Umgekehrt, toolsets that enable handling

and exploring data within CCEs have been proposed,
but without a focus on audio and machine-learning
Algorithmen. Zum Beispiel, the Bach family of exten-
sions for Max (Agostini and Ghisi 2012) has recently
been enriched with a suite of objects that provides al-
ternatives to Bach’s core focus on Western common
notation (Ghisi and Agon 2016).

Endlich, two prior contributions align more closely

with the proposed work, and thus warrant more
extensive discussion: the SCMIR package for Super-
Collider (Collins 2011), and the MuBu and Friends
package for Max (Schnell et al. 2009). Both of these
offer some version of an end-to-end pipeline for
gathering, analyzing, and exploring data, although
neither addresses the additional challenge of target-
ing multiple CCEs.

The SCMIR package provides a complete pipeline

from audio feature extraction to a range of al-
gorithms and representations, along with some
supporting machinery for conditioning data. Der
lack of a native plug-in API for sclang (the language
side of SuperCollider) is worked around by using
external programs as “pseudoplug-ins.” Although
the initial focus for SCMIR was analysis, it also
enables creative work with analysis data and al-
gorithms. Our toolbox similarly enables complete
pipelines, but all the computation is implemented
in server-side SuperCollider plug-ins. This provides
a more natural and granular integration with the
SuperCollider environment.

MuBu and Friends likewise offers a framework
with pipelines from audio feature extraction (über
PiPo, vgl. Schnell et al. 2017), through to a suite of
algorithms for exploring and learning from data. Es ist
focus is more general than our proposed toolbox,
insofar as gestural data is also an explicit concern
for MuBu, although less attention is given to the
framework’s place in an overall creative workflow.
At the heart of framework is the MuBu (multibuffer)
object itself, which extends the familiar paradigm
of the audio buffer with labels, different datatypes,
time tags, and other useful facilities. Our experience
has been that this richness can be difficult to
grasp, Jedoch, especially for newcomers, und das
interoperation with the wider Max environment is
not always simple.

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

The Toolkit

The tools added in the new iteration of the FluCoMa
toolbox (Tremblay et al. 2019) are well established
in data science and most have been previously
used in audio signal-processing research. Our aim
is to enable programmatic interaction with sound
collections and other signal corpora within CCEs
to enable future research around sustained musical
practice with these sorts of tools and concepts.
In the toolbox’s previous iteration, we pursued
this aim with a focus on musical approaches to
signal decomposition and description. The new
additions comprise a suite of more than 30 ob-
jects available for each of the most widely used
CCEs based on a common C++ architecture. Diese
CCEs encompass Max, SuperCollider, and Pd; A
command-line interface is also under develop-
ment. All code is open source, under a Berkeley
Software Distribution license (BSD-3), and avail-
able on the GitHub repository of the project. Visit
http://flucoma.org for the source code and the latest
binaries.

The following text discusses the design for this

new suite of tools and then describes its main
Funktionalität.

Aims and Priorities

Our foci for the toolkit design stem from wishing
it to be useful as a general framework in its own
Rechts, as well as from broader goals of the FluCoMa
Projekt. The following aims and priorities inform
our design decisions and trade-offs, and they draw
on our experiences with our previous work in this
Bereich (vgl. Harker and Tremblay 2012), on lessons from
developing the first iteration of our toolbox, Und
on continuous feedback from a pool of musicians
working with the tools from the earliest stages of
Entwicklung.

1. Native integration.

Our tools should respect as closely as possible
established idioms in the host CCE and allow
easy transfer of data between the framework
and native data structures.

2. Consistency.

Objects should fit together neatly and offer a
consistent way of working. We have opted for
an analogue to the scikit-learn (Pedregosa et al.
2011) naming conventions for our algorithms.

3. Learnability.

The overall framework and the granular-
ity of its objects should afford early, easy
exploration yet offer rich scope for deeper
Experimentieren.

4. Configurability.

Objects should offer fine adjustment and
tweaking where the algorithm supports it,
ideally in programmatic ways.

5. Scalability.

Algorithms should be able to cope with rea-
sonably large collections of data, within the
confines of currently available computing
power in central processing unit–based per-
sonal computing hardware.

6. Breadth.

An initial offering of well-known and under-
stood algorithms, touching on a breadth of
possible approaches, helps us draw on existing
knowledge and practice, and helps assess the
kinds of extension that might be of interest to
die Gemeinde.

7. Completeness.

We aim to enable complete workflows from
corpus-building, Kuration, machine listening,
and machine learning to sound making.

Zusätzlich, because of the project’s overarching
aim to catalyze more and better musicking research
in this area, close attention has been paid to inter-
face consistency across different CCEs, to enable
researchers working in different environments to
exchange ideas and insights. Variations on similar
ideas can be explored iteratively within the dis-
tinct idiomatic contexts that different CCEs afford
(McPherson and Tahıro˘glu 2020).

Endlich (and perhaps most elusively), is a com-
mitment to providing tools that are “artist spec”
(Buxton 1997): As well as being sufficiently robust
and performant for use in artistic projects, der Code
base and supporting resources need to be sustainable
well beyond the lifetime of the research project.

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Clearly some of these goals are in tension with
each other; wir glauben, Jedoch, that the resulting
toolkit strikes a reasonable balance and makes
a valuable contribution to the goal of helping to
animate current and future research. The following
discussion briefly describes the different object
categories in the framework. The following sections
will outline some of the workflows these make
möglich.

DataSets

The DataSet object is the core component for
all data-processing algorithms. It is a key-value
store with unique string identifiers as keys and
numerical vectors as values. The purpose is mainly
to store feature vectors, but it aims to be as generic
and versatile as possible within that scope. Other
objects in the toolkit consume and produce DataSets
as input and output.

DataSet entries are mostly introduced via the

CCE’s audio buffer objects, which is the main
interface in our audio feature-extraction objects.
Buffers are familiar to creative coders and the
only open-ended memory access available in all
music CCEs. Many of our objects copy data to
new DataSets, rather than having many objects
depend on shared DataSets. We found this approach
works well for practical amounts of data using
current computers and is typically much easier
to reason about than shared access, especially in
multithreaded environments.

DataSet instances have a JSON interface, as do
all other objects in the toolkit that hold significant
amounts of data and variables in their state. Das
allows a widely supported way to store, back up,
and transfer data into different environments, als
well as loading to and from native CCE dictionary
structures. A sibling object is the LabelSet, welche
holds labels instead of feature vectors, and is used
particularly in classification tasks.

Nearest Neighbors

Similarity queries are perhaps the most well-
established mechanism in descriptor-driven corpus-

based music making, such as concatenative syn-
thesis or automated orchestration. Usually imple-
mented via nearest-neighbor algorithms, they allow
the association of sounds by proximity according
to some set of acoustic features or derived metric.
Following established practice, we implement a k-d
tree to perform efficient queries. Our implementa-
tion allows querying within a given radius around
a point and can report the distances to the returned
points to allow intuitive use of this feature. Das
algorithm is often significantly better than brute-
force search, although limited with respect to data
dimensionality and dynamically adding or removing
tree entries.

Supervised Machine Learning

Supervised machine learning models a relationship
between input and output data, both of which are
supplied by the user. It is well established in creative
musical contexts, as has been shown in longitudinal
research examining sustained practice (Fiebrink
and Sonami 2020), as well as in many disciplines
involved with music and audio analysis. This type of
machine learning can be used both for classification
(categorizing inputs into discrete classes) und für
regression (devising a continuous mapping between
a given set of inputs and outputs). These kinds of
tasks can be quite intuitive for musicians, especially
if they have experimented with machine listening
and tried to handcraft code for creative purposes.
The toolbox provides both k-nearest neighbor
(KNN) and multilayer perceptron (MLP) regression
and classification. The KNN models are lightweight
and very quick to train and to query, und sie sind
conducive to experimentation. Trotzdem, they are limited
in the complexity of relationships they can model.
Models using MLP are more appropriate for problems
that are more difficult: Our implementation allows
flexible specification of the architecture and a
choice of activation functions, which affords a
great number of configurations and combinations.
The neural network regressor implementation also
allows both input and output access to any layer, Also
an MLP can be used as a feature extractor or to learn
complex mappings: Zum Beispiel, an autoencoder

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

(Bourlard and Kamp 1988) setup can be used to learn
hidden representations from data, and our interface
allows this learned representation to be queried
with new data. Although ever more possibilities
and layer types keep appearing in the deep-learning
Gemeinschaft, our estimation was that a tailored
implementation has better chances of providing a
good balance of flexibility and complexity within
the computing power available to most of the
creative-coding community. Endlich, MLPs are quite
amenable to the relatively small sizes of musicking
datasets, and even outliers or noise may still produce
fruitful artistic results.

Unsupervised Machine Learning

In unsupervised learning schemes, only input
data are provided, and the algorithm attempts
to model the data directly. Although this means
that the resulting outputs may be more abstract,
this approach requires less preparation and can
be advantageous for learning general features or
spaces that relate items in a DataSet. Given the
arduousness of obtaining supervised training data
for arbitrary creative endeavors, especially in the
midst of what Impett (2000) calls the “grip-slip
relationship between vision and realization,”
we expect unsupervised machine learning to be
particularly useful in data-driven musicking.

Although recent developments in general ma-

chine learning have not been as spectacular as
in the supervised case, in our view there is still
much promise for establishing existing tools and
paradigms for unsupervised machine learning in
creative domains. Our main foci are data clustering
and dimensionality reduction.

With respect to data clustering, the toolbox in-
cludes an implementation of the classic algorithm
for k-means clustering, which has a long tradition
in audio signal processing (Lloyd 1982). Obwohl
the output of clustering may seem a bit abstract
for a creative workflow, it can be used as a building
block for complex interfaces. We plan to add other
clustering algorithms to overcome the classic lim-
itations of k-means, nämlich, globular shapes and a
predefined number of clusters, as well as metrics for
assessing clustering quality.

The use of dimensionality reduction is perhaps
more straightforward, as it has direct applications
for visualization and parameter mapping, sowie
as preprocessing data for other tasks in machine
learning. In previous research we demonstrated the
creative potential for adaptive interfaces based on
the results of different dimensionality-reduction al-
gorithms (Roma, Green and Tremblay 2019a). Based
on the intuitions gained, we provide a selection in
the toolset: principal components analysis (PCA),
which is also generally useful for linear mapping
and preprocessing of high-dimensional data; mul-
tidimensional scaling (MDS), which allows easy
experimentation with different distance measures
between data points; and uniform manifold approx-
imation and projection (UMAP), which can be seen
as the state-of-the-art algorithm for dimensionality
reduction. Autoencoders based on MLP, discussed
earlier, can also be used for nonlinear dimensionality
reduction.

Data Scaling and Normalization

Data scaling and normalization are generally nec-
essary when using any of the described machine-
learning algorithms, as imbalances in the scale of
the different dimensions can make it very hard to
converge in multidimensional spaces, and some
models expect inputs to be centered or scaled. Der
toolbox provides a set of common preprocessing ob-
jects for normalization, standardization, and robust
scaling, alongside other housekeeping processes
such as thresholding and outlier removal.

Querying

Many of the early adopters had a keen interest in
manipulating the numerical data in DataSets, typi-
cally those from audio features. Such an expectation
is understandable, as SQL-like interfaces have been
frequently used to interact with audio corpora (z.B.,
Schwarz et al. 2006; Akkermans et al. 2011). Der
toolbox provides an object to accommodate this kind
of interaction, allowing manual filtering of the rows
and columns of DataSets as well as compositing
before further processing.

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Abstractions and Utilities

Endlich, utilities are provided for accommodating
common workflows and dealing with specific
affordances of each CCE, including buffer processing
and real-time querying operations. We provide some
flexible abstractions that speed up the process of
assembling a DataSet from a starting collection
of audio samples and files, allowing parameters
for batch analyses to be expressed concisely and
abstracting away boilerplate code.

Usage Examples

We believe that the value of the proposed framework
rests in large part on the affordances it yields for
musicians to experiment with these technologies
in environments already in their creative workflow,
so as to stimulate more and better-documented
practical research on creative strategies and tactics
around their use in practice. A key methodological
commitment of the project is that our designs are
informed and interrogated at the earliest opportunity
through intense use by musicians outside the
development team. We were lucky to benefit from
the input of a group of commissioned artists as well
as a pool of enthusiastic and engaged voluntary
alpha users. In the following text we will describe
a few examples of creative workflows enabled by
the toolbox, followed by a more detailed example
from a piece by the first author, to provide an idea
of how objects can be applied to musical practice.
As previously stated, we hope that the resulting
discussions around such practices will go beyond
implementation details and engage across the
various communities around each respective CCE.
To encourage such an outcome, we commissioned
musicians covering a range of practices, all fluent in
Max or SuperCollider, to test the ability to integrate
our framework into their workflows and to challenge
our early interface designs.

Data-Driven Workflows

As well as incorporating the toolset into their mu-
sicking, we asked that our early coinvestigators

contribute to a series of online presentations dis-
cussing their projects and experiences; these presen-
tations are available at www.flucoma.org/plenaries.
Zusätzlich, Jacob Hart, a musicology researcher
on the project, has amassed a valuable archive of
Analyse, Interviews, and code as part of his work on
tracking the creative processes of our commissioned
Musiker.

Some common patterns of workflows have been
observed in these early entanglements and can be
grouped into the categories corpus building, dynamic
querying, exploring parameter spaces, and adaptive
Schnittstellen.

Corpus Building

The tools for statistical processing (scaling, cluster-
ing, and dimensionality reduction) have been used to
clean and filter sound corpora obtained in different
ways (z.B., mass generation, sonification, or informal
recording). This includes removal of outliers or near
duplicates, finding good references and ranges for in-
tuitive descriptor values, and sanitizing data across
dimensions. Wichtig, the interchangeability of
the components of this process has allowed our
early users to compare their intuition of scaling and
distance with computed results in various descriptor
Räume. These intuitions and the computed results
often differed, which was a source, in equal parts,
of bemusement and of inspiring serendipity for our
artists.

Dynamic Querying

As opposed to fixed workflows based on similarity
queries in concatenative synthesis or automatic
orchestration tools, our toolbox’s modular interface
allows for an approach to complex material that is
more nuanced. Zum Beispiel, sounds can be modeled
into different segment decompositions (z.B., stages of
the energy envelope), which are analyzed separately
and then either merged or indexed into multiple
descriptor spaces. Similarity queries can also be
forked: Different indices can be queried depending
on the value of a given descriptor. A typical example
is using pitch features only for voiced segments of

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Figur 1. A typical
workflow for similarity
matching, showing
progression from source
audio from steps of
segmentation and
Analyse, statistical

Verarbeitung, dimensionality
reduction, through to
indexing. In the toolkit,
there are multiple options
available for these steps,
indicated next to the
relevant boxes. PCA =

principal components
Analyse; MDS =
multidimensional scaling;
UMAP = uniform
manifold approximation
and projection.

a certain duration, otherwise reverting to spectral
centroid.

Exploring Parameter Spaces

Machine learning can be used to create nonlinear
mappings of parameter spaces of complex synthe-
sizers and processors. Zum Beispiel, MLP objects
are configured and trained to expand from a few
control dimensions (which can be more easily
mapped to input devices and interfaces) to many
synthesis parameters. In the use experiences of our
early adopters, the proposed interface allowed for
ad hoc exploration of larger synthesis parameter
Räume, taking advantage of small, curated training
datasets. Außerdem, experimentation with re-
gression between synthesizer parameters and the
acoustic features of the resulting sounds allowed for
inspiring audio-driven resynthesis models.

Adaptive Interfaces

As shown by Roma, Grün, and Tremblay (2019A),
interactive scatter plots for playing sound corpora
can be extended using dimensionality reduction and
visualization (which can also represent clusterings
and classifications) to make playable interfaces that
adapt to the data distribution. Simple examples
include interaction with a touchscreen or other
controllers, or sequencing of spatial trajectories and
patterns. A more sophisticated interface developed
by one of the authors allows for the use of live coding
to control a number of playback heads navigating a
visualization of a sound collection. This visualiza-
tion is generated via dimensionality reduction, Und
also displays the items with descriptor-based sound
icons. This interface affords a view of both structure
and content at a glance. Each playback head follows
a trajectory that can be recorded interactively or live
coded as a function that generates a new position
based on the current position.

Tuning Nearest Neighbors

Figur 1 shows an example of dynamic querying
in more detail. This comes from a piece entitled

“Newsfeed,” a studio composition by the first
author in which similarity-based queries were used
to create new utterances by concatenating various
segments from a corpus of spoken voice material.
For context, Figur 1 shows a typical workflow
for similarity matching, alongside options available
in our toolkit for each stage. For training, each stage
will normally be performed in one batch. Collections
of host buffers are converted to a DataSet, and each
modeling stage produces a new DataSet that serves
as input to the next stage. At query time, neu
data points (in CCE buffers) are passed through the
pipeline to the k-d tree, which returns the nearest
sounds from the corpus.

In der Praxis, a great deal of experimentation is
required. The granularity of segmentation, Auswahl
of features, pooling mechanism, und so weiter, can
significantly affect the behavior of the similarity
query at the end of chain. Darüber hinaus, in a creative
Kontext, the desired musical behavior is often
neither known nor fixed, instead it is negotiated
through an interactive process of programmatic
musicking, in which exploring options and their
impact on sonic results in context transforms
musical questions in a fully circular process.

The first author found that he was able to en-

rich the workflow above with custom tools to
fluidly and interactively explore options during
the composition process. The ability to filter and
merge different DataSets using the DataSetQuery
object meant it was possible to quickly try out

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

different combinations of descriptors, scaling
schemes, and dimension-reduction models. Ad-
ditionally, the segmentation could be adjusted ex
post facto by inserting a clustering model into the
chain prior to pooling: Starting with a deliberately
fine-grained segmentation, the k-means method
was then used to identify adjunct segments that
could be rejoined according to selected descriptors,
enabling an effective way to discover various time
scales and groupings that worked well in different
contexts.

Crucially, new ideas and possibilities suggested

themselves during these explorations in a way
that would have been much less likely in a setting
that was less interactive and more disjointed, als
would have been the case with a solution that was
less configurable. Darüber hinaus, this creative hacking
workflow was able to integrate well into the first
author’s distinctive practice, which bridges working
in CCEs and digital audio workstations in ad hoc
ways, at times through audio sends, at other times
via MIDI events, or even full edit decision lists.
These were easily thrown together in context to
best fit the need of the musicking moment.

This last example, as well as a few others from the
pool of early users, are available both in the online
presentations mentioned earlier in this article and
in code snippets on the project’s online forum at
discourse.flucoma.org.

Preliminary Assessments

The first version of the new extensions was released
to our community of creative coders in November
2019. At that point the community comprised
eleven participants plus the authors, augmented
by a small group of eager people along the way.
Throughout the process, we released nine private
updates and then transitioned to a public beta phase
at the end of April 2021. Although it is still too
soon to reflect in detail on how well the proposed
framework supports sustained musical research, Und
notwithstanding the encouraging range of possible
uses described above, we have noted some early
trends in people’s encounters with the tools that
will guide our future work.

Overall Framework Aims

The variety and sophistication of the workflows
described in the previous section provide encourag-
ing signals that our aims for the toolkit are being
fulfilled. Our early adopters have been able to in-
tegrate the tools into their diverse range of broader
working patterns and to complete a variety of new
work that would have been otherwise much more
difficult. This suggests that the tools themselves
are sufficiently robust, performant, and usable to
support serious creative work, and that the coverage
of algorithm types provides enough breadth for a
range of ideas to be explored and brought to fruition.
From a framework development perspective, Die
C++ architecture has made it easy to add and adapt
objects to the toolbox.

At the time of writing, some avenues for en-
hancement are already clear, based on feedback
from our contributing artists and the authors’ own
experiences of creative work with the tools. Most
prominent among these are some rough edges
around idiomatic integration into CCEs vis-à-vis the
desired ergonomics of our higher-level vision of fluid
corpus manipulation. In der Tat, the goal of ensuring
the consistency of our interfaces across CCEs can
often be in tension with the goal of enabling familiar
and idiomatic usage in a particular environment. Ein
area for some further development, now that there
is a common basis in place, is to look at potential
improvements in environment-specific integration.
This is most apparent in the way in which
we have used audio buffers as a type of scalable
and configurable container that has an equivalent
available in all our target environments. Obwohl
this approach has enabled rapid and simultaneous
development for these diverse architectures, sowie
as providing fast, real-time access to scalable data
structures, it has also yielded interfaces that can be
slightly cumbersome in all environments. A possible
approach to improving this could be to apply idioms
that are both higher level and more familiar on top
of these lower-level (but consistent) mechanisms.
Now that a certain amount of usage experience

has accumulated, it is easy to see what these
idioms might look like, given the common types
of “boilerplate” code that we and our collaborators

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

have had to repeatedly produce. Zum Beispiel, in Max
and Pd, we could provide a mechanism for dealing
transparently with CCE-native lists to store and
recall data in containers and to trigger algorithms.
Likewise, on the SuperCollider server, an equivalent
abstraction would afford working directly with
control-rate signals without needing to address and
access buffers.

Wider Project Aims: Human Learning

Besides some possible areas for ergonomic improve-
ment and functional expansion discussed previously,
it is worth highlighting some areas of conceptual
difficulty that early adopters have encountered with
these new tools, not least as these can be suggestive
of ways in which specifically musical applications of
such technologies may depart from more established
usage.

The workflows presented in the previous section
took shape in a context of continual exchange and
Diskussion, on the project forum and through regu-
lar online group meetings in which early adopters
shared work in progress on their respective projects.
Such communication on musical mining has been
helped by the consistency of concepts and interfaces
between the Max and SuperCollider implementa-
tionen. This communal approach has been significant
not just in showing what can be done, but in help-
ing to form shared understandings and research
Ideen, particularly across different disciplines and
styles. It also indicated areas in which the broader
project could help musicians become fluent with,
and critical of, these technologies.

Just as importantly, this group of techno-fluent

composers was able to provide feedback on the
learning curve of working with concepts from data
science while using our emerging toolbox. Wir haben
noticed, insbesondere, that for those with little
prior experience in the language and techniques
of data science, the conceptual jumps can seem
quite daunting: People are confronted with abstract
features and procedures far removed from musical
intuitions, and with the task of relating objective
validation to subjective expectations. Darüber hinaus,
the general hyperbole surrounding the potential of

machine learning and machine listening requires
a careful and tactful calibration of expectations
while still showcasing the creative affordances
they provide when these technologies are embraced
divergently.

By giving our early adopters a forum in which to
voice their concerns, they were able to challenge the
need to understand some of the more obscure data
science concepts and their usefulness in actual mu-
sicking. This approach, through frank dialogue, War
useful to understand where and when to stop with
technical explanations and when to use metaphors.
Broadly speaking, many of these episodes have
a common theme. The objects in our second of-
fering represent a jump in abstractness from those
introduced first. We departed from algorithms that
already had a basis in musical signal processing and
Analyse, and we delved into a range of algorithms
more simply concerned with data in the general
sense. In the following we spell out the most salient
of these difficulties.

Dimensionality

Thinking beyond three dimensions is hard and
quickly becomes highly abstract. Becoming com-
fortable with this mental exercise is, in part, simply
a matter of practice. Many algorithms start to per-
form markedly worse as the dimensionality of input
data increases, but the extent to which this palpably
degrades the results obtained depends not only on
the algorithm but also on the data and what one
is trying to achieve. This represents a pedagogi-
cal challenge in giving sufficiently useful rules of
thumb to observe, in terms of which models might
work reasonably given some quantity of multidi-
mensional input data. Insbesondere, what counts as
“a lot” of dimensions can vary markedly between
different algorithms, as can the interaction between
this dimensionality and the quantity of training
examples.

Breakdowns in terms of dimensionality are
especially acute when dealing with models that use
distance metrics between multidimensional input
points. When trying to develop models in which
a given algorithm’s notion of how similarly two
corpus items correlate with a human’s perspective,

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

it is reasonable to, at first, expect that adding
more input features might improve this correlation
by giving the computer more data to work with.
Although as human listeners we are able to latch
fluidly and rapidly onto different saliences, Die
“curse of dimensionality” in machine learning
means that many algorithms simply struggle as the
number of input dimensions increases.

Data Quality

The dependence of models on the quality of their
training inputs is a well-known problem in machine-
learning applications, but not one that admits of very
simple explanations. It is not trivial to ensure that
those factors that could affect the notional quality of
training data in a musicking context are understood.
Such factors could include the abundance of data,
the consistency of labeling, the degree to which the
data’s dimensions are correlated, how the data are
statistically distributed, how much coverage the
training examples offer of expected future inputs,
the relative scales between dimensions, and how
input scales relate to perceptual correlates.

In practical terms, the codependence between
Daten, Algorithmus, and desired outcomes means that
coming to terms with each of these points requires
iterated experimentation for each new application.
Although such empiricism is not alien to musicians,
the number of coupled factors and their relative
abstractness can lead to confusion or frustration. Für
Beispiel, to improve results, one might try adjusting
the “hyperparameters” of an algorithm, wohingegen
it could often be better (but more elusive) to try
“improving” the input data. Although the toolset
provides the means to subjectively assess model
performance in relation to musical goals, we have
found in practice that the temptation to keep tuning
models in preference to such “sanity checks” can be
strong and can sometimes lead musicians astray.

This often-frustrating tinkering of various code-

pendent moving parts is especially acute in the
case of the MLP, and neural networks in general,
where finding a workable arrangement of network
architecture and hyperparameters for given training
data is necessarily a case of trial and error. Als solche,
a clear avenue for further research in this area is

to arrive at additional ways of assessing input data,
evaluating model results, and developing useful
guidelines for improvement, such as when to add
more or different data, when to preprocess, and so
forth.

Units and Scales

When working with the data analysis objects, eins
quite quickly ceases to have data in physically
relatable units like MIDI pitch or dB—however
perceptually approximate—and instead one needs
to contend with data whose units and scales are
more abstract. Although this is not in and of itself
a problem for creative coding musicians, who deal
routinely with dimensionless ranges (zum Beispiel,
when mapping between controllers and proces-
sors), developing an intuition for what these new
quantities might represent can be challenging.
This is particularly true of dimensionality-
reduction processes, in which the nature of the
outputs depends on the inputs in a way that can
be hard to anticipate. Zum Beispiel, even for the
simplest such process in our toolkit, PCA, the map-
ping it performs is not easy to explain in general
physical terms, and coming to terms with the range
of the output data requires some familiarity with
the concept of statistical variance. This has caused
particular problems, in our experience, when artists
have wanted to relate the products of independent
dimensionality reductions to each other. Allgemein,
there is no simple mechanism to do this, Weil
each result is derived wholly from the input data the
model was shown. But understanding why, or how
to get around it, requires an engagement with the
implementation details of an algorithm that may be
daunting.

Future Work

At the time of writing, a small community of
users is gathering momentum, but a few clear next
steps are emerging, both in terms of technological
affordances that are currently missing in the toolset,
and in terms of our ambitious aim to foster a wider
diverse community of critical creative coders.

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Potential Framework Extensions

Early experiences with the framework have sug-
gested some possible additions to the toolkit for us
to consider. Concentrating on first providing well-
established and well-understood algorithms has
enabled swift development, as well as taking advan-
tage of a range of learning resources online and in the
Literatur. We anticipated that by seeing what people
did with these resources and noting where block-
ages occurred, a principled way to draw on more
recent developments or extend existing techniques
would emerge. Some examples of this happened
quite quickly and have been written about in other
publications—for instance, implementing the com-
paratively recent UMAP algorithm to supplement
andere, more-established, dimensionality-reduction
Objekte, as well as proposing a range of musically
targeted applications of existing techniques such as
nonnegative matrix factorization (NMF; discussed
by Roma, Grün, and Tremblay 2019b; for an in-
troduction to the NMF technique itself, vgl. Lee and
Seung 1999) and using similarity graphs (Roma,
Tremblay, and Green 2021).

In the final year of the FluCoMa project, Wir
intend to continue growing the toolbox with new
objects and new object categories. Several signal-
processing tools for “hybridizing” sounds and
interpolating gestures are already available (Roma,
Green and Tremblay 2020). Darüber hinaus, we have
postponed further optimization of the code to this
last segment: By observing the toolkit’s performance
in practice, and what sorts of size and complexity
of collection people end up gravitating towards, Wir
hope to streamline this process with a particular
perspective on more investigation of the real-time
possibilities and hurdles created by musician use.

In der Zwischenzeit, based on our own creative work and
feedback from our collaborators, some possible av-
enues for expansion can be described as: temporally
aware analysis; dynamic, high-dimensional index-
ing; alternative time-frequency representations; Und
validation tools.

algorithms we currently provide do not handle it
explicitly, which can lead both to confusion and to
extra effort on the part of users. One avenue would
be to look at making some algorithms available that
are geared at analyzing the temporal evolution of
Merkmale. There is a range of techniques that could
be explored, such as convolutional extensions to
NMF (Smaragdis, Raj, and Shashanka 2008), oder der
various approaches to recurrent neural networks
from reservoir computing (Kiefer 2014, 2019).

Dynamic, High-Dimensional Indexing

Although k-d trees are easy to implement and
relatively lightweight, they suffer from being costly
to modify after construction and from scaling poorly
to higher dimensionalities. One possible addition
would be to explore approximate nearest-neighbor
Algorithmen (Slaney and Casey 2008), which scale
more gracefully (at the expense of exact matching),
as well as alternative, mutable tree structures that
would allow corpora to be more easily constructed
and queried in real-time.

Alternative Time-Frequency Representations

Extensions of and alternatives to the well-
established Fourier-domain analysis of audio, solch
as constant-Q transforms (CQTs), the scattering
verwandeln (Salamon and Bello 2015), or auditory
Modelle (Lyon et al. 2010), are by no means new, Aber
remain broadly unexplored in CCEs. Possible rea-
sons for this may be the lack of invertibility of many
CQT implementations until recently (Necciari et al.
2013), as well as a lack of experience of what modi-
fications or analyses are practical and useful in the
transform domain. Our framework is, wir glauben,
flexible enough to provide a basis for musicians to
start exploring some of these questions themselves,
should implementations be made available.

Validation Tools

Temporally Aware Analysis

Time is, Natürlich, of foundational importance to
sonic and musical creative work, and many of the

One final area of further work would be to establish
a range of tools for assessing learned models in
supervised-learning workflows, to help streamline
ihre Verwendung.

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Breadth of Audience

Our target audience so far has been “fluent” users
of CCEs: See Green, Tremblay, and Roma (2018)
for a discussion of how we conceptualize ideas of
technological dispositions in relation to techno-
fluency. In this way we have hoped that usage
experience will help catalyze a virtuous circle in
which multiple “means of entry” can be curated to
afford less-fluent or more-casual engagement, welche
in turn would feed new, divergent, inclusive ideas
into the community. This is of especial importance
to the suite of data-oriented objects discussed here,
in which the extra degrees of abstraction that
come with machine-learning processes that are
more generic may not be immediately musically
suggestive to all. A number of tactics may well be
fruitful here, discussed in the rest of this section.

Online Resources

Work is currently underway on developing a plat-
form for the support of learning resources that
respond to the kinds of need identified in the previ-
ous section, as well as clarifying and supplementing
not only the documentation and examples available
with the current package but also the knowledge
base accumulating on our forum. Zusätzlich, Wir
will run workshops and publish workshop materi-
als and essays, with the hope that we continue to
benefit and learn from diverse feedback and discus-
sions that can inform future technical and critical
work in this area. The goal here is to target a wide
breadth of familiarity and to retain focus specifically
on working with these technologies in a musical
Kontext, while pointing to more general material for
those who wish to go “deeper.” Again, granularity
of information and interface to knowledge and ap-
plications is our main research focus, and we hope
to be able to tap into our first series of workshops to
enable as many inclusive entry points as possible for
the complex questions raised by machine listening
and machine learning.

Enabling Code Snippets

The accumulated experience from our early releases,
commissions, and first workshops reveals a few

recurring tasks that can be encapsulated in the
distributed package. Zum Beispiel, exploring a corpus
visually in a 2-D space, as popularized by CataRT
(Schwarz et al. 2006) and continued by AudioStellar
(Garber, Ciccola, and Amusategui 2021), provides
a useful abstraction in its own right, sowie
a fruitful way to explore the practical effects of
different dimensionality reduction strategies. A
careful consideration on how much a specific
workflow is implied in such snippets is always at
the forefront of their design.

Higher-Level Idioms

Encapsulation can be taken further by exploring
different idioms altogether. Zum Beispiel, we could
follow the example of the Vizzie package for Max (sehen
http://cycling74.com/articles/introducing-vizzie) als
an alternative paradigm for working with Jitter
for video processing; we could develop Max4Live
devices aimed specifically at common workflows
with Ableton Live; or we could develop wrappers for
additional host environments, such as browsers, über
a cross-compiler like Emscripten or game engines
like Unity3D.

One motivation for “higher-level” interfaces is
that a set of options that is more curated (and so
necessarily more restricted) can, in practice, Sei
more attractive or more empowering to a range of
potential contributors. The danger here, Jedoch, Ist
that we simply entrench too many of our ideas of
musical workflow into a less flexible setting.

Our long-term hope is that engagement from a
broad cross-section of the creative coding music
community will help to turn this into a collabo-
rative effort that supports and informs sustained,
collaborative musicking research.

Critical Reflection on Method

Early in the project (vgl. Grün, Tremblay, and Roma
2018) we situated our methodological approach in
the context of critical philosophy of technology
(Feenberg 2017) and recent critical work on interdis-
ciplinarity (Barry and Born 2013), with the aim that
this would not only help strengthen our particular

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

musical and technical goals with this project but
also form a basis, more generally, for combining
research into artistic practice with research into
technical design. Als solche, we have hoped to work
towards what Barry and Born have called a “logic
of ontology,” in which interdisciplinary work leads
to a shared shift in understanding of the objects of
Studie, in contrast to situations where disciplines
merely end up in service to each other.

These aspirations warrant some sustained inter-

rogation, beyond the scope of this article, as we
approach the end of this first extensive period of re-
suchen. Work on this basis is proceeding, concerned
with examining how our aspirations for the toolkit’s
ecosystem and the cross-disciplinary methodol-
ogy that produced it fared in practice, and what
lessons we might draw upon to inform future ef-
forts at music technology research that emphasizes
accountability to diverse artistic communities.

vironments, institutional borders, and disciplinary
enclaves.

Danksagungen

This article is an extended version of our paper
presented at the International Computer Music
Conference (Tremblay, Roma, and Green 2021). Wir
would like to thank the creative coders engaging
in the alpha community (James Bradbury, Rodrigo
Constanzo, Richard Devine, Alice Eldridge, Daniele
Ghisi, “Leafcutter” John Burton, Lauren Hayes, Ted
Moore, Olivier Pasquet, Sam Pluta, Hans Tutschku),
the CeReNeM and its Creative Coding Lab, Und
the European Research Council, since this research
was made possible thanks to a project funded
under the European Union’s Horizon 2020 Re-
search and Innovation Program (grant agreement no.
725,899).

Conclusions

Verweise

This article presented the second iteration of the
Fluid Corpus Manipulation toolbox, a new software
framework for programmatic mining of sound banks
in CCEs. Combined with its previous iteration, Es
offers a broad system that enables programmatic
data-driven musicking for interaction with audio
and other data corpora within popular creative cod-
ing environments. Early usage and feedback from
a community of users suggest that, through sev-
eral iterations, the toolbox has fulfilled its design
goals with respect to native integration, konsis-
tency, learnability and configurability of interface,
scalability, and breadth, while enabling complete
mining-as-musicking workflows. We have shown a
range of possibilities that emerged from commis-
sioned works, and we hope that further interest in
the toolkit will be stimulated by the premieres of
these works, as well as their documentation. Wir
also hope that by reporting the various learning
hurdles needed to enable an empowered fluency
with these technologies, and by reflecting on our
next challenges and on the potential additions to
the toolkit, we will foster an open, inclusive, Und
critical research community spanning coding en-

Agostini, A., and D. Ghisi. 2012. “Bach: An Environment
for Computer-Aided Composition in Max.” In Proceed-
ings of the International Computer Music Conference,
S. 373–378.

Akkermans, V., et al. 2011. “Freesound 2: An Improved
Platform for Sharing Audio Clips.” In Proceedings of
the International Conference on Music Information
Retrieval, S. 3–5.

Barry, A., and G. Born. 2013. “Introduction.” In A. Barry
and G. Born, Hrsg. Interdisziplinarität: Reconfigurations
of the Social and Natural Sciences. London: Routledge,
S. 1–56.

Bourlard, H., and Y. Kamp. 1988. “Auto-Association by
Multilayer Perceptrons and Singular Value Decom-
position.” Biological Cybernetics 59(4–5):291–294.
10.1007/BF00332918, PubMed: 3196773

Buitinck, L., et al. 2013. “API Design for Machine Learning
Software: Experiences from the scikit-learn Project.”
In European Conference on Machine Learning and
Principles and Practices of Knowledge Discovery in
Databases, S. 108–122.

Bullock, J., and A. Momeni. 2015. “ml.lib: Robust,

Cross-Platform, Open-Source Machine Learning for
Max and Pure Data.” In International Conference
on New Interface for Musical Expression, S. 265–
270.

Tremblay, Roma, and Green

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Buxton, B. 1997. “Artists and the Art of the Luthier.”
ACM SIGGRAPH Computer Graphics 31(1):10–11.
10.1145/248307.248315

Carpentier, G., et al. 2012. “Automatic Orchestration
in Practice.” Computer Music Journal 36(3):24–42.
10.1162/COMJ_a_00136

Collins, N. 2011. “SCMIR: A SuperCollider Music In-
formation Retrieval Library.” In Proceedings of the
International Computer Music Conference, S. 499–
502.

Feenberg, A. 2017. Technosystem: The Social Life of

Grund. Cambridge, Massachusetts: Harvard Universität
Drücken Sie.

Fiebrink, R., and P.R. Cook. 2010. “The Wekinator: A

System for Real-Time, Interactive Machine Learning in
Music.” In Proceedings of the International Conference
on Music Information Retrieval. Online verfügbar unter
archives.ismir.net/ismir2010/latebreaking/000012.pdf.
Accessed January 2022.

Fiebrink, R., and L. Sonami. 2020. “Reflections on Eight
Years of Instrument Creation with Machine Learning.”
In International Conference on New Interfaces for
Musikalischer Ausdruck, S. 237–242.

Garber, L., T. Ciccola, and J. C. Amusategui. 2021.

“AudioStellar, an Open Source Corpus-Based Musical
Instrument for Latent Sound Structure Discovery
and Sonic Experimentation.” In Proceedings of the
International Computer Music Conference, S. 62–
67.

Ghisi, D., and C. Agon. 2016. “Real-Time Corpus-Based
Concatenative Synthesis for Symbolic Notation.”
In Proceedings of the International Conference on
Technologies for Music Notation and Representation,
S. 7–13.

Grün, O., P. A. Tremblay, and G. Roma. 2018. “Inter-
disciplinary Research as Musical Experimentation:
A Case Study in Musicianly Approaches to Sound
Corpora.” In Proceedings of the Electroacoustic Mu-
sic Studies Network Conference. Online verfügbar unter
www.ems-network.org/spip.php?article471. Zugriff
Januar 2022.

Hackbarth, B., N. Schnell, and D. Schwarz. 2010.

“AudioGuide: A Framework for Creative Exploration
of Concatenative Sound Synthesis.” Research re-
port. Paris: IRCAM. Available online at articles.ircam
.fr/textes/Hackbarth10a/index.pdf. Accessed January
2022.

Harker, A., and P. A. Tremblay. 2012. “The HISSTools Im-
pulse Response Toolbox: Convolution for the Masses.”
In Proceedings of the International Computer Music
Conference, S. 148–155.

Impett, J. 2000. “Situating the Invention in In-

teractive Music.” Organised Sound 5(1):27–34.
10.1017/S1355771800001059

Kiefer, C. 2014. “Musical Instrument Mapping Design

with Echo State Networks.” In Proceedings of the In-
ternational Conference on New Interfaces for Musical
Expression, S. 293–298.

Kiefer, C. 2019. “Sample-level Sound Synthesis with

Recurrent Neural Networks and Conceptors.” In PeerJ
Computer Science 5:Kunst. e205. 10.7717/peerj-cs.205,
PubMed: 33816858

Lee, D. D., and H. S. Seung. 1999. “Learning the Parts of
Objects by Non-Negative Matrix Factorization.” Nature
401:788–791. 10.1038/44565, PubMed: 10548103
Lloyd, S. 1982. “Least Squares Quantization in PCM.”
IEEE Transactions on Information Theory 28(2):129–
137. 10.1109/TIT.1982.1056489

Lyon, R. F., et al. 2010. “Sound Retrieval and Ranking

Using Sparse Auditory Representations.” Neural Com-
putation 22(9):2390–2416. 10.1162/NECO_a_00011,
PubMed: 20569181

McPherson, A., and K. Tahıro˘glu. 2020. “Idiomatic

Patterns and Aesthetic Influence in Computer Music
Languages.” Organised Sound 25(1):53–63. 10.1017/
S1355771819000463

Necciari, T., et al. 2013. “The ERBlet Transform: Ein

Auditory-Based Time–Frequency Representation with
Perfect Reconstruction.” In IEEE International Con-
ference on Acoustics, Speech and Signal Processing,
S. 498–502.

Pedregosa, F., et al. 2011. “Scikit-learn: Machine Learning
in Python.” Journal of Machine Learning Research
12:2825–2830.

Roma, G., Ö. Grün, and P. A. Tremblay. 2019A. “Adaptive
Mapping of Sound Collections for Data-Driven Musical
Interfaces.” In International Conference on New
Interface for Musical Expression, S. 313–318.

Roma, G., Ö. Grün, and P. A. Tremblay. 2019B. “Time
Scale Modification of Audio Using Non-Negative Ma-
trix Factorization.” In Proceedings of the International
Conference on Digital Audio Effects, S. 213–218.
Roma, G., Ö. Grün, and P. A. Tremblay. 2020. “Audio
Morphing Using Matrix Decomposition and Opti-
mal Transport.” In Proceedings of the International
Conference on Digital Audio Effects, S. 147–154.
Roma, G., P. A. Tremblay, and O. Grün. 2021. “Graph-

Based Audio Looping and Granulation.” In Proceedings
of the International Conference on Digital Audio
Effects, S. 253–259.

Solomon, J., and J. P. Bello. 2015. “Unsupervised Feature
Learning for Urban Sound Classification.” In IEEE

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

International Conference on Acoustics, Speech and
Signal Processing, S. 171–175.

Schnell, N., et al. 2009. “Mubu and Friends: Assembling
Tools for Content Based Real-Time Interactive Audio
Processing in Max/MSP.” In Proceedings of the Inter-
national Computer Music Conference, S. 423–426.
Schnell, N., et al. 2017. “PiPo, a Plugin Interface for Affer-
ent Data Stream Processing Modules.” In Proceedings
of the International Symposium on Music Information
Retrieval, S. 361–367.

Schwarz, D., et al. 2006. “Real-Time Corpus-Based

Concatenative Synthesis with CataRT.” In Proceedings
of the International Conference on Digital Audio
Effects, S. 279–282.

Slaney, M., and M. Casey. 2008. “Locality-Sensitive
Hashing for Finding Nearest Neighbors [Lecture
Notes].” IEEE Signal Processing Magazine 25(2):128–
131. 10.1109/MSP.2007.914237

Small, C. 1998. Musicking: The Meanings of Performing
and Listening. Middletown, Connecticut: Wesleyan
Universitätsverlag.

Smaragdis, P., B. Raj, and M. Shashanka. 2008. “Shift-

Invariant Probabilistic Latent Component Analysis.” In
IEEE International Conference on Acoustics, Speech,
and Signal Processing, S. 2069–2072.

Schmied, B. D., and W.S. Deal. 2014. “ml.∗: Machine

Learning Library as a Musical Partner in the Computer-
Acoustic Composition Flight.” In Proceedings of the
International Computer Music Conference, S. 1285–
1289.

Tremblay, P. A., et al. 2019. “From Collections to Corpora:
Exploring Sounds through Fluid Decomposition.” In
Proceedings of the International Computer Music
Conference, S. 223–228.

Tremblay, P. A., G. Roma, and O. Grün. 2021. “Digging
Es: Programmatic Data Mining as Musicking.” In
Proceedings of the International Computer Music
Conference, S. 295–300.

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
2
9
2
0
3
7
5
4
1
/
C
Ö
M
_
A
_
0
0
6
0
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Tremblay, Roma, and Green

23
PDF Herunterladen