Roger B. Dannenberg - Am MIT spezialisierte KI-Forschung

Roger B. Dannenberg
Carnegie Mellon University
School of Computer Science
5000 Forbes Avenue
Pittsburgh, Pennsylvania 15213, USA
rbd@cs.cmu.edu

Communication for
Real-Time Music Systems:
An Overview of O2

Abstrakt: Message passing between processes and across networks offers a powerful method to integrate and coordinate
various music programs, facilitating software reuse, modularity, and parallel processing. Networking can integrate
components that use different languages and hardware. In this article we describe O2, a flexible protocol for commu-
nication ranging from the thread level up to the level of global networks. Messages in O2 are similar to those of Open
Sound Control, but O2 offers many additional features, including discovery, clock synchronization, a reliable message
delivery option, and routing based on services rather than specific network addresses. A bridge mechanism extends the
reach of O2 to web browsers, shared memory threads, and small microcontrollers. The design, implementation, Und
applications of O2 are described.

Dismantling the Monolith

In the early days of interactive music systems, Es
was common to dedicate an entire computer to a
single real-time program so that operations could
be carefully scheduled to meet real-time demands.
Programs tended to be large and all-encompassing,
dealing with a user interface, sensing, Kontrolle
Verarbeitung, Und (when computers became fast
enough) audio signal processing. Over time, Software
components have become increasingly complex,
and we cannot expect to find all the functionality
we need in a single monolithic program. Reuse,
repurposing, and integration of multiple large-scale
components now form a practical approach to
system building, especially for creators whose first
priorities are to make music rather than to consider
all details of implementation.

Glücklicherweise, computers have evolved to support

this approach. Jetzt, nearly all computers feature
many cores, allowing them to run multiple real-
time applications in parallel with little interference
and far fewer scheduling concerns. Larger memories
have also enabled multiple applications to run in
parallel without the page swapping that was dis-
astrous for real-time music processing. Multiple
coordinated software applications can take advan-
tage of the parallelism available from multicore
processors, making more computing power avail-
able. Bedauerlicherweise, “composing” software systems

Computermusikjournal, 45:4, S. 7–19, Winter 2021
doi:10.1162/COMJ_a_00620
© 2022 Massachusetts Institute of Technology.

still requires work to establish communication,
coordination, and control of multiple components.

These interconnection problems can be solved in

several ways, using any of the following:

1. Any of the specialized standards such as VST,
MIDI, SMPTE, Link, and DMX512, welche sind
often associated with off-the-shelf hardware
but are not very general;

2. Custom one-off solutions based on TCP/IP,
RS232, ZigBee, and other low-level data
transports, which are often the simplest
solution when only the simplest functionality
is required; oder

3. Higher-level message-passing systems, In-

cluding many commercial message-oriented
middleware products.

In the experimental music community, Open
Sound Control (OSC) is arguably the most successful
solution, owing to its flexibility, simplicity, peer-
to-peer connections (no third-party intermediary),
and available implementations (Wright, Freed, Und
Momeni 2003). Jedoch, OSC lacks many desirable
Merkmale.

Challenges for Software Communication

An important consideration for communication
in music systems is real-time performance. Few
networks offer hard real-time behavior where there
are absolute guarantees on delivery times, aber wir
can at least design for good expected performance. Bei
least on lightly loaded networks, actual performance
can then be quite predictable. Good performance

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

depends on the right communication abstractions.
Zum Beispiel, remote procedure calls (RPCs) Das
invoke an operation and return a result are a nice
programming abstraction, but RPCs typically block
the sender until the result is available. Weil
of blocking, this approach is not well suited to
real-time systems. Daher, asynchronous one-way
messaging, where the sender does not wait for
replies, is typically used.

Assuming asynchronous messaging, what do
messages look like? Systems have been designed
around single-value or attribute/value messages,
but network messages can carry 1,000 bytes almost
as easily as 1. Darüber hinaus, collections of values are
often needed to describe operations and events
(even a MIDI note-on message contains a channel,
a key number, and a velocity), so messages should
carry multiple values. The success of OSC, with its
multiple-value messages, is a good indicator that
this is an important capability.

An important aspect of communication is config-

uring addresses and connections so that messages
can be delivered to the right destination. Computer
musicians must often configure raw IP (the “Inter-
net Protocol”) addresses manually because addresses
are assigned by networks, and addresses can change.
Even within a single computer, connections are
made to ports, which are typically assigned manu-
ally to avoid conflicts. Außerdem, at least TCP
connections require the server to exist before mak-
ing a connection; ansonsten, the client’s connection
request will be dropped. A great deal of effort can be
eliminated through “discovery” protocols that au-
tomatically configure communications. Automatic
discovery also allows services to run together on one
computer for development but later run on multiple
computers to achieve higher performance.

Another challenge is to achieve both good real-
time performance and reliability. In der Praxis, com-
munication is largely based on IP, welche, at a low
Ebene, is a “best effort” packet delivery system.
This simple and direct point-to-point transmission
usually offers the lowest latency available. Net-
work messages (packets) can be lost, Jedoch, Wann
operating systems or network switches become
overloaded, oder (in rare circumstances) when data
is corrupted in transmission. daher, a higher-

level protocol (usually TCP) is commonly used to
detect errors, retransmit packets, und ultimativ
deliver data with essentially perfect reliability. Für
many applications, a combination of best-effort and
reliable transmission is necessary.

The full Internet protocol suite, also known
as TCP/IP, provides a widely supported and solid
foundation for communication, but it is not always
fully available, nor is its full set of communication
features always called for. To list three examples:

1. Web browsers offer a wealth of cross-platform
tools for building interfaces and data displays,
but they are limited to HTTP and WebSocket
APIs, which can only connect to compatible
servers.

2. To achieve low latency, software music syn-

thesizers are often barred from direct network
communication and use shared-memory com-
munication instead.

3. Microcontrollers have limited memory and

power, and when used only to collect and send
sensor data, a powerful middleware package
and even a full TCP/IP implementation may
be overly complex and power hungry.

Timing in music is essential. Rather than leave
timing entirely to applications, an important capa-
bility is to synchronize clocks and deliver messages
with accurate timing at their destination. Clock
synchronization and timed messages allow compu-
tations to be synchronized in spite of large amounts
of timing jitter caused by network latency.

To address these challenges, O2 was designed as a
“communications middleware” for interactive mu-
sic systems. O2 builds on some existing structures
of OSC because they are widely known and suc-
cessful. In the next section, the fundamentals of O2
are described. The following section, Related Work,
describes other communication systems, especially
those for musicians. Then in the New Features
section we describe several interesting capabilities
that have been added to O2 since its original imple-
mentation, including publish/subscribe and a bridge
abstraction used to extend O2’s reach to non-IP sys-
Systeme. The Implementation Details section discusses
the implementation and some performance mea-
surements. In the Applications section we describe

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

how O2 could be, und ist, used in practice. Endlich,
the Future Work and Conclusions sections cover
possible extensions and summarize what has been
learned from building and using O2.

Introducing O2

O2 is a protocol and an implementation enabling
flexible asynchronous communication between pro-
cesses and across threads, especially for interactive
music applications. In diesem Abschnitt, principal O2
abstractions are described along with basic oper-
ations from the perspective of programmers and
developers.

O2 connects processes. An O2 process is es-
sentially a single running program. (Threads and
interthread communication will be discussed later.)
O2 communication is potentially global, so it is
important to limit communication to within a se-
lected group called the ensemble. An O2 ensemble
is a collection of peer processes that are allowed to
communicate with one another. Each ensemble has
a distinct name, and each O2 process joins one and
only one ensemble.

An O2 process can provide one or more services.

A service is just a name used to route messages.
Typically, a service is offered by at most one process,
but if multiple processes offer the same service,
O2 will pick one process as the active service
provider and the others will serve as backups in case
the active process terminates or loses its network
connections.

To send a message to a service, one needs an
address. An O2 address is a URL-like text string
beginning with the service name. Zum Beispiel,
/synth/lfo/freq addresses the synth service.
The suffix nodes lfo/freq designate an operation
or resource provided or managed by the service.
Addresses in O2 are similar to OSC addresses,
except that the first node in an O2 address is a
service name used to find the process that offers the
service, whereas in OSC, there is no service name,
and it is the programmer’s responsibility to send the
message to the correct server.

The message format in O2 is based on OSC
messages with minor changes. Every O2 message

contains a timestamp that refers to globally syn-
chronized O2 clock time. If the timestamp is greater
than the current time, message delivery is delayed
until the timestamp. Alternativ, the sender can
use a timestamp of zero to indicate “as soon as
possible.” Like OSC, O2 messages also contain an
address, a type string describing the types of the data
contained in the message, and a set of values. O2
types include standard OSC types such as integer,
float, and string, as well as some new ones, einschließlich
vectors.

O2 from the Developer’s Perspective

After initialization, O2 performs discovery, com-
munication, and timed message delivery in the
background. To simplify interaction with the ap-
plication, all O2 operation is explicitly invoked
by the application, which calls o2_poll()jeden
1 Zu 50 milliseconds, depending on the timing
precision required. To receive messages and re-
spond to them, the application creates a service
using o2_service_new (servicename) and installs
message handlers using o2_method_new (address,
types, handler, info, coerce, parse), where address is
the full address, z.B., /synth/ﬁlter/cutoff, types
gives the expected parameter types, handler is the
address of the callback function to process matching
messages, and info is an additional parameter to
pass to this handler function. The coerce and parse
parameters enable options for built-in type coer-
cion and message unpacking before invoking the
handler. Messages are delivered according to their
timestamps using a built-in scheduler.

Messages can be sent from any process, inkl-
ing the one that offers the service (in which case
networking is bypassed and the handler is invoked
directly.) To send a “best effort” message, one calls
o2_send (address, Zeit, types, val1, val2, . . .) mit
the destination address, type string, and values val1,
val2, usw. To send a message with guaranteed deliv-
ery, o2_send_cmd() is used instead. This name
suggests that the message carries a “command” that
must be delivered.

What if a message is sent but there is no active

service to handle it? In this case, O2 issues a

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

warning and drops the message. O2 will also drop
timestamped messages (with nonzero timestamps)
if the receiver has not established a synchronized
clock. In manchen Fällen, applications will want to
wait for a service to be discovered and synchronized
before sending it messages. The function o2_status
(servicename) can tell if a service exists, whether it is
local or remote, and whether clock synchronization
has been achieved.

To use clock synchronization, at least one process

must provide a reference clock to the ensemble.
This is done by calling o2_set_clock(). Optional
parameters allow the clock reference to be provided
by the application, such as an audio sample clock
instead of the default system clock.

Notice that the application developer does not
deal with IP addresses, port numbers, or even host
Namen. O2 offers many more capabilities, described
below, but before going into further detail, let us
consider some related work.

Related Work

O2 originated as a project to extend OSC with new
capabilities. The OSC protocol is intentionally de-
signed to be transport-independent, but that limits
it to simple point-to-point communication estab-
lished by some other means. That usually requires
manual configuration of IP addresses and port num-
bers and forces developers to choose either UDP
(best effort) or TCP (reliable), but not both. Clock
synchronization, discovery, and other features are
missing. At least discovery has been addressed
by liboscqs (liboscqs.sourceforge.io), OSCgroups
(www.rossbencina.com/code/oscgroups), and osc-
Werkzeuge (sourceforge.net/projects/osctools). Discovery
is also discussed and implemented by Essl (2011);
Eales and Foss (2012); and Malloch, Sinclair, Und
Wanderley (2015).

Libmapper (Malloch, Sinclair, and Wanderley
2015) is designed to map inputs from sensors to
synthesis control parameters. The model is akin
to connecting systems with patch cords, welche
is appropriate for connecting sensors, but not for
general event-based control. An unusual feature is

that libmapper offers various adjustable mapping
functions to transform data between the sender and
the receiver, which can be particularly useful when
working with sensor data.

LANdini (Narveson and Trueman 2013) hat
similar goals to O2 in that it offers discovery on
a LAN and reliable transmission. To simplify the
implementation, LANdini messages flow in three
“hops,” requiring a message from the sender to a
local server, from the local server to a remote server,
and from there to the destination. Auch, the overall
message rate for N devices is proportional to N2 due
to traffic used to insure reliable delivery. LANdini
has inspired some recent new capabilities in O2,
Jedoch.

MobMuPlat (www.mobmuplat.com; also cf. Igle-
sia 2016) can be described as a software framework
for running Pure Data (Pd) on mobile devices. In
addition to support for graphical interfaces and
sensors, MobMuPlat supports a simple discovery
and peer-to-peer connection scheme within a LAN,
addressing some of the problems that O2 also solves.
Networking approaches outside of the music
community are far more numerous. CORBA (Hen-
ning 2006) is an example of a distributed object
system with many capabilities, but its complexity
has discouraged its use. ZeroMQ (Hintjens 2013) Ist
less complex and supports a variety of communi-
cation patterns, but it does not offer discovery or
messaging over UDP, so it is not suitable for many
music applications.

Clock synchronization techniques are well
known, but often omitted from music systems
because of the extra implementation and configura-
tion required. Flaviu Cristian’s (1989) simple method
is the basis for synchronization in O2. Madgwick
et al. (2015) describe a method that uses broadcast
from a reference but assumes bounds on clock
drift rates. Brandt and Dannenberg (1999) describe
a round-trip method with a proportional-integral
controller. OSC bundles have timestamps, but clock
synchronization is rarely included in OSC imple-
mentations. Florian Goltz (2018) describes Ableton’s
Link technology, which uses clock synchronization
specifically for the task of establishing a shared beat
and tempo framework for applications.

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

New Features

Beyond the basic message-passing functions im-
plemented for the first version of O2 (Dannenberg
2019), significant extensions have been imple-
mented to address various problems or to make
O2 even more versatile. Properties provide a way
to share information about services. Taps allow
communications to be monitored and support the
publish/subscribe communication pattern. Bridges
allow O2 hosts to be connected to processes through
protocols other than IP. Most bridges use a subset of
O2 called O2lite, which runs over WebSockets, IP,
and shared memory interfaces.

Properties

Inspired by LANdini, O2 properties are at-
tribute/value pairs associated with service providers.
An important use case is finding players in a laptop
Orchester. Each player will be represented by a dif-
ferent O2 service name, so how does a central “con-
ductor” process send messages to all players? Notiz
that in O2, each player could offer a service named
player, but O2 would direct messages to only one of
ihnen. Daher, each player must be reached via its own
unique service name. Alternativ, the conductor
cannot simply send to every service because not all
services are players. The solution is for each player
to attach a property, Zum Beispiel, type:player,
to its service. Dann, the conductor can search for
services with a type attribute equal to player to
locate players. Properties are stored in strings, Und
copies of property strings are distributed by the
O2 discovery mechanism, so they can be accessed
or searched quickly without additional network
delay.

Taps and Publish/Subscribe

Messages arriving at a particular service can be
forwarded automatically to another service at
any process by setting a tap, which consists of a
process and a service name within that process.
The destination service (the “tapper”) is associated
with a specific process so that if the process is

disconnected, the tap can be removed automatically
rather than have it possibly redirect to another
service provider.

Taps were created to enable message monitoring

and diagnosis, but they serve another role that
is perhaps more important. A publish/subscribe
pattern is one in which the sender publishes using
some publishing name, and receivers can subscribe
to the name in order to receive messages. Das
pattern allows the publisher (consider a sensor or
global tempo control) to send information without
knowledge of who is interested in that information
or who will receive the messages. It is also a one-
to-many pattern in contrast to the many-to-one
pattern implemented by o2_send() and services.
To implement publish/subscribe, the publisher
creates a local service, which need not have any
message handlers. Publishing means simply sending
to the service. Each subscriber taps the service to
receive a copy of each published message. Das
scheme eliminates any need for the application to
manage a subscriber list, including the automatic
removal of tappers when network connections are
lost. (Alert readers will notice that publish/subscribe
offers another solution to the problem of connecting
a conductor to many players.)

The Bridge Abstraction

To be as versatile as OSC, O2 should be extensible
to work with new transports such as Bluetooth or
WebSockets. The bridge abstraction allows services
to be provided across any message-passing transport.
A bridge connects a single O2 process, called the
host, via a message transport, to a single process,
the client. The host advertises services on behalf
of the client. O2 messages for those services arrive
at the host and are immediately forwarded to the
client. The client can send arbitrary O2 messages
by sending them across the link to the host, Wo
they are resent to reach their ultimate destination.
Im Wesentlichen, the host is both a proxy service provider
for the client and a proxy for sending O2 messages.
In this way, O2 can be extended to support any link
Technologie, including RS232, Bluetooth, Zigbee,
WebSockets, and shared memory.

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

O2lite

OSC Compatibility

Most bridges implement a subset of O2 called
O2lite. There is even an O2lite bridge for TCP/IP,
allowing for lightweight implementations on mi-
crocontrollers running Wi-Fi such as ESP32-based
devices. In this case, the difference between O2 and
O2lite is not the transport, since both use IP, aber die
fact that O2lite can only send and receive directly
from a single O2 host process. Messages to other
processes must be relayed through the host. An
die andere Hand, a small and simple implementation
that is more suited to microcontrollers can be used,
leaving the rest of O2’s functionality to laptop or
desktop host computers.

Another implementation of O2lite uses WebSock-

ets for the transport. WebSockets allows browsers
to connect to an O2 ensemble. Daher, browsers can
be used as synthesizers or user interfaces. Mobile
devices with accelerometers and touch surfaces can
communicate with O2 through their built-in web
browsers. A library, o2ws.js, exists in Javascript
for these applications. Browser applications, typ-
ically written in HTML and Javascript, must be
downloaded over the HTML protocol. To be self-
contained, especially on private LANs often used
in performances or art installations, the O2 library
implements a simple web service. Users can down-
load web applications and connect to O2 without
creating a separate server or website.

A third interesting implementation of O2lite runs

over a shared memory bridge. This is one solution
to the problem that O2 does not directly support
multiple threads. Even if it did, synchronizing
threads and invoking network operations could
introduce unacceptable latency to real-time threads,
especially those used for audio signal processing.
The shared memory bridge uses lock-free queues to
pass messages between threads. Similar lock-free
queues are used to allocate and free memory through
a shared heap structure, supporting very-low-latency
audio processing. In our tests, the shared memory
bridge can send and receive a message using 320 nsec
of central processing unit (CPU) time plus a small
overhead to poll for messages. An application of
this bridge is described later in the Audio Synthesis
section.

One transport of great interest is OSC. To interoper-
ate with OSC, O2 uses a bridge-like mechanism that
translates to and from OSC messages. To receive
OSC, one calls o2_osc_port_new (servicename,
port, tcpflag) to create an OSC server that receives
incoming messages on the given port and forwards
them to the specified O2 service. Zum Beispiel, Wenn
servicename is sensor1, and the incoming OSC
address is /value, the message is forwarded to
O2 address /sensor1/value. To send to an OSC
server, one calls o2_osc_delegate (servicename,
ipaddress, port, tcpflag), which creates a new O2
service. Any O2 message to that service is translated
to OSC by removing servicename from the address
and is forwarded to the OSC server specified by IP
address, port number, and the protocol indicated by
tcpflag (TCP or UDP).

Wide Area Networking with O2

O2 uses zero-configuration networking implemen-
tations (Bonjour on macOS, Avahi on Linux) für
discovery (vgl. Guttman 2001). One limitation of
Bonjour is its use of broadcast messages, welche sind
restricted to the local area network. Das bedeutet, dass
distant machines cannot be discovered. Given the
interest in network performance and collaboration,
O2 implements an extended discovery protocol
that works globally. Global discovery builds upon
MQTT (mqtt.org), a lightweight Internet-of-Things
messaging protocol for which there are open servers
that can be located through conventional domain
name resolution.

The MQTT protocol offers publish/subscribe
services. Processes in O2 construct a topic o2-
ensemblename/disc, based on the ensemble name,
and publish their local and public IP addresses and
port number through MQTT. O2 processes continu-
ally listen for MQTT messages that announce new
members of the ensemble. When a new process is
discovered, a direct TCP connection is attempted to
enable further communication.

An often-encountered problem is receiving mes-
sages behind Network Address Translation (NAT)

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

firewalls, which are almost standard for home net-
funktioniert. The NAT mapping translates port numbers
and IP addresses as they transit from local home net-
works to the public Internet. In principle, the only
way to receive a message behind NAT is to connect
to a server and specify a reply port. The reply port
in the outgoing message is replaced with a different
number, but NAT updates tables so that any reply
message from the Internet can be directed to the
true reply port on the home network. Daher, outgoing
requests to servers work transparently. Auf dem anderen
Hand, a connection originating from the Internet will
not reach the desired address or port through the fire-
wall. There are interesting workarounds, einschließlich
STUN (dl.acm.org/doi/book/10.17487/RFC3489),
but O2 currently solves this problem by relaying
messages through MQTT using the same server
used for discovery. It is only when direct peer-to-
peer connections cannot be established that MQTT
is used.

Implementation Details

O2 is implemented in C++ but has a purely proce-
dural API accessible from C or any language that can
interface with external libraries. O2 currently runs
on macOS, Linux, and Windows, and is free and open
source (github.com/rbdannenberg/o2). O2 can also
be accessed through Pd using a set of four Pd objects
(“externals”): o2ensemble, o2send, o2receive,
and o2property.

Discovery is based on Bonjour, Avahi, or MQTT,

as described earlier. Upon discovery of another
Verfahren, O2 determines if a connection already
exists. If not, a TCP connection is made to the
Verfahren. To avoid two peers trying to connect to
one another simultaneously, permanent connec-
tions are only made from lower address and port
numbers to higher ones. To connect to a peer with
a lower address, a temporary connection is made,
and a discovery-like message is sent requesting a
connection in the reverse direction.

Once made, TCP connections are bidirectional,

and processes can reliably exchange their local
services, properties, and UDP ports that are needed
for “best effort” messages. All this information is

sent using ordinary O2 messages to the specially
recognized o2_ service. As a peer-to-peer network,
the number of connections grows as N2 for N
processes, but we assume O2 ensemble size is
limited to at most 100 processes. Maintaining 100
network connections per process is small-scale
networking in terms of modern servers. Natürlich,
performance will depend upon message rates and
network capacity, but O2 has been used successfully
over Wi-Fi for a laptop orchestra with 25 processes
and many more devices connected via OSC.

Message delivery operates as follows: Erste,
the message address is examined to obtain the
destination service name. O2 uses a hash table to
map the service to a list of service providers and
a list of taps (siehe Abbildung 1). The service providers
are sorted by decreasing process address, und das
first provider is considered the active provider that
will receive the message. If the service provider is
a remote process, the message is forwarded over
TCP or UDP directly to that process for delivery.
If the active provider is the local process, Dort
will be a tree of hash tables used to decode the full
address and determine a handler function to call.
If the active provider is a bridge, OSC server, oder
MQTT connection, the message is delivered using
the corresponding protocol. After delivery to the
active provider, a copy of the message is delivered to
each tap, wenn überhaupt, replacing the original service name
with that of each tap before forwarding.

The implementation of clock synchronization re-
quires one process to provide a clock reference that
other processes attempt to follow. Each so-called
follower sends periodic requests for the reference
Zeit. For each follower, the fastest round-trip time
of the five most recent requests is used to estimate
any difference between the follower’s local clock
and the reference clock, and local corrections are
made. The implementation in O2 makes smooth
clock adjustments by temporarily changing the
clock speed, eliminating rhythmic distortions in
music that might be caused by suddenly setting
the time forward or backward. O2 runs an effi-
cient scheduler to deliver timestamped messages.
Applications can use the scheduler by sending
themselves timestamped messages to initiate timed
Veranstaltungen.

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Figur 1. To deliver
messages, each process
first maps the service
name (first node of the
address) to a list of service
providers and taps. A
service provider is either: A
remote-process object

encapsulating an open
TCP socket, UDP address,
and connection state (top);
a local service represented
as a tree to decode the
address, z.B., /s2/x/y,
mapping it to a handler
function pointer or object

(Mitte); or a bridge
interface (or MQTT or
OSC) Beispiel, welche
implements a specialized
transport mechanism
(bottom).

The first implementation of O2 used blocking
network send calls. Typically, send operations copy
messages immediately to buffers and return, so in
üben, calls rarely block. Jedoch, deadlock can
occur when two processes are sending to each other
over TCP. Jetzt, O2 uses nonblocking calls, Und
senders can detect and avoid sending more messages
than the network can handle.

Performance and Evaluation

Communication software such as O2 should provide
useful functions without introducing high overhead
or performance penalties. In der Praxis, sending a net-
work message is already quite slow and expensive,
so there is not much one can do that will make
network performance significantly better or worse.
Zum Beispiel, we measured a median round-trip time
von 5.5 msec over a local Wi-Fi network, but within
a single computer, the time averaged 28 µsec, indi-
cating that more than 99 percent of the time is due
to Wi-Fi. The single computer round-trip time could
be reduced to 20 µsec by calling network primitives
directly instead of using O2. This indicates a single
message send overhead of about 4 µsec, but that
number (in this simple test) includes polling all
sockets for messages, running the scheduler, Und
other tasks for each message.

The shared-memory bridge uses the same message

decoding implementation as regular O2 messages

but avoids networking altogether. By comparing the
time to send and deliver a single message to the
time to send multiple messages back-to-back, Wir
can factor out the overhead of polling operations
and determine that the time to allocate memory,
write a message, send it, decode it, call its handler,
and finally free the memory is about 320 nsec. Das
is a likely a best-case scenario because repeatedly
performing the same small task will ensure few
cache misses. Auch, keep in mind that this is a
measurement of CPU utilization, whereas the real
time delay is limited by the polling period between
checks for new messages, typically on the order of 1
ms.

Direct comparisons with OSC (using the liblo

implementation) show negligible differences in
performance except in one case. O2 uses a single
bidirectional TCP connection, whereas OSC uses
a one-way connection, so two connections are
required for two-way communication. In our tests,
OSC over TCP using two connections was exactly
half as fast as O2 using a single TCP connection,
but we would expect one-way send times to be
nearly identical. Wieder, these times are swamped
by network latency when actual networks are
beteiligt.

Network configuration with O2 typically takes

from one to a few seconds, including clock syn-
chronization. Normally, clock synchronization and
Bonjour discovery messages are infrequent, Aber
when O2 is initialized, it actively contacts Bonjour

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

and O2 processes to make connections quickly.
When the reference clock is first discovered, clock
synchronization runs on an accelerated schedule
to reduce the time to estimate the reference clock
Zeit.

Evaluation should also include ease of use, suit-
ability, and generality. These are “soft” attributes
that can best be evaluated with time and experience.
We believe O2 is a strong candidate for computer
music applications because it has grown out of
experience with many system implementations,
including global network music performances, lap-
top orchestras, wireless sensors, audio servers, Und
single applications with communicating processes.
We hope the community will find O2 to be as useful
as we do.

mainly by the variation in distance from speakers to
audience members at different locations.

Control from the conductor was complemented
by local control by each player, including the use of
TouchOSC (hexler.net), which sends touch screen
data from mobile devices over Wi-Fi via OSC. O2’s
OSC compatibility allowed players to incorporate
TouchOSC data easily.

O2 over WebSockets was used to connect to
animations written in p5js (http://p5js.org) Und
running in a browser. Parameters from the conductor
including tempo information were shared with the
animations to synchronize them to the music.
The performance can be viewed at http://youtu.be/
3OYhC3KNt-g.

Applications

To illustrate the potential of O2, we describe several
applications with a focus on how O2 can support
their construction.

Laptop Orchestras

The first application of O2 was in a performance
created by students for laptop orchestra. Discovery
allowed for rapid prototyping, testing, and perfor-
mance setup in which “player” laptops connected
to a “conductor” laptop. The conductor established
a shared context including tempo, Stil, and other
controls for players to interpret. Musical time was
represented as a function of real time, with a few
parameters transmitted by O2, nämlich

beat (T) = beatoffset + (t − timeoffset)
× beatspersecond.

Because time t is based on O2 clock synchronization,
every laptop was able to compute the current
beat with high precision and schedule beat-based
rhythmic output accordingly. Timed messages were
able to change tempo by sending new parameters
at precise times. After compensating for local
synthesizer delays, synchronization was limited

Sensors and Synthesis

The original design of O2 anticipated that even
microcontrollers would run a full network stack and
act as full O2 processes. This is certainly possible
with Raspberry Pi and other small Linux-based
Systeme, but we also felt a need for a lighter-
weight and simpler implementation, which led
to the development of O2lite. We use O2lite on
ESP32-based microcontrollers and the Arduino
development environment, which includes an
implementation of Bonjour for discovery. Recall
that with O2lite, a single O2 process serves as a
bridge to the entire network.

An example application is a small self-contained

inertial sensor that communicates over Wi-Fi.
Without O2, one would hard-code a destination
IP address for data into the microcontroller. To
use the sensor, one would manually disconnect
from the Internet and set a laptop IP address to
match the one in the sensor. With O2, the sensor
can discover the laptop’s dynamic IP address, Und
the two-way capability of O2 allows the laptop
to configure the sensor by setting the sample rate
and other parameters. Once data is received at
the laptop, software can map the sensor data to
sound controls and send them, either via O2 or
OSC, to a synthesizer to implement interactive
gestural control. An example can be viewed at
https://youtu.be/cPQQiYs2xeY.

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Figur 2. Bridging OSC
over the Internet. An OSC
Sender sends to a local O2
Server (links), welche
discovers the remote
service and forwards OSC
messages reliably. Der

remote service converts
messages back to OSC and
sends them to a local OSC
receiver. Daher, OSC is
delivered reliably across
the Internet without any
manual configuration of IP

addresses and ports. Wenn
OSC processes are behind
firewalls, O2 will
automatically revert to
using an MQTT broker to
forward messages.

Network Music

Network music (McKinney 2016) often involves
collaborative control of music-generation systems
and graphical displays, as well as sharing of sensor
Daten. O2 makes it easy to bridge multiple sites,
especially computers on home Wi-Fi networks
behind NAT, as discussed above. O2 can even
help to bridge existing OSC-based systems by
receiving OSC messages in a local O2 process,
forwarding the messages to a remote site also
running O2, and from there, forwarding from O2
to an OSC system as a final destination (siehe Abbildung
2). Because O2 can use reliable transmission across
the Internet, not only does this solve problems with
configuration and NAT, but it also can eliminate
dropped packets, which occur frequently with
wide-area networking. This approach parallels our
“Telematic Soundcool” performance (Scarani et al.
2019), which was implemented directly with TCP
before O2 was available. Even audio streaming is
possible with O2 (Norilo and Dannenberg 2018),
but it may be simpler to implement audio/video
streaming with more-specialized programs and just
handle control information with O2.

sors, for control and for output. When existing Wi-Fi
networks are used, artists and staff are faced with
manual configuration tasks or even reprogram-
ming microcontroller programs with hard-coded IP
addresses. O2 simplifies connections through dis-
covery while offering flexible open-ended messages
to transmit whatever data is needed to support sens-
ing and control. Clock synchronization can be used
to coordinate different media controlled by different
processes.

Another possibility is remote monitoring and

Kontrolle: Artists cannot always keep an eye on
installations in person, so some build monitoring
capabilities into their systems in order to oversee
the operational status remotely. O2’s global-scale
discovery, taps, and other facilities simplify making
connections from anywhere and for monitoring any
message activity within an installation’s computing
ensemble. Any service can be messaged to invoke
operations or query data. O2 has not yet been used for
this purpose, but a prototype for remote monitoring
has been built and described (Dannenberg 2019);
Figur 3 shows an example of how this can be used.

Audio Synthesis

Interactive Installations

Computer-based art installations, including those
that feature multimedia and interaction, often use
multiple computers and microcontrollers as sen-

Audio processing in software requires that latencies
be on the order of 1 msec or less, which is near
the limit of what consumer operating systems can
offer. Real-time operating systems and specialized
hardware can do better, but software and hardware

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

Figur 3. The O2 Spy
program uses O2’s tap
facility to “spy” on
services. In diesem
demonstration, ein
interface process manages
the Buttons window
(lower right). Controls

send O2 messages to
another process offering
the synth service. The O2
Spy program, running in a
browser and hosted by a
third process, can join the
ensemble, display the
status of the synth service,

and “tap” the messages it
receives. Here we see the
messages produced by
clicking Buttons 1 Und 2,
adjusting the slider, Und
clicking Button 3.

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

choices (including audio interfaces and device
drivers) are more limited and often more expensive.
Assuming a consumer operating system, effective
low-latency audio processing requires developers to
refrain from making most system calls, precluding
normal memory allocation or locks to safely read
and write shared data. Calls within the audio thread
to send and receive network messages are out of the
question.

Developers can either carefully cope with all
these restrictions, or violate them and hope for
the best. Alternativ, O2 uses its own lock-
free memory allocator, and O2’s shared-memory
bridge uses lock-free message queues to provide the
flexibility of O2 messaging to low-latency audio
threads. Audio processing can be tightly coupled to
a control thread (an O2 process) using O2 messages
delivered through shared memory, or the O2 process

can connect to other processes running on the same
machine, on a LAN, or even globally. Controlling
an audio process in this way is comparable to
the use of OSC for control in the SuperCollider
synthesis server (Wilson, Cottle, and Collins 2011),
but O2 provides scheduling, discovery, and clock
synchronization in addition to messaging. Ein neues
experimental synthesis engine based on O2 and
FAUST (http://faust.grame.fr) has been constructed
to explore this direction further.

Modular Performance Systems

O2 was largely inspired by a project to support
human–computer music performance by using
software modules such as MIDI players, Audio-
players that can time-stretch to synchronize audio

Dannenberg

playback, conductors that control the players, Und
score displays that show music notation to human
performers, as well as sensors and hardware or soft-
ware synthesizers. The need for communication,
coordination, and timing among distributed music
software components led to implementations with
OSC, ZeroMQ (Hintjens 2013), and ultimately O2.
With O2, modules operate standing alone, but if a
conductor is discovered, the modules automatically
connect to it and delegate control over start, stop,
“set position,” and tempo operations. Clock syn-
chronization is critical for coordinating players, Und
publish/subscribe is used to distribute conducting
commands to all players.

Our vision is that future music performance
systems will include intelligent agents as per-
formers, and modular systems will be easy to set
hoch, just as today’s performing musicians combine
microphones, guitars, effects pedals, mixers, ampli-
fiers, and speakers using off-the-shelf compatible
components.

Future Work

O2 is now fully functional, but there are many
areas for further development. Additional language
support would encourage more use. O2lite can be
ported to most languages. For applications to use
the full O2 implementation, it should be possible
to link to the existing O2 implementation in C++,
which we have done for Pd. Debugging distributed
systems can be difficult, and some monitoring
software has been prototyped but is not ready
for practical use. Perhaps O2 could include more
internal monitoring of latency, CPU utilization, Und
other useful information.

Security is a difficult problem, especially for
real-time systems. O2 does not encrypt or protect
Daten, so it is quite possible for an attacker to snoop
network packets, discover an ensemble name, Und
inject malicious messages. Aus diesem Grund, O2
does not enable global discovery by default. Virtual
private networks (VPNs) are one way to secure O2
communication, but this is likely to affect latency.
Built-in security measures should be based on a clear
threat model and are left to future work.

Audio and video streaming have many applica-
tionen, ranging from modular effect chains within a
single computer to network music performances on
a global scale. Audio over O2 has been implemented
(Norilo and Dannenberg 2018), but it is a complex
problem with many conflicting goals. Perhaps wider
adoption of O2 can inspire further work in this area.

Conclusions

O2 offers a new level of communication support
or middleware, especially for real-time interactive
music systems. Conceptually, O2 is similar to OSC
in that it sends one-way messages containing a
URL-like address that names a parameter or func-
tion and a set of typed values. Jedoch, O2 also
addresses many additional practical needs of ap-
plication builders. Erste, O2 delivers messages to
services rather than to network addresses, support-
ing more reconfigurable and distributed systems of
peer processes as opposed to simple client/server
configurations. Zweite, O2 includes discovery to au-
tomate network configuration, allowing processes
to connect even globally without fixed IP addresses,
domain names, or port numbers. Dritte, O2 offers
new modes of communication, including one-to-
many publish/subscribe messages, properties that
are automatically propagated without explicit mes-
sages, taps for monitoring message traffic, and shared
memory communication for very-low-latency appli-
cations including audio signal processing. Vierte,
O2 offers a complete solution to distributed clock
synchronization and timed delivery of messages,
which is important in many music applications.

An important requirement for communication
software is to enable connections among diverse
Systeme. O2’s C++ API is compatible with C
and therefore easily integrated with many other
languages. Zum Beispiel, external objects allow
access to O2 within Pd. O2 also interoperates with
OSC, not only as an OSC server to receive messages
but also as an OSC client to send messages, mit
translation between O2 and OSC message formats.
O2lite allows a client to join an O2 ensemble over
a point-to-point link. O2lite for WebSockets along
with a built-in HTML server enables web browsers to

Computermusikjournal

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3

communicate with O2 processes and bridge to OSC.
O2lite in C has been run on ESP32 microcomputers
with Wi-Fi to create sensors that automatically
connect to O2 networks. O2lite for shared memory
is being used to create a new sound-synthesis server.
As computer music software systems continue

to grow in capability and complexity, wir hoffen
that artists and researchers will build on these
systems through creative combination, Kontrolle, Und
interconnection using O2. It is poised to solve many
interconnection problems.

Danksagungen

O2 has developed and evolved through many inter-
actions with students, visitors, and faculty in the
School of Computer Science at Carnegie Mellon
Universität. Zhang Chi contributed to the initial im-
plementation. Vesa Norilo’s early adoption spurred
many improvements as well as the exploration of
audio over O2.

Verweise

Brandt, E., and R. B. Dannenberg. 1999. “Time in Dis-
tributed Real-Time Systems.” In Proceedings of the
International Computer Music Conference, S. 523–
526.

Cristian, F. 1989. “Probabilistic Clock Synchroniza-

tion.” Distributed Computing 3(3):146–158. 10.1007/
BF01784024

Dannenberg, R. B. 2019. “O2: A Network Protocol

for Music Systems.” Wireless Communications and
Mobile Computing, Kunst. 8424381.

Eales, A., and R. Foss. 2012. “Service Discovery Using

Open Sound Control.” In Proceedings of the 133rd AES
Convention, S. 348–354.

Essl, G. 2011. “Automated Ad Hoc Networking for Mobile
and Hybrid Music Performance.” In Proceedings of the
International Computer Music Conference, S. 399–
402.

Goltz, F. 2018. “Ableton Link: A Technology to Synchro-
nize Music Software.” In Proceedings of the Linux
Audio Conference, S. 39–42.

Guttman, E. 2001. “Autoconfiguration for IP Network-
ing: Enabling Local Communication.” IEEE Internet
Computing 5(3):81–86. 10.1109/4236.935181

Henning, M. 2006. “The Rise and Fall of CORBA.” ACM

Queue 4(5): 29–34. 10.1145/1142031.1142044

Hintjens, P. 2013. ZeroMQ: Messaging for Many Applica-

tionen. Sebastopol, Kalifornien: O’Reilly Media.

Iglesia, D. 2016. “The Mobility is the Message: Der

Development and Uses of MobMuPlat.” In Proceedings
of the International Pure Data Convention, S. 56–61.

Madgwick, S., et al. 2015. “Simple Synchronisation for
Open Sound Control.” In Proceedings of the Interna-
tional Computer Music Conference, S. 218–225.
Malloch, J., S. Sinclair, and M. Wanderley. 2015. “Dis-

tributed Tools for Interactive Design of Heterogeneous
Signal Networks.” Multimedia Tools and Applications
15(74):5683–5707. 10.1007/s11042-014-1878-5

McKinney, C. 2016. Collaboration and Embodiment in
Networked Music Interfaces for Live Performance,
University of Sussex, PhD thesis. Online verfügbar unter
core.ac.uk/download/pdf/60240897.pdf. Last accessed
November 2022.

Narveson, J., and D. Trueman. 2013. “LANdini: A Net-
working Utility for Wireless LAN-Based Laptop En-
sembles.” In Proceedings of the Sound and Music
Computing Conference 2013, S. 309–316.

Norilo, V., and R. B. Dannenberg. 2018. “KO2 Distributed
Music Systems with O2 and Kronos.” In Proceedings
of the 15th Sound and Music Computing Conference,
S. 452–456.

Scarani, S., et al. 2019. “Software for Interactive and

Collaborative Creation in the Classroom and Beyond:
An Overview of the Soundcool Software.” Computer
Musikjournal 43(4):12–24. 10.1162/comj_a_00534
Wilson, S., D. Cottle, and N. Collins. 2011. The Su-
perCollider Book. Cambridge, Massachusetts: MIT
Drücken Sie.

Wright, M., A. Freed, and A. Momeni. 2003. “Open Sound
Kontrolle: State of the Art 2003.” In Proceedings of
the International Conference on New Interfaces for
Musikalischer Ausdruck, S. 153–159.

Dannenberg

D
Ö
w
N
Ö
A
D
e
D

F
R
Ö
M
H

T
T

:
/
/

D
ich
R
e
C
T
.

ich
T
.

e
D
u
/
C
Ö
M

J
/

A
R
T
ich
C
e
–
P
D

F
/

4
5
4
7
2
1
0
4
4
8
1
/
C
Ö
M
_
A
_
0
0
6
2
0
P
D

B
j
G
u
e
S
T

Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3 Roger B. Dannenberg image

PDF Herunterladen