top of page
Aerial View of a Field

Earful Software

This approach to generative music is akin to a musical behavioral ecology. That is, music is composed and realized (or generated) by the relations between groups of characters and their behaviors and relationships manifested in interaction spatially. That is a dense sentence. Let me explain how this works. 

​

Motivation

I have always been amazed at how satisfying and pleasurable is the sound of nature. Aurally of course, but also dramatically and musically. I remember, as an undergraduate composition student, wondering if birds have a “sense of ensemble”. Are they participating with joy in the collective effect of their song and utterance? At the time, the wonderful music of Messiaen offered a view into this phenomenon but it was the music and approach of Xenakis that I was drawn most strongly to in this regard. So, at the time I embarked on a study of mathematics and use of a simulation language (GPSS) to create a work for string orchestra. Fast forward 40 years later and now relatively astonishing capabilities for this music in our pockets (mobile phones), I have been attempting to make music as natural and alive as the experience of forest or shore and as musical and dramatic as music that I seek to hear. 

 

Definition of Terms

The system is at once very simple but capable of lifelike complexity. It’s helpful to have some terminology to name and identify the conceptual specific entities that form the model. 

Species

Group

Actor

State

Sound

 

Data Model

We use the unity game development as a spatial platform for the music. This facilitates three-dimensional placement and animation of the musical actors. It also provides natural behaviors of a physics engine and flexible distance-based sound attenuation and filtering. It also offers a third party plugin market with all kinds of visual and animation assets useful for achieving spatial music composition. 

 

On top of this platform I’ve created a software architecture for sound as interacting autonomous actors. Each belongs to a ”species” defined by musical behavioral states and responsiveness to other actors by species and state. 

Cardinality across these tables is many-to-many for sounds and states. A sound file can be associated with many states and a State have many Sounds in its repertoire. Cardinality is similarly many-to-many for States and Relations. Any Species State can have a relation defined with any other Species State, including itself. 

 

The data model for such behavior is expressed in three tables:

 

Sounds

The sounds table contains references to sound files for each of the species and their associated states. There is an entry in this table for every sound file used in the peas. The table also includes ranges of pitch and loudness variability to be applied to their entries sound when played. This affords sometimes useful variability in manifestation of a sound, which, depending on the sound can contribute to a natural Ness in the scene. By default default these values are unity.

 

States

Well, the sounds table associates, specific sound files with species states, the state file provides temporal sounding behavior for each state. That is, it includes a minimum and maximum delay between a song being played for that actor in that state, and the probability distribution for waiting the delays in that range. It also includes an expiration time for the state and a next state to enter at expiration. The next state could be the same state, a silent state, or any other state defined for that species.

 

Relations

The relations table is consulted by each actor. Every time another actor plays a sound. If there is an entry for the combination of the listener species in the listeners state with the sounding species in the its state, a calculation is made to consider triggering the listener into a new state. This table includes a threshold value for triggering that state transition as well as weights for spatial and temporal proximity in the calculation. That is, how close the sounding object is to the listening object, and how frequent are the recent calls, are considered in whether to trigger a new state. Hope you have a score, however, you’re also fades over time.

 

These weights can be subtle or deterministic or anything in between. They aren’t an important part of how an actor behaves, and how the drama and ecology of the piece, unfolds and evolves.

 

Software Architecture

The Earful software is a codebase of C# programs that run as components on game objects in the Unity Game Development Engine. The architecture of these comments reflects the cardinality of objects in the piece (a Unity game). For example, there are central components that manage global data stores such as the lookup tables above or sensor handlers. There are components that associate with a group of actors of a specific species and default state configuration. In the bird example, these would be flocks. And there are components associated with each Actor game object. In the bird example, these would be birds. These components and structure are used regardless of the specific sounds, states, relations, spatial animation behavior, or visual appearance in the app. Earful also includes a variety of additional components User Interface controls and OSC messaging out to external systems. 

 

EarfulTransitionManager

EarfulStateManager

EarfulEventManager

EarfulShakeDetection

EarfulPitchAnalysis

 

Groups:

EarfulStateActorGroup

EarfulGroup

OSCChannelGroup

 

Actors:

OSCGroupedPlay

OSCGroupedPositionTracker

EarfulStateSoundTimer

EarfulStatePlayNativeSound

 

Currently around 7600 lines of code over 60+ classes. 

 

Composition Process

For birds and natural sounds miss you. Thanks for calls that are the most clean, in other words are superimposed with another verb or sound, and a reasonable amplitude.

Then group all these isolated sounds into the species states. For example, I would take all the personal calls associate them with a species called cardinal ants, group, similar call in their own states that’s clearly what they suggest, or in some cases, bundle slightly different collection of sounds, that could arguably be in the same state, but offer useful variance.

I then consider what the call frequency statistics should be. That is, what is the minimum delay until the next sounding, and the maximum, as well as a sequence of weighted numbers that represent the probability distribution for choosing delay numbers within the species’ range. 

It might be that I want to create slightly different groups with the same set of sounds. That is, might use the same side of sound species, but with different frequency/delay call Harry. Or perhaps to a different next day with a different expiration time. These kinds of decisions call course both from what you have in mind for a collection of sounds, and perhaps when you discover when you start varying the frequency of behavior or next stage. Do you want different behavior for collections when we’re groups of different sizes. The frequency behavior works with group size to affect density. . How you achieve or evolve density is defined in this way. Kim, Lynne, dramatic or even semantic mean to certain states by certain species. It’s not uncommon to discover unexpected results that make sense in hindsight by who still weren’t expecting.

This is really the art of creating your ensembles and the ecology of your piece. One can achieve very proscriptive results, more “honestly” stochastic results, and levels in-between and in combination. It’s just like any compositional thinking process but can involve a process more like world building or dramatis personae with group dynamic. Also, by using unique species names across different projects and pieces, any of the species can be reused in new combinations and made sensitive to others by defining Relations across the combined species. We’ve used this to combine birds and animals from different continents, instrumental phrasing, and weather events in projects sensitive to each other. This accumulates a an expanding library of musical behavioral ensembles. 

​

Applicability

Ambisonic sound walk zones, ensembles in audio poetry, large-scale (30 channels) outdoor shadow puppet soundtrack, also 16-chan indoor enactment with puppets. Work over the last year has been focused on instrumental pieces with species defined with fragments of 5-string electric violin (viola-violin range) and piano. 

 

This music, released as phone apps also leverages the phone’s sensing instruments - accelerometer for shaking, microphone for pitch detection, and camera for image recognition and body tracking. The pieces are designed to interact and evolve beautifully and interestingly on their own without user interaction. However, shake impact is available for the fidgety, pitch detection for improvisers who want play with the piece (IRCAM Forum NYC, Fall 2021), and image detection for visual art interaction (IRCAM Forum Paris, 2022; Teresa Parod Murals AR app, 2023). Earful software extensions load additional ensemble players and broadcast events to current players to affect their behavior through the Relations mechanism, just like other virtual objects. 

​

Max/MSP

While Unity is great for 3D scene composition, software extensibility, and multi-platform deployment, one needs to reach outside in order to play I’m multi-channel speaker configurations. For this, we use a Max patch the receives OSC messages from Unity whenever a soundfile plays in Unity on a specific virtual channel and whenever the Actor for that channel changes position. The Max patch calculates gain and LPF rolloff for each virtual actor at each speaker in the configured speaker array. The patch uses Spat for sound file playback and mc.matrix for mixing. Its primary computations are done in Javascript within the patch. Its use and description is beyond the scope of this submission but it has been used for multichannel realization of 30, 16, and 8-channel arrays of various configuration as well as Ambisonic 3rd Order capture. 

 

Musical Results

I’ve use the air full software for several pieces. Each headed song, aesthetic and dramatic calls and contributor. It’s on sonic, repertoire and behaviors to the larger collection. For example, and early pieces used bird sounds and gibbons recorded in Chicago area, Sumatra, Indonesia, and Fiji. The defined behaviors were meant to mimic an amateurs, listening experience of the different species, and they relate to each other. Where are the species dome call From Salsa ecology, the relations are derived and define artistically. These were useful for applications, including ambisonic walk pieces, large scale, outdoor spatial concert, indoor gridded, spatial concert, and augmented reality mobile apps for use with murals in paintings..

 

Each piece is of course different, but in my opinion, …

Swarm resonances, small group interaction - behavioral duets with perhaps repercussions to others. 

​

Echoes Mobile App

The Echoes Mobile App is available in the App Store and Google Play. Many audio walks are available there by browsing titles, geographic area, or current proximity to the user.

Blue jay painting; Echoes audio walk cover

Imaginary Flocks of Lurie Garden contains a network of seven pieces or "Echoes", laid over the Garden, each hosting all, but featuring specific bird species. These pieces are displayed geographically in Echoes, as seen in the next figure. Echoes also displays your current location and which zone you are currently in, if any. When in Autoplay, each zone's audio is activated when entering that zone. When not in Autoplay, you may also select zones from a list view as seen in the 2nd figure, regardless of whether you are geographically present or not.

Ambisonic Audio Implementation

This video shows the the 23 flocks of birds laid across the Lurie Garden. A green bounding box is optionally displayed around each flock. Each flock is of a specific species - Cardinal, Robin, Crow, Blackbird, Woodpecker, or Wren. Each has specific vocalizations, influenced by their individual nature and how they react to other birds in their vicinity. These reactions are defined by their species and the current attitude of the other birds they encounter. The "player" in this video shows the location and orientation of a virtual 1st Order Ambisonic "microphone", implemented by Resonance Audio in Unity. The result is intended to be both a natural sounding and musically interesting soundscape.  

The color-coded base map is from Lurie Garden. All of the bird artwork is taken from Teresa Parod's Midwestern Birds mural 

The birds in this walk, the recordings of their vocalizations, and the dramas of their interactions will evolve and improve over time, as does the flora of the Lurie Garden.

​

See also: Augmented Reality version of Midwestern Birds in Earful Vision

bottom of page