Monday, December 14, 2009

Headphone music and Spatial manipulation

“The Composition of Auditory Space: Recent Developments in Headphone Music”
Author: Durand R. Begault

Source: Leonardo, Vol. 23, No. 1 (1990), pp. 45-52

Published by: The MIT Press

http://www.jstor.org/stable/1578456


Summary: In this article, Durand R. Begault discusses the potential for composition of spatial music, meaning music that is written to occupy specific space, as opposed to music that is performed under any context, regardless of the space the sound occupies. Durand writes, “The composition manipulation of the spatial aspect of music was as inevitable as the manipulation of pitch, timbre or duration...”. Using psychoacoustically-based digital signal-processing techniques, composers of today are able to create music that is based on spatial hearing in the composition of music for headphones. Musical-spatial intentions can be conveyed through the use of headphones, and can create various effects including the illusion of distance and manipulation of the environmental context of the sounds.

Begault writes about “four so-called separable musical elements (with their corresponding psychological descriptions); frequency (pitch), spectral content (timbre), intensity (loudness) and duration (perceived duration)”. These categories are restricting, according to Begault, so he has added a fifth element of musical sound- space. It is a problem, according to Begault, when an undesired perceptual mismatch occurs between the composer’s intent and the listener’s perception. This translates to a source-medium-receptor (SMR) model, the source being the composer’s spatial conception for a sound, the medium involves the effects of loudspeakers and room acoustics (which can have a huge impact on the sound before it arrives at the listener), and the receptor is the listener who experiences these sound waves in some manner. The receptor experiences what Begault calls “immediate perceptual recognition” of the spatial aspects of the sound, as given by cues based on non-aural and binaural differences in intensity, spectra and time-delay, and the higher-level cognition of spatial manipulation experienced by the listener”. What Begault claims is an advantage of headphone music is the elimination of the “medium” and the freedom of the composer to control the auditory space in which the listener experiences the music. Signals sent to each ear can be predicted and controlled very carefully. There are several types of headphone presentations, diotic, dichotic and binaural. Diotic headphone presentation involves a single signal being sent to both ears. A dichotic presentation involves two different signals being fed differently to each ear, and a binaural presentation essentially involves a dichotic presentation in which the content of one of the two signals is to some degree present in the other. These signals can be used simultaneously. When done, music can sound as though it is coming from above, behind, in front or to the side of a person wearing the headphones. The perceptual experience is altered significantly. Begault has designed a digital signal-processing algorithm called REFL which is used for creating spatialized versions of a digital sound file according to an arbitrary model. The algorithm allows compositional specification of a model that includes the position of the listener and of the sound source within a variable environmental context. These filters create spatial-listening cues by “modifying an input sound in the same way that the outer ears (or pinnae) and the head would modify a sound in an actual environmental context”. The filtering effect is altered as a function of the angle of incidence of the sound source. Listeners would interpret what is happening as changes in the spatial position of the sound source. Begault has used these techniques in many of his own compositions for headphones, and is continually expanding his work and his research as an innovative and creative composer with fascinating techniques. He ends his article with the anticipation, that “We should expect our mind’s ‘aural eye’ to be surprised and challenged in the future.”


Reflection: This field is entirely new to me, and fascinating as I have never considered the possibilities for creating music that caters specifically to the manipulation of auditory space. The notion of auditory space itself has only ever come up for me when considering acoustics of a performance venue or the effect of a room’s acoustics on an instrument on a smaller scale. The manipulation of auditory space leads me to think about what else we as listeners might be lacking by experiencing music through the filter Begault calls the “medium”, be it concert hall space, loudspeakers, etc. Is our experience lessened because of the filters through which the music has to travel? Do these filters take away from what is intrinsically enjoyable about music (whatever that may be)? Could we someday manipulate our concert halls to cater to each individual listener in the way that headphones cater to the individual listener?


Although there may be elements of music lacking when auditory space is not controlled, the opposite may also be true. The negative effects of eliminating the medium must be considered. What is offered through the medium that provides us as listeners with something unique? For example, to take away the concert hall essentially takes away the social aspect of music. In many cases, headphone listening is usually a secluded activity which allows for the listener to block out her or his environment. When we cater to this style of listening to and experiencing music, are we in fact cutting ourselves off from a very important aspect of music? By eliminating the social aspect of the musical experience, we eliminate the energy brought to us by other people and we don’t have an opportunity to consciously offer our own energy to the people around us, like the effect of a high-energy rock concert. By eliminating the social experience, music may become an introverted activity. This is not to say that seclusion and introversion are bad when it comes to music; this is to point out the combination of elements that make up a musical experience, and how by eliminating one (in this case, the social aspect) many other facets are necessarily altered.


Despite whatever drawbacks headphone music may have in terms of music’s social aspect, the exciting part of this work is how much manipulation can be done in terms of finite details and sound wave altering. Clearly, there is so much more to music than meets the ear.


2 comments:

Brian Graiser said...

When I studied electronic composition during my undergrad, one of the BIG goals in the course was developing a sense of how to use spacialization (basically, the physical space of sound as you've described) in our music. We had the option to eventually work with quadrophonic sound (from four speakers, usually-but not always- placed in the "four corners" surrounding an audience), but I and most others found the stereophonic field (just two channels) to provide sufficient space with which to work. In particular, I was fascinated by the prospect of simulating three-dimensional sound with just a two-dimensional speaker setup; this, I feel, is much akin to the work you've discussed regarding the two headphone channels. As you mentioned, I found through trial and error that frequency, volume, and of course right/left panning ALL affected the audience's perception not only of side-to-side spacialization, but height and depth as well! For example, in one piece I wrote with electronics ("Wishing Well," for those of you who went to my recital last year), I wanted to simulate a giant coin, having been flipped by some phantom hand, spinning through the air, rising up and then dropping back down. To achieve this with only two channels, I used a metallic source sound and "slowed it down" by dropping the overall pitch. Then, I manipulated the pitch and volume of the sound in tandem, to get the sensation of the coin spinning. Finally, I added a macro-level pitch and volume change to simulate the coin rising up, and then falling back down. By only manipulating pitch and volume, I was able to create a multi-layered effect that simulated height AND depth!

joe schacher said...

Two summers ago I had the pleasure of experiencing Murray Schafer's 'Patria'. It was a musical piece, about 3 hours in length, which was performed outdoors. The audience was led through a forest and stopped to witness various scenes. All the while, music literally surrounded us.

Since there was no stage, instrumentalists and singers played and sang from everywhere. Clarinets and flutes could be heard calling from the bushes and trees around us. A man in an eagle costume sang from up in a tree. In the distance, a beautiful soprano beckoned to us. It was the most immersive musical experience I had ever witnessed and I am sure it had something to do with the spatial nature of the music.

The headphone music which Begault refers to is a simulation of Schafer's idea. Begault is using psychoacoustics
to trick our brain into thinking "the sound came from over there". I must say that if it works well, it definitely brings a whole new dimension to sound production. It seems that the entertainment industry is moving towards a more immersive experience (Avatar, anyone?) and I am all for it! I think many people would love to watch a movie, play a video game or listen to music and finish with the feeling that they were actually IN the movie, game or orchestra. Maybe it would be possible to listen to an orchestral piece "as heard by the conductor" or "as heard by the bassoon player".

On Brian's comment:

I have used similar techniques in composition and music production. You are correct, the feeling of 'up' can be simulated with high pitch and 'closeness' can be simulated with volume with great effectiveness. There are many things that can't be simulated with stereo though. How can you simulate "the sound is right above my head", or "the sound is behind me" or even, "the sound came from outside the theatre!". With stereo it can't be done. With headphone spatialization it can at least be simulated.