|
Sound on the Internet
By John Maxwell Hobbs
The
conventional wisdom among IT professionals is that audio is a toy, for
entertainment purposes only. To date, the best use of audio in user
interfaces is in video games and CD ROMs. But thanks to faster
processors, sophisticated new algorithms, and well-thought-out new
APIs, user interface designers stand poised to take advantage of
perhaps the most subtly sophisticated sensing devices human beings have
-- our ears. The elements of hearing
Human hearing is often overlooked by hardware and software developers,
but it's a rich and subtle sense. It is also always with us. We cannot
close our ears the way we close our eyes. We even hear in our sleep, as
anyone who has been awakened at 6 a.m. by a noisy garbage truck can
tell you. Unlike vision, which focuses on one image at a time, we can
"focus" our hearing on several things simultaneously -- it is possible
to hold a conversation while listening to music, and even follow
several conversations going on at once. People have been trained by evolution to respond to very subtle sound cues. We can: - hear in all directions at once.
- hear around corners.
- hear through doors.
- hear through walls.
- identify the location of the source of a sound by hearing alone.
- hear loud noises, like thunder, from miles away.
- identify the dimensions of a space by acoustic clues.
- identify movement behind us via subtle changes in the sound field.
Problems and potential for audio
Currently there are many companies that allow no audible sound from
computers in their workplace. Sound is regarded as intrusive and
unnecessary. Perhaps the problem is not with sound per se, but the way
it has been used in the past. Until recently, the audio-processing
capability of a standard PC was fairly low. Unpleasant bleeps, bloops
and quacks were the only types of sounds that could be produced. For
the most part, audio alerts have been used only to notify the user of
error -- and nobody wants to share a mistake with the entire office. Recent
innovations are changing this trend. Standard-issue computers are now
able to produce high-quality sound. Most systems come with at least a
16-bit sound card. Sun's JavaSoft division, Microsoft, Intel and Elemedia, a division of Lucent
have all released high-quality, software-based audio processing tools
that can run on most standard machines. Separately, a number of very
good software-based, text-to-speech systems have been released for
several platforms. Other
companies are beginning to take tentative steps toward incorporating
sound into their interfaces. Sun has licensed the Headspace sound
engine (see below) and is releasing it as part of JDK 1.2, as the basis
for the Java Sound API. Sun's Java Media Framework API, out in public beta
now, has a very thorough object-oriented approach to rendering
multimedia files that makes the incorporation of audio events in Java
applications both simple and flexible. Microsoft's Windows 95 and
Internet Explorer 4.0 offer limited options for using sounds as
feedback for certain actions. Qualcomm's Eudora Pro allows the user to select an audio alert instead of an alert box. Immediate solutions with audio
Without waiting for new products and technologies, sound can easily be
added to Web sites through the use of JavaScript. The same programming
logic behind the use of mouseOver to create graphic "rollovers" can be
applied to sound files to create spoken captions and links, or to
associate a musical theme with a certain link. Other user actions, such
as clicking on links and clicking the forward or back browser buttons,
can be associated with sounds as well. Sounds
provide valuable feedback in the Web environment. Web sites often
feature frustrating delays that may indicate either a process in the
works or a failed action. Because of delayed reactions to form
submissions or clicked hyperlinks, users often click several times,
unsure whether their click has "taken." An audible "click" can reassure
users that a response will be forthcoming. Added assurance would come
if the browser supplied audio feedback when it opened a connection to a
remote server and began downloading data. Yet another sound could
signal the completion of the download. During long downloads, sound can
also be used as the equivalent of the music played over the telephone
to people on hold -- perhaps not a major feature, but a courtesy to the
user. onClick can be used to initiate both the file download and to
start a musical sequence or streaming audio file. When the download is
complete, onLoad can stop playback of the music. Browser plug-ins Sound is still in the early stages online, but several products are currently available to exploit the "gee-whiz" factor. Beatnik
is designed to make Web page "sonification" relatively simple to
accomplish. The Beatnik plug-in contains a wavetable synthesis engine
that allows for the rendering of most standard digital audio files such
as .WAV, .AIFF and .AU. If will also play MIDI
files and Headspace's own RMF files. An RMF file can contain both MIDI
sequences, and digital audio samples. RMF files are highly compressed
and download quickly. The contents of an RMF file can be addressed
individually via JavaScript. This is particularly useful because it
allows all of the sounds and musical sequences used on a page, or even
an entire site to be downloaded in one small package that can be
cached. It also allows the audio elements of a site to be updated very
simply. Beatnik
gives users unprecedented control over the playback itself. Users can
not only control volume, they can change the tempo, pitch and even the
instruments used to play musical sequences. A Java applet that watches
the number and speed of user clicks could use that information to
create a customized soundtrack. A user who hopped around a site
quickly, would hear an uptempo version of the site's theme music while
a more leisurely surfer would get a soundtrack to reflect that. Sseyo
is promoting the use of what they call "generative music" with its Koan
plug-in for Windows. The Koan plug-in reads a very small file,
sometimes as small as 1K, that acts as a "seed" that creates an
ever-evolving piece of music. The "seed" file specifies things like
feel, tempo, key, basic melodies and instruments used. The plug-in
takes it from there and "improvises" a new composition each time. This
eliminates the use of repetitious "loops" that can quickly become
annoying. Because of the small files used, this technology offers an
incredibly fast download time. Microsoft has an ActiveX control for Win32 and IE called Interactive Music Control
that has similarities to both Beatnik and Koan. As with Beatnik, sound
events contained in a single file can be scripted and dowloadable
sounds are supported. As with Koan, ever-changing compositions are
generated on the fly based on very simple initial parameter settings.
Interactive Music Control can also combine elements of both plug-ins,
so that a user's interaction shapes the qualities of the music. Conclusion
Sound is still a rarity on the Internet, often used for its novelty
value. As technology progresses and ideas catch up with PCs' new
abilities, the power of one of our most important senses is likely to
play a growing role. The role that sound already plays in computer
games may be an indicator of the future -- imagine Quake silent. The
challenge to software developers today is to bring that quality of
sound design to productivity and communications software. John
Maxwell Hobbs is a musician and has been working with computer
multimedia for over fifteen years. Most recently, he headed up
multimedia research and development for EarthWeb, Inc. He is also on
the board of directors of Vanguard Visions, an organization dedicated
to fostering the work of artists experimenting with technology. He is
the former Producing Director for The Kitchen. John Maxwell Hobbs can
be reached at: jmax@artswire.org.
|