|
The Synchronized Multimedia Interface Language
By John Maxwell Hobbs
On June 15, 1998, the World Wide Web Consortium (WC3) released the Synchronized Multimedia Integration Language Specification 1.0, or SMIL, as a recommendation. SMIL, an XML extension, is intended to allow the easy implementation of sophisticated time-based multimedia content on the Web. According
to the WC3's recommendation, SMIL allows a developer to "describe the
temporal behavior of a presentation, describe the layout of the
presentation on a screen, and associate hyperlinks with media objects."
Because SMIL is a relatively simple declarative language that can be
created with a basic text editor, it has the potential for widespread
adoption and to have as revolutionary an impact on computer-delivered
content creation as HTML. SMIL is supported for Windows in Real
Networks' G2 Player, currently in beta release at http://www.real.com, for Windows, UNIX and Macintosh in GriNS from CWI at http://www.cwi.nl/GRiNS/ and for UNIX and Java in HPAS at http://lists.w3.org/. The Website justsmil.com has extensive resources devoted to SMIL. Capabilities SMIL
allows for the visual layout and synchronization of a variety of media
types -- audio, video, graphics and text -- in time-based
presentations. SMIL can be used to present a sequence of media files in
a timed format like a slide show, or a number of different media types
simultaneously for a sophisticated "TV-style" presentation. For
example, a .smil file might instruct a player to begin a presentation
by playing an audio file, then ten seconds into the audio track, start
displaying a timed sequence of JPEG images in the top left corner of
the screen. Fifteen seconds later, it could begin rendering a streaming
video source while at the same time displaying a series of hyperlinks
in the bottom left corner of the screen. One
of the most important features of SMIL is that, like HTML, it allows a
presentation such as the example above to be created using distributed
media. It is possible for each of the files referenced by a SMIL
document to reside on a different server. SMIL does not require a
presentation to be "containerized" as proprietary formats such as
PowerPoint, Macromedia's Shockwave and Microsoft's Active Streaming
Format (ASF) do. Behind the scenes The
Synchronized Multimedia Working Group of the World Wide Web consortium
created the SMIL recommendation. The Working Group is composed of
professionals in the hardware, software, digital media and broadcast
media industries. The companies represented in the group include
Lucent, Apple, Real Networks, Phillips, Bell Labs, DEC, Netscape and
CNET among others. According to a paper entitled "Toward Synchronized
Multimedia on the Web," published by Philipp Hoschka in the Spring 1997
edition of the World Wide Web Journal, the formation of the
Working Group was to combat the "imminent danger that a plethora of
non-interoperable solutions for integrating real-time multimedia
content into the Web architecture will emerge. These different
solutions will most likely not result from a healthy competition
advancing technological progress. Instead, they will result from a
simple lack of communication between the three very different
communities involved, namely the Web community, the CD-ROM community,
and the community working on Internet-based audio/video-on-demand." The
earliest efforts of the Working Group revolved around the choice
between a declarative format, along the lines of HTML and XML and a
scripting format similar to Lingo, JavaScript and HyperCard. Ultimately
it was determined that the use of a declarative syntax had many
advantages over the use of scripting. Scripting is much harder to
maintain and does not lend itself to simple authoring tools. The
advantages of the use of a declarative language becomes apparent during
the creation process; sequence changes can be implemented immediately
by altering the appropriate descriptor, content changes can be
accomplished by replacing a source file with a new one with the same
name. In addition, the .smil file can contain meta tags containing
keywords and a description of the media contained in the presentation
to allow comprehensive indexing by search engines -- something that is
not possible with containerized multimedia presentations. The wide
variety of tools currently available for converting document processing
formats into HTML are a good indication of how easy it will be to
create similar tools for SMIL. The promise of SMIL SMIL
is an important step forward for computer-based multimedia in that it
is based on open standards. The world of traditional electronic media
-- recording, film and video -- relies on standards to insure that
content can be delivered across a broad range of devices and in a wide
range of circumstances and conditions. The use of open standards also
insures that the content will not become unavailable due to
technological obsolescence. It also allows the specification to grow
along with technology; new standards can be easily be incorporated as
they are introduced. This also insures that legacy content will remain
available. The SMIL implementation from Real Networks supports the
incorporation of Shockwave, ASF and other containerized content in a
SMIL presentation. In
the way that HTML has allowed the traditional methods for creating
documents to be extended to a networked environment, SMIL extends the
working methods of traditional media to the Internet. At the heart of
any film or video presentation is the Edit Decision List, or EDL. Like
SMIL, an EDL is a text file in a declarative format that describes the
sequence of cuts, transitions, effects and source material used to
create the final presentation. EDLs are created in a standard format
that can be read by any editing system. This similarity between
creating an EDL and creating a SMIL document should make it easy for
content creators working in traditional mediums to transfer their
knowledge and working methods to the Web. The process of content
creation has always been hands-on, and one of the greatest barriers to
the Web for traditional content creators -- the need to have knowledge
of computer programming or to have access to a programmer -- can be
eliminated. SMIL's
ability to create a presentation using distributed media has
significant implications for content providers. The first is the
ability to reuse media sources. Because SMIL has the ability to display
a selected segment of a larger file, multiple presentations can use the
same source file. For example, a news site could offer viewers the
option to view an entire press conference or just edited highlights
using the same streaming video file. The same source file could be used
along with simultaneous text or audio translations in a variety of
different languages. A provider could syndicate exclusive media content
for presentation in a wide variety of customized formats. The
use of distributed media also simplifies bandwidth management.
Currently, if versions for multiple bandwidths are required,
containerized formats such as Shockwave and ASF require each version to
be created from scratch. SMIL, combined with a server enabled for
bandwidth detection such as the one from Real Networks, requires only
that the source files be encoded at the required data rates; the same
SMIL file can be delivered to all viewers. As multi-complexity encoding
schemes such as Real Networks' G2 system become widely available,
bandwidth management will become even simpler. The
industry-wide involvement and support behind the creation of SMIL bodes
well for the standard. The ease of development should open the way for
creation of multimedia presentations by those who found the process too
difficult or labor intensive up to now. The ability to reuse content
and manage bandwidth should prove attractive to sites concerned about
the typical overhead involved in delivering multimedia. Development of
SMIL implementations is proceeding at a rapid pace. Authoring tools are
already available from Real Networks, CWI, Digital Renaissance, and VEON. Additional tools have been announced by a number of other companies. John
Maxwell Hobbs is a musician and has been working with computer
multimedia for over fifteen years. He is currently in charge of
multimedia development at Ericsson CyberLab New York. His interactive
composition "Web Phases" was recently one of the winners of ASCI's Digital '98 competition and is currently on exhibit at the New York Hall of Science.
He is also on the board of directors of Vanguard Visions, an
organization dedicated to fostering the work of artists experimenting
with technology. He is the former Producing Director for The Kitchen.
John Maxwell Hobbs can be reached at: john.maxwell.hobbs@ericsson.com.
|