SAN FRANCISCO -- Cisco, Microsoft, Intel, Comverse, SpeechWorks and Phillips have banded together to come up with a platform-independent standard for speech and multi-modal access to Web-based applications and information.Despite protestations to the contrary, the Speech Application Language Tags (SALT) Forum, as the group is called, is clearly aimed at establishing an alternative to Voice Extensible Markup Language and has seized on the shortcomings of that markup language in handling simultaneous voice and data dialogues.
The Forum will develop a set of tags to supplement HTML, XHTML and XML that will act almost as middleware to let the vast Web developer community build applications with a speech layer. At the heart of the initiative is an effort to build multi-modal applications -- combining speech, data and graphics -- that can run across multiple devices. The plan is to have a public specification of the tags assembled by the first quarter of next year, and then a submission to an as yet undetermined standards body by the middle of 2002.
Forum members apparently have not yet finalized how other companies will join the organization, although Microsoft and Comverse told the451 the specifications would be open to contributors outside the founding companies.
Since SALT is aimed at creating a developer environment, it's likely that Microsoft is spearheading the initiative. Cisco and SpeechWorks both have developer tools, but nothing on the scale of Microsoft. According to Kai-Fu Lee, vice president of Microsoft's user interface technologies division, the companies sat down to come up with the idea for a forum only in recent weeks.
The SALT participants are adamant the organization won't promote an alternative developer environment to VoiceXML, but it is clear that they are aiming to bypass that markup language for the hearts and minds of the Web developer community. Forum members contend that VoiceXML has been a positive development for the interactive voice response industry, which has traditionally used proprietary equipment, but that it falls short when it comes to providing a platform for multi-modality.
Forum members have participated in deliberations involving the World Wide Web Consortium (W3C) and 50 companies to finalize VoiceXML 2.0, and SpeechWorks, Microsoft and Phillips have contributed to draft specifications. In fact, other members of the W3C group contend that Microsoft has played more of a disruptive role, bogging down proceedings in technical disputes. To be fair, the standards process has also been held up by the original VoiceXML developers -- Lucent, IBM, AT&T and Motorola -- disputing the inclusion in the standard of what they contend is patented technology. The VoiceXML Forum, an industry trade group, has 560 members, although many are smaller startups.
It's clear, however, that despite the industry support, Microsoft has misgivings about VoiceXML.
"SALT will be a superset of VoiceXML," Lee told the451.
"VoiceXML is a good thing in the context of moving away from proprietary hardware and developing reusable components. However, it is designed for speech only, and if you have a screen it is already using HTML. What we are saying is that don't reinvent graphics to accommodate speech."
"VoiceXML does not achieve what it sets out to do. It does not support the separation of data from the presentation layer. Speech is part of the presentation layer," he added.
Why is Microsoft interested in speech? According to Lee, the software giant considers speech "the perfect way" to access its .Net Web services. "Speech is in sync with the .Net vision. It's not just a PC experience. We intend to support speech in Internet Explorer, Pocket PC Internet Explorer and definitely Visual Studio.Net," Lee told the451. Microsoft's inability, so far, to produce a commercial product from its huge investments in speech technology is legendary.
Some of the forum members have made strategic commitments to VoiceXML. SpeechWorks in July launched a speech recognition engine with VoiceXML support; Comverse, which has worked on both infrastructure and VoiceXML applications, such as unified messaging, last month introduced a VoiceXML gateway. According to Israeli press reports, Comverse lost out to its rival Openwave in a bid for AT&T Wireless' unified messaging business.
With current wireless networks there are tremendous difficulties in synchronizing a VoiceXML and data browser, for instance one using WAP, for simultaneous input and output of data and voice. However, the difficulties are not insurmountable, and some startups are working on solutions. SpeechWorks and Nuance are both developing multi-modal access for 2.5G and 3G networks. Also, the W3C has begun the standards process on the Multimedia Markup Language, which will unify VoiceXML, WAP and other markup languages.
Observers contend, however, that the SALT Forum is only likely to add another alternative architecture for creating multi-modal applications.
MORE INFORMATIONBest searchMiddleware links on XML-based integration and data exchange
the451 (www.the451.com) is an analyst firm that provides timely, detailed and independent analysis of news in technology, communications and media. To evaluate the service click here.