|
| United States Worldwide |
|
Speech ApplicationsFiscal 1995 Project Portfolio Report
Nicole Yankelovich and Paul Martin, Principal Investigators
nicole.yankelovich@East.Sun.COM
Overall ObjectivesTo develop within Sun a center of expertise on speech technologies, to build a robust, effective environment for speech applications, and to help define platform-level requirements needed to support sophisticated speech interaction.
Objectives for FY95The primary objectives for FY95 involved user testing prototype software, enhancing the speech application development environment, and scaling up the prototype software for experimental deployment around the company.
DescriptionThe Speech Project is a multi-year effort aimed at building a testbed for the design and study of speech applications. The initial project focus has been on users who would like telephone access to online data while they are away from their computers. The prototype speech applications we have developed allow users to call up their Sun workstations and verbally interact with applications such as electronic mail, Calendar Manager, stock quotations, and national weather forecasts.To construct these speech applications, we have created a prototype speech application framework called SpeechActs, in which multiple speech-driven applications can be integrated. This framework supports multiple speech recognizers and synthesizers, and includes a natural language component and a unified grammar language.
AccomplishmentsAfter constructing the backbone of the SpeechActs Framework in FY94, in FY95 we focused on user testing, feature enhancement, and SpeechActs deployment, as well as defining new directions for the group.In the beginning of FY95, the Speech Project team subjected the prototype applications to laboratory user testing. Based on the results, the application interfaces, as well as parts of the underlying Framework, were substantially redesigned. The user tests also brought to light the need for additional features, both for end-users and for application developers. On the end-user side, the most significant features we added included: tools for filtering, grouping, and prioritizing electronic mail messages; a mechanism for disambiguating user and other names; and a faster strategy for switching between applications. The SpeechActs development environment was enhanced with several "domain specialists" that were added to the Discourse Manager to handle name disambiguation and date resolution across applications. In addition, the team implemented a shared data repository (known as the "blackboard"), extended the use of "regions" to speech recognizers other than BBN's Hark system, and added tools to make dynamic grammar changes possible for those speech recognizers which support such a feature. In order to deploy the SpeechActs applications to a group of field test users, the project team had to make significant enhancements in order to accommodate users around the company. Scaling up involved tackling a number of difficult security-related issues, putting new administrative procedures into place, and enhancing the software. The enhancements included the development of "short lists" so that each user could have a private vocabulary for talking about other users, a graphical user interface for registering new users, the identification and integration of more reliable sources of stock and weather data, and the graceful handling of problems related to network latency. Once the field study was launched, the team focused on initiating several new projects which will form the basis of work in FY96. Two small projects include designing and implementing SpeechActs applications to allow users to check the time around the world (e.g., "What time is it in Paris now?") and to look up and convert currencies (e.g., "What's the currency in Japan?" or "How much is $57.00 in Yen?"). Larger projects include a prototype interactive television catalog with a speech interface; experimenting with use a large-vocabulary, isolated-word dictation system; and exploring the integration of other text-to-speech systems into SpeechActs.
ReferencesPublicationsYankelovich, Nicole. "Personification in SpeechActs." Lifelike Computer Characters '94 (October 1994). SML-94-0316. Yankelovich, Nicole and Eric Baatz. "SpeechActs: A Framework for Building Speech Applications." AVIOS '94 Conference Proceedings (September 1994). SML-94-0243. Yankelovich, Nicole, Gina-Anne Levow, and Matt Marx. "Designing SpeechActs: Issues in Speech User Interfaces." CHI '95 Conference on Human Factors in Computing Systems (May 1995). SML-94-0394. Video Sun Microsystems Laboratories. "SpeechActs: A Conversational Speech System." ACM Multimedia '95 (Forthcoming). SML-95-0098.
Return to the Table of Contents
webmaster@sunlabs.eng.sun.com | ||||||||||||||||||||||||||||||||