0% found this document useful (0 votes)
45 views5 pages

Vocalocity Voice Browser Features

The Vocalocity Voice Browser is a server-based software solution that integrates VoiceXML applications into telephony networks. It connects enterprise applications to phone networks and provides scalable call handling for IVR and speech applications. The Voice Browser terminates phone calls, selects appropriate speech recognition and text-to-speech resources, and communicates with web applications to execute VoiceXML. It uses an open architecture and supports various third party speech technologies and standards.

Uploaded by

nikhil adatkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views5 pages

Vocalocity Voice Browser Features

The Vocalocity Voice Browser is a server-based software solution that integrates VoiceXML applications into telephony networks. It connects enterprise applications to phone networks and provides scalable call handling for IVR and speech applications. The Voice Browser terminates phone calls, selects appropriate speech recognition and text-to-speech resources, and communicates with web applications to execute VoiceXML. It uses an open architecture and supports various third party speech technologies and standards.

Uploaded by

nikhil adatkar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as RTF, PDF, TXT or read online on Scribd

Features of the Vocalocity Voice Browser

The Vocalocity Voice Browser is a server-based software solution that gives companies
the ability to integrate enterprise VoiceXML-based applications into a telephony network.
It bridges the enterprise tier with the Public Switched Telephone Network (PSTN) or IP
Network to provide a highly-scalable, carrier-class call time platform for execution of
IVR and speech applications.
The Vocalocity Voice Browser terminates the calls from the PSTN/IP network,
intelligently selects the appropriate resources for call handling (such as ASR and TTS),
and communicates with the Web-tier to execute the application.
The Vocalocity Voice Browser features an open, standards-based architecture that
incorporates best-of-breed third-party technologies such as automated speech recognition
and text-to-speech with a set of software and tools for managing voice applications and
infrastructure. The Vocalocity Voice Browser allows companies to move to nextgeneration standards and technologies while preserving the ability to migrate existing
IVR systems. An n-tier, component-based architecture, the Vocalocity Voice Browser is
designed with industry standards, such as VoiceXML, SSML, and SRGS, and on leading
operating systems, Microsoft Windows and Red Hat Linux.

Logical Architecture
The following illustration shows the logical architecture of the Vocalocity Voice Browser.

There are four major high-level logical components of the Vocalocity Voice Browser:
Telephony Interface, ASR, TTS, and Interpreter.
Telephony Interface
The telephony component handles communication with the PSTN or IP network and
handles all call control and media related functions, such as answering the call, playing
audio prompts, and receiving spoken utterances.
The telephony component interacts with the telephony hardware through a telephony
extension point, or TEP.

Automated Speech Recognition (ASR)


The speech recognition component is responsible for managing the associated application
grammars and recognition state, and processing the spoken utterances, attempting to
recognize the spoken utterances to a set of known valid inputs, which drive the flow and
logic of the application.
Vocalocity provides integrations to leading ASR vendors, including SpeechWorks and
LumenVox.
Text-to-Speech (TTS)
The text-to-speech component is responsible for turning textual output into synthesized
audio that can be played back to the user as if it was spoken by a human. Text-to-speech
is useful when dynamic content does not lend itself to pre-recording.
Vocalocity provides integrations to many leading TTS engines, including SpeechWorks
Speechify and RealSpeak, and VoiceWare VoiceText.
VoiceXML Interpreter
The VoiceXML interpreter manages the dialog state and the application context during a
given call and manages the communication back to the application server that is
delivering the XML content. Depending on the language, execution of the document will
follow the dialog programming standard set forth in the appropriate specification.
The Vocalocity VoiceXML Interpreter supports all required elements in VoiceXML 2.0
and VoiceXML 2.1.
Additional Components
In addition to the major components, the Vocalocity Voice Browser includes several
lower-level components that model the entire interaction of the system, resources, and
components in the Vocalocity Voice Browser.
In most cases, you will not work directly with or modify these components; they are part
of the underlying architecture.
Compon
ent

Purpose

Call ID
Generates a unique ID for each call. This globally unique identifier is commonly called
Generato
a GUID.
r
Call
Router

Maps an incoming call to a specific URI, which in turns will deliver the necessary
XML for the call.
Set up call routing in Vocalocity Control Center.

Call
Manages a call and maps the appropriate resources such as ASR, TTS and Interpreter
Manager to the call.
Channel Manages one or more physical or logical channels. Channels define an endpoint that
Manager can receive and (or) originate a call.

Set up channels in Vocalocity Control Center (as part of configuring the instance).
Caches content used by the Interpreter during a call.
The Caching Manager keeps the most-used and frequented content local, while
preserving the dynamic aspects of any voice application and appropriately abiding by
Caching
the HTTP and VoiceXML specifications in regards to caching rules.
Manager
A distributed version of the Caching Manager can be used in larger networks where
multiple gateways would benefit from one or more larger caching servers instead of
spreading the disk requirements across individual servers.

Expanded View of the Vocalocity Voice Browser


In the logical architecture diagram on Logical Architecture, the Vocalocity Voice Browser
combined the telephony and voice (ASR, TTS, and Interpreter) building blocks of a voice
application system.
The following illustration expands on the high-level logical architecture diagram. It
shows the various layers that make up the Vocalocity Voice Browser (bounded in blue)
and how they integrate with the required third-party voice and telephony components.
Note: The application server and application components are not included in this
illustration.

The Vocalocity Voice Browser (bounded by a light blue box) includes the following
layers:
u Dialog, or interpreter, layer
u Middleware layer the VocalOS Communication Framework
u Integration layer APIs for integrating with voice browser components, as well as
delivered "managers" that handle common processing aspects
u Extension points custom adapters (or interfaces) to vendor-provided telephony and
speech components
Also included are the vendor-specific ASR and TTS engines, as well as the telephony

hardware.
Dialog or Interpreter Layer
The Vocalocity Voice Browser includes an XML interpreter that conforms to all required
VoiceXML standards. In addition, an open-source VoiceXML interpreter, called
OpenVXI, is available from Vocalocity.
Integration Layer
The integration layer ties the Vocalocity Voice Browser with required telephony, ASR,
and TTS components.
Vocalocity provides APIs that enable developers to create extension points that handle
messaging between the Vocalocity Voice Browser and those third-party components.
In addition to the Vocalocity APIs, these four components (or managers) handle common
processing aspects. These components are internal to the Vocalocity system; it is unlikely
that you will need to customize them.
Compo
nent

Purpose

Audio
Handles the playing of audio prompts and recording of audio (for example, voice mail
Manage
messages)
r
DTMF
Handles the recognition of DTMF digits entered by the caller.
Handler
Manages the requesting and releasing of dialog sessions that are responsible for all the
Dialog components necessary for a call. This includes a document interpreter, reco session, TTS
Manage session, media session, call session, channel session.
r
The Dialog Manager is an internal component; although it manages the Interpreters, it is
not the same as the Dialog layer in the illustration.
Media Manages the storage, retrieval, and caching of media raw data retrieved from a URI
Manage addressable location such as a web server or file.
r
The Media Managers primary responsibility is to deliver the media to the requestor.

You might also like