Call Browser Reference

The Call Browser is used to view details about specific calls and transactions. It displays a list of all currently loaded and filtered calls, and it displays all the interactions for a selected call.

The Call Browser window is divided into three sections: the Calls List at the top of the screen, the Interactions List in the middle, and an audio control at the bottom:

Keyboard Shortcuts

To move quickly between calls and interactions, users can use the keyboard arrow keys. The up and down arrows will move between calls, and the left and right arrows will move between interactions.

Calls List

The Calls List provides a list of calls and information about each call. Users can move between calls by clicking the Prev Call or Next Call buttons. If the last call is currently selected, the Next Call button returns to the first call.

The following information is provided:

Call ID

The ID number for the call within the database.

Call Time

The time the call was made.

Interactions

The number of interactions within the call.

No Match

The number of interactions within the call where the engine was unable to match the caller's audio.

No Input

The number of interactions within the call where the caller did not provide any input.

DTMF

The number of DTMF interactions within the call.

SRE

The number of times the speech engine recognized speech.

Avg. Incorrect Conf

The average confidence score for times the speech engine incorrectly interpreted a phrase.

Avg. Correct Conf

The average confidence for times the speech engine correctly interpreted a phrase.

Incorrect

The total number of incorrect interpretations.

Correct

The total number of correct interpretations.

Call GUID

The GUID for that call.

Interactions List

The Interactions List contains details about each interaction within a call. Users can move between interactions by clicking the Prev Interaction or Next Interaction. Pressing Next Interaction when the last transaction in a call is selected will move to the next call.

Pressing Launch in Tester or Launch in Transcriber will open the selected transaction in the Grammar Tester or the Transcriber. Right-clicking on an interaction will bring up a menu that will also present these options:

The following information is provided in the Interactions List:

Interaction ID

The number of the interaction within the loaded database.

Event Type

START_DECODE_SEQUENCE:

The start of the call.

SRE:

Indicates the speech application recorded audio for this event. Speech events will usually also have recognition results. With some speech recognition engines and platforms, the audio may not have been recorded even though the interaction is marked as a speech event. This is usually done for security, such as not recording the callers credit card number. In these cases, the interaction will still be marked as a speech event, but no audio, and most likely, no speech recognition results will be available.

DTMF:

The caller chose to key in a response using the telephone keypad rather than speaking.

NO_INPUT:

Generally, this means the application expected speech or DTMF, but the caller neither spoke, nor used the telephone keypad.

UNKNOWN:

These events are typically engine specific. Check the SRE output window to see what kind of event the interaction was.

END_DECODE_SEQUENCE:

The end of the call.

Raw Text

The raw text the engine recognized.

Confidence Score

The confidence score for that interaction.

Transcript

The text of the entered transcript, if one exists for that interaction.

Sequence Number

The sequence number is unique to a call and denotes the order in which interactions took place.

Interaction Name

The original interaction identifier, generated by the speech engine.

View Details

Clicking on the View Details button brings up the Interaction Details window, which displays more information about the selected interaction.

The following details are provided:

Acoustic Model

The acoustic model used by the speech engine that achieved the best answer.

Decode Time

The amount of time from decode request to completion, in milliseconds. The decode time is generally not the same as the recognition time. The decode time is actually the amount of time it takes to issue the decode request, compile the grammar, transport the audio, decode on the server, and receive a result back. Thus, it is much more of a round-trip-time measurement, and a more accurate portrayal of time according to how a caller experiences the speech system.

NBest Rank

NBest is the list of possible candidates for matching meaning the speech engine generates. The top candidate is ranked 0.

Semantic Interpretation

The semantic interpretation of the answer. Included with this is a confidence score that indicates how confident the engine is about this interpretation. Any semantic interpretation properties included in the grammar are displayed here as well.

Raw Text

Raw text contains the actual words recognized by the speech engine. If available, the raw text also has confidence scores attached to each word or phrase and that word's individual phonemes. Some conventions used within the raw text include <s> for the start of an utterance and </s> to mark the end. "SIL" under phonemes indicates silence.

Clicking the Detailed radio button at the top of the window provides some additional information, including the system configuration of the speech engine and any custom tags generated by the speech recognition application.

Audio Controls

Clicking Play or Stop will start or stop the audio. The drop-down box allows users to choose between hearing the Decoded Audio, which is the normalized and cleaned-up audio that comes out of the speech engine, or the Actual Utterance, which is the raw audio the caller spoke. By moving the volume slider left or right, users can control the volume level of audio.

The Export as WAV button allows users to export the call audio as a WAV or RAW file on their hard drive.

© 2012 LumenVox LLC. All rights reserved.