Speech recognition technology is now considered the most preferable way to interact with a telephony application. In addition, speech recognition is now feasible from a cost and ROI perspective — the price of core technology has gone down and software standards have opened up options for the consumer. So what is the best way to approach a "speech overhaul" for your DTMF application? And, what are some of the compelling reasons to speech-enable your DTMF, sometimes referred to as Touch Tone, solution?
In this paper we will cover some techniques and tips you can follow to convert your DTMF application to speech — and delve into the reasons for taking the plunge in the first place! With careful planning and a methodical approach, you can leverage speech recognition technology to bring increased caller satisfaction and improved customer service to your organization.
At this point you may be asking yourself, what are the advantages to speech recognition and is it the best move for my company? To illustrate the advantages, let's take a look at an application that in DTMF goes something like this:
Application: "Thank you for calling Umbrellas-R-Us. To get store hours press one, to find the location of the closest store, press two, to order umbrellas over the phone, press three."
Caller Input: 3
Application: "We have 5 choices of colors for umbrellas. Press 1 for green, 2 for red, 3 for yellow, 4 for blue and 5 for purple.
Press 6 to repeat this menu."
Caller Input: 4
Application: "We have two sizes of umbrellas, press 1 for compact and 2 for full-size."
Caller Input: 2
Now, let's take a look at how a well-designed speech application handles the same transaction:
Application: "Thank you for calling Umbrellas-R-Us. Would you like to order an umbrella or get store hours and locations?"
Caller: Order an umbrella
Application: "Great, would you like a green, red, yellow, blue or purple umbrella?"
Caller: Order an umbrella
Application: "Great, would you like a green, red, yellow, blue or purple umbrella?"
Caller: Blue
Application: "Would you like a compact or full-sized umbrella?"
Caller: Full-size
What did we achieve? Speech has transformed this passive DTMF menu structure into a more interactive and natural experience for the caller. Instead of simply listing off the menu items, it asks the caller questions. While this caller managed to remember which number to press for which color of umbrella, it is very possible that a caller would get confused, need to repeat the options a second time, and start to get frustrated. In general, long and complex menus make it difficult and cumbersome for any caller to a DTMF system to navigate. Other benefits of speech include:
Having said all that, make sure the application you choose to convert is best served as a speech recognition application. Perhaps start with the application that has the lowest call completion rates or an application that you'd like to get additional information from the caller, but simply can't today through DTMF
The goal of any telephony application is to either: route a call, provide information, complete a transaction, or all of the above. If implemented well, speech can smooth all of these processes and move a call forward more quickly
So, before you begin, keep in mind the following:
Before getting into grammars and prompts, let's discuss the difference between Directed Dialog and Natural Language speech applications. Directed Dialog is a way of developing the application to "guide" the caller to use specific phrases in their utterances. For example, the application would ask the caller, "Would you like sales, support or accounting?" Natural Language applications simply ask, "How may I help you?"
It is highly recommended to build your application using Directed Dialog because the confidence score of the Speech Engine processing the utterance will have a much higher accuracy rate with the focused grammar of a Directed Dialog solution. In addition, the time and resources needed to build an effective speech application using Natural Language is often cost prohibitive for most organizations. Not only would you need to load the grammar with every single possible response (very difficult!), but that, in turn, would increase the development time and cost.
Grammars
Grammars refer to the list of expected responses from your callers. For example, for a front-end auto attendant for your company,
the grammar would include the list of names of the employees or departments, plus words or phrases such as "Main Menu",
"Operator," or "Cancel."
Other points about grammars:
Prompts
Audio prompts are where the rubber meets the road for a speech solution. This is how you direct your callers through a
successful transaction. Speech applications ask questions, and need to coordinate with expected response. Learn from past
speech application mistakes and do not write, "say" menus. For example, "To get account status say Account
Status, for the main menu say Main Menu." This frustrates the caller and does not utilize the strengths of speech.
Simply write the prompt as "You can get Account Status or return to the main menu."
The grammars then can include the filler words, for example:
$AccountStatus = [get] account status;
This enables the user to say either "get account status" or "account status."
Your speech prompt's menu items do not necessarily need to follow the same order as those of your DTMF ones. They can be ordered by common usage or for better sounding prompts. You'll have more freedom ordering speech prompts because the order of your DTMF prompts are almost always, "For A press 1, for B press 2 ... and for Z press 0."
One last thing about prompts, if at all possible use recorded prompts and not text-to-speech.
Other Tips
Example $MainMenu = (Main Menu):"1"
It cannot be over-emphasized that once you've built your speech application that you are only half way there! A significant part of the time you spend should be testing and tuning the application. Deploy the application and gather call data, listen to calls and adapt and tune the application to your caller, not the caller to the application.
Test with real people, not people who are trying to break it or those who are intimately familiar with the call flow. Create a repeatable tuning cycle every week or every month and follow through. The more you learn about how real callers are reacting to your system, the better the application will become. Learn more about Speech Tuning Strategies and tools that will help you through the tuning process.
In this paper we covered the basic considerations when planning to port your speech application from DTMF to Speech.
Call LumenVox today at 1-877-977-0707 to discuss your project and learn about ways we can partner to make your speech application a success.