Video Transcription
Distributed Architecture
What & Why
- Distributed architecture is taking a single computing task and distributing it among
multiple machines.
- The primary reason for distributed architecture is redundancy. Let's say you have a
critical system, such as a PBX, which handles your incoming calls. If the power supply
in the PBX machine catches fire, you could lose all of your calls. In a distributed
environment, when one machine fails, the other would pick up the slack.
- Load balancing is another good reason for distributed architecture. This allows you
to split up tasks among multiple machines. If any one machine becomes overburdened with
too much to do the others would help out. The load would be evenly distributed so no
single computer becomes bogged down.
- Distributed architecture is very scalable, which means it's easy to add more machines.
So as the need grows, as you do more business, as traffic grows, you simply plug in more
machines to the cluster of machines and the increased traffic is handled.
Client-Server Model
The LumenVox Speech Engine works well in distributed environments, because internally we
use a client-server model. There are two parts of the application when the LumenVox Speech
Engine is installed.
- The first part is the speech client, this is the part that talks to your application.
You may have a platform that runs your IVRs, or a PBX, or any other application that
communicates with the Speech Engine using the speech client. This communication could be
by way of MRCP or an API integration. Audio is sent, grammars are sent and the LumenVox
Speech Engine sends back decodes with the actual text of what a user said.
- Internally, the Speech Engine client takes the speech and passes this data to a
speech server. The server application is what actually performs the decode. The client
and the server speak to each other using standard TCP/IP port connections. This is good
because unlike an API integration, which has an integration from your platform to the
LumenVox engine, the API cannot traverse a network. API is used with local machines only.
As opposed to the LumenVox client and server speak to each other using TCP/IP, which
allows communication over a network.
- We can have the speech client and the speech server on the same machine if we choose
to, but we can take the speech engine and put it on a separate machine and have the client
and server still able to talk to one another. The advantage to doing this is that it
allows us to balance the load. Speech recognition can be very resource intensive, it can
require a good deal of memory, and use many CPU's, etc. You may not want this all
happening on your PBX or IVR server, so you could just add the client to the PBX or IVR
server then have the speech server perform decodes on a separate machine which could be
dedicated to speech recognition if you choose.
Server Monitor
- For all for this to work, the speech client contains a small routine called the server
monitor. This routine takes a list of speech servers to use and your client application
supplies the monitor with the location of all of the speech servers so that it can
routinely check all of the speech servers. So if you have three speech servers: A, B, and
C, the client is set up on the PBX with 3 separate servers elsewhere. The server monitor
will constantly be communicating with all three servers, A, B, and C. If B suddenly
becomes disabled, the server monitor will realize this and it will no longer send decodes
to server B, it will simply use A and C only. However, if B server comes back online, the
monitor will see this and will begin sending decodes once again. This is good for
redundancy and failover.
- The server monitor also keeps track of how busy the servers are. So if server A
suddenly becomes very busy, the request for decode would be sent to B or C. New decode
request are sent to the server that is least busy. So LumenVox is automatically load
performing a type of load balancing if you have multiple servers set up.
Licensing
- The final piece that allows LumenVox to work in a distributed environment is our
licensing. Licensing is very convenient for this purpose. LumenVox is not concerned
with how many installations of the engine and the client you have, at least from a
licensing perspective. We don't have per seat or per server types of licenses. The
LumenVox licenses are all per decode; when a speech port is opened to perform a decode,
this is when a license is used. The speech engine and client on as many machines as you
want but it will not take up a license until decodes are done. The client asks for
licenses during a decode, it will look at a license server. Licenses can be installed on
the same machine as the client or on a separate machine. So you could have ten separate
speech clients communicating with twenty separate speech servers all speaking to one
license located on yet another machine.
The following is a graphic representation of the above:
In this graphic one of our customers, Ontelnet has two PBX servers that take incoming
SIP traffic. The servers are set up in a cluster and performing load balancing. Since the
PBX is integrated with LumenVox our client application is on the systems that the PBX's are
running. However, the PBX are very busy and the customer does not want to have to do speech
decodes on the PBX servers because they are handling so many calls. What they have chosen to
do is to move the speech decodes onto different servers. As you can see there are multiple
speech servers set up so now when there is a request for speech recognition that request goes
off to the most appropriate speech server because, as we discussed, the server monitor on
each client has a list of all the speech servers and their status. Also note that we have a
license server, all the clients communicate with the license server, and we also can see that
there is a back up license server that can be used if the primary license server fails for
whatever reason.