The hardware resources required by the LumenVox Text-to-Speech Server are slightly less than that of our Speech Recognizer. Speech synthesis requires less processor and memory than the complexity of decoding speech.
Determining appropriate hardware is an issue of knowing how many concurrent syntheses you will be generating, as well as knowing what else the host machine will be responsible for. If the application platform, databases or other resource-intensive services will be run on it, TTS performance may be affected.
For production telephony systems, LumenVox recommends a modern server running a 64-bit operating system with 4-8 CPU cores and 6 GB of RAM.
However, different applications and use cases may have significantly lower or higher requirements. Please contact us for help sizing your specific hardware needs.