- Installation
- Administration
- Programmer's Guide
- Grammars
- MRCP Server
- FAQs
If you have not already done so, please see our SRGS Introduction for more information about our SRGS tutorial.
We will begin our look at writing SRGS grammars with a simple grammar that lets the Engine recognize the words "yes" or "no." Yes or no grammars are the "hello world" of grammar writing.
Any SRGS grammar written in ABNF notation must begin with a line like:
This identifies to the LumenVox grammar compiler that the file being read is an ABNF grammar, as opposed to an SRGS GrXML grammar. Immediately following the grammar type is an optional declaration that indicates the character encoding, e.g. UTF-8, UTF-16, or ISO-8859-1.
The line ends with a semicolon, as do all lines in an ABNF grammar.
Following the identifier, a well formed grammar will contain information about the language the grammar is written in, the expected interaction mode (voice or DTMF), and the name of a rule where the Engine will begin its search (the root rule). In addition, the header may contain one or more tags, and an identifier describing the tag format for this grammar. Tags will be discussed later in this tutorial.
The contents of the grammar header may be in any order, but no header data may occur in the file after the first rule is written. The Speech Engine only requires the identifier line in the header; if interaction mode, language, or tag format are left blank, the Engine will use default values. It assumes voice for the interaction mode, en-US as the language, and semantics/1.0 for the tag format. If no root rule is specified in the header, all rules will be assigned as root, meaning the grammar will be matched if any rule is matched.
It is good practice to always explicitly assign the header information instead of relying on the default values.
ABNF grammars may contain comments anywhere in their body (with the exception of the first line, containing the grammar identifier). The comment format is the same one used by the C, C++, and Java programming languages.
A grammar's rules specify which words and combinations of words the Engine can recognize. They are the heart of the grammar. Each rule has a name, appearing on the left side of an = sign, and a rule expansion, appearing on the right side.
A rule name starts with a $ character. Immediately after the $ is the rule's name, which must start with a letter and may be followed by additional letters, numbers, or underscore characters. The first rule in our above grammar is:
The rule expansion describes to the Engine what sequences of words will allow a rule to be matched. In the above rule, the expansion consists entirely of the word "yes," and thus the rule is matched if the word "yes" is spoken. An entire grammar is matched only if its root rule is matched.
The second rule is matched if the word "no" is detected. The third rule contains a pipe symbol (the | character), which is a logical "or" operator. So the third rule is matched if either the $yes rule or the $no rule is matched.
When the Engine begins decoding your audio, it starts at the root rule of the grammar (in this case the rule $yesorno). It then steps through all legal expansions. It moves into the rules $yes and $no, since it can match against either rule. Since the first words in the rules $yes and $no are "yes" and "no," the Engine knows that it is allowed to recognize either word.
If the Engine detects "yes" as a possibility, it then looks for the next word it can recognize in the $yes rule. Since there are no more words in the $yes rule, the rule is matched. And since the $yes rule is matched, the $yesorno root rule is matched, so the entire grammar is matched.
Building more complex rules via rule expansions is the next step in our SRGS tutorial.