simple and natural:
use your voice
Grammars represent the core
of the voice based applications
When generating/testing the grammars one of the most delicate steps, and also one of the most expensive without the appropriate tools, is the tuning phase before or after starting the production process.
Let's consider the two cases above stated:
- grammar in the release phase: in the tests section every time we test the grammar through our voice we can save our audio files (dump files) and use them afterwards.
- grammar in production phase:: in this case too we are able to capture the file containing the users speech (dump file), save it and re-use it.
Without the chance to save and examine in detail the dump files, it is impossible to determine the exact source of a recognition error.
The available data, for example in a log file, are:
- the utterance: what the user has said;
- the recognised: the corresponding output value given by the system for that specific utterance.
Both of the data are not always sufficient to determine the kind of recognition error, for example it could happen that:
- user speaks too early so that the utterance is cut but the grammar is formally correct;
- a user pronounces a sentence that has not been included in the grammar or whose words sequence had not been evaluated.
The importance of the dump files analysis is clear when, modifying a grammar, we can test it again with a pool of files that we have saved in the two previous cases and immediately evaluate the result and the changes.
This operation would normally require several hours but with Grammar Studio the time needed for analysis is reduced to few minutes (less than 5 for hundreds of files).
One of the features deriving from the integration with MultiModalBerry is the advanced log of all the recognition events.
This log allows you to print on a XML file all the details of a single vocal recognition event: both when using a voice profiler and pre-recorded files from the file profiler.
Logs record various details including:
- information about recognition: utterances, confidence, n-best-list, event result;
- information about the kind of recognition: algorithm used (multimodal only);
- tag filing: time, context, dump file;
This information is collected in a Report through the Berry Log-Report that shows all the events collected by context, with a high level of information about each context and with details of every single recognition event.
In this case too, as shown for the dump files, it is possible to: