Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add confidence key to the MbrResult method in Kaldi Recognizer #563

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

saisyam
Copy link

@saisyam saisyam commented May 30, 2021

Our IVR system is built on Freeswitch + Rayo + Adhearsion. We are using Unimrcp with vosk-api and Kaldi to perform Speech to text for our on premise solution. We observed that the overall confidence of the conversion is not coming as part of the MbrResult method. We are expecting a confidence key as part of the result so that the Adhearsion application will make a decision. I have added that as part of this PR.

@@ -423,6 +424,7 @@ const char *KaldiRecognizer::MbrResult(CompactLattice &clat)
word["start"] = samples_round_start_ / sample_frequency_ + (frame_offset_ + times[i].first) * 0.03;
word["end"] = samples_round_start_ / sample_frequency_ + (frame_offset_ + times[i].second) * 0.03;
word["conf"] = conf[i];
confidence += conf[i];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe it should work this way. Assuming a confidence is a value in [0;1] range, you can't just sum all the values to get an overall confidence.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the overall confidence.

@nshmyrev
Copy link
Collaborator

nshmyrev commented Jun 6, 2021

I don't think this is the right way to calculate confidence. Why average and not minimum for example? I need to think more about it.

@saisyam
Copy link
Author

saisyam commented Jun 10, 2021

We need the confidence value to make some decision within our application. The average gives a better approximation of the overall confidence than max/min. We also compare the converted text with the list of accepted words or sentences using levenshtein distance method. Using these two values we will make a decision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants