Seeking Guidance on Custom Urdu ASR Training Data and Vocabulary Expansion #4900

Shaukataliii · 2024-01-03T18:36:04Z

Hello,
I am a developer working on a project involving the development of an Urdu Automatic Speech Recognition (ASR) system using the Kaldi ASR toolkit. I am encountering two specific challenges and would greatly appreciate your insights.

Challenges

Acquiring Transcriptions for Custom Urdu Dataset:

Issue: Obtaining accurate transcriptions for a substantial custom Urdu language dataset, tailored for industry-specific use, has proven challenging.
Request: Seeking guidance or suggestions on cost-effective solutions or resources that could assist in obtaining accurate transcriptions.

Optimizing Kaldi ASR for Recognizing Unseen Words:
- Issue: We aim to optimize the Kaldi ASR model to efficiently recognize new words it may encounter during inference, especially industry-specific jargon.
- Request: Looking for insights or recommendations on approaches to handle previously unseen words and enhance the model's adaptability.

Thank you for your time and consideration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Seeking Guidance on Custom Urdu ASR Training Data and Vocabulary Expansion #4900

Seeking Guidance on Custom Urdu ASR Training Data and Vocabulary Expansion #4900

Shaukataliii commented Jan 3, 2024

Seeking Guidance on Custom Urdu ASR Training Data and Vocabulary Expansion #4900

Seeking Guidance on Custom Urdu ASR Training Data and Vocabulary Expansion #4900

Comments

Shaukataliii commented Jan 3, 2024