-
Notifications
You must be signed in to change notification settings - Fork 4.2k
new WER script #2824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new WER script #2824
Conversation
Are those Java class files are redundant? |
Ah yes I think those gradle files are redundant, pls ignore them Only use the files that are in wer_testing |
test.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed under new script
have also added some preview to cli which would be iterated at every audio loop:
|
|
Downloadable datasets are huge with several hundreds of hours of audio. While I think this work item is to create some light-weight tests on WER performance which can be integrated into github workflows. If there is a tiny dataset contained in repo, then WER benchmarking will work just out of the box and on-the-fly. A selected tiny dataset can also cover English/non-English, clean or noisy, which will be handy & useful. For example, if some sort of noise cancellation is added, then some noisy audio files can be added to the dataset and get benchmark easily & quickly. Anyway, I suggested to add audio files to this repo. My apologies to @harvestingmoon. |
I have tried short audio inputs with the Google Command Dataset which contains audio input files approx 1s each. However, the problem with this is that whisper.cpp is unable to capture any words at all (I believe it is because the audio inputs are just too short) so there is difficulty in calculating WER. Hence, I switched over to the Hifi-TTS dataset. No worries @foldl ! am glad to try to help / contribute 😄 I can continue slowly developing the script if given the green-light 👍🏼 |
WER testing based off speaker 6097 of the HiTTS Dataset. Audio carried is ~ 10mb and contains dozens of short 10 second audio. WER_Scripting.py would then calculate the WER via DP.