RealtimeSTT: A low-latency speech-to-text library with advanced voice activity detection
Вставка
- Опубліковано 7 вер 2024
- github.com/Kol...
Features wake word activation and instant transcription. Designed for real-time applications like voice assistants.
Thank you for showing us your library in action as well as letting us know how we can support it!
Nice one! I look forward to trying this out
This is awesome ! Thanks
That's incredibly accurate. Nice work! Can you active-transcribe AND wake-word for commands? It'd be great if you could have it always listening and then do something on wake word.
No, currently not. The idea is good, I can see some use-cases for this. I'll think about that.
im gonna try to make a Vrchat STT app that puts the words above my head using their osc system :D
Great work..do you have any ideas to reduce latency in text to speech..im working on it..
Is it possible to make the voice dictation instantaneous at the cost of accuracy? I want to try controlling the servos on an animatronic mouth with voice dictation. It doesn't have to be accurate, it just needs to be accurate enough to be convincing and as fast as possible
You probably want to use whisper.cpp with a quantized tiny model and grammar sampling, look up Georgi Gerganov's chess example.
You could also train a wake word model to do this. They are crazy fast and reliable but specialized on few keywords. Check Openwakeword or PvPorcupine.
I don't understand how to use it...
What do you want to do?
The "tests" folder contains some examples how you can use it:
github.com/KoljaB/RealtimeSTT/tree/master/tests
Maybe also the "tests" of RealtimeTTS can help, they also use RealtimeSTT a lot:
github.com/KoljaB/RealtimeTTS/tree/master/tests