Low latency AI voice talk in 60 lines of code using faster_whisper and elevenlabs input streaming.
Вставка
- Опубліковано 16 сер 2023
- Short prove of concept code for a real-time ai companion. Note: This demo is conducted on a 10Mbit/s connection, so actual performance might be more impressive on faster connections.
Project link: github.com/KoljaB/AIVoiceChat
Incredible work!
Found your projects today and I cannot describe in words how impressive this all is. +1!
very impressive work!
Impressive work, thanks
Great work! If you had a strong enough computer you can run a smaller 13B model with fast tts with much lower latency
Incredible! I was working on the same project and had the issue of TTS latency: any Cloud TTS service has latency that is too high for real-time purposes. Definitely going to implement you approach. Thanks!
May I also point you to this one which can greatly help with TTS and latency: github.com/KoljaB/RealtimeTTS
It's impressive! Which GPU are you using?
Thank you. I have a RTX 2080 Super.
@@Linguflex Thanks for your answer! I have some questions. I've seen your email in the comments, can I email you?
hi Buddy!!
Im trying this approach but getting error, I have trained voice assitant using langchain and gpt 3.5 turbo and using elevenlabs api and opean ai api but latency is not reducing
Very nice. Greatjob❤
Out of curiosity, how would you handle back to back conversation with interruptionhandling without using space?
Thank you. We talked about how to do solid interruption in my discord channel recently: discord.gg/f556hqRjpv
Highly encourage you to join, it's a great place to ask questions, share progress and get support from tech enthusiasts. Would love to see you there!
Wow this project is insane is it possible to exchange openai with an llm instead to have 100% offline voice assistant ?
A local LLM is not the problem. Local TTS is much harder. There is only Tortoise or Bark afaik and they are not comparable to Elevenlabs quality sadly.
Please make a tutorial video for installing ai, I tried following the guide but I couldn't do it.
You have python installed? Mail me your install probs or send screenshot at lonligrin@gmail.com, I will help as good I can
It's basically: copy the files, enter api keys there, open a cmd shell as admin, enter "pip install openai elevenlabs pyaudio wave keyboard faster_whisper numpy torch" there. After that enter python voice_talk_vad.py or python voice_talk.py
Not sure how to do a good tutorial video...
@@Linguflexwow youre really nice mate
Hey brother! When i am running your program it is showing rate limit error. btw I am using free tier of openai
Elevenlabs or Openai API ran into rate limit. Check characters used in elevenlabs and settings limits in your openai account
@@Linguflex it is saying openai limit crossed.
i am using free tier of openai. is free tier enough for this program to run or i must upgrade to paid tier?
Paid account, it needs openai api key.
Actually you want about 100 MS of delay at the very least. We're human and take time to process information and it would just seem unnatural to have a conversation where you felt like someone was finishing your sentences for you all the time.