@@logannon no its impossible to fine tune groq. thats the problem. you have to use rag instead of fine tuning. but if you wanna make chatbot for specific domain you should try other service
How is it from scratch? You are using 3 Api. Also assembly api doesn't transcribe live audio streams without setting up billing. You have to put minimum 10 dollars in it for that too. I don't have a problem with that. But I have a problem you not telling this in advance, at the start of the video. You actually never mention this in anywhere in the video. It doesn't respond after the introduction. That's how you find out the problem is billing. Not from the video. That was quite annoying to be honest. Potentially great video ruined by lack of transparency.
Two questions: How can we improve the latency between the patient's response and the AI voice reply? and What can be done for the AI Voice to account for patient input if the patient speaks while the AI voice is speaking?
Hi Jeffrey, two very good questions! These deserve a video on their own, to be honest. To improve latency one thing you could try is running the LLM locally so you can get a faster inference over calling openai's API. As for handling overlapping speech, I've written the program to stop listening when the AI voice is responding back. But what you could do, is run another thread that is still listening while the AI voice is speaking.
As for the latency, I was assuming the majority of the latency was actually coming from ElevenLabs? And likely also from whatever functions might be needed to actually check the availability of the dentist and then also to schedule the actual appointment in the end. Am I wrong? So yeah I think running the LLM locally will surely help, or using Groq, but I'm not convinced yet that that is the biggest bottleneck.
Hey there are you learning to create generative ai models If yes please reply I have project for you By creating this project you can practice how to create ai model as well as you can include this in your resume for job search and this will also be help full for me
would you consider adding a web UI like gradio to this app so that we can send the demo to anyone if needed. this version only works if you run the actual code in your own environment.
The programming is not responding after the first introduction ,as shown in the video ;though even after using the github code. Any alternative with step by step instruction video ?
i am getting error "Cannot find reference 'generate' in '__init__.py' " on from elevenlabs import generate, stream line can you please help me to resolve this issue
Hi nice tutorial. I have coded real-time voice bot for phone conversations in Twilio. The latency comes from text-to-speech mostly and gpt response time. I'm guesing if either ones speed can be reduced about 2-3x, then the response time would be fast enough. In human conversation, we expect the response within 1 second....and anything above that seems more unnatural. I'm sure the speed issues will be solved with new Nvidia GPU-s or other hardware innovations.
@@rammohanbethi Hi, how can you let you know - its a lot of complicated server side code in node js and some python... The setup is too complex to explain in a comment. We make this as part of AI automation services for businesses.
I followed this tutorial then in the end I realized .. assemblyAI doesn't provide the support for the Japanese language in the live Reltimetranscriber. Which sucks .. lol can't use it. Any help? @assemblyAI
How do you grab the audio once the RealtimeTranscript has finalized? For example, could you pass the audio rather than the text to generate_ai_response?
This video is so great! I'm following your video but now I ran into this problem, I can install the package in Pycharm with Windows system, but I got this error: OSError: Cannot find mpv-1.dll, mpv-2.dll or libmpv-2.dll in your system %PATH%. I'm a researcher in the art field with only a debutant python knowledge, could you help me solve this problem? Thanks a lot!
For some reason, the microphone isn't picking up my voice. I enabled all permissions on my mac and am still having trouble. Is there any way to fix this?
the commends i can't run it on windows can you do one video for windows users or say what are the windows commends for this brew install portaudio pip install "assemblyai[extras]" pip install elevenlabs==0.3.0b0 brew install mpv pip install --upgrade openai '''
An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997) what to do with this error?
i've got same: An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)
Using Groq / Mistral AI instead of OpenAI will greatly reduce the latency issue you have in your demo.
can you fine tune groq?
Great suggestion, we will explore this in the next tutorial. This one was meant to be as accessible as possible so that people could build quickly.
@@logannon no its impossible to fine tune groq. thats the problem. you have to use rag instead of fine tuning. but if you wanna make chatbot for specific domain you should try other service
How much does Mistral API cost?
How is it from scratch? You are using 3 Api. Also assembly api doesn't transcribe live audio streams without setting up billing. You have to put minimum 10 dollars in it for that too. I don't have a problem with that. But I have a problem you not telling this in advance, at the start of the video. You actually never mention this in anywhere in the video. It doesn't respond after the introduction. That's how you find out the problem is billing. Not from the video. That was quite annoying to be honest. Potentially great video ruined by lack of transparency.
If you wish to make an apple pie from scratch, you must first invent the universe - Carl Sagan
I guess they have "Start building with the $50 free credit!" policy now
amazing lady and also an engineer omg)) thank you a million, I'll just add this to my stack
Two questions: How can we improve the latency between the patient's response and the AI voice reply? and What can be done for the AI Voice to account for patient input if the patient speaks while the AI voice is speaking?
Hi Jeffrey, two very good questions! These deserve a video on their own, to be honest. To improve latency one thing you could try is running the LLM locally so you can get a faster inference over calling openai's API. As for handling overlapping speech, I've written the program to stop listening when the AI voice is responding back. But what you could do, is run another thread that is still listening while the AI voice is speaking.
As for the latency, I was assuming the majority of the latency was actually coming from ElevenLabs? And likely also from whatever functions might be needed to actually check the availability of the dentist and then also to schedule the actual appointment in the end. Am I wrong?
So yeah I think running the LLM locally will surely help, or using Groq, but I'm not convinced yet that that is the biggest bottleneck.
great job Smitha... awsome video
Thank you! Such a useful application and well explained ❤
Exactly what I was intending on making. Thanks!
Hey there are you learning to create generative ai models
If yes please reply I have project for you
By creating this project you can practice how to create ai model as well as you can include this in your resume for job search and this will also be help full for me
@@MuskaanKhan.31 I am interested in a project! Just need required data and the objective you have in mind
would you consider adding a web UI like gradio to this app so that we can send the demo to anyone if needed. this version only works if you run the actual code in your own environment.
The programming is not responding after the first introduction ,as shown in the video ;though even after using the github code. Any alternative with step by step instruction video ?
i am getting error "Cannot find reference 'generate' in '__init__.py' " on from elevenlabs import generate, stream line can you please help me to resolve this issue
The Eleven Labs library version is specific: elevenlabs==0.3.0b0
Hi nice tutorial. I have coded real-time voice bot for phone conversations in Twilio.
The latency comes from text-to-speech mostly and gpt response time.
I'm guesing if either ones speed can be reduced about 2-3x, then the response time would be fast enough. In human conversation, we expect the response within 1 second....and anything above that seems more unnatural. I'm sure the speed issues will be solved with new Nvidia GPU-s or other hardware innovations.
Hi, can you please let me know how you developed the voice bot using Twilio’s, even I’m looking for such kind of bot. It would be helpful
@@rammohanbethi Hi, how can you let you know - its a lot of complicated server side code in node js and some python...
The setup is too complex to explain in a comment. We make this as part of AI automation services for businesses.
@@randotkatsenko5157 bro speaking as if he coded all the logic voice bot , bruhh you're just hitting API's that ain't complex....
But I still have problems it says that [from elevenlabs import generate, stream
ImportError: cannot import name 'generate' from 'elevenlabs'] how come
i have the exact same error did you fix it ?
change the version of python to 1.10 or +
The Eleven Labs library version is specific: elevenlabs==0.3.0b0
Please a tutorial on llava vision model to analyze video live with cv2
And I am unable to get my API token from assembly AI website please fix it
Thanks. First time I hear of AssemblyAI. Everyone talks about faster_whisper and Deepgram. Is AssemblyAI better for STT?
no its not
in py laptop the brew not installing, and also in program is not working
I followed this tutorial then in the end I realized .. assemblyAI doesn't provide the support for the Japanese language in the live Reltimetranscriber. Which sucks .. lol can't use it. Any help? @assemblyAI
Hi There - I was just looking at the code. Where is the appointment setting details / info coming from ?
All that is coming from the LLM we are using, so it's not hard-coded.
how would you handle interruptions while the ai is talking?
why not chunk text and output instead of output after all text is generated?
How may I integrate this into actual phone calling, through Twilio I assume, but how may I incorporate this ?
How do you grab the audio once the RealtimeTranscript has finalized? For example, could you pass the audio rather than the text to generate_ai_response?
Yeah but it doesn't actually check the time or book anything?
Can it be inducted into Aurdino board?
hi thanks for your video . i want Api real time conversation with python for Farsi language . the LLM support Farsi language?
may how to install brew does not have for windows other option for windows
❤❤❤❤❤so wonderful project
any way to make one with adam voice like the one in elevenlabs?😊
This video is so great! I'm following your video but now I ran into this problem, I can install the package in Pycharm with Windows system, but I got this error: OSError: Cannot find mpv-1.dll, mpv-2.dll or libmpv-2.dll in your system %PATH%. I'm a researcher in the art field with only a debutant python knowledge, could you help me solve this problem? Thanks a lot!
How can i conect to my phone number and google calendar?🙏🏼
You can make use of the Google API for google calendar and something like Twilio's API for making phone calls.
For some reason, the microphone isn't picking up my voice. I enabled all permissions on my mac and am still having trouble. Is there any way to fix this?
I think you need to pay for the real-time transcription for this at AssemblyAI
streaming from assembly ai is a paid service. So, first you need add balance into your account. If you have not done that yet. Hope that helps :)
How to install mpv on windows?
The only downside is the fact it takes a while to respond with voice.
the commends i can't run it on windows can you do one video for windows users or say what are the windows commends for this
brew install portaudio
pip install "assemblyai[extras]"
pip install elevenlabs==0.3.0b0
brew install mpv
pip install --upgrade openai
'''
We will look into making a Windows version in the future!
Hlw maam can we build me ai model for this
Source code Not Available
i am facing the mpv value error on windows i already installed it many times how can i fix that
just use vlc instead mpv bro
@@sethuraman9884 thank you guys
or check environment path of mpv. when you command mpv --version on cmd. you have to see its running
An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)
what to do with this error?
same error. You found the solution?
most likely your microphone is switched off pls check
i've got same:
An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)
your free api does not work in my project
The assembly ai api is not free.
can u make just a chat bot word to voice
assembly ai APIs requires a credit card for this task
nice but the lagging time is too long unfortunately.
why are you using Mac omg
Excellent .
I am very api to have found this
From scratch is misleading as others already commented.
why are you saying fro. scratch if you're only using api
TOO SLOW !
No thats not scratch i have no money stop making me hope
after watching your video, i think i prefer interacting with humans
😂
TOO SLOW!
make it faster