Coding an AI Voice Bot from Scratch: Real-Time Conversation with Python

Поділитися
Вставка

КОМЕНТАРІ • 85

  • @NatGreenOnline
    @NatGreenOnline 10 місяців тому +39

    Using Groq / Mistral AI instead of OpenAI will greatly reduce the latency issue you have in your demo.

    • @logannon
      @logannon 10 місяців тому

      can you fine tune groq?

    • @AssemblyAI
      @AssemblyAI  10 місяців тому

      Great suggestion, we will explore this in the next tutorial. This one was meant to be as accessible as possible so that people could build quickly.

    • @조바이든-r6r
      @조바이든-r6r 9 місяців тому

      @@logannon no its impossible to fine tune groq. thats the problem. you have to use rag instead of fine tuning. but if you wanna make chatbot for specific domain you should try other service

    • @TrilioniME
      @TrilioniME 6 місяців тому

      How much does Mistral API cost?

  • @fatmayonca1723
    @fatmayonca1723 7 місяців тому +57

    How is it from scratch? You are using 3 Api. Also assembly api doesn't transcribe live audio streams without setting up billing. You have to put minimum 10 dollars in it for that too. I don't have a problem with that. But I have a problem you not telling this in advance, at the start of the video. You actually never mention this in anywhere in the video. It doesn't respond after the introduction. That's how you find out the problem is billing. Not from the video. That was quite annoying to be honest. Potentially great video ruined by lack of transparency.

    • @yashmehta9299
      @yashmehta9299 3 місяці тому +4

      If you wish to make an apple pie from scratch, you must first invent the universe - Carl Sagan

    • @thesohailjafri
      @thesohailjafri 2 місяці тому

      I guess they have "Start building with the $50 free credit!" policy now

  • @euginekholmogorov5196
    @euginekholmogorov5196 10 місяців тому +2

    amazing lady and also an engineer omg)) thank you a million, I'll just add this to my stack

  • @JeffreyJohnson-vy1zm
    @JeffreyJohnson-vy1zm 10 місяців тому +3

    Two questions: How can we improve the latency between the patient's response and the AI voice reply? and What can be done for the AI Voice to account for patient input if the patient speaks while the AI voice is speaking?

    • @AssemblyAI
      @AssemblyAI  10 місяців тому +1

      Hi Jeffrey, two very good questions! These deserve a video on their own, to be honest. To improve latency one thing you could try is running the LLM locally so you can get a faster inference over calling openai's API. As for handling overlapping speech, I've written the program to stop listening when the AI voice is responding back. But what you could do, is run another thread that is still listening while the AI voice is speaking.

    • @EvertvanBrussel
      @EvertvanBrussel 9 місяців тому

      As for the latency, I was assuming the majority of the latency was actually coming from ElevenLabs? And likely also from whatever functions might be needed to actually check the availability of the dentist and then also to schedule the actual appointment in the end. Am I wrong?
      So yeah I think running the LLM locally will surely help, or using Groq, but I'm not convinced yet that that is the biggest bottleneck.

  • @christiankamguia7076
    @christiankamguia7076 2 місяці тому

    great job Smitha... awsome video

  • @yitaowang8547
    @yitaowang8547 5 місяців тому

    Thank you! Such a useful application and well explained ❤

  • @thebackpainmiracle
    @thebackpainmiracle 9 місяців тому

    Exactly what I was intending on making. Thanks!

    • @MuskaanKhan.31
      @MuskaanKhan.31 7 місяців тому

      Hey there are you learning to create generative ai models
      If yes please reply I have project for you
      By creating this project you can practice how to create ai model as well as you can include this in your resume for job search and this will also be help full for me

    • @avataraang3334
      @avataraang3334 6 місяців тому

      @@MuskaanKhan.31 I am interested in a project! Just need required data and the objective you have in mind

  • @mehmetbakideniz
    @mehmetbakideniz 6 місяців тому +2

    would you consider adding a web UI like gradio to this app so that we can send the demo to anyone if needed. this version only works if you run the actual code in your own environment.

  • @simonsandeep4977
    @simonsandeep4977 9 місяців тому +4

    The programming is not responding after the first introduction ,as shown in the video ;though even after using the github code. Any alternative with step by step instruction video ?

  • @PalashDandge
    @PalashDandge 9 місяців тому +6

    i am getting error "Cannot find reference 'generate' in '__init__.py' " on from elevenlabs import generate, stream line can you please help me to resolve this issue

    • @EngIsraelOlguin
      @EngIsraelOlguin 2 місяці тому

      The Eleven Labs library version is specific: elevenlabs==0.3.0b0

  • @randotkatsenko5157
    @randotkatsenko5157 8 місяців тому +1

    Hi nice tutorial. I have coded real-time voice bot for phone conversations in Twilio.
    The latency comes from text-to-speech mostly and gpt response time.
    I'm guesing if either ones speed can be reduced about 2-3x, then the response time would be fast enough. In human conversation, we expect the response within 1 second....and anything above that seems more unnatural. I'm sure the speed issues will be solved with new Nvidia GPU-s or other hardware innovations.

    • @rammohanbethi
      @rammohanbethi 7 місяців тому

      Hi, can you please let me know how you developed the voice bot using Twilio’s, even I’m looking for such kind of bot. It would be helpful

    • @randotkatsenko5157
      @randotkatsenko5157 7 місяців тому

      ​@@rammohanbethi Hi, how can you let you know - its a lot of complicated server side code in node js and some python...
      The setup is too complex to explain in a comment. We make this as part of AI automation services for businesses.

    • @Sibixpur
      @Sibixpur 6 місяців тому

      @@randotkatsenko5157 bro speaking as if he coded all the logic voice bot , bruhh you're just hitting API's that ain't complex....

  • @FaisalKhrisan
    @FaisalKhrisan 8 місяців тому +5

    But I still have problems it says that [from elevenlabs import generate, stream
    ImportError: cannot import name 'generate' from 'elevenlabs'] how come

    • @Ghosty0069
      @Ghosty0069 8 місяців тому

      i have the exact same error did you fix it ?

    • @LO-FI_walah_BABA
      @LO-FI_walah_BABA 3 місяці тому

      change the version of python to 1.10 or +

    • @EngIsraelOlguin
      @EngIsraelOlguin 2 місяці тому

      The Eleven Labs library version is specific: elevenlabs==0.3.0b0

  • @JokerJarvis-cy2sw
    @JokerJarvis-cy2sw 11 місяців тому +2

    Please a tutorial on llava vision model to analyze video live with cv2
    And I am unable to get my API token from assembly AI website please fix it

  • @bens4446
    @bens4446 8 місяців тому +2

    Thanks. First time I hear of AssemblyAI. Everyone talks about faster_whisper and Deepgram. Is AssemblyAI better for STT?

  • @nagarajdoddamani697
    @nagarajdoddamani697 4 місяці тому +1

    in py laptop the brew not installing, and also in program is not working

  • @uttamdwivedi7709
    @uttamdwivedi7709 10 місяців тому +1

    I followed this tutorial then in the end I realized .. assemblyAI doesn't provide the support for the Japanese language in the live Reltimetranscriber. Which sucks .. lol can't use it. Any help? @assemblyAI

  • @iainhmunro
    @iainhmunro 9 місяців тому +2

    Hi There - I was just looking at the code. Where is the appointment setting details / info coming from ?

    • @AssemblyAI
      @AssemblyAI  9 місяців тому

      All that is coming from the LLM we are using, so it's not hard-coded.

  • @theghostyced
    @theghostyced 8 місяців тому +1

    how would you handle interruptions while the ai is talking?

  • @TheBestgoku
    @TheBestgoku 10 місяців тому

    why not chunk text and output instead of output after all text is generated?

  • @rafaychaudry320
    @rafaychaudry320 2 місяці тому

    How may I integrate this into actual phone calling, through Twilio I assume, but how may I incorporate this ?

  • @shissncg
    @shissncg 5 місяців тому

    How do you grab the audio once the RealtimeTranscript has finalized? For example, could you pass the audio rather than the text to generate_ai_response?

  • @PeterBardenhagen
    @PeterBardenhagen 21 день тому

    Yeah but it doesn't actually check the time or book anything?

  • @abdulazad8432
    @abdulazad8432 5 місяців тому

    Can it be inducted into Aurdino board?

  • @sarap.sadegh4691
    @sarap.sadegh4691 10 місяців тому

    hi thanks for your video . i want Api real time conversation with python for Farsi language . the LLM support Farsi language?

  • @Akash-nb9sv
    @Akash-nb9sv 5 місяців тому

    may how to install brew does not have for windows other option for windows

  • @pawanmaurya1554
    @pawanmaurya1554 3 місяці тому

    ❤❤❤❤❤so wonderful project

  • @urekmazino1327
    @urekmazino1327 8 місяців тому

    any way to make one with adam voice like the one in elevenlabs?😊

  • @yuchengpeng7706
    @yuchengpeng7706 10 місяців тому

    This video is so great! I'm following your video but now I ran into this problem, I can install the package in Pycharm with Windows system, but I got this error: OSError: Cannot find mpv-1.dll, mpv-2.dll or libmpv-2.dll in your system %PATH%. I'm a researcher in the art field with only a debutant python knowledge, could you help me solve this problem? Thanks a lot!

  • @Alex-qo5je
    @Alex-qo5je 10 місяців тому +1

    How can i conect to my phone number and google calendar?🙏🏼

    • @AssemblyAI
      @AssemblyAI  9 місяців тому

      You can make use of the Google API for google calendar and something like Twilio's API for making phone calls.

  • @vishalsaichindepalli2798
    @vishalsaichindepalli2798 10 місяців тому

    For some reason, the microphone isn't picking up my voice. I enabled all permissions on my mac and am still having trouble. Is there any way to fix this?

    • @michaelnumnum
      @michaelnumnum 10 місяців тому +1

      I think you need to pay for the real-time transcription for this at AssemblyAI

    • @Vrilogs
      @Vrilogs 9 місяців тому

      streaming from assembly ai is a paid service. So, first you need add balance into your account. If you have not done that yet. Hope that helps :)

  • @abibusiness1085
    @abibusiness1085 3 місяці тому

    How to install mpv on windows?

  • @daeralbra
    @daeralbra 10 місяців тому +2

    The only downside is the fact it takes a while to respond with voice.

  • @I-Am-No-One-GG
    @I-Am-No-One-GG 17 днів тому

    the commends i can't run it on windows can you do one video for windows users or say what are the windows commends for this
    brew install portaudio
    pip install "assemblyai[extras]"
    pip install elevenlabs==0.3.0b0
    brew install mpv
    pip install --upgrade openai
    '''

    • @AssemblyAI
      @AssemblyAI  5 днів тому

      We will look into making a Windows version in the future!

  • @btslovers___01
    @btslovers___01 2 місяці тому

    Hlw maam can we build me ai model for this

  • @alifetechgenius3804
    @alifetechgenius3804 5 місяців тому

    Source code Not Available

  • @viditsharma6990
    @viditsharma6990 9 місяців тому

    i am facing the mpv value error on windows i already installed it many times how can i fix that

    • @sethuraman9884
      @sethuraman9884 9 місяців тому

      just use vlc instead mpv bro

    • @조바이든-r6r
      @조바이든-r6r 9 місяців тому

      @@sethuraman9884 thank you guys

    • @조바이든-r6r
      @조바이든-r6r 9 місяців тому

      or check environment path of mpv. when you command mpv --version on cmd. you have to see its running

  • @nithishreddy7684
    @nithishreddy7684 9 місяців тому

    An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)
    what to do with this error?

    • @islamicinterestofficial
      @islamicinterestofficial 9 місяців тому

      same error. You found the solution?

    • @chittisai47
      @chittisai47 9 місяців тому

      most likely your microphone is switched off pls check

    • @rachid6904
      @rachid6904 7 місяців тому

      i've got same:
      An error occured: Could not connect to the real-time service: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)

  • @CharlesZulu-v8g
    @CharlesZulu-v8g 3 місяці тому

    your free api does not work in my project

  • @jeevanjaison9646
    @jeevanjaison9646 8 місяців тому +1

    The assembly ai api is not free.

  • @mrunexpected10
    @mrunexpected10 10 місяців тому

    can u make just a chat bot word to voice

  • @jhinaouiroudayna4275
    @jhinaouiroudayna4275 7 місяців тому +2

    assembly ai APIs requires a credit card for this task

  • @JR-joren
    @JR-joren 4 місяці тому

    nice but the lagging time is too long unfortunately.

  • @ac3inlondon531
    @ac3inlondon531 7 місяців тому

    why are you using Mac omg

  • @mehdismaeili3743
    @mehdismaeili3743 8 місяців тому

    Excellent .

  • @MiguelCayazaya
    @MiguelCayazaya 8 місяців тому

    I am very api to have found this

  • @FructuredEchoes
    @FructuredEchoes 4 місяці тому +1

    From scratch is misleading as others already commented.

  • @urekmazino1327
    @urekmazino1327 8 місяців тому +1

    why are you saying fro. scratch if you're only using api

  • @BernardoCastro-eb6rp
    @BernardoCastro-eb6rp 8 місяців тому

    TOO SLOW !

  • @Marvinzock34
    @Marvinzock34 4 місяці тому

    No thats not scratch i have no money stop making me hope

  • @drmarioschannel
    @drmarioschannel 11 місяців тому +3

    after watching your video, i think i prefer interacting with humans

  • @BeRMaNyA
    @BeRMaNyA 8 місяців тому +1

    TOO SLOW!