Using OpenAI Realtime API to build a Twilio Voice AI assistant with Node.js

Поділитися
Вставка
  • Опубліковано 19 січ 2025

КОМЕНТАРІ • 199

  • @TwilioDevs
    @TwilioDevs  3 місяці тому +5

    What should we build next?
    Next up on the channel is likely going to be the Python version of this tutorial followed by some updates regarding interruptions and having the AI talk first.

    • @ethereal-rzn
      @ethereal-rzn 3 місяці тому +1

      AI talk first pleaseeee. Couldnt find any tutorial on that in the web

    • @mhazwan
      @mhazwan 3 місяці тому

      Want to see how the AI talks first

  • @georgedukic9955
    @georgedukic9955 3 місяці тому +5

    This makes things so much easier. I was trying to do this manually, converting voice to text, sending prompt to openai, and then converting the response back to voice..

    • @nags9723yt
      @nags9723yt 3 місяці тому

      Yeah. This is great feature. Imagine the lag by passing data between the different apis. 😊

    • @wissammoussa7540
      @wissammoussa7540 16 днів тому

      Yeah and the old way you only got back a 'reading' of the text, no emotions at all

  • @nlarchive
    @nlarchive 3 місяці тому +10

    that Twilio robotic voice need and update, thank for the content!!!

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +2

      There are other voice options available that sound better but definitely agree that one is from different era 😅

  • @riley_blackwell
    @riley_blackwell 3 місяці тому +8

    Now you just have to provide customer data from Segment to the model. Then when a customer calls the model can give a personalized answer.
    For example, a customer calls a car repair shop. Then the model using RAG accesses a customer’s data to check on the status of a car repair. Lastly, the model responds with the status of the car repair.
    All the customer has to do is call the car repair shop and ask a simple question with voice. A great customer experience if you ask me 😊

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      Yes, this is a great scenario! That's exactly the type of exciting things that can be enabled by combining all of the pieces. Thanks for watching and for the comment!

    • @CodyDietzofficial
      @CodyDietzofficial 3 місяці тому +2

      I am literally building this right now...

    • @riley_blackwell
      @riley_blackwell 3 місяці тому

      @@CodyDietzofficial Awesome! Can’t wait to see it :)

    • @LettersAndNumbers300
      @LettersAndNumbers300 Місяць тому

      Yes car repair shops is where the big bucks are to be made

    • @cheeky1699
      @cheeky1699 5 днів тому

      thats for us to do, not twilio :) they just give you the utencils, you gotta make the meal.

  • @Sa-if
    @Sa-if 3 місяці тому +20

    This will start a new age of AI...

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +3

      It's really impressive how interactive it is!

    • @EDashMan
      @EDashMan 3 місяці тому

      @@TwilioDevs Yoo that’s crazy. I’m going to test the repo myself first, seeing is believing haha!

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      @@EDashMan Let me know how it goes! I know OpenAI is rolling this out in stages so if it doesn't work at first, check to make sure you have access to the OpenAI Realtime API. I was blown away the first time I got this working though. Feel free to mix up the SYSTEM_MESSAGE prompt and the temperature a bit too. It's pretty amazing. I feel like I should have it coach me through making a meal :D

    • @EDashMan
      @EDashMan 3 місяці тому

      @@TwilioDevs Yeah I'm getting: Error in the OpenAI WebSocket: Error: Unexpected server response: 403
      I don't even have gpt-4o-realtime-preview-2024-10-01 in my playground. I guess I can't use it yet :(

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      @@EDashMan Bummer! Yeah hopefully it'll roll out pretty quickly.

  • @markustrasberg3957
    @markustrasberg3957 3 місяці тому +4

    There's a small bug in the blog post guide. The websocket connection URL is mistyped (should contain a single model=, atm has two)

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Thanks, I'll let Paul know!

  • @mohibahmed5098
    @mohibahmed5098 Місяць тому +1

    It doesn't handle interruptions while the AI is speaking. Am i missing something?

    • @TwilioDevs
      @TwilioDevs  Місяць тому

      Check the repo that is linked in the description. We figured out how to add that after the video shipped. Thanks for watching!

  • @saedsaify9944
    @saedsaify9944 14 днів тому

    Great. How difficult it is to modify this to use the realtime api with an openAI assistant trained on a specific knowledge base instead of a generic openAI?

  • @fantasticshorts167
    @fantasticshorts167 3 місяці тому +1

    Hey! I have use the function calling in this real time api for calendar bookking but I am struggling with how to send the response of the function back to API for TTS. Can you please help me with that?

  • @jothamdudley4116
    @jothamdudley4116 3 місяці тому +5

    got this working using my azure endpoint with some help from chatgpt!
    I did notice this example doesn't handle interruptions, will you be updating the repo with more features in the future?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +4

      that's awesome! thanks for giving it a try.
      We decided to leave interruptions out for this blog post/video because the code was already pretty long. We talked about doing follow-ups for things like interruptions and function calling. I'll check with the team and see what the plan is.

    • @gurumack
      @gurumack 3 місяці тому +1

      @@TwilioDevs to be honest, I'd really appreciate this - this is a huge part of what makes this tech so amazing. Any high level support on how to accomplish this, if it's even possible? thanks!

    • @limebulls
      @limebulls 3 місяці тому

      Great! Would you mind to share your code?

    • @jonasmassieAI
      @jonasmassieAI 3 місяці тому

      @@TwilioDevs looking for this also...

    • @ethanfossett5835
      @ethanfossett5835 3 місяці тому

      @@TwilioDevs Also looking for this - even just the samples of the code would be great don't need a full video.

  • @ziv4gamer
    @ziv4gamer Місяць тому +1

    Is there a way to trigger the first response without needing to say something first?

    • @TwilioDevs
      @TwilioDevs  Місяць тому +1

      Check the GitHub repo. It has a "assistant speaks first" option in it that got added after this video was made.

  • @ArmaanSood-y9d
    @ArmaanSood-y9d 3 місяці тому +3

    hey this is amazing , revolutionary even! , how do i connect my model to a vector_store / a knowledge base that it can refer to? or is that not supported yet ? I am trying to figure out if i should implement that in the function calling ; tools {} parameter or not? Thanks !!!!

    • @natevance3661
      @natevance3661 3 місяці тому

      I'm wondering if this is possible / how to do this as well

    • @ArmaanSood-y9d
      @ArmaanSood-y9d 3 місяці тому

      @@natevance3661 I have found out about some crazy shit , trying to piece it all together but you gotta use make

    • @titimiti1984
      @titimiti1984 3 місяці тому

      Did you figure out how to do that? Let me know if you do

    • @clairedubiel1
      @clairedubiel1 2 місяці тому

      Please let me know as well!

  • @randotkatsenko5157
    @randotkatsenko5157 3 місяці тому +2

    One thing I dont undrstand - how to make OpenAI speak first when it answers the call?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +4

      Right after the code sends the sessionUpdate object you can send something like this (feel free to modify the prompt):
      const event = {
      type: 'conversation.item.create',
      item: {
      type: 'message',
      role: 'user',
      content: [
      {
      type: 'input_text',
      text: 'Please greet the caller and say "hi there, how can i help you?"'
      }
      ]
      }
      };
      openAiWs.send(JSON.stringify(event));
      openAiWs.send(JSON.stringify({type: 'response.create'}));

    • @randotkatsenko5157
      @randotkatsenko5157 3 місяці тому

      @@TwilioDevs Thank you very MUCH! I got the code, but still no access to realtime API. Hopefully soon! Thanks again. Twilio is good.

    • @Bangs_Theory
      @Bangs_Theory 3 місяці тому

      @@randotkatsenko5157 try livekit

  • @bhargavpatel5208
    @bhargavpatel5208 17 днів тому

    Hello sir, i can able to call but AI is not connect I am not able to see the incomming-call when I receive the call

  • @cscrowley1
    @cscrowley1 3 місяці тому +1

    Also, do you guys have any thoughts you would care to share on outbound calling?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      What specifically are you looking for thoughts on?

  • @bahubaliavenger472
    @bahubaliavenger472 Місяць тому

    Hello sir, i did something similar in python flaks. But i am getting huge delay ( 5 second ) to download the audio file. From twilio. Any alternative please reply

  • @mspicela
    @mspicela 2 місяці тому

    Thanks again for the tutorial. What is needed to make it possible to interrupt the AI? I think Twilio may be buffering received audio from OpenAI that it finishes playing even when interrupted.
    I tried several changes to try to fix things. I wonder if the audio from OpenAI is sent to twilio that is buffering it. Then when it is interrupted, that is why it still keeps playing what it's already received. Is there a way to tell twilio to stop playing what had already been sent when an interruption is detected.
    The Web only implementations with webrtc handle interruptions immediately just like the ChatGPT official app. I know phone networks have a delay but this is more than that is seems to keep talking for many seconds.
    Thank you in advance.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому +1

      Hey hey! Check out this timestamp from our recent livestream where I helped Alex and Bianca add this (i'm the robot 😂). The timestamp starts at their first interaction with it where they see how the lack of interruptions impacts things and then we walk through how to add a version of interrupt to it: ua-cam.com/video/_itrbiszfiE/v-deo.htmlfeature=shared&t=2843

    • @mspicela
      @mspicela 2 місяці тому

      @@TwilioDevs Perfect and thank you! I watched the livestream recording and rebased my stuff on the newer version. It is working well now.
      What are you using to be a robot in the livestream?

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      @@mspicela Total custom build inside of OBS (obsproject.com). It's a pile of PNG files, a waveform generator for the mouth, and some subtle motion effects.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      @@mspicela Also super glad you got it working! Let us know if there's anything else we can help with!

  • @gurumack
    @gurumack 3 місяці тому +4

    has anyone here figured out how to modify this code for interrupts?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Working on this at the moment. Hopefully have an update yearly this coming week.

    • @gurumack
      @gurumack 3 місяці тому

      I was able to figure it out! thanks

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      @@gurumack Happy to hear it!

    • @exploretheworld1736
      @exploretheworld1736 3 місяці тому

      @gurumack can you plz share it. How to handle intruptions.

  • @craigsdennis
    @craigsdennis 3 місяці тому +4

    Love the video Brent! 💪🚀

  • @NexGenUltra
    @NexGenUltra 3 місяці тому +2

    for the interruption issue :
    you need to clear the twilio buffer and then send response.cancel

    • @johns332
      @johns332 3 місяці тому +1

      Can you share how you implemented this? I tried sending the following commands when the response type is input_audio_buffer.speech_started:
      await openai_ws.send(json.dumps({"type": "response.cancel"}))
      await openai_ws.send(json.dumps({"type": "output_audio_buffer.clear"}))
      No dice though :( Your help here would be greatly appreciated!

    • @NexGenUltra
      @NexGenUltra 3 місяці тому

      @@johns332 Use this : case 'input_audio_buffer.speech_started':
      console.log('Speech Start:', response.type);
      twilioWs.send(
      JSON.stringify({
      streamSid: streamSid,
      event: 'clear',
      })
      );
      console.log('Cancelling AI speech from the server');
      const interruptMessage = {
      type: 'response.cancel'
      };
      openaiWs.send(JSON.stringify(interruptMessage));
      }

  • @PraiseYeezus
    @PraiseYeezus 3 місяці тому +4

    Would like to see a tutorial about using OpenAI to get on-screen transcriptions of phone calls

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +4

      That's a cool idea. I'll see what we can do!

    • @limebulls
      @limebulls 3 місяці тому +1

      @@TwilioDevsyes please!

  • @DanBorgia
    @DanBorgia 3 місяці тому +1

    Perfect timing!

  • @wissammoussa7540
    @wissammoussa7540 16 днів тому

    Can I use this as it is and deploy it on Twilio itself as a function/build?

  • @xlretard
    @xlretard 3 місяці тому +1

    I needed this 18 months ago lol

  • @HarborProjectB
    @HarborProjectB 3 місяці тому +1

    This is great. But I have been struggling with the ability to interrupt the AI when on a call with Twilio.

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Working on something for this! Stay tuned.

  • @thechannel8x
    @thechannel8x Місяць тому +1

    Deployment? Great work, great explanation - what's the best place to deploy this? TW Services? Or that wouldn't work?

    • @TwilioDevs
      @TwilioDevs  Місяць тому

      I usually leave out deployment since it can be a fairly personal choice and outside of the scope of the tutorial. That said, this code should work anywhere you can deploy a full Node.js app. Some popular options include Render (render.com), Railway (railway.app), DigitalOcean or building your own setup within a VPS.
      Lots of options out there! Thanks for watching and let us know if you need any further help.

  • @VibeTech311
    @VibeTech311 2 місяці тому +1

    This was a great video. I am looking for a way to output the conversation both what was received and how it responded. Is that possible through the realtimeapi? Currently I can capture the response in text but I have not figured out how to capture what is said to it in text, via realtimeapi.
    Thanks again.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому +1

      I'll see if I can put something together for that. First up is the Python version of this tutorial which got delayed a little bit.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому +1

      So for clarity, you want the text of what the caller says to the AI?

    • @VibeTech311
      @VibeTech311 2 місяці тому +1

      @@TwilioDevs yes, and thank you so much. I can get the text for the realtime api response, but the text for the caller is where I am struggling. I don’t know if realtime has a way, and I recently saw something in Twilio that could possibly help. But thank you again, I truly appreciate your response and consideration.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому +1

      No promises but I'll see what I can do. If not a video perhaps we can at least get you a code snippet.

    • @VibeTech311
      @VibeTech311 2 місяці тому

      @@TwilioDevs you are amazing thank you 🙏

  • @dawid_dahl
    @dawid_dahl 3 місяці тому +2

    Can you show how we can integrate Function Calling as well?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +3

      That's a good idea for a follow-up video, thanks!

  • @mustaphaboutzoua8046
    @mustaphaboutzoua8046 3 місяці тому +1

    "Thank you, Brent! Do I need a Twilio subscription for communication between two valid numbers? (The trial only provides one valid number.) When I try to make a call using the Twilio dev phone with the same number, I don't receive anything." it seems i need two numbers?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      You can add a verified number to test your app with your own phone during trial: help.twilio.com/articles/223180048-Adding-a-Verified-Phone-Number-or-Caller-ID-with-Twilio

  • @sarzzfish8420
    @sarzzfish8420 3 місяці тому +1

    can i use it in danish, turkish or german?

    • @WaiZe0
      @WaiZe0 3 місяці тому

      I struggled to make audio input detect for a specific language, even with whisper’s language parameter. Tell me if u were able to choose any other language.

  • @cyruszad
    @cyruszad 3 місяці тому +1

    This is going to really help you guys. I worked on this immediately when this was dropped but this setup has a weakness. Interruptions don’t work when you interrupt the agent in the middle of a larger audio playback (ask it to read an example paragraph) and then try to interrupt it in the middle - it won’t work. I tried messing with it but nothing worked.

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      We're working on it! I should have something to share this week.

    • @josephbesgen4729
      @josephbesgen4729 3 місяці тому

      @@TwilioDevs Fantastic video! Just curious if you've uploaded anything regarding how to deal with interruptions

  • @zhangxiang18
    @zhangxiang18 3 місяці тому

    Thanks for the fantastic video and do I need to upgrad my twilio account to a full version to perform this function? I have set up everthing right based on the tutorial but no response from the AI even I spoke the first sentence. Alas..

  • @ryanroman6589
    @ryanroman6589 3 місяці тому +1

    running `twilio dev-phone` launches the dev phone but also updates the webhooks. anyone get this to work?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +2

      You need to use a different phone number than the one you are testing.

  • @esek-2
    @esek-2 2 місяці тому

    Is this still working? I got it to work some weeks ago, but strangely, it is not working anymore - When I call my Twilio Phone Number, in the nodejs output I get the event "input_audio_buffer.speech_started", and after I finished speaking, nothing happens, and the bot does not answer me.

    • @AbhishekMishra-db2tj
      @AbhishekMishra-db2tj 2 місяці тому

      Hey, I am also facing the same problem, did you find anything to solve this?

    • @TwilioDevs
      @TwilioDevs  2 місяці тому +1

      Should still be working, yes. We just were building again it on our livestream today and it was working.

    • @esek-2
      @esek-2 2 місяці тому

      @@AbhishekMishra-db2tj Hey, somehow it does not properly detect when I finished speaking with my phone. When trying from a different phone, it worked. Not sure why that is the case.

  • @ankitrawat7211
    @ankitrawat7211 3 місяці тому

    How can I load my own trained models in this?

  • @momoya8373
    @momoya8373 2 місяці тому

    To avoid any confusion, it’s important to clearly state that even using the development phone incurs charges for both making and receiving calls(x2 charges), as some users might assume it’s free otherwise. Why not be clear?

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      The Twilio Dev Phone documentation page states that it is using one of your own Twilio numbers to make the call. There's no intended deception here. I used the Dev Phone in the video as an option to not use my personal phone for the demo since it's easier to see the interaction and logs. It's just an option.

  • @mspicela
    @mspicela 2 місяці тому

    Thank you for the tutorial. I built an AI phone agent/bot with this combined with function calling from OpenAI and it worked very well. Unfortunately, now I can no longer edit my phone numbers configuration -- "Voice configuration is unavailable for this phone number" -- but this isn't true because it lists my URL still and worked for days. To make things worse, the support spins and spins so I can't submit a trouble ticket.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      Hi! Thanks for watching and I'm happy you built this. Sorry you're having trouble though (both with the app, and support).
      If you go here: help.twilio.com/ and ask a question, see if anything there helps resolve this.
      If not, there's a section at the bottom asking "Is this helpful?" and you can hit the thumbs down which will prompt you to either log in to submit a ticket or click the link next to it to submit a ticket without logging in.
      Once you have a ticket number, I can try to help escalate (no promises but worth a try!).

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      Hello Michael,
      Thank you for getting in touch with our Social Support Team. We sincerely apologize for the inconvenience caused.
      Could you please dm us the email address on file?

    • @mspicela
      @mspicela 2 місяці тому

      @@TwilioDevs thank you for the reply. It's working now! I didn't do anything to change it but it resolved itself.

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      Awesome news! That's much easier to triage 😀 Glad it's working again!

  • @RobertSpartacus
    @RobertSpartacus 3 місяці тому

    Twilio Folks,
    Is there any tutorila to use realtime api for outbound calls ? i.e - triggering a call & taking it forward

  • @Philosophicflix
    @Philosophicflix 3 місяці тому

    any replacement instead of ngrok? having issues with my terminal

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      There's a full list of alternatives here: github.com/anderspitman/awesome-tunneling

  • @EswaraNadh
    @EswaraNadh 3 місяці тому

    How to make OpenAI speak the function_call results? Like if the appointment is created successfully, then how to let the user know that the appointment is created successfully.

  • @SaminYasar_
    @SaminYasar_ 3 місяці тому +6

    Already built this on my channel will be crazy

  • @natevance3661
    @natevance3661 3 місяці тому

    Is there a way to connect this to a GPT assistant?

  • @MohsinAli-x8r5r
    @MohsinAli-x8r5r 3 місяці тому

    How can we get access to Realtime API on Openai account (I have paid account already). I integrated code and added openai key but problem is that during call, it's started communicating and not listening to me (No two-way communication). Can someone help me out?

  • @carloslfu
    @carloslfu 3 місяці тому

    This is great! Thanks for sharing!

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Glad you enjoyed it! Thanks for watching 🎉

  • @krloschavarriasauceda151
    @krloschavarriasauceda151 3 місяці тому +1

    what theme of you vscode you have?

  • @wordpressobsessed9067
    @wordpressobsessed9067 3 місяці тому +1

    So can we host this on Twilio serverless? If so, which file would we point the incoming call to? Also, it can be modified to greet the caller first, correct? I'm thinking for a business AI assistant to take calls, give information etc. I have created these AI apps with Vapi, but it gets pretty expensive. Twilio would be so much cheaper.

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      I think with the need for a persistent web socket connection you're probably going to be best served doing this outside of our serverless Twilio Functions. I can double check with the team though!
      As for greeting, you can definitely change the tags to customize the greeting from Twilio or I believe you could pre-prompt OpenAI with a text prompt using the Realtime API if you want the greeting to come from the assistant.

    • @wordpressobsessed9067
      @wordpressobsessed9067 3 місяці тому

      @@TwilioDevs Thanks, I'll mess around with it some. Is that voice coming from AWS? I've never heard that voice, but its really good and would be terrific for most professional business applications. The latency is next to nothing, which has been the biggest hurdle it seems with these voice AI assistants. Good to see Twilio is now in the game!

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      @@wordpressobsessed9067 It's one of OpenAI's voices. I agree it's very natural sounding!

    • @0xb1sh0p8
      @0xb1sh0p8 3 місяці тому +1

      @@TwilioDevs Correct you'll need a persistent ws listening for a unique stream for each number/assistant you're hosting.

    • @aiplaygrounds
      @aiplaygrounds 3 місяці тому

      You can probably run it through your crm before answering to get all the phone info if any.

  • @WaiZe0
    @WaiZe0 3 місяці тому

    How can i set input language to something other that English?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      You can change the system prompt to indicate the language you want to use. It will also usually match whatever language you speak to it.

    • @WaiZe0
      @WaiZe0 3 місяці тому

      @@TwilioDevs I’ve created a twilio program before but using the gather method i was able to choose the language, but with openai realtime api i tried their language parameter for whisper-1 and it doesn’t work.
      And sadly the current state of auto detection is 75% flawed in my tests.

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      @@WaiZe0 At 03:02 we set up a system prompt. You can tell it what language you'd like for it to use in that prompt (and also tell it how to greet the caller, etc.). From my testing it has obeyed that quite well. I told it to converse only in Spanish and I wasn't able to get it to break out of that even by insisting I only knew English.

    • @WaiZe0
      @WaiZe0 3 місяці тому

      @@TwilioDevs I noticed it works well in English and Spanish, but im working with Arabic and it gets it only 1/10 times even with the clearest system prompt. Is there a way to set language like Twilio’s gather method?

  • @IdkJustCookingDude
    @IdkJustCookingDude 3 місяці тому

    I'm so frustrated I'm literally at the last step. I got the twilio and openai API to work together and when I call the phone number it says please wait speak your AI agent brought to you by openai and twilio and then says okay you can speak and then hangs up. Can anyone help I have been using chat GPT and Claude and they're both making me run around in circles

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      The symptoms sound like an OpenAI Realtime API key issue. Seems like the call is hanging up at the point the OpenAI Realtime API should be getting connected. Are you getting any errors in the terminal?
      Please refer to the blog post or GitHub repo in the video description to make sure your code is 100% correct. You can also check on your API key's access at platform.openai.com

  • @RobertSpartacus
    @RobertSpartacus 3 місяці тому

    Any guide on how to add function calling ? Also can't we buy an Indian number rn ?

  • @muhammadatif9263
    @muhammadatif9263 3 місяці тому

    What is the reason for using fastify over express?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      The websocket module for fastify is nice to work with and fastify is more performant than Express for this use case.

  • @BrainCandyQuiz
    @BrainCandyQuiz 3 місяці тому +1

    Confused. Instructions say "Step 2: Get your Account Sid and Auth Token from the Twilio Console to get started.", but nowhere does it say what do with them. Also call connects ago, but it can't seem to hear me, then disconnected after 5 seconds. Related? Connected to the OpenAI Realtime API
    Sending session update: {"type":"session.update","session":{"turn_detection":{"type":"server_vad"},"input_audio_format":"g711_ulaw","output_audio_format":"g711_ulaw","voice":"alloy","instructions":"You are a helpful and bubbly AI assistant who loves to chat about anything the user is interested about and is prepared to offer them facts. You have a penchant for dad jokes, owl jokes, and rickrolling - subtly. Always stay positive, but work in a joke when appropriate.","modalities":["text","audio"],"temperature":0.8}}
    Disconnected from the OpenAI Realtime API

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      If the call is working at all, the Twilio side of this is fine which means you're okay on the Twilio credentials front. This looks like it's not getting audio over to the OpenAI API. There are some more logging types you can enable with the code in the blog post. Can you try turning those on and see what you get in the terminal?

  • @radoslav07
    @radoslav07 3 місяці тому

    If I want to use this example without twillio call, but directly from my mic and web page

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      You'll need to stream audio from your local microphone to the OpenAI websocket.

  • @RobertSpartacus
    @RobertSpartacus 3 місяці тому

    Is there a way to buy Indian numners on Twilio if not what is the workaround rn ?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Hi Bharath,
      Thank you for getting in touch with our Social Support Team. Unfortunately, Twilio does not offer the ability to purchase Indian phone numbers directly. However, there are some workarounds and considerations you can explore.
      Kindly dm us for more information.

  • @KirkBell
    @KirkBell 3 місяці тому

    Will this work with changing the default voices accents to accents like Australian, English/UK and others?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      I believe I read that OpenAI will detect the regional accent and speak the responses in that accent. I think you can add that to the instructions (SYSTEM_MESSAGE) in the app to help reinforce the goal.

  • @SathishM-n8i
    @SathishM-n8i 3 місяці тому

    this is for Incoming Call right what about outgoing call

  • @limebulls
    @limebulls 3 місяці тому

    Can you make a tutorial for this on azure as well?

  • @riley_blackwell
    @riley_blackwell 3 місяці тому +1

    This is great!

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      Thanks for watching!

  • @ompawaskar507
    @ompawaskar507 3 місяці тому

    Does this work with gemini 1.5 flash??

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      This tutorial is specifically for the OpenAI Realtime API.

  • @EDashMan
    @EDashMan 3 місяці тому +1

    Is the speed really this fast?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +4

      Yes! The phone calls shown are not sped up or edited 😃

    • @0xb1sh0p8
      @0xb1sh0p8 3 місяці тому +2

      I can vouch for the speed. I'm just wrapping up development on a project that uses this flow along with some other options for generating assistants.

    • @EDashMan
      @EDashMan 3 місяці тому

      @@0xb1sh0p8 how do you know if you have access to the api, other than a server 403 error I’m not getting an exact messaging regarding the api.. do you have it available in the playground ?

    • @0xb1sh0p8
      @0xb1sh0p8 3 місяці тому

      @@EDashMan I don't have anything public right now. When you signup with twilio, you'll create an account. When you go to that account's dashboard and scroll down, it will show you your SID and Auth Token to access the API

    • @0xb1sh0p8
      @0xb1sh0p8 3 місяці тому

      @@EDashMan hmm, did my last comment get deleted? You'll have access to the api when you sign up and create and account. At the bottom of the account page you'll see your SID and AuthToken to use.

  • @akelebelay1025
    @akelebelay1025 3 місяці тому +1

    can you do it using python?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      Yes! Should we make a Python video tutorial?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      For now, here's a blog post: www.twilio.com/en-us/blog/voice-ai-assistant-openai-realtime-api-python

    • @TwilioDevs
      @TwilioDevs  2 місяці тому

      Sorry for the delay!
      ua-cam.com/video/OVguB1h-eTs/v-deo.html

  • @sfsadfsadfasdf
    @sfsadfsadfasdf 3 місяці тому +1

    This is the future.. the problem is that the OpenAI's voices in spanish doesn't sounds very well.. they sounds with like an american accent, is there a way to integrate this voice, not using GPT's voice but using elevenlabs without losing the realtime benefit of twilio-openai?

    • @mandrews817
      @mandrews817 3 місяці тому +3

      If you use advanced mode, switch your system language to Spanish, open a new conversation, and tell the assistant: "can you speak to me using a Castillian Spanish accent?"

    • @boytenesee3494
      @boytenesee3494 3 місяці тому

      The realtime API allows either speech or text response - you can send the respond to 11labs and then push back into twilio after

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Have you tried the options provided by the other commenters yet? Would love to help you find success.

    • @sfsadfsadfasdf
      @sfsadfsadfasdf 3 місяці тому

      @@mandrews817 But the advance mode is available in the API?, or you are talking about the voice assistant that OpenAI is currently launching?, if its the first thing, could you please tell me where i can read more about.. i have never heard about advance mode in the API speech to text

    • @sfsadfsadfasdf
      @sfsadfsadfasdf 3 місяці тому +1

      @@boytenesee3494 Will try this, maybe it will delay the responses a little bit but i think it wouldnt be very noticeable, i will give it a try, thank you for the idea.

  • @60pluscrazy
    @60pluscrazy 3 місяці тому +1

    Thanks 🎉

  • @johns332
    @johns332 3 місяці тому +1

    Anyone else getting 403 errors?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      From the video description:
      "OpenAI is rolling out Realtime API access incrementally. Please watch their site for updates."
      This is likely due to this.

    • @johns332
      @johns332 3 місяці тому

      Darn, thanks for the video and response though!​@TwilioDevs

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      @@johns332 Thank you for watching 😃 Let us know when you get access. Happy building!

  • @aiplaygrounds
    @aiplaygrounds 3 місяці тому +1

    My next project ❤

    • @TwilioDevs
      @TwilioDevs  3 місяці тому +1

      Let us know how it goes!

  • @TwilioDevs
    @TwilioDevs  2 місяці тому

    Would you prefer to see this tutorial in Python? Check it out here: ua-cam.com/video/OVguB1h-eTs/v-deo.html

  • @musumo1908
    @musumo1908 2 місяці тому

    Using tools and azure realtime endpoint

  • @nixoncode
    @nixoncode 3 місяці тому +1

    Somewhat helpful, but why would you want this?

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Probably lots of use cases. This example is very basic but imagine an assistant that replaces the typical phone tree at a company with something that speaks naturally to them, can answer some questions they may have, and ultimately can redirect the call to an actual human if it detects it needs to.

  • @NexGenUltra
    @NexGenUltra 3 місяці тому

    Thre is an issue on the quality of the answer, especially when dealing with local dialects. While he can somewhat handle English (not Good), it struggles significantly with dialects like Darija or other regional languages. The difference in transcription accuracy between the current implementation and the OpenAI Playground is very noticeable.

  • @mohamudalifarah7722
    @mohamudalifarah7722 3 місяці тому +1

    Node.js 18+

    • @TwilioDevs
      @TwilioDevs  3 місяці тому

      Correct, version 18 or higher. Not sure why I said 18+ like it was an age or something 🤣

  • @cscrowley1
    @cscrowley1 3 місяці тому

    OAI dashboard billing limits says I do have access "Realtime
    gpt-4o-realtime-preview 20,000 TPM 5,000 RPM
    gpt-4o-realtime-preview-2024-10-01 20,000 TPM 5,000 RPM"
    But I can only hear the clunky Twilio TTS at the beginning of the call and do not get connected. Also DTMF button press seems to end the session: "Server is listening on port 5050, Client connected
    Received non-media event: connected
    Incoming stream has started MZcbf17dca62564c8a46602ce815cd43bd
    Connected to the OpenAI Realtime API
    Sending session update: {"type":"session.update","session":{"turn_detection":{"type":"server_vad"},"input_audio_format":"g711_ulaw","output_audio_format":"g711_ulaw","voice":"alloy","instructions":"You are a helpful and bubbly AI assistant who loves to chat about anything the user is interested about and is prepared to offer them facts. You have a penchant for dad jokes, owl jokes, and rickrolling - subtly. Always stay positive, but work in a joke when appropriate.","modalities":["text","audio"],"temperature":0.8}}
    Received non-media event: dtmf
    Disconnected from the OpenAI Realtime API
    Received non-media event: stop
    Client disconnected."