Tutorial: Create a Voice-Enabled chatbot using OpenAI & Azure Cognitive Services in Python
Вставка
- Опубліковано 29 чер 2024
- 🔗 Links:
OpenAI: platform.openai.com
Source repository: github.com/CarolineChiari/Con...
❓ What did you think of this video❓
www.carolinechiari.com/feedba...
💪🙌Help support the channel🙌💪
Yotta Savings: withyotta.page.link/EV3MpjCEG...
⏱ Timestamps ⏱
00:00:00 Demo
00:00:31 Introduction
00:01:21 Creating Azure Speech Service
00:02:43 Azure Speech Pricing
00:04:16 Getting Azure Speech Keys
00:04:56 Getting the OpenAI API Key
00:05:42 Creating the .env file
00:06:33 Creating the gitignore file
00:07:02 Drawing the application outline
00:10:44 Creating the application skeleton
00:11:18 Loading the environment variables
00:11:42 Installing dotenv library
00:13:16 Creating output folder
00:14:33 Writing the speech-to-text code
00:15:36 Installing the cognitive services speech library
00:16:13 Importing the environment settings for speech services
00:16:47 How Asynchronous speech services works
00:19:15 Speech config
00:20:18 Audio config
00:20:55 Speech Recognizer
00:21:27 Speech Recognizer events
00:22:35 Starting/stopping recognition
00:24:24 First try
00:24:59 Handling transcription results
00:30:26 Recording until done
00:37:34 Caroline Messed up...
00:39:00 Adding sound indicators
00:41:42 Returning speech processing results
00:42:48 Creating OpenAI processing
00:48:34 Creating the Dialog loop
00:50:24 Engineering the dialog prompt
00:54:38 Talking to an AI
00:55:22 Fixing Typo
00:56:31 Writing the test-to-speech code
00:59:00 Playing back the generated speech
01:00:00 Finalizing the dialog loop
01:00:34 Final result
01:02:41 Conclusion - Наука та технологія
This was so fun to create and record! I hope you enjoyed it. Here are a few things missing from this so you can practice your skills:
- Save all the the data to the output folder (Speech to text/text to speech/OpenAI completion)
- Use the new chat API instead of completions: platform.openai.com/docs/guides/chat
Hello @carolinechiari , can u please help me with one issue . I am running this with no errors but my session is getting started and automatically ended suddenly. I am unable to put my speech in it
An updated version with less delay would be great! If you could do one tutorial all on azure (llm & speech) with low delay (async by example) I would even spend 50 Euro to you as a thank you :)
I was making something like this and had a few gaps I couldn't fix. I think this covers everything I was stuck on. Thank you, thank you, thank you! 💜
You’re very welcome!
This is cool! Thanks for sharing. Any easy way to get this in to a mobile app or website? (For a developer test of a public facing app, we could ask the user to provide API keys themselves on startup, and have a pointer to a tutorial on how to create the Azure/OpenAI services to obtain them)
Hi Caroline, great video learned allot, just struggling to get it running im getting a error that i think is related to a environmental problem but cannot figure it out. Error = zsh: segmentation fault /usr/local/bin/python3
Nice tutorial - Could anyone please share how can this be done on a web app
For Example: A web app taking input from client's browser using microphone and returning the response.
It’s a lot of work but I can do it. I have multiple webapps that use this.
@@carolinechiari Thanks for your amazing work. Even I am looking for how we can put just speech to text on web app. Could you point to the video link which does that?
@maazkhan2312 were you able to work out speech to text on web app?
@@saurabhkhodake yes I did, I can share a reference to follow
I have a few that do it, but creating a tutorial would take a few hours to make and even more to edit. I’ll do it, I just need my voice to recover from surgery a little more.
Hello @carolinechiari , can u please help me with one issue . I am running this with no errors but my session is getting started and automatically ended suddenly. I am unable to put my speech in it
hello there, how did you open Azure speech project in your VS Code I tried to find a way to upload the Project on GitHub. Thanks for making this video.
You can download it and right click on the folder and click open in VS Code. I Hope that helps. Let me know if it doesn’t
@@carolinechiari Thanks for your support
Can I do this with the free subscription? It is telling me i cant create new resources?
How to stream the audio while the text is being generated by the LLM? can you give me the sample code?
I don’t have code to do this, but essentially, you would have to use the asynchronous api and send a text to speech request at the end of every sentence.
@@carolinechiariyeah an example / updated code tutorial would be awesome!