It's actually easy you first generate a face of your choice on midjourney or dall-e and you import that picture in an app like facedance, ifunface, speakpick, etc..
idk if i commented before but i really enjoy this. its simple to understand and easy to follow especially youre clean code and the use of comments makes it verry easy to code allong and to customize stuff as needed.
I had some problems with the speak and talk part, so it ended up like a chatbot that works with "hotkeys"/command triggers for input to make specific things. Like, command trigger "music" opens a youtube playlist and things like that. I'm happy with the results :) edit: now It can "talk"... I generated some phrases on viocevox, downloaded the audio files and made it play along with the texts in the code at some key points
Great example, well explained and acutally works. I have tried multiple youtube examples and forever end up in a rabbit hole spiral with chatgpt providing corrections to then only create further errors. I really liked how you explained each function and process. I was a great tutorial in provide clear and precise instructions that were very informative. Thank you.
@@Ai_Austin Hi, i just wanted to ask about an error I am getting! I have done pip3 to install everything, and when I run it I get an error saying pyaudio wants installed. I go and do everything "pip3 install pyaudio" "pip install pyaudio" etc. Nothing is working, it does like half of it then says that "Could not build wheels for pyaudio" blah blah blah. Then it says that there's an error with "#include 'portaudio.h' ". Do you know how to fix this?????
I have just tried to do something like this for my program, but you are the first one, thank you very much, great job. Now I will use it for my program. Thank you.
If you are looking for a topic for your next video, I would love to see you take this to a web interface using flask. I have been trying so many different ways from other videos but always end up in a dark rabbit how with a chatbot, unable to find something that works. It keeps recommending code that breaks in so many ways and loses the original function that was working.
Great tutorial, Austin. Simple, to the point. Would it make sense to upgrade this to the Turbo model now? Also, could you do a tutorial about fine-tuning {prompt: x, response: y} to clone your friends using chat history data?
Absolutely. If you just change the engine variable in the open ai function of the code, you can just specify “gpt-3.5-turbo”. Then it will send your prompts to the new version of the API. Fine tuning is absolutely in the video pipeline. Have a few others ahead of it but will be creating a fine tuning tutorial here soon.
I understood nothing, but damn IT stuff and programing is fascinating, it would probably take me 1000 years to learn it, that is why all I can do is admire people like you.
I bet you could learn it. Its not reserved for some super high iq humans. Checkout the free online book “automate the boring stuff with python”. Give yourself a month. Study it 1-2 hour 3 times a week, this program will look like fluent english!
The big problem with this is that chatGPT is only relevant for many queries up to 2021. You really need to make this to interact with Bing Chat which has access to current data.
Great idea Mike, I got a Bing AI Voice Assistant Tutorial coming soon. You are right, having access to current data for our voice assistant is a huge improvement and I’m working on getting that out for you guys now! The bing voice assistant I am making will be completely free if you have beta access to bing as well. Unlimited questions.
@@Ai_Austin Sounds good, though I noticed this morning I already have both voice input and output available on Bing Chat. Don't know when Microsoft added that. Sadly you have to press the microphone icon to activate whereas it would be much more useful to be able to start with some sort of voice activation like Google assistant (especially if it could be customised). What we really need is something as interactive as in the movie 'Her' (I and many like me would pay a monthly fee for that btw) ....Keep up the good work
This is great, and I'd love to try it, but the text is so small and kind of blurred that it's a challenge to make out the code. Will you add it to the description or pinned comment? That'd be really helpful.
I made a similar code, it s very easy, but you can improve yours by saving in a txt file all the questions and answers so it can memorize what you said before. You just have to give all the content of the file for each request
WOW you are an incredible tutor i have been an instructor/teacher for 30 years now and i NEVER seen code writing and concepts explained so clearly and understandable like you just did here explaining and teaching code is not so trivial as many would thing and there are plenty examples for that on the net GREAT video (and note that im not even talking about the specific content itself) keep up the good work, SUBSCIBED
c'mon bro really? you've never in you're 30 whole years of teaching never seen it explained better amongst professional teachers? I mean sure the video is informative but c'mon.
Sure, here's an improved version of your statement: "I used ChatGPT to analyze the script of this video and engaged in a conversation where ChatGPT3 generated a micro-detailed strategy to guide you through every last detail that you might need to know. In summary, during our conversation, you asked about creating a GPT-3 powered voice assistant with Python. I provided you with a step-by-step guide that covers everything from importing necessary libraries and setting up the OpenAI API key to defining functions for transcribing audio to text, generating responses, and speaking responses. We also discussed the importance of error handling and adding additional features to improve the accuracy and usefulness of the voice assistant."
this is a great tutorial!. I really love it if you upgrade it. What i mean by upgrade is that, import the python programme in to any type of device such as arduino or raspberry pi ( If possible). Make it wireless.
Thanks Mate! Through this i was able to completly copy famous Chatbots like Siri or Alexa and thanks to the python statement "in", i was able to create a bot, who can filter my commands from whole and variable sentences. My Bot almost feels like a human teacher i can ask any question 😁 ... well ... almost ... davinci seems not to be able to tell the correct date and time since both is created from learning and not from actual live data (i asked GPT directly, Davinci refused to give me a usefull answer 😂)
I watched your video and really enjoyed it. Please make another project like this where it will be a mobile application and whenever I call genius it will respond like Siri or google assistant. And if you make a video let me know with a little reply. In the end, I will say one thing, you are a wonderful teacher
This code works, but it is not optimal. Using speech_recognition to detect the initial command is slow because it requires sending the audio to a server, waiting for the server to process it with a large model, and then receiving the result. Ideally, a pre-trained KWS model that can recognize a single command and runs locally should be used instead.
Great video! I found it really informative and helpful. Thanks for sharing your knowledge and expertise with us. Looking forward to more videos like this in the future!
Very cool, make it and I'll use it, especially would love it if we could upload a Mid-journey etc talking avatar of our choice (or photo that could be adapted).
In case anyone is wondering as of today (2024) basically everything is outdated in this video unfortunately ;-; Hopefully this can save some people from trying and failing.
can you make this with GPT4all? would love to see a video on how to get this running on a offline system since you dont want to be depending on their model, if it gets out of hand we need backup models
with sr.Microphone() as source: recognizer sr.Recognizer() audio = recognizer.listen(source) its highlightin "sr" as an error and when i run it it says invalid syntax, and when i try to pip install the library it says that its already installed
I like your delivery style, however to be really effective the code needs to be legible, at least for those of us that are great coders. Even after magnification itsome of it was just a blur. It would be excellent if you could provide a file with the code in it.
I appreciate the feedback, is that happening even on 1080p with a computer monitor? Either way Ill make sure zoom in on the code and start linking a github repo for the projects. Thanks Nicholas!
@@Ai_Austin On my MacBook Pro Retina from 2015, the code is very readable at 1080p, only slightly blurred. Still, it would be convenient not having to write the code, but it might be a better learning experience writing the code myself.
Hello, loved the video works wonders. Would you be able to make a video series on how to add other features? such as opening apps, opening websites, setting alarms, adding a todo list & having it speak at cirten times of the day, say you want an alarm at 7am the bot would say good morning (name) today is (Date) with the weather being (weather info) & so forth I think it would be really cool
This is a good video it could be even better though with a release of GPT 3.5 turbo if you would take and show this again using GPT 3.5 turbo and whisper I think he would have a lot better response and a lot of people will really jump on wanting to do this. Thanks.
I have been researching Whisper. Its barrier to entry is a lot higher. Meaning if you want to run Whisper without having to pay for every question to transcribe, it needs to be done locally. Which puts you in the position of either needing a PC with 10+ GB of video ram. I also have not seen any evidence that the whisper api performs better in transcription than google speech recognition. OpenAI is the hype but I don’t want to make people feel obligated to shell out money for something that is currently possible for free. If one needed offline transcribing and has a beast of a pc to power the python program, Whisper would be a great choice.
I think today’s computers are probably powerful enough to handle text to speech I am a blind individual and I use several apps on my phone on my computer that dude just this kind of conversion and they’re not high power apps or high power computer. Some of them sent off to the Internet for processing but one of the things that could be done. If CPU horsepower is a real concern is push it off to the GPU most computers have Decent graphics processing units that would process much faster than a CPU ever could and it doesn’t take a lot of code to do that. I do think there’s a little more involved in writing code but I don’t think it’s any strong barrier. I think it’s just something Hass to be learn how to do. I’m in the process of trying to learn some of these things myself and I don’t see it as difficult as what you think it might be Again being blind it’s a little hard for me to quickly ramp up to the stuff but I’m getting there
Thanks for the great video :) One nit-pick, the text is so small it's a struggle to read, I'm constantly leaning into the screen just to know what I'm looking at. There's heaps of dead space around your avatar, maybe consider zooming in a bit on your next vid.
the video was good and i followed it but al last what files Did you download while you were running the programm can you tell and if i want to convert the voice to jarvis's voice how can i do it
It took me a long time to figure out a few things, and by the end all I found out is that the API key isn't free. WARN THE PUBLIC NEXT TIME. Thank you.
Yeah for future reference, OpenAI doesnt give any API access for free. This is more of a Python tutorial for people already aware of the API’s and wanting to create something or learn something with Python than a “here is everything you could know about OpenAI as a business” If learning Python isn’t of value, I definitely would watch my videos that are not Python tutorials. Thanks for your feedback! 🙏
Hi, how do we change the voice to sound a bit like normal voice. And how do we make this work like google AI. For it to come up on our phones when we say 'Hey Genius' Or just call her name.
I've done the same in PHP using a few different APIs and streaming the data as to reduce the latency as much as possible, but its still laggy. Reducing the lag between a question and response is the tricky bit.
Id check out my new Bard voice assistant tutorial! Its faster than openai's api's and free. The past week I have been using Bard way more than chatgpt. Its just better for fact based responses that need to check recent internet data to verify its answers. And somehow faster than chatgpt without back-searching google.
Had to get rid of the underscore in speech_recognition to get that to work. And I had to run pip install pyaudio to get it to work, but it works. Does this thing have contextual memory? Will it remember by conversations with it? I don't see any logging or context, so I don't think it does.
LOL, I built this app thinking "oh I'll to the old one becuase it will be less complicated" I finally got this thing to work and my AI is dumb as a brick, good exercise but now I will build the newer version.
😂 Yeah it's crazy how fast these things are advancing. My newest Gemini tutorial would be my recommendation. It's going to be the smartest, fastest and I put a lot more work into that video to make it as thorough as possible. It being longer doesn't make it harder. It just is a a more thorough tutorial!
That is a great question. Ive yet to find the need to learn whisper. Its my understanding that its superior for language translation and perception of accents. It also isn’t free like the speech recognition method i showed.
@@Ai_Austin I was testing Whisper over the weekend. It works great - English is excellent, while even small languages are acceptable with an editor. API is not that expensive, you can transcribe a movie for around 0,50 EUR. However, there is also possibility to install it on your server, running it locally and with that it will only cost the price of the infrastructure.
Yes we can modify parameters of the tts. Ask chatgpt how you can modify the parameters of the tts and you will have a little code snippet. just copy and paste the three lines after the initialization, you can modify the values for testing different voices and speech rates
i need help, it says "Python was not found; run without arguments to install from the microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases." what do i do??
Bro you have used gpt-3 not chat gpt. Chatgpt have extra features such as it can answer question based on previous question and responses. For example on chatgpt Question :-Creat a basic html page Answer:- *code* Question:- now add the background colour to black Answer:- * modified code with background colour black * But when u use gpt-3 for the same it will treat both the questions differently.. and not give the upgraded HTML code.
Sounds like you found some feature you want to add. Feel free to add a fork to the github repo linked in the new voice assistant if you wanted to actually contribute :)
@@arjund1173 ngl so I just copy and pasted my code into ChatGPT and told it that it wasn't hearing my voice and ChatGPT fixed it for me lol. Also make sure you have the packages installed too
All are actually pretty simple tasks with Python. Just a matter of adding a wake word for the new task and adding the few lines of code needed for each desired task you mentioned. ChatGPT could probably even do it for you!
i don't know anything about python but this is fascinating....i would also like to store all my chats realtime in some sort of folder and hierarachy so that i may catalogue them or use them later for referal
One idea would be to create an SQL database or a spreadsheet. Then have your python program write that data every time you ask a question. Those type of tasks python is really fast at, so you wont even notice any slowing of your program.
Is it possible to retain a session-like memory of previously asked questions with the API like we can do on the ChatGPT web interface? For instance if I ask "Where is the oldest tree located"? and follow it by "How tall is it?", can we make API responses retain the context?
Using 3.5-turbo it is possible to have contextual memory. It would definitely add some complexity and would potentially want to create a command to refresh memory if you did so.
so i'm having a slight problem when i say genius after starting the bot it comes back with: An error occurred: local variable 'filename' referenced before assignment. is there a fix for this?
Nice. Is it possible to create a Telegram bot using OpenAI's latest model released 3 days ago? Using the chat endpoint? It would be nice if you create a tutorial for that.
Hi, please do you know how to run the script in windows using VS code? I'm a junior developer trying to run this code and dont work :( I have already python installed in my pc. Copy and paste the code, paste my OpenAI apy key in place, run the python3 app.py in a VS terminal but did't work
@@marekkupis3086 Hi, No, I can't solve yet :( if you can solve anytime please let me know. I'm almost sure it is about a configuration that we don't know. Maybe install a dev dependency or something
@@carlosmontes7088 Hey Carlos I was using VS Code today to run the script and it worked perfectly. Just had to rewrite it a bit. Moved the generate response function above transcribe audio to text in order for it to work.
Currently working on integrating some Amazon API's to make the a much more usable experience. Including no wake word. I have 0 technical background but in the last 6 hours with the help of chatgpt I have a working model
My goal exactly with these tutorials is you add your own preferences and upgrade upon these. Super cool to hear you’re doing it with no coding background man!
Debug your code: - openai module not present even if you installed it? uninstall and reinstall python3, then pip install every library. - pip install SpeechRecognition - pip install setuptools - DON'T FORGET TO INSERT YOUR API KEY!!! (because I did lol) Now it should work BUT you'll still have to upgrade the code to get a fully functional assistant. Otherwise, if you miss the first audio recording opportunity, it just bugs in an infinite loop.
Build Your Own GPT-4o Voice Assistant in Python with Groq, Llama3, OpenAI-TTS & Faster-Whisper
ua-cam.com/video/pi6gr_YHSuc/v-deo.html
Can you make a video showing how you doing this avatar talk. Cheers
+1
I would love to know how....
It's actually easy you first generate a face of your choice on midjourney or dall-e and you import that picture in an app like facedance, ifunface, speakpick, etc..
Unreal Engine Metahumans can do it but it is intense, Metahumans looks more realistic also
Cheers
idk if i commented before but i really enjoy this. its simple to understand and easy to follow especially youre clean code and the use of comments makes it verry easy to code allong and to customize stuff as needed.
I had some problems with the speak and talk part, so it ended up like a chatbot that works with "hotkeys"/command triggers for input to make specific things. Like, command trigger "music" opens a youtube playlist and things like that. I'm happy with the results :)
edit: now It can "talk"... I generated some phrases on viocevox, downloaded the audio files and made it play along with the texts in the code at some key points
can you please share your code with me? im interested in the music part. thank you :D
you prolly doing it in windwos. the say function doesnt work in windows its the same case as mine.
Bro, can you please share your code? I kind of need it for a school project.
I'll make it worth your time I swear!
Great example, well explained and acutally works. I have tried multiple youtube examples and forever end up in a rabbit hole spiral with chatgpt providing corrections to then only create further errors. I really liked how you explained each function and process. I was a great tutorial in provide clear and precise instructions that were very informative. Thank you.
Very glad that helped, Thanks for the feedback Marc! More python tutorials coming!
@@Ai_Austin Hi, i just wanted to ask about an error I am getting! I have done pip3 to install everything, and when I run it I get an error saying pyaudio wants installed. I go and do everything "pip3 install pyaudio" "pip install pyaudio" etc. Nothing is working, it does like half of it then says that "Could not build wheels for pyaudio" blah blah blah. Then it says that there's an error with "#include 'portaudio.h' ". Do you know how to fix this?????
@@PhoenixVids123 in windows it will show for me like that, so i am using raspberry pi the pip install pyaudio worked on the pi
I have just tried to do something like this for my program, but you are the first one, thank you very much, great job. Now I will use it for my program. Thank you.
If you are looking for a topic for your next video, I would love to see you take this to a web interface using flask. I have been trying so many different ways from other videos but always end up in a dark rabbit how with a chatbot, unable to find something that works. It keeps recommending code that breaks in so many ways and loses the original function that was working.
Great tutorial, Austin. Simple, to the point. Would it make sense to upgrade this to the Turbo model now? Also, could you do a tutorial about fine-tuning {prompt: x, response: y} to clone your friends using chat history data?
Absolutely. If you just change the engine variable in the open ai function of the code, you can just specify “gpt-3.5-turbo”. Then it will send your prompts to the new version of the API.
Fine tuning is absolutely in the video pipeline. Have a few others ahead of it but will be creating a fine tuning tutorial here soon.
@@Ai_Austin looks like fine tuning is not yet available for Turbo. I've tried with DaVinci without much success
@@Ai_Austin Hmm I'm getting " Engine does not exist" error... What am I overlooking?
@@sebaccimaster it s not working like that you have all another syntax for the completion. Look in internet you will find your answers
I understood nothing, but damn IT stuff and programing is fascinating, it would probably take me 1000 years to learn it, that is why all I can do is admire people like you.
I bet you could learn it. Its not reserved for some super high iq humans. Checkout the free online book “automate the boring stuff with python”. Give yourself a month. Study it 1-2 hour 3 times a week, this program will look like fluent english!
@@Ai_Austin thx buddy, I'll give it a try :)
Not sure if anyone asked you this... How did you create your AI visual in place of your face?
I'm totally intrigued!
Thanks laddie! I've been scratching my head with assemblyAI for days to make this to work, this went seemingly!
i know right assembly ai is so much money
that is the best animated avatar ive ever seen
The avatar is so well made, every one i saw blinks so much but this one does it at a reasonable phase.
The big problem with this is that chatGPT is only relevant for many queries up to 2021. You really need to make this to interact with Bing Chat which has access to current data.
Great idea Mike, I got a Bing AI Voice Assistant Tutorial coming soon. You are right, having access to current data for our voice assistant is a huge improvement and I’m working on getting that out for you guys now! The bing voice assistant I am making will be completely free if you have beta access to bing as well. Unlimited questions.
@@Ai_Austin Sounds good, though I noticed this morning I already have both voice input and output available on Bing Chat. Don't know when Microsoft added that. Sadly you have to press the microphone icon to activate whereas it would be much more useful to be able to start with some sort of voice activation like Google assistant (especially if it could be customised). What we really need is something as interactive as in the movie 'Her' (I and many like me would pay a monthly fee for that btw) ....Keep up the good work
Even in the comments you talk like a robot my guy.
@@VirginMostPowerfulllmaooo
This is great, and I'd love to try it, but the text is so small and kind of blurred that it's a challenge to make out the code. Will you add it to the description or pinned comment? That'd be really helpful.
I made a similar code, it s very easy, but you can improve yours by saving in a txt file all the questions and answers so it can memorize what you said before.
You just have to give all the content of the file for each request
How did you create your avatar and his speech? It sounds much better than the pyttsx's generated voice?
It's 11labs probably
WOW you are an incredible tutor
i have been an instructor/teacher for 30 years now and i NEVER seen code writing and concepts explained so clearly and understandable like you just did here
explaining and teaching code is not so trivial as many would thing and there are plenty examples for that on the net
GREAT video (and note that im not even talking about the specific content itself) keep up the good work,
SUBSCIBED
c'mon bro really? you've never in you're 30 whole years of teaching never seen it explained better amongst professional teachers? I mean sure the video is informative but c'mon.
Great video. As someone else mentioned, the code is a little small. In future videos would you be able to make it larger so it's easier to read.
Sure, here's an improved version of your statement:
"I used ChatGPT to analyze the script of this video and engaged in a conversation where ChatGPT3 generated a micro-detailed strategy to guide you through every last detail that you might need to know. In summary, during our conversation, you asked about creating a GPT-3 powered voice assistant with Python. I provided you with a step-by-step guide that covers everything from importing necessary libraries and setting up the OpenAI API key to defining functions for transcribing audio to text, generating responses, and speaking responses. We also discussed the importance of error handling and adding additional features to improve the accuracy and usefulness of the voice assistant."
Opa
help please it says An error occured: You exceeded your current quota, please check your plan and billing details.
You can make a iPhone shortcut with the api. It’s one response though, but works really well. Using your api key. No programming needed.
this is a great tutorial!. I really love it if you upgrade it. What i mean by upgrade is that, import the python programme in to any type of device such as arduino or raspberry pi ( If possible). Make it wireless.
This can be very beneficial for those who may have a disability. Very cool
Thanks Mate! Through this i was able to completly copy famous Chatbots like Siri or Alexa and thanks to the python statement "in", i was able to create a bot, who can filter my commands from whole and variable sentences. My Bot almost feels like a human teacher i can ask any question 😁 ... well ... almost ... davinci seems not to be able to tell the correct date and time since both is created from learning and not from actual live data (i asked GPT directly, Davinci refused to give me a usefull answer 😂)
I watched your video and really enjoyed it. Please make another project like this where it will be a mobile application and whenever I call genius it will respond like Siri or google assistant. And if you make a video let me know with a little reply.
In the end, I will say one thing, you are a wonderful teacher
This code works, but it is not optimal. Using speech_recognition to detect the initial command is slow because it requires sending the audio to a server, waiting for the server to process it with a large model, and then receiving the result. Ideally, a pre-trained KWS model that can recognize a single command and runs locally should be used instead.
how would you do that ?
Yes we're curious how to do that
@@hugosilva5842 you can use speech recognition library and it s source code to run it locally, it would be faster, but not that faster...
@@hugosilva5842 this requires only a lil bit of machine learning skills and a PC that costs about 2k
Great video! I found it really informative and helpful. Thanks for sharing your knowledge and expertise with us. Looking forward to more videos like this in the future!
Thanks Austin and increase the font size in the editor next time
I actually learned some things completely unrelated to the video. Thanks dude.
Very cool, make it and I'll use it, especially would love it if we could upload a Mid-journey etc talking avatar of our choice (or photo that could be adapted).
Can you edit or train speedchrecognition library so that it will able to convert our dialect/unknown language to text
SOOOOO sick dude!! can you teach us how to implement an avatar to integrate it with the gpt responses? such as yours in the video
In case anyone is wondering as of today (2024) basically everything is outdated in this video unfortunately ;-;
Hopefully this can save some people from trying and failing.
Watch the newer tutorials. Code tutorials don't last for ever when they use 3rd party API's!
My newer tutorials are far better than this one anyways!
That's me 😂
can you make this with GPT4all? would love to see a video on how to get this running on a offline system since you dont want to be depending on their model, if it gets out of hand we need backup models
how did you create the host animation..please explain
This is a really good video compared to most on here. Cheers
Can u help me? The code is all correct but there is a huge error saying that google is not an attribute in the module recognizer
Can some explain were we are headed with but me its gonna be wild 😂 loved the Dis-song😅🎉
Hey! How did you make your avatar? Can you make a tutorial on that?
ua-cam.com/video/yWRx-jCDBqo/v-deo.html
what are the dependencies that are required to be installed for this project?
pyttsx3, pyaudio, openai and SpeechRecognition
with sr.Microphone() as source:
recognizer sr.Recognizer()
audio = recognizer.listen(source)
its highlightin "sr" as an error and when i run it it says invalid syntax, and when i try to pip install the library it says that its already installed
I like your delivery style, however to be really effective the code needs to be legible, at least for those of us that are great coders. Even after magnification itsome of it was just a blur. It would be excellent if you could provide a file with the code in it.
I appreciate the feedback, is that happening even on 1080p with a computer monitor? Either way Ill make sure zoom in on the code and start linking a github repo for the projects. Thanks Nicholas!
@@Ai_Austin Thanks! Great video. ... Yes, text is fuzzy on an iMac 27" monitor.
@@Ai_Austin On my MacBook Pro Retina from 2015, the code is very readable at 1080p, only slightly blurred. Still, it would be convenient not having to write the code, but it might be a better learning experience writing the code myself.
@@ingmarxhoftovningsr6144 or you could get chat gpt to write it
@@StoutProper Yes, that's the way to go, I guess!
What was the process you used to create the speaking animation of your avatar?
I'm also interested in how this is done.
Hello, loved the video works wonders.
Would you be able to make a video series on how to add other features? such as opening apps, opening websites, setting alarms, adding a todo list & having it speak at cirten times of the day, say you want an alarm at 7am the bot would say good morning (name) today is (Date) with the weather being (weather info) & so forth I think it would be really cool
Bro you're just awesome,
Can U please make an app like this...
Coming very soon! 🫣
@@Ai_Austin thanks broooo
Amazing man....
This is really helpful stuff, this level of quality get an insta like from me
lol the end :D nice one
This is a good video it could be even better though with a release of GPT 3.5 turbo if you would take and show this again using GPT 3.5 turbo and whisper I think he would have a lot better response and a lot of people will really jump on wanting to do this. Thanks.
I have been researching Whisper. Its barrier to entry is a lot higher. Meaning if you want to run Whisper without having to pay for every question to transcribe, it needs to be done locally. Which puts you in the position of either needing a PC with 10+ GB of video ram.
I also have not seen any evidence that the whisper api performs better in transcription than google speech recognition. OpenAI is the hype but I don’t want to make people feel obligated to shell out money for something that is currently possible for free.
If one needed offline transcribing and has a beast of a pc to power the python program, Whisper would be a great choice.
I think today’s computers are probably powerful enough to handle text to speech I am a blind individual and I use several apps on my phone on my computer that dude just this kind of conversion and they’re not high power apps or high power computer. Some of them sent off to the Internet for processing but one of the things that could be done. If CPU horsepower is a real concern is push it off to the GPU most computers have Decent graphics processing units that would process much faster than a CPU ever could and it doesn’t take a lot of code to do that.
I do think there’s a little more involved in writing code but I don’t think it’s any strong barrier. I think it’s just something Hass to be learn how to do. I’m in the process of trying to learn some of these things myself and I don’t see it as difficult as what you think it might be Again being blind it’s a little hard for me to quickly ramp up to the stuff but I’m getting there
"That sounds like NLE CHOPPA" ahahah you got me there m8 love the video
Thanks for the great video :)
One nit-pick, the text is so small it's a struggle to read, I'm constantly leaning into the screen just to know what I'm looking at. There's heaps of dead space around your avatar, maybe consider zooming in a bit on your next vid.
How did you make the avatar to talk? (The guy talking). Do you have a guide for yhat?
the video was good and i followed it but al last what files Did you download while you were running the programm can you tell and if i want to convert the voice to jarvis's voice how can i do it
Great! Good coding,
This is amazing 💯, it would be cool if you could also create a video on how to control other applications, using this module, btw is it possible ?
Yes, absolutely possible.
Great video !. what if you add the talking avatar feature to the ai assistant like the one in the video?
Thank you 🙏 That is the future but for now the tools to do it would make an extremely expensive and slow assistant
Opa quero ver
Simply, GENIIUS🤣
this is good bro, thank you
It took me a long time to figure out a few things, and by the end all I found out is that the API key isn't free. WARN THE PUBLIC NEXT TIME. Thank you.
Yeah for future reference, OpenAI doesnt give any API access for free. This is more of a Python tutorial for people already aware of the API’s and wanting to create something or learn something with Python than a “here is everything you could know about OpenAI as a business”
If learning Python isn’t of value, I definitely would watch my videos that are not Python tutorials. Thanks for your feedback! 🙏
Hi, how do we change the voice to sound a bit like normal voice. And how do we make this work like google AI. For it to come up on our phones when we say 'Hey Genius' Or just call her name.
I've done the same in PHP using a few different APIs and streaming the data as to reduce the latency as much as possible, but its still laggy. Reducing the lag between a question and response is the tricky bit.
Id check out my new Bard voice assistant tutorial! Its faster than openai's api's and free. The past week I have been using Bard way more than chatgpt. Its just better for fact based responses that need to check recent internet data to verify its answers. And somehow faster than chatgpt without back-searching google.
@@Ai_Austin help please it says An error occured: You exceeded your current quota, please check your plan and billing details.
Had to get rid of the underscore in speech_recognition to get that to work. And I had to run pip install pyaudio to get it to work, but it works. Does this thing have contextual memory? Will it remember by conversations with it? I don't see any logging or context, so I don't think it does.
Damn I was thinking about it this morning, but it appears that someone already did it 1 month ago before me.
Great video austin but my program cannot access the voice of the microphone. Do I need to save my voice as a file for this program to recognize
amazing! congrats
i'm facing some troubles running it, it requires FLAC conversion utility. any suggestions ?
btw: i'm using mac..
same
LOL, I built this app thinking "oh I'll to the old one becuase it will be less complicated" I finally got this thing to work and my AI is dumb as a brick, good exercise but now I will build the newer version.
😂
Yeah it's crazy how fast these things are advancing. My newest Gemini tutorial would be my recommendation. It's going to be the smartest, fastest and I put a lot more work into that video to make it as thorough as possible.
It being longer doesn't make it harder. It just is a a more thorough tutorial!
After Whisper API release, does it make sense to use Python function for transcribing audio? How much better is Whisper in understanding?
That is a great question. Ive yet to find the need to learn whisper. Its my understanding that its superior for language translation and perception of accents. It also isn’t free like the speech recognition method i showed.
@@Ai_Austin I was testing Whisper over the weekend. It works great - English is excellent, while even small languages are acceptable with an editor. API is not that expensive, you can transcribe a movie for around 0,50 EUR. However, there is also possibility to install it on your server, running it locally and with that it will only cost the price of the infrastructure.
I am creating perfect javas myself.
will it run the same on python or do I need to change something? because I'm trying to run it on Python and it does not run like VSC.
Looks cool bro
Hey, nice tutorial, but can you let me know if there is a way to change the voice of the model? If yes, how?
Thanks!
Yes we can modify parameters of the tts. Ask chatgpt how you can modify the parameters of the tts and you will have a little code snippet. just copy and paste the three lines after the initialization, you can modify the values for testing different voices and speech rates
Beultiful pipeline this channel.
I need help when I run and say "genius" it says An error ocurred : module 'speech_recognition' has no attribute 'recognize'
change it to sr.Recognizer()
cannot access local variable 'audio' where it is not associated with a value (error) How do I fix
Awesome Video
Thank you 🙏
Nice i try next
Bro this is so cool
This is all cool, but what if you wanted to do it with ChatGPT-4? Or are the libraries for it not available to the public yet?
could you redo this for the new GTP 3.5 turbo API?
awesome, I love this. how you do the animated face?
i need help, it says "Python was not found; run without arguments to install from the microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases." what do i do??
Bro you have used gpt-3 not chat gpt.
Chatgpt have extra features such as it can answer question based on previous question and responses.
For example on chatgpt
Question :-Creat a basic html page
Answer:- *code*
Question:- now add the background colour to black
Answer:- * modified code with background colour black *
But when u use gpt-3 for the same it will treat both the questions differently.. and not give the upgraded HTML code.
Sounds like you found some feature you want to add. Feel free to add a fork to the github repo linked in the new voice assistant if you wanted to actually contribute :)
@@Ai_Austin thanks for the help
Hi ! I love this video, but once I have the code, how I can launch the program ? (run the program don't work) 🥺
hello, very nice tuto but i have a problem with the microphone. the script doesnt want to hear my voice
I'm having the same problem have you found a solution yet?
I am having the same issue please help.
@@arjund1173 ngl so I just copy and pasted my code into ChatGPT and told it that it wasn't hearing my voice and ChatGPT fixed it for me lol. Also make sure you have the packages installed too
Can we add the basic functionalies like, open a website, open a file, launh/terminate a program ...etc?
All are actually pretty simple tasks with Python. Just a matter of adding a wake word for the new task and adding the few lines of code needed for each desired task you mentioned. ChatGPT could probably even do it for you!
i don't know anything about python but this is fascinating....i would also like to store all my chats realtime in some sort of folder and hierarachy so that i may catalogue them or use them later for referal
One idea would be to create an SQL database or a spreadsheet. Then have your python program write that data every time you ask a question. Those type of tasks python is really fast at, so you wont even notice any slowing of your program.
@@Ai_Austinthks whom would i commission to do that ...:)
Is it possible to retain a session-like memory of previously asked questions with the API like we can do on the ChatGPT web interface?
For instance if I ask "Where is the oldest tree located"? and follow it by "How tall is it?", can we make API responses retain the context?
Using 3.5-turbo it is possible to have contextual memory. It would definitely add some complexity and would potentially want to create a command to refresh memory if you did so.
it s very easy, you just have to save your questions and the answers in a .txt file , then you give for each new request the content of the file !
so i'm having a slight problem when i say genius after starting the bot it comes back with: An error occurred: local variable 'filename' referenced before assignment. is there a fix for this?
Nice. Is it possible to create a Telegram bot using OpenAI's latest model released 3 days ago? Using the chat endpoint? It would be nice if you create a tutorial for that.
Which AI tool you use avtar and voice-over
Thanks for video
Thank you for your advice! It worked for me! It's great! And I received some help from GPT-Chat too!
Hi, please do you know how to run the script in windows using VS code? I'm a junior developer trying to run this code and dont work :( I have already python installed in my pc. Copy and paste the code, paste my OpenAI apy key in place, run the python3 app.py in a VS terminal but did't work
@@carlosmontes7088Hi, Carlos. Did you solved your problem? Bc i have the same...
@@marekkupis3086 Hi, No, I can't solve yet :( if you can solve anytime please let me know. I'm almost sure it is about a configuration that we don't know. Maybe install a dev dependency or something
@@carlosmontes7088 Hey Carlos I was using VS Code today to run the script and it worked perfectly. Just had to rewrite it a bit. Moved the generate response function above transcribe audio to text in order for it to work.
Thanks brother 👍👍
Awesome!
Currently working on integrating some Amazon API's to make the a much more usable experience. Including no wake word. I have 0 technical background but in the last 6 hours with the help of chatgpt I have a working model
My goal exactly with these tutorials is you add your own preferences and upgrade upon these. Super cool to hear you’re doing it with no coding background man!
Debug your code:
- openai module not present even if you installed it? uninstall and reinstall python3, then pip install every library.
- pip install SpeechRecognition
- pip install setuptools
- DON'T FORGET TO INSERT YOUR API KEY!!! (because I did lol)
Now it should work BUT you'll still have to upgrade the code to get a fully functional assistant. Otherwise, if you miss the first audio recording opportunity, it just bugs in an infinite loop.
can you explain it pls ???
Very interesting. Thanks. What software do you use to create teh talking head? Thanks.
Is there anyway to attach this kind of virtual assistant code to a virtual avatar that can respond like the one in this video?
how to get source code
No one is saying how.
Awesome :), will this work on a raspberry pi?