Thanks for recording this! I've been reading the book Make Python Talk and it seems earlier versions of python work beautifully. Thanks for the example and the link to your script.
its nice to see the discovery process , (without focussing to much on it .... share the issues along the way ...as long as it works in the end !! this is the learning process...
Interesting video and nice to see thought process. 1 probably obvious question is the hard drive getting full up of MP3 + wave files or are they just temp files?
Thanks for the comment! That’s a great question. It actually shouldn’t be a problem as each time the assistant responds it overwrites the previous version. So you will have at most one MP3 and one WAV file. I probably could have mentioned that 😅
@@jakeeh Not sure if is local but plenty of people managed to pull off pretty impressive jarvis styles assistances, one used Arduino to switch light on and off. I was actually thinking of starting my own one but limited to pc managment like opening tabs/research or music all from a single program.
Yeah you could definitely do that. There is a Wikipedia pip package you could install to search Wikipedia easily. A text file you would just need to read the text from. Ideally separate it per new line and then search for keywords in each line until you find a match. :)
is it possible store user voice and next use voice assistant live voice compare with stored voice the both voice are match the voice assistant respond for user command other wise not respond ? like voice biometric system plz explain
Oh that’s a fun idea! As far as I know, none of the current packages we’re using in this project are capable of doing that. There might be some packages that are capable of that though. Voice biometrics is not a super thing to solve with any high level of accuracy, so it is probably trickier than you might imagine
@@jakeeh I know it's funny but my project is that creating authorised voice assistant it means it's respond only owner of the device commands only.but don't know how the machine identify it's owner voice if you know anything about that help mee 😍
Yeah, it’s a cool project for sure. 👍 Unfortunately biometrics is a complicated problem, so there doesn’t seem to be too many free resources. I did find this GitHub project (github.com/Raymo111/voiceprint) but it will require you to train a model using it.
@@jakeeh no sorry, you mentioned putting York favorite fictional AI in the comments. Skippy is from the Expeditionary Force series by Craig Alanson. Highly recommended!
I haven’t used pyttsx3 before but that sounds like a great option! I was just happy to find something so quickly. It’s amazing what’s available to us nowadays :)
Is it possible to make a Voice Assistant (on windows) where you are not local administrator ? On work pc I'm not local admin, but got Python installed and can get modules, but can't install FFMPEG.
Hey, it might not be possible to install FFMPEG without admin controls unfortunately. You might be able to find an alternative though can can work with the wav files instead though :)
This video aligns with my current interests, thank you very much. Google speech recognition service is not free beyond 60 minutes per month, I think. So how are you managing past that ? Are you using any other free options ?
Happy to help! I have only used this for small amounts of time, so I haven’t hit the 60 minutes before. That said, I’m not entirely sure of good free alternatives. With a quick search I did find a few that seem promising, but I can’t vouch for any myself at this time.
Thank you! I don’t have a discord community yet, but I plan to have one set up in ~1 week once I’m back home from vacation. I’ll reply here to let you know :)
@@jakeeh I got the speech in , image in , text in ... I got the speech out , the text out , (no diffusion yet) .. I would like the video in ! ... Perhaps to pictures to description ... I would like sound in to description (like that's the sound of a man ... Driving a car ? ) .. and sound out , generate the sound of a dolphin ? ... So this version of speech in is actually not bad .. still a playoff between this and whisper ... But this has enabled me to have the microphone live instead so it's great 👍
Yeah, that would be an amazing tool if you can get all that working! Diffusion in my experience takes a while, although I don’t have an Nvidia GPU so it’s just using CPU.
@@jakeeh the audio diffusion is actually quite quick ... i have the stable diffusion for imgae also but i think it was quite heavy ... i think there maybe a different option... i think the stable diffusion method is actually bad ... not for soun but for vison ... its too heavy and too many parts !
I'm always checking comments :) You should try adding a couple of print statements after the recognizer and "audio = " line. See if you end up getting there or if you're getting stuck there and it's just not taking input. Are you seeing any error or just no output outside of the "Listening for commands..."?
I'm not sure I understand exactly what you want to make. But try to do some searching online and you might be able to find pieces of what you want and then you just need to piece them together :)
@@jakeeh can I mail you sir..... I can share with you the screenshot of the error ... It said it needs visual c++.... Which i downloaded still can't crack it
when I did pip install winsound it didn't work it said, ERROR: Could not find a version that satisfies the requirement winsound (from versions: none) ERROR: No matching distribution found for winsound ERROR: Could not find a version that satisfies the requirement winsound (from versions: none) ERROR: No matching distribution found for winsound
The next step in my process would be to have the Ai activate on the trigger word, like siri and Alexa do, but I always find I have to run the program manually again after I exit the Ai and it stops responding to commands and such after a few minutes of in-activity. I don't want it to continuously listen as to not take up space but I need it to activate on command🤷🏼
Great point! I think what you could do would be similar to what many home assistants do which is have a small lightweight program that runs all the time just listening for a single word or a stream of words. Then it saves that if it heard the keyword at the start, like “Alexa”. Then it executes another separate program that does the heavy lifting of understanding the audio and it returns a response. Still not super ideal, but you’ll need something running to listen :)
@@jakeeh I think that you need to have a wake (two words) ... Like "hey Alexa" ... The extra word enable for less confusions with other potential simple words which may activate it by accident ... .. so like hay Jarvis ... Or hey computer ... Is perfect ...
In theory yes. You would need some kind of small computer capable of running this and a mic and set of speakers. But that should be possible. This would also need an internet connection too though. I do have another video where you could do it all offline though.
Hey maybe you can help me , im building a droid and i want him to communicate with ai and deliver it back to me using his custom made voice... is this possible ?
It should be possible, yeah. You’ll need to investigate voice mimicking. I think all you would need to do is take the input, run it through a model of your choice, and then get the string as output and run it through your custom voice. If you want it be able to be smart based on the response though, that would be much more difficult though. You’d need custom models, that’s something big companies are still figuring out
an up and coming youtuber you say? I better like and subscribe then! my suden ideas: 1. as i am currently building a Discord bot. I thought that this, with discord should be cool. 2. make a voice assistant to control my smarthome. 3. make a voice asisstant to do tasks on my pc.
Thanks for the support! There are a ton of ways you can use this. It’s a great time to be a programmer :) Consider joining our Discord community as well. I always love to hear about what projects people are working on!
hi JakEh, thank you for the video. I tried it but I get an error. ImportError: cannot import name 'AudioSegment' from 'pydub' how can I import Audiosegment from pydub please ?
there might be possibility that you havn't installed pydub and that could be the possible reason for getting this error and if you have already installed pydub then try to upgrade as "pip install --upgrade pydub".
@@jakeeh at the end of the 90's there was a TV series called earth final conflict. One of the characters called Augur fancied a women who wasn't available/interested so he build a virtual holographic assistant with her image and voice. It got pregnant and gave birth to a new baby AI if I remember correctly. Something I've never forgotten lol
Did you install the PIP packages for the project? If not, you’ll need to do that. Otherwise, if you could reply with what error you’re seeing when you’re running the code that would be helpful :)
@@jakeeh It doesn't come up with a problem so I believe I've installed all the packages. This is the error I'm getting: PS C:\Users\kiki__871k8t> & C:/Users/kiki__871k8t/AppData/Local/Programs/Python/Python312/python.exe c:/Users/kiki__871k8t/Downloads/assistant.py C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning) This has been building a virtual assistant with Python C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py:198: RuntimeWarning: Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work warn("Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work", RuntimeWarning) Traceback (most recent call last): File "c:\Users\kiki__871k8t\Downloads\assistant.py", line 75, in respond("This has been building a virtual assistant with Python") File "c:\Users\kiki__871k8t\Downloads\assistant.py", line 31, in respond sound = AudioSegment.from_mp3("response.mp3") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\audio_segment.py", line 796, in from_mp3 return cls.from_file(file, 'mp3', parameters=parameters) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\audio_segment.py", line 728, in from_file info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py", line 274, in mediainfo_json res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1026, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1538, in _execute_child hp, ht, pid, tid = _winapi.CreateProcess(executable, args, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ FileNotFoundError: [WinError 2] The system cannot find the file specified
It looks like you don’t have ffmpeg installed which is a program that helps convert files from one form to another. In the video it’s what the package that converts the voice file from mp3 to a wav. I’ll add a link to the description for where to download it :)
Are you using the same version of Python to run it as you installed the packages with pip? If you’re not sure, make sure you run it with ‘python -m pip install pyautogui’ then ‘python myscript.py’.
That's true! Most usually need an active internet connection. This is because they don't have any kind of local model that can interpret speech, so the audio is sent to a server to compute the response/action. That being said, there are some 'mini'-models as you described where the model that does the interpreting is small enough to be on the device. It's certainly possible to make one of those yourself, but it's not going to be very easy :)
Hey 👋 I actually have a Discord where you’re welcome to join and chat :) A link should be in the description of the video. I do have Twitter as well, but I’m less active there, it’s @JakeEhTv.
The process was quite interesting, it shows that you just have to be curious enough😅❤ To get some things done
Absolutely! Curiosity and enough poking around is often all it takes :)
Thanks for recording this! I've been reading the book Make Python Talk and it seems earlier versions of python work beautifully. Thanks for the example and the link to your script.
No problem! Happy to help!
Thanks for the vid... Really helpful👍🏾
Happy you enjoyed it!
please make more this type of video great work keep it up man !!!!!
Thank you so much! ❤️ / Jake
its nice to see the discovery process , (without focussing to much on it .... share the issues along the way ...as long as it works in the end !! this is the learning process...
Interesting video and nice to see thought process. 1 probably obvious question is the hard drive getting full up of MP3 + wave files or are they just temp files?
Thanks for the comment! That’s a great question. It actually shouldn’t be a problem as each time the assistant responds it overwrites the previous version. So you will have at most one MP3 and one WAV file. I probably could have mentioned that 😅
Which website are u using for the coding???????????????????????????
I'm using Visual Studio Code. I have a video on my channel where I go over it a bit if you're interested :)
- Jake
Amazing 😯♥️
Thank you! Happy you enjoyed the video! 🙂
nice video
Thank you :)
Wonder if someone put together already an local jarvis like assistant with python....
Yeah I do wonder that! Maybe a fun personal project if you can hook it up to your house lights and stuff :)
@@jakeeh Not sure if is local but plenty of people managed to pull off pretty impressive jarvis styles assistances, one used Arduino to switch light on and off.
I was actually thinking of starting my own one but limited to pc managment like opening tabs/research or music all from a single program.
What if i want to search in TXT or PDF file alternative of wikipedia 🤔
Yeah you could definitely do that. There is a Wikipedia pip package you could install to search Wikipedia easily.
A text file you would just need to read the text from. Ideally separate it per new line and then search for keywords in each line until you find a match. :)
@@jakeeh can you please make a video about it like you want to search in your only txt or PDF file
I can probably make a quick video to show something like that. No guarantee on a release date though. :)
@@jakeeh That will be great. Thanks 👍
is it possible store user voice and next use voice assistant live voice compare with stored voice the both voice are match the voice assistant respond for user command other wise not respond ? like voice biometric system plz explain
Oh that’s a fun idea! As far as I know, none of the current packages we’re using in this project are capable of doing that. There might be some packages that are capable of that though. Voice biometrics is not a super thing to solve with any high level of accuracy, so it is probably trickier than you might imagine
@@jakeeh I know it's funny but my project is that creating authorised voice assistant it means it's respond only owner of the device commands only.but don't know how the machine identify it's owner voice if you know anything about that help mee 😍
Yeah, it’s a cool project for sure. 👍
Unfortunately biometrics is a complicated problem, so there doesn’t seem to be too many free resources. I did find this GitHub project (github.com/Raymo111/voiceprint) but it will require you to train a model using it.
Skippy the Magnificent!
Is that the name of your personal assistant? :)
@@jakeeh no sorry, you mentioned putting York favorite fictional AI in the comments. Skippy is from the Expeditionary Force series by Craig Alanson. Highly recommended!
Oh that's awesome! I'll have to check it out :)
pyttsx3 is really good at tts imo it does sound a bit more robotic but its easier to use it doesnt make a file or anything(that is an option tho)
I haven’t used pyttsx3 before but that sounds like a great option! I was just happy to find something so quickly. It’s amazing what’s available to us nowadays :)
@@jakeeh I second this it's what I use much better than letting Google know what your up to 😁
I’ll be sure to take a look at pyttsx3. Thanks for the suggestions :)
Yes for text to speech it's fast !! So for output it's great 👍
@@drak4188 yes I was wondering , is it being converted online ? (Perhaps disconnect the internet and test it ? )
Is it possible to make a Voice Assistant (on windows) where you are not local administrator ?
On work pc I'm not local admin, but got Python installed and can get modules, but can't install FFMPEG.
Hey, it might not be possible to install FFMPEG without admin controls unfortunately. You might be able to find an alternative though can can work with the wav files instead though :)
This video aligns with my current interests, thank you very much. Google speech recognition service is not free beyond 60 minutes per month, I think. So how are you managing past that ? Are you using any other free options ?
Happy to help! I have only used this for small amounts of time, so I haven’t hit the 60 minutes before. That said, I’m not entirely sure of good free alternatives. With a quick search I did find a few that seem promising, but I can’t vouch for any myself at this time.
Another super creative video, nice job! Do you have a discord community set up?
Thank you!
I don’t have a discord community yet, but I plan to have one set up in ~1 week once I’m back home from vacation. I’ll reply here to let you know :)
Just created my discord - I added a link to the video description :)
@@jakeeh thanks for the update!
anything that works is good !!
the hardest thing is the ffmpeg pesky thingy
ffmpeg can be annoying to work with at first but it can be super powerful if you get used to it :)
@@jakeeh I got the speech in , image in , text in ...
I got the speech out , the text out , (no diffusion yet) ..
I would like the video in ! ... Perhaps to pictures to description ...
I would like sound in to description (like that's the sound of a man ... Driving a car ? ) .. and sound out , generate the sound of a dolphin ? ...
So this version of speech in is actually not bad .. still a playoff between this and whisper ... But this has enabled me to have the microphone live instead so it's great 👍
Yeah, that would be an amazing tool if you can get all that working! Diffusion in my experience takes a while, although I don’t have an Nvidia GPU so it’s just using CPU.
@@jakeeh the audio diffusion is actually quite quick ... i have the stable diffusion for imgae also but i think it was quite heavy ... i think there maybe a different option... i think the stable diffusion method is actually bad ... not for soun but for vison ... its too heavy and too many parts !
If you're still seeing comments, whenever I run the code it does say "Listening for commands..." but doesn't pick up my voice. Any fix : )
I'm always checking comments :)
You should try adding a couple of print statements after the recognizer and "audio = " line. See if you end up getting there or if you're getting stuck there and it's just not taking input.
Are you seeing any error or just no output outside of the "Listening for commands..."?
Can we implement this as hardware
I think you probably could if you have a raspberry pi or something + a mic.
What were you hoping to make?
@@jakeeh a voice assistant which can project also.
It's similar to what you have done but I want to make that instead of computer
I'm not sure I understand exactly what you want to make. But try to do some searching online and you might be able to find pieces of what you want and then you just need to piece them together :)
I get an error while installing audiosegment library.... Could you please help me
Hey Sreesanjanabose,
What error did you see when you tried installing it?
@@jakeeh can I mail you sir..... I can share with you the screenshot of the error ... It said it needs visual c++.... Which i downloaded still can't crack it
Feel free joining the discord! You're welcome to post there asking for help and I and others can see if we can help out :)
Very Nice 🙂
Thanks 😊
when I did pip install winsound it didn't work it said, ERROR: Could not find a version that satisfies the requirement winsound (from versions: none)
ERROR: No matching distribution found for winsound
ERROR: Could not find a version that satisfies the requirement winsound (from versions: none)
ERROR: No matching distribution found for winsound
Which python version are you using?
@@jakeeh never mind im using mac but can it still work with mac
Yeah there should be a way to make it work for Mac that does the same thing that Winsound does :)
@@jakeeh can you make a small tutorial on that, please?
@minecraftmacjava123 I can look into it, although I’d suggest asking in the Discord to see if anyone else has tried this on a Mac and has any luck :)
The next step in my process would be to have the Ai activate on the trigger word, like siri and Alexa do, but I always find I have to run the program manually again after I exit the Ai and it stops responding to commands and such after a few minutes of in-activity. I don't want it to continuously listen as to not take up space but I need it to activate on command🤷🏼
Great point! I think what you could do would be similar to what many home assistants do which is have a small lightweight program that runs all the time just listening for a single word or a stream of words. Then it saves that if it heard the keyword at the start, like “Alexa”. Then it executes another separate program that does the heavy lifting of understanding the audio and it returns a response.
Still not super ideal, but you’ll need something running to listen :)
@@jakeeh I think that you need to have a wake (two words) ... Like "hey Alexa" ... The extra word enable for less confusions with other potential simple words which may activate it by accident ... .. so like hay Jarvis ... Or hey computer ... Is perfect ...
Can one fit this tech into a toy doll or toy snimal
In theory yes. You would need some kind of small computer capable of running this and a mic and set of speakers. But that should be possible.
This would also need an internet connection too though. I do have another video where you could do it all offline though.
Can i try it
You sure can! The code is in the link in the description to my Github :) Feel free to just download that and give it a run :)
Hey maybe you can help me , im building a droid and i want him to communicate with ai and deliver it back to me using his custom made voice... is this possible ?
It should be possible, yeah. You’ll need to investigate voice mimicking. I think all you would need to do is take the input, run it through a model of your choice, and then get the string as output and run it through your custom voice.
If you want it be able to be smart based on the response though, that would be much more difficult though. You’d need custom models, that’s something big companies are still figuring out
an up and coming youtuber you say? I better like and subscribe then!
my suden ideas:
1. as i am currently building a Discord bot. I thought that this, with discord should be cool.
2. make a voice assistant to control my smarthome.
3. make a voice asisstant to do tasks on my pc.
Thanks for the support!
There are a ton of ways you can use this. It’s a great time to be a programmer :)
Consider joining our Discord community as well. I always love to hear about what projects people are working on!
hi JakEh, thank you for the video. I tried it but I get an error.
ImportError: cannot import name 'AudioSegment' from 'pydub'
how can I import Audiosegment from pydub please ?
there might be possibility that you havn't installed pydub and that could be the possible reason for getting this error and if you have already installed pydub then try to upgrade as "pip install --upgrade pydub".
I think @akankshayadav is right. Try to install that and see if you get a different result, if not post back here or in the discord :)
@@akankshayadav3087 Thank you so much. But I already did it and was still Error 😕
@@jakeeh Thank you for your answer. But I already tried it and nothing
Would you mind posting the stack trace and maybe a picture of your directory into the Discord? Without more information it's difficult to help out :)
tnx you
No problem!
Add gpt-4 and you have Google assistant 2.0!
Absolutely! With all the free tools it’s amazing the power when you mash a few things together :)
@@jakeeh you can but you need an API from your Chatgpt login plus it costs you not much but it still is a cost
can you give me sourse code ?
Just added it to the description. Thanks for the heads up that it wasn’t there yet :)
Source code sir?
There’s a link in the description to the GitHub for the code :)
Thanks for your comment!
Please Send Me this Code File Please
There’s a link in the description with the code :)
interesting
Thanks!
Holo Lili 😊
Did you name your assistant Lily?! :)
@@jakeeh at the end of the 90's there was a TV series called earth final conflict. One of the characters called Augur fancied a women who wasn't available/interested so he build a virtual holographic assistant with her image and voice.
It got pregnant and gave birth to a new baby AI if I remember correctly.
Something I've never forgotten lol
Oh wow! I grew up in the 90s but I certainly missed that one :)
I copy pasted the code into VSC but it doesn't work. Is there anything else I need to do
Did you install the PIP packages for the project? If not, you’ll need to do that. Otherwise, if you could reply with what error you’re seeing when you’re running the code that would be helpful :)
@@jakeeh It doesn't come up with a problem so I believe I've installed all the packages. This is the error I'm getting:
PS C:\Users\kiki__871k8t> & C:/Users/kiki__871k8t/AppData/Local/Programs/Python/Python312/python.exe c:/Users/kiki__871k8t/Downloads/assistant.py
C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
This has been building a virtual assistant with Python
C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py:198: RuntimeWarning: Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work
warn("Couldn't find ffprobe or avprobe - defaulting to ffprobe, but may not work", RuntimeWarning)
Traceback (most recent call last):
File "c:\Users\kiki__871k8t\Downloads\assistant.py", line 75, in
respond("This has been building a virtual assistant with Python")
File "c:\Users\kiki__871k8t\Downloads\assistant.py", line 31, in respond
sound = AudioSegment.from_mp3("response.mp3")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\audio_segment.py", line 796, in from_mp3
return cls.from_file(file, 'mp3', parameters=parameters)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\audio_segment.py", line 728, in from_file
info = mediainfo_json(orig_file, read_ahead_limit=read_ahead_limit)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\site-packages\pydub\utils.py", line 274, in mediainfo_json
res = Popen(command, stdin=stdin_parameter, stdout=PIPE, stderr=PIPE)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\kiki__871k8t\AppData\Local\Programs\Python\Python312\Lib\subprocess.py", line 1538, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [WinError 2] The system cannot find the file specified
It looks like you don’t have ffmpeg installed which is a program that helps convert files from one form to another. In the video it’s what the package that converts the voice file from mp3 to a wav.
I’ll add a link to the description for where to download it :)
@@jakeeh I've installed it but now it's saying that speech_recogniton, gtts, pydub and pyautogui could not be resolved when I've installed all of them
Are you using the same version of Python to run it as you installed the packages with pip? If you’re not sure, make sure you run it with ‘python -m pip install pyautogui’ then ‘python myscript.py’.
Yee but not internet acces then this all will not work
That’s right. We’re using a couple packages that call services online. So this won’t work if you’re offline.
Yeah but I think Alexa, Siri, google, ChatGPT all need internet anyway. I think Apple Watch can work offline. Or without the phone.
That's true! Most usually need an active internet connection. This is because they don't have any kind of local model that can interpret speech, so the audio is sent to a server to compute the response/action.
That being said, there are some 'mini'-models as you described where the model that does the interpreting is small enough to be on the device. It's certainly possible to make one of those yourself, but it's not going to be very easy :)
hey jake, i love your energy in your videos. do you have a twitter (x) or an email where we can correspond? i’d love to chat.
Hey 👋
I actually have a Discord where you’re welcome to join and chat :) A link should be in the description of the video. I do have Twitter as well, but I’m less active there, it’s @JakeEhTv.