DIY Alexa: Create Your Own Voice Assistant with ESP32 & TensorFlow Lite!

Поділитися
Вставка
  • Опубліковано 5 вер 2024

КОМЕНТАРІ • 224

  • @atomic14
    @atomic14  3 роки тому +12

    Interested in ESP32 Audio: ua-cam.com/play/PL5vDt5AALlRfGVUv2x7riDMIOX34udtKD.html
    Looking for all my ESP32 projects: ua-cam.com/play/PL5vDt5AALlRdN2KyL30l8j7kLCxhDUrNw.html

  • @tektronix475
    @tektronix475 3 роки тому +14

    wow, your alexa version, got me speechless.

    • @digitronix532
      @digitronix532 10 місяців тому

      How to upload program to esp32

  • @7Trident3
    @7Trident3 3 роки тому +16

    Wow!! I didn't think the esp32 had the guts for any AI stuff! Great video!!

    • @atomic14
      @atomic14  3 роки тому +5

      It's definitely starting to push the limits - but I think it's easy to forget just how powerful the ESP32 is. One of the problems is the size of the models which can get quite large (relative to the amount of RAM we have to play with). Processing time is also a factor especially when trying to do real time as in this project.

  • @trueintellect
    @trueintellect 3 роки тому +9

    I'm so glad I found your channel!! This is really cool. You've helped free me from my Raspberry Pi dependence.

    • @atomic14
      @atomic14  3 роки тому +4

      The ESP32 is an amazing device. Really powerful.

    • @codewithdaniel-1
      @codewithdaniel-1 3 роки тому +1

      Same case😊

  • @clydealcott3379
    @clydealcott3379 2 роки тому +4

    Thank you so much for this awesome and very educational video... I got my ESP32 recently...
    It's time to roll along.!👍

  • @naafff1
    @naafff1 3 роки тому +4

    I thought your gonna be using a Raspberry pi. Speechless... Im gonna make one like u .

    • @naafff1
      @naafff1 3 роки тому

      ​@Taylor Van i have got many messages like these. They ask you for money and once you give, they dont give you the account you wanted to hack

  • @trevorwslee
    @trevorwslee Рік тому +3

    What an insightful project! I really hope to be able to adapt your idea (including some code snippets and likely the TensorFlow model) and come up with my own little ESP32 experiment.

  • @mariomedina
    @mariomedina Рік тому

    Got it working! Now I need to learn how to change the activation word, and how to add multiple activation words that activate different code

  • @OnePunchHeizou
    @OnePunchHeizou 3 роки тому +1

    this channel was really helpful to understand many edge ai related concepts, thank you @atomic14.

    • @atomic14
      @atomic14  3 роки тому

      Thanks for the kind words - much appreciated!

    • @OnePunchHeizou
      @OnePunchHeizou 2 роки тому +1

      @@sltechgalaxy1677 he is using terminal/cmd_prompt for that.

    • @OnePunchHeizou
      @OnePunchHeizou 2 роки тому +1

      @@sltechgalaxy1677 i think anything should work for this this purpose. preferably use linux.

    • @OnePunchHeizou
      @OnePunchHeizou 2 роки тому +1

      @@sltechgalaxy1677 i used this video for reference. these commands work on linux/windows, i dont know about mac terminal.

    • @OnePunchHeizou
      @OnePunchHeizou 2 роки тому +1

      @@sltechgalaxy1677 bro clone this project git repository, in data u will find all the audio files.
      go to that directory and try using these commands.

  • @photopicker
    @photopicker 3 роки тому +2

    Very educational. Coming for a ESP32 background I found it very helpful to create a real target for the AI modeling tools. Great introduction.

    • @digitronix532
      @digitronix532 10 місяців тому

      How to interpret esp32 and this program

  • @paulsimpson9544
    @paulsimpson9544 3 роки тому +3

    Really fascinating. Thank you so much for sharing.

    • @paulsimpson9544
      @paulsimpson9544 3 роки тому

      I've a follow on question if you don't mind.. I see you using the Arduino framework, but also have the esp IDF icon in platform Io. Do you have any particular preference? I'm considering switching to the IDF as I'm already using xtimers. I like the idea of know control, but also like the easy access to the Arduino ecosystem of libraries..

    • @atomic14
      @atomic14  3 роки тому

      I've been mixing in quite a lot of functions from the IDF with my Arduino code. But it seems the IDF that comes with Arduino is now quite out of date. I've been trying to get Arduino working as a component in the IDF so I can use the latest IDF but still take advantage of the Arduino eco system but I've not had much luck. For my Asteroids game I did it all in the IDF - mainly because I wanted to use the PSRAM with malloc and there's not way to do that when using Arduino. But I really missed simple things like uploading firmware OTA - especially with my custom board not having a USB port...
      I think, unless there's a compelling reason (APIs that aren't available from the IDF when using Arduino) then I'd be tempted to stick with Arduino. If you aren't using any libraries or you can easily port them over then IDF is definitely worth giving a go. But, I don't think there are any huge advantages to it.

  • @sambidpradhan32
    @sambidpradhan32 3 роки тому +2

    This is awesome.. thinking to implement this on a custom dataset, and this model looks light weight as well.. can be implemented in real time I guess

    • @atomic14
      @atomic14  3 роки тому +2

      It's amazing what you can do with quite a small model. I have seen that the micro-speech example in the main TensorFlow codebase is now available for the ESP32 - might be worth taking a look at that as well.

  • @iotan09
    @iotan09 3 роки тому +2

    How kind you are ,thanks for sharing

    • @atomic14
      @atomic14  3 роки тому +3

      No problem at all, it's a privilege to be able to give something back to the community.

  • @JohnLauerGplus
    @JohnLauerGplus 3 роки тому +2

    Wow. Nice work here.

  • @marcush.6632
    @marcush.6632 2 роки тому

    You are an absolute genius in my eyes.....

  • @ChrisHalden007
    @ChrisHalden007 3 роки тому +1

    Amazing!!!! Will definitely give it a try. Thank you

    • @atomic14
      @atomic14  3 роки тому

      Let us know how you get on!

    • @digitronix532
      @digitronix532 10 місяців тому

      How to integrate esp32 and this program

  • @engrwaqas2904
    @engrwaqas2904 2 роки тому +1

    Absolultely amazing, Great work.

  • @OMNI_INFINITY
    @OMNI_INFINITY 5 місяців тому

    Thanks! Seems I should make a touchscreen voice AI app

  • @ehabelbwab1783
    @ehabelbwab1783 4 місяці тому +1

    You should mix the audios with noise background at out side than use them for training because adding _background folder with training data is bad choices.

  • @user-sr9ss3xd4q
    @user-sr9ss3xd4q 8 місяців тому

    Excellent.. I really enjoy the contents of the channel.. I suggest you make a content about rihno picovoice on esp32

    • @atomic14
      @atomic14  8 місяців тому

      Looks interesting, but I don't think it works on the ESP32 yet - might need a more powerful processor.

  • @jorgemota879
    @jorgemota879 2 роки тому +1

    Amazing, Fantastic thank you very much, really a great project

  • @sasisekharmg7823
    @sasisekharmg7823 3 роки тому +2

    Amazing work!

  • @gxbs2318
    @gxbs2318 8 місяців тому

    voy a aplicarlo en dos dispositivos IOT que tengo en funcionamiento Excelente video

  • @gitaran24
    @gitaran24 3 роки тому +1

    absolutly amazing.. you do great things,.. you are smart.. its chalange me to make it one

    • @atomic14
      @atomic14  3 роки тому

      You should definitely go for it - report back on how you get on.

  • @jeffzor
    @jeffzor 2 роки тому +3

    Obrigado pela oportunidade de aprendizado mestre!

  • @Nerdsking
    @Nerdsking 8 місяців тому

    It would be more interesting (and usefull) if there was a way to merge this with another ESP32 project that wuns chatgpt, so it could be not only a DIY Alexa, but also an general
    assistant

  • @rolyantrauts2304
    @rolyantrauts2304 3 роки тому

    Also many thanks as didn't realise tensorflow was and will do the job on an ESP32. There is a lack of opensource linux beamforming algs, which you have probably just solved.
    Esp32 is so relatively cheap that a distributed microphone array where the mic with highest keyword match is used for that ASR session.
    Vosk has a streaming API alphacephei.com/vosk/ just needs a streaming RTP protocol with current keyword match info and no beamforming needed as nearest mic automatically used...

    • @atomic14
      @atomic14  3 роки тому +1

      Sounds interesting - the only issue you may start to hit performance issues with processing multiple microphones at once. Currently the wake word detection takes around 100ms so you may start running out of CPU time with more than one or two microphones. You might also hit memory issues with the audio buffering - though using a wrover module might fix this.

    • @rolyantrauts2304
      @rolyantrauts2304 3 роки тому

      @@atomic14 I dunno thought I would ask you as a total noob with ESP32 but on linux irrespective of process power we still lack opensource beamforming. The pulseaudio addition just doesn't work, don't think it ever did prob hence why upstream its been dropped from webrtc.
      What I am thinking is that we are not 'processing' multiple microphones at once the I2S data for mono is just doubled and the L/R hi/lo word select is not used.
      A single channel would be fed into a delay buffer and then I guess just summed with the inverse of the current value of the other channel?
      It is really a single channel in a short delay ring buffer of the speed of sound distance and what is present on the other I2S is just subtracted.
      For a noob who is blankly staring blankly at a $5 aliexpress wrover and brief journey of documentation it makes curious if you could with 2x cores but to be honest yet haven't a clue how :)
      I can not even work out if http streams are just client or you can create a server stream or if you could present AMR-WB on a port?!?
      Just got my fingers crossed it might perk your interest.

    • @rolyantrauts2304
      @rolyantrauts2304 3 роки тому

      @@atomic14 PS the lack of beamforming was that each ESP32 could be a streaming KWS to a central ASR.
      Broadcast from KW to silence with some metadata of KW hit score and a central ASR would be able to use best KW hit score so an array of esp32s could be a distributed array with best and nearest always used.

  • @YigalBZ
    @YigalBZ 3 роки тому +1

    Great video and project. This is my next project. Thank you !

    • @atomic14
      @atomic14  3 роки тому +1

      Let us know how you get on!

  • @ei23de
    @ei23de 3 роки тому +2

    This is super great!
    We should do some kind of collaboration!
    Some time ago i tried out Rhasspy with a Raspberry Pi as an offline Alexa (I call it "Axel", you can see it in my "DIY Open Source Home Automation with a Raspberry Pi [EN]" Video).
    Rhasspy is great, but I need some kind of sattelite hardware like this, or an ESP32 Audio Kit, which I saw as quite a challenge.
    But you obviously did it right away!

    • @atomic14
      @atomic14  3 роки тому +2

      I had a quick look at Rhasspy and you could easily modify my code to talk to it. I have a few projects to complete but will come back to it when I have some more time.

    • @ruifreitas7475
      @ruifreitas7475 3 роки тому

      @@atomic14 This is something i was looking into when i saw your video. Great timing. Passing audio commands from ESP32 to (via MQTT or not) Rhasspy to be recognized and trigger intents or actions in Home assistant would be great. Thank you for sharing this. community.rhasspy.org/

    • @synesthesiam
      @synesthesiam 3 роки тому +1

      @@atomic14 Rhasspy author here. Your project looks awesome! I'd be very interested in collaborating, so feel free to ping me whenever :)

    • @ei23de
      @ei23de 3 роки тому +1

      ​@@synesthesiam What great people here!
      Thank you for Rhasspy!
      I'm currently working my DIY Video Doorbell (ESP32 Cam) and Face Detection with OpenCV. You know the drill.
      But the video soon will be finished and after that my smart doorlock will get some spotlight... but after that! I will definitly spend time on this! This is super exiting and needs more attention.
      Hope i'll find time for this, soon.

  • @prof.tahseen6104
    @prof.tahseen6104 2 роки тому

    the voice from those meme videos 😂

  • @ernstgennial7064
    @ernstgennial7064 3 роки тому +2

    Very interesting!

    • @atomic14
      @atomic14  3 роки тому

      Glad you think so!

  • @rachitkachhiya3458
    @rachitkachhiya3458 Місяць тому

    If I want to make a chatbot with real time responses like update me with current weather. Could you guide through that?

  • @ajanthahimali8491
    @ajanthahimali8491 2 роки тому +1

    Can you simplex the firmware codes please, it's very difficult to understand the code

  • @Gauthamphongalkar
    @Gauthamphongalkar 2 роки тому +2

    Marvelous content, thank you very much!

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 роки тому +1

      @@sltechgalaxy1677 I'm not sure to which you are pointing.. to play audio.. if you are on Linux you can use aplay

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 роки тому +1

      @@sltechgalaxy1677 aplay is utility of Linux.. you can't use such in windows.. in windows you can try playing in RAW format in VLC

    • @Gauthamphongalkar
      @Gauthamphongalkar 2 роки тому +1

      @@sltechgalaxy1677 yes.. also read about ALSA

  • @ankitthealchemist
    @ankitthealchemist 3 роки тому +2

    Hey! great work dude!! could we implement the simple command like "turn off the light" offline, just like the wake word detection?

    • @atomic14
      @atomic14  3 роки тому +2

      I'm looking at this right now - it is a more difficult problem than the simple wake word detection. The model needs to have an output for each possible command word which means it is a larger model so will take longer to run on the ESP32. Hopefully, I'll be able to do another video soon showing it working - though just to be clear, this would be very limited commands - like: "on", "off", "left", "right" etc...

  • @WagnerUlisses
    @WagnerUlisses 3 роки тому +1

    Very cool!

  • @legal_hack5626
    @legal_hack5626 2 роки тому +1

    everything is ok but......
    for file_name in tqdm(get_files("_problem_noise_"), desc="Processing problem noise"):
    process_problem_noise(file_name, words.index("_background"))
    in these lines you are processing noise , but I don't have data set of problem noise , from where I can download it... I have downloaded google speech data set but there is no _problem_noise_ folder... what can I do now>>?

    • @atomic14
      @atomic14  2 роки тому +1

      Hi there, the problem noise files are options (as are the mar sound files). I just recorded some additional audio of my office noises that seemed to be confusing the neural network. You can either add a folder yourself and record some audio or you can comment out that section of the notebook.

    • @legal_hack5626
      @legal_hack5626 2 роки тому +1

      @@atomic14 Thanks

  • @h4l050
    @h4l050 3 місяці тому

    I'm getting weird output values from the NN, outputs like 0.01... 0.14... and it can't detect the word Marvin. I'm using the model(model.cc) that comes in the source code but i can't get any right prediction. What's happening? Do you know what could it be? Thanks, and amazing project :D

  • @shufnagl
    @shufnagl 3 роки тому +3

    Hi, as others already mentioned...great work, great video. BTW, would it make sense to use other ESP32 Hardware with included Mic/Speaker like Atom Echo?

    • @atomic14
      @atomic14  3 роки тому +1

      I don't see why not - you may need to modify the code to use whatever pins and interface the Atom Echo uses for the microphone and speaker. It should work really well.

    • @shufnagl
      @shufnagl 3 роки тому +1

      @@atomic14 My AtomEcho arrived and I will give you feedback about the results. BTW, where should we discuss the technical aspects? UA-cam or Git? Thx

    • @atomic14
      @atomic14  3 роки тому

      Probably best on GitHub as we can share code snippets a bit more easily.

    • @shufnagl
      @shufnagl 3 роки тому

      @@atomic14 Should I create a separate branch (to avoid poluting your code)

    • @atomic14
      @atomic14  3 роки тому +2

      @@shufnagl You'll need to fork the repository and then you can do pull requests back to my code - there's a good guide here - github.com/firstcontributions/first-contributions Looking forward to seeing what you do!

  • @devmishra4131
    @devmishra4131 2 роки тому +2

    I am pursuing mechanical engineering from Stanford batch of 2023, and your video is pretty good.
    I had one query, can we use PAM8403 instead of MAX43434 for the output.

  • @jspark4171
    @jspark4171 9 місяців тому

    Your answer was very helpful to me. Thank you very much.

    • @atomic14
      @atomic14  8 місяців тому

      Thanks! Very much appreciated!

  • @chockman3833
    @chockman3833 3 роки тому +2

    I had to login to my other account to give this video another like, this was incredible!
    How hard would it be to extend the model to have some amount of offline NLP so we don’t have to rely on Facebook?

    • @atomic14
      @atomic14  3 роки тому

      I'm having a look at that right now, got slightly sidetracked looking at building an AGC. It's possible to a limited extent, the command dataset does contain some other words that we can try using. Getting performance from a small enough model looks doable. Hopefully should have something up this week,

  • @maul6117
    @maul6117 11 місяців тому

    do you have to watch these in a certain order? is there a playlist for just the diy Alexa project?

  • @erikpratama7685
    @erikpratama7685 3 роки тому +2

    Hello, nice project, can i use esp 32 cam??

  • @dreyreis
    @dreyreis 3 місяці тому

    Is it possible to use a pre-trained voice model and install it on a device (like a model of a famous person, perhaps)? If so, how would we do this?

    • @atomic14
      @atomic14  3 місяці тому

      The ESP32 isn’t really powerful enough to do that locally. But there are APIs that you can call that will do Text To Speech (TTS). And some of them offer custom voices.

  • @alphoncemutabuzi6949
    @alphoncemutabuzi6949 2 роки тому

    Thanks alot brother

  • @HassanPhiri-kx1im
    @HassanPhiri-kx1im 3 місяці тому

    Does any kind of esp32 work or does it have to be the ESP32-S2 saola 1R DEV KIT?

  • @your.free.electrons
    @your.free.electrons 3 роки тому +1

    Hey, this one's awesome :')

  • @ei23de
    @ei23de 3 роки тому

    The following question may falls below the standard of your channel, but since you introduced me to jupyter notebook, i have to know, which software you are using for presentation.
    This is not Powerpoint, is it?

    • @atomic14
      @atomic14  3 роки тому +1

      I use a bit of a mix for videos - I'm on a Mac so use Keynote (the Mac equivalent of Powerpoint). I've been trying to learn the manim library which is what the guy who does 3Blue1Brown uses. I've also got my own homegrown animation library that I use for some things - but it's definitely not really ready. I've used Apple Motion for a couple of videos, there is quite a learning curve with it and I'm nowhere near proficient.

    • @ei23de
      @ei23de 3 роки тому

      @@atomic14 I like your videostyle, its looks professional.

    • @atomic14
      @atomic14  3 роки тому

      @@ei23de Thanks!

  • @kavishchattoor1729
    @kavishchattoor1729 11 місяців тому

    sorry i know this might be late but I am replicating a similar project. Did you use the ESP32 to capture the audio signal? My esp32 doesn't have enough memory to capture enough data.

  • @kingsleybaros2095
    @kingsleybaros2095 7 місяців тому

    At the end please what are you uploading as code in the esp 32 that will run your entire system

  • @OMNI_INFINITY
    @OMNI_INFINITY 5 місяців тому

    Found where rabbit AI maybe started

  • @aisolutions834
    @aisolutions834 3 роки тому +1

    Hi There!
    Nice Work, Is it possible to run a TensorFlow object detection model like MobileNET on ESP32? OpenMV has this capability using TFLite library, but I am interested in running object detection on ESP32 which is very low cost compared, thanks!

  • @maul6117
    @maul6117 11 місяців тому

    is there a step by step video for the hardware build?

  • @dariovicenzo8139
    @dariovicenzo8139 2 роки тому

    Great video! What I don’t understand (Im at the basic of TF) why we need to use a cloud service AI when we are trying to make an edge device? So in other word we are losting the advantage to realize an edge system if we need a cloud service. So I could avoid the lite model and make all the stuff in the cloud using the esp32 as audio transmitter. I hope I understood well the purpose of facebook service. Thanks.

    • @atomic14
      @atomic14  2 роки тому

      Hi Dario, that is a very good question. One of the issues with using the ESP32 as an audio transmitter and doing the wake detection in the cloud is privacy concerns - you really want the user to be in charge of when the device is actively listening and sending your data to a third party service. So you really want the device doing the wake word detection and only sending audio data to the internet once the wake word has been detected. Currently, doing full intent recognition on the edge is too difficult on a device like the ESP32 - however, there is software for the raspberry pi that looks very promising - rhasspy.readthedocs.io/en/latest/

  • @guilhermevini65
    @guilhermevini65 2 роки тому

    Amazing !!!

  • @nielspaulin2647
    @nielspaulin2647 Рік тому

    EXCELLENT!

  • @pruthvirajvenkatesha6897
    @pruthvirajvenkatesha6897 2 роки тому +2

    Thanks for this! Amazing work! I had few questions and would be helpful if you could reply. Can we use this procedure to build the same for esp32s3? It seems you used arduino framework which i checked and is not up yet on vscode. Any other approaches to build this firmware on esp32s3?
    Also , do we have info on KWS model #computations? Based on few algorithms papers which are validated on Google speech data set, it is always a trade off bw accuracy and total computations so wanted to know the procedure used to select an algorithm.
    Last question, can we build any tflite model using the tflm framework?

  • @dicle6714
    @dicle6714 2 роки тому

    I can't compile this application with Arduino IDE. I made the necessary file edits.

  • @faizabdulchakim8796
    @faizabdulchakim8796 3 роки тому +1

    this is esp32 s2 saloa-1 right? is possible using other type of esp32?

    • @atomic14
      @atomic14  3 роки тому +1

      Definitely - pretty much any ESP32 dev board will work - I'm not using any special features.

  • @rafaelmatos8754
    @rafaelmatos8754 7 місяців тому

    How do you get so many examples of the word Marvin?

    • @atomic14
      @atomic14  7 місяців тому +1

      Weirdly, it was in the training data. I guess the people who compiled the audio samples were fans of Douglas Adams.

  • @SinanAkkoyun
    @SinanAkkoyun 3 роки тому +2

    Wow wtf!!!!!! 😍😍😍😍😍😍

  • @digitronix532
    @digitronix532 11 місяців тому

    Kindly help me in Programming ESP32 ...how to integrate python program and ESP 32

  • @khaoulakanna4227
    @khaoulakanna4227 3 роки тому +1

    can this be done in an other language other than english ?

  • @DayanandKushwaha-ef6oi
    @DayanandKushwaha-ef6oi 8 місяців тому

    i am not getting audio output please help ...

  • @DJ1TJOO
    @DJ1TJOO Рік тому

    Can this work with a normal sound sensor that just has an analog out put?

  • @SonuRauniyar
    @SonuRauniyar 3 роки тому

    Pretty cool stuff:). I want to make my own wake-up word detection system using a custom audio dataset. Let's say my wake-up word is "Hey Marvin" which is I assume is longer than 1 second? How many data points can be decent enough to train the model? and since I will use google speech dataset to add noise for better accuracy , do you think time frame of 1 second will matter here?

    • @spacecdr
      @spacecdr 2 роки тому

      A linux terminal with alsa and curl installed! "software"...😂

  • @devisnugroho
    @devisnugroho 2 роки тому

    what's kind of software that you use in 3:58, the wave and spectrogram comes realtime?

  • @ahlamhusni6258
    @ahlamhusni6258 Рік тому

    What is the distance for the microphone to be able to catch the voice ?

  • @emilianotl3572
    @emilianotl3572 2 роки тому

    do you know if i can use dialogflow to control devices that are connected to google home?

  • @amarjeetkumarfor
    @amarjeetkumarfor 10 місяців тому

    Can I have circuit diagram, please

  • @devmishra4131
    @devmishra4131 2 роки тому

    hi sir, I have one more doubt that what at this timing 16:13 you used as the terminal, I tried many ways to run the link(I have used my own recording, saved in desktop and pasted the path) which I got from my wit.ai account in my window's terminal, but it didn't work. And I also tried to find many other ways to do that, but nothing worked. So, please reply as soon as possible.

  • @gsge
    @gsge 3 роки тому

    Apart from your vast knowledge of hardware and software you are the best teacher to make quite complicated subject very easy to understand for newbie like me.
    Is it possible to bypass cloud service like wit.ai to host it on local Raspberry for totally local solution ?
    Thank you.

    • @atomic14
      @atomic14  3 роки тому

      Yes - there's a solution called Rhasspy - rhasspy.readthedocs.io/en/latest - I think in theory you should be able to swap out Wit.ai for it. The code for decoding the response will probably need to change, but it looks doable.

    • @gsge
      @gsge 3 роки тому

      @@atomic14 Thank you.

    • @digitronix532
      @digitronix532 10 місяців тому

      How to upload program to esp32

  • @TechnicalShubhamofficial
    @TechnicalShubhamofficial 2 роки тому

    Hey can you tell me how to program the esp 32 and where is the final code

  • @EricSouzarys
    @EricSouzarys 2 роки тому

    Do you think it's possible to train the model so it can detect a ringtone?

  • @Techn0man1ac
    @Techn0man1ac 3 роки тому

    Спасибо Большое

  • @fiottovotre7202
    @fiottovotre7202 Рік тому

    How can I navigate the dataset plz? Actually, I can't find it

  • @izigoldenberg218
    @izigoldenberg218 3 роки тому +1

    Is there any chance this could work with ESP8266 instead of the ESP32?

    • @atomic14
      @atomic14  3 роки тому +2

      I think that might be difficult - it is pretty much pushing the limits of the ESP32.

  • @typingcat
    @typingcat 2 роки тому +1

    I need a off-the-grid system, not using a voice recognition service from Facebook. Who knows what Zuckerberg is going to do with your data. Also, as I see in the demo, the is a quite a significant delay, like 3 seconds. One of the reasons why I want to create my own is that I don't like the delay of Google Home.
    I don't know how other people use the the voice assistance, but I have found that they are dumb. Not really "A.I.", but just scripted responder by some programmers. So, I don't really try to "speak" to it, but just say some fixed-structure phrases that I know it will understand, like "turn on the light", etc. In short, all I need is speech to text. If I could get a string like "turn on the light", I could parse it and turn on the light myself. Is ESP32 powerful enough to convert speech to text on its own?

  • @JernD
    @JernD 3 роки тому

    This is probably a silly question, but why did you take the log(audiodata) after audio normalization? Would it be superior to swap those operations?

    • @atomic14
      @atomic14  3 роки тому

      Hey John, definitely not a silly question, the audio is normalised and then we calculate the spectrogram of the normalised audio. The log operation is applied to the spectrogram output. The spectrogram can end up with some very large values and the log operation brings them down into a more sensible range for the neural network to train against.

  • @AryanKapur0605
    @AryanKapur0605 2 роки тому

    Hi! Can I use ESP 32 Cam instead of ESP32? Thanks!

  • @edgull_tlt
    @edgull_tlt 2 роки тому

    Спасибо за видео. Было интересно.

  • @apoorvanavin3300
    @apoorvanavin3300 5 місяців тому

    in which language this works on? python

  • @55cancri_e76
    @55cancri_e76 Рік тому

    Hi sir,
    Thank you for the great video.
    My teammates and I are trying to make similar project to yours. But I would like to ask you how did you linked the python code with the C code. Also, how did you upload the code on the ESP32? was the C code or the python code?

  • @thomasob42
    @thomasob42 3 роки тому

    Can this project be implemented using arduino BLE 33 Sense?

  • @Pavana_sai
    @Pavana_sai 3 роки тому +1

    HI, wonderful project.
    im interested to build the same project. can you help me

  • @THEbonny95
    @THEbonny95 Рік тому

    Can't do this on Google Assistant?

  • @ei23de
    @ei23de 2 роки тому

    Hey, i hope you don't mind if i mention this video (and your channel) in one of my future videos?

    • @atomic14
      @atomic14  2 роки тому +1

      Go for it :)

    • @ei23de
      @ei23de 2 роки тому

      @@atomic14 ua-cam.com/video/-Hfow7KMCK8/v-deo.html
      (but it's german language...)

  • @data_resources
    @data_resources 2 роки тому

    Hello i followed your instructions and i did almost all the project but am having trouble getting the output sound when i give the commands

  • @francegall-web9819
    @francegall-web9819 3 роки тому

    Mr. atomic14 really impressive. Since you are very good at programming can you help us reprogram the HLK-V20 speech recognition? It is a very cheap chip - three dollars - which provides offline speech recognition, but its manufacturer does not explain how it is programmed. (There is also the SU-10A which is the same from a different manufacturer.)

  • @keithsummers2842
    @keithsummers2842 3 роки тому

    You didn't really mention the size of the project. What is the expected memory footprint of the Flashed program?

    • @atomic14
      @atomic14  3 роки тому

      It uses about 1,1Mbytes of flash. When running memory is tight, making the HTTPS connection to Wit.ai leaves about 30K of RAM.

    • @keithsummers2842
      @keithsummers2842 3 роки тому

      @@atomic14I'm working on a project right now where just Wifi and BLE implemented is soaking up about 1.5M of flash. As long as the entire project remains below about 3M then OTA continues to be possible in the WROM32 with 16M flash. I was most concerned about OTA memory space. Thank you for the response and the excellent video post here on UA-cam.

    • @atomic14
      @atomic14  3 роки тому

      @@keithsummers2842 No problem - thanks and good luck with your project!

    • @keithsummers2842
      @keithsummers2842 3 роки тому

      @@atomic14 You seems to be very knowledgeable. Could I hire you for consultations just to keep us on track with our project? I can be reached at Keith@SSLEDLighting.com

  • @devmishra4131
    @devmishra4131 2 роки тому

    I followed all of your processes and really found it amazing and helpful!!. but I have a doubt that how are we going to upload these codes into esp 32 or esp 8266 as you don't have any .ino file so you must not be using arduino for that. so what ide are you using, if it is vscode then what settings you have did? please tell, it would really help everyone.

    • @atomic14
      @atomic14  2 роки тому +1

      I'm using PlatformIO, just install VSCode and download the PlatformIO plugin.

    • @devmishra4131
      @devmishra4131 2 роки тому

      @@atomic14 Thanks a lot sir for your reply, it means a lot to me.
      Looking forward to a successful test!!

    • @data_resources
      @data_resources 2 роки тому

      @@devmishra4131 can you explain how you did it

    • @digitronix532
      @digitronix532 10 місяців тому

      How to upload program to esp32

  • @hokuspokus8570
    @hokuspokus8570 2 роки тому +1

    Marvin tell me a joke .... OK

  • @Yakroo108
    @Yakroo108 7 місяців тому

    👍👍👍

  • @MaxSMoke777
    @MaxSMoke777 3 роки тому +11

    Oh... you had me going... right up until you said the voice processing was by Facebook. I'm trying to get big brother OUT of my life!

    • @atomic14
      @atomic14  3 роки тому +5

      Have a look at Rhasspy - rhasspy.readthedocs.io/en/latest/ - it should be possible to replace wit.ai with it.

    • @horrorhotel1999
      @horrorhotel1999 Рік тому

      Thanks, you just saved me a lot of time

    • @ARIJITRAKSHIT_Create_Marvel_
      @ARIJITRAKSHIT_Create_Marvel_ Рік тому

      @@atomic14 does it can respond other commands other than that

  • @techs5564
    @techs5564 4 місяці тому

    my unit doesn't respond to "marvin" what shall i do?

    • @h4l050
      @h4l050 3 місяці тому

      Same here... Have you found the solution?

    • @techs5564
      @techs5564 3 місяці тому

      now it is responding to "Marvin" but theres a way to speak Marvin, now the issue is it doesn't turn ON/OFF any lights. Sometimes it responds to Marvin not always. there are some errors also arising in jupyter notebook. The process is complex and it is not explained properly in this video.

  • @yaowang4490
    @yaowang4490 3 роки тому

    I have installed platformio on vscode, but when I open the diy-alexa-master folder, vscode prompts me "this is not platfrom project(should contains platformio.ini file), please tell me how to learn your project, Looking forward to your reply~~~ think you

    • @atomic14
      @atomic14  3 роки тому

      Ah sorry - you need to be in the "firmware" folder. That should fix your problems.

    • @yaowang4490
      @yaowang4490 3 роки тому

      @@atomic14 😄~~Is it convenient for you to give me your contact information?
      I still won’t import your project into vscode. I feel a little frustrated. My computer environment is vscodo+esp32 wroom 32u(hardware)+platfromio

    • @atomic14
      @atomic14  3 роки тому

      @@yaowang4490 Easiest way is to raise an issue on the GitHub repo - I can help you from there and there are other people who will be able to help as well.

    • @yaowang4490
      @yaowang4490 3 роки тому

      @@atomic14 ok According to your instructions, I successfully imported your project, but when I clicked the (ESP-IDF build project) button, I got this error (CMake Error: The source directory "C:/Users/liu/Downloads/diy-alexa-master) /diy-alexa-master/firmware" does not appear to contain CMakeLists.txt.)

    • @atomic14
      @atomic14  3 роки тому

      @@yaowang4490 I think you are trying to import it into an ESP-IDF project. It's a PlatformIO project. You just need to open it up in VSCode - no need to import or do anything like that. Just make sure you have the PlatformIO plugin installed in VSCode and open the firmware folder. platformio.org/ As I say, these kind of questions are much easier as a GitHub issue where we can share images and code snippets.

  • @yaowang4490
    @yaowang4490 3 роки тому

    hello ~Can you tell me how to import the project into vsconde, and look forward to your reply。 think you

    • @atomic14
      @atomic14  3 роки тому

      Hi Yeo, you'll need the PlatformIO extension installed and then you just open the folder the project is in.

    • @yaowang4490
      @yaowang4490 3 роки тому

      @@atomic14 I have successfully run your project, but I don’t know whether the inmp441 works. How to print the data of inmp441? think you

  • @digitallifetanzania2373
    @digitallifetanzania2373 2 роки тому

    Can it answer any questions

  • @luisfelipesaldivar5100
    @luisfelipesaldivar5100 3 роки тому

    I have a question, maybe it's stupid or too obvious for You, but, how and which files are uploaded into the ESP32? I don't understand very well it's done.
    And can i use the Arduino IDE to upload the code(s)?

    • @atomic14
      @atomic14  3 роки тому

      Hi Luis, I'm using Platform.io for the project - it's a lot better than the Arduino IDE when you have a lot of files to manage. You can upload directly using platform.io it will handle it all for you.

    • @luisfelipesaldivar5100
      @luisfelipesaldivar5100 3 роки тому

      @@atomic14 I've download platform.io and load the "FIRMWARE" folder to it, but it give me 27 error while trying to upload it to my ESP32, I search for the errors that appeared but i don't understand them.

    • @atomic14
      @atomic14  3 роки тому

      @@luisfelipesaldivar5100 Hi Luis, check in the platformio.ini file it may be that the upload_port and monitor_port have been set to the wrong values. You can delete these entries or change them to the correct ones.

    • @data_resources
      @data_resources 2 роки тому

      hey i was wondering how you upload the files in the esp32