Wio Terminal TinyML Course #6 Speech recognition on MCU - Speech-to-Intent
Вставка
- Опубліковано 12 лип 2024
- Learn how to directly parse user utterances into actionable output in form of intent/slots. In this video I will share techniques to train a specific domain speech-to-intent model and deploy it to Cortex M4F based development board with built-in microphone, Wio Terminal from Seeed Studio.
Link to the article:
www.hackster.io/dmitrywat/tin...
Hardware.ai Twitter
/ hardwareai
Wio Terminal
www.seeedstudio.com/Wio-Termi...
Speech-to-Intent-Micro Github:
github.com/AIWintermuteAI/Spe...
Speech-to-intent model deployment to low-power low-
footprint devices TinyML Talk
forums.tinyml.org/t/tinyml-ta...
0:00 Demo
1:12 Intro
2:40 What is Speech to Intent
5:01 Training code for reference model
7:27 Fluent.ai Speech commands dataset
8:21 Data processing and model training
18:10 MCU Inference code explanation
26:20 Testing the inference on device
27:57 Improvements and conclusion
Credits for the music:
Sahara - A Chillwave Mix
• Sahara - A Chillwave Mix
Axium Waves - Paradise
whilefalse - Petrichor - Наука та технологія
Thank you very much for the information, everything is very intelligently told
Thank you for the continued support!
This series is very useful for me
Glad to hear that
🔥🔥🔥🔥🔥
Fire detected!
in def_create_aug_pipeline function it is showing error, plus which path should I use in the place of "../data/wavs/background_noise" ? Which ever path I am using from the dataset, it is showing error. Please someone help me out with this.
I fixed this recently here github.com/AIWintermuteAI/Speech-to-Intent-Micro/commit/3f5d8b533be4891a7d4732bb6fdf5dd33c841695
but perhaps there is something else going on. Can you create an issue in GH? Please post detailed error description there.
I'm a little confused between machine-learning, deep-learning (neural-networks).. and plain old OpenCV!
For example, is the idea that you must collect the data (audio, images, or sensor data) via the embedded device, but the training/learning is always done on the cloud (ex. Edge Impulse) or a PC (tensorflow, keras, etc)?
And how do you tune the training/model for a particular application (or the type of data you are expecting)?
And what does does the (Neurona, TInyML, etc) output look like? Is it c/python code, or more like data that you must then transform into an algorithm?
And will that work only on the embedded device that the training data was captured on? I assume the CPU/speed will determine how fast the algorithm will work, but what are the memory requirements?
If the embedded device has an internet connection, can the real-time classification be done on the cloud?
So many questions :)
I'll pick one
If the embedded device has an internet connection, can the real-time classification be done on the cloud?
Depends on the required latency - for some devices it can be as low as 1 ms. Hard to beat that if sending the data to the cloud.
Very interesting. One question. output = [intent_output, slot_output]. Intent and slot are different shape. How do you calculate the cross entropy? In words, how do you combine two cross entropy?
Tensorflow takes care of that automatically for models with multiple outputs. During the training you can see loss and accuracy printed for both outputs, then they get combined for total loss.
hi..
we need more project ai using this wio terminal..good job...which one is better arduino nano 33 ble sense or this wio?especially for the microphone?can we connect external mic?such as i2s or even usb mic?
So, for microphone, nano 33 BLE sense is better. You can also try XIAO BLE Sense, which is smaller and might be more available at this time.
USB won't be possible, but I2S should be fine.
Can you please explain how to do speech recognition that trains directly on microcontroller and only works with the voice of person who trains it, similar to dfrobot voice recognition module
Hmm. So from what I understand dfrobot module does not use NN - these are very hard to train on device. There are a few approaches you might take - the one that I would try is 1) train a keyword feature extractor 2) use k-means classifier on top of that feature extractor - it is very easy to "re-train" k-means classifier, feasible even on MCU.
@@Hardwareai I found out a new ML model light enough to train on microcontroller itself, its called "Sefr classifier" and its new its 2020
I installed board version 1.8.2 but somehow I get the "board not supported" error. What could be the cause?
Can you create an issue in Github repo?
I'm facing the problem when this project deploy on Arduino 33 ble.
Help me to deploy this project on Arduino 33 ble.
Yes, it won't work as is. You basically need to take care of microphone driver, using existing PDM examples. Should not be too difficult.
I have a problem
tensorflow/lite/micro/system_setup.h: No such file or directory
Are you using the right version of Tensorflow Lite for microcontrollers?
wiki.seeedstudio.com/Wio-Terminal-TinyML-TFLM-1/#install-the-arduino-tensorflow-lite-library
Here is detailed explanation.
Can this device recognize the voiceprints of different mosquitoes?
The microphone most likely won't be precise enough. You'll need a professional grade microphone for that.
For example, what kind of microphone device? i2s microphone? Please suggest a device
While training its showing early stopping because accuracy is not improving. Any way to fix this?
Not possible to say without inspecting the data :)
Im using the data set given in the updated jupyter note book as drive link. Im just trying to run all the cells to see if i am able to generate the .h file on my machine.
If you could help me out it would be great. I haven't made any changes to other cells. Can you verify if its working on your machine? The val loss accruacy does not improve from 3rd epoch. Thank so much
@@tabrezahmed1000okay. can you post the description of the problem together with relevant logs as Github issue?
@@Hardwareaii have posted the issue
I have opened a new issue can you please check it out
You are getting real good at these presentations. But it's time to loose the um. All the great presenters do stand-up comedy to perfect their speach.
Um, I agree ;)