Glad to see fellow citizen, doing such videos🎉 on the topic I guess as much as we work on this and keep pushing in this direction, the fastest we will get there. Surely if person is curious can make such device on his own with better mic and speaker. If I don't forget I'll get back here later this year with feedback on the DIY topic once I'm done with the whole renovation and start bugging with the whole Homelab.
I just got through setting Assist to work from my Android tablet that I use with HA companion, usually split screen with Kindle or YT... Seems to work well, once I create aliases for my entities... Did you use the HotWordPluggin and Snowboy to generate your WakeWord, with Tasker or Automatic? Or, is there another better or simpler way?
I think we all want to get rid of our Alexa devices but their hardware, no doubt backed by millions in research, is very good, particularly their microphones and speakers, for their price. Microphones have to be very good for seamless integration into a smart home and I'm not used to yelling at Alexa.
If that's the case it makes me wonder why didn't people just reverse engineer it to see what it was, and just use a similar thing? I think the time has come, sooner or later someone will definitely be able to run all this locally.
If you accept, that your music device can be separate from voice assistant, it makes switching process much faster. I already switched to Wyoming satellites in pretty much every room except living room, only because i'm still writing Assist integration to play music from Music assistant on user defined speaker with voice.
Thank you for this so much! The timing is perfect, since I just got mine yesterday :) I like dark theme when my eyes are tired or hurt. I like light theme when it's the middle of the day :)
Thank you for this material. I see a need of the followup material in which you could show how things improved after you've put some time into adjusting the configuration (and maybe a camparison with the cloud processing?)
Wow, I am really glad I watched this because I had never messed around with the models before. When messing around with them it had mediums-us chosen. The local response was about 5 to 7 seconds. Changed to the default(tiny-init) and response times are just as fast as cloud of not faster. I am running x86 image on a 2 year old AMD chip (no dedicated GPU) with 32GB of RAM. Overkill but my 12 year old Intel Nuc was dying which was my previous HA server. The add-on also uses less RAM. Odd but I'll take it. Finally switching my default voice pipeline to local only! I've been using Nabu Casa cloud for years (and will continue to) for external access it was just an option for me to use the cloud. I certainly trust their servers and what day do they collect more than Google or Amazon.. You can get a discount if you get a year, which I believe is 35 to $40. At least it used to be.
Glad that I help you go local :) And yes, changing models is a great way to optimize response times for your system. Supporting Nabu Casa team is a good cause and I highly appreciate it, but just to mention that there are at least 3 other very good & free methods for Home Assistant remote access and I covered them all in my previous videos :)
I won't be throwing out my Echos for now, though I am interested in local voice control. I'll check back in 6 months to see what improvements HA has made
First of all, thanks for the video 🫶 Secondly, different “stands” for the screen have other functionalities, in particular, one of them is filled with sensors, such as an IR receiver/emitter, temperature sensor, etc The question that interests me is: do you happen to know whether it is possible to simultaneously use the device as a voice assistant, as well as useful use of the mentioned sensors, as well as the screen itself, not only for displaying 6 states but also for displaying a dashboard from the home assistant, video output or something like that, using a touchscreen to control these widgets? I'll be appreciated for your answer 🤞
Been using Reolink cams i already have as a listening device, and some cheap-o cameras then using the Amazon Echo devices as the media speakers and outputs. Turn the mics off on the Echo devices. They have been working fairly good after fixing a few bugs and such. The cameras like to turn the mic off but a quick ITTT took care of all that.
haha correct :) And it is not working everytime, but nevertheless this is hell of a progress for a local voice assistant and I love it. I can't wait for the future developments
Hi Kiril, yeah, thanks for the bullet :) and i have some questions :) what is the max distance that it hears/understands correctly what you say? What about the speaker, it seems that it doesnt have one. If not then no piper is needed? Справя ли се добре с българския?
I didn't test the distance, I'm always invoking it from from 1-3 meters and it is fine. Regarding the audio response it's kind of a bug that I got lucky to hit during the video shooting, now it's fine and there is audio response after upgrading to the latest ESPHome FW. За Българския има какво да се желае, но работи все пак което е супер. На дисплея обаче излизат квадрати вместо текст не съм си играл да видя може ли да се оправи :)
@@KPeyanski great! It's nice it has an audio, I'm looking for something like that that is able to do audio announcements and it's important for me. Thanks!
@@KPeyanski I am! Your video helps out a ton. I originally found it because I accidentally bought a couple of ESP32 S3 modules thinking they might work with WLED or Awtrix (they do not lol). So I was looking for compatible projects like voice recognition, etc. But now that I’ve seen the Box version that you have, I think I’m going to just get it cause it’s cool. 😂 Like BMO from Adventure Time.
i saw tuto with rasberry pi zero w2 and Wyoming protocol and a microphone installe on it. It's look to work very well. Maybe you could try it for the rest of us :).
@@KPeyanski The raspberry pi zéro w2 only work for Wyoming protocole and microphone. And the home assistant server with rasbery pi 4 work for everything else. So i suppose it's enough for good performance. But i am suspicious with that....
I have two Pi Zero 2's with the Seeedstudio speaker hat as Wyoming Satellites and the performance is comparable to this video here. It's far from being anywhere as good as an Echo device. Sometimes I need to repeat the wake word several times until it is recognized. I regularly need to power cycle them despite watchdog scripts. When music is playing on an attached speaker they don't hear you at all. Nice for playing around, but no good for day to day use. Spouse approval factor 1/10.
It's a shame that Nabu Casa doesn't have the business acumen to realize they need a rock solid voice assistant WITH off the shelf hardware to match rivaling Google/Amazon. 2023's year of the voice was a complete bust as we are nearing the end of 2024 and they still have very little to show for it publicly at least. All (yes, to date I have seen them all) the DIY solutions are still trash for the most part as this video shows. I get much better and faster results from cheap, old, low end Google speakers based in the cloud unfortunately. As a program/project manager and business owner, it is obvious that Nabu Casa is seriously lacking in focus and direction business wise. It's no wonder they are nearly 7 years old (not to mention HA has been around much longer) and have less than 50 employees. That is abysmal growth for a for-profit tech company now days. Google/Amazon assistants = 2024, Apple = 2021, HA assistant = 2016. Don't misunderstand, I want nothing more than a local voice assistant in HA and to ditch Google and the cloud once and for all.
If anyone is considering buying one of these - just don’t. Honestly, just do not do it to yourself. It barely responds to wake words which is the most frustrating part of the whole voice experience. It’s junk. One of the most disappointing products I’ve ever bought. Additionally, when I first started, the speaker didn’t work. I flashed 4 different firmwares - no audio. Then I reflashed esphome again for the fourth time and suddenly the speakers started working. No idea what the issue was. Maybe the the circuit board needed to warm up 😂
Yes, It is strange, because the $13 M5 Atom Echo responded with voice and this device is not. I guess it is some sort of a setting somewhere or I have to use another mount.
@@KPeyanski It's really odd because if you watch the home assistant videos the s3 box is speaking. I read a post on the forum saying they rolled back the software and then the voice worked !
Oh, it is matured a lot! We didn't have a waking word at all in the beginning. Actually, we didn't have anything local that is working in the beginning. Also another voice chapter is coming later this month so nothing is passed yet :)
80+ euros for a device that has a crappy small screen..... microphones that are 50-50 working... and a system that is 50-50 working.... nope. i will wait a year and see if things change. if i can't beat my Google's nest performance on microphone and reaction... i will not be spending any money on devices like these
@@KPeyanski It struggles with mine too and I am English. In my opinion, it is not the device but the processing that needs advancing. Love to see a video with processing on Home Assistant cloud.
Thanks for taking a bullet for the rest of us.
anytime :)
It's actually working much better that that. :)
Glad to see fellow citizen, doing such videos🎉 on the topic I guess as much as we work on this and keep pushing in this direction, the fastest we will get there. Surely if person is curious can make such device on his own with better mic and speaker. If I don't forget I'll get back here later this year with feedback on the DIY topic once I'm done with the whole renovation and start bugging with the whole Homelab.
Thanks for the comment and please don't forget the feedback when you are ready!
Had the same problems and obstacles with this device and sent it back. Now I use Voice Assist on my Android tablet which works like a charm.
I just got through setting Assist to work from my Android tablet that I use with HA companion, usually split screen with Kindle or YT...
Seems to work well, once I create aliases for my entities...
Did you use the HotWordPluggin and Snowboy to generate your WakeWord, with Tasker or Automatic? Or, is there another better or simpler way?
can you talk to it from a distance?
@@gdaaps No! The microphone of the tablet is not built for that. But in my camper van this is not a big problem.
I think we all want to get rid of our Alexa devices but their hardware, no doubt backed by millions in research, is very good, particularly their microphones and speakers, for their price. Microphones have to be very good for seamless integration into a smart home and I'm not used to yelling at Alexa.
true, sad but true, at least we are on the right direction and I will keep trying replace my cloud voice assistant with something local
Also, Is more expensive
Agreed, and that is why I think that is why those projects that reuse and de-Amazon or de-Google those companies devices are more promising.
If that's the case it makes me wonder why didn't people just reverse engineer it to see what it was, and just use a similar thing?
I think the time has come, sooner or later someone will definitely be able to run all this locally.
If you accept, that your music device can be separate from voice assistant, it makes switching process much faster. I already switched to Wyoming satellites in pretty much every room except living room, only because i'm still writing Assist integration to play music from Music assistant on user defined speaker with voice.
Thank you for this so much! The timing is perfect, since I just got mine yesterday :) I like dark theme when my eyes are tired or hurt. I like light theme when it's the middle of the day :)
Wonderful! Hope you enjoy the device and yes you are right eye caring is more important than color preferences :)
Thank you for this material. I see a need of the followup material in which you could show how things improved after you've put some time into adjusting the configuration (and maybe a camparison with the cloud processing?)
Noted! Thanks for the idea!
Thanks for the video. If the report is true, Nabu Casa is supposed to release a microphone interface later this year.
I also head about that, I hope they gonna make it ;)
Wow, I am really glad I watched this because I had never messed around with the models before. When messing around with them it had mediums-us chosen. The local response was about 5 to 7 seconds. Changed to the default(tiny-init) and response times are just as fast as cloud of not faster.
I am running x86 image on a 2 year old AMD chip (no dedicated GPU) with 32GB of RAM. Overkill but my 12 year old Intel Nuc was dying which was my previous HA server. The add-on also uses less RAM. Odd but I'll take it.
Finally switching my default voice pipeline to local only! I've been using Nabu Casa cloud for years (and will continue to) for external access it was just an option for me to use the cloud. I certainly trust their servers and what day do they collect more than Google or Amazon..
You can get a discount if you get a year, which I believe is 35 to $40. At least it used to be.
Glad that I help you go local :) And yes, changing models is a great way to optimize response times for your system. Supporting Nabu Casa team is a good cause and I highly appreciate it, but just to mention that there are at least 3 other very good & free methods for Home Assistant remote access and I covered them all in my previous videos :)
I won't be throwing out my Echos for now, though I am interested in local voice control. I'll check back in 6 months to see what improvements HA has made
fair enough :)
First of all, thanks for the video 🫶
Secondly, different “stands” for the screen have other functionalities, in particular, one of them is filled with sensors, such as an IR receiver/emitter, temperature sensor, etc
The question that interests me is: do you happen to know whether it is possible to simultaneously use the device as a voice assistant, as well as useful use of the mentioned sensors, as well as the screen itself, not only for displaying 6 states but also for displaying a dashboard from the home assistant, video output or something like that, using a touchscreen to control these widgets? I'll be appreciated for your answer 🤞
Thanks for this video. By the way, most of the time I use de dark theme. I find it more comfortable for my eyes.
You're welcome! Yes, that is a biggest advantage of dark themes
Been using Reolink cams i already have as a listening device, and some cheap-o cameras then using the Amazon Echo devices as the media speakers and outputs. Turn the mics off on the Echo devices. They have been working fairly good after fixing a few bugs and such. The cameras like to turn the mic off but a quick ITTT took care of all that.
Do you have any docs on this on how to use reolink cams as listening device for commands?
wow it's working just after 10 attempts
haha correct :) And it is not working everytime, but nevertheless this is hell of a progress for a local voice assistant and I love it. I can't wait for the future developments
Great vid - very tempted to buy one and just have a play lol. Maybe just to open some blinds in my bedroom.
Go for it it is fun.... :)
Great tutorial, I am thinking about buying this unit for my setup. Has it been reliable?
it depends on what are your expectations :) same as with everything...
Hi Kiril, yeah, thanks for the bullet :) and i have some questions :)
what is the max distance that it hears/understands correctly what you say?
What about the speaker, it seems that it doesnt have one. If not then no piper is needed?
Справя ли се добре с българския?
I didn't test the distance, I'm always invoking it from from 1-3 meters and it is fine. Regarding the audio response it's kind of a bug that I got lucky to hit during the video shooting, now it's fine and there is audio response after upgrading to the latest ESPHome FW. За Българския има какво да се желае, но работи все пак което е супер. На дисплея обаче излизат квадрати вместо текст не съм си играл да видя може ли да се оправи :)
@@KPeyanski great! It's nice it has an audio, I'm looking for something like that that is able to do audio announcements and it's important for me. Thanks!
So cool!!
Thanks, are you going to try it?
@@KPeyanski I am! Your video helps out a ton. I originally found it because I accidentally bought a couple of ESP32 S3 modules thinking they might work with WLED or Awtrix (they do not lol). So I was looking for compatible projects like voice recognition, etc. But now that I’ve seen the Box version that you have, I think I’m going to just get it cause it’s cool. 😂 Like BMO from Adventure Time.
Thanks for your hard work
It's my pleasure :)
What hardware can you reccomend in order to have the best and fastest performance for piper and whisper. Could it be a seperate nvidia ai machine?
I ordered this one (still didn’t arrive) do you know if it’s possible to integrate it with ChatGPT for voice control?
yes, it is possible, check this video - ua-cam.com/video/1dRcfqAOuyo/v-deo.html
@ thank you!!!
i saw tuto with rasberry pi zero w2 and Wyoming protocol and a microphone installe on it. It's look to work very well. Maybe you could try it for the rest of us :).
I'll think about it, but what will be different in terms of performance in such setup?
@@KPeyanski The raspberry pi zéro w2 only work for Wyoming protocole and microphone. And the home assistant server with rasbery pi 4 work for everything else. So i suppose it's enough for good performance. But i am suspicious with that....
I have two Pi Zero 2's with the Seeedstudio speaker hat as Wyoming Satellites and the performance is comparable to this video here. It's far from being anywhere as good as an Echo device. Sometimes I need to repeat the wake word several times until it is recognized. I regularly need to power cycle them despite watchdog scripts. When music is playing on an attached speaker they don't hear you at all.
Nice for playing around, but no good for day to day use. Spouse approval factor 1/10.
Darkmode is good, but for YT-Videos, the white-mode is very much better!
Yeah I agree
The S3 Boxes frustrating me - the mics are not very well designed - my raspi satelites works much better! Looking forward for their own hardware....
what's a raspi satellite
My experience exactly. I tried it in German and it almost everytime didn‘t understand anything but the wakeword. Sadly…
It's a shame that Nabu Casa doesn't have the business acumen to realize they need a rock solid voice assistant WITH off the shelf hardware to match rivaling Google/Amazon. 2023's year of the voice was a complete bust as we are nearing the end of 2024 and they still have very little to show for it publicly at least. All (yes, to date I have seen them all) the DIY solutions are still trash for the most part as this video shows. I get much better and faster results from cheap, old, low end Google speakers based in the cloud unfortunately. As a program/project manager and business owner, it is obvious that Nabu Casa is seriously lacking in focus and direction business wise. It's no wonder they are nearly 7 years old (not to mention HA has been around much longer) and have less than 50 employees. That is abysmal growth for a for-profit tech company now days. Google/Amazon assistants = 2024, Apple = 2021, HA assistant = 2016. Don't misunderstand, I want nothing more than a local voice assistant in HA and to ditch Google and the cloud once and for all.
ordered ;-) thanks
Enjoy!
How do you get the log screen to show on the right side?
I am using HA on rpi 4 and tried this with esp32 and microphone but HA puts a lot of load on processor.
I would like to see the load on the processor and memory being used in your tutorials.
hmm, good idea! I will include that in my next video tests ...
I have one of these...i found if I stand to close, it doesn't understand me?
Strange, mine works better from closer positions....
If anyone is considering buying one of these - just don’t. Honestly, just do not do it to yourself. It barely responds to wake words which is the most frustrating part of the whole voice experience. It’s junk. One of the most disappointing products I’ve ever bought.
Additionally, when I first started, the speaker didn’t work. I flashed 4 different firmwares - no audio. Then I reflashed esphome again for the fourth time and suddenly the speakers started working. No idea what the issue was. Maybe the the circuit board needed to warm up 😂
I notice that your device is also not speaking back just writing on the screen!
Yes, It is strange, because the $13 M5 Atom Echo responded with voice and this device is not. I guess it is some sort of a setting somewhere or I have to use another mount.
@@KPeyanski It's really odd because if you watch the home assistant videos the s3 box is speaking. I read a post on the forum saying they rolled back the software and then the voice worked !
Making smart use of supply and demand. It's certainly not worth 80 euros.
I think they increased the price recently. Not sure why, but it was a bit more cheaper before...
Yeah, sticking to Alexa for now. Voice of the year has passed and this hasn’t matured more than what we saw in the beginning.
Oh, it is matured a lot! We didn't have a waking word at all in the beginning. Actually, we didn't have anything local that is working in the beginning. Also another voice chapter is coming later this month so nothing is passed yet :)
80+ euros for a device that has a crappy small screen.....
microphones that are 50-50 working...
and a system that is 50-50 working....
nope. i will wait a year and see if things change.
if i can't beat my Google's nest performance on microphone and reaction... i will not be spending any money on devices like these
Fair enough :) As @INSIGHTSAU in one other comment I took the bullet for you...
where program code
Struggling with your accent?
Sorry to hear that, I'm accepting english accent consultations as long as they are free and good ;)
@@KPeyanski It struggles with mine too and I am English.
In my opinion, it is not the device but the processing that needs advancing. Love to see a video with processing on Home Assistant cloud.
@@KPeyanski I think he meant to say that the device is struggling with your accent.