How to Program Speech Synthesis in an Animatronic Mouth Using Python and Arduino
Вставка
- Опубліковано 24 тра 2020
- Here's a closer look at the programming behind my animatronic mouth. Using Arduino, Python, and a few open-source libraries, I take a typed sentence and convert it into an animation sequence.
Support me on Patreon! / nilheimmechatronics
Contact: enquiries@willcogley.com
Discord Server: / discord
Open source animatronic mouth design: www.nilheim.co.uk/latest-proje...
Instructable: www.instructables.com/id/Simp... - Наука та технологія
As a linguist, I can say that this is awesome sauce.
I did the same thing with a talking head program. Used the same dictionary and mapped the visemes to the phonemes. You could have a conversation in real time with my app (called Ayako). Awesome that you took it further with the robotic mouth. Well done.
That looks great! I'm working on a same kind of controller. But I am using a stereo track that is analyzed by a Teensy Audio Board. The Right track contains the speech and controlls the Jaw. The Left track contains tones that correspond to the mouth positions of the other servo's.
For example 200Hz for A, E, I. And 250Hz for B, M, P.
So the actor who's gonna play with this mouth only has te make an audio track with the right tones in the right position on the Left-track.
Very good vídeo and better job.
Its really amazing and interesting.
Thanks about your job.
Excellent work ! Awesome !!
Excellent work. My upcycled Teddy Ruxpin uses a much simpler set of visemes.
Nice work !!!
Good work!
instead of maping in an array each servo position for each viseme you could try to do a reverse kynematic model then just have each end position for the mouth maped in set of position , you could even map that to motion capture
For emulating audio you would probably need to look at how the frequency domain representation of the signal is changing over time.
Beautiful video awesome robot
Looks awesome I got to say
Such a cool video
Make your mouth the narrator in the bottom right of all your videos!
How loud are the servos in real life?
I seem to remember using a Regular Expression to make the UK version of the Arpabet for my speech projects in the past. I think thats right. This looks like a fun project. I used a c# library under unity for this last time I faffed around. Good job overall. it's a tricky subject :)
Also the iphonex does a decent job these days. apps.apple.com/us/app/face-cap/id1373155478 might be worth an eyeball for example.
I did cmusphinx.github.io/2013/03/speech-recognition-on-kindle-touch-with-cmusphinx/ donkeys years ago showing that one can indeed get a decent extract of words from those tools you mentioned. You need to faff a bit though. Good luck, this one looks like fun,
Wow thanks!
Fantastic
Are you planning to put a skin on the mouth? Seems like there are bits around the mouth to deform the lips etcs but looks a bit odd without a skin
It did strike me as odd that he was setting servo positions for certain expressions when he doesn't even know what those expressions will look like on a completed face. Surely it would make more sense to finish the head before setting something like that? But what do I know?
I'm sure that if I wanted to, I could probably write a blender extension that avoids all this phoneme stuff and instead sends direct pose data from animated frame data. That way, the mouth is animated like how someone might animate a character.
For local ASR try DeepSpeech, on a RPI4 DeepSpeech Lite works in real time.
Local Speech recognition is still tricky, works better on full sentences than single words.
You might be able to use text to speech software first and then convert that text that it gives you into phonemes. both microsoft and bing offer "free" speech recognition for python. I use both. I am planning on doing something similar with a human figure.
Respect
Great work Will! I was thinking, what about using an ESP32 instead of Arduino? ESP32 can run MicroPython. Therefore eliminating the need for two microcontrollers and a serial connection.
NLTK will not run on an ESP32. Raspi + arduino combo would be ideal for a contained system.
Could you approximate the mouth positions for Leeds by just leaving the mouth open all of the time?
where did you get that amazing servo tester????? I NEED one of those! --Thomas
use a loop and table instead of a bunch of 'if's those will stack up real quickly lol
Can I buy the code
2:44
Does this mean switch statements do not exist in arduino?
Yes, but bare in mind that they are not very useful when you do logic operations within the ‘if’. You would need to map the possible results into values so that you could use the switch statements
Ok im making the mad hatter and march hare
gona make me act up
So if you modded this just a hair it could be a phonetic filter.
Will, take a look at audioservocontroller dot com. I'm not sure how many servos it is cabable of controlling at once as I haven't grabbed one yet, I can inquire though as someone from the FB group has one. There is a few dif audio servo controllers out there but they don't seem to be very common. Scary Terry is another one. I saw someone post the hardware layout at some point so it might be easier to throw one together depending on the components. Might be the only way to get that many sevos running in sync.
Anymore thoughts on the potential forum? XD
*edit* Fernado from the group has a vid up with him testing it on his DARA robot. "DARA robot lip sync" if u r curious. Can ask him if he played with it anymore since. I think he had it just hooked up to the jaw and not his tongue model, he would know better mind you. :9
Scary Terry and similar just work off the volume of the sound source, not visemes. Adequate for a jaw on a prop or toy (I use them and I'm developing a similar thing using a Raspberry Pi), bit nowhere near what Will is doing with visemes, jaw,.face, and tongue movements.
People who made this
*it took me 2 years to make*
#10
Unfortunately this isn't "open source". The source is available, as are the STLs, but there is no license on them and so cannot be used used, redistributed or altered legally.
The commonly held definition of "open source" is (from en.wikipedia.org/wiki/Open-source_license):
"... a type of license for computer software and other products that allows the source code, blueprint or design to be used, modified and/or shared ... Licenses which only permit non-commercial redistribution or modification of the source code for personal use only are generally not considered as open-source licenses."
From the "terms of service" page at www.nilheim.co.uk/terms-of-service.html:
" .. not to (or permit anyone else to) do or attempt any of the following:
* distribute, rent, loan, lease, sell, sublicense, or otherwise transfer or offer the Service for any commercial purpose;
"
Which puts it in direct contradiction with the definition of "open source" most widely used.
Please consider removing the term "open source" for something more appropriate like "source available", or putting the source code and STL files under a free/libre license.
Of course I agree that prohibiting commercial use means it's not "open source" under the common definition. But I can certainly see reasons for doing so. I do think that the developer should consider some standard license, rather than the current "terms of service" which has some clear wording errors and more importantly, use of a non-standard license restricts uses the creator intended to allow, as no one is familiar with them or exactly how the terms may be interpreted. Besides just changing to an open source license . wouldn't other options include: 1) While not intended for software, use the Creative Commons license limiting commercial use, 2) License under the VERY unrestictive GPL, with options for commercial users to pay for closed licenses. This doesn't actually prohibit commercial use provided the user abides by the terms of GPL opening up their own changes to the same terms, but may make it more attractive for commercial users to pay for a restrictive license, or 3) While I haven't seen it used, use the Commons Clause (commonsclause.com/)?
on L the tongue should touch the top
Are you a Furry? You could make a lot of money by seling this to them :3
It might be hard to fit all this stuff into a mask, but I remember there being a few people building animatronic fursuit heads like this.