Unreal Engine 5 - Ultimate Voice AI Tutorial - Masterclass from scratch
Вставка
- Опубліковано 11 чер 2024
- Because there has been excessive demand for a tutorial, there it is now.
Explaining how to use a microphone for communication with an NPC/AI and get an voice response using ChatGPT and Elevenlabs TTS from scratch!
Project Files downloadable for Patrons:
9sj9.short.gy/UE5AI
Plugins need to be installed for UE5.1
____________________________________
www.unrealengine.com/marketpl...
www.unrealengine.com/marketpl...
www.unrealengine.com/marketpl...
Websites:
openai.com/
API: platform.openai.com/docs/api-...
beta.elevenlabs.io/
API: docs.elevenlabs.io/api-refere...
Voice IDs:
Name Voice ID
Rachel 21m00Tcm4TlvDq8ikWAM
Domi AZnzlk1XvdvUeBnXmlld
Bella EXAVITQu4vr4xnSDxMaL
Antoni ErXwobaYiN019PkySvjV
Elli MF3mGyEYCl7XYWbV9V6O
Josh TxGEqnHWrfWFTfGW9XjX
Arnold VR6AewLTigWG4xSOukaG
Adam pNInz6obpgDQGcFmaJgB
Sam yoZ06aMxZJJ28mfd3POQ
Don't forget to like, comment, and subscribe for more exciting showcases.
/ discord
_______________________________________
Want to support me?
/ marvelmaster
or
www.paypal.me/marvelmasteryt
All donation money flows into better software, equipement, assest and cloud space.
_______________________________________
Time stamps:
0:00 Introduction
1:39 Requirements and limitations
4:51 Setting up the project
6:21 Speech recognition
16:07 ChatGPT communication
34:46 Fixing ChatGPT bugs
37:47 Make ChatGPT remember
44:36 Elevenlabs TTS
1:04:44 TTS Bug fixing
1:12:00 Outro - Наука та технологія
kind reminder for any security phreak who is bumping her head against the wall after an hour of debugging: please enable your mic access in win under Privacy Settings > Microphone
: D. I almost fell down the chair when I figured. Many thanks for the tutorial btw, its truly amazing and that you published it open source. People like you are the reason the unreal community flourishes (*^ ‿ *)♡
For people wondering about the {Blank Audio} error I found mine was capturing the audio from this youtube tutorial. Go to the Start capture function that is copied an pasted at 8:28 in the video and set the Device Id from 0 to maybe 1 or 2 depending on your set up. For me 0 was my audio from my PC and 1 was my microphone. I hope that clears some confusion.
Not here. I'm getting [Blank Audio] every time, no matter how I set up the Device ID. I know my mic signal is getting into Unreal because I made an input level meter and used a channel from the Audio Capture component. I thought maybe that was interfering somehow, so I set up a blank project just to test it. Still nothing. I'm using UE 5.3.2, in case anyone happens to have any insights into what's going on. The Marketplace says the plug-in works up to 5.4.
Having the same issue. Did you manage to solve it?
Thanks for creating this amazing tutorial! Subscribed, and I'm looking forward to see more of your creations :)
Excellent. Thank you. I've been looking for something like this for a while. Well done.
Thank you for this tutorial, it's some amazing knowledge you are sharing.
Very nice tutorial, thank you for creating this. The future of dynamic NPC interactivity is here today :)
indeed
Superb tutorial!
Genius! Pure genius!!
Thanks for the tutorial!
Great tutorial thanks heaps
This is just magic, thank you, time to finally install UE5 since there's no plugin for UE4 :)
Great tutorial, I want to understand how you set up the digital character to change from a casual animation to an intro animation
Genau das hab ich gesucht^^
👌
Amazing work, how can i Connect animations to it?
Nice nice :)
Now we just need a AI like dolly locally and quite a bit smaller, fed with only game specific content. Hmmm, my todo-list is growing.
Lovely tutorial, thank you for creating this. Just one question, because you make use of the VaRest plugin do you need a internet connection in order to use chat-gtp? And if so, how is the ai going to work on a build when the api is dependent on wifi connection?
yes gpt requests are made online... as well as text to speech... I think computation would be too intense locally at the moment.
Hi, this is amazing. You are a life saver. I'm facing a little problem, I've been trying to 'Get Response Content' but that node does not show, instead i'm seeing all others - 'Get Response Value' , 'Get Response Content as String', etc. What can I do and what could be wrong? I am using UE4.27
Hi. Can I use my RVC voice model in this pipeline? so that my character can speak with special voice
awsome
Thank you for the content!! It helped me a lot! Is it available on unreal engine 5.2??
i think. the plugins are not available jet fir 5.2
very good tutorial.
ijust have one problem my engine crashes when i hit # multiple times
it crashes with an access violation at 0xfff...
any ideas how to fix this?
I really love what you do, in my app I need to download the audio file and storage it at a specific file path. Do you know how could I do that ? Any small advice would be really helpful!! It is not clear to me what the runtime importer does with that file.. I need to turn it into an asset, because I have another API that requires the audio, but also the asset of the audio. Thank you!
did you check the documentation of the audio importer... there is even a setup for non streaming
Thanks so much for the effort and clearly structured tutorial! Eventually I made it work for me :) I guess you are right and this is still basic with lots of things that can be tweaked.
My biggest wish: How can I find an ID for generated voices to implement in UE?
Btw, the German language model works quite well and I do not see any latency difference to English.
different voice ids are in the description ir you create your own one if you habe an elevenlabs subscrption
Thank you! finally it worked, now i m trying to add a metahuman with Quixel bridge, i can add but i don t know how to start O_o using also the lipsynch like the "text to speech" function...
jeah maybe thwre is some realtime plugin for that
Hello, I followed the tutorial, and it only allows me to speak once. No error or string printed out after the second time voice recognition. Do you know why it's happening?
Thanks for this! I am not so clear about what the streaming option is doing though. Isn't this supposed to output a stream that we can play through before the request is completed? Like we start reading the stream and continue updating the playback until the stream is over? The way it is now I still have to wait till the audio is completely through before playing so I don't see the streaming part of this?
yes the way it is now is just wait till end... but in theory with streaming you can split inputs and outputs to make it faster... if one knows. how to do that
Thanks for the tutorial! For your patreon what tier would we join to gain access to the project files?
as minimus supporter 😁
@@MarvelMaster Thank you!
any way to do this with a local LLM instead so that it doesn't have all the filters that chatgpt does?
Is there anyway to setup a memory for the character like give him a backstory?
Hi, great tutorial! In my case the imported soundwave didn't work, so in order to find out why, I added a "Switch on ETranscodingStatus" attached to the "OnResult" event that continues from "Status".
Well.. this switch returns "Failed to read". Any idea why would this happen? I did correctly putting the switch there? (like, should I trust it? If yes, why would it fail to read the audio file? I did exactly what you did!). Thank you!
maybe there is no audio file... did you printdebug the response?
Hello does anyone know how to fix the "the audio data could not be processed to the recognizer since the thread is stopped" problem with this plugin?
Hi, thanks for the tutorial! I have a question. I created my voice in ElevenLabs but can not find voice ID. Where can I get it? Thanks for help.
the elevenlabs website surely can tell that
Hi . can This be Used to generate Lip sync for non metahuman Characters
Hey, in your waterline video it was for ue4 and needed world displacement and tesselation. Is there anyway I could show me how to do the waterline without those options in ue5? Thanks
I read that you can use virtual heightfield mesh insteafmd of tessellation in ue5
@@MarvelMaster I figured it out!! Though now I just need to figure out how to get swimming and buoyancy to work in a custom water mesh!
Hey!
So I've been looking into this Speech Recognition, is there any way to make it more efficient or faster?
I'm noticing it takes minimum of 3 seconds for the Model to recognize what I say and return an answer, is there a way to make it "instant"? I already use the tiny model size and put english only, so can't reduce the sample size. :D
hm maybe better hardware? And maybe there is a way to make it recognize to get streaming answer from chat gpt and elevenlabs... meaning not waiting till the whole sentences are processed but getting the responses while they are generated partly?
Hey, I just wanted to mention that since then, the plugin has undergone significant improvements, and its speed is now much faster compared to half a year ago
So cool man can you plz tell me how to mzke the animation while talking
just start animation when sound starts
@@MarvelMaster ohh i was thinking your using npc meta plugin or sdk meta plugin thank you
Hi! Everything works fine in project. However, when I packaged project, only "VoiceRecognitionStarted" and "VoiceRecognitionStopped" are writing. So it does not work when I packaged. Do you have any suggestion?
iirc plugins need to be packed manually in project settings
i'm having a lot of trouble with creating save game slots of the chat array history. would you be able to help with that please? this video is great thanks again.
isnt it just a string to save?
@@MarvelMaster i have it connected to twitch and twitch messages trigger this whole process to happen instead of a microphone. i'd like it so that when I load it only loads based on who just sent the message. so it would pull the array save game slot associated with that username. what's driving me mad is that I figured this out earlier this week and forgot to back it up. running in circles trying to remember what I did. lol
@@MarvelMaster oh right i didn't explain the problem. it's saving all messages onto one array. even though I set the username variable as the save game slot name.
@@MarvelMaster chatgpt and all them have been completely useless in getting this to work XD
@@ai_and_chill you can try use something different than array
i have one question, when i change language in setting, it wont work, if i change it back, the issue will be fixed, why, anyway, tks your video, its help me a lot. if u can rep me about how to change language, its will be great. and it will help a lot people just like u and me who are not the native english speaker.
depends on what does not work... you can set voice recognition language in plugin and the also use the multilingual model of elevenlabs
Great content, How to do animations when voice is playing
you can get when audio is playing and the start an animation
How can I mix the idle animation and the talking animation, is there more tutorials coming up, because this has been a great help 😀
@@titan3rd474 Animation is a topic itself...you can use an animation BP or just Trigger animations in the character BP
Is it possible to make microphone work without clicking any button but when you start game? Thanks!
you mean voice activated? Not sure if the plugin has a option for that or another plugin does that
Did you find your answer, id like to do that too but a bit like how we say hey google to activate google assistant, but ive got a god game so u have to pray dear god lol.
When i press # in the blurprint search, keyboard elements isnt appearing as an option.
Today is my first day in unreal engine so am asking here for everyone who has the same issue in future doesnt have to leave the video to debug. Thanx whoever provides the answer to this issue :)
Found it, though it may be specific to the latest version of UE (5.4.1)
You have to type "keyboard event" and then the key you want ie keyboard event T in order to get the precise key in the search.
If you just type the word keyboard and then #, youl have to scroll up and locate it manually, or alternatively if you just type the word key and then #, youl have to scroll down and find it manually
where should I connect Make SpeechRecognitionParameters, in order make it work? 14:40
there is probably a set parameters node or something
@@MarvelMaster can you let me know me know how to do it?
Does it work for only 5.1? or include above?
havent tried other depends on if the plugins are available for newer version
how to change the mic input?
dont know maybe in projext settings or windows settings... usually it takes the standart mic input from windows afaik
great solid tutorial. seems like they have update their github page and how to use the plugin page is not there anymore. We have to create all the nodes by ourselves for the blueprint
no the copyable codes is in the runtime speech recognizer doc not audio importer doc
@@MarvelMaster Sorry my bad! you are right. Followed the eleven labs integration. All is going well but from the eleven labs whatever is returned, its not playing from the buffer. I am not using chatgpt response but actually just passing the text generated based on voice recognition to the custom event of playing voice reply. Any help here would be really appreciated.
@@AICineVerseStudios you can debug print the resonse from elevenlabs... if its cryptic the itsa sound 8f its text that says some error then something is wrong
@@MarvelMaster It's working! Eureka! 😀😀😀😀
@@MarvelMaster Ok actually I'm building this to run on a cellphone as a person avatar. Is it recommended to follow your approach to building the same for Iphone or Android ? Can there be processing issues ? Or would it even be able to use the plugins for voice recognition ?
Great tutorial! For some reason, when it prints out what it processes from speech recognition (step 1 of the video), it always returns a series of exclamation points (!!!!!!!!). Any idea why?
step 1 is voice recognition... make sure your mic works and is not set too loud or quiet
@@MarvelMaster right now I'm just using the mic on my Mac and am having issues - would this only work (well) with an external mic?
@@MarvelMaster bump on this -- is this a Mac issue? Would I need an external mic?
@@user-pf2se2df8v try it on another pc then
Hey, that mentioned problem occurred some time ago due to a resampling issue on the engine side caused by audio chunks being too small for proper resampling. But it has been fixed since then, and now it shouldn't cause these problems anymore
how did you manage to animate her when she talks?
just start a random animation when audio plays
hello can you help for solving about "Audio_Blank"?
never hewrd of that...
I made a comment, it might help you
the audio wont play, this is the log "ogRuntimeAudioImporter: Warning: Imported sound wave ('CapturableSoundWave_1') data will be cleared because it is being unloaded
LogRuntimeAudioImporter: Warning: Imported sound wave ('CapturableSoundWave_0') data will be cleared because it is being unloaded"
did you try google it or look into the documentation... andengine version?
@@MarvelMaster i use UE 5.2
Im at 20%, everytime i speak it print me "you", any ideas?
you have to hold the button
Good except the lips do not move when she speaks.
jeah unfortunately did not find an easy solution for that
@@MarvelMaster id like to see u do it though, metahuman comes with the visemes, so text to visemes then set the visemes, would be the logic, but thats way over my head
When I restart the project, the VArest json nodes get deleted and speech recognizer nodes break. Know what thats about? I could be dumb
oh, I think it was because I created a cpp project instead of a blueprint project
ok 😬
Awesome tutorial!