README!! Not downloading? 👇The VC is continually being updated so the version showed off in the video is no longer available. If you run into errors, you may have to try out the other versions to see if that resolves issues. Latest version as of this update: 1.5.3.8a I expect that I'll have to make a follow-up video. If the google drive link is down, use hugging face website. Look at the names there and determine what to download based on: cuda - Nvidia directml - AMD mac - MAC
@@giggy_rook bro check if you dl-ed the cpu version. Because the first one says it'll work with your cpu, the mid one is cuda, and the last one's amd. Edit: if you're downloading from the archive...
@@nokinirus i also have amd and it doesnt show the gpu only cpu. and yes i downloaded the directml one. I even tried downloading other versions and different types
TBF there's a popular one that literally all AI atm, though her creator has arguably become even more popular lmao, it's kinda nuts what he's been able to make
In reality there are no shortage of women willing to do the job. Although many of them are self concious and still use a voice changer just to make themselves sound cuter/younger/whatever they felt inadequate about. Don't forget that 50% of the world are women and plenty find vtubing appealing. In real life the highest Youtubing earners are MEN, so if anything you lose money by voice changing into a female.
Some japanese VAs agencies are hammering down on their VAs' AI voices. It's gonna be interesting to see where this will go. While I hope I'd be able to use some of my favorite JP VAs voices and have it for english content, I'm betting there will be much more worse abusers and signing it off as their own.
Rules and regulation are gonna be needed for sure, but there's no legal precedent for this yet so it's all up in the air for how it's gonna be dealt with in the courts. As with all things, there will be bad actors and despite me looking, I am yet to find any good resources or detection tools that can keep up with these advances.
Always remember when dealjng with strict japanese laws: They can't touch you if you're outside their country or if your country has no reciprocation laws to back them up. If you get copystriked, just make another account ad infinitum, assuming you're anonymous. When all hope is lost, the worst-case scenario is to upload on BiliBili. Let's just say China and Japan aren't really on speaking terms.
Holy shit, the amount of power of turning into a Hololive Girl is getting closer! Also, that Marine voice when she’s speaking English fluently is just so damn uncanny to imagine that there’s a timeline where Marine learned English SUPER WELL. I hope there’s a program that compiles all the complicated setup into an easier way of setting up since I’m tech savvy 😂
True. Now people are going to view Hololive creators a little differently... Especially since most of the Hololive girls go to great lengths to hide their true identity. Makes you wonder...
This is going to be so great for online role playing games. Being able to make a custom voice that you can use that matches your avatar will really increase immersion.
This was incredibly helpful! I seen your video on TikTok and came here right away. Thank you so much for making this video; I couldn't of figured out that program without it!
If anyone is experiencing very choppy sound, like your voice cutting off after every 'chunk', you can try changing the AUDIO to "server" instead of "client". Eliminated all choppiness for me.
Just don't worry about it. If you've ever mysteriously had an advert for a product you just mentioned, outloud near an active device, pop up in your feed, you're already having everything you say parsed by some sort of analytical algorithm. This, while an additional outgoing stream of data from you, is at least one that you are aware of and have some control over. The only thing I could really do to guarantee my phone isn't listening to me type, even this sentence right now, is to put it in the microwave to block any outgoing signals. At least all you have to do is shut off the program and they aren't able to parse your data anymore.
Your Chopping is because of threshold set all the way up. It does a cut off of the input under a certain volume. Make it a lot lower (almost completely at the other side actually)
It works but latency is pretty high. Lowering chunks improves that but you lose a lot of the content of what you are speaking. One thing to note is to make it most natural sounding, always tune it to a number that is closest to the voices natural sound. Like around 22 for that first model. Also any idea how to turn off real time playback? It's easier to use the record and then playback for any projects.
If you don't need the realtime functionality of it, you might be better off recording audio and then converting them in the RVC interface. You could always increase chunk size and there is a record fucntion on the client.
It’s crazy to believe that there’s actually people that design voices for these Vtubers to design to there preference on how they want it to sound. More power to them.
I swear github is like the holly grail,I just learned about it recently but now I realise that every kind of software can be obtained from there and for free
Tsukuyomi-chan's project & her creator are so inspiring! She is a free voice project across a whole number of engines, mostly any free Japanese speech & singing synthesis programs. I definitely recommend that people check out some of her other resources & samples because she is really a treasure of a voice!
For those of you watching this and you cant see your GPU listed under the GPU tab this is what you do. Where the Audio section is where it says "Client or server" click on server, go back to the GPU tab to make sure your GPU shows in the drop down list, and then you can click back to client or leave it on server. It worked for me.
I've started using RVC for some of my videos. I was able to change Clint's voice (Clint from LGR) to Duke Nukem's voice for a Duke Nukem review he did some years ago.
Just wondering, are there any resources online where people can post their own trained voice weights? It'd be convenient as you won't have to keep training your own voice for the voice changer in case somebody else already happened to do so.
As a highly trained vocalist, I have been waiting for this to be a reality so that I can create cover songs that one could only dream to hear, like Freddy Mercury, Curt Cobain, and Steve Perry singing on the same ballad!! If anybody seeing this has the capability to train voices and would like to collab on a project, let me know! I’ve yet to figure out the training but I have an entire professional quality studio set up and I’m ready to get to sangin!! Let’s GOOO!!!! 🚀🚀🚀🚀
Funny you comment this because a video I'm gonna be releasing is talking about the potential to turn my untrained voice into something that is bearable... simply by using a trained model xD. Another use case is say you throw a bunch of filters onto a voice and don't want to do post-processing ever again. Well, if you just get enough voice samples... you could essentially just "sing" and then BOOM, it's all edited. Still a little bit of issues ofc, but.......... it's super exciting lol.
Personally for me i’d use it for music production, so so much easier to draft a song when you can hear fitting voice with it for an actual artist to sing later. I usually sing a bit myself but having a fairly low male voice, i can never do a female voice
I went along with this tutorial and everything went smoothly until i opened the program. Above I only have the "clear settings" button when there should also be "reload" and "select vc". My screen looks like 3:07 without those buttons and without the voices to choose from. The "edit" section for the voices is completely empty for me and I'm now stuck and don't know what to do since I'm not too advanced when it comes to computers. Does anyone know how I could fix this? EDIT: Nevermind I fixed it! If anyone else ran into this it's easy to fix. Under "NOISE" you have the "F0 DET." thing. It's on "dio" by default and when you switch it to one of the other modes the different voice models will appear.
Immediately one of the first things I thought about lol, it's gonna get wild. But also, the more you're in the know, the less likely you're to fall for any types of these things as well.
so more males are gonna be applying to be part of a middle range agency (that cannotnmake too much background check( using these filters lol Catfishing and also contract breaking (one female can work at 2 or 3 agencies without her being voice recognized lmao)
@@MaxKrovenOfficial It's also bad because agencies and companies wants to see you literally in person in order to setup any sort of contracts or deals, but it also opens up to scams and such because you can impersonate other people very easily... For instance it might hide the indian scammer's bad accent and fool a lot more people who are usually aware of these people and their bad voices.
I thought it would be REALLY good, but I didn't realize it takes more resources than Chrome and Photoshop. As soon as anything else needed some GPU, it started stuttering and became unusable. I hope it gets good enough to not need more than 12GB of VRAM.
If you don't have the voice actor icons, you downloaded a past version or server ( I am incompetent, don't ask me). You need to download a client version and it is at the same page in the start and currently will direct you to download it from hugging face. It is 2.+ version. It helps if you choose english on the git page... Took me a day to figure it out, never again.
can you make a tutorial or is there already one of how to make your own model for real time talking? i know there's ones for singing but if I want a better talking model how much data should I use? podcasts maybe?
I do have to link it here: ua-cam.com/play/PLknlHTKYxuNshtQQQ0uyfulwfWYRA6TGn.html The same models used to train in RVC can be either singing or talking models, just depends on what audio data you curate and train it with. I recommend start with 10 minutes of super, high quality data that is clear and then increase it if the model isn't good enough.
It's a bit of a crapshoot. I've somehow had amazing results with something like 2 minutes of ludicrously high quality audio data and not quite as good results with several hours of also very high quality data. It seems that there are a few types of voice which just happen to work better. Whatever you do, make sure to only use data which is as good as you can get.
I give it 5-10 years and we can just prompt a website to generate media to consume. At least that means I can finally get a second season for all those shows that didn't get one...
It's an odd thing, but CPU draw seems to go up when using it selected on my 2070 super I've noticed. Dunno why, but it doesn't show GPU usage. Might be something to raise to the author eventually though.
@@CertifiedAsher I'm using the GPU version, and my GPU should be plenty to process it at lower bitrates at least, but it always ends up sounding staticy.
For some reason my client doesn't have the "Select VC" button to select RVC. Does anyone know how to fix this? I can see the deafult models downloaded in the files but they don't appear on the client as RVC isn't selected. I've also realised that it doesnm't seem to be detecting my GPU as the CPU selection is the only on in the list
having a problem, after starting start_http for the first time it said it failed because it could not find win.api or something like that. Now when i try to run start_http it opens for 1 second and immediately closes
Nothing of this works, I followed all directions perfectly and my mic isn't working. When i change the input it goes to some "No error message" and "Intialize"
You're not alone. I don't know if it was the update I did because it was working fine this morning. Now gpu and server giving me a "ERR_CONNECTION_REFUSED" error.
Appreciate it guys, but I can't distribute the models unfortunately! However, I can share the knowledge required to train the models and I have those videos on my channel. I'm working to get it all a bit more organized, but you'll have to gather audio data on your own (thought there are plenty of tutorials on how to get audio data out there).
Hi Jarods, I have tried to download this voice changer multiple time. But everytime i click at the drive (normal) it is not working. It say that too many people downloading this file which lead to failure when downloading. Please help me of how to download this if this way is not working. Thank you,
one interesting thing is this could theoretically be combined with a translator first, though that would take a whole new, probably larger model, As hardware and software improves, this is just the beginning!
This is truly an amazing find and piece of software, I would be very interested in messing around with it but unfortunately I have an AMD build and I cant find a way to use my 5700XT gpu to process the sounds, and it doesnt seem to be fairing well with my Ryzen 7 2700X cpu :( Any potential help would be greatly appreciated!
Might have to adjust the settings in the client to try and help it out, but it doesn't run too well on CPU unfortunately. Did you download the directml version? That would should support AMD.
Whenever I speak into it. I can hear the voice quite well the only issue is that after I stop speaking a second later it will play a very quiet voice of it back to me. It only does it with one voice tho so I'm assuming it's just something to do with that voice.
Hey guys, I wonder, can you just use an audio input instead of real time voice so that it still mimics your intonations? Or maybe there is some other software that can help adjust intonations?
Yep, Japanese voice models work better with Japanese. I mean... they were trained on Japanese speakers )) This is most noticable with consonants, since those are usually treated differently in those programms (e.g. unvoiced consonants don't have pitch). For example, I couldn't make an English model to pronunce 's' from my language, or any variations of it actually (regarding tongue position being more forward/backward). But with vowels it followed my speach pretty close even if those vowels were not typical for English phonetics... with some exceptions on high vowels (pretty decent though). Tough the last problem might be due to relative lack of palatalized consonants in English.
I tried speaking Turkish with it and it works just fine, probably due to the fact that Turkish is pretty similar to Japanese when it comes to pronounciations and stuff.
@@wargreysama To be honest Turkish is close to Japanese even at grammar (up to some degree) )) It works pretty decent with many languages. I was just curious about possible limitations and since I'm a bit into languages I tried to -make poor anime girl suffer- to play with different sounds non-typical for Japanese. As for now - my favorite thing is a word 'tractor'. Those voices make it more like 'toractor' which is adorable )
yes, in the input, you can select file. for me it errored the 1st time but after reloading, it let me select a file from my computer as the input, and record the converted audio by clicking the record / save button on the bottom to get the converted output
Figured out the link, but do you have any advice for getting clearer audio? When I speak it chirps and distorts pretty constantly (sort of like when you lowered the chunk down really low, but I get that effect in all chunks)
Hello, i followed all the instructions but mine still doesnt work. Now i want to delete everything, do i just delete the file in file manager? How to uninstall the one downloaded in the cmd program? Sorry for my bad english, please let me know @anyone.
This is the main reason why I rejected my banks offer to secure my bank account using my voice over the phone! AI is freaking scary in the wrong hands!
@@Jarods_Journey You can use it to change your voice live and make a phonecall. That is how a Chinese investor got tricked and lost a ton of money. He thought he was talking to his business partner and sent him money for a business deal. The crook even facetimed the victim using a deepfake of the business partner face!
That certainly gives V-Tubers a break. I don't believe there's any way they could do the same thing everyday and not become even a little tired of it. I don't usually watch most of them exclusively, sometimes clips; but for example, I watched from the last 2.5 hours of Mumei's livestream karaoke and she was already tired and bored after the 1st hour when I joined their stream; not that I'd know what she's like but it definitely seemed like it was possible that it was someone else filling in for the night using such a voice changing program.
Ah, it's not that good to be at that level yet, you can definitely tell when someone is trying to use an AI voice still but to this point, I think V-tubers are youtubers so even if you're an IRL streamer, it's not like you have someone else sub in for you when you're tired xD
It works very well but the latency is very high (more than 2 seconds!) even with collabs pro when using the notebook page. Any idea to help solve this? Thx!
it just says in the cmd promt thing "warming up... generating sola buffer." and thats after i have it turned on but it doesnt do anything from there and my mic stay sounding default
Tbh, I'm wasn't sure if the Collab worked but if it does, the delay would come from having to use googles servers to host the client instead of your local device
@Jarods_Journey, that's what I expected (I've subscribed to Google Collab Pro) but I still have 2 sec latency. That's frustrating! I would LOVE to use your great tool for my next live perfomance in Taiwan! Can you help? That would be fantastic
Hey Jarods! A fellow Mechatronics Engineering Grad here, you make a lot of quality content and I wish to message you regarding some technicalities of AI voice cloning. And maybe some career advice for degree holders in Mech? haha Where can I reach you?
Hey Patrick, always great to see a fellow mecha :D! I would say linkedin is going to be the best bet for professional stuff, if not, there, then the next best bet is discord as I'll generally respond on there. I do get a lot of PMs but it shouldn't be a problem if you pmed me from my group.
Everytime I try to launch the voice changer it shows this error and doesn't work Failed to load URL: localhost:18888/ with error: ERR_CONNECTION_REFUSED or when it does load I click on a voice and it says Cannot read properties of null (reading 'enableServerAudio')
Welp, this is going to put old voice actors out of a business, but on the other hand it's also going to allow VAs to be easily replaced in case of illness, death or jail sentence (yes that last one has happened)
they still need the datasets to train the ai with which will need VAs to make so they will still have jobs just making datasets rather than the exact lines
@@muzz4355 Which is why I specified *old* actors are out of a business - they have plenty of recorded voice to train their doppelgangers on. Newer actors are safer in virtue of having less data to train on.
@@csolisr its less the VAs that are in danger but rather the specific characters they voice. a VA is always changing tone, accents etc between characters . Old actors will still be wanted to come in for new characters but less likely to return to their existing characters.
For those who have an AMD graphics card, when you set up the whole software and have a model selected, open your Task Manager and test the said model while switching from GPU1, GPU2 etc.. One of those is your graphics card, so it won't have to use your CPU. I have an RX6800XT and still couldn't find my GPU and it was lagging due to it using my CPU. Following the steps above will sort that out, at least it did for me. GPU1 for example had my CPU at 80%. GPU0 on the other hand had my CPU at 20%, which means that GPU0 is actually my RX6800XT.
I don't know why, but on the input sound test feature I have quite severe noise, even though my room is quiet. It was quite annoying and affected the voice changer results. help :')
I’m having a problem when I try to use the voice changer when I talk I hear a static sound and when I get it to work sometimes it has a very long delay
I have got the strangest problem. When I choose my microphone, RVC is hearing everything. If I play a UA-cam video it starts audio changing that. In Discord it changes my friend's voices too. How the heck do I make RVC only hear my microphone?
lol, i'm about to call my friend just to say "Uh, Hello! Hello Hello? Eh, i've a menssage for you, Freddy Fazbear's Pizza sends u an Happy Birthday Johnn," with phoneguy model, lol, thats crazy
I have a really good computer, so I don't understand why the voice has a 3 second delay :( If anyone can help me with this, please give me some advice.
You mentioned training your own models but I don't really understand the process like that file type I'm not familiar with. Maybe I need to watch some of your other videos🤷🏾♂️
README!! Not downloading? 👇The VC is continually being updated so the version showed off in the video is no longer available. If you run into errors, you may have to try out the other versions to see if that resolves issues.
Latest version as of this update: 1.5.3.8a
I expect that I'll have to make a follow-up video.
If the google drive link is down, use hugging face website. Look at the names there and determine what to download based on:
cuda - Nvidia
directml - AMD
mac - MAC
Ah yes. The Mac uses mac, with a side of mac.
i downloaded it but it wont show my gpu it only shows my cpu and yes i use amd
@@giggy_rook bro check if you dl-ed the cpu version. Because the first one says it'll work with your cpu, the mid one is cuda, and the last one's amd.
Edit: if you're downloading from the archive...
@@nokinirus i also have amd and it doesnt show the gpu only cpu. and yes i downloaded the directml one. I even tried downloading other versions and different types
@@nokinirus still doesnt work
I have a creeping suspicion that the vtuber market is going to get a whole lot weirder with this tech improving.
those voice changer jokes are gonna be legit now
TBF there's a popular one that literally all AI atm, though her creator has arguably become even more popular lmao, it's kinda nuts what he's been able to make
Have always been
@@Reydriel Neuro
In reality there are no shortage of women willing to do the job. Although many of them are self concious and still use a voice changer just to make themselves sound cuter/younger/whatever they felt inadequate about. Don't forget that 50% of the world are women and plenty find vtubing appealing.
In real life the highest Youtubing earners are MEN, so if anything you lose money by voice changing into a female.
Under Audio options i found that choosing Server instead of Client, makes you sound a lot more realistic and takes away most of the robotic features
where do i find that
?
@@zak_facts2676 pretty sure its right under the S.Thresh, there a option next to AUDIO: for client or server
yes but it messed with my audio setup though
@sorasong6780 under the download there is a "huggingface" button if you click that it works :)
What should I put the audio output to? There's MME but idk what that does
I'm not ready for fluently English speaking Marine-senchou.
It's sounds like ame
"kimitachi!"
you must be ready
I very much like hearing the actual before and after effects and the detailed walkthrough. Thank you for posting!
Some japanese VAs agencies are hammering down on their VAs' AI voices.
It's gonna be interesting to see where this will go. While I hope I'd be able to use some of my favorite JP VAs voices and have it for english content, I'm betting there will be much more worse abusers and signing it off as their own.
Rules and regulation are gonna be needed for sure, but there's no legal precedent for this yet so it's all up in the air for how it's gonna be dealt with in the courts. As with all things, there will be bad actors and despite me looking, I am yet to find any good resources or detection tools that can keep up with these advances.
and I mean...somewhat rightfully
If someone isn't comfortable with their voice being replicated, you shouldn't do so/leave a replication up
Always remember when dealjng with strict japanese laws:
They can't touch you if you're outside their country or if your country has no reciprocation laws to back them up.
If you get copystriked, just make another account ad infinitum, assuming you're anonymous.
When all hope is lost, the worst-case scenario is to upload on BiliBili. Let's just say China and Japan aren't really on speaking terms.
@@Jarods_Journey how do you get more ai models? i cant seem to find other voices to download
imagine buying license for gura voice
Holy shit, the amount of power of turning into a Hololive Girl is getting closer!
Also, that Marine voice when she’s speaking English fluently is just so damn uncanny to imagine that there’s a timeline where Marine learned English SUPER WELL.
I hope there’s a program that compiles all the complicated setup into an easier way of setting up since I’m tech savvy 😂
True. Now people are going to view Hololive creators a little differently... Especially since most of the Hololive girls go to great lengths to hide their true identity. Makes you wonder...
@user-xp9kq7xb6p You'll do it to me >:)
help,i have too much errors
Recommended to me randomly. You are super underrated.
Appreciate it 🙏
2:25 for anyone wondering why it's stuck at
Booting PHASE :__main__
Voice Changerを起動しています。
please wait, it may take a few minutes.
Great help Thanks!
日本語上手
@@shinigamiwolfen hahah nani kore got jozued
thanks literally after reading this it finally did it haha
Thnx for the tip I was searching for!
i did not know a tutorial video can be this pleasent and nice, i know this is a weird compliment but you're really good at making tutorial videos
This is going to be so great for online role playing games. Being able to make a custom voice that you can use that matches your avatar will really increase immersion.
This is insane 😭 crazy how people can replicate voice’s by using AI in real time
@@freedomofwordbruh 2 seconds is real-time for most purposes
This was incredibly helpful! I seen your video on TikTok and came here right away. Thank you so much for making this video; I couldn't of figured out that program without it!
Appreciate it. I'm surprised at how much traction it gained haha.
If anyone is experiencing very choppy sound, like your voice cutting off after every 'chunk', you can try changing the AUDIO to "server" instead of "client". Eliminated all choppiness for me.
Doesn't that mean your using server somewhere, and likely giving them all your audio data you're creating?
@@boombattlefields9123 ☠ bro... let's go
Just don't worry about it. If you've ever mysteriously had an advert for a product you just mentioned, outloud near an active device, pop up in your feed, you're already having everything you say parsed by some sort of analytical algorithm. This, while an additional outgoing stream of data from you, is at least one that you are aware of and have some control over.
The only thing I could really do to guarantee my phone isn't listening to me type, even this sentence right now, is to put it in the microwave to block any outgoing signals. At least all you have to do is shut off the program and they aren't able to parse your data anymore.
can you help i cant hear my voice in program
Tried this and there seems to be no change in the voice even after changing the tunes hmm...
Your Chopping is because of threshold set all the way up. It does a cut off of the input under a certain volume. Make it a lot lower (almost completely at the other side actually)
It works but latency is pretty high. Lowering chunks improves that but you lose a lot of the content of what you are speaking. One thing to note is to make it most natural sounding, always tune it to a number that is closest to the voices natural sound. Like around 22 for that first model. Also any idea how to turn off real time playback? It's easier to use the record and then playback for any projects.
If you don't need the realtime functionality of it, you might be better off recording audio and then converting them in the RVC interface. You could always increase chunk size and there is a record fucntion on the client.
@@Jarods_Journey this is very helpful knowledge thanks for this.
And thanks for asking this question also.
Your stuff sounds SO much cleaner than mine, and I even try to use a very clear voice
Jarod thanks for these videos!
you've really helped me out a lot appreciate your content man it's fun keeping up with the new stuff you showcase!
Appreciate it man! It's all wild and crazy tech and it's an adventure everyday checking these things out!
the reason it says smartscreen protected you is because the dev hasent signed the app with microsoft, but thats because doing that costs 300 a year
It’s crazy to believe that there’s actually people that design voices for these Vtubers to design to there preference on how they want it to sound. More power to them.
It will be used for this purpose, yes. However, it actually exists so that Horny men can privately moan at themselves in Waifu-speak.
@@tripleheadedmonkey420bro why did you put this idea in their head
@@SirGlazer "Their head" he says while desperately trying to hold back the tears as his Tsundere anime waifu life begins anonymously.
@@tripleheadedmonkey420 😭
I swear github is like the holly grail,I just learned about it recently but now I realise that every kind of software can be obtained from there and for free
Yuppppp, hometown of lots of open source and many, many awesome things on there.
Is it weird I've seen this exact comment, word for word, pop up on almost every video related to AI in the last few weeks? xD
Tsukuyomi-chan's project & her creator are so inspiring! She is a free voice project across a whole number of engines, mostly any free Japanese speech & singing synthesis programs. I definitely recommend that people check out some of her other resources & samples because she is really a treasure of a voice!
For those of you watching this and you cant see your GPU listed under the GPU tab this is what you do. Where the Audio section is where it says "Client or server" click on server, go back to the GPU tab to make sure your GPU shows in the drop down list, and then you can click back to client or leave it on server. It worked for me.
what gpu do u have? AMD or Nvidia
@@jamesduke151 AMD Ryzen 5 3600
@@Chrispyy__ does a drop down menu for the GPU appear like in the video for you? On mine there is a 0 1 2 3 instead
@@jamesduke151 mine just shows my GPU name. I don’t have any numbers
@@Chrispyy__ ok thanks. Are you using the latest version?
Sloppy Walrus your a MENACE to society for setting this up for your video XD
I've started using RVC for some of my videos. I was able to change Clint's voice (Clint from LGR) to Duke Nukem's voice for a Duke Nukem review he did some years ago.
Haha that's awesome. RVC is quite good so I can see it being used in a lot of places.
@@Jarods_Journey Definitely. Compared to SVC, it's the best in regards to replicating consonants with the least amount of smearing.
Just wondering, are there any resources online where people can post their own trained voice weights? It'd be convenient as you won't have to keep training your own voice for the voice changer in case somebody else already happened to do so.
Someone let me know of one called AIhub discord group
@@Jarods_Journey is there a quick invite link anywhere? I can't seem to find the group anywhere
Anything on huggingface?
As a highly trained vocalist, I have been waiting for this to be a reality so that I can create cover songs that one could only dream to hear, like Freddy Mercury, Curt Cobain, and Steve Perry singing on the same ballad!! If anybody seeing this has the capability to train voices and would like to collab on a project, let me know! I’ve yet to figure out the training but I have an entire professional quality studio set up and I’m ready to get to sangin!! Let’s GOOO!!!! 🚀🚀🚀🚀
Funny you comment this because a video I'm gonna be releasing is talking about the potential to turn my untrained voice into something that is bearable... simply by using a trained model xD. Another use case is say you throw a bunch of filters onto a voice and don't want to do post-processing ever again. Well, if you just get enough voice samples... you could essentially just "sing" and then BOOM, it's all edited. Still a little bit of issues ofc, but.......... it's super exciting lol.
People have already been doing similar with rap. It's definitely cool!
@@Jarods_Journeycount me in… I am chasing the real time to help how song like AXL ROSE… how can I help to make real time effect a reality?
Personally for me i’d use it for music production, so so much easier to draft a song when you can hear fitting voice with it for an actual artist to sing later. I usually sing a bit myself but having a fairly low male voice, i can never do a female voice
Hey, i am a music producer, where can I contact you? This is my first private account
I went along with this tutorial and everything went smoothly until i opened the program. Above I only have the "clear settings" button when there should also be "reload" and "select vc". My screen looks like 3:07 without those buttons and without the voices to choose from. The "edit" section for the voices is completely empty for me and I'm now stuck and don't know what to do since I'm not too advanced when it comes to computers. Does anyone know how I could fix this?
EDIT: Nevermind I fixed it! If anyone else ran into this it's easy to fix. Under "NOISE" you have the "F0 DET." thing. It's on "dio" by default and when you switch it to one of the other modes the different voice models will appear.
God bless you my friend.
thank you so much
@@sparta_vov_rmd7579 Eyy no problem. Have fun!
You're my heroooooooo!!!
hearing Hoshio Marine speaking fluent english is something I did not think my brain could comprehend, holy shyet
The catfishing is gonna be wild..
Anyways, thank you for uploading this video for others like me to see, It is gonna be cool to try out.
Immediately one of the first things I thought about lol, it's gonna get wild. But also, the more you're in the know, the less likely you're to fall for any types of these things as well.
LOL ME
so more males are gonna be applying to be part of a middle range agency (that cannotnmake too much background check( using these filters lol
Catfishing and also contract breaking (one female can work at 2 or 3 agencies without her being voice recognized lmao)
This is gold for the Vtubing community, actually.
@@MaxKrovenOfficial It's also bad because agencies and companies wants to see you literally in person in order to setup any sort of contracts or deals, but it also opens up to scams and such because you can impersonate other people very easily...
For instance it might hide the indian scammer's bad accent and fool a lot more people who are usually aware of these people and their bad voices.
I thought it would be REALLY good, but I didn't realize it takes more resources than Chrome and Photoshop. As soon as anything else needed some GPU, it started stuttering and became unusable. I hope it gets good enough to not need more than 12GB of VRAM.
You might be able to offload it to run on CPU instead of GPU, but yeah, most of these AI projects are pretty compute hungry.
@@Jarods_Journey how
WOAHHH 6:25 that's totally Marine's voice speaking fluent English. Crazy.
whoa! good to see you finally get the views you deserve brutha!!
Haha thanks man 🤟
If you don't have the voice actor icons, you downloaded a past version or server ( I am incompetent, don't ask me). You need to download a client version and it is at the same page in the start and currently will direct you to download it from hugging face. It is 2.+ version. It helps if you choose english on the git page... Took me a day to figure it out, never again.
thank you
That was so weird hearing Senchou speaking native-like english
can you make a tutorial or is there already one of how to make your own model for real time talking? i know there's ones for singing but if I want a better talking model how much data should I use? podcasts maybe?
I do have to link it here: ua-cam.com/play/PLknlHTKYxuNshtQQQ0uyfulwfWYRA6TGn.html
The same models used to train in RVC can be either singing or talking models, just depends on what audio data you curate and train it with. I recommend start with 10 minutes of super, high quality data that is clear and then increase it if the model isn't good enough.
It's a bit of a crapshoot. I've somehow had amazing results with something like 2 minutes of ludicrously high quality audio data and not quite as good results with several hours of also very high quality data.
It seems that there are a few types of voice which just happen to work better.
Whatever you do, make sure to only use data which is as good as you can get.
I give it 5-10 years and we can just prompt a website to generate media to consume. At least that means I can finally get a second season for all those shows that didn't get one...
2:24 when i open the bat file, for some reason the download you mentioned isnt starting, is there any way to fix?
This is only a problem with the new file, download the old version and everything will work
Both Crepe and Harvest seem to be both cpu dependent.
CPU runs at like 100% Ryzen 7 5700G
While gpu is at like 38% gtx 1080 ti
It's an odd thing, but CPU draw seems to go up when using it selected on my 2070 super I've noticed. Dunno why, but it doesn't show GPU usage. Might be something to raise to the author eventually though.
For the index option, it will improve your quality. It wanted you to choose the file starting with added_IVF5870 rather than the npy file.
Such a great video, Thank You very much bro!
why dont the characters pop up for me
Still has a very distinguishable "robotic quality" to it, but that will probably improve
This is the worse the technology will ever be... so yeah, a bit spooky.
Cool video!! Try to make more often longer videos, more fun and exciting! ❤️
For the song covering maybe, but this seems crossing the line into creepy. 😂
It was much, much better than I thought 😅, but mind-blowing tech nonetheless.
This would be hilarious for messing with people in VOIP games like Battlebit Remastered
I've done everything but when I get 2:22 here the black menu doesn't show up
Until now can't use AMD GPU, I've followed the method * 2 still can't, it still takes the source from the processor, not from video graphics 😢
4:27 Sounds like an old lady who just finished giving a toothless deepthroat gum job to a BBC.
thats a little specific
what the fuck
For some reason, whenever I try to use any of the voices there's a lot of background noise/static, that seems to be coming from nowhere.
Yeah I either hear nothing or just static for me as well
Dont use cpu, be sure to have good gpu : (
@@CertifiedAsher I'm using the GPU version, and my GPU should be plenty to process it at lower bitrates at least, but it always ends up sounding staticy.
@@CertifiedAshercope
For some reason my client doesn't have the "Select VC" button to select RVC. Does anyone know how to fix this? I can see the deafult models downloaded in the files but they don't appear on the client as RVC isn't selected. I've also realised that it doesnm't seem to be detecting my GPU as the CPU selection is the only on in the list
Any ideas on how to get this to output as a virtual microphone? This could be really fun in discord.
VB Cable / Virtual Audio Cable - should be easy to find with google
LOL
:) ua-cam.com/video/IS_SPQVv5iY/v-deo.html
maybe set your recording device as Stereo Mixer , i don't know, it might work
having a problem, after starting start_http for the first time it said it failed because it could not find win.api or something like that. Now when i try to run start_http it opens for 1 second and immediately closes
same problem
Nothing of this works, I followed all directions perfectly and my mic isn't working. When i change the input it goes to some "No error message" and "Intialize"
You're not alone. I don't know if it was the update I did because it was working fine this morning. Now gpu and server giving me a "ERR_CONNECTION_REFUSED" error.
Where do I find the voices like Botan and Marine? Would love to get em! (aswell as ur settings with em)
Exactly what I was about to ask!
Appreciate it guys, but I can't distribute the models unfortunately! However, I can share the knowledge required to train the models and I have those videos on my channel. I'm working to get it all a bit more organized, but you'll have to gather audio data on your own (thought there are plenty of tutorials on how to get audio data out there).
@@Jarods_Journey if you cant distribute it why even make the video at all or even showcase it LMFAO
@@Boredness90 It's educational content and falls under fair use. Distribution does not, falls under more murky waters.
this is super cool!! is there a way to use this as an input for a discord call or something?
Hi Jarods, I have tried to download this voice changer multiple time. But everytime i click at the drive (normal) it is not working. It say that too many people downloading this file which lead to failure when downloading. Please help me of how to download this if this way is not working. Thank you,
same
Try hugging face link
This would be awesome for online TTRPG's. As a deep voiced male, the best I can do is an intimidating Hag. I'd like to get some other female voices.
love how it upgraded and now you can download it from huging face
one interesting thing is this could theoretically be combined with a translator first, though that would take a whole new, probably larger model,
As hardware and software improves, this is just the beginning!
This is truly an amazing find and piece of software, I would be very interested in messing around with it but unfortunately I have an AMD build and I cant find a way to use my 5700XT gpu to process the sounds, and it doesnt seem to be fairing well with my Ryzen 7 2700X cpu :( Any potential help would be greatly appreciated!
i have the exact same build let me know if you find anything ahaha
@@GondoMan21 will do, likewise -so far no luck but I will do some checking each day and come back with anything I learn
Might have to adjust the settings in the client to try and help it out, but it doesn't run too well on CPU unfortunately. Did you download the directml version? That would should support AMD.
@@Jarods_Journey well, it doesn't
Whenever I speak into it. I can hear the voice quite well the only issue is that after I stop speaking a second later it will play a very quiet voice of it back to me. It only does it with one voice tho so I'm assuming it's just something to do with that voice.
Discord nitro here I come
Free money, here i come!
Hey guys, I wonder, can you just use an audio input instead of real time voice so that it still mimics your intonations? Or maybe there is some other software that can help adjust intonations?
Owls need HUGS
Use the standard rvc, so Vita inference in python or in a ui like rvc GUI.
can you explain that more @@IelmaoUfo-lp9bd
if i speak and hear it through my headphones i get an echo and the model says what i said many more times... creepy... how can i fix this???????
the "crepe" is unvailable for me, it says "cepe (N/A)"
Yep, Japanese voice models work better with Japanese. I mean... they were trained on Japanese speakers ))
This is most noticable with consonants, since those are usually treated differently in those programms (e.g. unvoiced consonants don't have pitch). For example, I couldn't make an English model to pronunce 's' from my language, or any variations of it actually (regarding tongue position being more forward/backward). But with vowels it followed my speach pretty close even if those vowels were not typical for English phonetics... with some exceptions on high vowels (pretty decent though). Tough the last problem might be due to relative lack of palatalized consonants in English.
I tried speaking Turkish with it and it works just fine, probably due to the fact that Turkish is pretty similar to Japanese when it comes to pronounciations and stuff.
@@wargreysama To be honest Turkish is close to Japanese even at grammar (up to some degree) ))
It works pretty decent with many languages. I was just curious about possible limitations and since I'm a bit into languages I tried to -make poor anime girl suffer- to play with different sounds non-typical for Japanese.
As for now - my favorite thing is a word 'tractor'. Those voices make it more like 'toractor' which is adorable )
Is it possible to use this with audio files, in terms of converting the audio file to the respective voice?
yes, in the input, you can select file. for me it errored the 1st time but after reloading, it let me select a file from my computer as the input, and record the converted audio by clicking the record / save button on the bottom to get the converted output
Figured out the link, but do you have any advice for getting clearer audio? When I speak it chirps and distorts pretty constantly (sort of like when you lowered the chunk down really low, but I get that effect in all chunks)
this is both with custom pth files and the provides stock ones
it has to be your gpu
@@kennethnathantagalog5597 its set to my GPU (seems to be putting the load on my CPU anyway)
Might be hardware specs, I'll be going over this a little bit more in a vid
Hello, i followed all the instructions but mine still doesnt work. Now i want to delete everything, do i just delete the file in file manager? How to uninstall the one downloaded in the cmd program?
Sorry for my bad english, please let me know @anyone.
FINALLY I can do Jack Sparrow vs Barbosa Standoff in Red Dead Online
Wow thank you for making this tutorial. I'm wondering can I add more models? If so, where to find it?
Subscribed!!!
Appreciate it! I recommend you train models, but there's a discord group called AI Hub where you can go find some models
@@Jarods_Journey thanks, super excited about this
This is the main reason why I rejected my banks offer to secure my bank account using my voice over the phone! AI is freaking scary in the wrong hands!
I have yet to try it on voice security systems... but that'll be an interesting topic to explore.
@@Jarods_Journey You can use it to change your voice live and make a phonecall. That is how a Chinese investor got tricked and lost a ton of money. He thought he was talking to his business partner and sent him money for a business deal. The crook even facetimed the victim using a deepfake of the business partner face!
That certainly gives V-Tubers a break. I don't believe there's any way they could do the same thing everyday and not become even a little tired of it. I don't usually watch most of them exclusively, sometimes clips; but for example, I watched from the last 2.5 hours of Mumei's livestream karaoke and she was already tired and bored after the 1st hour when I joined their stream; not that I'd know what she's like but it definitely seemed like it was possible that it was someone else filling in for the night using such a voice changing program.
Ah, it's not that good to be at that level yet, you can definitely tell when someone is trying to use an AI voice still but to this point, I think V-tubers are youtubers so even if you're an IRL streamer, it's not like you have someone else sub in for you when you're tired xD
It works very well but the latency is very high (more than 2 seconds!) even with collabs pro when using the notebook page. Any idea to help solve this? Thx!
Im so lost on how to even get it to work lol
it just says in the cmd promt thing "warming up... generating sola buffer." and thats after i have it turned on but it doesnt do anything from there and my mic stay sounding default
Tbh, I'm wasn't sure if the Collab worked but if it does, the delay would come from having to use googles servers to host the client instead of your local device
@Jarods_Journey, that's what I expected (I've subscribed to Google Collab Pro) but I still have 2 sec latency. That's frustrating! I would LOVE to use your great tool for my next live perfomance in Taiwan! Can you help? That would be fantastic
Awesome, I'm sure this will be put to a very very very.. Very good use.
Hey Jarods! A fellow Mechatronics Engineering Grad here, you make a lot of quality content and I wish to message you regarding some technicalities of AI voice cloning. And maybe some career advice for degree holders in Mech? haha
Where can I reach you?
Hey Patrick, always great to see a fellow mecha :D! I would say linkedin is going to be the best bet for professional stuff, if not, there, then the next best bet is discord as I'll generally respond on there. I do get a lot of PMs but it shouldn't be a problem if you pmed me from my group.
Gotcha! How do I find you on Linked In btw hehehe
DUDE RTX 4090 🙂
Everytime I try to launch the voice changer it shows this error and doesn't work Failed to load URL: localhost:18888/ with error: ERR_CONNECTION_REFUSED or when it does load I click on a voice and it says Cannot read properties of null (reading 'enableServerAudio')
Welp, this is going to put old voice actors out of a business, but on the other hand it's also going to allow VAs to be easily replaced in case of illness, death or jail sentence (yes that last one has happened)
AI will erase many jobs
i mean real life people can still say a vowel for a long time without fail so
they still need the datasets to train the ai with which will need VAs to make so they will still have jobs just making datasets rather than the exact lines
@@muzz4355 Which is why I specified *old* actors are out of a business - they have plenty of recorded voice to train their doppelgangers on. Newer actors are safer in virtue of having less data to train on.
@@csolisr its less the VAs that are in danger but rather the specific characters they voice. a VA is always changing tone, accents etc between characters . Old actors will still be wanted to come in for new characters but less likely to return to their existing characters.
It's cool, I really admire someone like you❤
Don't let these Nigerian dating scammers know about this
RVC is amazing, but the latency is a huge problem
can i use it on 750 ti?
I just used the voice of this video on my phone to configure my settings. Thanks
Followed along, did the same settings, clicked start and nothing....
The google drive says "Sorry, you can't view or download this file at this time."
I have a question, how to you get the trained voice material? I mean where you download the Marine's Voice(But I'm not finding Marine)
For those who have an AMD graphics card, when you set up the whole software and have a model selected, open your Task Manager and test the said model while switching from GPU1, GPU2 etc.. One of those is your graphics card, so it won't have to use your CPU. I have an RX6800XT and still couldn't find my GPU and it was lagging due to it using my CPU. Following the steps above will sort that out, at least it did for me. GPU1 for example had my CPU at 80%. GPU0 on the other hand had my CPU at 20%, which means that GPU0 is actually my RX6800XT.
i don't hear anything, how to fix pls
edit: also i have a white screen when im opening the voice changer app help pls
I don't know why, but on the input sound test feature I have quite severe noise, even though my room is quiet.
It was quite annoying and affected the voice changer results. help :')
I’m having a problem when I try to use the voice changer when I talk I hear a static sound and when I get it to work sometimes it has a very long delay
this is great, i can finally act like im some online characters on video games
Awesome!
Is there a colab notebook for "realtime" voice changing? I saw repo of so-vits-svc-fork but this is not work for "realtime" voice changing.
omg i loved that u used houshou marines voice LMAO
When I open the Google Drive the file does not show up, there is an error.
I have got the strangest problem. When I choose my microphone, RVC is hearing everything. If I play a UA-cam video it starts audio changing that. In Discord it changes my friend's voices too. How the heck do I make RVC only hear my microphone?
lol, i'm about to call my friend just to say "Uh, Hello! Hello Hello? Eh, i've a menssage for you, Freddy Fazbear's Pizza sends u an Happy Birthday Johnn," with phoneguy model, lol, thats crazy
I have a really good computer, so I don't understand why the voice has a 3 second delay :(
If anyone can help me with this, please give me some advice.
You mentioned training your own models but I don't really understand the process like that file type I'm not familiar with. Maybe I need to watch some of your other videos🤷🏾♂️
why mine echoes, like it repeats the word i say many times
Same did u find a solution ?
I tried it, but voice lags and comes with lags like I had 1990 pc. It speaks with pauses, so...me...thing... ... like... th-this