RVC V2 Tutorial - Speak in any voice! - Retrieval-based Voice Conversion - Easy AI Voice Tutorial

Ai Voice Tutor

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 4 лис 2024

КОМЕНТАРІ • 465

@fadysameh7656 Рік тому ⁺³³
So is that for english voice only you can i do it for any language like train it a german and speak german?
@AiVOICETUTOR Рік тому ⁺¹⁶
Any language should work. So if you train a German voice and then use another native german voice as input to clone it, then it should sound perfect. If you use a different language input, compared to whatever was used to train the voice, then you should end up with an accent (similar to real accents).
@anyonecancode8329 Рік тому
@@AiVOICETUTOR Hi man, will this guide work for any language? I want to clone my voice in Vietnamese. If it wont work, pls help me by link to any other guides!
@AiVOICETUTOR Рік тому
Sorry I missed your reply. Please post exactly what didn't work for you here so me and others can help you
@AiVOICETUTOR Рік тому ⁺¹
@anyonecancode8329 Train the voice with a Vietnamese one and it'll work
@bannedname5036 11 місяців тому
@@AiVOICETUTOR I'm trying to do voices, but keep ending up with an Australian accent to anything I do. Is there a way to sound less Australian?
@rodrigolins3187 28 днів тому ⁺²
Its amazing that you keep updating this! Thank you very much! Keep it going!
@AiVOICETUTOR 22 дні тому
Thanks, will do!
@Bakamatsu-GojiFanArchive Місяць тому
Thanks for uploading this! I couldn't find a beginner friendly tutorial anywhere until I stumbled upon this upload. Thank you so much!
@AiVOICETUTOR 26 днів тому ⁺¹
Thanks for taking time to comment and I'm glad it was helpful!
@grabani Рік тому ⁺¹⁴
Very good detailed and clear tutorial. I understand you have just started your UA-cam journey, but I encourage you to keep going I see you growing your channel because the content quality is excellent.
@AiVOICETUTOR Рік тому ⁺²
Thank you so much for your kind words! Much appreciated!
@FourLionsClips 3 місяці тому
not clear at all xd
@imrankhan-ko4op 4 місяці тому ⁺¹
the quick gui is very helpful, i spend 3 days in installl rvc with different command as i not programmer, diffult for me,! after everything works, main issue was to train model, when i start training my gpu temps wento 83+ within 5 minute, i live in hot area, I tried pretrain models by community to to my work but rvc never detects it. luckily this video helps me.
thank you man!,.
@AiVOICETUTOR 3 місяці тому ⁺¹
Glad it was helpful. Thanks for taking the time to comment!
@cristobalmuller 7 місяців тому ⁺³
Is harvest the best quality? Or Crepe? Do you have a setting for the highest quality? Thanks!
@thebluecreeper2574 5 місяців тому ⁺¹
I think harvest is the best quality.
@georgelaskosofficial 6 місяців тому ⁺¹
Hey mate. First of all *GREAT JOB*. Everything works great. I have a question. What about to give an audio in English and convert it to other language? Any idea?
@officially_s 19 днів тому
This is awesome . Quick questions - If I increase or decrease the number of epochs (you've mentioned 300) will that make any difference? Sorry I'm new to this.
@bwheldale Рік тому ⁺³
This "first video tutorial" got me subscribed. I'm waiting to try this but I'm waiting for another speech training program I'm running to complete. I'm interested in program/tutorial that utilizes Speaker Diarization to identify and separate speakers in an audio file and segments each speaker's audio into separate audio files. E.g., create a training dataset from video/movie of favourite characters and clone them (responsibly of course!).
@AiVOICETUTOR Рік тому ⁺¹
Thanks for your feedback! What you described would be really really useful and I'm sure it won't be too long before we'll be able to do this somehow
@sinterm Рік тому ⁺³
Hi, first of all I would like to thank you for such a clear guide. Secondly I would like to ask one question. I made my voice AI model, but since I have a weak video card, I've only been able to make 25 epochs so far (each one takes me 20-25 minutes), so here's a question: if I use the same voice file, then specify my existing model in the "Load pre-trained base model (G and D) path" tab and make 20 epochs this time, will they add up? So in total my model will have 45?
@AiVOICETUTOR Рік тому ⁺²
Hi, thanks I'm glad you like the video. You can easily resume by going back to the tool and click "Train Model". But when you start training, you should set the total number of epochs you want to train and then you can interrupt it after the pth has been saved (after whatever you set in the "saving frequency"). More info about resuming here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/953
@sinterm Рік тому
Thank you !@@AiVOICETUTOR
@AbelFelixStudio Рік тому ⁺¹⁵
Thanks man. Now I can do animation without hiring someone else to record voices.
@AiVOICETUTOR Рік тому ⁺¹
Glad it was useful to you!
@NSG25 11 місяців тому ⁺¹
awesome
@AiVOICETUTOR 11 місяців тому
thanks
@magenta6 Рік тому ⁺¹
Good tutorial, clear, concise and well presented.
@AiVOICETUTOR Рік тому ⁺¹
Thank you very much! Glad you like it
@petrcejpek3317 5 місяців тому
Thanks for the tutorial. 🙂Unfortunately, I have a problem in Czech language. I have trained my model for 500 epochs on cca 26 minutes of training dataset. When I finally apply the model, the voice color is perfect, but the pronunciation is sometimes quite spoiled. The original and the target language are the same (Czech). Do you have any advice?
@gisfdlc9210 10 місяців тому ⁺¹
You're fantastic; thank you so much 👍
@AiVOICETUTOR 10 місяців тому
You're welcome and thanks for taking the time to comment 🙏
@KaneNable1 День тому
whats better RVC or Applio then?
@parkermaggard8862 3 місяці тому ⁺¹
very helpful
thankyou
@ER-ec4uq 10 місяців тому
Thanks for the clear tutorial. Do you have any tips for getting a more realistic end result? I guess the quality depends a lot on the input source. But is it possible to do extra training, increasing epochs maybe? Or would using more than ten minutes help too?
@AiVOICETUTOR 10 місяців тому
Thanks for your comment. You could try what my latest edit in the video description suggests and use RMVPE to train the model. Or you could try it with more or less epochs. I also found that it works better with some input voices than with others and some input voices need more epochs than others etc. It's really a science of itself
@ER-ec4uq 10 місяців тому
Thanks, I'll experiment a bit. It's pretty cool stuff though, when it works it's very impressive. Although the end result really depends on the audio you use to convert because it retains all the mannerisms and even some of the accent@@AiVOICETUTOR
@AiVOICETUTOR 10 місяців тому
Yes totally agree. So many factors that are influencing the end result. The good thing is that the tech will only get better from here :)
@effigia6896 День тому ⁺¹
How do we do for a singing voice please?
@kollias_music 8 місяців тому
thanx! Any chance you do a tutorial for singing voice cloning?
@smaiderman2 10 місяців тому
I can confirm that this works in Spanish.
Also, is there a way to improve the sound quality? Im using your suggested values, and I get a pretty good output, but you can tell is fake because its slightly "robotic". Would it get better results changind any parameter, even if it takes more time to proccess?
Thank you for the tutorial!
@AiVOICETUTOR 10 місяців тому ⁺¹
Hi, you could try what my latest edit in the video description suggests and use RMVPE to train the model. Or you could try it with more or less epochs. I also found that it works better with some input voices than with others.
@PurpleWind64 11 місяців тому
Thank you so much for this. I was stuck for hours on some other tutorial only because the boob who made it left out the instruction on the Train Feature Index button. I wish this was the tutorial I came across first.
@AiVOICETUTOR 11 місяців тому
Awesome! I'm glad you found the tutorial and that it was helpful to you
@IndigaVP 20 днів тому
Hello. Pth is formed during training but the voice is like the original??
@billerpjc Рік тому
hi . i just seen your channel and the video was easy to understand .
will it work on my victus hp . nividia geforce rtx 3050 4gb graphics card, core i5 . 11th gen .
please le me know
@AiVOICETUTOR Рік тому
Sorry for the late reply but I just noticed that your comment had been held back from UA-cam and I had to approve it. Yes the tool will definitely work on your machine! Training will be slower as in the tutorial but it will work
@rydarit1948 11 місяців тому
it dosent work
@@AiVOICETUTOR
@playipstreamsolutions5538 Місяць тому
I’m way too dumb, well explained I just couldn’t get it to work, thanks
@brownjonny2230 Рік тому
Thanks it's very easy follow. Btw what microphone are you using?
@AiVOICETUTOR Рік тому
Thanks. For this video I was only using my iPhone (can't really remember if I was using a headset microphone connected to the iPhone though). Since then I have upgraded to a Razer seiren v2 x
@brownjonny2230 Рік тому
@@AiVOICETUTOR Cool thanks.
@حالاتواتساب-ح8ي3ز Рік тому ⁺¹
can i using for singing
@AiVOICETUTOR Рік тому ⁺¹
Yes if you train it with a voice extracted from songs (use ultimatevocalremover.com to extract the actual voice) and the clone a voice that's singing, it should work fine.
@svt8253ai 10 днів тому
Can u add text to speech? Web GUI to enter text and convert to audio?
@AiVOICETUTOR 8 днів тому
I just made a video about a new text to speech tool: ua-cam.com/video/-brbxJ43F1c/v-deo.html And here's another method: ua-cam.com/video/P1HIOvKg5Ko/v-deo.html
@orhangorek Рік тому
Can I change my current voice model online in the way you showed, with my computer having 4 GB of RAM?
@AiVOICETUTOR Рік тому ⁺¹
It's recommended to use a card with more than 4GB of ram but it might just work ( see: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/wiki/FAQ-(Frequently-Asked-Questions)#q8cuda-errorcuda-out-of-memory)
@HarrisonBorbarrison 11 місяців тому
I guess I can't do it on Mac since MacOS can't run .bat files.
@AiVOICETUTOR 11 місяців тому
You can install Pinokio (ua-cam.com/video/ln1qEglnpMo/v-deo.html) on Mac and install RVC easily. Hope it works for you
@PotatoKaboom Рік тому
it all worked flawlessly! thank you very much, i will look out for more content from this channel :)
one thing: can you give some references / papers to cite on the technology that is used by the repo?
@AiVOICETUTOR Рік тому
Glad it worked for you without issues! I couldn’t find any papers (maybe because the authors of the tool are Chinese) but the keyword is “retrieval based voice conversion“. Maybe you‘ll be able to find something
@PotatoKaboom Рік тому
@@AiVOICETUTOR Hey, the repo references HIFI-Gan which is a paper that is already a few years old. Just not used to have this "old" tech giving results like that all of a sudden because the community delivers in such a way. Audio is just a totally different world compared to text processing and LLMs where new papers are thrown out on a daily basis. Thank you for your help though!
@magellanthecat 4 місяці тому
Ug. One note says you need 30 minutes. Then it's only 10 minutes. Could we please make the information clearer up front?
WHat is the actual minimum amount of time needed? 2 minutes? 30 seconds?
@kingsofthering 8 місяців тому
I did not expect step 2 two take a literal 24 hours lol, but hopefully this all works in the end.
@AiVOICETUTOR 8 місяців тому
Sorry it took so long for you but I hope you got a good result in the end
@ravkhangurra7522 Рік тому
Great video, I have noticed when my gradio page comes up, it doesn't have the same options as yours. It might need to be updated, can you please advise how I can do this.
Also will you be creating a colab version of this too
@AiVOICETUTOR Рік тому
Thanks and sorry for the delayed reply. If you want to use the exact same version as I did, download this zip file: huggingface.co/lj1995/VoiceConversionWebUI/blob/main/RVC-beta-v2-0528.7z. I still haven't been able to look into colab yet but it's on my ToDo list.
@Markiz93 Рік тому
@@AiVOICETUTOR I downloaded this version and I have in GPU information No supported GPU is found. Training may be slow or unavailable.
@AiVOICETUTOR Рік тому
Which GPU do you have?
@Markiz93 Рік тому
I think it's most likely a problem with Pytorch, but I don't know, how to update it...
@AiVOICETUTOR Рік тому
Could be far fetched but maybe have a look here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/350
@szymonbalinski9766 Рік тому ⁺²
Hi!
Is it possible to use voices trained with this method in real-time? For example in Discord app?
@AiVOICETUTOR Рік тому ⁺¹
Hey! Technically, it should already be possible since the voice conversions are running faster than real time on some hardware. I haven’t seen any implementations so far but it must be only a matter of time.
@didierdunca Рік тому
@@AiVOICETUTOR make a tuto?
@drygdryg2 Рік тому
It's possible. Tha author made a video about this: ua-cam.com/video/vFKm-G-dxHo/v-deo.html
@eskhatos 6 місяців тому ⁺¹
I have followed every step in this to a tee, but when I try to save the .pth file, nothing shows up in weights, but every other file is saved. I cannot find the pth through a search either. I'm not sure why this is.
@baraobuu Рік тому
Training with 01:35 is not good right? I cleaned the audio of the only video I have from my deceased grandmother, train with 300 epoch took the all day training, I'm using Nvidea Gforce GTX 1060 6GB, in GPU I used 10 value, the result was not good and I did find out the file size did not increased after, the training could not improve any further, I guess 01:35 is insuficient data.
@AiVOICETUTOR Рік тому
Yeah, I'm afraid the sample size is too small for this tool. You could try two tools (XTTS and Bark Voice Cloning) via Pinokio (ua-cam.com/video/ln1qEglnpMo/v-deo.html) that require only a few seconds of input data. However the quality of the results is not comparable to RVC. Maybe the next version of RVC will work better with shorter input audio samples
@CoolDudeClem Рік тому ⁺¹
Am i doing something wrong? When i do "proccess data", i get this in the command prompt window:
runtime\python.exe trainset_preprocess_pipeline_print.py E:\AL Speech Stuff\RVC-beta\RVC-beta0717\sheapk 40000 11 E:\AL Speech Stuff\RVC-beta\RVC-beta0717/logs/just a test False
Traceback (most recent call last):
File "E:\AL Speech Stuff\RVC-beta\RVC-beta0717\trainset_preprocess_pipeline_print.py", line 8, in
sr = int(sys.argv[2])
ValueError: invalid literal for int() with base 10: 'Speech'
IS this right or is something wrong? I cant make head nor tail of this robot languadge but i dont think this is right.
@AiVOICETUTOR Рік тому ⁺¹
Remove the spaces in your folder path and it should work.
@polinazaitseva2016 6 місяців тому
still cant figure this out
@thebluecreeper2574 5 місяців тому ⁺¹
Why does it say "Please select a model and input audio file"???????????
@crackinfo8888 Рік тому ⁺⁶
I think I'm gonna make pororo ambatukam
@AiVOICETUTOR Рік тому
Had to look it up. Go for it!
@Overneed-Belkan-Witch Рік тому
Yes
@Salvat-pz9lz 9 місяців тому ⁺¹
No suported Nvidia Cards. using CPU for inference
cpu . A solution Please
@AiVOICETUTOR 9 місяців тому ⁺¹
You need to use a NVIDIA GPU
@Visito_kruse 2 місяці тому
Did everything as explained by i am prompted with Filenotfounderror. What could be the issue?
@ProgrammerPenguin Рік тому
oh snap i am definataly gonna animate using this!
@AiVOICETUTOR Рік тому
Awesome!
@ProgrammerPenguin Рік тому
@@AiVOICETUTOR i'll make phineas and ferb parodys
@AiVOICETUTOR Рік тому
Would love to see them! If you want to, drop me a link once you’re done
@ProgrammerPenguin Рік тому
@@AiVOICETUTOR* o h i w i l l *
@switchpp1266 Рік тому
Hi i wonder should i use GPU or CPU processing mode
@AiVOICETUTOR Рік тому ⁺¹
Hi you should use GPU. CPU should also work but will be slower
@yanik5480 4 місяці тому
@@AiVOICETUTOR How slow is the CPU method? GPU is greyed out so I can't really use it and it would be nice to know how long this could take.
@mboamotivation 9 місяців тому
Merci !
y'a t'il pas une option d'utiliser du texte pour générer des audio avec ces modèles ia ?
@AiVOICETUTOR 9 місяців тому
Sadly not. check this video for text to speech: ua-cam.com/video/P1HIOvKg5Ko/v-deo.html
@thedagothexperience Рік тому
Would it be possible to use this process to change your voice live through a microphone?
@AiVOICETUTOR Рік тому ⁺¹
Yes and I am working on a tutorial for that at the moment
@magellanthecat 4 місяці тому
So, my samples aren't showing up in the "weights" folder, nor are there any new .pth files anywhere on my system. So... now what? Is there an update to this tutorial, because I am not familiar with this enough to troubleshoot on my own--that's why I'm using a tutorial in the first place.
@techgenius614 Рік тому
thank you very much ım try to find this program
@shirifshirif9506 Рік тому
Everything works perfectly only one issue. The epoch are really slow despite the powerful PC. What could be the reasons?
@AiVOICETUTOR Рік тому ⁺¹
Glad it works for you. If you’re not getting any errors, I think it’s normal that it’s slow. Even on powerful PCs
@explorewithnafa Рік тому ⁺¹
Please how do I livestream this in a Zoom video call
@AiVOICETUTOR Рік тому
Works the same way for Zoom as for discord: ua-cam.com/video/vFKm-G-dxHo/v-deo.html
@ShubhTalkss1 15 днів тому
Bro how to train voice and download .pth and index file please give me website link?
@Bigjuergo 10 місяців тому
can you pls make a tutorial how to clone a singing voice?
@AiVOICETUTOR 10 місяців тому
Yep its on my todo list
@envision6331 Рік тому
my freind. what is the retrieval rate? What does it do?
@AiVOICETUTOR Рік тому
Hey I couldn't find a lot of information on it but here's what Bard has to say:
The retrieval rate in RVC beta is not publicly available information. However, the developers of RVC have stated that they are working on improving the retrieval rate, and they believe that it will be significantly improved in the future.
In the meantime, you can use the following tips to improve the retrieval rate in RVC beta:
Use a high-quality audio sample as the query.
Speak clearly and slowly into the microphone.
Avoid background noise.
If you are having trouble getting RVC to recognize your voice, you can try adjusting the settings in the RVC webui.
Hope this helps
@toopanation 7 місяців тому
what should be the batch file for gpu ??
i have a 6g rtx 3060
@ScreenY Рік тому ⁺¹
How do I train my existing model?
@AiVOICETUTOR Рік тому
Could you explain in a bit more detail what you want to achieve?
@ScreenY Рік тому
I created a model thanks to the video and now I want to improve it (train it)@@AiVOICETUTOR
@AiVOICETUTOR 11 місяців тому
Oh you want to finetune a model. To my knowledge that isn't possible yet. If you want to improve a model, you'll have to train it from scratch again with different parameters
@beardedbhais4637 Рік тому
You said this isn't the most efficient method, can you guide me to the most efficient and highest quality method? I would like to generate a realistic model with fast inference.
@didierdunca Рік тому
me too
@AiVOICETUTOR Рік тому
What I meant by “not the most efficient method“ was that I’m using a separate tool (RVC GUI) instead of doing the voice changing in RVC Beta. Therefore using twice the space. Although it’s not perfect, to my knowledge, training a model in RVC is the highest quality free method available currently.
@jazzkaur3581 Рік тому ⁺¹
hi please answer me , i am trying to train a model but i know that i cant do it in one sitting like the pc will restart. is there a way to continue training from where i left off , lets say i am at 120 epoch and pc shuts down , i dont want to continue again from 1. i know we can back up every few epoch but can we continue training from lets say 50th epoch or 100th if i am backing up every 50 epoch?
@AiVOICETUTOR Рік тому ⁺¹
Hi. Yes this seems to be possible but it's not very straight forward. Check this thread on Git for more info: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/606. Hope it helps
@pixelismo 10 місяців тому
Hello friend.. sorry for my english... you download the RVC-beta.7z file . Is it for macos too? is it another? or i can`t use it on mac?
Thank you!
@AiVOICETUTOR 10 місяців тому
Hey, yeah this works on Mac too. I haven't tried it myself though but you can find some info on how to run it on Mac here: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/blob/main/docs/en/README.en.md
@IndigaVP 19 днів тому
When i training have this error: loss_disc=nan, loss_gen=nan, loss_fm=nan,loss_mel=nan, loss_kl=9.000
@Asdedix Рік тому ⁺²
I have this error
ValueError: invalid literal for int() with base 10: 'Voice\\xxx
@AiVOICETUTOR Рік тому
Make sure you have no spaces or special characters in the folder path
@joacoolcipher Місяць тому
@@AiVOICETUTOR but mine doesnt have any space or special characters onto my path, ive been trying to solve this for 3 hours without success because id never wanna give up even if its driving me insane?? come up with a better solution instead of copying and paste the same thing onto other comments which have same issue as i have?? if you cant then you could ofc link me some other youtube video which goes in depth of that same tool youre also using on your video aswell..
@varunnegi-v7z Рік тому ⁺¹
how to use this in google colab?
@AiVOICETUTOR Рік тому
I hope someone else can answer this. Although its way up on my TODO list, I haven't used cola yet.
@fernanda3161 Рік тому
Hi!
When deciding the number of epochs... What would be a rough scale to follow corresponding to the amount of voice lines we have?
@AiVOICETUTOR Рік тому ⁺¹
Hi, this is a tough one as it might depend a lot on the input voice. 40 minutes with 300 epochs gave me the best results overall so far but I'd love to hear from others.
@ari-jp8bb 8 місяців тому
I don't have "one click training button" any idea why?
@hashaam06 5 місяців тому
These are the my laptop specifications, Please suggest me the freq, epochs, batch_size GPU.
CPU
13th Gen Intel(R) Core(TM) i7-13620H
Base speed: 2.40 GHz
Sockets: 1
Cores: 10
Logical processors: 16
Virtualization: Enabled
L1 cache: 864 KB
L2 cache: 9.5 MB
L3 cache: 24.0 MB
Utilization 5%
Speed 1.29 GHz
Up time 3:01:01:11
Processes 276
Threads 3995
Handles 230130
GPU 1
NVIDIA GeForce RTX 4050 Laptop GPU
Driver version: 31.0.15.5161
Driver date: 2/15/2024
DirectX version: 12 (FL 12.1)
Physical location: PCI bus 1, device 0, function 0
Utilization 0%
Dedicated GPU memory 0.0/6.0 GB
Shared GPU memory 0.0/7.9 GB
GPU Memory 0.0/13.9 GB
@nghiatong-pu1si Рік тому
thanks!
@AiVOICETUTOR Рік тому
Thanks for watching!
@dr.b3276 11 місяців тому
Can it be used in real time?
@AiVOICETUTOR 11 місяців тому
Absolutely! Check out this video: ua-cam.com/video/vFKm-G-dxHo/v-deo.html
@DrawTakenShorts 11 місяців тому
i have a question
is this like... real time voice cloning or like you record audio and then you change it after and it comes out as a file?
@AiVOICETUTOR 11 місяців тому
Yes this tutorial is for cloning prerecorded voices. If you want to clone your voice in real time, you can watch that video: ua-cam.com/video/vFKm-G-dxHo/v-deo.html
@DrawTakenShorts 11 місяців тому
@@AiVOICETUTOR thank you man i appreciate it
@didierdunca Рік тому ⁺¹
5:20 cmd windows i have this in the last phrase: ValueError: invalid literal for int() with base 10: 'Music\\RVC-beta-v2-0528\\Lecturer'
my audio is 9 minutes
@didierdunca Рік тому ⁺¹
very stupid program but my directory was too long... I had to edit it from this Music\\RVC-beta-v2-0528\\Lecturer
to this M\\RVC-beta-v2-0528\\Lecturer (make it short)
@AiVOICETUTOR Рік тому
Interesting. Glad you figured it out and thanks for sharing the solution!
@Pl-gh8tx 11 місяців тому
can we use text to create voice which we have trained
@AiVOICETUTOR 11 місяців тому
Yes but the models are not compatible with this method. Check this video for Text-To-Speech: ua-cam.com/video/P1HIOvKg5Ko/v-deo.html
@Pl-gh8tx 11 місяців тому
@@AiVOICETUTOR thank you
@FerencAntalicz 11 місяців тому
is there any way to do this with amd graphic card?
@AiVOICETUTOR 11 місяців тому
From what I can tell, you need to be on Linux to use AMD cards and not all cards are supported yet
@yashin1122 Рік тому
Is this process compatible for nvidia GTX 1650 4GB?
And can you please tell me the Best Setting for 4GB VRAM
@AiVOICETUTOR Рік тому
4GB is critical for training a voice (8GB recommended) but if you're lucky it might work. You can try the following things if you're having memory issues. 1.Lower the batch size to "1". 2.Cut the audio in clips shorter than 10 seconds. 3. Reduce the size of the input audio.
@sNaikoo Рік тому
Hmm i got error when trying input audio file? RuntimeError: Failed to load sound: [WinError 2] The specified file could not be found .. what i do wrong? Help me ASAP.
@AiVOICETUTOR Рік тому
Make sure you don’t have any spaces or special characters in the name of the folder path
@tommytomickey Рік тому
Can I use Mac to follow these steps?
@AiVOICETUTOR Рік тому ⁺¹
I think it’s possible but I haven’t done it myself and can’t find a lot of good info about it. Maybe keep an eye on this: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/575
@jaanbrosmusicofficial Рік тому
How much time this whole process need to train a model.....?
Because in my case this software is hang and stuck on step 2B what to do....?
Plzz help and guide me
@AiVOICETUTOR Рік тому
Step 2b, the feature extraction, should be very quick unless maybe you have many audio files and are using a regular hard disk (and not an SSD)
@kuroganehaku 10 місяців тому
I've downloaded it and it works, but now i want to uninstall it but i don't know how. Its taking up too much spaces for me.
@AiVOICETUTOR 10 місяців тому
You can just delete the folder manually.
@vithujan Рік тому
thx for tuto
@AiVOICETUTOR Рік тому
You're welcome and thanks for watching
@builtlikeasackofpotatoes8020 Рік тому
typed in the path for the audio, hit process and I get this message:
Expecting value: line 1 column 1 (char 0)
@AiVOICETUTOR Рік тому
Make sure your folder path doesn't have any spaces or special characters in it
@DungTranOfficial Рік тому ⁺¹
I encountered the “CUDA out of memory” error while performing step 3. Does it mean my GPU is to weak? I’m using RTX 2060 graphics card. Please guide me on how to solve this issue, thank you so much.
@AiVOICETUTOR Рік тому
Sorry it didn't work for you. This seems to be one of the most common issues/bugs. You can try the following things. 1.Lower the batch size to "1". 2.Cut the audio in clips shorter than 10 seconds. 3. Reduce the size of the dataset.
@DungTranOfficial Рік тому
@@AiVOICETUTOR I reduced the batch size to 1, restarted the computer and it's now functioning. Thank you for your help!
@AiVOICETUTOR Рік тому
Awesome! Glad it worked
@TheRealCoooookie 8 місяців тому
It said move model to cuda what dose that mean
@gfagamalibrahim3542 Рік тому
Hi I wonder if should I use GPU not CPU.
@kenpro1359 Рік тому
it just takes longer to train(very long)
@AiVOICETUTOR Рік тому
Yes alway use GPU instead of CPU if you can. But CPU will work too
@reditshortz Рік тому
Dont you need to install requirements before using this?
@AiVOICETUTOR Рік тому ⁺¹
Nope you only need to do the steps in the video
@reditshortz Рік тому
@@AiVOICETUTOR ok but when i try to train my voice , first 2 options works just fine but when it comes to one click training it just stops at all-feature-205
all-feature-done so basically it stops at feature extraction
@AiVOICETUTOR Рік тому
Make sure your file names and folder paths don't contain any spaces or special characters
@yem4679 10 місяців тому
where does that "Lecturer" folder came from?
@AiVOICETUTOR 10 місяців тому
I downloaded a video lecture off the internet and put it in a folder called "lecturer". Hope that helps
@bigguy7558 Рік тому ⁺¹
Hi, for the Epoch step, is it necessary for it to have 300? It takes a very long time for it to process all of them especially if you don't have 12GB of VRAM for your GPU
@AiVOICETUTOR Рік тому
Hi, it’s not necessary to use 300 epochs but from what I know, with the 10 minute samples, 300 works the best. You could try it with a lower value and see how well the voice performs. For testing purposes maybe start with 20 or 50 epochs
@bigguy7558 Рік тому
@@AiVOICETUTOR Hi, after testing I found that 300 epochs sounds best and less robotic than lower values. What would happen if the epoch is raised more than 300? Would it sound more accurate or is it a waste of time?
@AiVOICETUTOR Рік тому
There is a risk of “overtraining“ a model but I’d say you can definitely try going higher than 300. Some people went as high as 1000 but it also depends on how long your input data is
@bigguy7558 Рік тому
@@AiVOICETUTOR How much input data would you need if it required 400 epoch? Also, is it recommended to put in audio clips of different tones of the voice (yelling, whisper, etc.)? Thanks
@AiVOICETUTOR Рік тому
I wouldn’t go much higher than maybe 15-20 minutes for 400 epochs. And yes, I think ideally the samples should contain different emotions (whisper only if it’s clearly audible). It’s still on my ToDo list though so I can’t tell how well it’ll work
@efecetinkaya4802 Рік тому
When I just started training to voice model, during the 5th epoch, I received a warning that the disk space of the computer was full, and I immediately deleted my movies and games and freed up 40 GB of space, but until I deleted it (I do not know whether the epoch progressed or not), my disk space was full for a short time during the training (5- 10 minutes) will it harm the whole process? There is a longer time and it will go up to 300 and I don't want to start the training all over :(. By the way, I made space on my computer and the training is ongoing. One last question, what exactly does "Epoch" mean in the artificial intelligence voice model? it would be great for both me and your followers who may have the same problem if you reply, I am waiting for your answers in advance and thank you very much.
@AiVOICETUTOR Рік тому ⁺¹
Hey, if it ran out of space, it should have stopped the training process. So I hope you ended up with a good voice model in the end. Good question about the epochs. I didn't have a clear answer in my head so here's what Bard says:
An epoch is one complete pass through the training dataset. This means that the model will see each audio file in the dataset once, and then it will start over at the beginning of the dataset and see all of the audio files again. The number of epochs that you specify when you train a model will determine how long the training process will take.
For example, if you have a training dataset of 100 audio files, and you specify 20 epochs, then the model will train for a total of 2000 passes through the dataset. This means that the model will see each audio file 20 times during the training process.
@titusfx Рік тому
can it be use with foreign language?
@AiVOICETUTOR Рік тому
Yep if you train a foreign language and clone a voice in the same language it will work well.
@Mehdi0montahw Рік тому
good
@AiVOICETUTOR Рік тому
🙏
@angelmolinalopez9997 11 місяців тому
I don't have Nvidia but AMD. When I get to the training it tells me that it has done it successfully but it has not done any epoch. Is it because I don't have the right graphics card?
@AiVOICETUTOR 11 місяців тому
Yeah AFAIK RVC doesn't work with AMD GPUs on windows yet
@YDMCA Рік тому
I dont have GPU...can i get this done with colab?
@AiVOICETUTOR Рік тому
Yes even though I haven’t tested it here’s a collab for the training: colab.research.google.com/drive/1TU-kkQWVf-PLO_hSa2QCMZS1XF5xVHqs?usp=sharing
@khajask8113 Рік тому
To run rvc..what pc specs.? 8gb ram, 2gb gpu ok..?
@AiVOICETUTOR Рік тому
A 2GB GPU is not enough for training a voice (4GB is critical, 8GB recommended) but you can clone the voice using pretrained models in RVC GUI with CPU alone but it will be much slower
@macdoctorsg Рік тому
anything for Mac users?
@AiVOICETUTOR Рік тому
I think it’s possible but I haven’t done it myself and can’t find a lot of good info about it. Maybe keep an eye on this: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/575
@Themysteriousilluminator 10 місяців тому
Rvc gui is telling me "please select a model and imput audio file" when i click convert despite the fact that i definitely have both selected i even tried changing the models and testing that the audio file was not corrupted
@AiVOICETUTOR 10 місяців тому
Sorry it's not working for you. I couldn't find any info on it but hope it will be fixed in a future version of RVC
@IndigaVP 21 день тому
It work without index??!!!
@AiVOICETUTOR 20 днів тому ⁺¹
Yeah the index is optional (and I don’t notice much of a difference when I don’t use it)
@tommasocolo5414 Рік тому
Can I do the all the things if the audio is only 3 minutes? Or 10 minutes or more only
@AiVOICETUTOR Рік тому ⁺¹
Yeah you can do it with 3 minutes but you should use less epochs. But its hard to tell the perfect number (depends on more than just the length of the input voice). Maybe start with 150 epochs and see how it sounds
@tommasocolo5414 Рік тому
@@AiVOICETUTOR thanks
@khajask8113 Рік тому
Hey..rvc-pkg can install locally forever free to use..?
@AiVOICETUTOR Рік тому
Hey. Yes you can use this locally free for forever
@God_and_Beyond 11 місяців тому
Sir, after I trained the models voice, I can't find the index file in the logs or weights. What should I do?
@AiVOICETUTOR 11 місяців тому
Make sure you don’t have any special characters or spaces in your folder paths or that its not a cloud or network drive
@DHH_RH_YT Рік тому
How did you downloaded lecturer file??
@AiVOICETUTOR Рік тому
There's many ways to download videos off UA-cam if that's what you meant. For example you can do it on some websites if you google "UA-cam downloader"
@MonderMurshed Рік тому
7:08 how much does it take to finish?
@AiVOICETUTOR Рік тому ⁺¹
The "One Click Training" can take a few hours depending on your GPU and number of epochs. "Train feature index" will only take a few seconds at max.
@eventfakt Рік тому
Hello brother I have a problem in model inference.... none type object has no attribute dtype
@AiVOICETUTOR Рік тому
Hey check this out: github.com/RVC-Project/Retrieval-based-Voice-Conversion-WebUI/issues/1020. Hope that fixes it!
@hrishikeshandurlekar2178 Рік тому
What f0 method should i use in the RVC Gui if i use the RMVPE model to train?
@AiVOICETUTOR Рік тому
That's a great question. I haven't tried it in RVC GUI yet since when I last checked, the latest version didn't have RMVPE. What I did so far is use the "Model Inference" tab in the RVC-WebUI to clone the voice, since that lets you select RMVPE. Over the weekend I'm gonna have a look at RVC GUI and see what f0 sounds best with RMVPE trained models. Guess "Harvest" should work well with it.
@AiVOICETUTOR Рік тому
I have tried the RMVPE trained voice in RVC-GUI and it works great when using "Harvest" as f0 method.
@hrishikeshandurlekar2178 Рік тому
Thank you. The model interface tab also works very well. Initially I assumed that it may be very complicated for a layman and RVCGUI offered a easier workflow. But now I feel the interface tab is pretty good too.
@AiVOICETUTOR Рік тому
Same here. I think we got used to the UI by using the training tab a couple of times :)
@dedsechacker7929 7 місяців тому
is there any ai tool where I can enter text it give me voice but before that I can train it to particular voice model and then start making aurdios through texts not actucally to record my voice to clone it
@AiVOICETUTOR 7 місяців тому
Yes, check out this tool for text-to-speech: ua-cam.com/video/P1HIOvKg5Ko/v-deo.html

Наступне

Автоматичне відтворення

Free Text To Speech in Any Voice with Lipsync | E2 F5 TTS Tutorial + FaceFusion 3 | Zero Shot TTS