XTTS FAQ | Interview with Josh Meyer from Coqui AI

Thorsten-Voice

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 січ 2025

КОМЕНТАРІ • 35

@MarcoManzo Рік тому ⁺⁴
yes please make a local voice cloning xtts tutorial!
@ThorstenMueller Рік тому ⁺⁵
It's already on my TODO list 😊.
@MightyReiti Рік тому ⁺²
thanks, great Interview, and very good informations on point, as always.
@ThorstenMueller Рік тому
Thank you very much, @MighyReiti for your nice feedback 😊
@mobobi Рік тому ⁺²
Great Interview. Good to hear Josh's voice.
@ThorstenMueller Рік тому
Thanks for your nice feedback 😊. Would you like to see interviews like this on my channel?
@mobobi Рік тому
Yes . Interviews like these are very helpful and we’d sure like to see more
@juanjesusligero391 Рік тому
Oh! Cloning my voice using several clips, each with different emotions, seems like a really interesting thing! I hope you make a video about that! :D (preferably on Windows)
@ThorstenMueller Рік тому ⁺¹
This is already work in progress 😉.
@juanjesusligero391 Рік тому
@@ThorstenMueller That's great to hear! :D
I think you could (and probably even should) split the content into two videos, one for the local installation (XTTS working offline), and another one for the experiment with different voices (same voice but with different emotions). Short and concise videos can be easier to follow, and having more videos will help bringing more people to this great channel :)
@Berlin2Best Рік тому ⁺¹
Hi Thorsten! So if you want to make a video on your (monetized, i.e., commercial) channel involving XTTS (e.g. tutorial or comparison), you can't legally play anything generated from it. Oder?
@ThorstenMueller Рік тому ⁺¹
So the question is, does it make a difference if a video with XTTS audio is monetized or not. That's a good point 👍. As Josh mentioned i'd recommend you to contact Coqui AI with that question.
@bobbyboe Рік тому
@@ThorstenMueller exactly also what I thought. It might have been a better licence-fit if also "small commercial use, like videos here on YT could have been included in the free usage.
@pupattolino75 11 місяців тому
this ai is awersome and great. However, I have noticed in the various cloning of voices that for male voices with baritone voices, I cannot recreate the same vocal impression on the low frequencies. (I use the italian language prompt) I have tried several samplers and seem to verify that the software works better in general for female voices than for male voices. I wonder if there might be some development or use of a different system from this point of view that would improve this. Is a frequency bandwith range issue? Thanks anyway. I am creating my own private audiobooks with this system and have found a way to read while listening. (I am a subscriber to audiobooks anyway) but I find that creating something from scratch with a favorite book is a must, obviously not for commercial purposes. In any case I will use different sampler of 6 seconds of the same voice to see what suggested in the interview if this can improve the voice to be more realistic especially in the lower frequencies for male.
@ThorstenMueller 11 місяців тому ⁺¹
Normally i'd recommend to ask this specific question (on frequency stuff) on Coqui TTS community. But as the Coqui AI company has sadly shutdown i'm not sure on how fast you will get a response. As i've no experience with this type of adjustment i guess i can't really be helpful on that issue.
@bobbyboe Рік тому
Hallo Thorsten, super Content, genau die selben Fragen hätte ich ihn auch gefragt- genial - danke schön!
Ich hätte eine Anfängerfrage:
Ich habe Coqui versucht lokal mittels Pinokio zu installieren. Es scheint auch zu laufen... sagt Pinokio, ...aber es gibt da keinen Local-host-Link den ich anklicken könnte... ich komme irgendwie nicht zum Program... ist bestimmt peinliche Anfängerfrage... ich suche nach einer Möglichkeit meine Stimme durch Text zu erzeugen (für Erklärvideos)... so war zumindest der Plan. LG Bobby
@ThorstenMueller Рік тому
Vielen Dank für deinen sehr netten Kommentar 😊. Mit "Pinokio" habe ich noch nicht gearbeitet. Nur damit ich deine Frage richtig verstehe - du möchtest deine eigene Stimme mit XTTS clonen, um sie für Erklärvideos zu nehmen oder möchtest du eine existierende (fremde) Coqui TTS Stimme dafür verwenden?
@bobbyboe Рік тому
@@ThorstenMueller Hallo, danke für deine Antwort. Ich möchte primär meine eigene Stimme für Text2Speach Aufnahmen möglichst hoher Qualität verwenden können. Ich habe für mich ein System entwickelt bei dem ich bildliche und sprachliche Inhalte quasi zeitgleich konzipiere, in 8 sek - Blöcken, bei denen dann Sprache zu Bild passen muß. Die Audio-Aufnahmen sind recht aufwendig weil sie ja genau passen müssen. Wär toll wenn das in annehmbarer Qualität über Text ginge. Aber ich schrecke davor zurück dies mit Elevenlabs anzugehen weil ich Abo-Abhängigkeiten scheue.
@ennething Рік тому
"I am a student (thus relatively poor) and have ADHD; it is extremely difficult for me to read longer texts. Therefore, I have been using text-to-speech (TTS) for a long time to have texts read aloud. However, there is a huge gap in quality between free TTS and cloud-based, naturally sounding TTS, where one often pays per punctuation mark. XTTS sounds so incredibly promising. I would even consider getting a new graphics card for it. I would be very interested in an installation guide."
@ThorstenMueller Рік тому ⁺¹
Thanks for your feedback.
My step-by-step tutorial on XTTS is online now 😊:
ua-cam.com/video/HJB17HW4M9o/v-deo.html
Do you know Piper TTS, which could be an interesting alternative for you too.
ua-cam.com/video/GGvdq3giiTQ/v-deo.html
@maryamnazari1281 6 місяців тому
hi Thorsten! i want to know is it possible that I train a xtts model for a language which is not supported by coqui dev team?? how much data do I need to train a new language with xtts?
@ThorstenMueller 6 місяців тому
When i got Josh Meyer (from Coqui) right, XTTS should be able to clone voices in languages it has not been trained on. Have you watched my interview with Josh on XXTS?
@ThorstenMueller 6 місяців тому
Argh sorry, obviously you've seen the interview video 🤦.
@priyankagrawal2916 Рік тому
Thanks Thorsten, good interview.
PS: the ads are way too many pretty much every few mins, would appreacitate a little lesser ads.
@ThorstenMueller Рік тому
Thanks for your nice and helpful feedback 😊. I have limited configuration options on how many ads are shown inside the video, but i'll keep an eye on that.
@iamjamiilkhan Рік тому
Please make tutorial on xtts on Windows or a ubuntu server
@ThorstenMueller Рік тому
This is already work in progress 😉.
@Fadelabdalkhalik Рік тому
Vielen, lieben Dank. Ich benutze Ihre Stimme zum Deutschlernen.
@ThorstenMueller Рік тому
Vielen lieben Dank für den Kommentar, das freut mich sehr 😊. Ich wünsche viel Erfolg beim Lernen.
@juergenmarsch7198 Рік тому
2nd time comment
I like the video very much. What I don't like is the fact that my first comment disappeared
Before typing lot of text for nothing, I'll try with this short one.
@ThorstenMueller Рік тому
Hi, thanks for your comment. I see a second and longer XTTS related comment by you in this video: ua-cam.com/video/9e1Wt82GKmU/v-deo.html&lc=UgzvCm2Ch50-OPeExI94AaABAg
Maybe you posted the comment on the wrong video 😉?
@unlovabledeadsquirrel Рік тому
The idea of getting to know people behind these awesome technologies is exciting so this format is definately a keeper even if only done on an irregular basis.
It would also be interesting to hear more about the history and development of text to speech technology. Maybe you could try to get ahold of someone having been involved in development of the SAM voice on the Commodore 64
@ThorstenMueller Рік тому
Thanks for your nice feedback 😊. I already have a special guest in mind for next time.
Not sure if i can get in contact with these pioneers of voice tech (like a C64).
@poly06033 Рік тому
❤
@ThorstenMueller Рік тому
❤️

Наступне

Автоматичне відтворення

Local voice cloning with 6 seconds audio | Coqui XTTS on Windows