Hello Abdelrahman. Can you share the code and the dataset with me? I guess the alphabet must be the problem here. We need to define a function to convert it into English alphabet
we solved the problem with Abdelrahman, indeed, if you're working with a language that has a different alphabet than English, you should conver it to English. Example: convert السلام عليكم to alsalam alekum.
Yes the problem is that the model's tokenizer can't understand anything rather the English letter. Thus; the trained data should be "Audio+ converted text to English" not the translated though. And it should be working fine right after that.
I am working on same problem , I already did the converting from arabic alphabet to english but the output is not clear at all is there any suggestions to solve this issue?
Congratulations, your work will be a light for those who aim to make progress in this direction. I wish you continued success. May your path be clear Emirhan
This video is helpful for people who want to understand text-to-speech (TTS) and how to make it better. Emirhan,who made the video knows a lot about it, and the part where he show how to write computer code is useful, even I don't know much about coding.
Very successful. I subscribed to your channel. You deserve more followers. But for this, I think you need to produce a little more content.💪 Tebrikler....
Hiii Emirhan, I am one of your new viewers. I am recently learning machine learning and now i have to fine tune a tts model for interviews based technical words like OAuth, API etc. Can you help me with it or can we connect personally because that project is really important for me
Hello, if you mean the abbreviations, or something else, you can define a custom function to handle that case like this: def preprocess(text): text = number_normalizer(text).strip() text = text.replace("-", " ") if text[-1] not in punctuation: text = f"{text}." abbreviations_pattern = r'\b[A-Z][A-Z\.]+\b' def separate_abb(chunk): chunk = chunk.replace(".","") print(chunk) return " ".join(chunk) abbreviations = re.findall(abbreviations_pattern, text) for abv in abbreviations: if abv in text: text = text.replace(abv, separate_abb(abv)) return text I took it from: huggingface.co/spaces/parler-tts/parler_tts/blob/main/app.py Even if you don't do it with an additional function, you can do it by providing enough samples (more than a thousand) to the model.
Hello, my model is generating speech, but it's only producing about two words and cutting off after approximately 0.1 seconds. Do you have any advice or help? Is there a Discord where I can reach you?
Hello, this could be due to three reasons: Your individual data samples are small, such as having only two words per sample, making it difficult for the model to learn from longer sequences. Your dataset is small, for example, only containing 300 sentences. I recommend increasing the size of your dataset. The model hasn't been trained enough, or you may need to experiment with different hyperparameters.
I tried that on Arabic dataset, didn't work. Tried to increase the steps to 5000. Still didn't work, any advice?
Hello Abdelrahman. Can you share the code and the dataset with me? I guess the alphabet must be the problem here. We need to define a function to convert it into English alphabet
we solved the problem with Abdelrahman, indeed, if you're working with a language that has a different alphabet than English, you should conver it to English.
Example:
convert
السلام عليكم
to
alsalam alekum.
Yes the problem is that the model's tokenizer can't understand anything rather the English letter. Thus; the trained data should be "Audio+ converted text to English" not the translated though. And it should be working fine right after that.
@@abdelrahmanmohsen6393 did you find a solution bro ?
I am working on same problem , I already did the converting from arabic alphabet to english but the output is not clear at all is there any suggestions to solve this issue?
Solid explanations, learned a lot! Thanks!
Glad you liked it:) 🥰
Congratulations, your work will be a light for those who aim to make progress in this direction. I wish you continued success. May your path be clear Emirhan
Thank you so much!
The great turkish robot from mardin teaches us how to fine-tune itself. Ai is really something else.
trrrrum,
trrrrum,
trrrrum!
trak tiki tak!
Thanks to UA-cam I've seen this video, hope Mr. Bilgiç will bless us with new videos.
Thank you Hüseyin 😄
This video is helpful for people who want to understand text-to-speech (TTS) and how to make it better. Emirhan,who made the video knows a lot about it, and the part where he show how to write computer code is useful, even I don't know much about coding.
Thank you for the support Carlos 🙂
Very successful. I subscribed to your channel. You deserve more followers. But for this, I think you need to produce a little more content.💪 Tebrikler....
Çok teşekkür ederim:)
Congratulations emirhan I wish you continued success 😊
Looking solid! Congrats Emirhan.
Thank you so much Zaur!
Thank you for the great explanation!❤️💯
Thank you Yunus :)
tebrik ederim dostum, çok temiz ve açıklayıcı bir video olmuş 💯
Desteğin için teşekkürler :) Daha ayrıntılı bir şeyler de çekebilirim ilgi olursa.
Desteğin için teşekkür ederim :) Daha ayrıntılı bir şeyler çekebilirim ilgi olursa.
Congrats brother 👏
Thank you so much 😊
Great, make more videos on TTS, voice cloning, multilingual TTS
Thank you! will try :)
Emeğine sağlık süper olmuş. Başarılar dilerim 🤝
çok teşekkürler:)
Tebrikler oğlum çok güzel olmuş ❤
Teşekkür ederim babacığım ☺❤
Hayırlı olsun başarılarınin devamını dilerim
Başarılarının devamını dilerim sonsuz başarılar
Çok teşekkürler :)
best indian youtuber so far ✋🏻 no cap 🧢
Thank you but I am not Indian 😄
Alanım değil twitterda görüp bakayım dedim bu yoruma koptum 😂@@emirhanbilgicai
Hiii Emirhan, I am one of your new viewers. I am recently learning machine learning and now i have to fine tune a tts model for interviews based technical words like OAuth, API etc. Can you help me with it or can we connect personally because that project is really important for me
Hey! I can give you some tips if you share the details
Hayırlı olsun.Tebrikler
Teşekkürler :)
hi, is it possible to train the model in english with only certain words that its currently pronouncing inccorectly ?
Hello, if you mean the abbreviations, or something else, you can define a custom function to handle that case like this:
def preprocess(text):
text = number_normalizer(text).strip()
text = text.replace("-", " ")
if text[-1] not in punctuation:
text = f"{text}."
abbreviations_pattern = r'\b[A-Z][A-Z\.]+\b'
def separate_abb(chunk):
chunk = chunk.replace(".","")
print(chunk)
return " ".join(chunk)
abbreviations = re.findall(abbreviations_pattern, text)
for abv in abbreviations:
if abv in text:
text = text.replace(abv, separate_abb(abv))
return text
I took it from: huggingface.co/spaces/parler-tts/parler_tts/blob/main/app.py
Even if you don't do it with an additional function, you can do it by providing enough samples (more than a thousand) to the model.
For contact and everything: emirhanbilgic.github.io
Hello, my model is generating speech, but it's only producing about two words and cutting off after approximately 0.1 seconds. Do you have any advice or help? Is there a Discord where I can reach you?
Hello, this could be due to three reasons:
Your individual data samples are small, such as having only two words per sample, making it difficult for the model to learn from longer sequences.
Your dataset is small, for example, only containing 300 sentences. I recommend increasing the size of your dataset.
The model hasn't been trained enough, or you may need to experiment with different hyperparameters.
@@emirhanbilgicai my Audi is like 2-10s long
@@emirhanbilgicai that was true if I fine tune with 20mins each audio will it produce 10 -20 mins long audio ?
@@og_23yg54 yes, but it would take ages to train a model with 20min-long samples (with enough number of samples)
🧑💻💯
Thank you!
That AI version of Harry Potter is pretty convincing.
Thank you 😅
🤣
Bari Türkçe altyazı koy jshs
Çok uzun zaman gerektiriyor :((