- 152
- 705 216
Thorsten-Voice
Germany
Приєднався 12 лис 2013
Guude! (hi, nice to see you) 👋,
i'm Thorsten 😊.
You like open source, privacy aware and local running voice technology? Me too 😎. You'll find cooking recipe like tutorials on TTS, STT, Voice Assistants, AI, ML and way more cool stuff here. So, hop on and join my amazing community 🥰.
#opensource #voice #cloning #technology #news #tutorial #local #privacy #tech #tts #stt #voiceassistant #raspberrypi #smarthome #homeassistant
* My project website: www.Thorsten-Voice.de
* Me on GitHub: github.com/thorstenMueller
i'm Thorsten 😊.
You like open source, privacy aware and local running voice technology? Me too 😎. You'll find cooking recipe like tutorials on TTS, STT, Voice Assistants, AI, ML and way more cool stuff here. So, hop on and join my amazing community 🥰.
#opensource #voice #cloning #technology #news #tutorial #local #privacy #tech #tts #stt #voiceassistant #raspberrypi #smarthome #homeassistant
* My project website: www.Thorsten-Voice.de
* Me on GitHub: github.com/thorstenMueller
Your AI Voice Sounds WRONG! Here's Why 🤖 → 🗣️
Learn how to dramatically improve your AI text-to-speech output through proper text cleaning and normalization techniques. In this tutorial, I'll show you:
✓ Common text issues that ruin TTS quality
✓ Step-by-step text cleaning process
✓ How to handle numbers, abbreviations, and special characters
✓ Universal techniques that work with any TTS engine
Whether you're using commercial or open-source TTS solutions, these text preprocessing steps will help you achieve more natural-sounding speech output. I will use @NVIDIA NeMo for text cleaning / normalization as one possibility.
#TextToSpeech #TTS #AIVoice #Tutorial #VoiceAI
00:00 Intro & samples / goals
01:32 What to achived by tutorial end
03:13 What is the actual problem?
04:26 Showing the cleaned / normalized text
06:10 Options for text cleaning (num2words, Huggingface, NVIDIA NeMo)
10:35 Google colab for text cleaning
15:30 Bring it together (voice processing pipeline)
* Thorsten-Voice Blogpost: www.thorsten-voice.de/2025/01/09/your-ai-voice-sounds-wrong-heres-why-%f0%9f%a4%96-%e2%86%92-%f0%9f%97%a3%ef%b8%8f/
* Link to Google Colab Notebook: colab.research.google.com/drive/1mGskJoVRr-QjKvV5ipm-k9LcdUqjFzlv?usp=sharing
* num2words: pypi.org/project/num2words/
* Huggingface: huggingface.co/docs/tokenizers/api/normalizers
* German language only: github.com/repodiac/german_transliterate
* NVIDIA NeMo text cleaning: github.com/NVIDIA/NeMo-text-processing
Please subscribe to my channel 😊.
ua-cam.com/users/ThorstenMueller
---
- www.Thorsten-Voice.de
- github.com/thorstenMueller/Thorsten-Voice/
✓ Common text issues that ruin TTS quality
✓ Step-by-step text cleaning process
✓ How to handle numbers, abbreviations, and special characters
✓ Universal techniques that work with any TTS engine
Whether you're using commercial or open-source TTS solutions, these text preprocessing steps will help you achieve more natural-sounding speech output. I will use @NVIDIA NeMo for text cleaning / normalization as one possibility.
#TextToSpeech #TTS #AIVoice #Tutorial #VoiceAI
00:00 Intro & samples / goals
01:32 What to achived by tutorial end
03:13 What is the actual problem?
04:26 Showing the cleaned / normalized text
06:10 Options for text cleaning (num2words, Huggingface, NVIDIA NeMo)
10:35 Google colab for text cleaning
15:30 Bring it together (voice processing pipeline)
* Thorsten-Voice Blogpost: www.thorsten-voice.de/2025/01/09/your-ai-voice-sounds-wrong-heres-why-%f0%9f%a4%96-%e2%86%92-%f0%9f%97%a3%ef%b8%8f/
* Link to Google Colab Notebook: colab.research.google.com/drive/1mGskJoVRr-QjKvV5ipm-k9LcdUqjFzlv?usp=sharing
* num2words: pypi.org/project/num2words/
* Huggingface: huggingface.co/docs/tokenizers/api/normalizers
* German language only: github.com/repodiac/german_transliterate
* NVIDIA NeMo text cleaning: github.com/NVIDIA/NeMo-text-processing
Please subscribe to my channel 😊.
ua-cam.com/users/ThorstenMueller
---
- www.Thorsten-Voice.de
- github.com/thorstenMueller/Thorsten-Voice/
Переглядів: 562
Відео
🎙️ Home Assistant Voice Preview Edition (VPE) #03 | Local Setup with Whisper & Piper 🗣️
Переглядів 1,2 тис.21 день тому
Welcome to our new series about Home Assistant Voice! In this episode, we'll setup the device for local voice processing using whisper (STT) and piper (TTS). 📋 What we'll cover: - Install / configure OpenAI whisper for speech recognition - Install /configure Piper TTS for speech synthesis - Add Wyoming protocol - Configure voice assistant - Turning on/off entities with local voice control ⚡ Dev...
🎙️ Home Assistant Voice Preview Edition (VPE) #02 | First Setup & Connection 🔌
Переглядів 48521 день тому
After unboxing Home Assistant Voice in the previous episode, let's get this device up and running! In this video, we'll go through the initial setup process and connect the device to your Home Assistant installation. 📋 What we'll cover: - Powering on the device for the first time - Connecting to Home Assistant - Exploring created entities - Demo entities overview ⚡ Using Home Assistant version:...
🎙️ Home Assistant Voice Preview Edition (VPE) #01 | Unboxing & Tech Specs 📦
Переглядів 37121 день тому
Welcome to our new series about Home Assistant Voice! In this first episode, we'll unbox this exciting new device and take a detailed look at its technical specifications. 📋 What we'll cover: - Complete unboxing experience - Box contents overview - Hardware specifications - Quick look at documentation - Preview of upcoming episodes ⚡ Device Details: - Home Assistant Voice Preview Edition - Rele...
F5 Text to Speech Tutorial | Hit "Refresh" on Your AI Voice!
Переглядів 6 тис.2 місяці тому
🔥🔥🔥 Impressive voice cloning with F5 TTS! Clone your voice with a few seconds audio data for your personal AI voice. Step-by-step tutorial For comparison reason - here's my computer spec: * CPU: 4x Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz * RAM: 16GB * GPU: NVIDIA GeForce GTX 1050 Ti Based on some comments you might want to watch it on 1.5x speed 😁. Thanks to @kardiokode-g8v for pointing out lic...
3 steps to run HuggingFace 🤗 "Parler TTS" AI Voice on your local machine
Переглядів 7 тис.2 місяці тому
How to run "Parler TTS" from @HuggingFace on your local machine in 3 simple steps (using python code)! Including audio samples. #python #parler #tts #huggingface 00:00 Intro 02:22 Parler TTS Github repo 03:10 Dataset basis for Parler TTS 05:40 Huggingface space to try it out 06:20 Set up Python venv for Parler TTS & Install 09:47 Using python script to synthesize audio 14:45 Synthesizing audio ...
Best AI Voice Generator | 2024.08
Переглядів 20 тис.4 місяці тому
Free #TTS with #Mars5 #Parler #MetaVoice #Toucan and #ChatTTS. First look and comparison video on voice cloning and more. Thanks to you great #opensource text to speech projects and @HuggingFace for providing cool spaces to play around with 🤗. And thank you "VB" for pointing to these cool projects on LinkedIn 👏: www.linkedin.com/posts/vaibhavs10_text-to-speech-ecosystem-has-been-booming-activit...
Automate Voice Dataset Creation Using Whisper AI
Переглядів 2,1 тис.6 місяців тому
Easy tutorial on creating a structured voice dataset on raw audio data using Python and Whisper by OpenAI for speech recognition. #ai #whisper #tts #voice #data #python 00:00 Intro 01:10 Set up python virtual environment 03:00 Working with "the magic" script :) 07:00 Run voice dataset generation with Whisper AI STT 07:58 Checking results 09:45 Outro * github.com/thorstenMueller/Audio-to-Voice-D...
TTS Voice Dataset | LJSpeech | Voice Cloning
Переглядів 2,8 тис.6 місяців тому
Close look to ljspeech voice dataset and it's structure for tts voice cloning. The ljspeech voice dataset is widely supported by tts voice cloning software. Videos is describing the structure and how you can create it for your personal voice clone. 00:00 Intro 02:23 LJSpeech info and download 04:15 LJSpeech in research (Google Scholar) 05:17 Close look to the voice dataset file structure 06:25 ...
Unlock AI Superpowers with NVIDIA CUDA: Boost Performance in Python!
Переглядів 1,6 тис.6 місяців тому
Boost your AI performance by using NVIDIA CUDA on Windows. Step by step tutorial on how to use CUDA with Python / pytorch and performance comparison with Coqui TTS. #performance #nvidia #python #ai #machinelearning #tts Please subscribe to my channel 😊. ua-cam.com/users/ThorstenMueller Thanks dear @MightyReiti for your inspiration and support on my new recording setup ❤️. 00:00 Intro 01:55 What...
Home Assistant ❤️ Voice - Tutorial 05 - Wyoming protocol
Переглядів 5 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 05 - Wyoming protocol
Home Assistant ❤️ Voice - Tutorial 04 - Piper TTS
Переглядів 8 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 04 - Piper TTS
Home Assistant ❤️ Voice - Tutorial 03 - Conversation / NLP
Переглядів 1,6 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 03 - Conversation / NLP
Home Assistant ❤️ Voice - Tutorial 02 - Text Assist
Переглядів 2 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 02 - Text Assist
Home Assistant ❤️ Voice - Tutorial 01 - Basic setup & demo entities
Переглядів 4,5 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 01 - Basic setup & demo entities
Running a local Piper TTS server with Python on Linux
Переглядів 7 тис.10 місяців тому
Running a local Piper TTS server with Python on Linux
🔥 Voice interview Michael Hansen | HA | Raspberry | Piper | Rhasspy
Переглядів 2,2 тис.11 місяців тому
🔥 Voice interview Michael Hansen | HA | Raspberry | Piper | Rhasspy
Local voice cloning with 6 seconds audio | Coqui XTTS on Windows
Переглядів 47 тис.Рік тому
Local voice cloning with 6 seconds audio | Coqui XTTS on Windows
🇩🇪 Künstliche Sprachausgabe uff Hessisch | Kostenlos und OHNE CLOUD !
Переглядів 1,1 тис.Рік тому
🇩🇪 Künstliche Sprachausgabe uff Hessisch | Kostenlos und OHNE CLOUD !
TEXT TO SPEECH | Piper TTS on Windows 🚀 AI voice 10x faster Realtime!
Переглядів 30 тис.Рік тому
TEXT TO SPEECH | Piper TTS on Windows 🚀 AI voice 10x faster Realtime!
XTTS FAQ | Interview with Josh Meyer from Coqui AI
Переглядів 2,3 тис.Рік тому
XTTS FAQ | Interview with Josh Meyer from Coqui AI
Python virtual environment / venv | Windows, Linux & Mac OS X
Переглядів 3,3 тис.Рік тому
Python virtual environment / venv | Windows, Linux & Mac OS X
Free voice recording for BEST voice cloning | Piper-Recording-Studio | Windows
Переглядів 10 тис.Рік тому
Free voice recording for BEST voice cloning | Piper-Recording-Studio | Windows
Is Mycroft Mark 2 the better Alexa?! | Private | Voice Assistant
Переглядів 3,8 тис.Рік тому
Is Mycroft Mark 2 the better Alexa?! | Private | Voice Assistant
Create your AI digital voice clone locally with Piper TTS | Tutorial
Переглядів 53 тис.Рік тому
Create your AI digital voice clone locally with Piper TTS | Tutorial
Increase Text to Speech pronunciation quality with eSpeak | Tutorial
Переглядів 13 тис.Рік тому
Increase Text to Speech pronunciation quality with eSpeak | Tutorial
Talk locally (no ChatGPT) with your documents 😄 | PrivateGPT + Whisper + Coqui TTS
Переглядів 6 тис.Рік тому
Talk locally (no ChatGPT) with your documents 😄 | PrivateGPT Whisper Coqui TTS
Raspberry Pi | Local TTS | High Quality | Faster Realtime with Piper TTS
Переглядів 33 тис.Рік тому
Raspberry Pi | Local TTS | High Quality | Faster Realtime with Piper TTS
Thorsten-Voice TTS in Windows nutzen | DDC / VITS
Переглядів 6 тис.Рік тому
Thorsten-Voice TTS in Windows nutzen | DDC / VITS
Thorsten-Voice TTS in Linux nutzen | DDC / VITS / Piper
Переглядів 3,5 тис.Рік тому
Thorsten-Voice TTS in Linux nutzen | DDC / VITS / Piper
Couldn't follow you when you were showing code in Collab - was busy gazing at dogs running on a top
Omg dude, I really appreciate your work, and the voice of my AI powered cohost will be amazing because of it, I am very grateful
Hello, Thorsten. I've noticed that UA-cam has been deleting my comments everywhere. It's really frustrating because your channel is my main source for TTS tutorials, and I can't participate in the discussions, whether it's to compliment your work, thank you, or ask questions.
Great content! I teach a class locally on getting better results from ChatGPT, and normalizing inputs makes a huge difference, especially with RAG.
Thanks Thorsten! Good normalisation makes a huge difference and there are plenty of subtle edge cases. It's interesting that even the big companies mess this up at times: in London Google Maps' TTS incorrectly transforms some bus-stop letter codes into expanded abbreviations. So "Stop LT" becomes "Stop Lieutenant"!
I appreciate your work on this, but this is at the Commodore 64 level of computer programming. A tremendous amount of manual labor is required by every individual end user. This approach is for computer nerds, not ordinary computer users who just would like their normal text read properly.
Clearly this for technical types working on TTS, not average consumers - your comment implies you think this is a huge revelation 👏😆
This is barely a step above using “find & replace” in Word. If it’s above your pay grade, it’s a good opportunity to learn. These principles apply to using AI, too, because you get much better response if you sanitize your inputs to something it prefers, e.g., using markdown files vs PDFs.
Honestly these are problems that should actually be solved with proper computer programing. We are supposed to be an age of AI but so far it's very weak, extremely poor or totally awful. You are only working in one direction; text to speech, but the problems are equally massive with speech to text. I'm a exclusively Linux user and Linux has been way behind in this area. Only recently have I been able to get decent text to speech out of my computer using a program called, "Speech Note".
I was going to say this, why needs gpu, TTS can themselves sobe this issue via programming during stdin. Btw I solved this type of issues using bash script and also in some cases c/cpp
Thank you!!!!
i dont really understand about this, but i just want to ask, is it possible to make voice model of Indonesian Language using piper?
Yes, this should be possible. Do you know my tutorial on creating a new piper tts voice model tutorial? ua-cam.com/video/b_we_jma220/v-deo.htmlsi=2FuUErT2fofm2iel
Dude, you rule so hard 🤘I have spent well over 12 hours binging your stuff in the last week, on repeat, and I am incredibly excited to have faster-whisper playing nice with piper! Now to give it a custom voice and a mind...l
Thanks for your nice feedback 😊.
what is the system requirement to run it locally
Apart from python there is no big dependency. But GPU (NVIDIA CUDA) is recommended for performance reasons. But it should run even on CPU (but way slower).
Hey man! Im on older machine its Dell optiplex 780 , can i run this on that, the newest version of tts is 0.22(if I'm not wrong) and i tried several hours but didnt work after that i did quit. But i really need it for my youtube videos voice overs can you help me , and a short answer of yes or no for using this as a voice over for youtube!(Even its ipen souce but i really want to make sure)
Do you have an error on install/using Coqui tts?
hello, is there any way to use the model directly on python like xtts v2. Gratefuly
Didn't try this myself yet. But as the codebase is pure python i "guess" this should be able to achieve.
V spricht man in Englisch wie ein deutsches W, win engl. W wie ein deutsches u
Danke für den Hinweis. Ich habe es in meinem aktuellsten Video sogar aufgegriffen 😉. ua-cam.com/video/-99WPCIlq-s/v-deo.html
I couldn't resist. 😂
🤣 Have you been able to break the browser refresh loop yet 😉?
Can I build my own assistant with a special wake word and a special interface using rhasppy?
Thanks for your question, i guess yes, but i am not really sure. Maybe you can ask this question on their community to get a more helpful answer.
@ Thanks!
I was giving Piper a try for one of my projects when I stumbled upon your channel. It was a pure joy to watch your excitement. 🙂
Thanks for your nice comment and happy you enjoyed watching it 😊. Yes, i really like and am enthusiastic on this niche of technology.
Amusingly, when training these on a 4090 the checkpoints blow by so fast you cannot actually load and test them. By the time it tried, the checkpoint it found using *.ckpt was already 3 versions behind and long gone.
Thorsten, installing it all with Pinokio is so much better. Installing everything with Pinokio in the AI world is so much better.
Thanks for your great hint 👍🏼. I heart about Pinokio some times, but didn't give it a try yet. But this might be a good idea 😊.
Excellent content, I will give it a try :)
Hi! Thank you for your videos. Can you please tell me in your opinion the best TTS software available for meditation and also personal trainer voices in each gender. I can do some fine tuning but preferably looking for a Simple and cheap option. Any thoughts?
Hi, you're welcome 😊. I'd recommend listening to same Piper or Coqui TTS voice samples. There should voices in multiple genders be available (at leat in english). Maybe there's a voice available that fits your needs (you might have to check the voice models individual license).
Thank you so much for your video; it’s truly valuable to me. Oh, and I really like your teaching style in the video as well. 😁😁
Thank you a lot, dear LearnOpsViet for your kind feedback - happy you like it 😊.
hello sir
Hello :)
Sadly, none of the reviewed models and frameworks work locally. I have a 2080 Nvidia and tried the frameworks on Umbuntu. All of the frameworks have very poor documentation. Tucan has an issue with the code that is meant to execute finetuning and kept coming up with division by zero errors (I think it has a lower limit on number of samples but not mentioned anywhere in docs). Mars5 needs more than 16GM Vram (but not mentioned anywhere either). ChatTTS does NOT support finetuning, but needs training from scratch. and MetaVoice has a stated 12GB VRam requirement, which meant I did not even try.
Do you know Piper TTS? I made some tutorials on it. It runs locally, is really performant (even on a raspberry pi 4) and offers voice cloning.
DISREGARD: I got a fork and was able to install it! Thank you! I tried to install coqui TTS to Python (version 3.11.1) using VS Code, but receive an error message: "ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (TTS)". Please help! Thank you for the video. Best.
Did you update pip before? So running "pip install pip -U"? As coqui tts is not maintained any more i am not sure if python 3.11 is fully supported.
Can you tell me can we use this to data augmentation of low resource language like `URDU`? Non english language or asian languages? Like Urdu, Hindi? as Urdu is under resource language so i want to create a very large dataset of this.
This is pure joy! Do play around with other voices, there are some really good ones.
Thanks for your nice feedback 😊. I agree, there are really good voices available.
which ones we can use with Swift CoreML ? Is it possible to make them run swift locally?
I had to lookup Swift CoreML because i didn't know about it (yet) ;-).
@@ThorstenMueller oh ok 😀 let me know please
It would of been nice to see an example of how it works. Does it respond well? How fast does it respond? There really wasnt anything in the video showing how well it works
Do you know my video on listening to the samples? ua-cam.com/video/HojuVmW5LUI/v-deo.html
Yeah would be interesting to see if it can be used with other software as well
I agree, thanks for your feedback 😊.
Do you have a specific software in mind?
@@ThorstenMueller if it can be used just as an external regular speaker or USB microphone in any typical audio software.
Hello, I want to make a text-to-sound conversion model for Farsi, which videos should I watch now, where can I contact you?
Oh, I found your LinkedIn
@@font_net I've received your message. But it takes some time to respond due some other topics ;-).
What is the device with microphone, button and led round light is on your table? Can you please give some more details?
That is what the video is about. Watch his previous videos.
@DichtMe thanks a lot. Will do
New nvidia minicomputer (jetson if I'm not mistaken) which was released just recently is a good replacement for regular bulky PC with full sized GPU card.
Can I use it for German?
Hello, IMHO currently not. But you can use my german Thorsten-Voice in Piper or Coqui 😉 (thorsten-voice.de/).
what is language suport and can help me for give me inforation about modle support language arabic
@ThorstenMueller I am facing this error please help me: PS C:\Users\Apple Compter\tts> pip install TTS--0.22.0 Defaulting to user installation because normal site-packages is not writeable ERROR: Could not find a version that satisfies the requirement TTS--0.22.0 (from versions: none) ERROR: No matching distribution found for TTS--0.22.0
@Thorsten-Voice I am installing latest TTS but it not work for me please help...
Good question. Do you run it with admin privileges? Is free diskspace available? Just thinking because of this message "installation because normal site-packages is not writeable".
is there any way to run this in python code?
That's an interesting question, i thougth about too. But last tine i looked at it, it was just an early codebase on python integration. According to this (github.com/rhasspy/piper/tree/master/src/python_run) there's no recent updates on that.
Coqui appears to have folded now. Confusingly there is a community run fork that is sorted but its docs look very similar to the original.
Coqui already shut down by beginning of 2024 and imho the code in the original repo is not maintained any more. I heart about a fork too but didn't have time to give it a try.
@@ThorstenMueller the license is garbage and prevents any further interest... why should anyone keep developing it if he cannot use it for further commercial projects...
دوستت دارم
Hello Torsten I have heard that some languages in Piper TTS sound pretty bad for example the Swedish model like that when you train a new voice like when you find tune from the existing checkpoint mall that exists it sounds quite bad and such is that true because the default Swedish NST voice sounds very monotone but when you find tune from that will it sound like me or will it sound different just with the pronunciation errors and when you find two from scratch How many hours of speech do you need I have an RTX 40 6016 GB card so is that good for AI training and the thing is also that do I need to set up Linux and Windows at the same time and fiddle around with complicated stuff because it’s just easier to have a Windows set up And not worry about Windows for Linux so can I just do it with a command
Hello, i only trained my german "Thorsten-Voice" tts piper voice. So i have no experience on other languages, their quality and need for training material. I used multiple hours (around 10 for finetuning my piper model), but i additionally played around with just 1000 phrases and these worked too. It's a little bit of a try'n error.
It is a nice and useful video. Thank you. I am looking at various options right now.
Thanks for your nice comment 😊.
The voice quality is OK, but not great. Did you ever figure out a way to make it better?
No in xtts, but (just in case you're looking for an english solution) do you know my f5 tutorial? ua-cam.com/video/ASFoTNpkM8o/v-deo.htmlsi=gyYl6R8W1xuKoZZM
What's the best TTS for use in an Apple and Android app locally (ie no server connecting)?
That's a good question. Honestly i have not taken a closer look to tts on smartphones so i can't tell you (yet).
Wow, that's great, thanks for showing this! Subscribed :)
Thanks for your nice feedback and welcome 😊.
Hello Thorsten, can you have a check and review of PopPop AI text to speech?
Thanks for your topic suggestion 😊. I've added it to my todo list.
tried getting this to work on my own, couldn't came here, watched this twice and it's up and running. Thank you @ThorstenMueller
Thanks for your nice feedback, happy you got it working 😊.
can piper switch between voices? previously i have used mimic3 server and requested texts with different specific voices, can piper do the same or is it limited to the voice you start server with? Nevermind, i've found where i can make a few adjustments to the script to pass in a speaker together with a text.
AFIK is ssml (which you are looking for) not yet supported in piper. Mimic3 was able to do it. As the developer (Mike) is the same for both projects i am optimistic that ssml will come to piper somewhere in the future.
okay sehr schön. Im Piper adon funktioniert es. In einer Automation sagt eine weibliche Stimme: set public URL in configuration. Muss da noch was rein geschrieben werden in der Config.yaml?
Ich bin gerade nicht sicher, ob man Piper TTS Stimmen in Automatisierungen verwenden kann. Gute Frage, aber müsste ich selber erstmal testen.
As a Ghanaian i'm happy to hear TTS in Twi and Ewe and Hausa
More effort for underrepresented languages is really important to provide open voice technology for everybody. Happy you found a matching tts voice 😊.
I tried this morning and the cloned voices are the best I have never used. I wonder if I can use the cloned voices in some way with Home Assistant through I don´t know know..piper might be ? I can´t find if this is possible to do with this software, it is only tts ? is possible to synthesise a dataset with this ? Thanks
AFIK you can use piper tts voices in Home Assistant. But for this you have to record way more audio data to train/finetune a piper tts model. Do you know my video about piper tts voice cloning? ua-cam.com/video/b_we_jma220/v-deo.html