Thorsten-Voice
Thorsten-Voice
  • 152
  • 705 216
Your AI Voice Sounds WRONG! Here's Why 🤖 → 🗣️
Learn how to dramatically improve your AI text-to-speech output through proper text cleaning and normalization techniques. In this tutorial, I'll show you:
✓ Common text issues that ruin TTS quality
✓ Step-by-step text cleaning process
✓ How to handle numbers, abbreviations, and special characters
✓ Universal techniques that work with any TTS engine
Whether you're using commercial or open-source TTS solutions, these text preprocessing steps will help you achieve more natural-sounding speech output. I will use @NVIDIA NeMo for text cleaning / normalization as one possibility.
#TextToSpeech #TTS #AIVoice #Tutorial #VoiceAI
00:00 Intro & samples / goals
01:32 What to achived by tutorial end
03:13 What is the actual problem?
04:26 Showing the cleaned / normalized text
06:10 Options for text cleaning (num2words, Huggingface, NVIDIA NeMo)
10:35 Google colab for text cleaning
15:30 Bring it together (voice processing pipeline)
* Thorsten-Voice Blogpost: www.thorsten-voice.de/2025/01/09/your-ai-voice-sounds-wrong-heres-why-%f0%9f%a4%96-%e2%86%92-%f0%9f%97%a3%ef%b8%8f/
* Link to Google Colab Notebook: colab.research.google.com/drive/1mGskJoVRr-QjKvV5ipm-k9LcdUqjFzlv?usp=sharing
* num2words: pypi.org/project/num2words/
* Huggingface: huggingface.co/docs/tokenizers/api/normalizers
* German language only: github.com/repodiac/german_transliterate
* NVIDIA NeMo text cleaning: github.com/NVIDIA/NeMo-text-processing
Please subscribe to my channel 😊.
ua-cam.com/users/ThorstenMueller
---
- www.Thorsten-Voice.de
- github.com/thorstenMueller/Thorsten-Voice/
Переглядів: 562

Відео

🎙️ Home Assistant Voice Preview Edition (VPE) #03 | Local Setup with Whisper & Piper 🗣️
Переглядів 1,2 тис.21 день тому
Welcome to our new series about Home Assistant Voice! In this episode, we'll setup the device for local voice processing using whisper (STT) and piper (TTS). 📋 What we'll cover: - Install / configure OpenAI whisper for speech recognition - Install /configure Piper TTS for speech synthesis - Add Wyoming protocol - Configure voice assistant - Turning on/off entities with local voice control ⚡ Dev...
🎙️ Home Assistant Voice Preview Edition (VPE) #02 | First Setup & Connection 🔌
Переглядів 48521 день тому
After unboxing Home Assistant Voice in the previous episode, let's get this device up and running! In this video, we'll go through the initial setup process and connect the device to your Home Assistant installation. 📋 What we'll cover: - Powering on the device for the first time - Connecting to Home Assistant - Exploring created entities - Demo entities overview ⚡ Using Home Assistant version:...
🎙️ Home Assistant Voice Preview Edition (VPE) #01 | Unboxing & Tech Specs 📦
Переглядів 37121 день тому
Welcome to our new series about Home Assistant Voice! In this first episode, we'll unbox this exciting new device and take a detailed look at its technical specifications. 📋 What we'll cover: - Complete unboxing experience - Box contents overview - Hardware specifications - Quick look at documentation - Preview of upcoming episodes ⚡ Device Details: - Home Assistant Voice Preview Edition - Rele...
F5 Text to Speech Tutorial | Hit "Refresh" on Your AI Voice!
Переглядів 6 тис.2 місяці тому
🔥🔥🔥 Impressive voice cloning with F5 TTS! Clone your voice with a few seconds audio data for your personal AI voice. Step-by-step tutorial For comparison reason - here's my computer spec: * CPU: 4x Intel(R) Core(TM) i5-3550 CPU @ 3.30GHz * RAM: 16GB * GPU: NVIDIA GeForce GTX 1050 Ti Based on some comments you might want to watch it on 1.5x speed 😁. Thanks to @kardiokode-g8v for pointing out lic...
3 steps to run HuggingFace 🤗 "Parler TTS" AI Voice on your local machine
Переглядів 7 тис.2 місяці тому
How to run "Parler TTS" from @HuggingFace on your local machine in 3 simple steps (using python code)! Including audio samples. #python #parler #tts #huggingface 00:00 Intro 02:22 Parler TTS Github repo 03:10 Dataset basis for Parler TTS 05:40 Huggingface space to try it out 06:20 Set up Python venv for Parler TTS & Install 09:47 Using python script to synthesize audio 14:45 Synthesizing audio ...
Best AI Voice Generator | 2024.08
Переглядів 20 тис.4 місяці тому
Free #TTS with #Mars5 #Parler #MetaVoice #Toucan and #ChatTTS. First look and comparison video on voice cloning and more. Thanks to you great #opensource text to speech projects and @HuggingFace for providing cool spaces to play around with 🤗. And thank you "VB" for pointing to these cool projects on LinkedIn 👏: www.linkedin.com/posts/vaibhavs10_text-to-speech-ecosystem-has-been-booming-activit...
Automate Voice Dataset Creation Using Whisper AI
Переглядів 2,1 тис.6 місяців тому
Easy tutorial on creating a structured voice dataset on raw audio data using Python and Whisper by OpenAI for speech recognition. #ai #whisper #tts #voice #data #python 00:00 Intro 01:10 Set up python virtual environment 03:00 Working with "the magic" script :) 07:00 Run voice dataset generation with Whisper AI STT 07:58 Checking results 09:45 Outro * github.com/thorstenMueller/Audio-to-Voice-D...
TTS Voice Dataset | LJSpeech | Voice Cloning
Переглядів 2,8 тис.6 місяців тому
Close look to ljspeech voice dataset and it's structure for tts voice cloning. The ljspeech voice dataset is widely supported by tts voice cloning software. Videos is describing the structure and how you can create it for your personal voice clone. 00:00 Intro 02:23 LJSpeech info and download 04:15 LJSpeech in research (Google Scholar) 05:17 Close look to the voice dataset file structure 06:25 ...
Unlock AI Superpowers with NVIDIA CUDA: Boost Performance in Python!
Переглядів 1,6 тис.6 місяців тому
Boost your AI performance by using NVIDIA CUDA on Windows. Step by step tutorial on how to use CUDA with Python / pytorch and performance comparison with Coqui TTS. #performance #nvidia #python #ai #machinelearning #tts Please subscribe to my channel 😊. ua-cam.com/users/ThorstenMueller Thanks dear @MightyReiti for your inspiration and support on my new recording setup ❤️. 00:00 Intro 01:55 What...
Home Assistant ❤️ Voice - Tutorial 05 - Wyoming protocol
Переглядів 5 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 05 - Wyoming protocol
Home Assistant ❤️ Voice - Tutorial 04 - Piper TTS
Переглядів 8 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 04 - Piper TTS
Home Assistant ❤️ Voice - Tutorial 03 - Conversation / NLP
Переглядів 1,6 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 03 - Conversation / NLP
Home Assistant ❤️ Voice - Tutorial 02 - Text Assist
Переглядів 2 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 02 - Text Assist
Home Assistant ❤️ Voice - Tutorial 01 - Basic setup & demo entities
Переглядів 4,5 тис.10 місяців тому
Home Assistant ❤️ Voice - Tutorial 01 - Basic setup & demo entities
Running a local Piper TTS server with Python on Linux
Переглядів 7 тис.10 місяців тому
Running a local Piper TTS server with Python on Linux
🔥 Voice interview Michael Hansen | HA | Raspberry | Piper | Rhasspy
Переглядів 2,2 тис.11 місяців тому
🔥 Voice interview Michael Hansen | HA | Raspberry | Piper | Rhasspy
Local voice cloning with 6 seconds audio | Coqui XTTS on Windows
Переглядів 47 тис.Рік тому
Local voice cloning with 6 seconds audio | Coqui XTTS on Windows
🇩🇪 Künstliche Sprachausgabe uff Hessisch | Kostenlos und OHNE CLOUD !
Переглядів 1,1 тис.Рік тому
🇩🇪 Künstliche Sprachausgabe uff Hessisch | Kostenlos und OHNE CLOUD !
TEXT TO SPEECH | Piper TTS on Windows 🚀 AI voice 10x faster Realtime!
Переглядів 30 тис.Рік тому
TEXT TO SPEECH | Piper TTS on Windows 🚀 AI voice 10x faster Realtime!
XTTS FAQ | Interview with Josh Meyer from Coqui AI
Переглядів 2,3 тис.Рік тому
XTTS FAQ | Interview with Josh Meyer from Coqui AI
Python virtual environment / venv | Windows, Linux & Mac OS X
Переглядів 3,3 тис.Рік тому
Python virtual environment / venv | Windows, Linux & Mac OS X
Free voice recording for BEST voice cloning | Piper-Recording-Studio | Windows
Переглядів 10 тис.Рік тому
Free voice recording for BEST voice cloning | Piper-Recording-Studio | Windows
Is Mycroft Mark 2 the better Alexa?! | Private | Voice Assistant
Переглядів 3,8 тис.Рік тому
Is Mycroft Mark 2 the better Alexa?! | Private | Voice Assistant
Create your AI digital voice clone locally with Piper TTS | Tutorial
Переглядів 53 тис.Рік тому
Create your AI digital voice clone locally with Piper TTS | Tutorial
Increase Text to Speech pronunciation quality with eSpeak | Tutorial
Переглядів 13 тис.Рік тому
Increase Text to Speech pronunciation quality with eSpeak | Tutorial
Talk locally (no ChatGPT) with your documents 😄 | PrivateGPT + Whisper + Coqui TTS
Переглядів 6 тис.Рік тому
Talk locally (no ChatGPT) with your documents 😄 | PrivateGPT Whisper Coqui TTS
Raspberry Pi | Local TTS | High Quality | Faster Realtime with Piper TTS
Переглядів 33 тис.Рік тому
Raspberry Pi | Local TTS | High Quality | Faster Realtime with Piper TTS
Thorsten-Voice TTS in Windows nutzen | DDC / VITS
Переглядів 6 тис.Рік тому
Thorsten-Voice TTS in Windows nutzen | DDC / VITS
Thorsten-Voice TTS in Linux nutzen | DDC / VITS / Piper
Переглядів 3,5 тис.Рік тому
Thorsten-Voice TTS in Linux nutzen | DDC / VITS / Piper

КОМЕНТАРІ

  • @alx8439
    @alx8439 11 годин тому

    Couldn't follow you when you were showing code in Collab - was busy gazing at dogs running on a top

  • @gearscodeandfire
    @gearscodeandfire День тому

    Omg dude, I really appreciate your work, and the voice of my AI powered cohost will be amazing because of it, I am very grateful

  • @fabiano8888
    @fabiano8888 День тому

    Hello, Thorsten. I've noticed that UA-cam has been deleting my comments everywhere. It's really frustrating because your channel is my main source for TTS tutorials, and I can't participate in the discussions, whether it's to compliment your work, thank you, or ask questions.

  • @sblowes
    @sblowes День тому

    Great content! I teach a class locally on getting better results from ChatGPT, and normalizing inputs makes a huge difference, especially with RAG.

  • @nmstoker
    @nmstoker День тому

    Thanks Thorsten! Good normalisation makes a huge difference and there are plenty of subtle edge cases. It's interesting that even the big companies mess this up at times: in London Google Maps' TTS incorrectly transforms some bus-stop letter codes into expanded abbreviations. So "Stop LT" becomes "Stop Lieutenant"!

  • @JoeLinux2000
    @JoeLinux2000 День тому

    I appreciate your work on this, but this is at the Commodore 64 level of computer programming. A tremendous amount of manual labor is required by every individual end user. This approach is for computer nerds, not ordinary computer users who just would like their normal text read properly.

    • @nmstoker
      @nmstoker День тому

      Clearly this for technical types working on TTS, not average consumers - your comment implies you think this is a huge revelation 👏😆

    • @sblowes
      @sblowes День тому

      This is barely a step above using “find & replace” in Word. If it’s above your pay grade, it’s a good opportunity to learn. These principles apply to using AI, too, because you get much better response if you sanitize your inputs to something it prefers, e.g., using markdown files vs PDFs.

  • @JoeLinux2000
    @JoeLinux2000 День тому

    Honestly these are problems that should actually be solved with proper computer programing. We are supposed to be an age of AI but so far it's very weak, extremely poor or totally awful. You are only working in one direction; text to speech, but the problems are equally massive with speech to text. I'm a exclusively Linux user and Linux has been way behind in this area. Only recently have I been able to get decent text to speech out of my computer using a program called, "Speech Note".

    • @vixxkigoli345
      @vixxkigoli345 День тому

      I was going to say this, why needs gpu, TTS can themselves sobe this issue via programming during stdin. Btw I solved this type of issues using bash script and also in some cases c/cpp

  • @s-androjclic
    @s-androjclic День тому

    Thank you!!!!

  • @MuhammadRickyRizaldi
    @MuhammadRickyRizaldi 3 дні тому

    i dont really understand about this, but i just want to ask, is it possible to make voice model of Indonesian Language using piper?

    • @ThorstenMueller
      @ThorstenMueller 13 годин тому

      Yes, this should be possible. Do you know my tutorial on creating a new piper tts voice model tutorial? ua-cam.com/video/b_we_jma220/v-deo.htmlsi=2FuUErT2fofm2iel

  • @gearscodeandfire
    @gearscodeandfire 4 дні тому

    Dude, you rule so hard 🤘I have spent well over 12 hours binging your stuff in the last week, on repeat, and I am incredibly excited to have faster-whisper playing nice with piper! Now to give it a custom voice and a mind...l

  • @BBZ101
    @BBZ101 4 дні тому

    what is the system requirement to run it locally

    • @ThorstenMueller
      @ThorstenMueller 13 годин тому

      Apart from python there is no big dependency. But GPU (NVIDIA CUDA) is recommended for performance reasons. But it should run even on CPU (but way slower).

  • @Shakeel_productions
    @Shakeel_productions 5 днів тому

    Hey man! Im on older machine its Dell optiplex 780 , can i run this on that, the newest version of tts is 0.22(if I'm not wrong) and i tried several hours but didnt work after that i did quit. But i really need it for my youtube videos voice overs can you help me , and a short answer of yes or no for using this as a voice over for youtube!(Even its ipen souce but i really want to make sure)

    • @ThorstenMueller
      @ThorstenMueller 13 годин тому

      Do you have an error on install/using Coqui tts?

  • @loyd1298
    @loyd1298 6 днів тому

    hello, is there any way to use the model directly on python like xtts v2. Gratefuly

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Didn't try this myself yet. But as the codebase is pure python i "guess" this should be able to achieve.

  • @MaHo-b4v
    @MaHo-b4v 6 днів тому

    V spricht man in Englisch wie ein deutsches W, win engl. W wie ein deutsches u

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Danke für den Hinweis. Ich habe es in meinem aktuellsten Video sogar aufgegriffen 😉. ua-cam.com/video/-99WPCIlq-s/v-deo.html

  • @fabiano8888
    @fabiano8888 6 днів тому

    I couldn't resist. 😂

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      🤣 Have you been able to break the browser refresh loop yet 😉?

  • @Qumzer
    @Qumzer 6 днів тому

    Can I build my own assistant with a special wake word and a special interface using rhasppy?

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Thanks for your question, i guess yes, but i am not really sure. Maybe you can ask this question on their community to get a more helpful answer.

    • @Qumzer
      @Qumzer 14 годин тому

      @ Thanks!

  • @fabiano8888
    @fabiano8888 6 днів тому

    I was giving Piper a try for one of my projects when I stumbled upon your channel. It was a pure joy to watch your excitement. 🙂

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Thanks for your nice comment and happy you enjoyed watching it 😊. Yes, i really like and am enthusiastic on this niche of technology.

  • @madcatandrew
    @madcatandrew 7 днів тому

    Amusingly, when training these on a 4090 the checkpoints blow by so fast you cannot actually load and test them. By the time it tried, the checkpoint it found using *.ckpt was already 3 versions behind and long gone.

  • @MarcRitzMD
    @MarcRitzMD 7 днів тому

    Thorsten, installing it all with Pinokio is so much better. Installing everything with Pinokio in the AI world is so much better.

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Thanks for your great hint 👍🏼. I heart about Pinokio some times, but didn't give it a try yet. But this might be a good idea 😊.

  • @gillesreyna1272
    @gillesreyna1272 7 днів тому

    Excellent content, I will give it a try :)

  • @LifeWithTaranSmith
    @LifeWithTaranSmith 8 днів тому

    Hi! Thank you for your videos. Can you please tell me in your opinion the best TTS software available for meditation and also personal trainer voices in each gender. I can do some fine tuning but preferably looking for a Simple and cheap option. Any thoughts?

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Hi, you're welcome 😊. I'd recommend listening to same Piper or Coqui TTS voice samples. There should voices in multiple genders be available (at leat in english). Maybe there's a voice available that fits your needs (you might have to check the voice models individual license).

  • @LearnOpsViet
    @LearnOpsViet 9 днів тому

    Thank you so much for your video; it’s truly valuable to me. Oh, and I really like your teaching style in the video as well. 😁😁

    • @ThorstenMueller
      @ThorstenMueller 8 днів тому

      Thank you a lot, dear LearnOpsViet for your kind feedback - happy you like it 😊.

  • @072_tushar
    @072_tushar 10 днів тому

    hello sir

  • @nikosterizakis
    @nikosterizakis 10 днів тому

    Sadly, none of the reviewed models and frameworks work locally. I have a 2080 Nvidia and tried the frameworks on Umbuntu. All of the frameworks have very poor documentation. Tucan has an issue with the code that is meant to execute finetuning and kept coming up with division by zero errors (I think it has a lower limit on number of samples but not mentioned anywhere in docs). Mars5 needs more than 16GM Vram (but not mentioned anywhere either). ChatTTS does NOT support finetuning, but needs training from scratch. and MetaVoice has a stated 12GB VRam requirement, which meant I did not even try.

    • @ThorstenMueller
      @ThorstenMueller 15 годин тому

      Do you know Piper TTS? I made some tutorials on it. It runs locally, is really performant (even on a raspberry pi 4) and offers voice cloning.

  • @aprosimracing
    @aprosimracing 12 днів тому

    DISREGARD: I got a fork and was able to install it! Thank you! I tried to install coqui TTS to Python (version 3.11.1) using VS Code, but receive an error message: "ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (TTS)". Please help! Thank you for the video. Best.

    • @ThorstenMueller
      @ThorstenMueller 8 днів тому

      Did you update pip before? So running "pip install pip -U"? As coqui tts is not maintained any more i am not sure if python 3.11 is fully supported.

  • @EhsanIrshad
    @EhsanIrshad 12 днів тому

    Can you tell me can we use this to data augmentation of low resource language like `URDU`? Non english language or asian languages? Like Urdu, Hindi? as Urdu is under resource language so i want to create a very large dataset of this.

  • @djbet653
    @djbet653 15 днів тому

    This is pure joy! Do play around with other voices, there are some really good ones.

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      Thanks for your nice feedback 😊. I agree, there are really good voices available.

  • @musakurel
    @musakurel 16 днів тому

    which ones we can use with Swift CoreML ? Is it possible to make them run swift locally?

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      I had to lookup Swift CoreML because i didn't know about it (yet) ;-).

    • @musakurel
      @musakurel 9 днів тому

      @@ThorstenMueller oh ok 😀 let me know please

  • @AS-ol6os
    @AS-ol6os 17 днів тому

    It would of been nice to see an example of how it works. Does it respond well? How fast does it respond? There really wasnt anything in the video showing how well it works

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      Do you know my video on listening to the samples? ua-cam.com/video/HojuVmW5LUI/v-deo.html

  • @alx8439
    @alx8439 19 днів тому

    Yeah would be interesting to see if it can be used with other software as well

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      I agree, thanks for your feedback 😊.

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      Do you have a specific software in mind?

    • @alx8439
      @alx8439 12 днів тому

      @@ThorstenMueller if it can be used just as an external regular speaker or USB microphone in any typical audio software.

  • @font_net
    @font_net 22 дні тому

    Hello, I want to make a text-to-sound conversion model for Farsi, which videos should I watch now, where can I contact you?

    • @font_net
      @font_net 22 дні тому

      Oh, I found your LinkedIn

    • @ThorstenMueller
      @ThorstenMueller 13 днів тому

      @@font_net I've received your message. But it takes some time to respond due some other topics ;-).

  • @alx8439
    @alx8439 22 дні тому

    What is the device with microphone, button and led round light is on your table? Can you please give some more details?

    • @DichtMe
      @DichtMe 22 дні тому

      That is what the video is about. Watch his previous videos.

    • @alx8439
      @alx8439 20 днів тому

      @DichtMe thanks a lot. Will do

  • @alx8439
    @alx8439 22 дні тому

    New nvidia minicomputer (jetson if I'm not mistaken) which was released just recently is a good replacement for regular bulky PC with full sized GPU card.

  • @ianicius
    @ianicius 23 дні тому

    Can I use it for German?

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      Hello, IMHO currently not. But you can use my german Thorsten-Voice in Piper or Coqui 😉 (thorsten-voice.de/).

  • @rabeemohammed5351
    @rabeemohammed5351 25 днів тому

    what is language suport and can help me for give me inforation about modle support language arabic

  • @MuhammadShahid-bl5hh
    @MuhammadShahid-bl5hh 25 днів тому

    @ThorstenMueller I am facing this error please help me: PS C:\Users\Apple Compter\tts> pip install TTS--0.22.0 Defaulting to user installation because normal site-packages is not writeable ERROR: Could not find a version that satisfies the requirement TTS--0.22.0 (from versions: none) ERROR: No matching distribution found for TTS--0.22.0

    • @MuhammadShahid-bl5hh
      @MuhammadShahid-bl5hh 25 днів тому

      @Thorsten-Voice I am installing latest TTS but it not work for me please help...

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      Good question. Do you run it with admin privileges? Is free diskspace available? Just thinking because of this message "installation because normal site-packages is not writeable".

  • @herofahimshahriargaming8288
    @herofahimshahriargaming8288 26 днів тому

    is there any way to run this in python code?

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      That's an interesting question, i thougth about too. But last tine i looked at it, it was just an early codebase on python integration. According to this (github.com/rhasspy/piper/tree/master/src/python_run) there's no recent updates on that.

  • @jez9999
    @jez9999 28 днів тому

    Coqui appears to have folded now. Confusingly there is a community run fork that is sorted but its docs look very similar to the original.

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      Coqui already shut down by beginning of 2024 and imho the code in the original repo is not maintained any more. I heart about a fork too but didn't have time to give it a try.

    • @mrechbreger
      @mrechbreger 19 днів тому

      @@ThorstenMueller the license is garbage and prevents any further interest... why should anyone keep developing it if he cannot use it for further commercial projects...

  • @font_net
    @font_net 29 днів тому

    دوستت دارم

  • @adamrastrand9409
    @adamrastrand9409 Місяць тому

    Hello Torsten I have heard that some languages in Piper TTS sound pretty bad for example the Swedish model like that when you train a new voice like when you find tune from the existing checkpoint mall that exists it sounds quite bad and such is that true because the default Swedish NST voice sounds very monotone but when you find tune from that will it sound like me or will it sound different just with the pronunciation errors and when you find two from scratch How many hours of speech do you need I have an RTX 40 6016 GB card so is that good for AI training and the thing is also that do I need to set up Linux and Windows at the same time and fiddle around with complicated stuff because it’s just easier to have a Windows set up And not worry about Windows for Linux so can I just do it with a command

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      Hello, i only trained my german "Thorsten-Voice" tts piper voice. So i have no experience on other languages, their quality and need for training material. I used multiple hours (around 10 for finetuning my piper model), but i additionally played around with just 1000 phrases and these worked too. It's a little bit of a try'n error.

  • @EdTimTVLive
    @EdTimTVLive Місяць тому

    It is a nice and useful video. Thank you. I am looking at various options right now.

  • @jimmyjam77
    @jimmyjam77 Місяць тому

    The voice quality is OK, but not great. Did you ever figure out a way to make it better?

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      No in xtts, but (just in case you're looking for an english solution) do you know my f5 tutorial? ua-cam.com/video/ASFoTNpkM8o/v-deo.htmlsi=gyYl6R8W1xuKoZZM

  • @rvanner
    @rvanner Місяць тому

    What's the best TTS for use in an Apple and Android app locally (ie no server connecting)?

    • @ThorstenMueller
      @ThorstenMueller 20 днів тому

      That's a good question. Honestly i have not taken a closer look to tts on smartphones so i can't tell you (yet).

  • @CarlinComm
    @CarlinComm Місяць тому

    Wow, that's great, thanks for showing this! Subscribed :)

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      Thanks for your nice feedback and welcome 😊.

  • @charlenechen2507
    @charlenechen2507 Місяць тому

    Hello Thorsten, can you have a check and review of PopPop AI text to speech?

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      Thanks for your topic suggestion 😊. I've added it to my todo list.

  • @TimothyBakerhistorygym
    @TimothyBakerhistorygym Місяць тому

    tried getting this to work on my own, couldn't came here, watched this twice and it's up and running. Thank you @ThorstenMueller

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      Thanks for your nice feedback, happy you got it working 😊.

  • @funcSAGE
    @funcSAGE Місяць тому

    can piper switch between voices? previously i have used mimic3 server and requested texts with different specific voices, can piper do the same or is it limited to the voice you start server with? Nevermind, i've found where i can make a few adjustments to the script to pass in a speaker together with a text.

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      AFIK is ssml (which you are looking for) not yet supported in piper. Mimic3 was able to do it. As the developer (Mike) is the same for both projects i am optimistic that ssml will come to piper somewhere in the future.

  • @maraka0100
    @maraka0100 Місяць тому

    okay sehr schön. Im Piper adon funktioniert es. In einer Automation sagt eine weibliche Stimme: set public URL in configuration. Muss da noch was rein geschrieben werden in der Config.yaml?

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      Ich bin gerade nicht sicher, ob man Piper TTS Stimmen in Automatisierungen verwenden kann. Gute Frage, aber müsste ich selber erstmal testen.

  • @BlackOtaku_Edits
    @BlackOtaku_Edits Місяць тому

    As a Ghanaian i'm happy to hear TTS in Twi and Ewe and Hausa

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      More effort for underrepresented languages is really important to provide open voice technology for everybody. Happy you found a matching tts voice 😊.

  • @mercuryin1
    @mercuryin1 Місяць тому

    I tried this morning and the cloned voices are the best I have never used. I wonder if I can use the cloned voices in some way with Home Assistant through I don´t know know..piper might be ? I can´t find if this is possible to do with this software, it is only tts ? is possible to synthesise a dataset with this ? Thanks

    • @ThorstenMueller
      @ThorstenMueller Місяць тому

      AFIK you can use piper tts voices in Home Assistant. But for this you have to record way more audio data to train/finetune a piper tts model. Do you know my video about piper tts voice cloning? ua-cam.com/video/b_we_jma220/v-deo.html