Setting up Openvoice version 2 and MeloTTS for AI voice cloning

Поділитися
Вставка
  • Опубліковано 27 кві 2024
  • NOTE: This video is part of the Text-to-speech Comparison Series
    I'll be setting up Openvoice version 2 and MeloTTS by MyShell AI. We'll follow their documentation closely and setup MeloTTS to work independently as a text-to-speech engine and as the Base speaker TTS for Openvoice voice cloning engine, so you can easily integrate it into your AI application.
    🔗 LINKS
    Code Repo: github.com/brainiakk/youtube/...
    MyShell.ai HF page: huggingface.co/myshell-ai
    Openvoice V2 Huggingface download page: huggingface.co/myshell-ai/Ope...
    MeloTTS English V2 Checkpoints download page: huggingface.co/myshell-ai/Mel...
    🔗 MY LINKS
    Twitter: x.com/alhajibrain
    Instagram: / _alhajibrain
    Github: github.com/brainiakk
    #ai #aivoice #aivoices #texttospeech #tts #openvoice #melotts

КОМЕНТАРІ • 23

  • @eddysaoudi253
    @eddysaoudi253 2 місяці тому +4

    Great. Thank you.

  • @socialexperiment8267
    @socialexperiment8267 Місяць тому

    great bro!

  • @charlietech892
    @charlietech892 Місяць тому +1

    Great job. Can you show how a PDF or TXT file can be uploaded and used instead of cutting and pasting or typing text? Most videos show short phrases but if you want a paper or a document made text to speech, how would you go about doing this? Thanks again.

    • @techgiantt
      @techgiantt  Місяць тому +1

      Interesting 🤔 might do a video on it, but it’s as simple as adding a function that parses the text from the pdf or txt and unto the text to speech function. It can be passed in chunks, maybe paragraph by paragraph. You could also accept an input of the document file path once you run the python script or make it more lively by using the tts function to say “please provide the file you want me to read” and a file dialog opens your file explorer. It all depends on what you want but it’s possible

  • @vivekgangurde9685
    @vivekgangurde9685 7 днів тому

    Does it work good on other languages audio because i have tried on bark and tacotron 2 but did not get good results for hindi language, thanks for video keep giving good content 😊

    • @techgiantt
      @techgiantt  6 днів тому

      I think it’s mostly English, Japanese, Chinese, French, Spanish, Korean language that’s supported, but they it also has Indian accent

  • @everybodyguitar5271
    @everybodyguitar5271 6 днів тому

    A bit confusing. What's the relation between MeloTTS and OpenVoice V2?

    • @techgiantt
      @techgiantt  6 днів тому +1

      Melo tts can act as a stand-alone Text to speech engine or as the Base speaker for Openvoice v2. Openvoice is both a tts and a voice cloning engine. The Openvoice v1 can do without Melo tts as the base speaker

    • @everybodyguitar5271
      @everybodyguitar5271 6 днів тому

      @@techgiantt Thanks for your reply. I'm able to play English voice without any issue. But when I play Chinese, I got the following error message: RuntimeError: Placeholder storage has not been allocated on MPS device! Any suggestion? Thanks.

  • @eucharisticadoration
    @eucharisticadoration 2 місяці тому +1

    Hello, first thank you for the tutorial. Currently there are not many out there 🙂
    But when running, I'll get this error:
    Loaded checkpoint 'modules/openvoice/checkpoints_v2/converter/checkpoint.pth'
    missing/unexpected keys: [] []
    Any idea what might be wrong? Thank you!

    • @techgiantt
      @techgiantt  2 місяці тому

      Did you set the directories up exactly as I did? Also, make sure you copied the downloaded checkpoints_v2 folder to the openvoice directory properly and if you did all that, you could go to their huggingface page and redownload the Openvoice V2 converter/checkpoint.pth to replace the old one. I'll add their huggingface link in the description.

    • @techgiantt
      @techgiantt  2 місяці тому +1

      Hold on, do you mean: [ ] ? Because what I'm seeing in your comment is this: [] []

    • @eucharisticadoration
      @eucharisticadoration 2 місяці тому

      @@techgiantt Yes, exactly! Maybe a copy and paste issue.

    • @techgiantt
      @techgiantt  2 місяці тому +1

      @@eucharisticadoration Then, that's not an issue. I don't know why they didn't hide that output on this version. Just let it run, if there was a missing key it would be written the square brackets like a list, but since its empty that means everything is okay.

    • @eucharisticadoration
      @eucharisticadoration 2 місяці тому

      @@techgiantt Ok, thank you very much! Finally I've realized that I had to change some more things (paths) in voice.py and now it is running 🙂

  • @justindaniels5923
    @justindaniels5923 Місяць тому +1

    What operating system are you running? Will this work for Windows?

    • @techgiantt
      @techgiantt  Місяць тому +1

      I'm using OSX (Apple Macbook), I think it should work fine on windows

    • @justindaniels5923
      @justindaniels5923 Місяць тому +1

      @@techgiantt Appreciate the response! I'm going to give it a shot. Subbed!

    • @techgiantt
      @techgiantt  Місяць тому

      @@justindaniels5923 Thanks for the sub

  • @komakaze1
    @komakaze1 14 днів тому

    Is there a way to make these TTS more expressive.

    • @techgiantt
      @techgiantt  14 днів тому +1

      Yes, but you need a beefy gpu to use it with an ai model since you won’t want extra latency, but I’ll create a video for that.

    • @komakaze1
      @komakaze1 14 днів тому

      @@techgiantt I think it would be amazing if they could act, expressing emotions anger, sadness, sorrow, compassion, confidence, hesitation, shyness, embarrassment, bravado, whisper, fear, shout, laugh, etc. moods and personality expressed via voice.