Kokoro Local TTS + Custom Voices

Поділитися
Вставка
  • Опубліковано 14 січ 2025

КОМЕНТАРІ • 31

  • @andherium
    @andherium 5 годин тому +8

    hmm Tiny TTs is definitely an interesting name

  • @mageshyt2550
    @mageshyt2550 4 години тому +1

    love to see video on conversation with local agents

  • @MeinDeutschkurs
    @MeinDeutschkurs 4 години тому +1

    Sky is back! Wooohooo!!! ❤❤❤❤

  • @lovol2
    @lovol2 4 години тому

    Thanks for making this video.

  • @djstraylight
    @djstraylight 4 години тому +2

    Were there any instuctions on how to train voicepacks?

  • @MojaveHigh
    @MojaveHigh 2 години тому

    Very helpful, thanks!
    Any chance you could take a look at RealtimeSTT? And maybe put that and Koroko into a single local conversational AI agent?

  • @MeinDeutschkurs
    @MeinDeutschkurs 4 години тому

    What I‘d use it for? Voice Chat, based on aya-expanse.

  • @altmediamedia9654
    @altmediamedia9654 28 хвилин тому

    Sam, I can't access the shortened URL links. I can't name this website shortener in my comment but you know which one you are using. it either timesout or is unreachable. Anyone else bothered with this issue?

  • @helloworld7796
    @helloworld7796 4 години тому +1

    Is it possible to train own model for some language other than US from scratch?

    • @samwitteveenai
      @samwitteveenai  3 години тому

      Yes or you could fine tune this to another language, but you would need some training code as well which currently isn’t in the repo

  • @MeinDeutschkurs
    @MeinDeutschkurs 4 години тому

    Is it possible to fade from one voice to another voice? Could help to find great voices. (With values in terminal)

    • @samwitteveenai
      @samwitteveenai  4 години тому

      Good question unfortunately it’s not really possible to fade between them because you need to put the full embedding in at the generation time and you can only put one in.

    • @MeinDeutschkurs
      @MeinDeutschkurs 4 години тому

      @@samwitteveenai , ok, so I should iterate word by word from 0.0 to 1.0 for both of the values. 😆 Why not? At least the same sentence multiple times to compare it.

  • @pin65371
    @pin65371 4 години тому

    This would be good for people that want to run something like Alexa locally at home. I know some people have been putting together systems for home assistant. While maybe the OpenAI integration might sound slightly better I'd consider this more than good enough to replace that and not have to send your data to OpenAI.

    • @samwitteveenai
      @samwitteveenai  4 години тому

      Yeah that is how I feel too. It’s not the best but it is damn good .

  • @moundercesar3102
    @moundercesar3102 4 години тому

    Very interesting, can we use it as a pdf reader where it reads in real time and not after processing the whole text ?

    • @samwitteveenai
      @samwitteveenai  4 години тому +1

      You would probably process a sentence or a line at a time(maybe even a paragraph to help it with prosody), but should be possible

  • @figs3284
    @figs3284 Годину тому

    Transformers js version coming soon from Xenova 👀

  • @VanillaGun
    @VanillaGun 2 години тому

    Is there a defined context length it can parse and process at a time? I want to test it out for large text sources.

    • @finbenton
      @finbenton Годину тому

      Idk but I just generated 25min long audio file but it took 5-10mins to generate.

  • @Quantum_Nebula
    @Quantum_Nebula 5 годин тому

    Interesting -- definitely is fast for the quality

  • @SyamsQbattar
    @SyamsQbattar 3 години тому +1

    Do you know how to add a new language, like Indonesian?

    • @samwitteveenai
      @samwitteveenai  3 години тому

      To get a good result you would probably need to mix some real Bahasa audio into the train mix. Or fine tune it later. Might be able to do something with with a phoneme dictionary but really need some example audio

    • @SyamsQbattar
      @SyamsQbattar 3 години тому +1

      @@samwitteveenai Is there a step-by-step tutorial on this?

    • @miklosprisznyak9102
      @miklosprisznyak9102 3 години тому +1

      Yes, adding a new language is what I would be also interested in...
      Please enlighten us if you have any clue. 😊

    • @Notifest
      @Notifest 28 хвилин тому

      I would appreciate a fine tuning tutorial for a custom voice in any language

  • @concretec0w
    @concretec0w 3 години тому

    Is it better than piper-ttts? piper is sooooo fast and decent