SUPER Fast AI Real Time Speech to Text Transcribtion - Faster Whisper / Python

Поділитися
Вставка
  • Опубліковано 15 лис 2024

КОМЕНТАРІ • 132

  • @bim-techs
    @bim-techs 10 місяців тому +35

    Tips: You can transform your device's audio output into a "microphone" on Windows, so you don't need to place your headphones over your microphone.
    1. Press Windows key + R -> type "mmsys.cpl"
    2. In the Recording tab, enable the Stereo Mix option. Now, "Stereo Mix" is an available microphone option! You can select it as the audio input.

    • @weekendmakeit7760
      @weekendmakeit7760 10 місяців тому +3

      this really helped me! Thank you!

    • @aoeu256
      @aoeu256 8 місяців тому +2

      this a grewt idea, i was using voice meeter as a virtual audio thingy and its complicated to use

    • @TimothyHuey
      @TimothyHuey 18 днів тому

      I enabled the microphone but I don't know how to select it in the code. It doesn't hear anything when I start the app.

  • @OliNorwell
    @OliNorwell 8 місяців тому +7

    Epic! - These videos are some of the best stuff on UA-cam - love the idea with the image generation at the end

  • @theraybae
    @theraybae 10 місяців тому +6

    This is amazing and inspiring. I love the ending of the video and can’t wait for Wednesday. As a dyslexic person I think you unlocked a new use case for learning.

  • @jaujud
    @jaujud 16 днів тому

    5:51 Neutral = I'm gonna go troll now. Funny stuff, great video! Thanks

  • @filipphenderson6342
    @filipphenderson6342 8 місяців тому +105

    Pulling in people with a flashy thumbnail of a Python code that works and then trying to monetize your code based on a library that is already supposed to be open source is in my opinion bs. it is not fair for beginners that might not know Python or whisper very well. for that I give you a thumbs down!

    • @christianmccauley7340
      @christianmccauley7340 3 місяці тому

      Wow, an AI channel scamming people? Who would’ve ever heard of such a thing!
      Tired of the fucking grifternet man, how did this happen?

    • @jbtesla3581
      @jbtesla3581 2 місяці тому

      for real this is a fking scam, the code is in gifthuf free

  • @benscottbongiben
    @benscottbongiben 10 місяців тому

    Good to see transcription and generate responses as audio in real-time for phone call

  • @reddyparthu5978
    @reddyparthu5978 7 місяців тому +5

    how to get the code for this?

  • @ReadyMedia-no
    @ReadyMedia-no 9 місяців тому +4

    There is a product for Live video Transcription there. Live text services are expensive and does not work on many current languages.. Set up a server/service that will ingest a RTMP video source, delay the video and overlay text on video in perfect sync. then offer RTMP output with burned in Live text. :) There is need for this service.

  • @ferluisch
    @ferluisch 5 місяців тому +1

    Hey man this is really cool! I'd like to know if you:
    1) used the whisper v3 model? or the v2?
    2) If you have seen the demos from gpt4, they also showed that gpt ASR is better than whisper v3, wonder if it will be open like whisper.

  • @ArmandoMenicacci
    @ArmandoMenicacci 10 місяців тому +1

    Fantastic !!! A bit fast in explaining and showing, but I can always pause!

  • @HammerOnTheNet
    @HammerOnTheNet 10 місяців тому

    Amazing and inspiring work! Kris what about something less powerful but better accessible in terms of hardware?

  • @JohannaKarlsson
    @JohannaKarlsson Місяць тому

    Hello and great to see this kind of contents.
    I actually have a question about speech to text in another language and for example Swedish.. and passing it throw llama for correction ,.. maybe for a meeting conference or something like that .. what do you suggest ?

  • @cristobalmunoz84
    @cristobalmunoz84 3 місяці тому

    Nice video!! thanks for your help in this topics!!

  • @aoeu256
    @aoeu256 8 місяців тому

    This will be a good tool for language immersion chinese / japanese / indonesian along with the deepl clipboard tool, edge browsers tts engine.

  • @enesgul2970
    @enesgul2970 10 місяців тому +1

    Gerçekten çok iyisiniz.

  • @radudamianov
    @radudamianov 10 місяців тому

    Excellent! Thank you so much for sharing!

  • @bigswede88
    @bigswede88 3 місяці тому

    Heja Sverige ! Bra jobbat

  • @kimsteinhaug
    @kimsteinhaug 10 місяців тому

    Interesting stuff on the image creation at the end while talking, not sure if you are taking into consideration puctuation in you sentences? Im pretty sure this would have to do with something cool, maby keeping an overview of all the text that has been moving out of the "buffer" for style ? Looks like something I could have a lot of fun with, do not have the GPU though :/ Colab however.

  • @calvinapollos
    @calvinapollos 7 місяців тому

    Great video! Thanks for going through this in such an easy-to-understand way! Can you share the python scripts?

  • @renatox5288
    @renatox5288 12 годин тому

    faster whisper or whisper turbo?

  • @henrijohnson7779
    @henrijohnson7779 8 місяців тому

    @Kris : I already joined as an Adept member on Jan 18th 2024 and requested access to the Github Repo via email and also via Discord but have not had any response from you yet ?

  • @leucome
    @leucome 10 місяців тому

    Faster whisper and Insanely Fast Whisper don't seem to have AMD gpu support yet. So I had to go with an alternative for the 7900xt. I used wishper.cpp with cuda/HIP + distilled whisper model. Seriously this combination is kinda real-time too, even when using the distil large v2. Though there is a downside to that, the TTS and Whisper on the GPU gobble up like 8GB or vram. This put some limit to the LLM model I can use at same time.

  • @magnoliasphinkter8622
    @magnoliasphinkter8622 2 місяці тому

    thanks this is great! Where can I find the actual code you have on your screen? Struggling to find it on the github

  • @hjoseph777
    @hjoseph777 3 місяці тому

    I have been looking where to start, fantastic work, where can I have the code for testing

  • @ryanjames3907
    @ryanjames3907 10 місяців тому

    wow !! great video !!! Thank you for being so generous and teaching this to us, this is epic stuff! I can already start see all kinds of use cases, I cant wait to get it running, I'm really looking forward to Wednesday's video . Thanks again from Canada

  • @claudiobalderrama1599
    @claudiobalderrama1599 8 місяців тому

    Do you think this could be used to transcribe, for example, phone calls made through the browser? I would greatly appreciate your response :)

  • @AC2006Uk
    @AC2006Uk 5 місяців тому

    That image gen project was pukka!

  • @ItsNsour
    @ItsNsour 7 місяців тому +1

    can it translate?

  • @unrealminigolf4015
    @unrealminigolf4015 10 місяців тому

    Awesome bro! ❤

  • @110gotrek
    @110gotrek 10 місяців тому +8

    Now make it translate and do phone-cals

    • @rne1223
      @rne1223 10 місяців тому +1

      Noooo…pls nooo. We got plenty auto callers already.

    • @ibrahimelshenhapy9179
      @ibrahimelshenhapy9179 8 місяців тому

      ​@@rne1223
      Where?

    • @luluw9699
      @luluw9699 Місяць тому

      Hello ur computer has a virus

  • @fredericpaillot2570
    @fredericpaillot2570 10 місяців тому

    Hi Kris! I love what you do, I would like to become a member of your channel, but I can't access the page to subscribe, do you have a direct link? the one in description doesn't work for me.. have a good day!

  • @svenborgers6908
    @svenborgers6908 8 місяців тому +4

    I have tried to get this to run on M1 MacBook. No joy. The CPU maxes out even with the tiny model. But then I tried with the Whisper.cpp implementation which is compiled for apple silicon. I found a whisper-cpp-python wrapper for that library. That actually runs and is far less CPU bound. It has a bit of a stutter, it is not as clean, it misses words between the chunk processing but you can see that with just a little bit more power it could work.

    • @MrThaitrinh
      @MrThaitrinh 8 місяців тому

      Hi Seven, could you please share your code with me? Thank you very much!

  • @maizizhamdo
    @maizizhamdo 6 місяців тому

    i love your videos man , please video about fastwhisper on docker api please

  • @lutusp
    @lutusp 10 місяців тому

    Hey, it's in your video description, therefore easily fixed: the word is "transcription". Why not avoid the irony of a video that extols modern AI voice to text ... transcription ... in which the AI engine will surely avoid this mistake, and at the speed of light.

  • @t-dsai
    @t-dsai 10 місяців тому

    Thanks for sharing your knowledge/experience.
    I'm bit perplexed. The description here mentions 45+ prompts in the PDF book, the newsletter website says 40+, and the PDF doc says 35+. Which number is correct?

    • @gcardinal
      @gcardinal 18 днів тому

      none, its a scam.

  • @isaacmasinde1994
    @isaacmasinde1994 3 місяці тому

    Which gpu are you using ?

  • @maverick1901
    @maverick1901 9 місяців тому

    running fully local is one thing ... doing this via webaudio api towards a backend is a different topic - is there any implementation for that as well foreseen?

  • @ytemre
    @ytemre 7 місяців тому +1

    I became a member how do I get access to the code and the github for this

    • @AllAboutAI
      @AllAboutAI  7 місяців тому

      hello :D send me a e-mail at kris@allabtai.com

  • @kebman
    @kebman 10 місяців тому

    I might be jaded but... I mean really, how about an AI that calculates the probability of drone attacks or artillery attacks? How about an AI that calculates the probability of soldiers hiding in terrain? I mean, there are already good search algorithms out there, that one may-or-may-not use to carry out artillery strikes. I'm just thinking aloud here. Probably nothing.

  • @kebman
    @kebman 10 місяців тому

    The sentiment analysis really scares me. I mean, there's absolutely no chance that'll be abused by big tech in terms of political marketing. I mean, like, there's no way in hell right?

  • @prakashsahu-xn6qy
    @prakashsahu-xn6qy 2 місяці тому +1

    how can i get this code which you used in this videos same code i need.

  • @mattaylor-qg4yw
    @mattaylor-qg4yw 6 місяців тому

    just joined. would be good to get my grubby paws on the files for this.

  • @danielgh4814
    @danielgh4814 8 місяців тому +1

    Hi, I'm a subscriber but I do not have access to your github ,can you helpme please?

  • @unleashAI23
    @unleashAI23 24 дні тому

    where do I get the code sir?

  • @thedoctor5478
    @thedoctor5478 10 місяців тому +1

    I think there's an even faster whisper module but I forget what it's called

  • @royzac7829
    @royzac7829 9 місяців тому

    How does the transcription performance compare to assemblyAI?

  • @kate-pt2ny
    @kate-pt2ny 10 місяців тому

    Kris, you are a genius. Real-time speech transcription can do a lot of things. The last example is great. I can’t wait to watch the video released on Wednesday. My computer is a Mac M chip computer. I found the code in your github and changed it to run on the CPU. Later, some problems occurred, such as incomplete transcribed content and OSError. Can you release a version suitable for Mac computers? grateful

  • @agardner-to7vi
    @agardner-to7vi 4 місяці тому

    that is awesome. Sooo i am trying to do something like this. My sister is deaf and i want something that can also just label the who is speaking. So for a small group it will say user 1 user 2 user 3. and who ever is speaking it will let person know. Do you think that is possible.. How could i do that. I got everything but that last part.

  • @haloBean
    @haloBean 7 місяців тому

    Hi,
    Can get the github repo of the above code ?
    Thanks

  • @ShariqueAM
    @ShariqueAM Місяць тому

    I want to do speech to text Audio from the browser speaker and not from the mic , how can we do that in real time ?

  • @martinvizar6430
    @martinvizar6430 9 місяців тому

    Impresario thank you

  • @AlexPopov-hv3kp
    @AlexPopov-hv3kp 5 місяців тому

    what is a transcribe_chunk function in the code? Seems that it's not from faster_whisper?

  • @gurbachhansingh5715
    @gurbachhansingh5715 29 днів тому

    confused can you please create step by step video and provide the code as well.

  • @aseel6910
    @aseel6910 6 місяців тому

    If there any way to translate this text to another languages it will be awesome

  • @digitalsoultech
    @digitalsoultech 10 місяців тому +1

    The accuracy sucks. Many words are incorrect which you can see in the image itself.
    This isn't usable in the real world.

  • @thnmanucian7993
    @thnmanucian7993 7 місяців тому

    Hello. I’m beginner in this major. How can I get your code to refer? Thank you

  • @عبدالرحيمعبدالرحيم-غ5غ
    @عبدالرحيمعبدالرحيم-غ5غ 10 місяців тому +2

    could you do another demo to see how it can translate in real time?

    • @gregh7457
      @gregh7457 10 місяців тому

      yes! there are no really good or fast translation apps available. UA-cam auto translate is horrible!

  • @TimothyHuey
    @TimothyHuey 18 днів тому

    All I get is "Thank you! Thank You! Thank you! as my transcribed output....so weird

    • @naczelnyh8rpolskiegoyt167
      @naczelnyh8rpolskiegoyt167 16 днів тому

      hey, same problem here, actually exact same problem, have you figured it out?

    • @TimothyHuey
      @TimothyHuey 11 днів тому

      @@naczelnyh8rpolskiegoyt167 Yes I did. I went to Sound Recorder and made a test to see what was actually being recorded and playing it back. There was No Sound. Windows wasn't recording anything for some reason. I guess when nothing is recorded, Whisper hallucinates "Thank You" or sometimes just "You." So weird. But anyway, had to find a way to get the mic that this app was working with to record sound. So I would investigate that route, find out if the mic that this app is accessing is actually hearing anything at all.

  • @himanshujaviya6021
    @himanshujaviya6021 6 місяців тому

    Can we get the code used in this video that would be really helpful

  • @crazyforhyunwoo119
    @crazyforhyunwoo119 8 місяців тому

    Can I did this with javascript?

  • @avgplayer
    @avgplayer 10 місяців тому

    Waiting for the in deep video :) Btw your discord invite link is expired.

  • @vallu-Tech
    @vallu-Tech 7 місяців тому

    Bro can you put th video about live streaming voice to text

  • @nouriensha2873
    @nouriensha2873 2 місяці тому

    Can i convert this code to cpp and implement using Arduino without api

  • @maxstauss9579
    @maxstauss9579 5 місяців тому

    i cant find the script of the realtime translation pls help me finding it :((

  • @saqqara6361
    @saqqara6361 4 місяці тому

    how to access your sourcecode as a paid channel member?

  • @mujahidali2369
    @mujahidali2369 3 місяці тому

    welldone

  • @kylebolt5861
    @kylebolt5861 9 місяців тому +1

    How do we join your community?

    • @AllAboutAI
      @AllAboutAI  9 місяців тому

      Link in desc :) youtube member

    • @najafzawar8168
      @najafzawar8168 9 місяців тому

      @@AllAboutAI just subscribed to your channel but not getting GitHub code..

  • @huhaifan
    @huhaifan Місяць тому

    cannot find the code in github

  • @George-kx8fl
    @George-kx8fl 9 місяців тому

    Would it be possible to do speaker recognition then pipe it into translation

  • @Siri-tz7dz
    @Siri-tz7dz 7 місяців тому

    where do i get the setup/python code

  • @RicardoMaciasYepez6913
    @RicardoMaciasYepez6913 5 місяців тому

    Can this run on raspberry pi?

  • @jotixh
    @jotixh 7 місяців тому

    Is there a way to connect a live streaming url?

  • @harshitsingh3061
    @harshitsingh3061 10 місяців тому +1

    where can we get the code

  • @thebigbigdaddy
    @thebigbigdaddy 10 місяців тому +1

    how can we identify different speakers?

    • @ickorling7328
      @ickorling7328 8 місяців тому

      Microsoft co-pilot in a teams call recording transcription. Cant simply call, needs to he a meeting call... subtle difference. Try 'meet now' in teams calender view, or make calendar event.

  • @erenkaraboga8570
    @erenkaraboga8570 8 місяців тому

    Can we take source code ?

  • @eliasbosc
    @eliasbosc 2 місяці тому

    Can you pls share you code?

  • @ahmedelkamash9323
    @ahmedelkamash9323 6 місяців тому

    how can we download this script?

  • @Edward_ZS
    @Edward_ZS 10 місяців тому

    Has anyone updated the code from the previous video to use this recording method instead?

  • @joaopaulonadal8484
    @joaopaulonadal8484 9 місяців тому

    How can i get acess to this code?

  • @TonyHoangPodcast
    @TonyHoangPodcast 7 місяців тому

    does it support speaker diairzation?

    • @ShariqueAM
      @ShariqueAM Місяць тому

      I want to do speech to text Audio from the browser speaker and not from the mic , how can we do that in real time ?

  • @maxstauss4821
    @maxstauss4821 5 місяців тому

    iam a member but i cant acces the github pls HELP

    • @maxstauss4821
      @maxstauss4821 5 місяців тому

      this i my github
      maxaxaxaxxaxaxaax

  • @nusretalikok823
    @nusretalikok823 10 місяців тому

    where can we find the code that you used?

  • @slimshady91bat
    @slimshady91bat 3 місяці тому

    ma è gratuito?

  • @gurbachhansingh5715
    @gurbachhansingh5715 29 днів тому

    Please provode the code

  • @Onlyindianpj
    @Onlyindianpj 4 місяці тому +1

    This is Presentation not tutorial

  • @ScaryLasers
    @ScaryLasers 4 місяці тому

    how do i get access to the github?? TAKE MY MONEY!
    lol no but seriously how

  • @fufu9352
    @fufu9352 8 місяців тому

    Zero latency? I have been check your video timeline. terminal output and audio is not correspond. you must be living a world 1-2 second ahead our timeline. 😅

  • @MiguelCayazaya
    @MiguelCayazaya 3 місяці тому

    pip install patience and kindness

  • @MarxOrx
    @MarxOrx 10 місяців тому

    BROOOO 🎉 FIRST

  • @ramadanhasan1574
    @ramadanhasan1574 10 місяців тому

    Where is the link to this source code ? Thanks amazing

  • @AlphaScraperOne
    @AlphaScraperOne 8 місяців тому

    🧡

  • @curtisnewton895
    @curtisnewton895 9 місяців тому

    transcriPtion

  • @gcardinal
    @gcardinal 18 днів тому

    What a disgusting practice of hiding very basic and poorly written code behind a paywall. No effort, no skill, GPT generated based on million dollar investments shared for free - slamming behind a paywall is as low as you can get as a UA-camr. But you don't care.

  • @tharosen-g4q
    @tharosen-g4q 10 місяців тому

    🎈

  • @rahar6009
    @rahar6009 8 місяців тому +1

    It is bs to make an open source code monetized! So sorry for you and your kinds... unsubs.

  • @Velnio_Išpera
    @Velnio_Išpera 8 місяців тому

    Can you use different languages?

  • @01karthikrajan40
    @01karthikrajan40 Місяць тому

    Anyone please lemme help how to run this code im trying it to but it doesn't work like the way how it is mentioned with zero latency
    any GitHub link access

  • @vaibhavmishra1100
    @vaibhavmishra1100 8 місяців тому

    can you tell me the solution of this error : Could not load library cudnn_ops_infer64_8.dll. Error code 126
    Please make sure cudnn_ops_infer64_8.dll is in your library path!

    • @劉育安
      @劉育安 8 місяців тому

      try "pip install nvidia-cudnn-cu12"

    • @vaibhavmishra1100
      @vaibhavmishra1100 8 місяців тому

      its didnt work@@劉育安

  • @HungBui-r7z
    @HungBui-r7z 7 місяців тому

    I have registered as a member, please check your email

  • @gmazuel
    @gmazuel 3 місяці тому

    Where can find the code .