I used the BEST Open Source LLM to build a GPT WebApp (Falcon-40B Instruct)

Поділитися
Вставка
  • Опубліковано 20 сер 2024

КОМЕНТАРІ • 123

  • @NicholasRenotte
    @NicholasRenotte  Рік тому +11

    Thanks a million to NordPass Business for helping me end my Shin Ramen speed run…and for sponsoring this video! Whatcha waiting for?
    Grab a 3-month free trial here nordpass.com/nicholasnord with code nicholasnord!

    • @kakamoora7874
      @kakamoora7874 Рік тому

      Bro how we can make webscrapping in ai please make one video… or give some tips please

  • @careyatou
    @careyatou Рік тому +80

    Yes. A fine tuning video would be amazing.

    • @officialdiadonacs
      @officialdiadonacs Рік тому +3

      Bump

    • @sujith_25
      @sujith_25 11 місяців тому +1

      Yeah , Please do it sir.

    • @muhammadumaranzar3431
      @muhammadumaranzar3431 9 місяців тому +1

      Bump

    • @AIEntusiast_
      @AIEntusiast_ 6 місяців тому

      dont think we get it from this guy, all his videos are just superficial this is possible but never show detailed how

  • @zd676
    @zd676 Рік тому +7

    It’s one thing to build something cool with LLM, it’s completely different to bring it to production.

  • @Someone17122
    @Someone17122 Рік тому +10

    I started following quite a long time, I could see an incredible transition in video, quality and content. I will be always waiting to explore more new tech from you.

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +4

      Thank you so much, I've been trying to improve them dramatically as of late!

  • @gianlucafiorini
    @gianlucafiorini Рік тому +8

    amazing content!!!! i was doing almost the same with falcon 7b and langchain last week, happy to see a better explanation on what i was doing heheh!

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +2

      hahahah, I've been experimenting like crazy with it!! Whatcha building?

    • @gianlucafiorini
      @gianlucafiorini Рік тому

      @@NicholasRenotte trying to create a Private expert on specific subject but i am seeing i will need to through fine tunninv to do what i want, i love the way you explain this tech, to me as a begginer is hard to realy understand what i am doing, aways watching your content to realy get an understanxing on what i am realy doing

  • @shawn.builds
    @shawn.builds 11 місяців тому

    I've seen a bunch of your videos and this one has got to be the best.
    Such a complicated topic turned engaging + informative is no joke. Thanks Nicholas!

  • @jannik3475
    @jannik3475 Рік тому +7

    Nice! I am currently looking for a way to make a document Q&A Chatbot with Falcon.
    Also a Video on fine tuning would be helpful!
    Thanks Nicholas!

  • @noobking5056
    @noobking5056 Рік тому +4

    please make a video on the difference between langchain fine-tuning and normal fine tuning!

  • @ikurious
    @ikurious Рік тому +4

    did you run that on your local machine?

  • @pkmnjourney
    @pkmnjourney Рік тому +1

    I cannot wait for a finetuning video! Looking forward to it.

  • @ummnine6938
    @ummnine6938 Рік тому +1

    i ran the falcon 7B on the free google collab today and played with it a lot, its ok for just playing but there are so many things it cant do its no good for me since I do programming and have complex questions and want to give complex questions and have it write code. Still will keep using chatgpt for now, waiting for the time when a good laptop nvidia gpu can run something of the class of falcon 40B and then the game changes. The way the progress is going this should be happening this year pretty sure.

  • @elchippe
    @elchippe Рік тому +1

    USe the gmml model, can run in a CPU with a lot of memory and can share memory with a GPU running cuda.

  • @rahulkiroriwal8779
    @rahulkiroriwal8779 Рік тому +5

    why among us sounds lmao 😂😂😂😂

  • @essentials9030
    @essentials9030 8 місяців тому

    thanks for this video, but please make a video in which implementation should be there for object detection by using LLM or multimodal, please

  • @fahnub
    @fahnub Рік тому

    10/10 Content. Engaging, Informative, Precise. ❤

  • @DominicMarrocco
    @DominicMarrocco Рік тому

    your style of production, personality and content is excellent

  • @Arvolve
    @Arvolve Рік тому +1

    Wonderful informative and concise content!
    A video on Fine tunning potentially on a cloud service would be awesome!

  • @ShifraTech
    @ShifraTech Рік тому

    Have been fintuning this model into learnign new languages a fun experiment indeed... I think more people need to play around with this. 😇

  • @deanchanter217
    @deanchanter217 Рік тому

    Llama2 with fine tuning...crazy that as you are producing this video a new and better model dropped

  • @fulltimefrontend
    @fulltimefrontend Рік тому +3

    How may RTX4090 required for Falcon 40B ? 160gb vRAM via RTX 4090 means 6-7cards , which is roughly $10k and still cheaper than 2 A100's. Using RTX for Stable diffusion already.

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      Huh, interesting. I couldn't easily find many cloud providers that were using 4090s, there were a bunch offering H100 and A100s though!

    • @kotykd6212
      @kotykd6212 Рік тому +1

      ​@@NicholasRenottedata centers can't rent them out, but places like runpod and vast ai can since there are hosts instead of only datacenters

  • @abbeynguyen8396
    @abbeynguyen8396 11 місяців тому

    It is amazing. Could you please do a video about Tree-of-thoughts?

  • @rexlaurus5894
    @rexlaurus5894 Рік тому

    Would have been cool to see how you set up runpod.

  • @guimaraesalysson
    @guimaraesalysson Рік тому

    Why install torchvision and torchaudio libraries with IMAGE and AUDIO datasets for one text-to-text app ?

  • @bvdlio
    @bvdlio Рік тому +1

    Yay, finally first

  • @farrukhzamir
    @farrukhzamir Рік тому

    Can we not use model.save pretrained and save the model in the form of shards so that when device_map=auto is used accelerate would kick in and allow to offload the shards to disk and memory. I think that's why you were getting OOM errors.

  • @ParthPatel-db4tk
    @ParthPatel-db4tk Рік тому +1

    Hello Nicholas sir, this video was really helpful to learn how to make own chatbot, it would be helpful if you make video upon how to use LLMs to perform classification using fine tuning techniques such as Zero shot & Few shots learning. Thanks.....

  • @chongdashu
    @chongdashu 6 місяців тому

    Have you been able to use VSCode connected to a remote jupyter instance that still allows Pylance to work? e.g., so you can make use of VSCode's nifty features like cmd/ctrl clicking to see a method definition, etc.

  • @luis96xd
    @luis96xd Рік тому

    Amazing video, everything was well explained!

  • @DCinzi
    @DCinzi Рік тому

    This is so exiting and yet so demoralizing at the same time. Unless you have some real good understanding of coding and llm looks like an impossible task. And rightly so.. but I wish there were more effort out there to make this way more accessible to people that focused on other subjects, also because we may just end up with a lot of very superficial products .

  • @wasgeht2409
    @wasgeht2409 Рік тому +2

    Nicholas, thank you :) I have two questions. First, is it possible to use this on my personal laptop? I don't have a robust GPU; instead, I have a Mac M1 with 8 GB RAM. Second, could I train the pre-trained 40B model for a specific task in German? I'd like to use it for classifying sentences into labels. Is that feasible?

    • @AkolytosCreations
      @AkolytosCreations Рік тому

      No, this requires large GPU’s to run. It doesn’t even have the capability to run on CPU but extremely slowly.

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +1

      Realistically, no, you'll need a beast GPU to run it. Didn't work on my mac.
      You would probably need to fine tune it for that but you would need even more GPU compute to achieve that. But tbh, if it's just sentence classification there's much easier ways to do that, you could just use a small encoder only model and it would probably work well!

    • @elchippe
      @elchippe Рік тому +1

      The bloke gmml falcon version can run in a CPU but 8GB is way to low for this model.

  • @heltengundersen
    @heltengundersen Рік тому

    please, a video on fine turning falcon 40b on a large code base

  • @i2c_jason
    @i2c_jason Рік тому +2

    On the math thing... is the word on the street that we're going to handle math just by increasingly larger parameter counts? Because that scares the crap outta me for engineering applications where the math becomes very technical and obscure. Almost like we need a separate ALU baked into the model to make math feasible on lightweight small parameter count models.

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +3

      Mark my words, some new architecture will come out that will boost performance with dramatically smaller parameter counts. You're right though using a separate ALU could work as well, e.g. Langchain using Wolfram. Also, I can share some of the work our research teams are doing for efficient fine tuning and building smaller parameter efficient models!

    • @LowestofheDead
      @LowestofheDead Рік тому +1

      I think they only do arithmetic tests to see how well the model can generalize. Like he said, people already use Langchain or the Wolfram plugin to do math properly.

    • @i2c_jason
      @i2c_jason Рік тому

      @@LowestofheDead Yes, agree, but that's not going to accelerate us very far. It means you still have to be super specialized in mathematics to know how to use those tools. The promise of AI would be to get to a point where the AI model can use those tools to output highly mathematical solutions with simple prompts. For example, "imagine a geometrically correct STEP file assembly of a handheld drill"... then open it in Fusion360 and print or machine all of the parts. That is the next fundamental step change in this tech, IMO. "3D" images don't count, because they are not geometrical engineering files of reproducible physical objects.

  • @sindoc42
    @sindoc42 Рік тому

    What's the cheapest hardware we can find on the market to be able to run this? And I prefer it be local. Can someone help me order the required hardware for this?

  • @fuba44
    @fuba44 Рік тому

    If truncating the weights to maybe 8 bit or less, can it fit on a high end consumer grade GPU?

  • @adelekefikayomi8351
    @adelekefikayomi8351 11 місяців тому

    Please can it work with tensorflow????

  • @divaxshah9424
    @divaxshah9424 Рік тому +1

    Really loving this going through technique , what an amazing video that's a lot . Also I have 2 questions..
    1) can I run falcon 40 instruct on Colab free version, which has Tesla T4 16GB ?!
    2) can you make a video on Fine Tuning a Stable Diffusion model like sd2.1 or sdxl to make our own checkpoints ?!
    PS: really amazing video, thank lot❤

    • @Woollzable
      @Woollzable Рік тому

      Answer to your first question: No. You cannot run falcon 40b-instruct on a Colab free version. Falcon 40b needs 85GB - 100GB of VRAM at 16-bit precision. Even with reduced precision down to 8-bit it still requires some 45 GB VRAM. At 4-bit precision, it requires 35 GB VRAM. You. need to load the entire model on to GPU memory (could be multiple GPUs).

  • @user-ui2lv9kl2v
    @user-ui2lv9kl2v Рік тому

    hey, very nice video. Can you tell me what is the system requirement to train our model.

  • @jafferaliumar
    @jafferaliumar Рік тому +1

    Nice video and very informative. We almost going start the same testing and it definitely useful.

  • @eeshanchanpura3356
    @eeshanchanpura3356 Рік тому

    can i run this on Google colab?
    it does provide me cloud memory which can be helpful

  • @Movierecap998
    @Movierecap998 Рік тому

    I have purchased amd 6800xt and i want to learn about AI and ML is there any chance i will be able to learn ?

  • @kevynkrancenblum5350
    @kevynkrancenblum5350 Рік тому +2

    Quelle vidéo ! Tu es le meilleur Nic !!! Love the new videos styles ! Look so nice 💪🏻💪🏻

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      KEV?!! I didn't know you spoke French? Also, YESSSS, stoked you liked it!

  • @gluttony4778
    @gluttony4778 Рік тому

    Any idea how one can deploy a web app like this or one using streamlit online? maybe hosted on a domain/ just integrated a chatbot for a site.

  • @wasgeht2409
    @wasgeht2409 Рік тому +1

    Thanks!

  • @pranavagrawal4324
    @pranavagrawal4324 Рік тому

    Hey, how to train a llm from scratch using multiple dataset from huggingface Video will be amazing

  • @rbanondo
    @rbanondo Рік тому +1

    best teacher in youtube. is there any chance you will make a video about working with medical images?

  • @patchshorts
    @patchshorts Рік тому

    what video card is required to run this?

  • @americanwayformation8717
    @americanwayformation8717 Рік тому +1

    Pas trop mal ton français et mieux que l'anglais de mes collègues 😜, as an american who has lived in France for more than 10 years, I've both said and heard a lot worse.
    Love the videos! so much great info and things to learn. Thanks so much for sharing 🙏

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      Hahahahahah, I was honestly crying with laughter when watching the edit but let's be real it was a 3/10 performance!

  • @conradcaldeira7131
    @conradcaldeira7131 11 місяців тому

    a fine tuning video for non- GPU users will be most appreciated

  • @luis96xd
    @luis96xd Рік тому

    What is the best LLM model for low RAM memory usage, for example implementing in a free tier hosting service

    • @wichawt3079
      @wichawt3079 Рік тому +1

      perhaps a 13b model, " ehartford/Wizard-Vicuna-13B-Uncensored " is the highest ranked 13B on huggingface's leaderboard

    • @luis96xd
      @luis96xd Рік тому

      @@wichawt3079 Thank you so much for your answer! I will try

  • @foreignconta
    @foreignconta Рік тому

    Excellent!!!!!!! Only if I could run falcon 40B on my 4Gb DDR6 GPU. 😂

  • @alirezagoudarzi1915
    @alirezagoudarzi1915 Рік тому

    Thanks, Hey Nick how can I integrate langchain codes?

  • @TheAbdallahk
    @TheAbdallahk Рік тому +1

    You are the GOAT! This tutorial is fire! 🔥🔥🔥🔥

  • @DanielCampbellYT
    @DanielCampbellYT Рік тому +1

    Great Video Nick!

  • @so_i_learn_3d549
    @so_i_learn_3d549 Рік тому +1

    hey Nich thanks for u video but i have i have already devellope Q.A chatbot response and i try to implementation fontionnality , to make him read excel file using text generation, do u have any idea how an i implement this? the only i found is to use Langchain and openai API but i try to do withoug openIA API

  • @ragunanthan7499
    @ragunanthan7499 Рік тому

    wonderful content sir, can put a video how to train llm on own data

  • @rverm1000
    @rverm1000 Рік тому +1

    How can you overlay photo's? Open cv? I'm looking at just photo's that taken one after another. What I find interesting is just level of detail. At first glance they look like photo's taken in the 1950's until you hit the zoom button. There are thousands of stars and alot of stuff moving around in space. The just photo's you can see all the stuff. What I want to do I overlay 100 photo's of the same area and color everything that's not in all 100 photo's. See if we can discover new objects moving in space. Here's the starting photo jw0157126001_04201_00001_nis_trapsfilled.jpg the target is Antennea. These photos start around 790 in the list

  • @FunCodingwithRahul
    @FunCodingwithRahul Рік тому +1

    Excellent video Nich.
    I am also exploring Falcon for my domain specific requirement using the concept of RAG with Langchain. But the model is taking too much of time to generate the result even after quantization. Do you have any suggestion on how to reduce the runtime? If I set max_length to less than 1000, model is unable to generate anything. Kind of stuck with the issue !!

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      Yeah I ran into this as well, only way to see fast results is running if on big GPUs.

  • @mmmhhh.
    @mmmhhh. Рік тому +1

    Whats the best model multi target emotionally informed hatespeech detection

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      Think there's a bunch of those, probably encoder only models. I've seen a few in the HuggingFace model repo.

  • @Imnotsoumyajit
    @Imnotsoumyajit Рік тому +1

    Congrats on your new position Nick 👏👏👏👏👏👏

  • @renegadezed
    @renegadezed Рік тому +1

    id be more interested in integrating this into a discord or twitch bot than some random webapp..

  • @jzam5426
    @jzam5426 Рік тому

    thanks for the great content!! Newish to the channel. Has it been tested against gpt-4?

  • @i2c_jason
    @i2c_jason Рік тому +1

    Excellent work, as always. Thank you so much!!

  • @Tripp111
    @Tripp111 Рік тому +1

    Thank you!

  • @jorgefelipegaviriafierro705
    @jorgefelipegaviriafierro705 Рік тому +1

    Great as always!

  • @guimaraesalysson
    @guimaraesalysson Рік тому

    Great video, man

  • @adeelhasan7536
    @adeelhasan7536 Рік тому +2

    PLEASE UPLOAD VIDEO ON FINE TUNING

  • @tkololfi5999
    @tkololfi5999 Рік тому

    Please run MPT-30b

  • @fishnchips6627
    @fishnchips6627 Рік тому

    Congratulations on your promotion!

  • @sauravkumar-sz5zx
    @sauravkumar-sz5zx 8 місяців тому

    Fine tuning video please

  • @kakamoora7874
    @kakamoora7874 Рік тому +1

    Bro how we can make webscrapping in ai please make one video… or give some tips please

    • @NicholasRenotte
      @NicholasRenotte  Рік тому

      I don't think you really need AI or ML for it, BeautifulSoup is your best friend!

    • @kakamoora7874
      @kakamoora7874 Рік тому

      @@NicholasRenotte actually we have 2000 websites, if I’m trying in beautifulsoupe it’s taking one month it’s so long process…. Selenium also not working

  • @telesniper2
    @telesniper2 6 місяців тому

    is it censored?

  • @sabarishrajksabarishrajk292
    @sabarishrajksabarishrajk292 Рік тому +1

    Nice one man.Keep rocking...

  • @cvs2010
    @cvs2010 Рік тому

    Best vídeo ever

  • @CMT-p6q
    @CMT-p6q 4 місяці тому

    I have to leave a comment for the family

  • @jonnybrabals
    @jonnybrabals Рік тому

    Give us the lipnet working with our own videos! Pleaseeeeeeee men

  • @ChewDaPi
    @ChewDaPi Рік тому +1

    You should have used Sagemaker chief could have been cheaper

  • @deathspainvincentblood6745
    @deathspainvincentblood6745 Рік тому +1

    guys I'll introducing programming helper the programming helper is so powerful and much better than OPENAI wat waiting for now since start 1990 is there Lua language in AI CHAT and many more

  • @chrisweeks8789
    @chrisweeks8789 Рік тому +1

    So 2 A100's = 30 sec response? 😅
    Inner tom ford
    🤣🤣

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +1

      LOL, honestly it started out as 30 minutes with no response on my local machine.

  • @philtoa334
    @philtoa334 Рік тому +1

    Tu parles bien Français Nicho !! Merci pour Ta vidéo .

    • @NicholasRenotte
      @NicholasRenotte  Рік тому +1

      Bien merci Phil!! I don’t know if it’s that great though 😂

  • @khalidal-reemi3361
    @khalidal-reemi3361 Рік тому

    Fine Tuning Pleeeeeeeeeeeeeeeeeeeeeeeeeeeeese

  • @danielgormly6064
    @danielgormly6064 Рік тому +1

    12 days later this is out dated... damn things are moving fast

  • @emperor1337
    @emperor1337 Рік тому

    Another pro-falcon, low impressions count video... falcon is a 3rd tier llm at best