StableVicuna: The New King of Open ChatGPTs?

Поділитися
Вставка
  • Опубліковано 17 лис 2024

КОМЕНТАРІ • 80

  • @jayhu6075
    @jayhu6075 Рік тому +3

    Great to come with a other Opensource. I am very curious what you all mention with LangChain. Many thanks.

  • @rakeshrajgopalasaikrishnan5562

    Hey Sam, Awesome video as usual. Can you make a video on fine-tuning as well? Hope that will be helpful for the community.

  • @4.0.4
    @4.0.4 Рік тому +13

    Considering Stability AI has the resources, I hope they eventually develop an open foundational model to replace LLaMA.

    • @woongda
      @woongda Рік тому +1

      rwkv

    • @adamconrad5249
      @adamconrad5249 Рік тому

      @@woongda yes

    • @4.0.4
      @4.0.4 Рік тому

      @@woongda RWKV is not by Stability, and it is a whole different architecture also (nothing wrong with that). I mean, Stability AI spent like $300 to train Vicuna which isn't bad but they spent like $600k to train the first few versions of Stable Diffusion.

    • @zandrrlife
      @zandrrlife Рік тому +1

      What I am saying bro. This is their business model. So far. There models. Dataset used to train them. Are subpar. StableLM is trained on less than 60b tokens of GitHub data, where there tons of papesr, that highlight learning representations in code, leads to higher levels of generalized reasoning. It's like basic shit to me. They have too much money to release subpar products.😮

  • @kenfink9997
    @kenfink9997 Рік тому

    Very excited to see your colab/video on running this locally with LangChain.

    • @samwitteveenai
      @samwitteveenai  Рік тому

      working on this but unfortunately it only works 30% of the time with this model.

    • @kenfink9997
      @kenfink9997 Рік тому

      @@samwitteveenai Thanks for the reply! Can LangChain adapt for the prompts for vicuna? Have you tried other models with LangChain like OpenAssistant-SFT-7-Llama-30B or Wizard Vicuna 13b? What local models have you found work best or worst so far?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      @@kenfink9997 To be honest for work we just trained our own. I did try adapting the prompts for a few of these and I do think that should be possible, but I haven't had great success yet doing it that way. I may just release a model aimed at this.

  • @bingolio
    @bingolio Рік тому

    Excellent as always, running out of superlatives (note to self: query GPT 😊) THANKS

  • @vishalgoklani
    @vishalgoklani Рік тому +2

    Could you please make a video on how to do the 4-bit conversion of a model like this, thank you!

  • @vicentejavieraguilerayeven14
    @vicentejavieraguilerayeven14 Рік тому +1

    Hi Sam, thanks for your content! You are amazing, Im really excited to see your next video about Langchain, react and StableVicuna, I've been struggling to left OpenAI LLM to run Langchain tools in a decent way and not spending a sh*t ton of money 😂

  • @luis96xd
    @luis96xd Рік тому

    Amazing video, this model looks interesting, thanks for sharing

  • @zoombapup
    @zoombapup Рік тому +5

    Another nice video, thanks Sam, I've been getting into Langchain a lot recently as a decent abstraction for the different LLM's, but in terms of videos, have you considered looking over the alternatives to langchain? I've seen about half a dozen similar ones, like Chameleon and that C# whose name escapes me right now. I've tried a bunch of the different models, but honestly the whole licensing thing for Llama/Alpaca/Vicuna and the like really annoys me. Because I can't build anything commercially, I have to mess around finding patches to models etc. Just wish someone would release a properly open source base model that everyone can customize and have done with the Llama licensing issues.

    • @samwitteveenai
      @samwitteveenai  Рік тому +2

      The truly open source stuff is coming in regards to models. Other frameworks there is at least one I am looking at doing some vids for.

    • @ranu9376
      @ranu9376 Рік тому +4

      @@samwitteveenai Waiting for the RedPajama model to be out!

    • @samwitteveenai
      @samwitteveenai  Рік тому +2

      Agree I look at all of these as trial runs for once that is available etc.

  • @scitechtalktv9742
    @scitechtalktv9742 Рік тому +1

    I was trying to run the COLAB notebook but it crashes due to insufficient memory errors. Is it ONLY possible to run this on the PRO paid version of COLAB? How could I run this on a free version of COLAB or perhaps even locally on a PC in a Jupyter notebook? Perhaps using 4-bit version of the LLM as you mention?

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      yeah unfortunately you need a GPU with a lot of VRAM to run this so Colab free isn't going to work.

  • @fontenbleau
    @fontenbleau Рік тому

    I like to play with Ai by making competition between two of them and they can really write a book together correcting each other's errors like Microsoft cognitive Ai(chatgpt4) with Open assistant from Huggingface which is viewing and consulting with each other (both have competing pros and cons). But for now it's only possible with manual copy-pasta, someday will be automatically in one window.
    They made together by 30 mins a great business project with careful documents creation (correcting each other) and even full memo for best presentation of how to pursue investors for it, presentation part was quite thoroughly edited like 5 times to the most distilled simplest form with advices to the presenter.
    It will be an interesting world soon.

  • @MarcelSamyn
    @MarcelSamyn Рік тому

    Would it be easier to first train a model that predicts RLHF preferences, and then use that to "self-train" one of these LLMs? So you'd build a kind of generative adversarial network that way.

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Yeah this is what Constitutional AI does. I made a couple videos about that and go through RLAIF in that.

  • @ChrisadaSookdhis
    @ChrisadaSookdhis Рік тому +1

    If I want to further fine tune these RLHF fine-tuned models on domain specific data, is preparing instruction-response dataset the only way? I have domain specific data, but it is just corpus.

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      It is probably the best way currently. You do things like do more pre-training on a particular corpus just doing next token prediction then do a SFT on instruction / response data. Ie for a finance model you can do more pretraining on that kind of vocab and then do the instruction / response.

    • @ChrisadaSookdhis
      @ChrisadaSookdhis Рік тому

      @@samwitteveenai Thank you Sam. Appreciate your comment. I don't have the means to collect instruction/response data atm, and my single 24GB GPU likely no able to pre-train the whole model. (just LoRA should be OK). Maybe I will try some automated way to generate QA data from corpus first.

  • @fulin3397
    @fulin3397 Рік тому

    Hi Sam. Great video as always! I have learned a lot from you. One quick question: I don't feel the parameter temperature has any effect in this case since sampling is not enabled and model just do greedy search when doing inference?(i.e. by default do_sample=False)

  • @Hypersniper05
    @Hypersniper05 Рік тому +1

    This model can follow complex instructions the best so far, on par with with turbo3.5 api. I think this is the new open source king for us peasants lol

  • @patrickmcguinness1363
    @patrickmcguinness1363 Рік тому +1

    You say you need an A100 to run this. I am wondering if a 24GB Nvidia RTX4090 is big enough for this 13B model with 8bit quantization. possible? How much A100 memory is it using?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      I didn't check the usage, but I know others have run some of the similar models on a 3090/4090 with 24GB etc. so I think it should work.

  • @Atlent112
    @Atlent112 Рік тому +1

    Pretty sure by now all teams working on camelids and other LLMs are making sure to train their models to answer your standard questions :D

    • @samwitteveenai
      @samwitteveenai  Рік тому +1

      lol this is something that did occur to me as I recorded this.

  • @zandrrlife
    @zandrrlife Рік тому

    Tbh there LLM releases have been subpar compared to whats already out. The dataset could have been better. I mean there whole business model is around open source. Where is the quality at? The fact I have a better dataset. Is comical to me. Meta honestly been 🐐 lately in the space.
    Great content as usual.

  • @winglight2008
    @winglight2008 Рік тому

    I opened the colab and run all, then it failed for "CUDA error: invalid device function". Is that caused for my free colab account?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      yes this won't work with the free Colab unfortunately

  • @wanfuse
    @wanfuse Рік тому

    Use this model ( or better other) to download data from free sources, refractor, and retrain, avoids license issue, yes?

  • @anthanh1921
    @anthanh1921 Рік тому

    Great quality as always! I notice you have done a lot of videos using a100 in colab pro+, have you every facing the case of running out compute units? If that happen what is the gpu google gave us for the rest of the month?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Good question I think it goes back to just cpu. I am not sure these days. I tend to make the vids in Colab Pro but I often use a custom Colab with a different backend.

  • @janalgos
    @janalgos Рік тому

    how does this compare to the GPT4AllxAlpaca model? That seems to be the best one I've come across so far

  • @hiawoood
    @hiawoood Рік тому

    You are my favorite llama 🎉

  • @dragonwave2652
    @dragonwave2652 Рік тому

    Hi sam, thank you for your videos. I have 12 gb of ram, how I can train/fine tune LLM models? Or how much video memory should I buy to fine tune them

    • @samwitteveenai
      @samwitteveenai  Рік тому

      you should be able to run some of the 3B models. Generally for the bigger models you will need at least a consumer card with 24gb ram

    • @dragonwave2652
      @dragonwave2652 Рік тому

      @@samwitteveenai Thank you for your answer. I am trying to fine-tune vicuna on 24GB ram, but got this error: Cannot copy out of meta tensor; no data!

  • @sonOfLiberty100
    @sonOfLiberty100 Рік тому

    Its hard to find a good a comercial use model

  • @holthuizenoemoet591
    @holthuizenoemoet591 Рік тому

    Can you run it on a 24GB card ?

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Yes I think so, though I was using a 40gb card on the video.

  • @siegfriedcxf
    @siegfriedcxf Рік тому +1

    dont know why they call this open source, none of llama model is really opensource.

  • @emmanuelkolawole6720
    @emmanuelkolawole6720 Рік тому

    How does the RLHF piece work

    • @DJWESG1
      @DJWESG1 Рік тому

      R'einforces l'earning through h'uman f'eedback

    • @samwitteveenai
      @samwitteveenai  Рік тому

      When I get some time I will make a proper video going through how it works. If you look at my video on constitutional AI it has a bit about it in there.

    • @emmanuelkolawole6720
      @emmanuelkolawole6720 Рік тому +1

      Is it continuously learning on human feedback? Or was that during the training period only?

  • @JoshuaGabriel-x9d
    @JoshuaGabriel-x9d Рік тому

    Hi Sam, great content as usual! I am a doctor from Australia working on a SaaS product for GP's to use and would love some guidance regarding AI integration. Do you offer consulting services and if so, can you be reached by email? Thank you and kind regards, Dr Gabriel

  • @andrewdunbar828
    @andrewdunbar828 Рік тому

    Vicuñas eat jalapeños

  • @NNokia-jz6jb
    @NNokia-jz6jb Рік тому

    You need 2000 dollar gfx card.

  • @avi7278
    @avi7278 Рік тому +5

    I just don't get how any of these models are useful. They are really poor in comparison to open ai models. I would like to hear your opinions on that. I don't see any more huge advances coming off of these sort of models. Also who wants a model that is worse than Open AI's but still has all the same filters and restrictions? I just don't get it.

    • @samwitteveenai
      @samwitteveenai  Рік тому +11

      Lots of people want models that don't require them to share their data with OpenAI. I agree that these still have a long way to go compared to OpenAI for open domain chats. Most business cases don't need open domain chat they need something that can be fine tuned to be really good with a limited closed domain. I do understand the people want to have non restrictive models. This is something I have actually spent most of today testing and working on.

    • @sharperd2
      @sharperd2 Рік тому

      So, Sam, would you say that these lastest open LLMs, vicuna, etc, could be used as a base to fine tuning in specific domains? Is that what private business are looking for?

    • @adamstewarton
      @adamstewarton Рік тому

      openassistant is a no bad modelif you are looking to chat with a bot.

    • @samwitteveenai
      @samwitteveenai  Рік тому

      these and more often the 30B models.

    • @darkestmagi
      @darkestmagi Рік тому +3

      In edition what others said. Using OpenAI is a privacy nightmare for anything that needs to be kept secure. Customer records, proprietary code, business secrets and planning.

  • @klammer75
    @klammer75 Рік тому

    I’m really liking these data sets if not the models, but those just keep getting better and better! Awesome work once again Sam and Tku for what you do!🦾🤖😎

  • @PhilipZeplinDK
    @PhilipZeplinDK Рік тому +1

    Bummer that the colab isn't working anymore :(
    Spits out error when you get to the "tokenizer = LlamaTokenizer.from_pretrained("TheBloke/stable-vicuna-13B-HF")" activation point.

  • @maformedrequest1889
    @maformedrequest1889 Рік тому

    Hey Sam, can you change your colab to load model with these settings? Its running 3x faster, better accuracy with only 26GB memory usage:
    base_model = LlamaForCausalLM.from_pretrained(
    "TheBloke/stable-vicuna-13B-HF",
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    device_map='auto',
    )

  • @clray123
    @clray123 Рік тому

    Based on some initial testing, this model (TheBloke/stable-vicuna-13B-GGML, 4_2 bit quantized version with llama.cpp) is way more incoherent than eachadea/ggml-vicuna-13b-1.1. Also I notice it is broken by all the censorship bs and keeps babbling around about content policies and restrictions. I'd recommend using the uncensored version of Vicuna instead.

  • @taimoorneutron2940
    @taimoorneutron2940 Рік тому

    NameError: name 'init_empty_weights' is not defined
    your code throws this error can u help me up here

    • @samwitteveenai
      @samwitteveenai  Рік тому

      Make sure your have a GPU that can run the model.

    • @taimoorneutron2940
      @taimoorneutron2940 Рік тому

      @@samwitteveenai yes, i am using colab gpu and my local is 3040

    • @samwitteveenai
      @samwitteveenai  Рік тому

      @@taimoorneutron2940 are you using Colab Pro+ you will need an A100 to run it (check with !nvidia-smi -L )

    • @taimoorneutron2940
      @taimoorneutron2940 Рік тому

      @@samwitteveenai i have pro plus i will show u updates, let me update u

  • @harrykekgmail
    @harrykekgmail Рік тому

    thanks