WizardLM 2 - First Open Model Outperforming GPT-4

Поділитися
Вставка
  • Опубліковано 31 тра 2024
  • In this video, we test the first Open LLM that outperforms GPT-4 on MT-Bench. Open LLMs are catching up really fast.
    🦾 Discord: / discord
    ☕ Buy me a Coffee: ko-fi.com/promptengineering
    |🔴 Patreon: / promptengineering
    💼Consulting: calendly.com/engineerprompt/c...
    📧 Business Contact: engineerprompt@gmail.com
    Become Member: tinyurl.com/y5h28s6h
    💻 Pre-configured localGPT VM: bit.ly/localGPT (use Code: PromptEngineering for 50% off).
    Signup for Advanced RAG:
    tally.so/r/3y9bb0
    LINKS:
    How was it trained? / 1
    How it performs? : / 1779899325868589372
    Github Repo: wizardlm.github.io/
    TIMESTAMPS:
    [00:00] Ground breaking Open LLM
    [00:58] Deep Dive into Model Training and Performance
    [01:58] Testing it with LM Studio
    [05:08] Exploring the Model's Reasoning and Writing Skills
    All Interesting Videos:
    Everything LangChain: • LangChain
    Everything LLM: • Large Language Models
    Everything Midjourney: • MidJourney Tutorials
    AI Image Generation: • AI Image Generation Tu...
  • Наука та технологія

КОМЕНТАРІ • 64

  • @borisrusev9474
    @borisrusev9474 Місяць тому +13

    Command R+ is the first open model to outperform GPT-4-0314 according to the LMSYS Chatbot Arena Leaderboard.

    • @engineerprompt
      @engineerprompt  Місяць тому +4

      Agree, everyone is using different benchmarks, ones that suites the model creators :)

  • @SoonerStoneAI
    @SoonerStoneAI Місяць тому +18

    We aren’t running out of human generated data. We are just running out of the easily internet accessible data.

    • @DynamicUnreal
      @DynamicUnreal Місяць тому +1

      True. Humans generate insane levels of data everyday outside of the internet. Companies have to look for ingenious ways to try and capture some of that data.

  • @stratos7755
    @stratos7755 Місяць тому +22

    Remove censorship and it will be good.

  • @tawansunflower
    @tawansunflower Місяць тому

    Thank you for the informative video! By the way, how did you record this video the zooms and the cursor look super smooth!

  • @unclecode
    @unclecode Місяць тому +2

    Isn't it crazy that u uploaded the video 13hrs ago, and about 5hrs later, Llama3 came out with an impressive claimed benchmark and a 400B version in training? Just 9 days ago Mixtral8x22B, then 3 days ago with WizardLM, and now Llama3! I think the table is changing; now open-source models are pushing proprietary models to improve themselves. Tbh, I think the only thing left for OpenAI to impress the market is to drop AGI :D:D

    • @engineerprompt
      @engineerprompt  Місяць тому

      I agree, the pace is just crazy. Hard to keep up. Its OpenAI's turn now :D

    • @engineerprompt
      @engineerprompt  Місяць тому

      btw any plans adding function calling to llama3, that would be great.

    • @unclecode
      @unclecode Місяць тому

      @@engineerprompt haha u read my mind 😎 working on it since morning, stay tuned , update you soon

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      @@unclecode Awesome, will be waiting for it.

    • @cucciolo182
      @cucciolo182 Місяць тому

      The thing is how to pack all those tools so we can sell customs gpts and merge them into websites 😂

  • @SeeFoodDie
    @SeeFoodDie Місяць тому +1

    Wow that Llama3 is here we can ignore all these models for a few days. Until the next best thing is released! The pace is breathtaking.

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      I agree, I wonder if people are actually using every new model or just sticking to their old stack.

    • @kc-jm3cd
      @kc-jm3cd Місяць тому

      Once I start downloading these I will run everything of quality that comes out looking for mostly storytelling abilities and some general knowledge ai

  • @legendarystuff6971
    @legendarystuff6971 Місяць тому +3

    First.. you know.. I miss 2015 😢

    • @PazLeBon
      @PazLeBon Місяць тому +1

      i miss 2005 :/

    • @Nihilvs
      @Nihilvs Місяць тому +1

      @@PazLeBon I mis 500 BC

  • @BlackMita
    @BlackMita Місяць тому +8

    Zenzorzhip bad

  • @user-qb2jn9zh9i
    @user-qb2jn9zh9i Місяць тому

    Unfortunately, I missed it and then couldn't find the part in the video that said which version was being tested.
    Maybe someone understands - the author managed to download a version that the manufacturer later removed, or will he get access to a new, improved version?

    • @pedrogorilla483
      @pedrogorilla483 Місяць тому +1

      He didn’t explain it well. What happened was the weights for 7B and 8x22B were uploaded and then deleted. However the license used was Apache 2.0 which allows for copying and reuploading. So people who managed to download the weights before they deleted reuploaded the weights fully legally. Just search on hugging face. Only the 70B is missing which they never uploaded.

    • @LibertyRecordsFree
      @LibertyRecordsFree Місяць тому

      MaziyarPanahi/WizardLM-2-8x22B-GGUF
      WizardLM-2-8x22B.IQ3_XS-00003-of-00005.gguf

    • @user-qb2jn9zh9i
      @user-qb2jn9zh9i Місяць тому

      Thank you for the clarification!
      We still managed to download and post it! :)

  • @ziad_jkhan
    @ziad_jkhan Місяць тому +3

    Why not use open Ollama instead of closed LM Studio?

    • @kylequinn1963
      @kylequinn1963 Місяць тому +1

      Because LM Studio has a wicked user interface and Ollama barely functions on windows, that's my reason anyway.

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      I tested it on ollama but the model is generating gibberish. Still figuring out what is the issue there.

    • @ziad_jkhan
      @ziad_jkhan Місяць тому

      @@kylequinn1963 Well, it might also be wicked in the real sense. How can we know without access to the source?

    • @ziad_jkhan
      @ziad_jkhan Місяць тому

      @@engineerprompt May be report the issue on Github or DIscord. That's why it is open-source after all.

    • @ziad_jkhan
      @ziad_jkhan Місяць тому

      @@engineerprompt The Github repository accepts bug issues

  • @ilianos
    @ilianos Місяць тому

    If I don't want to/can't use this model locally: Does anyone know if it's already hosted somewhere online and available per API?

    • @engineerprompt
      @engineerprompt  Місяць тому +2

      Not this but the instruct fine-tuned version by Mistral AI is available on their platform.

    • @Gatrehs
      @Gatrehs Місяць тому

      You could try checking Infermatic, not sure how their API runs though.

  • @lancemarchetti8673
    @lancemarchetti8673 Місяць тому

    *Does anyone know where I can test the Mistral 8x22b online, as I don't have a system that supports local models?*

    • @engineerprompt
      @engineerprompt  Місяць тому

      checkout labs.perplexity.ai/ its the base version not the instruct version

  • @snuwan
    @snuwan Місяць тому

    There is a version of it in ollama. Is it different

    • @engineerprompt
      @engineerprompt  Місяць тому

      I have tried the latest version of ollama (1.32) and have issues running the 4bit version. 8bit works but is too 🐌

    • @snuwan
      @snuwan Місяць тому +1

      @@engineerprompt I have an NVidia 3090 with 24GB VRAM so might be able to load it. Need to try it with Ollama

  • @MonkeySimius
    @MonkeySimius Місяць тому

    As far as the trick question about whether Sally is John's sister and it figuring out its mistake once you pointed it out:
    You should do another test where you do specify that Sally is John's sister and then gaslight it saying the initial prompt didn't say that. I'm curious how it would respond.

    • @engineerprompt
      @engineerprompt  Місяць тому +1

      Interesting, will try that for sure with this and llama3.

  • @NavneetRingania_from_Bangalore
    @NavneetRingania_from_Bangalore Місяць тому +2

    Would the price of this hosted be lower than gpt4

    • @engineerprompt
      @engineerprompt  Місяць тому

      Self hosting will be cheaper in the long run but in short term it will be more expensive.

    • @Gatrehs
      @Gatrehs Місяць тому

      @@engineerprompt What kinda hardware are you running this on?
      Edit: Nevermind I saw it further down.

  • @coreyhughes1456
    @coreyhughes1456 Місяць тому +1

    What are the VRAM requirements to run these models?

    • @kylequinn1963
      @kylequinn1963 Місяць тому +1

      Massive. I'm running the Q3 variant on my machine with a 4090 and 128gb of ram and the model itself is around 65gb, referring to the 8x22b model specifically.

    • @engineerprompt
      @engineerprompt  Місяць тому

      I am running this on M2 Max 96GB RAM. Can run the Q3 only.

    • @williamcoleman7869
      @williamcoleman7869 Місяць тому

      I am running the Q8 model on a desktop with a 3060 12gb. It takes about 4 seconds to start writing. That's fine with me.

  • @alexsov
    @alexsov Місяць тому

    Why not just finetune on benchmark questions?)

    • @ilianos
      @ilianos Місяць тому +2

      I'm just genuinely curious: are you being sarcastic? :)
      We would need new benchmark questions then. But in my opinion, we need new benchmarks (reguarly) anyways, to prevent false advertising of new models.

  • @pabloe1802
    @pabloe1802 Місяць тому

    Its possible to run it using 2 GPU? any tutorial with langchain

    • @engineerprompt
      @engineerprompt  Місяць тому

      Yup depending on the vRAM you have in each gpu. you will need about 48GB

  • @RickySupriyadi
    @RickySupriyadi Місяць тому +1

    i got these naughty poem inside my notes it was converted notes from my teens, somehow i got them into one of my daily notes no wonder i never ever find those poems. using ollama + obsidian copilot with dolphin model i got that old notes back and then i was calling all my buddies from the 90's then we all having great time they even remember those silly naughty poems.... ah the beauty of uncensored LLM.
    without censorship all kind information can be used in all kind different ways whenever it's for good nor for bad. censorship in my country already be misused in all kinds different creative corrupt way to get monopoly for the profit of few ~they censor yet they access ~they censor yet they gain strategics ~they censor in favor of their ideology ~they censor in favor their politics (this is fact)
    uncensored = good will gain, bad also gain. let's us human thrive in information and tech.

  • @mohammadhamidi5517
    @mohammadhamidi5517 Місяць тому

    what hardware spec does it need to run ?

    • @engineerprompt
      @engineerprompt  Місяць тому

      I am running this on M2 Max 96GB and takes about 50GB

  • @efifragin7455
    @efifragin7455 Місяць тому +1

    the current model is not 1106... there is april updated chat4 turbo version

  • @jaysonp9426
    @jaysonp9426 Місяць тому +1

    This didn't age well 😂

    • @engineerprompt
      @engineerprompt  Місяць тому

      That's so true 😂😂😂

    • @jaysonp9426
      @jaysonp9426 Місяць тому

      @@engineerprompt I'm glad you made this though. With the news cycle I would have completely missed it!

  • @rude_people_die_young
    @rude_people_die_young Місяць тому

    Then there was llama3 🎉😂