Claude 3 vs ChatGPT in Street Fighter | Local 7B Model Tournament (Mistral, Gemma ++)

Поділитися
Вставка
  • Опубліковано 23 жов 2024

КОМЕНТАРІ • 39

  • @FerGodSakes219
    @FerGodSakes219 6 місяців тому +3

    Just wanted to say thanks for another great video. Discovered your channel last weekend and really appreciate your content. thank you!

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      thnx a lot! really appreciate it, glad you're enjoying the content :) let me know if you have any questions!

  • @joannot6706
    @joannot6706 6 місяців тому +1

    That was so interesting, please do more of this!

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +2

      thnx :) yeah this was a lot of fun to create. i have way more ideas so will def do more of these in the future!

    • @joannot6706
      @joannot6706 6 місяців тому

      @@AllAboutAI Awesome

  • @TheHistoryCode125
    @TheHistoryCode125 6 місяців тому +1

    This video demonstrates how to set up and use an open-source project called LLM Coliseum that allows you to evaluate large language models in real-time using the video game Street Fighter. The process involves installing Docker, cloning the GitHub repository, and setting up API keys. The video then shows how to pit OpenAI's GPT-3.5 model against Anthropic's Claude in the game.

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +2

      nice, sounds like a super fun project! i've had a blast playing around with it. as i mentioned in the video, i've made a few tweaks to the open source code so you can use models like claude 3 haiku as well. def checkout the github if you become a channel member, i upload all my code and experiments there. let me know if you have any other questions!

    • @orterves
      @orterves 6 місяців тому +3

      This is an AI generated summary of the video

  • @negadan77
    @negadan77 5 місяців тому +1

    We need to do a fight test with gpto Vs gpt4

  • @DemiGoodUA
    @DemiGoodUA 6 місяців тому

    cool to see the same thing, but without the response time dependency. It is more interesting to see who is smarter, not faster

  • @peterkonrad4364
    @peterkonrad4364 6 місяців тому +1

    what ive always been wondering, also with games like connect 4 and so on, if theres a strategy a that always beats strategy b, and strategy b always beats strategy c, does that automatically mean, that a beats c? or could it be that c beats a? that would mean we have a kind of stone paper scissors situation, which would be much more fun. it would mean you first have to identify what strategy your opponent is doing, and then reacting with something that you know that beats it. it would mean that there is no single strategy that beats everything. that would be kind of boring.

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      thnx, that's a really good point. i think you're absolutely right, it could be a rock paper scissors type situation. that's something i've been thinking about as well. it would definitely make it more fun and challenging, having to identify your opponent's strategy and then counter it. i'm going to have to experiment more with that. i agree, having a single optimal strategy that beats everything would be a bit boring. the fun is in trying to outsmart and outmaneuver your opponent. great insights, thanks for sharing!

    • @coryarmbrecht
      @coryarmbrecht 6 місяців тому

      @@AllAboutAI Is this where things like AlphaStar and AlphaGo come in?

  • @nic-ori
    @nic-ori 6 місяців тому

    Thank you. Useful information.👍👍👍

    • @AllAboutAI
      @AllAboutAI  6 місяців тому

      thnx mate :) happy to help! let me know if there is anything else i can assist with.

  • @App2bits
    @App2bits 6 місяців тому +2

    What is your hardware setup to work with these models in parallel?

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      thnx for tuning in! i am actually running this on my personal GPU, so i have a decent rig. but i agree, having multiple gpus would be ideal to play with different models in parallel. im just using this for a bit of fun and learning, not for anything production ready. let me know if you have any other questions!

    • @bossgd100
      @bossgd100 3 дні тому

      @@AllAboutAI yes but which gpu ?

  • @App2bits
    @App2bits 6 місяців тому

    @AllAboutAI, could explain why you used WSL instead of plain Windows? :)

    • @AllAboutAI
      @AllAboutAI  6 місяців тому

      ah yeah, i just found it a bit easier to run everything on linux for this project. the docker setup and all that just seemed to work a bit better for me on wsl. but no worries if you just wanna use windows instead, should still work fine :)

  • @rishabhsingh1406
    @rishabhsingh1406 6 місяців тому

    I think using groq for inference and then having battle can make it even more fun

    • @AllAboutAI
      @AllAboutAI  6 місяців тому

      oh yeah, that's a great idea! i've actually been looking into using groq myself recently. i think it could add a really cool extra layer to the battle sim. i'll definitely give that a try and see how it goes. thanks for the suggestion, it sounds super fun!

  • @Grandork
    @Grandork 6 місяців тому

    Why use GPT 3.5 and Claude Haiku instead of GPT 4 and Claude Opus.

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      thnx for the question. i decided to use gpt 3.5 turbo and claude haiku for a few reasons. first, they are a bit more accessible and affordable compared to the more powerful gpt-4 and claude opus models. i wanted to make this project something anyone could try out. plus, i've found the haiku and 3.5 models to be surprisingly capable for things like this. but you're right, the newer models could potentially offer even better performance! i'll have to experiment more with those in the future.

  • @jana171
    @jana171 6 місяців тому +1

    It's like you just taught Skynet that a less aggressive battleplan will get you the win in the end... maybe humanity will survive a few more years due to your research 🙂

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +2

      haha cheers mate :) yeah i def think there is a lot of potential in using llms strategically for different tasks. its just a bit of fun and learning, but who knows what the future holds!

    • @jana171
      @jana171 6 місяців тому

      @@AllAboutAI Yeah i totally loved this.. we could be looking at an entire new sport here, or a complete shift in how measuring of models and hardware are done. Epic !

  • @wurstelei1356
    @wurstelei1356 6 місяців тому

    Aw, I though it is doing the multimodal thing, but this is just the keyboard combos as text. Multimodal would be way too slow I guess.

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      yeah, i agree. the keyboard combos are just a quick demo, the real power is in the multimodal stuff. but that does take a lot more compute, so its not quite ready for prime time yet. i'll try to do a more in-depth tutorial on the multimodal stuff soon!

    • @wurstelei1356
      @wurstelei1356 6 місяців тому

      @@AllAboutAI Nice, I am collecting everything I can about multimodal AI. Especially the robot controlling ones. Would be nice to see a tutorial on controlling a cheap robot arm by multimodal here. Or maybe with a Google robot transformer.

  • @ReNiCGaming
    @ReNiCGaming 6 місяців тому

    holy shit

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +1

      tnx! yeah, i tried to make this project as fun and engaging as possible. if you wanna get the code, just sign up as a member and i'll invite you to the community github :)

    • @ReNiCGaming
      @ReNiCGaming 6 місяців тому

      I've played a whole lot of sf3 (check the channel). If you want help refining prompts/"moves" and optimal strategy. I'd love to help.

    • @ReNiCGaming
      @ReNiCGaming 6 місяців тому

      @@AllAboutAI if I can find the time I may.. it would be interesting to play AGAINST AI.

  • @qadirtimerghazin
    @qadirtimerghazin 6 місяців тому

    I guess it may be OK in WSL, but seeing stuff done as root freaks me out :)

    • @AllAboutAI
      @AllAboutAI  6 місяців тому +2

      haha yeah, i know what you mean. i try to avoid root where possible too. but sometimes it just makes things easier, ya know? anyways, hope you're still enjoying the vids! let me know if you have any other questions.