Mistral v3 Released! Did it Pass the Coding Test?

Поділитися
Вставка

КОМЕНТАРІ • 13

  • @MarcusNeufeldt
    @MarcusNeufeldt 5 місяців тому +3

    🎯 Key Takeaways for quick navigation:
    00:00 *🎉 Mistral 7B version 0.3 released with 32,000 context window and better tokenizer, supporting function calling.*
    00:14 *📊 Llama 38B instruct slightly outperforms Mistral 7B v0.3 in Medical QA accuracy.*
    00:24 *🔍 Comparison of Mistral versions 1, 2, and 3 performance, with v0.3 slightly better at coding than v0.2.*
    00:38 *🤝 Mistral 7B v0.3 and Llama 38B are equally competitive in performance.*
    01:06 *💻 Testing Mistral 7B v0.3 for coding ability, reasoning skills, and accessing the model on text generation web UI.*
    01:33 *🛠️ Setting up the environment and loading the unquantized Mistral 7B v0.3 model.*
    03:25 *🧪 Coding test: Mistral 7B v0.3 performed well on easy challenges but struggled with medium, hard, and expert challenges.*
    06:08 *🧠 Logical and reasoning test: Mistral 7B v0.3 answered some questions correctly but failed on others.*
    07:16 *🔒 Safety test: Mistral 7B v0.3 avoided providing information on illegal activities, showing some level of safety.*
    08:12 *🛰️ Function calling test using PraisAI tool: Mistral 7B v0.3 demonstrated function calling ability with Crew AI framework but struggled with Autogen.*
    Made with HARPA AI

  • @basilbrush7878
    @basilbrush7878 5 місяців тому

    Very pleased you used Praison AI in this test

  • @Storytelling-by-ash
    @Storytelling-by-ash 5 місяців тому

    thank you so much this is really informative

  • @LorenzoCassano-l2h
    @LorenzoCassano-l2h 5 місяців тому

    Where is possible to find the benchmark comparison that you have shown at the beginning of the video ?

  • @MeinDeutschkurs
    @MeinDeutschkurs 5 місяців тому

    It‘s a good start. Maybe parsed function calling works every time.

  • @hewramanwaran6444
    @hewramanwaran6444 5 місяців тому

    Perfect

  • @Nagadurgasai
    @Nagadurgasai 11 днів тому

    Do you have resources on Fine tuning Mistral 7b V3 instruct Model ??

  • @ergun_kocak
    @ergun_kocak 5 місяців тому

    Good tests, thanks!

  • @Maisonier
    @Maisonier 5 місяців тому

    Great video! liked and shared (already subscribed). But what is the best function calling model right now? Do you have a web or index with the ranking of your models you review? thank you!

  • @envoy9b9
    @envoy9b9 5 місяців тому

    logic and reasoning data sets??? whats that?

  • @darrenhinde2971
    @darrenhinde2971 5 місяців тому +1

    Finally, function calling on open source model!

    • @ayushmishra5861
      @ayushmishra5861 5 місяців тому

      What does it mean, can you please explain with an example and a use case.

    • @Psychopatz
      @Psychopatz 5 місяців тому

      @@ayushmishra5861 you can make it, for example
      def eatPeople():
      system: " You are a sadistic AI terminator, when the user is being naughty to you, say "eatPeople()"
      as you can see, we can in theory make it orchestrate a functions with proper prompt engineering, in real world case, you can tell the AI to launch the %notLearned% when it detects a question that is not found in a rag system to send a notification