🎯 Key Takeaways for quick navigation: 00:00 *🎉 Mistral 7B version 0.3 released with 32,000 context window and better tokenizer, supporting function calling.* 00:14 *📊 Llama 38B instruct slightly outperforms Mistral 7B v0.3 in Medical QA accuracy.* 00:24 *🔍 Comparison of Mistral versions 1, 2, and 3 performance, with v0.3 slightly better at coding than v0.2.* 00:38 *🤝 Mistral 7B v0.3 and Llama 38B are equally competitive in performance.* 01:06 *💻 Testing Mistral 7B v0.3 for coding ability, reasoning skills, and accessing the model on text generation web UI.* 01:33 *🛠️ Setting up the environment and loading the unquantized Mistral 7B v0.3 model.* 03:25 *🧪 Coding test: Mistral 7B v0.3 performed well on easy challenges but struggled with medium, hard, and expert challenges.* 06:08 *🧠 Logical and reasoning test: Mistral 7B v0.3 answered some questions correctly but failed on others.* 07:16 *🔒 Safety test: Mistral 7B v0.3 avoided providing information on illegal activities, showing some level of safety.* 08:12 *🛰️ Function calling test using PraisAI tool: Mistral 7B v0.3 demonstrated function calling ability with Crew AI framework but struggled with Autogen.* Made with HARPA AI
Great video! liked and shared (already subscribed). But what is the best function calling model right now? Do you have a web or index with the ranking of your models you review? thank you!
@@ayushmishra5861 you can make it, for example def eatPeople(): system: " You are a sadistic AI terminator, when the user is being naughty to you, say "eatPeople()" as you can see, we can in theory make it orchestrate a functions with proper prompt engineering, in real world case, you can tell the AI to launch the %notLearned% when it detects a question that is not found in a rag system to send a notification
🎯 Key Takeaways for quick navigation:
00:00 *🎉 Mistral 7B version 0.3 released with 32,000 context window and better tokenizer, supporting function calling.*
00:14 *📊 Llama 38B instruct slightly outperforms Mistral 7B v0.3 in Medical QA accuracy.*
00:24 *🔍 Comparison of Mistral versions 1, 2, and 3 performance, with v0.3 slightly better at coding than v0.2.*
00:38 *🤝 Mistral 7B v0.3 and Llama 38B are equally competitive in performance.*
01:06 *💻 Testing Mistral 7B v0.3 for coding ability, reasoning skills, and accessing the model on text generation web UI.*
01:33 *🛠️ Setting up the environment and loading the unquantized Mistral 7B v0.3 model.*
03:25 *🧪 Coding test: Mistral 7B v0.3 performed well on easy challenges but struggled with medium, hard, and expert challenges.*
06:08 *🧠 Logical and reasoning test: Mistral 7B v0.3 answered some questions correctly but failed on others.*
07:16 *🔒 Safety test: Mistral 7B v0.3 avoided providing information on illegal activities, showing some level of safety.*
08:12 *🛰️ Function calling test using PraisAI tool: Mistral 7B v0.3 demonstrated function calling ability with Crew AI framework but struggled with Autogen.*
Made with HARPA AI
Very pleased you used Praison AI in this test
thank you so much this is really informative
Where is possible to find the benchmark comparison that you have shown at the beginning of the video ?
It‘s a good start. Maybe parsed function calling works every time.
Perfect
Do you have resources on Fine tuning Mistral 7b V3 instruct Model ??
Good tests, thanks!
Great video! liked and shared (already subscribed). But what is the best function calling model right now? Do you have a web or index with the ranking of your models you review? thank you!
logic and reasoning data sets??? whats that?
Finally, function calling on open source model!
What does it mean, can you please explain with an example and a use case.
@@ayushmishra5861 you can make it, for example
def eatPeople():
system: " You are a sadistic AI terminator, when the user is being naughty to you, say "eatPeople()"
as you can see, we can in theory make it orchestrate a functions with proper prompt engineering, in real world case, you can tell the AI to launch the %notLearned% when it detects a question that is not found in a rag system to send a notification