- 20
- 16 475
Florenz Erstling
Приєднався 6 лис 2016
"A person who never made a mistake never tried anything new." - Albert Einstein
Don't Use Vision AI Before Watching This (2024 Complete Test) ⚠️
🤖 Ultimate Vision AI Model Comparison 2024 - Complete Benchmark & Testing Guide
🌟 Official Model Documentation:
- GPT-4V: platform.openai.com/docs/guides/vision
- Claude Vision: docs.anthropic.com/en/docs/build-with-claude/vision
- Mistral Vision: docs.mistral.ai/capabilities/vision/
- LLaVA-13B: replicate.com/yorickvp/llava-13b
- Together AI: docs.together.ai/docs/vision-overview
🔍 In this video I put the leading Vision AI models through an extensive real-world test. From document analysis to visual classification, I reveal which models excel at practical tasks and which ones fall short. Whether you're building enterprise solutions or choosing the right model for your project, this comprehensive benchmark will save you weeks of testing.
💡 Key Highlights:
- Complete comparison of GPT-4V, Claude 3.5, Pixtral, LLaVA, and Llama 3.2
- Real-world testing scenarios including document verification & quality control
- Cost comparison between commercial and open-source options
- Step-by-step **AI Tutorial** for implementing vision models
- Advanced **AI Automation** techniques for scaling vision tasks
🎯 Perfect for:
- Developers building vision-enabled applications
- Companies evaluating vision AI solutions
- Anyone interested in practical **AI Agents** implementation
- Engineers working on **tutorial** projects
⚡ The benchmark includes:
- Document Analysis & Text Extraction
- Object Detection & Classification
- Quality Assessment Systems
- Visual Verification Workflows
- Performance Metrics & Cost Analysis
⚙️ Testing Categories:
- Signature Detection & Document Verification
- Object Counting & Scene Understanding
- Quality Control & Safety Assessment
- Scientific Classification & Recognition
- Real-time Performance Analysis
🔥 Don't forget to like, subscribe, and hit the notification bell to stay updated on the latest AI developments!
#VisionAI #AIBenchmark #ArtificialIntelligence #GPT4Vision #AITutorial #ComputerVision #DeepLearning #AIAgents #MachineLearning #AIAutomation
🌟 Official Model Documentation:
- GPT-4V: platform.openai.com/docs/guides/vision
- Claude Vision: docs.anthropic.com/en/docs/build-with-claude/vision
- Mistral Vision: docs.mistral.ai/capabilities/vision/
- LLaVA-13B: replicate.com/yorickvp/llava-13b
- Together AI: docs.together.ai/docs/vision-overview
🔍 In this video I put the leading Vision AI models through an extensive real-world test. From document analysis to visual classification, I reveal which models excel at practical tasks and which ones fall short. Whether you're building enterprise solutions or choosing the right model for your project, this comprehensive benchmark will save you weeks of testing.
💡 Key Highlights:
- Complete comparison of GPT-4V, Claude 3.5, Pixtral, LLaVA, and Llama 3.2
- Real-world testing scenarios including document verification & quality control
- Cost comparison between commercial and open-source options
- Step-by-step **AI Tutorial** for implementing vision models
- Advanced **AI Automation** techniques for scaling vision tasks
🎯 Perfect for:
- Developers building vision-enabled applications
- Companies evaluating vision AI solutions
- Anyone interested in practical **AI Agents** implementation
- Engineers working on **tutorial** projects
⚡ The benchmark includes:
- Document Analysis & Text Extraction
- Object Detection & Classification
- Quality Assessment Systems
- Visual Verification Workflows
- Performance Metrics & Cost Analysis
⚙️ Testing Categories:
- Signature Detection & Document Verification
- Object Counting & Scene Understanding
- Quality Control & Safety Assessment
- Scientific Classification & Recognition
- Real-time Performance Analysis
🔥 Don't forget to like, subscribe, and hit the notification bell to stay updated on the latest AI developments!
#VisionAI #AIBenchmark #ArtificialIntelligence #GPT4Vision #AITutorial #ComputerVision #DeepLearning #AIAgents #MachineLearning #AIAutomation
Переглядів: 411
Відео
AI Agents Build me a Website While I Slept (Here's The Code!) 🚀
Переглядів 71514 днів тому
🚀 Complete Code for this Tutorial: github.com/Florenz23/ai-agent-videos/tree/master/webDev Discover how I create a fully automated web development workflow using AI Agents ! In this video, I show you step-by-step how to build a powerful system that automates the entire web development process through AI Automation . 🎯 What you'll learn in this AI Tutorial : - Setting up AI Agent swarms for web...
Web Development is Dead: AI Agents Just Killed It 😱
Переглядів 1,6 тис.14 днів тому
🔥 Discover How to Build an Automated Web Development System with AI Agents ! 📂 Get the Complete Code Here: github.com/Florenz23/ai-agent-videos/tree/master/webDev 🚀 In this video I revolutionize web development by creating a fully automated system using AI Agents . Through this comprehensive AI Tutorial , I demonstrate how to orchestrate multiple AI agents to handle every aspect of web developm...
Hands-On with Swarm: OpenAI's AI Agent Framework Delivers!
Переглядів 1,4 тис.28 днів тому
🚀 OpenAI Swarm in Action: Live Test and Quick Dive! 🤖 In this hands-on AI Tutorial, I put OpenAI's Swarm through its paces in a live demonstration! Join me as I explore this revolutionary AI Agent framework and test its capabilities in real-time. 🔬 📊 Key highlights of this AI Agent Tutorial: • Live Swarm AI test and comprehensive review • Quick dive into Swarm's features and functionality • Rea...
Swarm AI: OpenAI's Game-Changer for Ai Agents ?
Переглядів 1,7 тис.Місяць тому
🚀 Discover the Future of AI: OpenAI Swarm Unveiled! 🤖 In this video, I dive deep into OpenAI's groundbreaking Swarm framework - the next evolution in AI Tutorial and AI Agent technology! 🌟 🔍 Learn about: • What is OpenAI Swarm? • How Swarm revolutionizes multi-agent AI • The power of AI orchestration and collaboration • Comparing Swarm to other AI frameworks (CrewAI, Langgraph, Agent Zero, Auto...
Future of Coding: ChatGPT Canvas vs Claude Artifacts (Must Watch!)
Переглядів 717Місяць тому
🚀 Dive into the Future of Coding with AI! 🤖💻 In this video, I explore the cutting-edge AI coding tools that are revolutionizing software development: ChatGPT Canvas and Claude Artifacts. As an AI Tutorial expert, I'll guide you through a thrilling comparison of these two powerhouses in automated programming. 🔍 Discover how ChatGPT Canvas, developed by OpenAI, is reshaping the landscape of AI-as...
AI Agent Framework Final: CrewAI vs AutoGen vs LangGraph vs AgentZero Final Results Shock [5/5]
Переглядів 4,2 тис.Місяць тому
📊 AI Agent Frameworks Showdown: CrewAI vs. AutoGen vs. LangGraph vs. Agent Zero 🤖 In this video, I dive deep into the world of AI Agents and put four powerful frameworks to the test! 🚀 Watch as I compare CrewAI, AutoGen, LangGraph, and the underdog Agent Zero in an epic battle of performance and functionality. 🔬 The Challenge: I tasked each framework with a real-world stock analysis problem, ap...
AI Agent Framework Battle: CrewAI vs. AutoGen vs. LangGraph vs Agent Zero | Underdog Triumphs? [4/5]
Переглядів 661Місяць тому
📊 AI Agent Frameworks Showdown: CrewAI vs. AutoGen vs. LangGraph vs. Agent Zero 🤖 In this video, I dive deep into the world of AI Agents and put four powerful frameworks to the test! 🚀 Watch as I compare CrewAI, AutoGen, LangGraph, and the underdog Agent Zero in an epic battle of performance and functionality. 🔬 The Challenge: I tasked each framework with a real-world stock analysis problem, ap...
AI Agent Framework Battle: CrewAI vs. AutoGen vs. LangGraph vs Agent Zero | Underdog Triumphs? [3/5]
Переглядів 8242 місяці тому
📊 AI Agent Frameworks Showdown: CrewAI vs. AutoGen vs. LangGraph vs. Agent Zero 🤖 In this video, I dive deep into the world of AI Agents and put four powerful frameworks to the test! 🚀 Watch as I compare CrewAI, AutoGen, LangGraph, and the underdog Agent Zero in an epic battle of performance and functionality. 🔬 The Challenge: I tasked each framework with a real-world stock analysis problem, ap...
AI Agent Framework Battle: CrewAI vs. AutoGen vs. LangGraph vs Agent Zero | Underdog Triumphs? [2/5]
Переглядів 9542 місяці тому
📊 AI Agent Frameworks Showdown: CrewAI vs. AutoGen vs. LangGraph vs. Agent Zero 🤖 In this video, I dive deep into the world of AI Agents and put four powerful frameworks to the test! 🚀 Watch as I compare CrewAI, AutoGen, LangGraph, and the underdog Agent Zero in an epic battle of performance and functionality. 🔬 The Challenge: I tasked each framework with a real-world stock analysis problem, ap...
AI Agent Framework Battle: CrewAI vs. AutoGen vs. LangGraph vs Agent Zero | Underdog Triumphs? [1/5]
Переглядів 7432 місяці тому
📊 AI Agent Frameworks Showdown: CrewAI vs. AutoGen vs. LangGraph vs. Agent Zero 🤖 In this video, I dive deep into the world of AI Agents and put four powerful frameworks to the test! 🚀 Watch as I compare CrewAI, AutoGen, LangGraph, and the underdog Agent Zero in an epic battle of performance and functionality. 🔬 The Challenge: I tasked each framework with a real-world stock analysis problem, ap...
No-Code Revolution? AI Builds Entire App in Minutes | AI - Tutorial
Переглядів 1672 місяці тому
In this video, I demonstrate the groundbreaking potential of AI in app development, showcasing how artificial intelligence can build an entire app with minimal coding required. Watch as I create a Dev Habit Tracker app in minutes using AI assistance! Witness the power of AI in app creation 💡 Learn how to leverage AI for rapid prototyping 🛠️ Explore the future of no-code and low-code developmen...
I Built an AI That Watches YouTube and Summarizes EVERYTHING (You Can Too!) | CrewAi Tutorial
Переглядів 1283 місяці тому
🚀 I Built an AI That Watches UA-cam and Summarizes EVERYTHING (You Can Too!) | CrewAI Tutorial 🔗 Get the Code: github.com/Florenz23/ai-agent-videos/tree/master/youtube_insights Ever wished you could extract key insights from UA-cam without watching endless videos? In this AI Tutorial , I'll show you how to build an AI that does just that using CrewAI and AI Agents ! 🧠💡 📺 What This AI Can Do: -...
I Let AI Plan My Meals for a Month. The Results Were Shocking! | CrewAi Tutorial
Переглядів 593 місяці тому
Discover the power of AI Agents in this mind-blowing AI Tutorial ! In this video, I explore how CrewAI , a cutting-edge framework for building AI agent systems, can revolutionize your meal planning and cooking routine. Watch as I demonstrate a practical AI Automation solution that plans an entire month's worth of meals, creating a personalized cooking schedule and shopping list. 🔧 Get the code:...
Mind-Blowing Multiplayer Game Creation with WebSim AI | AI Coding Tutorial
Переглядів 3483 місяці тому
Welcome to my tutorial on creating an exciting multiplayer game using WebSim AI! In this video, I'll guide you through the entire process of AI-powered game creation, from start to finish. Follow along as I show you how to utilize WebSim AI for fast game creation and easy AI game development. This tutorial covers everything you need to know about prompt-based game creation, making it simple to ...
Build Stunning Games in Minutes with WebSim | Step-by-Step Tutorial + Prompts
Переглядів 9294 місяці тому
Build Stunning Games in Minutes with WebSim | Step-by-Step Tutorial Prompts
Mind-Blowing AI Creates Full Websites in 5 Seconds - FOR FREE!
Переглядів 4514 місяці тому
Mind-Blowing AI Creates Full Websites in 5 Seconds - FOR FREE!
The Most Insane YouTube Automation System (crewai ai-agents)[FREE CODE INCLUDED]
Переглядів 3574 місяці тому
The Most Insane UA-cam Automation System (crewai ai-agents)[FREE CODE INCLUDED]
The Future of Blogging: AI Agents and Crew AI in Action
Переглядів 1415 місяців тому
The Future of Blogging: AI Agents and Crew AI in Action
What temperature do you use at the experiments?
Great video! Thanks. Only challenge I faced is I questioned my eyesight when I heard 68% instead of 86%, and caused me a mild anxiety 😅
Video is very good. We need more independent benchmarks! But you need to prove it's actually calling the models.
Good video, I like that you were strict and ran the tests many times. However without the code being in the video or linked in description, we don't really know how well the tests were executed.
I’m trying to make my game multiplayer but when I go on a different account and look at the game it doesn’t have the people symbol
Nice content , I want to ask what is happening to coding now since current ai tools able to code by their own, so as an software developer we have to provide pseudo code to these agents or something else ?
This was really helpful. Thanks. Are there any differences in the tools available between them? For example when I tried crewAI - it couldn’t do instagram scraping. Wondering if others offer better more rich tools that can invoked easily. For reference - l’m a newbie coder.
Good idea! But you didn't talk about what actually happened. How many iterations? How long did it take? What are the exit criteria? Would have liked to see the final output. Also, what if the Web page is longer than the screen? Will a screenshot capture a long page?
Don't worry about all the hateful comments. I found the video useful! . Web development will change dramatically for sure. There are a lot of people who will resist the change. I'm waiting for the follow up video. Thx
please, do a tutorial...I need to do the same with Buffet and more knowledge
I can definitely do that, what kind of tutorial you want exactly
@@FlorenzErstling I'm really noob, I don't know how to put all together with CrewAI, I see in github and it's really difficult
Thank you for sharing this experiment! I have an idea about using claude models with swarm. OperRouter service wraps different providers (including anthropic) and gives us API casted to the OpenAI API (with custom endpoint). I hope, it works with images too.
I checked the documentation for model claude-3.5-sonnet:beta on OpenRouter (sorry, I can't give the link, because youtube hides comments with links) and find in the API tab example with image. So, it should work.
Oh, thats very nice, thanks a lot for the input, I definitely gonna try that 👍🙏🏻
now do authentication, now do multi-user accounts, now do multi-tenanting, now do subscriptions, now do SEO, now do A11Y, now do cookie acceptance, now do ads integration and metrics. You can't build an aeroplane by just drawing an aeroplane.
are you AI?
not yet 😅
you should not put your video over the slide contents ;-)
Good point, thanks :)
No, its not dead. You are living in another world maybe.
coding we know it like it was one year ago is pretty dead, you can automate a lot already
Spoken like a another AI youtuber that hasn't actually coded anything even remotely complex.
everything complex you can break down into something less complex, give me one concrete problem in coding that ai is not able to handle
@@FlorenzErstling I never said it can't handle it. But it can't be done by somebody without development knowledge. My point is that all you AI youtubers claim somebody with no coding background can just prompt it to build anything...they can't. AI doesn't think, it regurgitates what it thinks matches your prompt. A non-technical person will end up going in circles and burning through their credits and rate limits if they try to create something of any complexity.. And then let that person, having no understanding, try to refactor a large app using AI.. I like AI and I use it everyday.. But I don't depend on it and I know how to prompt it past its sticking points..or just fix the issues manually. Most people won't.
I have an idea for your next title: AI is slaughtering fake-news-youtubers with German accents. Im just getting sick and tired of this everyone tries to make a penny on AI but lacks the quality. The UA-cam algorithm gives me strange conspiracy theories just by searching for AI models just to give you an idea hoe bad it is
thats how the biz works bro, but you are right, those titles tend to be misleading sometimes, I will remove some spice for my next one ;)
@@FlorenzErstlingI just wish that the biz will be based on facts as it supposed to be. You seems like an intelligent guy to me and dont need this to become successful with your content just take it as some advice .. well cheers bud
@ I like your advice, its well appreciated, thank you 🙏🏻. Next one will be more conservative.
that is it. we are all obsolete!
ah, it will definitely take some more time ;)
dudue, why is your head in the graph? 👎
sry for that, shall I send it to you?
@@FlorenzErstling naw, s'all good, just pointing it out for your next vid. I'm ocd and i really wanted to see anything with agent zero. so far it's the only one i've gotten o work on my system, windows or ubuntu. also tried open-interpreter, AiLiceAI, agent K, phidata and AIOS. sad thing is, in windows AND linux, i can't get Any of them running in docker. bad builds. i'd really love to see someone do vids on how to debug and fix bad builds in docker.
@@fatherfoxstrongpaw8968 ok thanks for pointing it out 🙏🏻, will see that I avoid it in the next one
I prefer LangGraph 🦜🎡
Eyo, Florenz, can we setup a meeting to talk about AI agents? I have some questions and you seem like someone who could teach me the best way to procide
sure, please send me an email to hello@florenzerstling.de and we schedule a meeting, looking forward to it :)
That's an interesting get_weather function you have there 🤔🤔🤔🤔🤔🤔
just a demo :D
bro you shared you api keys
He probably disabled it before uploading the video
🙏🏻 thanks a lot for pointing it out and not abusing it!
now yes 😅
@@FlorenzErstling np
:)
Thanks for the video but there is something I don’t understand in this « benchmark ». What about the frameworks « output » ?? Does all the frameworks generated exactly the same final report ? I guess not ? Then for a « benchmark » to be meaningful I think it should includes a careful rating of the results. Why would I pick a framework that it slightly faster or easier to install if the produced results are poorer ?? Or did I miss something ?? Many thanks for your thoughts on that…
thats a good point, thank you, the results where actually very similar indeed, but not exactly the same, I will definitely consider it for my next benchmark
I was lost and then I found it. Super helpful and down to earth explanation. May the Almighty bless you, man!
thanks a lot, I am very happy to hear that, if you have another topic you struggle with, just let me know in the comments
great video!
Thank you 😊
Please stop stomping or maybe hitting with your hand. It is very disturbing.
ok, gonna try :)
Fyi, you can combine langgraph with crewai too
Ah really? Did not know that, did you try it? How were your results?
@@FlorenzErstlingI have not myself, but I believe one of the examples in the example repo does that
Where is agency swarm ? If you want fast, use groq, if you want cheap use ollama, if you need something done like this kind of research, speed does not matter, but insightfull results do. So improve prompt and or use better model. If you cant code, dont use ai frameworks, there zare plenty of other options now, or was opensource a requirement ?
Yes, thats true, of course speed is not a requirement for this task, just wanted to compare the different kpi, then everyon has to choose the framework what fits best for a specific use case.
You are comparing apples and oranges here. Agent zero is a single autonomous agent, not a multi agent agentic framework. There are a bunch of others now popping up in that space. Prompts and which llm you use, determine everything. Agent zero prompts can be easily improved, very visible in the code, not hidden deep diwn in pip package ...
Yes, I realized that too, you are right, concerning agent zero the comparasion does not fit that well, anyway I wanted to make it part of the serie as its pretty popular right now. You know any other frameworks similar to agent zero, I tried autogpt, but was not that happy with it.
@@FlorenzErstling @aicodeking has several recent reviews
Really liked the series but you have the playlist the wrong way around. Currently the results are the first video :D
Thanks a lot, I'm glad you liked it. I ordered the list now correctly, thank you for the hint :)
Your ease of use VS instalation complexity is misleading. The ease of use trait should have it's maximum on the right as that's the established way to present it, yet "easy" is on the left side
True, thank you for pointing it out.
Sure, I should also mention: thank you for this content, it made me install Autogen :)
good content!
Thank you for this review, I always like seeing different perspectives. That being said, I don't agree with your final assessment of Agent Zero. For example benchmarks step 1 through 6 could use a little more analysis. I'm thinking that cost and execution time depends on desired goal as interpreted by the framework. So if normalization of data before comparison wasn't taken into account, I'd probably reconsider what the data is actually telling us if we don't adjust or adapt for discrepancies between the different frameworks. Finally, step 7 and 8 contradict each other and so do steps 9 and 10. Step 11 which is the summary, declares the framework not to be "agentic". I think this surprised me the most and made me question what "agentic" really means when referring autonomous AI agent frameworks. This is all IMHO of course, so any feedback or insight would be appreciated!
Thanks for watching :) Concerning your question, check some videos of Andrew Ng, he explains the word "agentic" very well. It can be missleading in many cases as its still a very new area, I am pretty sure that it will change and evolve a lot in the near feature.
Good video, thanks for sharing the code. Can you please make font in the video a bit bigger to understand what I see in sync with your explanations?
Thanks a lot. Yes, good tip, I will definitely do that in the next ones.
Btw, now gemini (in android) supports asking about the youtube video, which i thinks a very cool and practical feature
oh, thats nice, I think perplexity supports that too, thanks for the info 🙏
@@FlorenzErstling yes and also copilot.. 👍👍
Thank you. What do you think about agentzero? I want the framework to do the work for me but i dont want it to reinvent the wheel when a similar task is asked
Thank you, agentzero is part of the benchmark serie (check my most recent videos), its scheduled for next monday, if you need the results write me an email info@florenzerstling.de and I send you the linke to the not yet published video
Thank you! I love this series.
Thank you, im very happy to hear that 😃
This was a great video. You're a little ahead of the crowd on this one so I bet this video stacks up views in teh months to come.
Thanks a lot! Im happy that you liked it. But what do you mean exactly with being ahead of the crowd ? 😅
@@FlorenzErstling I thought the same thing as @joaob1226: I believe he meant most of the videos and tutorials out there are still talking about multi-agent system theory/concepts, about building small agent systems on your own, or, at most, about the installation and basic features of one of the main agent frameworks out there. These agent frameworks involve a lot of boiler-plate code which makes it hard for developers to actually manage understanding the inner-workings and way-to-use of more than just one agent framework, let alone building a same use case with two of them and comparing results. In summary, I would dare say you are one of the first to actually use several different agent frameworks at the same time and prepare a video comparing the results, i.e., "a little ahead of the crowd". Amazing work and amazing video series, thanks very much.
My conclusion on LangChain is the same as yours: very very powerfull tool, but kind of hard to understand because it can do so many things in so many ways (I've tested it for RAG implementation). Did you try Flowise? It's a "low code" interface to create flows in Langchain (and others). Just a suggestion for your videos: when you explain code, please zoom your IDE and collapse the terminal panel. It will be more readable for the viewers. Thank you for you videos.
That's the kind of benchmark I wanted to make. Thank you for avoiding me the pain of reading a ton of documentation 👍 ps: you deserve more views
@@PhunkyBob Thank you, im happy that you like it. Let me know if there are any other frameworks you would like to benchmark.
@@FlorenzErstling One framework I have in sights is "Haystack" from Deepstep.
I'm following A.I closely and I found this website a few months ago, I think it's really cool what it can do and I can finally start on my childhood dream to create my own games without any coding knowledge. Only downside is both of my games I have been working on just totally breaks after a while, I always seem to come to a point where tha a.i simply can't add anything else without breaking the rest of the game's functions. And bow recently the context menu(right click on PC) for mobiles has dissapeared, I always find the main prompt bar at the top gets progressively worse the longer you work on a project so I always use the edit function instead
But with that said, this is probably the worst this technologys gonna be, in a few years we can probably create our own tripple A games ourselves with a few text prompts 😀
thanks for your feedback, you have a link to your project? I could take a look and maybe help if you like 🤔
@@MickeRamone probably yes 😅
Beautiful nice idea, 😊 Keep going ❤👍
Thanks a lot 🙏🏻
Good tutorial, Nice work....... Please add Ollama Integration as another option to new videos about crewai 🙏
Thank you 🙏🏻, good idea, I will do that 👍
Awesome ❤🙏 Very great idea and tutorial Please, Go Go Go for more 🎉
Thank you 🙏🏻, anything specific you would like me to make a video about?
Nice content, thx!
@@YupengChao-u1o Thank you, if you have issues with a specific topic, let me know.