Claude Opus vs. GPT4 - A Practical Review (with code examples)

Dave Ebbelaar

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 жов 2024

КОМЕНТАРІ • 57

@daveebbelaar 7 місяців тому
What do you think of these new models? Let me know in the comments!👇🏻
@icandreamstream 7 місяців тому ⁺²⁷
Worked with Claude yesterday for a while. It is “smarter” in some unstructured language processing and seems to deal with larger contexts better, and slightly better in abstract thought, sometimes. But it’s not better at everything. It trades blows with GPT-4, so that’s a good start. We really haven’t had any viable competitors before, so this is needed.
@videos6505 7 місяців тому ⁺²
GPT-5 will be released (soon).
@daveebbelaar 7 місяців тому
Probably ;)
@Mourne84 6 місяців тому
How soon?
@burdenedbyhope 7 місяців тому ⁺⁵
I always have good experiences with Claude. Claude 1.0 instant back then had the better reasoning and instructions following than gpt3.5 (most of the cases) and sometimes even gpt4 (in some specific cases).
Unfortunately they figured out I’m in Vietnam and revoked my access 😢.
Replacing gpt4 with claude 3 sonnet (via openrouter) yesterday for my game master AI and clear 40% of the issues just with the better and more consistent reasoning and instructions following.
@burdenedbyhope 7 місяців тому ⁺²
and json schema works much better with claude, gpt4 sometimes still return invalid json
not sure about python, but with typescript you can use typebox or zod to create the schema for prompt and use it as static type to infer in your code as well
@Accuface2000 6 місяців тому ⁺¹
ChatGPT-4 is overrated. Its not better than 3.5 in my honest view and looks worse. ChatGPT-4 has some stubborn qualities and laziness, it will skip one or two clear instructions in a detailed but brief prompt. It puts words in you mouth and is too lazy to complete. Claude Opus is super attentive, excellent in understanding detail and producing long answers.
@endoflevelboss 7 місяців тому ⁺³
My favorite quick question for these models... "What is the first palindromic multiple of 13" Answer should be 38 x 13 = 494. Claude Sonnet couldn't get it (e.g. 91 is the first because it reads the same forward [91] as backward [19] !? ) --- and repeated wrong answers like this over and over. ChatGPT 4 writes a Python program to solve it correctly. I would be interested to see if Opus gets it right.
@endoflevelboss 6 місяців тому
Cool thanks for the info. I won't take my OpenAI flag down just yet 👍
@ktb1381 6 місяців тому ⁺²
I tried opus three times in a row and it failed each time. But then I tried it with Chad GpT4 and it failed that time too. So....
Here's the last run on opus:
Okay, let's approach this step-by-step:
1) First, let's understand what a palindromic number is. A palindromic number is a number that remains the same when its digits are reversed. For example, 121, 34543, 11611, etc.
2) Now, we need to find the first multiple of 13 that is a palindrome.
3) Let's start listing out the multiples of 13:
13 x 1 = 13
13 x 2 = 26
13 x 3 = 39
13 x 4 = 52
13 x 5 = 65
13 x 6 = 78
13 x 7 = 91
13 x 8 = 104
13 x 9 = 117
13 x 10 = 130
13 x 11 = 143
4) None of these are palindromes. Let's keep going.
13 x 12 = 156
13 x 13 = 169
13 x 14 = 182
13 x 15 = 195
13 x 16 = 208
13 x 17 = 221
5) Here we go! 221 is the first palindromic multiple of 13.
Therefore, the first palindromic multiple of 13 is 221.
@endoflevelboss 6 місяців тому
@@ktb1381 yeah what I said, the wrong answer. And why did ChatGPT 4 give you the wrong answer when it gives me the right answer every time?
@ktb1381 6 місяців тому
@@endoflevelboss oh sorry I thought you were interested in seeing someone try it with opus. Regarding chat GPT differences, I double checked and actually I did get it right when I tried that so same results. Although in theory LLMs can give different answers when they are prompted the same way multiple times. Although maybe not now with the Python code check who knows.
@rogierhelmus3481 7 місяців тому ⁺³
@Dave: you mention that you prefer using your own wrappers instead of being dependant on langchain for instance. Does that mean that you have been able to build agent&team setups yourself? If so, I'm interested how since I was just about to dive into langchain and crew-AI. Ps: I'm beginner when it comes to Python.
@daveebbelaar 7 місяців тому ⁺²
Yea I mostly build the agent frameworks myself. Will consider this for a future video.
@Mar3o-0-o 7 місяців тому ⁺²
Hello, how do you do !
first thanks for all your efforts I really appreciate ,
I want you to suggest me a plan to learn aws or Google cloud or the one you suggest me.
Also as a data scientist should I learn deep learning and computer vision and nlp and llms (I am talking as a beginner or a junior to apply a job what are the specifications needed)
thanks again
@micbab-vg2mu 7 місяців тому ⁺²
Gemini Ultra is worse than GPT-4. Claude 3 Opus is the real deal. Based on my preliminary tests, it's better than GPT-4. It is my default LLM now. :)
@videos6505 7 місяців тому ⁺¹
Opus is very expensive.
@micbab-vg2mu 7 місяців тому
For longer text and output I use chat Poe. When I use API I set up words output limitation - In my case quality of output is more imprortant than price. For less demanding task I think haiku model will be perfect is cheap and quality is between GPT3.5 and 4. In the future for my projects I will mix Heiku/ Sonet with Opus model for the best cost quality balance. @@videos6505
@daveebbelaar 7 місяців тому
Thanks for sharing!
@vdeomkr70 6 місяців тому
Until tomorrow.
@dontwannabefound 6 місяців тому ⁺¹
The bracket thing seems so stupid - does no one try this shit before releasing it
@catsanzsh Місяць тому
hai im late btw theres websim support xd
@No_Masterpiece 6 місяців тому ⁺²
When am i getting replaced as a software engineer soon?
@qozia1370 5 місяців тому
2 years max.
You are already being replaced.
@qozia1370 5 місяців тому
"AI Engineer" Hahahaha, he says so because he can type things into chatgpt.
Do you create CNNs professionally? You do not.
Do you implement ai algorithms in robots? You do not.
@takeabreakism 6 місяців тому ⁺¹
Hey Dave, what theme and font do you use for vscode? Looks great. I am trying to get the look of Gemini code window, but can't get there quite yet.
@daveebbelaar 6 місяців тому ⁺¹
Thanks! Check out this video: ua-cam.com/video/3sIzCFuLgIQ/v-deo.html
@roberth8737 7 місяців тому ⁺¹
What cool tidbit regarding the "{", never read that deeply through their docs to find that - good find!
@daveebbelaar 7 місяців тому
Yea that’s a neat little feature, useful for many other cases as well a certain output it desired.
@diaryofafounder 6 місяців тому ⁺¹
Hey Dave, can I ask how you record yourself like that in Portrait mode? What gear are you using?
@daveebbelaar 6 місяців тому ⁺⁴
Sure! Ecamm, Sony ZV-E10, Sigma 16mm F1.4, Camlink 4k, Amaran 60d
@diaryofafounder 6 місяців тому
Thanks! Keep up the great videos!
@WeAskToAI 3 місяці тому
You are an AI developer? And what you were 1 year ago? A developer? Man if this is not hype...
@daveebbelaar 3 місяці тому ⁺¹
Got a bachelor and master in AI. Started in 2013 😉
@PeterPan-hs5tu 5 місяців тому
great to see Dave’s vid again, somehow youtube stop pushing your vid to my list 😅
@BangaloreYoutube 6 місяців тому
I was just about to go test this 😂😂 thanks for doing it for me the json output part,
Funny thing UA-cam pointed me straight to this video and to the timestamp of the exact time
My search was "Claude json output" damn.
@daveebbelaar 6 місяців тому
Wow that's awesome haha
@johnamckinley 7 місяців тому
Was quite disappointed with Claude and initial efforts to do parallel code gen comparison with gpt4. Both have a long way to go, and the hype factor for new Claude release for code gen is not warranted imho
@Dave-cg9li 7 місяців тому ⁺¹
I've tested some pdf summaries and having it teach me some concepts based on the documents, and Claude was on a completely different level there - both in retrieving the information and teaching it. The documents were short, so the context didn't play a role.
Haven't yet tested it for writing code though.
@lugaidster 4 місяці тому
3 minutes in and still no review.... Dude...
@superresistant0 7 місяців тому ⁺¹
What’s your browser?
@daveebbelaar 7 місяців тому ⁺¹
Arc Browser
@alexanderpopelyuk741 6 місяців тому
come to see Claude review, instead got a lecture on how to get json output 😂
@daveebbelaar 6 місяців тому
Structured output is important 😂
@jasonholdener9036 6 місяців тому
Great video Dave, keep them coming!
@eugenmalatov5470 6 місяців тому
isn't it blocked in the EU?
@kirschdieblp4242 6 місяців тому ⁺¹
Yes but not the API
@alfglobalservices 7 місяців тому
Thanks for the video!
@daveebbelaar 7 місяців тому
You're welcome!
@henrydeutsch5130 6 місяців тому ⁺²
Almost your entire video is about JSON format, no one except you cares about this. Nearly your entire video is irrelevant to the vast majority of your viewers, not good.
@daveebbelaar 6 місяців тому
Fair point. The video unexpectably went into that direction as I was doing my research, but the main goal of this video is to provide you with a practical framework to start running and comparing your own tests with both models.
@dstutz 6 місяців тому ⁺¹
You can leave feedback without being a total jerk about it you know
@henrydeutsch5130 6 місяців тому ⁺¹
@@dstutz ye my b I forgot creators read comments lol
@henrydeutsch5130 6 місяців тому
@@dstutz ye my b I forgot creators read comments sometimes lol
@jamminrebel3614 7 місяців тому
top content, may the algo pick u up and carry this faar xD
@daveebbelaar 7 місяців тому
📈🙏🏻

Наступне

Автоматичне відтворення

Why Agent Frameworks Will Fail (and what to use instead)