Worked with Claude yesterday for a while. It is “smarter” in some unstructured language processing and seems to deal with larger contexts better, and slightly better in abstract thought, sometimes. But it’s not better at everything. It trades blows with GPT-4, so that’s a good start. We really haven’t had any viable competitors before, so this is needed.
I always have good experiences with Claude. Claude 1.0 instant back then had the better reasoning and instructions following than gpt3.5 (most of the cases) and sometimes even gpt4 (in some specific cases). Unfortunately they figured out I’m in Vietnam and revoked my access 😢. Replacing gpt4 with claude 3 sonnet (via openrouter) yesterday for my game master AI and clear 40% of the issues just with the better and more consistent reasoning and instructions following.
and json schema works much better with claude, gpt4 sometimes still return invalid json not sure about python, but with typescript you can use typebox or zod to create the schema for prompt and use it as static type to infer in your code as well
ChatGPT-4 is overrated. Its not better than 3.5 in my honest view and looks worse. ChatGPT-4 has some stubborn qualities and laziness, it will skip one or two clear instructions in a detailed but brief prompt. It puts words in you mouth and is too lazy to complete. Claude Opus is super attentive, excellent in understanding detail and producing long answers.
My favorite quick question for these models... "What is the first palindromic multiple of 13" Answer should be 38 x 13 = 494. Claude Sonnet couldn't get it (e.g. 91 is the first because it reads the same forward [91] as backward [19] !? ) --- and repeated wrong answers like this over and over. ChatGPT 4 writes a Python program to solve it correctly. I would be interested to see if Opus gets it right.
I tried opus three times in a row and it failed each time. But then I tried it with Chad GpT4 and it failed that time too. So.... Here's the last run on opus: Okay, let's approach this step-by-step: 1) First, let's understand what a palindromic number is. A palindromic number is a number that remains the same when its digits are reversed. For example, 121, 34543, 11611, etc. 2) Now, we need to find the first multiple of 13 that is a palindrome. 3) Let's start listing out the multiples of 13: 13 x 1 = 13 13 x 2 = 26 13 x 3 = 39 13 x 4 = 52 13 x 5 = 65 13 x 6 = 78 13 x 7 = 91 13 x 8 = 104 13 x 9 = 117 13 x 10 = 130 13 x 11 = 143 4) None of these are palindromes. Let's keep going. 13 x 12 = 156 13 x 13 = 169 13 x 14 = 182 13 x 15 = 195 13 x 16 = 208 13 x 17 = 221 5) Here we go! 221 is the first palindromic multiple of 13. Therefore, the first palindromic multiple of 13 is 221.
@@endoflevelboss oh sorry I thought you were interested in seeing someone try it with opus. Regarding chat GPT differences, I double checked and actually I did get it right when I tried that so same results. Although in theory LLMs can give different answers when they are prompted the same way multiple times. Although maybe not now with the Python code check who knows.
@Dave: you mention that you prefer using your own wrappers instead of being dependant on langchain for instance. Does that mean that you have been able to build agent&team setups yourself? If so, I'm interested how since I was just about to dive into langchain and crew-AI. Ps: I'm beginner when it comes to Python.
Hello, how do you do ! first thanks for all your efforts I really appreciate , I want you to suggest me a plan to learn aws or Google cloud or the one you suggest me. Also as a data scientist should I learn deep learning and computer vision and nlp and llms (I am talking as a beginner or a junior to apply a job what are the specifications needed) thanks again
For longer text and output I use chat Poe. When I use API I set up words output limitation - In my case quality of output is more imprortant than price. For less demanding task I think haiku model will be perfect is cheap and quality is between GPT3.5 and 4. In the future for my projects I will mix Heiku/ Sonet with Opus model for the best cost quality balance. @@videos6505
"AI Engineer" Hahahaha, he says so because he can type things into chatgpt. Do you create CNNs professionally? You do not. Do you implement ai algorithms in robots? You do not.
I was just about to go test this 😂😂 thanks for doing it for me the json output part, Funny thing UA-cam pointed me straight to this video and to the timestamp of the exact time My search was "Claude json output" damn.
Was quite disappointed with Claude and initial efforts to do parallel code gen comparison with gpt4. Both have a long way to go, and the hype factor for new Claude release for code gen is not warranted imho
I've tested some pdf summaries and having it teach me some concepts based on the documents, and Claude was on a completely different level there - both in retrieving the information and teaching it. The documents were short, so the context didn't play a role. Haven't yet tested it for writing code though.
Almost your entire video is about JSON format, no one except you cares about this. Nearly your entire video is irrelevant to the vast majority of your viewers, not good.
Fair point. The video unexpectably went into that direction as I was doing my research, but the main goal of this video is to provide you with a practical framework to start running and comparing your own tests with both models.
What do you think of these new models? Let me know in the comments!👇🏻
Worked with Claude yesterday for a while. It is “smarter” in some unstructured language processing and seems to deal with larger contexts better, and slightly better in abstract thought, sometimes. But it’s not better at everything. It trades blows with GPT-4, so that’s a good start. We really haven’t had any viable competitors before, so this is needed.
GPT-5 will be released (soon).
Probably ;)
How soon?
I always have good experiences with Claude. Claude 1.0 instant back then had the better reasoning and instructions following than gpt3.5 (most of the cases) and sometimes even gpt4 (in some specific cases).
Unfortunately they figured out I’m in Vietnam and revoked my access 😢.
Replacing gpt4 with claude 3 sonnet (via openrouter) yesterday for my game master AI and clear 40% of the issues just with the better and more consistent reasoning and instructions following.
and json schema works much better with claude, gpt4 sometimes still return invalid json
not sure about python, but with typescript you can use typebox or zod to create the schema for prompt and use it as static type to infer in your code as well
ChatGPT-4 is overrated. Its not better than 3.5 in my honest view and looks worse. ChatGPT-4 has some stubborn qualities and laziness, it will skip one or two clear instructions in a detailed but brief prompt. It puts words in you mouth and is too lazy to complete. Claude Opus is super attentive, excellent in understanding detail and producing long answers.
My favorite quick question for these models... "What is the first palindromic multiple of 13" Answer should be 38 x 13 = 494. Claude Sonnet couldn't get it (e.g. 91 is the first because it reads the same forward [91] as backward [19] !? ) --- and repeated wrong answers like this over and over. ChatGPT 4 writes a Python program to solve it correctly. I would be interested to see if Opus gets it right.
Cool thanks for the info. I won't take my OpenAI flag down just yet 👍
I tried opus three times in a row and it failed each time. But then I tried it with Chad GpT4 and it failed that time too. So....
Here's the last run on opus:
Okay, let's approach this step-by-step:
1) First, let's understand what a palindromic number is. A palindromic number is a number that remains the same when its digits are reversed. For example, 121, 34543, 11611, etc.
2) Now, we need to find the first multiple of 13 that is a palindrome.
3) Let's start listing out the multiples of 13:
13 x 1 = 13
13 x 2 = 26
13 x 3 = 39
13 x 4 = 52
13 x 5 = 65
13 x 6 = 78
13 x 7 = 91
13 x 8 = 104
13 x 9 = 117
13 x 10 = 130
13 x 11 = 143
4) None of these are palindromes. Let's keep going.
13 x 12 = 156
13 x 13 = 169
13 x 14 = 182
13 x 15 = 195
13 x 16 = 208
13 x 17 = 221
5) Here we go! 221 is the first palindromic multiple of 13.
Therefore, the first palindromic multiple of 13 is 221.
@@ktb1381 yeah what I said, the wrong answer. And why did ChatGPT 4 give you the wrong answer when it gives me the right answer every time?
@@endoflevelboss oh sorry I thought you were interested in seeing someone try it with opus. Regarding chat GPT differences, I double checked and actually I did get it right when I tried that so same results. Although in theory LLMs can give different answers when they are prompted the same way multiple times. Although maybe not now with the Python code check who knows.
@Dave: you mention that you prefer using your own wrappers instead of being dependant on langchain for instance. Does that mean that you have been able to build agent&team setups yourself? If so, I'm interested how since I was just about to dive into langchain and crew-AI. Ps: I'm beginner when it comes to Python.
Yea I mostly build the agent frameworks myself. Will consider this for a future video.
Hello, how do you do !
first thanks for all your efforts I really appreciate ,
I want you to suggest me a plan to learn aws or Google cloud or the one you suggest me.
Also as a data scientist should I learn deep learning and computer vision and nlp and llms (I am talking as a beginner or a junior to apply a job what are the specifications needed)
thanks again
Gemini Ultra is worse than GPT-4. Claude 3 Opus is the real deal. Based on my preliminary tests, it's better than GPT-4. It is my default LLM now. :)
Opus is very expensive.
For longer text and output I use chat Poe. When I use API I set up words output limitation - In my case quality of output is more imprortant than price. For less demanding task I think haiku model will be perfect is cheap and quality is between GPT3.5 and 4. In the future for my projects I will mix Heiku/ Sonet with Opus model for the best cost quality balance. @@videos6505
Thanks for sharing!
Until tomorrow.
The bracket thing seems so stupid - does no one try this shit before releasing it
hai im late btw theres websim support xd
When am i getting replaced as a software engineer soon?
2 years max.
You are already being replaced.
"AI Engineer" Hahahaha, he says so because he can type things into chatgpt.
Do you create CNNs professionally? You do not.
Do you implement ai algorithms in robots? You do not.
Hey Dave, what theme and font do you use for vscode? Looks great. I am trying to get the look of Gemini code window, but can't get there quite yet.
Thanks! Check out this video: ua-cam.com/video/3sIzCFuLgIQ/v-deo.html
What cool tidbit regarding the "{", never read that deeply through their docs to find that - good find!
Yea that’s a neat little feature, useful for many other cases as well a certain output it desired.
Hey Dave, can I ask how you record yourself like that in Portrait mode? What gear are you using?
Sure! Ecamm, Sony ZV-E10, Sigma 16mm F1.4, Camlink 4k, Amaran 60d
Thanks! Keep up the great videos!
You are an AI developer? And what you were 1 year ago? A developer? Man if this is not hype...
Got a bachelor and master in AI. Started in 2013 😉
great to see Dave’s vid again, somehow youtube stop pushing your vid to my list 😅
I was just about to go test this 😂😂 thanks for doing it for me the json output part,
Funny thing UA-cam pointed me straight to this video and to the timestamp of the exact time
My search was "Claude json output" damn.
Wow that's awesome haha
Was quite disappointed with Claude and initial efforts to do parallel code gen comparison with gpt4. Both have a long way to go, and the hype factor for new Claude release for code gen is not warranted imho
I've tested some pdf summaries and having it teach me some concepts based on the documents, and Claude was on a completely different level there - both in retrieving the information and teaching it. The documents were short, so the context didn't play a role.
Haven't yet tested it for writing code though.
3 minutes in and still no review.... Dude...
What’s your browser?
Arc Browser
come to see Claude review, instead got a lecture on how to get json output 😂
Structured output is important 😂
Great video Dave, keep them coming!
isn't it blocked in the EU?
Yes but not the API
Thanks for the video!
You're welcome!
Almost your entire video is about JSON format, no one except you cares about this. Nearly your entire video is irrelevant to the vast majority of your viewers, not good.
Fair point. The video unexpectably went into that direction as I was doing my research, but the main goal of this video is to provide you with a practical framework to start running and comparing your own tests with both models.
You can leave feedback without being a total jerk about it you know
@@dstutz ye my b I forgot creators read comments lol
@@dstutz ye my b I forgot creators read comments sometimes lol
top content, may the algo pick u up and carry this faar xD
📈🙏🏻