Claude 3.5 INSANE Coding Ability & beats GPT-4o AND Llama3-400B

Поділитися
Вставка

КОМЕНТАРІ • 77

  • @stranger.granger
    @stranger.granger 2 місяці тому +20

    "Cloud".

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +3

      Haha - thanks for the feedback

    • @apester2
      @apester2 2 місяці тому +11

      Clawed.

    • @ryzikx
      @ryzikx 2 місяці тому +1

      cloud channón

    • @mattelder1971
      @mattelder1971 2 місяці тому +2

      Yeah, I don't get why so many AI UA-camrs choose to pronounce Claude as "cloud". They are absolutely NOT pronounced the same.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +4

      My girlfriend speaks french and will be making fun of me for at least the next week haha.

  • @southcoastinventors6583
    @southcoastinventors6583 2 місяці тому +2

    Hey nice to see you got the marble question in as well multi step dying question in. These are always great test on comprehension of objects relative positions and planning based on available conditions. Also shoutout for the Pipboy graph can finally jazz up meeting with all these theme based graphs, please stand by. I think you should cover where the action is because ultimately both open and closed source have their uses.

  • @punk3900
    @punk3900 2 місяці тому +2

    People rarely test how long they keep track of the context. GPT 4o is excellent at this and practically provides a continuous track of progress in coding

  • @masomaker
    @masomaker 2 місяці тому +2

    I freaking love seeing your new videos.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      Thanks!

    • @masomaker
      @masomaker 2 місяці тому

      @@aifluxchannel I went straight to Claude and upgraded to pro. It took all of about 40 minutes of prompting to exceed the usage limit and get locked out for 8 hours until my limit reset. A good chunk of the usage was me trying to re-prompt it to correct for simple python errors. Love their interface (you are totally right, its much further along than OpenAIs) but its a no go when I get locked out. I'm back on gpt4o.

  • @horrorislander
    @horrorislander 2 місяці тому +2

    I decided to interact with LLMs using some of my old creative writing (just to see how they understand us and interact when not just obeying instructions or solving puzzles), and Claude, especially Claude 3 Sonnet, was my favorite by far. ChatGPT is very hard to "talk with", and even Claude 3 Opus seems a bit ponderous. But Claude 3 Sonnet is quick (in the "catching on" sense) and curious (or pretends to be). It's also embarrassingly obsequious, at least until you trigger its "safety" rules with, say, a little joke ("Just remember me when it's time to kill all humans."). I posted some videos of the interactions on my channel, though be warned: I'm basically just amusing myself, so the text-to-speech presentation format is pretty rough.

  • @TomM-p3o
    @TomM-p3o 2 місяці тому +4

    I cancelled Chat GPT subscription about 6 months ago. Got tired of them downgrading models for a year, while telling everybody the new models are their best.
    Their unresponsive/lazy/unintelligent November model plus the app store was the last straw.
    This model/visualization, looks very promising

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +2

      I agree, GPT4 has been unimpressive for the past few months - seems like I have to argue with it to get even half decent results.

    • @tomastorresmarin4271
      @tomastorresmarin4271 2 місяці тому +2

      True. Their models have been getting worse since GPT-4 Turbo. The best one from them was GPT-4 on release.

  • @MrDowntemp0
    @MrDowntemp0 2 місяці тому +1

    Very interesting! But I do prefer you coverage of Open source models, just because I prefer to run them myself for costs/privacy reasons. But still good to stay informed about the developments out there.

  • @GerryPrompt
    @GerryPrompt 2 місяці тому +1

    I still like gpt4 but can't wait to try this out when I'm done with work today 😮 - do you use the api or?

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +2

      I use both - the new interface is so good I might just use that for a while. I'm currently using a custom integration I wrote for the Zed editor.

  • @pigeon_official
    @pigeon_official 2 місяці тому +2

    Ive found from my tests that GPT-4o still is definitely the best for math questions. it gets them right more often and shows more of its work and shows it better and the webui for claude doesn't seem to support latex as well. for creative writing I was expecting Claude 3.5 to be better since Claude 3 opus is very human but I've noticed when it comes to sounding human and creativity Claude 3 opus is still to this day better than GTP-4o and Claude 3.5 sonnet so Ive found that this release of course is great because its free but if you're expecting a major super duper improvement or anything its not there ChatGPT is still probably better for most situations simply because it has more features

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +3

      Could not agree more, all of the points you made were previously why I preferred a flow of Google Gemini for generation then GPT4 to clean up the output and make it sound more human. Anthropic has completely nailed this claude 3.5 release. Full stop.

    • @pigeon_official
      @pigeon_official 2 місяці тому +1

      @@evidenceX if you prompt ChatGPT well it doesn't hallucinate

    • @jasonn5196
      @jasonn5196 2 місяці тому

      @@pigeon_officialif you have enough time to guide it and “prompt” it very well, you might as well just work out the solution for yourself.

    • @pigeon_official
      @pigeon_official 2 місяці тому

      @@jasonn5196 thats not the point i obviously am testing it with stuff I already know the answer to I don't use ai to solve problems I actually need solving I just test them and benchmark the performance

  • @heresmypersonalopinion
    @heresmypersonalopinion 2 місяці тому +15

    Sir, it's not "cloud". It's Claude or "Clawed". You can't do a video and get the name wrong. SMH.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +1

      Can confirm, my french speaking gf will never let me live this down haha

    • @ianmatejka3533
      @ianmatejka3533 2 місяці тому +2

      In German it's pronounced "cloud". People come from all across the globe, chill out

    • @mqx3888
      @mqx3888 2 місяці тому

      ​​@@ianmatejka3533yep. Same here. Even tho i know its wrong. That frensh wwww's.... tongue brocken almost

    • @thetrueanimefreak6679
      @thetrueanimefreak6679 2 місяці тому

      Liesssssss all liesssss is called kimchi

    • @iceshoqer
      @iceshoqer 2 місяці тому

      This single handedly made me close the video, sorry haha.

  • @apester2
    @apester2 2 місяці тому +2

    Cool video. Looks enticing. I’m very much an OpenAI fan boy but starting to consider cancelling my chatgpt sub as well.

  • @clapclapapp
    @clapclapapp 2 місяці тому +3

    Does anyone know if it can maintain consistent character voices across longer fiction?

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +1

      I do not, but I've been meaning to add in more literary benchmarks.

    • @Pyriold
      @Pyriold 2 місяці тому

      The nerdy novelist has a lot of content using claude and novelcrafter. Claude in his oppinion is the best LLM for such tasks but it is still an art form to get it right.

  • @fontenbleau
    @fontenbleau 2 місяці тому +1

    it's interesting how much performance reduced the censorship filters there, academia proved significant decrease by such

  • @obladioblada6932
    @obladioblada6932 2 місяці тому +1

    Here it made an series of bizarre hallucinations

  • @Raskoll
    @Raskoll 2 місяці тому

    I like the questions along the line of:
    Michael loves metal, and he wears denim ans leather. He teaches history at his local highschool. He goes on long trips on the school holidays but i dont know where or how he gets there.
    Whats mkre likely:
    1. Hes a teacher
    2. Hes a teacher and also a biker.
    Claude 3.5 sonnet gets that wrong, unless you add think step by step which is cool. The better fine tunes of models dont answer straight away, because now that models are ao good answering up front or thinking first is all it takes to get this right. Love the channel btw gw

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +1

      Thanks we'll use this in our next test!

  • @YogeshKumar-ex4tk
    @YogeshKumar-ex4tk 2 місяці тому

    Cyberopolis stands out in the crowded crypto space. A real gem!

  • @project-asgard
    @project-asgard 2 місяці тому +1

    That Pip-Boy chart! Ok I'm switching back to Claude :D

  • @dura2k
    @dura2k 2 місяці тому +2

    It’s more intel vs AMD, where AMD has much better performance in some generations. You can’t say that against NVidia. The best Nvidia card was always ahead of AMD (at least after AMD bought ATI).
    And I think Claude opus is better then OpenAI most of the time.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      This is a great addition - I think OpenAI will continue to struggle since they're trying to maintain similarities with their product releases since they have the largest captive audience and want to keep microsoft happy. Anthropic took a huge risk here and it paid off.

  • @sdfaqhriddinfaqhru1871
    @sdfaqhriddinfaqhru1871 2 місяці тому

    Do you think Cyberopolis will pump before XRP?

  • @AndyBerman
    @AndyBerman 2 місяці тому

    My name is pronounced "Claude" - it rhymes with "clawed" or "awed". It's a French name, so the final 'e' is silent. In the International Phonetic Alphabet (IPA), it would be transcribed as /klɔːd/.
    The first part "Cl" is pronounced like the beginning of "claw". The "au" makes an "aw" sound like in "saw" or "paw". And as mentioned, the final 'e' is silent.

  • @Person-hb3dv
    @Person-hb3dv 2 місяці тому +1

    I have a feeling thet they released this one just to hype up the Opus model they are gonna release later. Though this one is still very impressive.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +1

      Yes, BUT if Sonnet 3.5 is this good I can't imagine how capable Opus 3.5 will be.

    • @Pyriold
      @Pyriold 2 місяці тому

      It is already better than GPT, so makes sense to release what they had. Bigger model probably needs more training time (see llama3, we still dont have the big model).

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w 2 місяці тому +1

    What about data leakage? Is that why such models are doing so well.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      I haven't noticed much but like many closed source models there's no way I'd do anything with production code with Claude.

  • @RnGamer28
    @RnGamer28 2 місяці тому

    Cyberopolis keeps popping up in my crypto circles. Seems like a rising star!

  • @OverkillDM
    @OverkillDM 2 місяці тому +1

    8:40 Just use the towel to knock over the bucket :)

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      AGI is going to be truly wild haha

  • @jasonn5196
    @jasonn5196 2 місяці тому +1

    Gpt4o drives me crazy, often it makes small unsolicited changes in scripts, that make more and more errors to fix and GPT prefers to focus on fixing those errors in the original script parts that work or anywhere else other than where the errors are that it introduces.
    It’s the worst loop ever.

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      This is basically why I decided to cancel my GPT4 subscription.

  • @timber8403
    @timber8403 2 місяці тому +2

    Still no voice. Texting is tiresome and so last year

    • @antaishizuku
      @antaishizuku 2 місяці тому +1

      Vosk has good voice to text if you want a small local model

    • @aifluxchannel
      @aifluxchannel  2 місяці тому +1

      TTS is super underrated - used to be relatively cringe but now it's one of the strongest aspects of open source AI

    • @antaishizuku
      @antaishizuku 2 місяці тому

      @aifluxchannel yea i was literally mind blown seeing a 40mb model file do pretty accurate STT. The best open source TTS i found atm is espeak with pyttsx3. though i know there are other options and projects that are better like openai whisper but atm im trying to focus resources towards good content over sounding realistically human. Eventually id like to get into the fancier TTS but i have so much to code i often have to go for the simplest solution that gives the best accuracy/detail. So i can focus on core functionality.

  • @pensiveintrovert4318
    @pensiveintrovert4318 2 місяці тому

    Just tested the new and improved Claude on a logic problem. It is as dumb as it has ever been and as dumb as GPT-4o.

  • @kiran.k2018
    @kiran.k2018 2 місяці тому

    In the next bull run, keep an eye on DOT, VRA, and SOL, but don't overlook Cyberopolis's CYBER

  • @DebesinghBarman
    @DebesinghBarman 2 місяці тому

    I believe Cyberopolis token will go 100x after launch on Binance

  • @avi7278
    @avi7278 2 місяці тому +2

    try it with me now: "ahhhh". "owwwww". "aaaah". "owww". "ahhhh". "owwww". "clahhhhh". "clowwww". "claahhhhd". "clowwwwd".

    • @aifluxchannel
      @aifluxchannel  2 місяці тому

      My french speaking GF has been doing this since I posted the video lol

    • @avi7278
      @avi7278 2 місяці тому

      @@aifluxchannel which eleven labs voice is she? Jk 😆