Can ChatGPT solve the world's hardest puzzles?

Поділитися
Вставка
  • Опубліковано 18 гру 2024

КОМЕНТАРІ • 84

  • @sanjey-ww8jn
    @sanjey-ww8jn Рік тому +201

    This guy talks to ChatGPT exactly the same way interviewers do during my technical interviews xd

  • @nurichbinreel4782
    @nurichbinreel4782 Рік тому +53

    You didnt even have to go this hard. Ask GPT 4 to solve a simple ceasar cipher. You can even tell it the exact letter shift and it will still fail to apply it.

    • @EvanG529
      @EvanG529 10 місяців тому +4

      ChatGPT doesn't do well with the actual details of a string of text. It can't really store information you give it, it just knows what word probably comes next.

  • @makesnosense6304
    @makesnosense6304 Рік тому +18

    What's funny about all these "Coded an entire website using ChatGPT" is that 1. It's not really an entire website. It's just basic stuff. Unless you count some one page site with some simpler functionality and buttons an "entire website"... And 2. There were plenty of corrections before ending up with whatever was made.

  • @FAB1150
    @FAB1150 Рік тому +27

    4:09 FIY it didn't bug out, it ran out of tokens for the answer. You can tell it "continue" or "go on", and it will go on with the answer!

    • @xBINARYGODx
      @xBINARYGODx Рік тому +2

      yes, this and other things make me think he doesn't understand language models and their limitations too.

  • @IHaventDiedYet
    @IHaventDiedYet Рік тому +71

    Thought this was like a 50k subs channel, only 148? Greatly underrated

  • @Viniter
    @Viniter 11 місяців тому +6

    It produced text that looks convincingly like an answer to a logic puzzle, which is exactly what it's trained to do, so 10/10.

  • @sandoh9500
    @sandoh9500 Рік тому +39

    This channel is destined for big stuff

    • @Jake28
      @Jake28 Рік тому +1

      you're destined for big stuff

  • @colouredmirrorball
    @colouredmirrorball Рік тому +1

    I was trying to concentrate on the puzzles, but I kept getting distracted by THE LICC

  • @Xeverous
    @Xeverous Рік тому +6

    ChatGPT can't solve anything because it doesn't understand the meaing of words. All it does is pattern matching and probability models. The answers to simple puzzles come out probably just because the training input already had them and the bot correlated these answers with the questions in the input.

  • @GeoRoze
    @GeoRoze Рік тому +5

    If anyone wants some peak humour:
    Ask chatgpt to draw an ascii art of yoshi.
    You’ll be surprised at what you find

  • @kipchickensout
    @kipchickensout Рік тому +6

    It's only good for stuff that do not require too much thinking or calculations, if I told it to give me certain logic in code, it was only able to do it for very common things such as a levenshtein distance alg, but not for something lesser known

    • @orterves
      @orterves Рік тому

      That's because it doesn't do any thinking or calculations. It's a word predictor. It predicts words.
      Don't hammer your nails with a sponge. Don't run mathematical calculations with a word predictor.

  • @KidJV
    @KidJV Рік тому +1

    underrated channel is underrated

  • @orterves
    @orterves Рік тому +13

    ChatGPT is a word predictor with a bias towards attempting to match the current conversation context.
    The more you try to correct it in a conversation, the more tied up in the context it gets.
    Don't correct the bot with further conversation, edit your statements or restart the conversation entirely

    • @charleystello1822
      @charleystello1822 Рік тому

      Yes! While these puzzles are really difficult and I’m not so sure it would have gotten them anyway, the way he was talking to gpt was not exactly “correct” as you said it relies heavily on context and while it is possible to correct simple mistakes when it comes to difficult tasks, correcting it does more harm than good because he starts to get confused about what is fact and what is not based on what both he said previous and what the user has inputted, the majority of the hallucinations that I have witnessed with gpt comes from trying to correct him without doing it in the correct way if that makes sense

  • @mcwolfbeast
    @mcwolfbeast Рік тому +17

    ChatGPT is a language model based thing. Don't expect it to understand problems that fall outside of the scope of basic logic and language comparison.

    • @orterves
      @orterves Рік тому +7

      This absolutely. I wonder though if a similar model trained purely on mathematics would have better success with maths problems?

    • @SgtSupaman
      @SgtSupaman Рік тому +6

      So, it failing to provide a word with 11 letters has nothing to do with language?

  • @xrayian
    @xrayian Рік тому +1

    Kinda feeling happy for being a subscriber before 1k, you'll do great if you keep at it!

  • @herzogsbuick
    @herzogsbuick Рік тому

    that LifeAdviceLamp "Buy Lottery Tickets" tweet, is the king of the city of my heart

  • @IcecubeX
    @IcecubeX Рік тому +7

    this is really cool

    • @kevinfaang
      @kevinfaang  Рік тому +7

      you're pretty cool yourself, IcecubeX 2000

  • @sharpieman2035
    @sharpieman2035 11 місяців тому +1

    Are you the Kevin Fang that works at Jane Street or is that a different Kevin Fang? He’s on LinkedIn if you’re not him and want to find him.

  • @Ramonatho
    @Ramonatho Рік тому

    Hitting the AI with "bruh" is what's gonna lead to the robot uprising isn't it

  • @I_Was_Named_This_Way...
    @I_Was_Named_This_Way... Рік тому +7

    You are very underrated ):

  • @asdfssdfghgdfy5940
    @asdfssdfghgdfy5940 Рік тому +1

    Man I would die on the hill about seed being an early stage of a plant.

  • @lightning_11
    @lightning_11 Рік тому +1

    7:08 all that math looks impossible to me.

  • @perelmanych
    @perelmanych Рік тому +12

    I asked if ChatGPT knows Bulls and Cows game and suggested to play it. Bot thought a number and I had to guess it. After the third answer limitations of a bot that just tries "to continue a sequence of words with the most probable candidate" became very obvious)) Answers were inconsistent and when I pointed out inconsistency it agreed about mistake, but the new answer it gave was as inconsistent as previous. To sum up, when ChatGPT saw something similar to a problem in a training set as I believe was the case for the single-cross problem it can produce wonderful results, but do not expect real reasoning from it.

  • @lancemarchetti8673
    @lancemarchetti8673 Рік тому +2

    I created a simple stenography challenge for ChatGPT, which only required 5 steps to uncover my pseudo Google account details. The bot could not solve it. I used no password encryption, only standard ASCII reversal , encoded to binary, then to Base64. I then advanced every 3rd character in the Base64 by 1. I then added the resulting string into metadata in a standard jpeg file depicting a red rose. It would have been cool if the AI could have uncovered the hidden data. Perhaps we still have far to go before AI can achieve this. ?

    • @samuelthecamel
      @samuelthecamel Рік тому +1

      ChatGPT is really just a fancy next-word predictor, so it can't really do stuff like that and probably never will for a long time. It's like trying to use a fork to eat soup. That being said, if there's an AI that is specially trained for this task, it may be able to recover your account details.

    • @lancemarchetti8673
      @lancemarchetti8673 Рік тому

      @@samuelthecamel Agreed

  • @ryanm2648
    @ryanm2648 Рік тому +2

    The issue is that you found these tests online. ChatGPT has scanned the internet so it can get many of the word riddles. Some of them it just doesn't know what you're asking.

    • @AgentFire0
      @AgentFire0 Рік тому +3

      Yup, I've suspected as much when I copy-pasted an Einstein's famous riddle to the ChatGPT, and it immediately blurted out the right answer, however, when I simply replaced "German guy" with a "Russian guy" within the puzzle's description, ChatGPT fucking exploded with wrong answers, illusions of logical thinking, and other nonsense stuff.
      So, in the end, it couldn't even compare my input with the IDENTICAL input it was learned upon save for ONE replaced word.

    • @ryanm2648
      @ryanm2648 Рік тому

      @@AgentFire0 I have managed to actually get it to do problem solving by making word riddles that are not anywhere on the internet.
      I asked it something like this.
      There are three boxes, box A, B, and C, all placed side by side. These all look identical.
      Fred places a coin in box A for storage, and leaves the room.
      While Fred is not anywhere nearby, box A is switched places with box C.
      The coin is removed, and placed in box B.
      When Fred returns, he looks in the storage where he placed his coin, which box will Fred check first?
      It got this right for me.
      And then as another test, you could say the boxes are labelled BUT you need a way to ask it "Which position does he check" rather than "Which box does he check". Because, if they are labelled, he will see it has switched places, and check the third position (where box C was) BUT this is still technically box A. So even though the answer is the same if they're labelled (He will check box A) the place where box A is has changed.

  • @danieltao261
    @danieltao261 Рік тому

    Can you add what bgm you used to the description?

    • @kevinfaang
      @kevinfaang  Рік тому

      All original music in this one - the intro one is on this channel (the davie504 video)

  • @JohnDlugosz
    @JohnDlugosz Рік тому +20

    After seeing so many reports of astonishing things ChatGPT can do, it appears the worm has turned and now we find it interesting where it fails.

    • @fergalhennessy775
      @fergalhennessy775 Рік тому +15

      it's an nlp language model, not the oracle of delphi, i wouldn't be concerned if your job requires brain power.

    • @television9233
      @television9233 Рік тому

      Not really, seeing the difficulty of the puzzles I would be surprised if it got any of the reasoning correctly. (although I did suspect it would have seen at least some of those answers on the internet previously but I guess not)

  • @NuncNuncNuncNunc
    @NuncNuncNuncNunc Рік тому +6

    A one story house with a basement is still considered one story. Only above grade level counts. Edge cases are everywhere.
    Tug of war: You might be able to convince ChatGPT that the correct answer is incorrect.

    • @Alex_Vir
      @Alex_Vir Рік тому

      Also is a roof balcony counted as an additional story?

    • @NuncNuncNuncNunc
      @NuncNuncNuncNunc Рік тому

      @@Alex_Vir Balcony/deck is not a habitable space, not even inside the house, so no. Inside there could be stairs up to a roof deck in a one story house.

  • @pal181
    @pal181 Рік тому +1

    I once tried something like this and it did same crap. Now I wonder how many hours they spent to get those ad results.

  • @djmips
    @djmips Рік тому +1

    Any improvement with ChatGPT4?

    • @kevinfaang
      @kevinfaang  Рік тому +1

      Not sure (not going to sign up for premium). I think Bing AI uses GPT4 though...

    • @jean-lucsedits4319
      @jean-lucsedits4319 Рік тому

      I have been using bing for a while now to search things if a very obscure language. And to be very honest, it's good to provide general answers but really bad at giving very precise result, in that case Google actually beats it. Also I hate that it doesn't open youtube videos on UA-cam. In conclusion I don't really see much of an improvement atm :)

  • @SunnyNagam
    @SunnyNagam Рік тому

    Language models are horrible at letter wise questions like the first one. It basically doesnt even read letter by letter or know where they are since it turns the words into chunks and the chunks into embedding space vectors. Language models are also not really built for math since the neural networks theyre based on have no way to perform calculations outside of memorization and pattern matching.
    That being said even if these two limitations didnt exist it would probably still get the questions wrong since ai just isn't there yet to do this level of multi level creative reasoning... Yet.
    Id be curious to see the results with gpt-4 and "chain of thought" prompting, as im sure that would perform much better.

  • @Ikxi
    @Ikxi Рік тому +1

    I gave up on chatgpt when it just kept giving me the same code over and over again.
    It's so painful.

  • @codewizard58
    @codewizard58 Рік тому

    chatgpt is a very chatty talking dice.

  • @Ceidonianphysicist
    @Ceidonianphysicist Рік тому +3

    I asked it to play tic tac toe with me. A game famous for always ending in a draw, been solved by computers since the 50s and famously used in the 80s film Wargames to teach the rogue ai in that film about no win scenarios. I won every game against chatgpt. It’s a very clear word prediction algorithm but intelligence it is not.

  • @ai-spacedestructor
    @ai-spacedestructor Рік тому

    im not surprised by this. puzzle solving probably was a fairly low percentage of the training data and since it cant access the internet its not able to learn or look up how it works and therefore is just randomly guessing like probably most people would.

  • @Paulo27
    @Paulo27 Рік тому

    I'm actually crying, this was hilarious.

  • @CC21200
    @CC21200 4 місяці тому

    "Difficulty" of a puzzle (to humans) is not a relevant metric here. Rather, it's more a matter of how well-documented the solutions are in its training dataset.

  • @natew4724
    @natew4724 Рік тому

    Answer: No, but it sure thinks it can.

  • @mikesum32
    @mikesum32 Рік тому

    Puzzle one sounds like the ABCs song.

  • @danraine9009
    @danraine9009 Рік тому

    bruh is my most used comment back to chat-gpt's answers hahaha had me laughing there

  • @alepouna
    @alepouna Рік тому +2

    Would be fun to see this revisited with chat gpt 4

  • @TMinusRecords
    @TMinusRecords Рік тому +1

    Funny how the language model is terrible at the language puzzle, but great at the maths one

  • @samuelthecamel
    @samuelthecamel Рік тому

    ChatGPT can't actually read letters. Instead, words are simplified into "tokens," which may be a full word or a part of a word. This puts it at a severe disadvantage with any word puzzles.

  • @babywaffles
    @babywaffles Рік тому

    Try GPT-4

  • @herzogsbuick
    @herzogsbuick Рік тому

    You have truly doubled Dolly with extra care.

  • @maxmustermann8447
    @maxmustermann8447 Рік тому

    Duuude, you new, you good! please keep it up! :D
    Sup from me

  • @Veptis
    @Veptis Рік тому

    The model sees token ids, not words or letters

  • @polygonalcube
    @polygonalcube Рік тому

    I'd give it a score of 1/2.

  • @adre2194
    @adre2194 7 місяців тому

    Language models are impressively bad with anything rekated to math. I once gave one a string and asked it to count the characters and it failed in the most spectacularly impressive ways.

  • @Batman_akzo
    @Batman_akzo Рік тому

    It can't answer simple questions and you're talking about jane street puzzles

  • @fosy6991
    @fosy6991 Рік тому

    i was here before he got big.

  • @EvanBear
    @EvanBear Рік тому

    ChatGPT doesn't actually understand or analyze anything, it just makes shit up. Its main goal is to "sound" true, whether or not it's actually true doesn't matter.

  • @atmavighyan6710
    @atmavighyan6710 Рік тому

    Worth trying again with v4

  • @1234567qwerification
    @1234567qwerification Рік тому +1

    The Python code is cringe.

  • @davronsherbaev9133
    @davronsherbaev9133 Рік тому

    important note: you started with gpt4 and continuted with chat gpt. Next time try to use gpt4, its much smarter)

  • @aze4308
    @aze4308 Рік тому

    yoo

  • @monoco1159
    @monoco1159 Рік тому

    Prompt engineering is an actual skill, sir. You are not leveraging the complete potential of cGPT with your prompts. Retry this again but this time craft the prompts in instructions format. Look at its training for reference.

  • @SgtSupaman
    @SgtSupaman Рік тому

    Bot solves 0/3 problems, scores 2/5... Sure, ok.