What does AI believe is true?

Поділитися
Вставка
  • Опубліковано 17 січ 2025

КОМЕНТАРІ • 19

  • @farrael004
    @farrael004 Рік тому +4

    I'm really glad to see such an well researched video about this paper instead of a clickbaty headline reporting that glosses over the more interesting details. Adding that second paper at the end also helps to show that CCS is not the silver bullet for LLM hallucination that some could believe after reading the original paper.

  • @jobobminer8843
    @jobobminer8843 Рік тому +2

    Thanks for the video

  • @Subbestionix
    @Subbestionix Рік тому +1

    I'm glad I found your channel :3
    This is extremely interesting

  • @quantumjun
    @quantumjun Рік тому +2

    It might be interesting if they can do True,False and Yes,No at the same time to check the consistency

    • @SamuelAlbanie1
      @SamuelAlbanie1  Рік тому +1

      Do you mean to supervise the model to predict this also? One idea could be as follows:
      If the authors trained a regressor that could predict yes/no from the normalised features, then they have proof that this signal is leaking. So instead, they could learn a projection and then use a trick from domain adaptation (reversing gradients) to ensure that the projected features contained no information about yes/no labels.

  • @TheThreatenedSwan
    @TheThreatenedSwan Рік тому

    Are you reading David Rozado? I've noticed that while chat AI has gotten better at keeping things consistent, it doesn't give one answer to one thing and then give you a completely contradictory answer for other things even if they're dependent on the former, but this only seems to work linguistically. You can also ask things in a different mode like analytically where you ask it to examine data and analyze it and then make a statement, but then when you ask it for what should be the same thing in other ways, it gives you a completely different answer. Similarly the framing can give you one answer even if it is generally going back to what is pc for the model. It would be nice if it could establish what exactly is meant in material terms, what is communicated not merely what the words are, and also establish bayesian priors to then make more drawn out conclusions, but I don't see how this could be done for gpt and other chatbot style models.

  • @juliangawronsky9339
    @juliangawronsky9339 Рік тому +2

    Interesting work. I think it's trying gather the validness, or logic, rather soundness of concept, or objective nature of claim, in my understanding.

    • @SamuelAlbanie1
      @SamuelAlbanie1  Рік тому +2

      Thanks for sharing your perspective. My interpretation of the work is that the goal is to infer which claims the models "thinks" are true, in an unsupervised manner.

  • @JustAThought01
    @JustAThought01 Рік тому

    Reality is generated by random events. Knowledge is defined to be logically related non random facts.

  • @XeiDaMoKaFE
    @XeiDaMoKaFE Рік тому

    yeah lets base ai on the current peer reviewed consensus bs and not the actual truth of the scientific method

    • @SamuelAlbanie1
      @SamuelAlbanie1  Рік тому +2

      I suspect modern large language models (GPT-4, Claude etc.) are often trained on large collections on peer reviewed articles, so they will pick up on these. But I'm not sure I understand your comment (the focus of this work is on trying to determine what the AI thinks is true).

    • @XeiDaMoKaFE
      @XeiDaMoKaFE Рік тому +1

      @@SamuelAlbanie1 my focus is on the root of the problem of who's deciding whats the consensus truth between humans in the first place vs the actual truth in the real world , ai could very well use the principles of logic to determine of something is true or not by picking the fundamentals instead of the assumptions , for example when you ask if michelson morley means that there no aether or means there's no static aether on a moving earth , he's trained to pretend the consensus is the truth instead of looking into the actual roots of the michelson morley and relativity to understand that in the interference of the light can also mean a moving aether on a stationary earth
      my point is
      they will never make ai actually solve problems about truth

  • @younesprog2629
    @younesprog2629 Рік тому

    What about the LLMs used by the CIA, NSA or DARPA....they're classified projects.

    • @SamuelAlbanie1
      @SamuelAlbanie1  Рік тому

      Unfortunately (or perhaps fortunately), I don't know much about the LLMs of the CIA and NSA...

    • @younesprog2629
      @younesprog2629 Рік тому

      @@SamuelAlbanie1 what I'm trying to say how we can verify the data they're using.