Stealing LLMs (MIT, Microsoft, Harvard)

Поділитися
Вставка
  • Опубліковано 15 лис 2024

КОМЕНТАРІ • 21

  • @code4AI
    @code4AI  День тому +1

    Hi community, it is only natural, that when we encounter a new method, we search for the nearest, "known method" or "explanation" to it. Because why should we learn something new, when we already have learned something simpler? Same happens here: "Is this not a simple model distillation?".
    Well, while both methods aim to replicate the behavior of an existing (Large Language) model, our old "model distillation" focuses on compressing a known model into a smaller one using supervised learning techniques, leveraging full access to the teacher model's outputs and maybe even additional info.
    In contrast, the method I've presented in this video is about reverse-engineering an unknown model's distribution through interactive queries, employing advanced mathematical techniques to construct a compact approximation without direct access to the model's structure, parameters, training data or internally learned sequences.
    I know it is more complex, and if you think, "it is about the same" .... this is absolute okay with me. But please don't post it like you are a professor of mathematics and have complete understanding and therefore declare "it is absolutely the same"... you might misinform others.
    However, if you ask, for a comparison, for further clarification, why not read the arxiv pre-print from MIT for a more detailed understanding. You might discover new ideas ...

  • @MusingsAndIdeas
    @MusingsAndIdeas 2 дні тому +5

    I'm just going to to put this out there. This is exactly how we develop a Theory of Mind. Replace "steal" with "understand" and "LLM" with "person" and you get something intimately familiar to any human being.

    • @zacharygray3078
      @zacharygray3078 12 годин тому

      When you go really hard on predicting the next token you are building something that creates a near perfect model of the representations of that token.

  • @LuanVasconcelosCorumba
    @LuanVasconcelosCorumba 13 годин тому

    I'm very curious about witch prompts you use to simplify the papers and maintain the formulas

  • @joehopfield
    @joehopfield 10 годин тому

    Oh no, our collective "private" data that models trained on without permission might be revealed... and that would be evidence of a obvious corporate crime. 👩‍⚖

  • @ulamss5
    @ulamss5 День тому +1

    Is this different from teacher-student "distillation"?

    • @code4AI
      @code4AI  День тому

      Great question. I pinned a reply to your question to the top of the comments, since multiple subscriber asked the same question.

  • @mshonle
    @mshonle 2 дні тому +5

    Why not try asking o1 or Claude 3.5 about that last theorem? Given you can ground the discussion with the paper you may have that second brain within your reach. It’s interesting that they cast it in terms of model stealing when it seems this could work as distillation in general? (Perhaps that was the original goal, and this black box case fell out as an interesting idea?)

    • @code4AI
      @code4AI  2 дні тому +4

      Smile. No chance with current AI. Maybe in 2 to 5 years?

  • @chakery3
    @chakery3 День тому

    isn't it same as knowledge distillation and model pruning?

    • @code4AI
      @code4AI  День тому

      Great question. I pinned a reply to your question to the top of the comments, since multiple subscriber asked the same question.

  • @apoage
    @apoage День тому +1

    This sounds alike distilation technique... It could be used as striping out island or sleeping areas of an model.. but meh

    • @code4AI
      @code4AI  День тому

      Great question. I pinned a reply to your question to the top of the comments, since multiple subscriber asked the same question.

  • @bediosoro7786
    @bediosoro7786 2 дні тому +2

    great job. unfortunately, there is no experimental proof. i believe that you can mimic the llm in a specific task but stealing it all ability will be very difficult.

    • @code4AI
      @code4AI  2 дні тому +2

      If there is a mathematical proof, that is valis (and I can understand), I need no experimental proof. It is not about a single specific task, it is about the complete model, since we test the complete mathematical space of all representative sequences.

    • @egoincarnate
      @egoincarnate 2 дні тому +2

      ​@@code4AI Just because it is theoretically possible doesn't mean modern LLMs are sufficiently low-rank enough for this "stealing" to be practical. Theoretically the large integers used in public key crypto can be factored, but it's not a practical means of attack. Experimental results would provide for a calibration of expectations. It's possible that while theoretically possible, it could cost more to "steal" a model than to train a new one from scratch.

    • @code4AI
      @code4AI  2 дні тому

      @egoincarnate Do you have an facts that would support your statement " it would cost more to "steal" a model than to train a new one from scratch"? Since this idea was just published yesterday, you can't have empirical data.

    • @egoincarnate
      @egoincarnate 2 дні тому

      @@code4AI Sorry, that should have been "could", not "would". corrected

    • @bediosoro7786
      @bediosoro7786 2 дні тому

      @@code4AI I see thank you. I have not read the paper I just scrolled through to see the end. the story was quite exciting and ended up like an open question.