RING Attention explained: 1 Mio Context Length

Поділитися
Вставка
  • Опубліковано 28 вер 2024

КОМЕНТАРІ • 7

  • @bilalviewing
    @bilalviewing Місяць тому

    The library example is super helpful, great explanation, thanks a lot!

    • @code4AI
      @code4AI  27 днів тому

      You're very welcome!

    • @senx8758
      @senx8758 21 день тому

      maybe not a good idea. you still need to know the details if you want to fully understand. Some important details are skipped. but too much about Q, K, V with library example.

  • @Slappydafrog_
    @Slappydafrog_ 5 місяців тому

    Don't forget about LongRoPE! validated up to 2 million token context length!

  • @senx8758
    @senx8758 21 день тому

    where is the proof for permuation invariance? like combined "correctly" for rescaling. what does correctly and rescaling mean?

    • @senx8758
      @senx8758 21 день тому

      this needs some background from flash attention paper.

  • @jflu1
    @jflu1 5 місяців тому

    Can you explain… but of course we can it is simple it is AI!
    You know when I watch your videos, you do such an amazing job of explaining things it does almost feel simple. I learn so much from every video you produce!