Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

Поділитися
Вставка
  • Опубліковано 19 гру 2024

КОМЕНТАРІ • 9

  • @code4AI
    @code4AI  14 годин тому +3

    Please note, with the automatic dubbing from UA-cam /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

  • @mrpocock
    @mrpocock 18 годин тому +8

    Byte-level LLMs are obviously the way forward for that first round of training where you're predicting 1..n tokens given the prefix, particularly for multi-language models. Tokenization is clearly a hack, like in the dark ages of image neural networks, where we would hand-craft feature detection kernels.

  • @wwkk4964
    @wwkk4964 13 годин тому

    Thank you so much for covering this paper! I had been thinking about this specific implementation for a year and i believe its a significant step towards having truly general learning architecture that is minimizing hand crafted human priors.

  • @themax2go
    @themax2go 12 годин тому +1

    i'm having a plantbased BLT right now

  • @King_Deundel
    @King_Deundel Годину тому

    BLT seems the way to go in an ideal world, but there are definetly problems with it, I think tokenizers have accomplished tremendous work and we are on this state thanks to improving the vocab size and the tokenizations mechanisms, but from this point we may have the technology and resources to try to perform BLT on a model ( I still don't think it would work that much better)

  • @davidwynter6856
    @davidwynter6856 10 годин тому

    Can you clarify that the pre training will have to use the BLT embeddings. I.e. unless models pre trained using BLT start appearing on huggingface or elsewhere we mere mortals will not be able to take advantage of this new method?

  • @TalsBadKidney
    @TalsBadKidney 18 годин тому +1

    very very cool

  • @JeomonGeorge
    @JeomonGeorge 15 годин тому

    Does the small transformer have bpe then in the H(xi) is it finding the cross entropy. 26:13

  • @ivangoncharuk607
    @ivangoncharuk607 5 годин тому

    Bacon Lettuce Tomato