Can't wait for large-context-models (another paper by Meta) combined with BLT. It might noticeably improve the models while using the same training data.
First principles thinking: why not go to the fundamental level of all things digital? Byte-level LLMs (byte -> byte) are the most universal. So, no more trying to tokenize many human languages; byte-level models could handle ANY type of digital data. o1 predicts that byte-level LLMs will be cost effective by 2026-2028. We will see if that is correct...
My friend showed me this. I like the content, the detail and the way of your explanation. Earned a sub!
Can't wait for large-context-models (another paper by Meta) combined with BLT. It might noticeably improve the models while using the same training data.
thank you! Now I understand the paper better!!! BLT was really hard to understand with just reading aint gon lie
Glad to hear that!
Huge thanks to you. I cant thank you enough, your videos make things so easier to understand.
Thank you so much for your kind words! It's amazing to know that 😊
This was beautiful, thanks.
Awesome
First principles thinking: why not go to the fundamental level of all things digital? Byte-level LLMs (byte -> byte) are the most universal. So, no more trying to tokenize many human languages; byte-level models could handle ANY type of digital data. o1 predicts that byte-level LLMs will be cost effective by 2026-2028. We will see if that is correct...