What is Retrieval Augmented Generation (RAG) and JinaAI?

Fine-Tune Llama3 using Synthetic Data

abstract syntax tree's are gonna be IMPORTANT in 2024

g-squad assembles (skibidi toilet 74)

Cute Barbie gadgets 🩷💛

Как быстро замутить ЭлектроСамокат

Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

Chris Hay

Переглядів 5 187

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 бер 2024
We look deep into the AI and look at how the embeddings layer of a Large Language Model such as Mistral-7B and Gemma-2B actually works.
You will learn how tokens and embeddings work and even extract out and load the embeddings layer from Gemma and Mistral into your own simple model, which we will use to visualize the model
You will see how an AI clusters terms together and how it can cluster similar words, build connections which cover not just similar words but also grouping of concepts such as colors, hotel chains, programming terms.
If you really want to understand how an LLM's works or even build your own LLM then starting with the first layer of a Generative AI model is the best place to start.
Github
-----------
github.com/chrishayuk/embeddings
Наука та технологія

КОМЕНТАРІ • 29

@chrishayuk 2 місяці тому ⁺²
this is the github repo: github.com/chrishayuk/embeddings
@scitechtalktv9742 2 місяці тому ⁺³
Fantastic video !
I am wondering: I think it would also be very interesting to also be able have a visualization of not only the static embeddings you already did, but also a visualization of the so-called contextualized embeddings in a later layer of the model! These are the embeddings that are exposed to the attention mechanism. That why they are also called dynamic embeddings.
It adds another layer of abstraction, but are better embeddings because they are able to distinguish between homonyms: words that are the same but have completely other meanings if used in another context. A good example is the word “bank”, that has several different meanings when used in another context (for example financial institution or river bank and several other meanings! ). As a consequence the word “bank” will be represented by several different vectors in embedding space, depending on the context it is used in!
This technique is called Word Sense Disambiguation (WSD).
Would it be possible to visualize that too? I am curious….
@chrishayuk 2 місяці тому ⁺¹
yep, you got what i'm doing... i'm literally walking the stack
@chrishayuk 2 місяці тому ⁺¹
so those videos will be coming
@scitechtalktv9742 2 місяці тому ⁺¹
@@chrishayukFantastic ! Those embeddings are crucially important for the workings of Large Language Models !
@johntdavies 2 місяці тому ⁺²
Great insight, thanks for posting this. It would be interesting to show how a fine-tuned model differs in similarities and "vocabulary". I'm also curious on the effects of quantisation, i.e. Q4, Q6, Q8, fp16 etc. on the internal "workings" of the LLM. Thanks again.
@chrishayuk 2 місяці тому ⁺¹
It’s almost like you’re reading my roadmap
@NERDDISCO 2 місяці тому ⁺³
This came to the absolute right time! Thank you very much! I was just trying to understand this. Now I know how it works ❤
@chrishayuk 2 місяці тому ⁺¹
Glad it was helpful!
@khalilbenzineb 2 місяці тому ⁺²
I was playing a bit with finetuning to force an output schema for some 7B Models, but lately I discovered schema grammar, which is a way to dynamically play with the EOS tokens, by limiting them to a specific set of tokens, to generate the output you want, This is very stable and way efficient for many cases that we may think it requires finetuning, For me it felt like a new dimension to get the model intentions inline, I loved the unique and efficient way you create your videos, So I wanted to ask you if possible to create a video for us about this, I feel it's very important
@chrishayuk 2 місяці тому ⁺²
that's a good shout
@khalilbenzineb 2 місяці тому
Thx@@chrishayuk
@sumandawnmobile 2 місяці тому ⁺¹
Its an great video to understand the internals via the visualization. Thanks Chris.
@kenchang3456 2 місяці тому ⁺¹
Thanks the visualization really helped me.
@chrishayuk 2 місяці тому ⁺¹
so glad, seeing it at a lower level really demystifies what's going on
@andypai 2 місяці тому ⁺¹
Thank you! Great video!
@chrishayuk 10 днів тому
thank you, glad it was useful
@enlightenment5d Місяць тому ⁺¹
Good! Where can I find your programs?
@chrishayuk 10 днів тому
in my github repo github.com/chrishayuk
@gregherringer7700 2 місяці тому ⁺¹
This helps thanks!
@chrishayuk 2 місяці тому
Glad it helped! :)
@Memes_uploader 2 місяці тому ⁺¹
Thank you so much! Thank you youtube algorithm for showing such a great video!
@chrishayuk 2 місяці тому
Glad you enjoyed it!
@lfzuniga31 2 місяці тому ⁺¹
based

Наступне

Автоматичне відтворення

What is Retrieval Augmented Generation (RAG) and JinaAI?

What is Retrieval Augmented Generation (RAG) and JinaAI?

Fine-Tune Llama3 using Synthetic Data

Fine-Tune Llama3 using Synthetic Data

abstract syntax tree's are gonna be IMPORTANT in 2024

abstract syntax tree's are gonna be IMPORTANT in 2024

g-squad assembles (skibidi toilet 74)

g-squad assembles (skibidi toilet 74)

Cute Barbie gadgets 🩷💛

Cute Barbie gadgets 🩷💛

Как быстро замутить ЭлектроСамокат

Как быстро замутить ЭлектроСамокат

ВОЛКОВА: хочу поїхати в РЕХАБ. Мене ДОМАГАВСЯ викладач. Після СМЕРТІ чоловіка відчула ПОЛЕГШЕННЯ

ВОЛКОВА: хочу поїхати в РЕХАБ. Мене ДОМАГАВСЯ викладач. Після СМЕРТІ чоловіка відчула ПОЛЕГШЕННЯ

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

MAMBA from Scratch: Neural Nets Better and Faster than Transformers

Fine-Tuning Meta's Llama 3 8B for IMPRESSIVE Deployment on Edge Devices - OUTSTANDING Results!

Fine-Tuning Meta's Llama 3 8B for IMPRESSIVE Deployment on Edge Devices - OUTSTANDING Results!

Ollama 0.1.26 Makes Embedding 100x Better

Ollama 0.1.26 Makes Embedding 100x Better

World’s Fastest Talking AI: Deepgram + Groq

World’s Fastest Talking AI: Deepgram + Groq

Getting Started with OLLAMA - the docker of ai!!!

Getting Started with OLLAMA - the docker of ai!!!

Fine-tune Multi-modal LLaVA Vision and Language Models

Fine-tune Multi-modal LLaVA Vision and Language Models

RAG But Better: Rerankers with Cohere AI

RAG But Better: Rerankers with Cohere AI

LangChain (OpenAI) Vector Embeddings For Beginners

LangChain (OpenAI) Vector Embeddings For Beginners

Kalem ile Apple Pen Nasıl Yapılır?😱

Kalem ile Apple Pen Nasıl Yapılır?😱

ЭТО Главный провал Apple перевод @mkbhd Смотри до КОНЦА

ЭТО Главный провал Apple перевод @mkbhd Смотри до КОНЦА

👎Главный МИНУС планшета Apple🍏

👎Главный МИНУС планшета Apple🍏

iPad Pro 11 на M4 - КАК APPLE ЛИШИЛАСЬ ДУШИ

iPad Pro 11 на M4 - КАК APPLE ЛИШИЛАСЬ ДУШИ

Теперь это его телефон

Теперь это его телефон

The Coolest PSU | ROG Thor 1000w Platinum II Eva Edition ASMR Unboxing

The Coolest PSU | ROG Thor 1000w Platinum II Eva Edition ASMR Unboxing

Samsung or iPhone

Samsung or iPhone

Эволюция телефонов!

Эволюция телефонов!