10
14 296

Coding Llama 3 from scratch in PyTorch - Part 1

23:59

LLMOps: Deploying LLMs and Scaling using Modal, LangChain and Huggingface

34:13

Coding Llama 2 from scratch in PyTorch - Part 3

50:14

Get started with Command-R Cohere's new LLM: RAG and Tool Calling on Consumer GPUs

3:47

Claude 3: The GPT-4 Killer That Will Shock You!

28:21

Get started with Gemma Google's NEW open-source LLM model

40:19

Coding Llama 3 from scratch in PyTorch - Part 2

In this video series, you will learn how to train and fine-tune Llama 3 model from scratch.
The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 22B, 45B, 35B and 45BM params. In this second video, you'll learn about continous pretraining, LLM benchmarks and you'll also get to see the results.
🤖 Models:
- Llama-3-6B-v0.1: huggingface.co/prince-canuma/Llama-3-6B-v0.1
- Llama-3-6B-v0.1 adapters: huggingface.co/prince-canuma/Llama-3-6B-v0.1-adapters
- Llama-3-6B-v0 (Untrained): huggingface.co/prince-canuma/Llama-3-6B-v0
📚Papers:
- LoRA: Low-Rank Adaptation of Large Language Models: arxiv.org/abs/2106.09685
- QLoRA: Efficient Finetuning of Quantized LLMs
: arxiv.org/abs/2305.14314
💻 To follow along you can use this colab notebook:
- github.com/Blaizzy/Coding-LLMs-from-scratch/tree/main/Llama-3
🎥 Coding Llama 3 from scratch video series
Part 1: ua-cam.com/video/6nYfl_iOKFM/v-deo.html

Відео

Coding Llama 3 from scratch in PyTorch - Part 1

23:59

Coding Llama 3 from scratch in PyTorch - Part 1

Переглядів 2,7 тис.Місяць тому

In this video series, you will learn how to train and fine-tune Llama 3 model from scratch. The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 35B and 45BM params. In this first video, you'll learn about upcycling, downcycling and infini-attention. 📚Papers: - Sparse Upcycling Training Mixture-of-Experts from Dense Checkpoints : arxiv.org/abs/2212.05055 - Pre...

LLMOps: Deploying LLMs and Scaling using Modal, LangChain and Huggingface

34:13

LLMOps: Deploying LLMs and Scaling using Modal, LangChain and Huggingface

Переглядів 3992 місяці тому

In this video, you'll learn about LLMOps, the practice of deploying and scaling LLMs using Modal, Langchain and Huggingface. In the rapidly evolving domain of Large Language Models (LLMs), businesses and researchers grapple with the challenges of efficiently deploying, monitoring and scaling these models. The operational complexities, from infrastructure management to ensuring context-aware res...

Coding Llama 2 from scratch in PyTorch - Part 3

50:14

Coding Llama 2 from scratch in PyTorch - Part 3

Переглядів 1,1 тис.3 місяці тому

In this video series, you will learn how to train and fine-tune Llama 2 model from scrach. The goal is to code LLaMA 2 from scratch in PyTorch to create models with sizes 100M, 250M and 500M params. In this third video, you'll learn about KV cache, RoPE, and Hugginface Trainer in detail. 📋 KV cache: - ua-cam.com/video/80bIUggRJf4/v-deo.html 🪢 RoPE: - ua-cam.com/video/o29P0Kpobz0/v-deo.html - nn...

Get started with Command-R Cohere's new LLM: RAG and Tool Calling on Consumer GPUs

3:47

Get started with Command-R Cohere's new LLM: RAG and Tool Calling on Consumer GPUs

Переглядів 4563 місяці тому

In this video, you will learn how to do tool calling and RAG with ⌘-R while running it on consumer GPU (i.e, 4090, A5000, T4) with just 24GB VRAM. Model weights 🧠 Transformers: huggingface.co/prince-canuma/c4ai-command-r-v01-4bit MLX-LM: huggingface.co/mlx-community/c4ai-command-r-v01-4bit

Claude 3: The GPT-4 Killer That Will Shock You!

28:21

Claude 3: The GPT-4 Killer That Will Shock You!

Переглядів 4273 місяці тому

In this video, you'll learn how to use Anthropic's Claude 3 models to extract information from large documents, including the vision variants. ⚙️ Essential Tools We'll Be Using: - Anthropic LLM API: Gain access to Anthropic's cutting-edge language models through their powerful API. - Langchain: a framework for developing applications powered by language models. 🌟 What is Claude 3? Claude 3 is a...

Get started with Gemma Google's NEW open-source LLM model

40:19

Get started with Gemma Google's NEW open-source LLM model

Переглядів 3,1 тис.3 місяці тому

In this video, I'll show you how to summarize large PDF documents locally on your laptop using Gemma 2B and 7B instruct models. ⚙️ Essential Tools We'll Be Using: - MLX-LM: A library based on MLX, which is a framework for Machine learning research on your laptop or in a data center - by Apple. - Huggingface: The Hugging Face Hub is a platform with over 120k models, 20k datasets, and 50k demos i...

RAGOps: Advanced Retrieval Strategies with LangChain, Langsmith and Supabase.

48:27

RAGOps: Advanced Retrieval Strategies with LangChain, Langsmith and Supabase.

Переглядів 2 тис.4 місяці тому

Learn to build advanced RAG applications in this video. I'll guide you through setting up each pipeline step, along with monitoring, evaluation, and enhancing your prompts and document processing. Also, get an inside look at how we optimize our RAG pipelines at Kulissiwa.com using LangChain, Langsmith, and Supabase. Follow me on: - LinkedIn: www.linkedin.com/in/prince-canuma/ - X: P...

КОМЕНТАРІ

@Tuscani2005GT 16 годин тому
This channel is pure gold. Keep it up!
@sharjeel_mazhar 8 днів тому
So in this series, you don't use any pre-trained weights? You build and train the model from scratch on a custom dataset?
@marinepower 10 днів тому
Removing every other layer or something along those lines would be much more effective. If you think about it, this just means that one layer needs to do the work of two layers (one layer + one missing layer). Whereas if you just lop off half the network you suddenly need to learn 16 layers worth of processing in one fell swoop. And not only that, but your old layers need to be retrained since it is no longer sufficient for them to just do their one layer of work they were doing before. Basically, removing every other layer is a finetune, lopping off half the network is a cataclysmic change that (almost) requires training a brand new model from scratch.
@marinepower 10 днів тому
The only thing that saves this technique is using the learned embeddings / the learned output layer, but you get that with strided layer removal too. Wish I had seen this video earlier, I'd have saved you $500 lol.
@wilfredomartel7781 12 днів тому
😊
@wilfredomartel7781 12 днів тому
😊🎉
@RadRebel4 12 днів тому
Amazing Video ! Could you Please Upload The traning scripts as well
@fliptip 16 днів тому
such a high quality content piece
@sharjeel_mazhar 17 днів тому
Can you please make sure that your future videos have higher resolution? Maybe 1440p or above? Other than that, great job! 💯
@linz4213 18 днів тому
Well made Prince! Learned a lot
@maslaxali8826 19 днів тому
CS programmers are vampires. My eeeeyyyes. great content though
@sergey_a 19 днів тому
Why are there only 3 likes, I put 4 on HF.)
@spkgyk 21 день тому
Why do you use 32 bit paged optimzier when the model is being fine-tuned with QLoRA? Surely QLoRA stores the weights in 8bit double quantized form, so using a 32 bit optimizer makes no difference, and the weight updates need to be converted back to 8 bit anyway? Please help me understand this
@princecanuma 21 день тому
Additionally, 8bit states are dequantized to 32bit for the update anyways. huggingface.co/docs/bitsandbytes/main/en/explanations/optimizers
@spkgyk 21 день тому
@@princecanuma Thank you for the quick response. With 8-bit optimizers, large models can be finetuned with 75% less GPU memory without losing any accuracy compared to training with standard 32-bit optimizers. The reduced memory requirements means 8-bit optimizers are 4x faster than a standard optimizer, and no hyperparameter tuning is required. Surely this means that using 32 bit just wastes compute power? Please correct me if I'm wrong, I'm really trying to understand the benefits. Is it because training with 32 bit means that despite converting to 8 bit for the weight update, the conversion leads to small accuracy gains?
@princecanuma 20 днів тому
There are no accuracy gains only reduced GPU usage and potentially some extra speed. In terms of speed, I personally didn’t notice any changes. I tested it yesterday and besides reduced GPU usage I noticed that it would take just as long as the 32bit to complete training.
@PaoloTshiyole 22 дні тому
Your English is nice
@princecanuma 21 день тому
Thank you very much!
@leiray7465 22 дні тому
cool
@princecanuma 21 день тому
Awesome, I’m happy you liked it :)
@kishoretvk 22 дні тому
Thanks for committing to the open source and educating people on cutting edge knowledge.
@princecanuma 21 день тому
Most welcome, it’s my pleasure!
@yoanijosias 22 дні тому
Very good, can’t wait to see updates to it.
@princecanuma 21 день тому
You and me both!
@vivekpadman5248 Місяць тому
Bro how did you train llama 3 without paper?
@princecanuma 28 днів тому
Could you elaborate?
@vivekpadman5248 27 днів тому
@@princecanuma As far as I know there hasn't been an official llama 3 paper released and no data Info as well. But I could be wrong... 😅
@princecanuma 27 днів тому
@@vivekpadman5248 true, they only released a blog detailing the data, model arch and performance. Here is how I did it: Llama-3 has the same exact architecture of Llama-2 which we already covered in this channel. ua-cam.com/play/PLDn_JsyofyfQp4td_ub6LfIg5vxyu6YJK.html&si=0Gyt9mdaA-ydiWOA Finally, if you understand how these models work you don't need the paper, the code implementation is more than enough.
@vivekpadman5248 27 днів тому
@@princecanuma oh understood, thanks I'll check it out and also your video 💙
@princecanuma 27 днів тому
Most welcome :)
@ngamcode2485 Місяць тому
this is very impressive and great content. thank you
@princecanuma 28 днів тому
You're very welcome!
@jihoonjung2776 Місяць тому
Best video i ever seen. thanks~~!~!~!~!
@princecanuma Місяць тому
Most welcome!
@princecanuma Місяць тому
It’s my pleasure
@sheikhakbar2067 Місяць тому
Command-R is one of the best models out there for non-English / non-European languages. In Arabic I tried it, it's almost perfect, not as good as Claude (which also perfect for Arabic), but as far as I understand command-R from cohere (the community version I guess) is free! Is that true, it's free (I know command-R-plus is not free).
@kishoretvk Місяць тому
Super impressive. Great value One question How do I further train the model on my custom content Instead of LORA ? Can we further full training it and add new memory
@princecanuma Місяць тому
Most welcome! You can do that, but that can be very expensive.
@AC-go1tp Місяць тому
This is very thoughtful and great initiative! researchers with enough gray matter but limited means can be still in the game . Thank you PC🙏!
@princecanuma Місяць тому
Most welcome! It’s my pleasure:) I lived through this so others don’t have to.
@ojasvisingh786 2 місяці тому
🥳🤩👏💐
@philgoddard8606 2 місяці тому
Thank you for the really nice entry into using gemma locally! Could you share how to utilize GPUs on mac - i just got a mac studio and saw you had referenced some code earlier for NVIDIA. Thnks in advance :)
@princecanuma 2 місяці тому
Most welcome! You can use MLX: github.com/ml-explore/mlx-examples/tree/main/llms
@sayantan336 2 місяці тому
Great work 🎉. Would be great if you can introduce tutorial on coding GPT and BERT from scratch as well using only Pytorch. And then show how to do their pre training on custom data.
@princecanuma 2 місяці тому
Thank you very much! Llama is pretty close to GPT so I think BERT is more differentiated. What kind of data would you suggest?
@morningstar3996 2 місяці тому
Can we have the presentation please?
@princecanuma 2 місяці тому
Sure, here you go! www.canva.com/design/DAF7MlJ2Zoc/f75ryYIZnLc80NlIFZhS5A/edit?DAF7MlJ2Zoc&
@morningstar3996 2 місяці тому
@@princecanuma Appreciate it my friend
@girijeshthodupunuri1300 2 місяці тому
Great video! Learnt a lot.
@princecanuma 2 місяці тому
Thank you very much! I’m happy you liked it :) There is so much more on the way.
@girijeshthodupunuri1300 2 місяці тому
@@princecanuma Could you go over how to implement Parent Document retriever?
@princecanuma 2 місяці тому
@user-vd7im8gc2w Why do you need position ids? You use it to map the input ids to their respective position in the sequence. Example: input_ids = [100, 20, 4, 50] position_ids = torch.arange(input_ids.shape…) print(position_ids) >> [0, 1, 2, 3]
@Frost-Head 3 місяці тому
Keep up the good work
@princecanuma 3 місяці тому
Thank you!
@sayantan336 2 місяці тому
Brilliant 🎉
@princecanuma 2 місяці тому
Thanks!
@afzalharun8975 3 місяці тому
First time watching your video. Keep going bro 💪, its your friend Afzal
@princecanuma 3 місяці тому
Thank you very much brother! It's been long my friend :)
@RemekKinas 3 місяці тому
Really great job!
@princecanuma 3 місяці тому
Thank you very much, Remek! I’m happy you liked it :)
@dossantos4415 3 місяці тому
Hey please continue with the coding llama 2 from scratch
@princecanuma 3 місяці тому
Hey, thanks for watching and pinging me for part 3. Don’t worry, Coding Llama 2 from scratch part 3 should be up soon. Potentially tomorrow :) The video has been recorded, However, it was delayed due to my first ever graduation which occurred today, a very important moment for me. 👨🏾‍🎓
@tharunbhaskar6795 3 місяці тому
waiting for the training part
@princecanuma 3 місяці тому
Working on it 👌🏽 The video should be out this week.
@banbephanboi4708 3 місяці тому
Great work! Wait for your next videos
@princecanuma 3 місяці тому
Thank very much! New videos dropping soon.
@CarlosAntunesZ 3 місяці тому
Amazing video 🖖🏽
@princecanuma 3 місяці тому
Thank you very much! I’m happy you enjoy it :)
@shihab-soft 3 місяці тому
Thank you very much this was very useful
@princecanuma 3 місяці тому
Most welcome :)
@illia_user 3 місяці тому
Great job! Thank you!
@princecanuma 3 місяці тому
Hi, thank you very much!
@buddhu9364 3 місяці тому
Is there a way I could go about doing the same thing in Windows and Gemma?
@princecanuma 3 місяці тому
Hi, thanks for watching! Yes, there is and I will cover it in a future video soon. 👌🏽
@NitoKuvell 3 місяці тому
Parabens Prince é um orgulho ver oque te tornaste na esfera das tecnologias. Avante
@princecanuma 3 місяці тому
Thank you very much brother! It means a lot coming from you :) Long time no see, let’s catch up.
@steliomoiane494 4 місяці тому
uau, amazing Prince, thanks for sharing this very useful content
@princecanuma 4 місяці тому
Most welcome :) Thank you for watching, Stelio!

Prince Canuma

КОМЕНТАРІ