Easy Tutorial: Run 30B Local LLM Models With 16GB of RAM

ШЕЙТЕЛЬМАН: «Путин узнал, что Трамп может его …..» что выдал Соловей, Песков, Володин, Карлсон

LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

To Brawl AND BEYOND!

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

How Strong Is Tape?

Try llama.cpp with alpaca-lora-30B-ggml

Jack Lin

Переглядів 4 781

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 29 січ 2025

КОМЕНТАРІ • 13

@BB-ko3fh Рік тому ⁺¹
is there a particular reason why they transferred the model to c++ (newbie question) other than to make the model smaller
@TMichael66 Рік тому ⁺²
C++ allows the entire model to be loaded into regular RAM. This is helpful for those of us without beefy GPUs.
@JackLin-ct3wx Рік тому
There's a 7B model, which takes up only 4GB memory. But I was not sure if 7B can work at that time because there was a breaking change in this project. So the authors not only run them using C++ but also make smaller models.
@Phasma6969 Рік тому
Because otherwise you will have to use a GPU which uses different RAM (VRAM) compared to system RAM. You can also get more RAM for less money than multiple GPUs. Most consumer GPUs have very little VRAM, on average 4-8GB, which isn't enough usually. Although GPU is much, much faster than CPU inference as you get the parallel compute with higher floating point precision for next token predictions.
@samas69420 Рік тому
Everyone is saying that it is because in this way you can load the model in regular ram, but if I'm not mistaken pytorch already has this feature and so you don't need to reimplement everything in cpp if you only care about where to load the model, instead i think the difference here is that you need to reimplement stuff if you want to use custom protocols or formats (like in this case with the ggml format) and control how they are managed at low level to have more efficiency so i guess that's the main reason
@csabaczcsomps7655 Рік тому
How quit from chat, is ask ai and he say Ctrl+t but not work, finally I close the window o prompt, but I think can be quit somehow?
@JackLin-ct3wx Рік тому
Just press Ctrl+C for 2 or 3 times (in case the prompt didn't catch it), which is the termination signal in Linux.
@joakimjocka8022 Рік тому
Did you slow down this video ?
@JackLin-ct3wx Рік тому
Nope. It's the original speed.
@misterxxxxxxxx Рік тому
how did you transform the model ? (.tmp ?) I get a too old, regenerate your model files or convert.. error when trying to use it...
@JackLin-ct3wx Рік тому
I followed the comment github.com/ggerganov/llama.cpp/issues/382#issuecomment-1479091459 to transform the model.
@JackLin-ct3wx Рік тому
But I notice there are some newer alpaca lora projects with more user-friendly setup like github.com/nomic-ai/gpt4all. Maybe you can try it.
@Patrick-rj8gh Рік тому
Your computer be slow.

Наступне

Автоматичне відтворення

Easy Tutorial: Run 30B Local LLM Models With 16GB of RAM

Easy Tutorial: Run 30B Local LLM Models With 16GB of RAM

ШЕЙТЕЛЬМАН: «Путин узнал, что Трамп может его …..» что выдал Соловей, Песков, Володин, Карлсон

ШЕЙТЕЛЬМАН: «Путин узнал, что Трамп может его …..» что выдал Соловей, Песков, Володин, Карлсон

LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

LLAMA-3.1 🦙: EASIET WAY To FINE-TUNE ON YOUR DATA 🙌

To Brawl AND BEYOND!

To Brawl AND BEYOND!

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

Разобрался голыми руками 😎 #start #кино #фильм #сериал #молотведьм #полиция #пацаны

How Strong Is Tape?

How Strong Is Tape?

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

Сети для несетевиков // OSI/ISO, IP и MAC, NAT, TCP и UDP, DNS

Сети для несетевиков // OSI/ISO, IP и MAC, NAT, TCP и UDP, DNS

Compile Automotive Grade Linux (AGL) for QEMU | Part 1

Compile Automotive Grade Linux (AGL) for QEMU | Part 1

Installing Llama cpp on Windows

Installing Llama cpp on Windows

Where Does Bad Code Come From?

Where Does Bad Code Come From?

Структура файлов и каталогов в Linux

Структура файлов и каталогов в Linux

Reverse Engineering 101 tutorial with the amazing Stephen Sims!

Reverse Engineering 101 tutorial with the amazing Stephen Sims!

How to Run a ChatGPT-like AI on Your Raspberry Pi

How to Run a ChatGPT-like AI on Your Raspberry Pi

DeepSeek AI - Why EVERYONE is Talking About It? How to USE DeepSeek?

DeepSeek AI - Why EVERYONE is Talking About It? How to USE DeepSeek?

Run a ChatGPT-like AI on Your Laptop Using LLaMA and Alpaca

Run a ChatGPT-like AI on Your Laptop Using LLaMA and Alpaca

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

Пилот обманул смерть ракета пролетела рядом с ним #shorts

Пилот обманул смерть ракета пролетела рядом с ним #shorts

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

Cute Baby Ties Up Dad And Wants To Play With His Phone #funny #fatherhoodlove#cute#fatherhoodmoments

Cute Baby Ties Up Dad And Wants To Play With His Phone #funny #fatherhoodlove#cute#fatherhoodmoments

Мама загинула у блокадному Чернігові, а тато у полоні РФ #війна #люди #україна #shorts #смерть

Мама загинула у блокадному Чернігові, а тато у полоні РФ #війна #люди #україна #shorts #смерть

How to treat Acne💉

How to treat Acne💉