Це відео не доступне.

Перепрошуємо.

How to Run a ChatGPT-like AI on Your Raspberry Pi

Gary Explains

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 6 сер 2024
Services like ChatGPT run on big servers, with lots of CPU and GPU power. But did you know that the humble Raspberry Pi can also run Large Language Models (LLM). Thanks to the llama.cpp project they can. Here is my step-by-step guide on how you can run a ChatGPT-like AI bot on your Raspberry Pi.
---
Instructions: github.com/garyexplains/examp...
#garyexplains

КОМЕНТАРІ • 93

@lsismeiro 10 місяців тому ⁺³
Very nice Gary, I liked and the main reason was because it was done with a Raspberry Pi.
@robertrogerson5488 10 місяців тому ⁺³
Very interesting video, as always. Even a few months ago I wouldnt have beleived a generative AI could run on a raspberry pi with such accuracy. The speed with which AI is developing is truly breathtaking.
@ZDevelopers 10 місяців тому ⁺¹²
Cant believe the Raspberry Pi 4 is more than 4 years old! Hoping the Raspberry Pi 5 reflects the almost 5 year gap between it and the 4 when it comes out!
@technolus5742 10 місяців тому ⁺³
And hopefully it there will be enough units for everyone who wants one...
@neogen23 10 місяців тому ⁺²
I hope its SOC is on par with the RK3588S, though I seriously doubt it for cost reasons. What I consider a certainty is that it's going to show up in 2024. It is already way too late
@craveknowledge 10 місяців тому ⁺¹
Quick setup thank you Gary. Just to let others know will require a lot more to get a decent response but it is awesome nonetheless.
@ytfp 10 місяців тому
Great video Gary!
@whitneydesignlabs8738 10 місяців тому
Thanks, Gary! Thanks for explaining this. I really look forward to trying this on the raspberry pi 5
@whitneydesignlabs8738 8 місяців тому
Replying to my own comment... Got my Pi5 today. Got it working!
@elvenkim 8 місяців тому
yes it is working! thanks very much Gary 😁! I turn off the Wifi and it still works!
@JohnMayfield-NS 6 місяців тому
Noticed a typo on your instruction page. First model has an errant ) at the end.
Thanks for the easy instructions!
@johnny14794 2 місяці тому ⁺²
Quick and easy instructions. Thanks! It also works fine on a Pi4 4GB.✌😉
@wuyanchu 8 місяців тому
very nice tutorial, thx and god bless u and the world... regards from hong kong ^_^
@EarnestWilliamsGeofferic 9 місяців тому ⁺³
I've got a 118 gb optane m.2 ready to go when my rpi 5 arrives. This is exactly the first project I'm trying out.
@whitneydesignlabs8738 8 місяців тому
That is what I did today, the minute, my Pi5 arrived. But using SD card still. NVMe stuff should be later this week. But it works. Proof of concept.
@whitneydesignlabs8738 8 місяців тому ⁺¹
Thanks, Gary! I got my Rpi 5 today and used your method. I am currently testing openbuddy-zephry-7b-v14-1. It is pretty slow, even on the Pi5. Response is marginal so far. But it is functional. I am going to experiment with an "optimized for Pi" model called Orca next. My use case is a robotic platform.
@Cranfield446 7 місяців тому ⁺⁶
It works fine on my Raspberry Pi 5 8G. I have been asking it Python questions and it knows the answers.
@michael68000 7 місяців тому
Agreed, and it seems to run noticeably quicker on the Pi 5.
@DH-oy4zf 5 місяців тому
can it code and execute by itself?
@DorukMergan-hq7jo 3 місяці тому
when ı run sudo apt uptade it gives me a error
@Mochoso_ 9 місяців тому
Would love to see the same model running on the pi 4 but overclocked! may be it gets to read-speeds?
Excelent video! will try it as soon as posible
@davidtorrens8647 5 місяців тому ⁺³
Any chance of repeating this with an R pi5 with M.2 drive and is there any hardware add on for the Pi that improves it ability for running LLM.s. a different approach I was considering was using a P4 or 5 as a web gateway to A PC running LLM because Pis are easier to connect to from elsewhere. (Like the USB Coral AI accelerator)
@user-gd7xq8zh7l 11 місяців тому ⁺³
Nice
@mandelafoggie9359 4 місяці тому ⁺¹
Your good !
@_nightowl263 6 місяців тому
What are some ways you could make it go faster? Jetson Nano perhaps?
@filker0 10 місяців тому ⁺¹
What is the performance penalty for the narrow floating point representation that is not natively supported by the ARM v8 ISA?
This same set of instructions should work on other similar SBCs that have enough RAM and storage. I have a Freescale T2080 (PowerPC 64 bit multicore, 2 threads per core) running Debian and might try it on that.
@ellinoakendotiki_ 9 місяців тому
I wonder if it could work on a PI Zero. It would be amazing if it did. Can you please try on some other models considering that speed is not an issue and we care only about compatibility?
@suki4410 10 місяців тому ⁺¹
More interesting for me, would be network cards with fiber glass connection.
@steph291 10 місяців тому ⁺¹
GARY !!!! I added so much fun with this ! on a HP I5 intel, 8gig ram, Linux recent version. It's ABSOLUTELY amazing.
Good thing would be text2speach ... I told him to teach me Spanish and he was appropriate :)
@marcusk7855 10 місяців тому ⁺¹
Its pretty easy to make a USB drive your root partition now a days so that'd be the best option. SD as boot partition and SSD as root straight from RPi imager.
@ComatoseMN 4 місяці тому
It's works FANTASTICALLY!! Thanks Gary!! The way you explained it, I think I understand now how to do this with other models that are compatible with Llama 2 and Pi 4. Let the experimenting begin!! *Insert diabolical laughter here* I mean, relatively diabolical laughter of course... LOLza!!
>M
@lloydfitchen3583 6 місяців тому
Very nice thank you.
@Buildingeachotherup 3 місяці тому ⁺¹
Just tried testing it I think the whole thing is outdated at this point. I think I'm going to try another approach.
@khazen1234 6 місяців тому
Gary, is there an update on this considering Chatgpt now has an offical voice feature?
@-RobGPT- 9 місяців тому ⁺¹
now i just need a good offline speech recogniser and TTS and the droid im building can finally be free lmao
@RafaCoringaProducoes 3 місяці тому
Hey have you heard there is a new box86 alike caled fex 2404 or something like it? Just saw on linux experiment news
@GaryExplains 3 місяці тому
Fex fex-emu.com/
@user-xw8jl8xd5t Місяць тому
hello thank u for ur video but ls -la main => bash: ./main: No such file or directory. so. how to fix this problem?
@DataChiller Місяць тому
hi, please try with llama-cli instead of main. for me this works. looks like there was in update in llama.cpp recently:
[2024 Jun 12] Binaries have been renamed w/ a llama- prefix. main is now llama-cli, server is llama-server
@DeusEx.Machina 8 місяців тому ⁺²
Is it possible to chat with it using an offline AI-Assistant or a speech recognizer ? It would essentially make this much more powerful, and very dope. Regardless, Thanks Gary, I've been following you since your AA days and has always been a fan.
@Pyrolite 8 місяців тому ⁺¹
For that I would assume you need to have an AI model run locally so having a trained neural network uploaded along with your speech code
@dalo2571 8 місяців тому ⁺¹
I'm trying the same thing, I'm using whisper Ai running locally for speech recognition, and sox to actually record the audio
@DeusEx.Machina 8 місяців тому ⁺¹
@@dalo2571 I got ChatGPT to work with the Python SpeechRecognizer, and TTS. Was pretty cool for a starter project I did for college.
@lemoncryptonfa4980 10 місяців тому ⁺¹
Is there a way to unlock all potential of the AI? answering to any prompt without restriction?
@shrigmian 9 місяців тому
Use an uncensored model
@BillYovino 10 місяців тому ⁺⁴
Followed the instructions but there was no "main" file.
@rikys91 Місяць тому
Really strange. Anyone know something about that?
@lloydfitchen3583 5 місяців тому
It is truly sad that the gpio library had to be abandoned on the Raspberry pi 5,. Can you go over the new gpiod, it seems to just be all new coding for I/O and OMG what a can of worms?
@user-ul7kc2bt1c 4 місяці тому ⁺⁴
I followed your step and the result is follwoing.
(a) pi5 : m2 ssd
(b) pi4 : sd card 128g
with Llama-2-7B-Chat-GGUF
------------------------------
(1) Load Time:
PI4: 28,346.89ms
PI5: 728.95ms
PI5 has a significantly shorter load time compared to PI4.
(2) Sample Time:
PI4: 42.70ms (400 runs)
PI5: 18.97ms (400 runs)
PI5 has a shorter sample time than PI4.
(3) Prompt Eval Time:
PI4: 17,747.83ms (19 tokens)
PI5: 4,250.20ms (19 tokens)
PI5 has a shorter prompt evaluation time compared to PI4.
(4) Eval Time:
PI4: 463,391.22ms (399 runs)
PI5: 123,724.81ms (399 runs)
PI5 has significantly shorter evaluation time than PI4.
(5) Total Time:
PI4: 481,415.62ms (418 tokens)
PI5: 128,070.61ms (418 tokens)
PI5 has a significantly shorter total time compared to PI4.
----
"I'm considering if there's a way to make it faster using Google Coral TPU (USB)."
@teoyongxuan7338 4 місяці тому ⁺¹
I followed step by step instructions and when it comes to ls -la main, it says ls: cannot access 'main': No such file or directory
@user-xw8jl8xd5t Місяць тому
me too.. Did you solve it?
@chipling7367 8 місяців тому
Thanks Gary for sharing. However, trying to get a 8 GB RasPi is almost impossible. Can you try it on a 4 GB RAM version (which is the most common config) and see if it works.
@dalo2571 8 місяців тому
It works, it's just very very slow.
@MAYERMAKES 10 місяців тому ⁺⁴
I recently tried out llama 30B and oh boy replies take forever compared t o7B but they are much better. on a pi I can se the 7b model beeing good enouhg to gnerate responses for anymatronics etc, by pipelining into voice synth and moving a jaw based on amplitusde.
@shrigmian 9 місяців тому
7b needs about 16gb ram, so for a Pi you should be using a 3b model which is 8gb minimum
@anthonyhoang5902 6 місяців тому ⁺¹
Is the llama.cpp.git file now outdated? I see build errors on the "~$ make -j" command.
@HANZHELONG 4 місяці тому
you only need to type make -j
@bertbrecht7540 10 місяців тому
Do you think the Pi 5 will require the same steps as the 4 for the chatbot? I placed my pre-order and this is the first project I will do with it.
@GaryExplains 10 місяців тому ⁺¹
Yes, I don't see any difference for the Pi 5.
@whitneydesignlabs8738 8 місяців тому
Can confirm. Works on the Pi5. @@GaryExplains
@fenix20075 9 місяців тому
may be it can run a 3B model with Q8 gguf? XD
@thundernews159 8 місяців тому
Runs on my game boy
@rownokeee 9 місяців тому
Followed the instructions but there is no "main" file.
@GaryExplains 9 місяців тому
Then with the greatest of respect, you didn't follow the instructions correctly.
@NE0MEW 8 місяців тому ⁺²
im about to trythis on a raspberry pi 5 8GB
@eyupylmaz9813 8 місяців тому ⁺¹
Let us know how things turn out!
@JarppaGuru 8 місяців тому
not matter AI answer is not correct or wrong its not know. result need be confirmed. its just tool not like script that 100% correct and can be let run automatic
@Widzion 10 місяців тому ⁺⁴
interesting, but unusable. You basically type your question and you can go make some coffee before you get the response.
@GuillermoTs 4 місяці тому
its possible run in a Raspberry pi 3?
@GaryExplains 4 місяці тому
As I say at ua-cam.com/video/idZctq7WIq4/v-deo.htmlfeature=shared&t=60 you need a Pi 4 with 8GB of RAM.
@GuillermoTs 4 місяці тому
ah,ok i dont speak english very well @@GaryExplains
@EquaTechnologies 9 місяців тому
Next: Running it on a micro:bit
@ThatOneSpeedCuber 7 місяців тому
wich OS are you useing???
@GaryExplains 7 місяців тому ⁺¹
Raspberry Pi OS, i.e. Linux.
@john_doe668 6 місяців тому
@@GaryExplains which distro/version? is it x32 or x64?
@torusx8564 5 місяців тому ⁺¹
Why are you using big model language like LLAMA while you can use some SLM that get better performance
@GaryExplains 5 місяців тому ⁺⁵
I didn't used a SLM for various reasons, the most import being that SLMs weren't even invented when I made this video and I haven't invented time travel yet. 🤦‍♂️
@torusx8564 5 місяців тому
@@GaryExplains they were not called SLM but we had some minimal AI language lol. AI is not recent
@torusx8564 5 місяців тому
@@GaryExplains also lmao that face palm
@GaryExplains 5 місяців тому ⁺¹
So you are claiming that there are minimal AI language models that predate the current crop of LLMs, that at better than LLMs, because AI is not recent. I would love to hear more about these.
@torusx8564 5 місяців тому
@@GaryExplains where did u read powerful as llm lol. I'll edit the command let me give you the name one.
@rolyantrauts2304 10 місяців тому ⁺²
'Running' on a Pi is at least hugely optimistic, it crawls in what is an unusable way. So slow he couldn't wait for it to end.
Arm v8.2 has added Mat/Mul instructions that greatly increase running speed and a Cortex A76 than the Pi's A73 you can near x5 ML perf with some models.
The RK3588 SoCs out there produce probably what is a minimum usable LLM but we are still talking the smallest params and most quantised (less accurate)
LLMs are inherently server based due to the diversification of use and a single home LLM on better hardware shared amongst many devices is a much better infrastructure as Lllama.cpp is so incredabilly optimised its likely already optimal.
LLM's are just bad on a Pi and even the RK3588 SoC's its not great but a single home LLM on something like a Mac mini comes into its own with its ability to quickly return results and serve multiple prompts as its unlikely they will hit at the exact same moment.
@dh2032 10 місяців тому
all Haile, our robot masters & over loads ? 😞
@galihpa 10 місяців тому ⁺²
This is very unusable on raspberry pi
@JNET_Reloaded 10 місяців тому ⁺¹
It work on 4gb version rpi4 yes or no?
@rolyantrauts2304 10 місяців тому
The quantised 7b model is 3.9 GB so with swap yeah (use zram) I run it on a Opi5-4gb with little perf hit

Наступне

Автоматичне відтворення

Is the Orange Pi Zero3 Better Than the Raspberry Pi?