Це відео не доступне.
Перепрошуємо.
How to Run a ChatGPT-like AI on Your Raspberry Pi
Вставка
- Опубліковано 6 сер 2024
- Services like ChatGPT run on big servers, with lots of CPU and GPU power. But did you know that the humble Raspberry Pi can also run Large Language Models (LLM). Thanks to the llama.cpp project they can. Here is my step-by-step guide on how you can run a ChatGPT-like AI bot on your Raspberry Pi.
---
Instructions: github.com/garyexplains/examp...
#garyexplains
Very nice Gary, I liked and the main reason was because it was done with a Raspberry Pi.
Very interesting video, as always. Even a few months ago I wouldnt have beleived a generative AI could run on a raspberry pi with such accuracy. The speed with which AI is developing is truly breathtaking.
Cant believe the Raspberry Pi 4 is more than 4 years old! Hoping the Raspberry Pi 5 reflects the almost 5 year gap between it and the 4 when it comes out!
And hopefully it there will be enough units for everyone who wants one...
I hope its SOC is on par with the RK3588S, though I seriously doubt it for cost reasons. What I consider a certainty is that it's going to show up in 2024. It is already way too late
Quick setup thank you Gary. Just to let others know will require a lot more to get a decent response but it is awesome nonetheless.
Great video Gary!
Thanks, Gary! Thanks for explaining this. I really look forward to trying this on the raspberry pi 5
Replying to my own comment... Got my Pi5 today. Got it working!
yes it is working! thanks very much Gary 😁! I turn off the Wifi and it still works!
Noticed a typo on your instruction page. First model has an errant ) at the end.
Thanks for the easy instructions!
Quick and easy instructions. Thanks! It also works fine on a Pi4 4GB.✌😉
very nice tutorial, thx and god bless u and the world... regards from hong kong ^_^
I've got a 118 gb optane m.2 ready to go when my rpi 5 arrives. This is exactly the first project I'm trying out.
That is what I did today, the minute, my Pi5 arrived. But using SD card still. NVMe stuff should be later this week. But it works. Proof of concept.
Thanks, Gary! I got my Rpi 5 today and used your method. I am currently testing openbuddy-zephry-7b-v14-1. It is pretty slow, even on the Pi5. Response is marginal so far. But it is functional. I am going to experiment with an "optimized for Pi" model called Orca next. My use case is a robotic platform.
It works fine on my Raspberry Pi 5 8G. I have been asking it Python questions and it knows the answers.
Agreed, and it seems to run noticeably quicker on the Pi 5.
can it code and execute by itself?
when ı run sudo apt uptade it gives me a error
Would love to see the same model running on the pi 4 but overclocked! may be it gets to read-speeds?
Excelent video! will try it as soon as posible
Any chance of repeating this with an R pi5 with M.2 drive and is there any hardware add on for the Pi that improves it ability for running LLM.s. a different approach I was considering was using a P4 or 5 as a web gateway to A PC running LLM because Pis are easier to connect to from elsewhere. (Like the USB Coral AI accelerator)
Nice
Your good !
What are some ways you could make it go faster? Jetson Nano perhaps?
What is the performance penalty for the narrow floating point representation that is not natively supported by the ARM v8 ISA?
This same set of instructions should work on other similar SBCs that have enough RAM and storage. I have a Freescale T2080 (PowerPC 64 bit multicore, 2 threads per core) running Debian and might try it on that.
I wonder if it could work on a PI Zero. It would be amazing if it did. Can you please try on some other models considering that speed is not an issue and we care only about compatibility?
More interesting for me, would be network cards with fiber glass connection.
GARY !!!! I added so much fun with this ! on a HP I5 intel, 8gig ram, Linux recent version. It's ABSOLUTELY amazing.
Good thing would be text2speach ... I told him to teach me Spanish and he was appropriate :)
Its pretty easy to make a USB drive your root partition now a days so that'd be the best option. SD as boot partition and SSD as root straight from RPi imager.
It's works FANTASTICALLY!! Thanks Gary!! The way you explained it, I think I understand now how to do this with other models that are compatible with Llama 2 and Pi 4. Let the experimenting begin!! *Insert diabolical laughter here* I mean, relatively diabolical laughter of course... LOLza!!
>M
Very nice thank you.
Just tried testing it I think the whole thing is outdated at this point. I think I'm going to try another approach.
Gary, is there an update on this considering Chatgpt now has an offical voice feature?
now i just need a good offline speech recogniser and TTS and the droid im building can finally be free lmao
Hey have you heard there is a new box86 alike caled fex 2404 or something like it? Just saw on linux experiment news
Fex fex-emu.com/
hello thank u for ur video but ls -la main => bash: ./main: No such file or directory. so. how to fix this problem?
hi, please try with llama-cli instead of main. for me this works. looks like there was in update in llama.cpp recently:
[2024 Jun 12] Binaries have been renamed w/ a llama- prefix. main is now llama-cli, server is llama-server
Is it possible to chat with it using an offline AI-Assistant or a speech recognizer ? It would essentially make this much more powerful, and very dope. Regardless, Thanks Gary, I've been following you since your AA days and has always been a fan.
For that I would assume you need to have an AI model run locally so having a trained neural network uploaded along with your speech code
I'm trying the same thing, I'm using whisper Ai running locally for speech recognition, and sox to actually record the audio
@@dalo2571 I got ChatGPT to work with the Python SpeechRecognizer, and TTS. Was pretty cool for a starter project I did for college.
Is there a way to unlock all potential of the AI? answering to any prompt without restriction?
Use an uncensored model
Followed the instructions but there was no "main" file.
Really strange. Anyone know something about that?
It is truly sad that the gpio library had to be abandoned on the Raspberry pi 5,. Can you go over the new gpiod, it seems to just be all new coding for I/O and OMG what a can of worms?
I followed your step and the result is follwoing.
(a) pi5 : m2 ssd
(b) pi4 : sd card 128g
with Llama-2-7B-Chat-GGUF
------------------------------
(1) Load Time:
PI4: 28,346.89ms
PI5: 728.95ms
PI5 has a significantly shorter load time compared to PI4.
(2) Sample Time:
PI4: 42.70ms (400 runs)
PI5: 18.97ms (400 runs)
PI5 has a shorter sample time than PI4.
(3) Prompt Eval Time:
PI4: 17,747.83ms (19 tokens)
PI5: 4,250.20ms (19 tokens)
PI5 has a shorter prompt evaluation time compared to PI4.
(4) Eval Time:
PI4: 463,391.22ms (399 runs)
PI5: 123,724.81ms (399 runs)
PI5 has significantly shorter evaluation time than PI4.
(5) Total Time:
PI4: 481,415.62ms (418 tokens)
PI5: 128,070.61ms (418 tokens)
PI5 has a significantly shorter total time compared to PI4.
----
"I'm considering if there's a way to make it faster using Google Coral TPU (USB)."
I followed step by step instructions and when it comes to ls -la main, it says ls: cannot access 'main': No such file or directory
me too.. Did you solve it?
Thanks Gary for sharing. However, trying to get a 8 GB RasPi is almost impossible. Can you try it on a 4 GB RAM version (which is the most common config) and see if it works.
It works, it's just very very slow.
I recently tried out llama 30B and oh boy replies take forever compared t o7B but they are much better. on a pi I can se the 7b model beeing good enouhg to gnerate responses for anymatronics etc, by pipelining into voice synth and moving a jaw based on amplitusde.
7b needs about 16gb ram, so for a Pi you should be using a 3b model which is 8gb minimum
Is the llama.cpp.git file now outdated? I see build errors on the "~$ make -j" command.
you only need to type make -j
Do you think the Pi 5 will require the same steps as the 4 for the chatbot? I placed my pre-order and this is the first project I will do with it.
Yes, I don't see any difference for the Pi 5.
Can confirm. Works on the Pi5. @@GaryExplains
may be it can run a 3B model with Q8 gguf? XD
Runs on my game boy
Followed the instructions but there is no "main" file.
Then with the greatest of respect, you didn't follow the instructions correctly.
im about to trythis on a raspberry pi 5 8GB
Let us know how things turn out!
not matter AI answer is not correct or wrong its not know. result need be confirmed. its just tool not like script that 100% correct and can be let run automatic
interesting, but unusable. You basically type your question and you can go make some coffee before you get the response.
its possible run in a Raspberry pi 3?
As I say at ua-cam.com/video/idZctq7WIq4/v-deo.htmlfeature=shared&t=60 you need a Pi 4 with 8GB of RAM.
ah,ok i dont speak english very well @@GaryExplains
Next: Running it on a micro:bit
wich OS are you useing???
Raspberry Pi OS, i.e. Linux.
@@GaryExplains which distro/version? is it x32 or x64?
Why are you using big model language like LLAMA while you can use some SLM that get better performance
I didn't used a SLM for various reasons, the most import being that SLMs weren't even invented when I made this video and I haven't invented time travel yet. 🤦♂️
@@GaryExplains they were not called SLM but we had some minimal AI language lol. AI is not recent
@@GaryExplains also lmao that face palm
So you are claiming that there are minimal AI language models that predate the current crop of LLMs, that at better than LLMs, because AI is not recent. I would love to hear more about these.
@@GaryExplains where did u read powerful as llm lol. I'll edit the command let me give you the name one.
'Running' on a Pi is at least hugely optimistic, it crawls in what is an unusable way. So slow he couldn't wait for it to end.
Arm v8.2 has added Mat/Mul instructions that greatly increase running speed and a Cortex A76 than the Pi's A73 you can near x5 ML perf with some models.
The RK3588 SoCs out there produce probably what is a minimum usable LLM but we are still talking the smallest params and most quantised (less accurate)
LLMs are inherently server based due to the diversification of use and a single home LLM on better hardware shared amongst many devices is a much better infrastructure as Lllama.cpp is so incredabilly optimised its likely already optimal.
LLM's are just bad on a Pi and even the RK3588 SoC's its not great but a single home LLM on something like a Mac mini comes into its own with its ability to quickly return results and serve multiple prompts as its unlikely they will hit at the exact same moment.
all Haile, our robot masters & over loads ? 😞
This is very unusable on raspberry pi
It work on 4gb version rpi4 yes or no?
The quantised 7b model is 3.9 GB so with swap yeah (use zram) I run it on a Opi5-4gb with little perf hit