Run this Small Language Model on Raspberry Pi for Fun and PROFIT
Вставка
- Опубліковано 5 січ 2024
- Is TinyLLaMA any good? Let's run it on Raspberry Pi, benchmark the baseline and possible optimizations! Then we talk with our TinyLLaMA living inside of Raspberry Pi using a simple barebones web server for inference.
Github Gist (benchmark commands, benchmark results, prompts):
gist.github.com/AIWintermuteA...
TinyLlama/TinyLlama-1.1B-Chat-v1.0
huggingface.co/TinyLlama/Tiny...
TinyLlama/TinyLlama-1.1B-Chat-v1.0 (GGUF)
huggingface.co/TheBloke/TinyL... - Наука та технологія
More no-bullsh**t tutorials! 👍
More to come!
Great stuff as always! Will be trying this as soon as I’m back in the workshop…
Please do! :)
Very good tutorial - well done! Btw, following the chat format of the model (i.e. adding the , , tokens) should improve the accuracy, though at the moment, there is not an easy way to do it with "server". It's possible to do it with "main" but it takes some extra command-line arguments
Appreciate your work on llama.cpp!
Thanks for the info, it will definitely be useful for people watching.
Does the new Hailo AI module offer any improvement on running any of these LLMs? I know it speeds up the vision side of things but I haven't seen anyone use it for LLMs yet.
Oh, and would it speed up Whisper at all? Allowing a larger model to be run?
re: LLMs, not really. The question has been asked many times in different locations, here is one of the replies
www.reddit.com/r/LocalLLaMA/comments/1d7shcr/comment/l71q04c
re: Whisper: given that it as transformer as well, Hailo are not geared towards this type of NN. but I remember seeing a paper about modifying BERT to be run with Google Coral USB, so... your mileage may wary, but it is going to be very far from plug-n-play
😮
😊
this tutorial needs a tutorial
Does it though?