What if I use the model by pasting to the Terminal, but then I no longer want that model. What if I wanted to try a different model on llama, how do I uninstall the first model that I pasted to my Terminal and replace it with a new one?
Yes, Ollama supports it. Refer to ollama.com/library/solar:10.7b. You need to pull the model first using command "ollama pull solar:10.7b" before you can select it inside openwebui. Hope this helps
Thank you very useful, if you get an error when you put in the Docker text (when using the windows version), make sure you run through all the docker install stepes and restart if needed (you will need to register).
Most LLMs do not use the internet. Using some OSS tools, you can create RAG based system that can fetch webpages and give the text to Local LLMs. Hope this helps
Thank you for your contribution. We just discovered privategpt, and will follow up with a video soon! Be sure to subscribe, if not already so that you get the notification when new video is live. Thank you again!
Hello, this link might be useful for your answer stackoverflow.com/a/78390633 Excerpt is here A 70b model uses approximately 140gb of RAM (each parameter is a 2 byte floating point number). If you want to run with full precision, I think you can do it with llama.cpp and a Mac that has 192GB of unified memory, though the speed will not be that great (maybe a couple of tokens per second). If you run with 8 bit quantization, RAM requirements is dropped by half and speed is also improved. I hope this helps
@@bonsaiilabs thx for ur timely reply. I may ask the question in a wrong way. I should ask : can we delete this "IIama3: latest" or just leave it a blank? Because I don't wont the user know what's behind it. Thanks again.
@ricardoribeiro3281, the video is almost finished and will be out in next few days. Make sure you subscribe and click the bell icon so that you get the notification once it is available. Thanks
You will need to fine-tune the model for your own use case. The base models can be finetuned, but you cannot train the base model as is. Hope that helps.
"Ollama is popular library for running LLMs on both CPUs and GPUs". I found this reference on skypilot.readthedocs.io/en/latest/gallery/frameworks/ollama.html. Hope that helps!
What if I use the model by pasting to the Terminal, but then I no longer want that model. What if I wanted to try a different model on llama, how do I uninstall the first model that I pasted to my Terminal and replace it with a new one?
good video.
Can we run solar 10.7B uncensored just inn the same way?
Yes, Ollama supports it. Refer to ollama.com/library/solar:10.7b. You need to pull the model first using command "ollama pull solar:10.7b" before you can select it inside openwebui. Hope this helps
Thank you very useful, if you get an error when you put in the Docker text (when using the windows version), make sure you run through all the docker install stepes and restart if needed (you will need to register).
Thank you for adding more context for people who may hit any issues with docker
Can you use a voice interface wth the offline models?
Would love to understand what's required.
Yes, please look at ua-cam.com/video/RELQNYa4qNc/v-deo.html
Amazing and very clear step by step instructions! was able to replicate the work done on my computer
Thank you so much for this excellent tutorial!
You're welcome! Glad to know that it worked for you
Even if I'm connected to the internet, will the model still not use the internet?
Most LLMs do not use the internet. Using some OSS tools, you can create RAG based system that can fetch webpages and give the text to Local LLMs. Hope this helps
Can macbook air m1 handle this model ?
Honestly, I do not know as I do not know your machine configuration. Why not try out, you will know
can we train this with our own personal data , and if yes how ???
Any open models can be fine tuned. We will make videos in future to demonstrate this use case. Thanks for asking
Thank you for this. I would like to upload/ingest files to the privategpt is that possible?
Thank you for your contribution. We just discovered privategpt, and will follow up with a video soon! Be sure to subscribe, if not already so that you get the notification when new video is live. Thank you again!
What's a good machine that you can recommend where I want to load llama 3 70b?
Hello, this link might be useful for your answer
stackoverflow.com/a/78390633
Excerpt is here
A 70b model uses approximately 140gb of RAM (each parameter is a 2 byte floating point number). If you want to run with full precision, I think you can do it with llama.cpp and a Mac that has 192GB of unified memory, though the speed will not be that great (maybe a couple of tokens per second). If you run with 8 bit quantization, RAM requirements is dropped by half and speed is also improved.
I hope this helps
hello. Can we markup the model name " llama3: latest"? thx
Hello, as per ollama.com/library/llama3, you need to use "ollama pull llama3:latest". This should work
@@bonsaiilabs thx for ur timely reply. I may ask the question in a wrong way. I should ask : can we delete this "IIama3: latest" or just leave it a blank? Because I don't wont the user know what's behind it. Thanks again.
Great video. Very clear. Thank you!
Glad it was helpful!
Is it possible to upload PDF files and ask for summarization?
Yes, it is definitely possible. Stay tuned and we will share a video about that soon
@ricardoribeiro3281, the video is almost finished and will be out in next few days. Make sure you subscribe and click the bell icon so that you get the notification once it is available. Thanks
Finally i find easy tutorial thank you
You're welcome!
can be trained ? or will answer from his data only?
You will need to fine-tune the model for your own use case. The base models can be finetuned, but you cannot train the base model as is. Hope that helps.
can it run without gpu
I believe it can, but the inference might be slow. I would encourage to try out and let mw know you things do with you!
"Ollama is popular library for running LLMs on both CPUs and GPUs". I found this reference on skypilot.readthedocs.io/en/latest/gallery/frameworks/ollama.html.
Hope that helps!
Super cool stuff
Thank you very much!