I have rtx 3080 with 10 GPU memory, i7 12700F and 32GB RAM. What parameter should I use? I know the 671B is perfect (Above ChatGPT o1 performance). I am wondering about 32b and 70b performance. How does that compare to GPT-4-turbo?
Hello, I set everything as in the video, but when i ask smth chat goes blank, it takes so long time to get responses. What can i do? I got RTX 3050, Ryzen 7 5800H
One more questions, Can we run 7B model in Laptops not having GPU but having internal GPU as part of processor like i3, i5, etc.? Thanks for your earlier reply. I don't see anyone as addressed my query in any UA-cam video. As i could not find it anywhere.
To do this at an "acceptable" performance, these minimum requirements might help, to start with a CPU: 7B Params, C1, 16GB 13B Params, C2, 32GB 30B Params, C3, 64GB 70B Params, C4, 128GB Legend C1: Quad-core CPU (e.g., Intel i5/Ryzen 5) C2: Hexa-core CPU (e.g., Intel i7/Ryzen 7) C3: High-end CPU (e.g., Intel i9/Ryzen 9) C4: Workstation-grade CPU (e.g., Xeon/Threadripper) Obtained via Perplexity. Sources checked. Please doublecheck before investing in anything.
671b is the full real one with all the benefits. Anything else is just garbage and not worth it. Too many UA-cam videos of hosting small models but they don’t function like the full models
Great video 🤗 btw I always wanna ask you flowise vs pydantic AI . I know one is no code and other is code but still wonder which framework is more powerful and worth learning. I don't know coding nor flowise but I am ready to learn any of this which is more futuristic and powerful. So plz do guide me.
Like you said, they're different as the one is no-code and the other a programming library. I like Pydantic, but if I had to learn one specific library, it would be Langchain and, by extension, LangGraph.
@leonvanzyl thank u for the replay , can I know why you choose those. What make it more powerful . Flowise use langchain and langraph right ? Is it better to learn both by programming or using flowise ? What u suggest
Thank u for the video bro...is phidata better than pydantic ai ,langchain anybother frame work to build ai agent ? If have to pick one frame work to buidl ai agent which one would you pick and why 🤔🫡
Challenge time! Share the most interesting response you got from DeepSeek R1 in the comments.
Great content! Thank you Leon.
My pleasure!
Thanks man, your tut was musch easier easier.
Well done ✅ baie goed verduidelik
befok!💪❤️
Dankie Francois
very Big thanks for the great content
You're welcome!
I have rtx 3080 with 10 GPU memory, i7 12700F and 32GB RAM. What parameter should I use? I know the 671B is perfect (Above ChatGPT o1 performance). I am wondering about 32b and 70b performance. How does that compare to GPT-4-turbo?
Hello, I set everything as in the video, but when i ask smth chat goes blank, it takes so long time to get responses. What can i do? I got RTX 3050, Ryzen 7 5800H
which is best for AMD eyzen 5 3600 and 1080ti
Wow, does it support file attachment?
Yes
One more questions, Can we run 7B model in Laptops not having GPU but having internal GPU as part of processor like i3, i5, etc.? Thanks for your earlier reply. I don't see anyone as addressed my query in any UA-cam video. As i could not find it anywhere.
Well, you need to just try for yourself.
I was able to run the Llama 3.2: 3b model on a 10 year old laptop with an internal / onboard GPU.
To do this at an "acceptable" performance, these minimum requirements might help, to start with a CPU:
7B Params, C1, 16GB
13B Params, C2, 32GB
30B Params, C3, 64GB
70B Params, C4, 128GB
Legend
C1: Quad-core CPU (e.g., Intel i5/Ryzen 5)
C2: Hexa-core CPU (e.g., Intel i7/Ryzen 7)
C3: High-end CPU (e.g., Intel i9/Ryzen 9)
C4: Workstation-grade CPU (e.g., Xeon/Threadripper)
Obtained via Perplexity.
Sources checked. Please doublecheck before investing in anything.
What is the main difference between 1.5b and 671b model and its impact on response outcomes?
671b is the full real one with all the benefits. Anything else is just garbage and not worth it. Too many UA-cam videos of hosting small models but they don’t function like the full models
Will the output response be different for 1.5b vs 671b also speed?
Of course it will be. Check the benchmarks
Which R1 model can I run on RTX 4090? will 32b struggle? got bad internet here and i dont want to wait and try different models
32B should be fine
Can I use open webui/ anythingLLM with api keys?
Can we upload a pdf and ask questions on it in local setup?
Yes. WebUI includes file uploads and web search.
@@leonvanzyl Nope, unfortunatelly He cant read (pdf files) or describe images
does it have tools? can it search the internet?
Nope
Can you make a video of adding search + v3/r1 API keys from say hyperbolic to webui / anythingLLM (has temperature control)
Lekker pel
Great video 🤗 btw I always wanna ask you flowise vs pydantic AI . I know one is no code and other is code but still wonder which framework is more powerful and worth learning. I don't know coding nor flowise but I am ready to learn any of this which is more futuristic and powerful. So plz do guide me.
Like you said, they're different as the one is no-code and the other a programming library.
I like Pydantic, but if I had to learn one specific library, it would be Langchain and, by extension, LangGraph.
@leonvanzyl thank u for the replay , can I know why you choose those. What make it more powerful . Flowise use langchain and langraph right ? Is it better to learn both by programming or using flowise ? What u suggest
nvm, didn't realise if you use the docker one you need to import a model
How else does it get access to models?
@leonvanzyl i tried a different one and it downloaded one by itself. I assumed this would also.
671b is where it’s at though
Ask it to spell strawberry 😂
And tell it its wrong. And again. Love the internal thoughts its having and at some point just gives in
Thank u for the video bro...is phidata better than pydantic ai ,langchain anybother frame work to build ai agent ? If have to pick one frame work to buidl ai agent which one would you pick and why 🤔🫡