LLM System and Hardware Requirements - Running Large Language Models Locally

AI Fusion

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 січ 2025

КОМЕНТАРІ • 45

@watcharasupphakan9326 3 місяці тому ⁺³
Thank you for a very informative video. I just brought rtx 4060 8gb Vram with 16gb ram laptop. Looking to run smaller llm and learning to build Ai agent and also using stable diffusion. Hopefully its enough for start. Saving up for rtx 4090 pc. A bit far fetch at the moment 😂
@AIFusion-official 3 місяці тому ⁺¹
Thanks for the feedback! Congratulations on the new RTX 4060 laptop. With 8GB of VRAM and 16GB of RAM, you're well-equipped to start working with smaller LLMs and building AI agents. I also have an RTX 4060, and it runs models like Gemma 2 2B in FP16, LLaMA 3.1 8B in Q4 quantization, and Qwen 2.5 3B in Q8 smoothly. These models should perform well with your setup, but I would recommend paying attention to the laptop temperature. Keep going with your AI projects!
@watcharasupphakan9326 3 місяці тому ⁺¹
Thank you for the model recommendation. That's what I was thinking as well but with a little assurance goes a long way. I thinking on moving on to a PC in a couple months. Probably not rtx 4090 😅. I would probably be going for 4070ti or 4080 super, don't really need very large model for what I'll be doing. Keep up the good work and hope you build up large subscribers 💪
@AIFusion-official 3 місяці тому
You're welcome! I'm glad the recommendations were helpful. Moving to a PC sounds like a great plan, and the 4070 Ti or 4080 Super are solid choices, as they both have 16 GB of VRAM, which will allow you to run larger models. Thank you for the kind words! I'm excited about the journey ahead and appreciate your support. Best of luck with your AI endeavors, and feel free to reach out if you have any questions along the way. By the way, I would recommend playing around with Whisper, which is an opensource speech to text model. I'm using it to create subtitles for the videos, and "Medium" and all the ones below it works well on the RTX 4060.
@amit4rou Місяць тому
@@AIFusion-officialI also have a laptop with RTX 4060 and 32GB RAM i9 13980HX processor... Could you suggest a good coding LLM, and which quantized version of qwen 2.5 coder could I run?
@nishitkakkad4976 2 місяці тому ⁺¹
Hey, I'm thinking of getting a new laptop with an RTX 4060 and 8GB of VRAM. But I'm also considering using Google Colab or Jupyter notebooks to learn about and play around with Large Language Models (LLMs) and their applications.
The thing is, I need a new laptop anyway because my current one is ancient and barely hanging on. So, I'm wondering if it makes more sense to just buy my own machine or if I should go the Colab/Jupyter route.
What do you think?
@YasirJilani-f1j 4 місяці тому ⁺²
very nice thanks, can you do a video on TPU?
@dilip.rajkumar 2 місяці тому
Great informative video. Could you kindly suggest, if I should choose an AMD 9800X3D or AMD 9900X if I want to run LLMs locally?
@RajeshMudi 8 днів тому
My dedicated GPU has 4 GB of memory, and my current RAM is 23 GB. Can I use more 16 GB of RAM with Lappy to run the LM model? My current configuration is an ASUS tuf f15 1650ti with 23GB of RAM. If I want to run qwen2.5-coder:35b on my laptop, can it be run?
@MonsieugarDaddy 4 місяці тому ⁺¹
I'm glad I found this video, but I got lost at the first part: GPU memory
I knew most ppl will refer the graphic part as a 'dedicated' component
yet I'm curious: can we include iGPU to contribute to our setup?
considering the latest Ryzen 880M is quite a 'capable' iGPU..
@AIFusion-official 4 місяці тому ⁺¹
Thank you for your comment! While the Ryzen 880M is a powerful iGPU for many tasks, running large language models (LLMs) typically requires a dedicated GPU with substantial VRAM (often 8GB to 24GB) due to the heavy memory and computational demands. iGPUs like the 880M share system memory and lack the dedicated resources and bandwidth needed to efficiently handle LLMs, so they wouldn't contribute significantly in this context. For best results, a discrete GPU with sufficient VRAM is essential.
@jamegumb7298 3 місяці тому
What if you want to just run an LLM specifically for better speech recognition. It should be very small, a subset. Could that be done on integrated to keep the GPU free?
@azkongs 5 місяців тому ⁺²
Do you know if a model that 16GB in size, could it run with graphics card with 16GB VRAM?
@AIFusion-official 5 місяців тому
A model that is 16GB in size might not fit perfectly into a 16GB VRAM graphics card due to additional memory requirements for computations and overhead. While it’s theoretically possible, practical use often requires more VRAM. Techniques like quantization and reducing batch size can help manage memory usage.
@xj3130 20 днів тому
Does RAM speed (ddr5 vs ddr4) matter much for LLM performance?
@hayseedfarmboy Місяць тому
this does give system requirement , talks about thing that are import as far as specs goes but unless you want use a FP16 or a 16GB gpu its really kinda useless
@burakinan7518 2 місяці тому
Hello, thanks for that great video. how about that system : i5 13600f, 32 gb of ddr5 ram and rtx 3090. For the current published llms?
@matthiasandreas6549 4 місяці тому ⁺¹
Hello you have the link for quantization calcukation what you showing in the video? Please thank you
@AIFusion-official 4 місяці тому
that's not a tool, that's just something i made using HTML, Css and js for the sake of the video. I have made a tool right after this video, where you can choose a large language model and see what GPUs could run it (and how many of them) in FP32, FP16, INT8 and INT4. (you can find the link to this tool in the description)
@matthiasandreas6549 4 місяці тому
@@AIFusion-official thanks a lot for your anwer, i look but cant find the link 🤔
@AIFusion-official 4 місяці тому
You're welcome, Here it is : aifusion.company/gpu-llm . Hope it helps!
@matthiasandreas6549 4 місяці тому ⁺¹
@@AIFusion-official thank you so much
@AIFusion-official 4 місяці тому ⁺¹
You're welcome!
@bitlong4669 24 дні тому
Pshh I been running llms on 2012 hardware with 4 year old CPUs. Very little budget.
@harshdeep7015 3 місяці тому ⁺¹
I want to use Ai tools like stabe diffusion, comfyui, 3D game development, also want to train my ai models. Searching for a beast laptop under $1800. Also want to do video editing 😢
@AIFusion-official 3 місяці тому
For the tasks you're looking to do-AI tools like Stable Diffusion, ComfyUI, 3D game development, AI model training, and video editing-getting a powerful laptop under $1800 is going to be tricky. While some laptops in that range offer decent specs, the problem is that they often struggle with heat management, especially during heavy workloads. Overheating can cause thermal throttling, which severely impacts performance and productivity, making it frustrating for tasks like AI model training and video editing. Honestly, you'd be better off going with a desktop. Desktops provide much better cooling systems, which means they can handle long hours of heavy use without slowing down. Plus, they're way more price-friendly for the same level of performance. With $1800, you can get a desktop with a much more powerful GPU, higher RAM, and more storage space than any laptop in the same price range. And as a bonus, desktops are more upgradable, so you can easily improve the specs down the line as your needs grow.
@harshdeep7015 3 місяці тому ⁺¹
@@AIFusion-official but I am a student I need portability, I will also build pc after 3-4 years
@AIFusion-official 3 місяці тому
For AI tools like Stable Diffusion, 3D game development, and video editing, finding a strong laptop under $1800 is tough but doable. Look for one with at least an NVIDIA RTX 3060 or RTX 4060/4070 GPU, paired with an Intel i7 or Ryzen 7 processor. You’ll need 16GB RAM (or ideally 32GB) and 1TB SSD storage. Good cooling is important to avoid overheating during heavy tasks. Some solid options are the ASUS ROG Strix G15, MSI Katana GF76, and Lenovo Legion 5 Pro. They’ll give you the portability you need as a student while handling your workload.
@harshdeep7015 3 місяці тому ⁺¹
@@AIFusion-official What about 3070Ti laptop??
@AIFusion-official 3 місяці тому
A laptop with an RTX3070Ti is a solid choice! It’ll handle AI tasks, game development, and video editing really well. Just pair it with a good i7 or Ryzen 7, 16GB or 32GB RAM, and 1TB SSD. If you find one under $1800, go for it! Just check the cooling to avoid overheating.
@irocz5150 4 місяці тому ⁺¹
Apologies for the lack of knowledge..but why no AMD video cards?
@AIFusion-official 4 місяці тому ⁺²
AMD Video cards work too, i didn't mention them in the video, but there are some good AMD GPUs for LLMs tasks.
@jefflane2012 4 місяці тому ⁺¹
Nvidia GPUs have tensor cores to perform massive simultaneous calculations for AI. AMD has cuda which is more of a general processor.
@akhathos1618 3 місяці тому ⁺³
@@jefflane2012 No tienes ni idea de lo que estás diciendo.
@jamegumb7298 3 місяці тому
@@jefflane2012 AMD has no CUDA at all, AMD has ROCm which is kinda the same but different.
Tensor cores helps, but other capavbilities factror into it also, RDNA3 can work fine.
@matthiasandreas6549 4 місяці тому ⁺¹
Hello and thanks for the video, i have a possibility to choose at the mean tim to buy a Nvidia RTX 3060 12GB or a Nvidia RTX 4060 8GB GPU to use 8b llms in private testing.
Which one will be the better one i understand that both gpus eill be work at T4 85% accuracy is this so?
Thank you for answerig
@AIFusion-official 4 місяці тому ⁺¹
I have an RTX 4060, and it works fine. However, I would recommend getting the RTX 3060 because it has more video RAM, which allows it to load larger models. Hope this helps!
@matthiasandreas6549 4 місяці тому
@@AIFusion-official⁠, yes thanks it helps a lot, mostly you stream was very informative.
But the accurracy value with 12Gb VRAM will be the same or a bit better for understand?
@matthiasandreas6549 4 місяці тому
@@AIFusion-official, and how big is your RAM? About 32Gb or bigger?
I can choose between 32 or 64gb but there is a bigger price.
@AIFusion-official 4 місяці тому
i have 32gb ram
@AIFusion-official 4 місяці тому
The accuracy would be the same, with the only difference being the speed-specifically, how many tokens per second it can generate.
@markverhoeven7518 3 місяці тому
"cutting edge research" read: "for more serious science or commercial tasks"

Наступне

Автоматичне відтворення

Local LLM Challenge | Speed vs Efficiency