Should You Buy nVidia RTX 4060 for Stable Diffusion? AI Gaming?

Поділитися
Вставка
  • Опубліковано 15 лип 2024
  • The RTX 4060 Ti is not better for running generative AI applications than the 12GB 3060 Ti. In fact, it is slightly worse. The RTX 4060 Ti has a narrower memory bus than the 3060 Ti, which means that it can't transfer data as quickly. This can be a bottleneck for generative AI applications, which often require a lot of data to be processed. Additionally, the RTX 4060 Ti has a slightly lower number of CUDA cores than the 3060 Ti. CUDA cores are the processing units that are responsible for performing the calculations in generative AI applications.
    As you can see, the two graphics cards are very similar in terms of specifications. However, the 3060 Ti has a slight edge in terms of memory bandwidth and CUDA cores. This means that it will be slightly faster for generative AI applications.
    If you are looking for a graphics card for generative AI applications, the RTX 3060 Ti 12GB is a better choice than the RTX 4060 Ti. It is slightly faster, has more memory, and is available at a lower price.
    ----
    Mining Footage Attribution: ‪@RedPandaMining‬
  • Наука та технологія

КОМЕНТАРІ • 64

  • @shao-qiko7048
    @shao-qiko7048 11 місяців тому +65

    For those who are researching this topic and come to this video, there's some clear problems with the conclusion of this video. First of all, the reviewer seemed to mixed up 3060 and its TI variant (mixed up the 4000 series as well); you can't be mixing them up when trying to give recommendation. Second of all, VRAM bandwidth is usually not the bottelneck of LLM or stable diffusion (not saying it will not be ). What's more important is...first whether you have enough VRAM to load/run the model, and second the FP16/32 performance of your chip (some models/applications might have slightly different preference). Now enough theory on paper, there's benchmarks of 4060 and 4060ti running stable diffusion already out there, and they are both better than 3060 (12G). If we only talk about pure performance (so assume you have enough VRAM to do what you want to do), 4060 actually performs very close to 3060ti and 4060ti performs close to 3070 (obviously better than 3060ti). Now if you add VRAM into the equation, 3060 (12G) does still have its place over the 4060 (8G), i.e. it can load larger models; it's also a lot cheaper at 2nd hand market sure, but 4060(8G) is most certainly not "slower" than 3060 (12G). The AI space is evolving really quickly so the actual % might change, if you want more information check www.reddit.com/r/StableDiffusion/comments/15h4g7z/nvidia_amd_intel_gpu_benchmark_data_update/

    • @BlueprintBro
      @BlueprintBro 8 місяців тому +1

      Thank you

    • @TheParadoxy
      @TheParadoxy 7 місяців тому +4

      Yeah, thanks for pointing this out. I just upgraded from a 2060 Super to a 4060 ti and it's definitely an improvement despite having half the bus width. You can also check the Tom's Hardware review for a stable diffusion benchmark and they show the 4060 ti beating the 3060 ti (once again with the 3060 ti having twice the bus width)

    • @TylerDurden-up8hz
      @TylerDurden-up8hz 6 місяців тому

      ​@@TheParadoxyand the Intel Arc a770 beating the 4060 at lower price and double VRAM 😂

    • @TheParadoxy
      @TheParadoxy 6 місяців тому +1

      @@TylerDurden-up8hz if you think I'm an Nvidia fanboy, your wrong. I have another rig with a Radeon 6750 XT that I absolutely love and have found it to actually be a better user experience for machine learning than the Nvidia cards for the most part. But then I'm sol in certain contexts like running the newest Leela Chess Zero networks on that card because Open CL doesn't support the attention mechanism
      So if Intel is fully functional for stable diffusion that's awesome! I checked the Tom's hardware article and saw that it's beating not just the 4060, but also the 4060 Ti. It looks like in the original article they couldn't get 768 resolution working but later figured it out. If people say it's working smoothly for the new sdxl models then I think the A770 is hands down the best option for stable diffusion on a budget

  • @Kwipper
    @Kwipper 11 місяців тому +19

    There isn't a 12Gb 3060 ti. There is a 12GB 3060, but there is no 12GB TI variant.

    • @aifluxchannel
      @aifluxchannel  11 місяців тому +4

      You're right! We'll add a correction in a card.

  • @fatidicusaeternus6498
    @fatidicusaeternus6498 Рік тому +3

    DLSS applies to all rendering, not just ray tracing. Additionally, the "7/8ths of all frames" is based on upscaling a traditionally rendered frame from 1080p to 4k and using DLSS 3 to insert an estimated frame between each of those, meaning only 1/8th of all generated pixels are based on actual rendering from the game engine. All this adds graphical artifacts and latency of course.

  • @gcardinal
    @gcardinal Рік тому +16

    sry can't agree - 16GB is huge for SD - when you generate 512 you simply won't ever get the results as if you are doing larger initial image. also if you are doing this a business to sale images - size matters a lot - and if you source is low quality you wont ever be able to upscale it. for a rig that will just generate images 24x7 and upscale them - time is not as important as high quality end results. but for general having fun - 3060 ti is indeed just as good.

    • @TheBobo203
      @TheBobo203 11 місяців тому +3

      So have anybody tried SD on 16gb 4060ti already? I still have to decide if it is better than like 4070 12 gb in high resolutions

    • @gcardinal
      @gcardinal 11 місяців тому +3

      @@TheBobo203 It is for sure. It simply gives you much more options with 16GB. 12GB is just limiting. You will constantly get errors and will need to adjust resolution.
      But 4070 is faster - so it all depends on what you end goal is.

    • @generalawareness101
      @generalawareness101 11 місяців тому +2

      @@TheBobo203 Being that most of us who buy x060 cards we are on gen 3 boards. FFS, Nvidia not only halved the bus width (even my 1060 has 192bit) but they stuck it where it demands a gen 4 to take full speed advantage else the gen 3 people will be at x8 max with 128 bit bus. That is horrid, no matter how anyone tries to serve it.

    • @KillFrenzy96
      @KillFrenzy96 11 місяців тому +2

      Now that Stable Diffusion XL is out, 12GB is barely enough to scrape by. The RTX 3060 12GB is still an excellent budget card and is barely enough to generate 1024x1024 SDXL images. However, if you need to train models at 1024x1024 or generate images larger than this, you will need at least 16GB or even 24GB of VRAM.

    • @generalawareness101
      @generalawareness101 11 місяців тому +2

      @@KillFrenzy96 If you want to train a full on checkpoint (FT/DB) you need 32GB to 48GB now for XL1024x1024 BS1 as just a full Unet is 23GB to itself. I hope we get some refinement/optimizations soon to lower that to more sane levels.

  • @64jcl
    @64jcl 8 місяців тому +2

    Huh, I am using Invoke AI running SD 1.5 (and variants of this model) on an MSI 2060 and that only has 6GB video ram. Still churning out 768x768 pictures pretty fast. The same goes for my mobile 3050Ti with only 4GB ram so not sure what you say about memory being so critical here. Was wondering about upgrading my standard 2060 to a 4060Ti as it would also be nice to have better gaming performance, but it seems SD would run at 2x speed too which is decent too.

    • @lefourbe5596
      @lefourbe5596 8 місяців тому

      If you want to generate higher resolution you want more vram. Face looks terrible at base generation (full body shot).
      Details are weak too.
      If you add controlnet which is a standard you will absolutly need more than 8Gb. Therefore the more Vram, the better it get.

    • @64jcl
      @64jcl 8 місяців тому

      ​@@lefourbe5596 , I generally just upscale the image then do individual improvements like the face which always generates an excellent face as I describe it using a 768x768 area.

  • @mrbabyhugh
    @mrbabyhugh 2 місяці тому +1

    it is ignorant for people to say dont buy EVGA cards because they dont do them anymore. WHAT does that have to do with what they had already done.

  • @kostadintzvetkov1636
    @kostadintzvetkov1636 9 місяців тому +2

    RTX 3060 has a 192-bit memory bus, not 256 bits. You should get your facts straight.

  • @JohnDoe-cv8iw
    @JohnDoe-cv8iw 7 місяців тому +2

    i had a 4090 up until a few days ago.. i had to sell it because im going back to school and need to pay some bills ahead of time... i had someone trade me a 3060 to knock $100 off the price and im surprised i thought it would be extremely slow compared to the 4090.. but its not as bad as i thought it would be! still useful actually

  • @Shadow_Shinigami
    @Shadow_Shinigami Рік тому +5

    DLSS/FRAME GEN is only an enhancement for gaming experience on slower paced explorative and story based single player games because of the high latency that gets added on. For everything else it's a worse experience.

    • @aifluxchannel
      @aifluxchannel  Рік тому

      Definitely agree! nVidia generally over sells this feature to spend less on expensive components like VRAM :(

  • @kehindeadesuyi6741
    @kehindeadesuyi6741 Місяць тому +1

    What about someone upgrading from an Geforce Mx150 to a laptop with rtx 4060

  • @haon2205
    @haon2205 Рік тому +3

    I thought only the non-ti 3060 has 12gb.

  • @jamesjonnes
    @jamesjonnes Рік тому +3

    If I was doing only generative AI, I'd buy a M40 or P40 cheaper than a 3060TI for 16 more GBs of VRAM.

    • @aifluxchannel
      @aifluxchannel  Рік тому +1

      Also a very good choice!

    • @pokepress
      @pokepress Рік тому

      I don’t think I’ll be picking up a data center GPU for myself any time soon, but the prospect of using one is interesting. From what I understand, the speed isn’t as fast as the new cards, but the extra VRAM can be helpful.

    • @jamesjonnes
      @jamesjonnes Рік тому

      ​@@pokepress How fast they are depends on the application and the system. There will probably be no gain in speed if you add more than one in a consumer grade system. But if you want to run a LLM, it will be much faster than offloading some layers to RAM, and not much slower than a consumer grade GPU with the same amount of VRAM.

    • @mordokai597
      @mordokai597 Рік тому +3

      TL/DR: EVERY developer making popular AI apps, including Auto1111(SD), Bmaltis(Kotha_SS),Oobabooga (textgen) and several others have all told me the only solution for numerous recent SYSTEM DESTROYING incompatibility errors was "get rid of the m40" that was each and every one's exact words .
      i have a 3060xc, and an M40... keeping that m40 running, with driver/board/pcie resource/ecc cache issues, incompatibility with most 4bit/8bit/bf16 optimizations makes every thing about that setup more complicated... and most common ai applications are ALL moving toward "cross platform stability" and what that means is making everything more WINDOWS-USER friendly, then rolling the closest equivalent of the new windows setup to the other platforms as an autoinstaller/auto env/venv setup (i had to do a complete system reinstall 2 weeks because a Koyha_SS linux docker had literal windows stuff in it) ... these are quickly making it impossible to configure run m40's on dual-gpu systems... m40's are becoming only viable on single gpu, headless or IGPU-videoout setups...

    • @generalawareness101
      @generalawareness101 11 місяців тому +2

      P100 as well, but what has kept me away from them is the noise. OMG, the blast blowers from hell is not my idea of fun, nor is having to rig up a water cooling loop for one either. Stuck in a rock and a hard place since Feb.

  • @silentwindstudio
    @silentwindstudio 5 місяців тому +1

    So no real world testing included in the video? uh

  • @jazzlover10000
    @jazzlover10000 4 місяці тому +1

    Am looking for good reviews comparing the 4060Ti/16gb to the 3070/8gb as this is how they are generally found in the wild. I don't think anyone (sane) buys the 4060/8gb card but because of the marketing error, you _must_ put the gigababage in the title, comments+intro. Otherwise we'll just assume the video is not pertinent.

  • @RealShinpin
    @RealShinpin Рік тому +3

    I still think 30 series are able to run dlss3. I firmly believe they are witholding it to drive people to buy 40 series... Hopefully someone ports it over to 30 series

    • @RandomClips22023
      @RandomClips22023 11 місяців тому +1

      No, DLSS 3 needs an special processing unit

    • @kallethefox1968
      @kallethefox1968 26 днів тому

      well you could but unlike AMD that has software for it Nvidia uses hardware on the die itelf only 40 series cips have those on the die 30 series only got tensor cores for DLSS not for FG unless they make a software like amd its not gonna happen.

  • @karionwhite2367
    @karionwhite2367 2 місяці тому

    2:30 Da war Peter!

  • @kalibtv8001
    @kalibtv8001 7 місяців тому

    You are confused with RTX 3060 12GB vs RTX 3060Ti 8GB.

    • @jazzlover10000
      @jazzlover10000 4 місяці тому

      Well and was the 4060Ti with 16gb? The 8gb card doesn't perform well.

  • @hp2073
    @hp2073 7 місяців тому +1

    I have two options, a 3060 12g for 200$ and 3090 24g for 600$. I dont look for fun and I look for a way to make money. what is your suggestion?

  • @tompom3513
    @tompom3513 10 місяців тому +1

  • @Mehdital89
    @Mehdital89 7 місяців тому +1

    Best card for the money is the 3090

  • @not_milk
    @not_milk Рік тому +3

    Honestly it makes the most sense at this point to get an M2 Mac with 32 or 64gb of RAM to be able to run LLM’s and video generators. Since you can’t get that kind of VRAM without a very large budget.

    • @lefourbe5596
      @lefourbe5596 11 місяців тому +2

      it is slower still. not by that much. a friend M1 MAX is able to generate in 2.2 times slower than my 3090 (at 85% power preset).
      impressive since APPLE siliconrun below 45W. but also this 64 RAM M1 max costed him 4000€ !

    • @amarnamarpan
      @amarnamarpan 3 місяці тому

      I would argue against it. Firstly that's costlier than so many gpus. And it's still slower than most Nvidia GPUs.

  • @Feelix420
    @Feelix420 11 місяців тому +2

    the only thing rtx4060 is good at is that it consumes less power

    • @aifluxchannel
      @aifluxchannel  10 місяців тому

      Yep, nVidia has been prioritizing this more and more with each new generation of GPUs

    • @jazzlover10000
      @jazzlover10000 4 місяці тому

      Nahhh it's cheap and it's a great card for SDXL gen. But y'know what really makes this card go is the 16gb RAM.