For transistor count I guessed >2.2x, since it's about 80% faster in games. The smaller total area was a surprise how tiny 4N is. It couldn't be much larger, 700-800mm² is the maximum size for the process, right?
Makes me think there is no point in more power other than to saturate 8K. For 4K, this is all you need. If they based a console off this chip, it would render reality.
That's just simply not true, just off the top of my head Asianometry is one of many other creators who goes deep into these subjects. But by all means, HighYield is a really great creator!
cose it is not a very popular topic. look at the number of comments in the comment section. "very few" people understands/wish to know what a transistor/chip is. You are probably the only person in your immediate social circle that knows anything about transistor, unless you are a IT engineer. 😄
I keep rooting for your channel to get some exposure. I know it won't mean shit, but the bunch of us out here watching appreciate every frame of your videos. Keep it up man, this is one of the very few "hardcore arch" channels that tickle me just the right way.
FYI - I always make it to the end, you have good analysis, and perspective. Yes, I was surprised at the silicon size, from other tubers I expected it to be larger. That is a staggering amount of transistors too! Now Im so curious what AMD will release, and how it will be priced.
Great video, very informative. I wonder if the 88% binning for the 4090 might mean yields are good but great on 4N. I do find it strange that they didnt consider enabling more transistors but clock the chip lower. Would have had similar performance but much less power.
350watt? Then create an unlimited power version and clock it 10% higher and that's your ti or whatever. They Nvidia would have to disable more yields to a 4080 or 4080ti.
The RTX 4090 was tuned down a lot from what was planned, making it pretty efficient. I was curious about how much of the chip was disabled, given that it's huge size on a cutting edge process. Thanks for the info.
I thought it still retained what the leaks said a hefty l2$ of 96MB, at 72MB it seems as the rumored 4080 class we heard about it in the first rounds of rumours about AD.
I promise I did not look this up now, but I have seen the counts before, but have maybe misremembered them My guess for how many more transistors AD102 has: GA102: 2.7x TU102: 4x GP102: 6.5x
Great deep dive, thanks!! Not that it is unexpected from you. FYI I did watch all the way through yesterday, I just couldn't comment on my TV and didn't have my phone handy, so I came back to make sure after DLing the new video for Mt drive to get the little one. I'm sure it's great. Thanks again!
Don't usually comment, but as a undergrad student in comp engineering, your channel has taught me so many things our course has left out on. And other tech channels haven't tried as good as you have. thank you very much 🥰. Hope I get to learn a lot more
There's a lot of tech channels but you stand out as one of the better in-depth knowledgeable channels. You're not just vomiting unconfirmed news or repeating the same thing everyone else does. I'll watch your channel until you stop bro, and I hope you don't stop.
Another solid deep dive into a part that lots of gamers talk about without really caring to look into it. Thank you for dissecting that beast, its even more impressive after all the info from your presentation. Also congrats for the excellent production value of your content. Like in football (USA), you got to go ALL THE WAY! Hats off to Green Team.
I really love your videos. They are so precise and informative. 👍 I already knew the transistor count and the die-size (I'm such a nerd ... 😂). I think Nvidia might have binned the RTX 4090 that way to obtain a higher yield rate (remember, the N4 process is still relatively new and TSMC will probably have a higher percentage of faulty transistors than the "old" and well-known 8 nm Samsung process). It doesn't really make sense, to push a die with 100 % active transistors, when the process is not yet ready to deliver such a high yield rate. Nvidia will buy some time with the RTX 4090/4080 for TSMC to streamline the new process and they will deliver a fully activated AD102 when the yields justify such a step. It doesn't have to be this way, just some thoughts on it. 👌
Thank you so much for making this video, it's refreshing to see specialised insight with the perfect balance of existing knowlege assumptions and detailed explanations.
The 4090 has the most over engineered cooler we will probably every see. Well, wait.. the 4080 16gb will use the same one!!! Damn, crazy. I’m capped at 120fps at 4k in destiny 2 at max settings and I don’t see more than 50°c peak heat and 58c hotspot. Peaks at 60% power draw (hwinfo). It’s insane efficient
7:41 The use of "exponential" as stated is plainly incorrect. Doubling the base number of transistors on the chip results in double the scaled number of transistors, likewise, quadrupling the base quadruples the scaled number of transistors. The factor on the base and scaled transistor count is the same, which describes a *linear* mapping.
Every product I see using TSMC's 4nm node performs with an extreme uplift in performance and efficiency. Just shows how strong tsmc's fab processes are.
The power draw limiting from my experience isnt quite as good as some make out. I ran some benchmarks at 90%PL and noticed some small wobbles in clock speed, the end result of the bench had the same ave fps and just 1 less max. But it should be noted that at 100% PL, the card wasnt pulling near 100%, so limiting it to 90% wasn’t really a 10% drop in power draw. I then tested again at 80%PL, the clocks were very noticeably wobbling. I cant remember now what the fps drop was - it was small, but there was a drop. Which is fine, you are now actually cutting power (in this instance) by the full 10% below the 90% marker. Probably does increase its efficiency, but I dislike such wobbling.
Watched to the end. Idk much about those super high end setups some people use, but spending transistors on new and evolving technologies rather than just scaling up seems a much better use of resources. I'd really expect diminishing returns on scaling up these beasts even more an irritating Moore's law
Nice one thanks. Was wondering if you could you make a vid comparing AMD and Nvidia Architectures and more specifically CUDA vs CU. I think with RDNA3 such a comparison would be very interesting.
This video is *fantastic* . You have no idea how great it is to just "appreciate" how much work goes into these GPU designs rather then having people mope and bitch about prices and bad marketing on NVIDIA's part. I plan to buy a RTX 4090 FE in the coming months to start making YT content and I have *no* doubt that I will not have to change my GPU for some years to come!
I have a 4090 FE. I for one am glad that they gave us 450w and up to 600w with the proper PSU connections. The 4090 is an ultra enthusiast class card. Ultra enthusiasts like to tweak and play with their hardware and see what it's capable of. I would take a power limit slider and 600w capability even if it's hot and super inefficient and have the option to limit it myself, or push it hard to see what it can do 10 times out of 10 over Nvidia bios limiting it to 350 watts and forcing us to have to flash the vbios or worse, do some kind of hardware mod. I think the fact that it is so efficient and can easily just be power limited to 350w while losing practically no performance is just a bonus.
I agree, that NVidia could and should have released a much more attractive product by using a lot less power and being a LOT smaller. It would have been an instant buy for me.
Yep I lowered my power to 80%..saving power and hitting 3ghz..around 350 watts and stays lower most the time...my case is pretty good at cooling, it has 3 nice size fans at the bottom blowing right on the card btw...4090 fe.
I have a 4090, and I overclocked it, resulting in about a 10% improvement in render times. This is huge when considering how long some renders might take.
@@HighYield Octane Render. The improved raytracing performance has decreased render times by 2x compared to 3090Ti; unlike gaming, this engine sees a much bigger leap in performance. You can go and check out Octanebench results and exclude multi-GPU setups to see the scores; you can also set it to display average or maximum scores.
I've been very curious to understand better this Binning business, would the same chip go into a 4080 as 4090 but the difference being how many "bad" transistors were found? Love the vids, tech geek nerds all the way!!!!
Nvidia has different architectures, for AI acceleration they have "Hopper" instead of "Ada Lovelace". Parts of the AD102 chip use tensor cores (mainly to accelerate DLSS), which is also used on Hopper for AI stuff (like ChatGPT). So its related, but Nvidia has a different chip with a much larger AI focus.
Were you surprised by the true size of the chip inside the RTX 4090? And how much bigger did you guess it was compared to the 3090 Ti?
I was surprised that it is slightly smaller in mm²
For transistor count I guessed >2.2x, since it's about 80% faster in games. The smaller total area was a surprise how tiny 4N is. It couldn't be much larger, 700-800mm² is the maximum size for the process, right?
Keep up the good work, really enjoyed the information you provided and the way you did it .
Makes me think there is no point in more power other than to saturate 8K. For 4K, this is all you need. If they based a console off this chip, it would render reality.
very few channels go this deep inside the chip technology, thanks for the videos.
That's just simply not true, just off the top of my head Asianometry is one of many other creators who goes deep into these subjects. But by all means, HighYield is a really great creator!
@@WAY2PWN thats still very fewz
cose it is not a very popular topic. look at the number of comments in the comment section. "very few" people understands/wish to know what a transistor/chip is. You are probably the only person in your immediate social circle that knows anything about transistor, unless you are a IT engineer. 😄
I keep rooting for your channel to get some exposure. I know it won't mean shit, but the bunch of us out here watching appreciate every frame of your videos. Keep it up man, this is one of the very few "hardcore arch" channels that tickle me just the right way.
The usual excellent job on content, presentation, editing and sound. Keep them coming!
FYI - I always make it to the end, you have good analysis, and perspective. Yes, I was surprised at the silicon size, from other tubers I expected it to be larger. That is a staggering amount of transistors too! Now Im so curious what AMD will release, and how it will be priced.
Great video, very informative. I wonder if the 88% binning for the 4090 might mean yields are good but great on 4N. I do find it strange that they didnt consider enabling more transistors but clock the chip lower. Would have had similar performance but much less power.
350watt? Then create an unlimited power version and clock it 10% higher and that's your ti or whatever. They Nvidia would have to disable more yields to a 4080 or 4080ti.
Always so interesting video - Great pace and nice calm voice too!
here, thanks for making these in-detail videos. I really like every video of yours so far for going into real depth. Keep up this Awesomeness!
I really love this type of context. As a render programmer, I find this stuff really fascinating.
The RTX 4090 was tuned down a lot from what was planned, making it pretty efficient.
I was curious about how much of the chip was disabled, given that it's huge size on a cutting edge process. Thanks for the info.
I thought it still retained what the leaks said a hefty l2$ of 96MB, at 72MB it seems as the rumored 4080 class we heard about it in the first rounds of rumours about AD.
Found your channel recently, loving the detail and clarity of your analysis! Keep it up 👍
What an incredibly interesting deep dive. I am looking forward to future video's!
I know I'm not the only one who glanced at the the thumbnail, and read AD102 as "ADIOS"... For a second I thought Nvidia was calling it quits.
More like, Adios hasta la vista baby ;)
My guess was twice as big. I guess that’s not as bad if you consider how much ends up being dark silicon on the 4090!
I promise I did not look this up now, but I have seen the counts before, but have maybe misremembered them
My guess for how many more transistors AD102 has:
GA102: 2.7x
TU102: 4x
GP102: 6.5x
Best explanation yet! Well done!
Great deep dive, thanks!! Not that it is unexpected from you. FYI I did watch all the way through yesterday, I just couldn't comment on my TV and didn't have my phone handy, so I came back to make sure after DLing the new video for Mt drive to get the little one. I'm sure it's great. Thanks again!
Don't usually comment, but as a undergrad student in comp engineering, your channel has taught me so many things our course has left out on. And other tech channels haven't tried as good as you have. thank you very much 🥰. Hope I get to learn a lot more
There's a lot of tech channels but you stand out as one of the better in-depth knowledgeable channels. You're not just vomiting unconfirmed news or repeating the same thing everyone else does. I'll watch your channel until you stop bro, and I hope you don't stop.
This video is amazing, I'm watching all the way through.
Good video, but I have some points:
1. From die shots available, RT & Tensor units use very little die space (
Another solid deep dive into a part that lots of gamers talk about without really caring to look into it. Thank you for dissecting that beast, its even more impressive after all the info from your presentation. Also congrats for the excellent production value of your content. Like in football (USA), you got to go ALL THE WAY! Hats off to Green Team.
Thanks for doing what you do!! Good content are often less hyped but it takes the same or even more effort!!!
Love this high end stuff! Really, though, I'm most interested to see if this trickles down to lower priced cards this generation.
Love the in depth explanations! keep em coming
I really love your videos. They are so precise and informative. 👍
I already knew the transistor count and the die-size (I'm such a nerd ... 😂).
I think Nvidia might have binned the RTX 4090 that way to obtain a higher yield rate (remember, the N4 process is still relatively new and TSMC will probably have a higher percentage of faulty transistors than the "old" and well-known 8 nm Samsung process). It doesn't really make sense, to push a die with 100 % active transistors, when the process is not yet ready to deliver such a high yield rate. Nvidia will buy some time with the RTX 4090/4080 for TSMC to streamline the new process and they will deliver a fully activated AD102 when the yields justify such a step.
It doesn't have to be this way, just some thoughts on it. 👌
Yes, a comment on watching all the way. The binning examples were fascinating. Now to go search for your other deep dives.
Thank you so much for making this video, it's refreshing to see specialised insight with the perfect balance of existing knowlege assumptions and detailed explanations.
these videos are always so well produced!
I’m a simple man- I see a High Yield post? I click
The 4090 has the most over engineered cooler we will probably every see. Well, wait.. the 4080 16gb will use the same one!!! Damn, crazy. I’m capped at 120fps at 4k in destiny 2 at max settings and I don’t see more than 50°c peak heat and 58c hotspot. Peaks at 60% power draw (hwinfo). It’s insane efficient
Good work...love the deep dive
New to this channel. Really like the fine details of getting into the "nitty gritty" behind the generation on generation
7:41 The use of "exponential" as stated is plainly incorrect. Doubling the base number of transistors on the chip results in double the scaled number of transistors, likewise, quadrupling the base quadruples the scaled number of transistors. The factor on the base and scaled transistor count is the same, which describes a *linear* mapping.
Watched till the end to get the full scope of your review.
Thanks for these analysis vids.
Finally, someone going into DETAIL about leading edge tech, I always learn something fun every video
Was the media engine changed...ie avx1 decoding hardware added? Enjoyed the watch!
quality video, excellent! thx you for all these informations
Such a great video!
Fantastic breakdown, thank you!
Leaving the comment as asked lol. I've been binging your videos because I guess i love my job animated
.
Ihis is first of your videos that i watched, seems Cool!
Still watching - fascinating stuff, especially back to back with the AMD presentation
I am both excited & excited !!!!!!!!!!!!!!!!!!!!!!!!
Every product I see using TSMC's 4nm node performs with an extreme uplift in performance and efficiency. Just shows how strong tsmc's fab processes are.
Good work! New sub🍻🏆
I’m enjoying the content you’re putting out, keep up the great work!
I appreciate it!
The power draw limiting from my experience isnt quite as good as some make out. I ran some benchmarks at 90%PL and noticed some small wobbles in clock speed, the end result of the bench had the same ave fps and just 1 less max. But it should be noted that at 100% PL, the card wasnt pulling near 100%, so limiting it to 90% wasn’t really a 10% drop in power draw.
I then tested again at 80%PL, the clocks were very noticeably wobbling. I cant remember now what the fps drop was - it was small, but there was a drop. Which is fine, you are now actually cutting power (in this instance) by the full 10% below the 90% marker. Probably does increase its efficiency, but I dislike such wobbling.
You deserve a millions subscriber for this kind of content
Great content. My kind of spoon-fed knowledge dump. Keep feeding me these awesome videos.
Great analysis!
Awesome analysis! Great video!
Watched to the end.
Idk much about those super high end setups some people use, but spending transistors on new and evolving technologies rather than just scaling up seems a much better use of resources. I'd really expect diminishing returns on scaling up these beasts even more an irritating Moore's law
Rewatched again great overview
Rare quality analysis!
Excellent video!
Nice one thanks. Was wondering if you could you make a vid comparing AMD and Nvidia Architectures and more specifically CUDA vs CU. I think with RDNA3 such a comparison would be very interesting.
This video is *fantastic* . You have no idea how great it is to just "appreciate" how much work goes into these GPU designs rather then having people mope and bitch about prices and bad marketing on NVIDIA's part. I plan to buy a RTX 4090 FE in the coming months to start making YT content and I have *no* doubt that I will not have to change my GPU for some years to come!
Consumers will always demand more for less thats just how it works if we stopped demanding better it would be bad for market
Strictly looking at density, N4 is 2.4-3x denser than 8LPP. So, I’d go with 2.7x.
I am surprised that people start the video and then do not finish it. Unless interrupted of course. This is great stuff!
I'm happy you liked it!
I feel Nvidia's decision to max out the 4090 is based upon their believe OC users are a big part of their top of the line customers.
I made it that far :) interesting video, thank you
I reckon at some point they will release a unbinned version of the GPU as the 4090 Ti once they improve their defective chip %
Fully agree! I've talked about it more in my RTY 4070 & 4090 Ti video: ua-cam.com/video/BpM6zcusweY/v-deo.html
Made it to 13:42… gonna watch it all
We got bamboozled. TU102 was significantly bigger than AD102 but AD102 cost significantly more.
Driver improvements could probably make use of that extra TDP
Very much excited for RDNA3, I see no competitor from AMD, to the 4090, but if they can price their GPUs right, they can win big
I have a 4090 FE. I for one am glad that they gave us 450w and up to 600w with the proper PSU connections. The 4090 is an ultra enthusiast class card. Ultra enthusiasts like to tweak and play with their hardware and see what it's capable of. I would take a power limit slider and 600w capability even if it's hot and super inefficient and have the option to limit it myself, or push it hard to see what it can do 10 times out of 10 over Nvidia bios limiting it to 350 watts and forcing us to have to flash the vbios or worse, do some kind of hardware mod. I think the fact that it is so efficient and can easily just be power limited to 350w while losing practically no performance is just a bonus.
I am watching full video. You are genius UA-camrs. Best and honest tech UA-camr
I would love more videos like this 💝 detailed yet simple words ✨
wish u all the success 💖🏆
My guess is that the area is 430mm² and that it's 75% larger than the GA chip.
Maybe the TDP is high because they’re gonna reuse the same board and cooler for 4090ti.
That wa great video, thank you 👌
With quality videos like this, I hope everyone watches till the end.
Thanks for the kind words!
Subbed ! Great channel/
I agree, that NVidia could and should have released a much more attractive product by using a lot less power and being a LOT smaller.
It would have been an instant buy for me.
And the reason for all this seems to be the monster RDNA3, we'll see.
thanks for the video , very intresting
Impressive analysis.
Yep I lowered my power to 80%..saving power and hitting 3ghz..around 350 watts and stays lower most the time...my case is pretty good at cooling, it has 3 nice size fans at the bottom blowing right on the card btw...4090 fe.
Turing Top die has around 3000+ cuda cores, Not sure about 1080ti but i know it is similar to 2070super at ,2560cuda cores
I have a 4090, and I overclocked it, resulting in about a 10% improvement in render times. This is huge when considering how long some renders might take.
What program are you using for rendering?
@@HighYield Octane Render. The improved raytracing performance has decreased render times by 2x compared to 3090Ti; unlike gaming, this engine sees a much bigger leap in performance. You can go and check out Octanebench results and exclude multi-GPU setups to see the scores; you can also set it to display average or maximum scores.
Good video!
9bn trans are disable, that means 4090ti would be very interesting, because there are a lot of room to improve, unlike older generation
watched to the end 😀
Can they make 700 mm + area with 4nm node? If yes this would be more transistors?
I've been very curious to understand better this Binning business, would the same chip go into a 4080 as 4090 but the difference being how many "bad" transistors were found? Love the vids, tech geek nerds all the way!!!!
I don't have to move the power slider. It is anyway using less power in the games I play.
I was going to guess about 40% bigger.
The full die could have a decent 10 to 12% uplift at a -10% power usage as 4090. Up to 20% stronger if Nvidia let's it pull 600watt.
Very interesting. Can't wait to see your take on Radeon's 7000 offering
How does it relate to their deep learning accelerator?
Do they use the same chip with other features activated or a completely different chip?
Nvidia has different architectures, for AI acceleration they have "Hopper" instead of "Ada Lovelace". Parts of the AD102 chip use tensor cores (mainly to accelerate DLSS), which is also used on Hopper for AI stuff (like ChatGPT). So its related, but Nvidia has a different chip with a much larger AI focus.
Well that was informative
Still here!
I wonder if they could bring back the multiple gpus on one card approach like their older 90 series cards used
13:40 I'm here.
Nvidia 4N is a custom optimized N5 node (5nm). The N in 4N is Nvidia.
It doesn't use 4nm which is N4
N4 is also just a optimized 5nm node, TSMC doesnt have a "real" 4nm process node.
300%. But i knew that since I looked it up a couple weeks ago and cross referenced it a few times since I didn't believe it at first.
I thought it was a mistake when I fist saw the number :p
I'm curious to see how far the 5000 series cards will go. Will the rtx5090 run Cyberpunk 2077 at a full 4k (no DLSS) at 120fps+?