Deep-dive into the technology of AMD's MI300

Поділитися
Вставка
  • Опубліковано 21 лис 2024

КОМЕНТАРІ • 304

  • @HighYield
    @HighYield  Рік тому +40

    Do you think all five technologies will "trickle down" into the consumer market? Will we buy a single "gaming SoC" at some point in the future?

    • @ChristianHowell
      @ChristianHowell Рік тому +6

      We're almost there... I have a miniPC with a 6900HX and I can play EVERY game at 720p... WHen StrixPoint comes out next year I think we'll be at 1080p60 using FSR3...
      I haven't seen any benches with 7x40 APUs but they should be 50% faster than the 6900HX... Especially if they can get 30% higher clocks...
      And then Strix should be 30-40% faster than that...

    • @closerlookcrime
      @closerlookcrime Рік тому +1

      @@ChristianHowell Those system on a chip machines are really cool. I hope they make a bunch of improvements over the next 5 years.

    • @boredgunner
      @boredgunner Рік тому +4

      @@ChristianHowell Those performance numbers are good for something like a portable system (Deck) or standalone VR headset

    • @FfbeEXVIUS
      @FfbeEXVIUS Рік тому

      No, heat dissipation, leakage, and memory bandwidth, issues with latency. Just look at modern GPU's from nvidia and AMD way too costly.
      Physics isn't magic.

    • @hristobotev9726
      @hristobotev9726 Рік тому

      probably zen6 - 2026. 8+8core + 24-32 wgp GPU chiplet, On bottom IO+128mb cache and 2 chanel DDR6 memory. All for $500-600

  • @SirMo
    @SirMo Рік тому +173

    One correction I would mention is. The latest version of PyTorch. PyTorch 2.0 is moving away from CUDA. For ever increasing AI models CUDA on its own can't optimize for the scale. The frameworks themselves have to be optimization aware. This is why ML frameworks are shifting from the eager mode to the graph mode, which sidesteps CUDA (cuDNN) and provides better performance. Instead of using CUDA, they will use tools like Triton (this is what Open AI's ChatGPT uses) which interfaces directly with Nvidia's NVPTX and AMD's LLVM AMDGPU backends. So CUDA is on its way out and with it Nvidia's software moat. MI 300 will be a monster.

    • @HighYield
      @HighYield  Рік тому +51

      I'm not up to date with all the AI/ML software. Maybe this is AMDs chance to catch up to Nvidia?

    • @SirMo
      @SirMo Рік тому +39

      @@HighYield I believe so. There is a big push at AMD to improve their AI offerings. mi250 was kind of a test, with limited matrix multiplication units, and built more for pure scientific HPC applications (like the ones used by Oak Ridge supercomputers). mi300 will focus on AI. And I do believe it's going to be very competitive with Nvidia.

    • @benjaminoechsli1941
      @benjaminoechsli1941 Рік тому +18

      ​@@SirMo Excellent. AMD has been lacking in that market, and some competition against Nvidia's monopoly is welcome. I know they've had their CUDA compatibility layer for ages, but a proper alternative is always best.

    • @Shubhamrawat364
      @Shubhamrawat364 Рік тому

      Another blind fanboy having the amd cool aid drink, amd has zero chance in AI because nvidia is light years ahead with full stack ecosystem and partnerships, all amd can do is make chips but that wont take them anywhere, its all about the ecosystem and the advantage of scale where nvidia absolutely doimantes, poor amd was only able to catch on intel because intell got fallen behind TSMC in process node technology and amd used TSMC but with nvidia there are no low hanging fruits to eat because unlike intel nvidia is not sitting on its laurels they are moving ahead with light speed in AI and here amd is just warming up.

    • @closerlookcrime
      @closerlookcrime Рік тому +2

      You think so? I am learning CUDA and working on 12.1. I am just getting started.... Just want to do some small scale work to learn this new tech. There are a ton of processors on the market for cheap that I would have been thrilled to have just 5 years ago. Did you see CUDA has a new communication system that is not only under development but already being used that reduces i/o bottle neck. I hope AMD does something great though to keep the competition going. Anyway gotta run.

  • @lefthornet
    @lefthornet Рік тому +78

    This maybe the first step of the war coming between AMD and Nvidia, I'm waiting for Intel to react, but the advances that AMD is making are huge.

    • @beachslap7359
      @beachslap7359 Рік тому +6

      They don't have enough supply to compete with nvidia no matter how competitive their architecture is.

    • @coladict
      @coladict Рік тому +5

      @@beachslap7359 If Nvidia doesn't have chiplets with Blackwell, they're screwed. It's known they're working on it, but what they themselves don't know is if they'll get it done on time.

    • @SirMo
      @SirMo Рік тому +11

      @@beachslap7359 AMD produces more chips than Nvidia. They have all the supply they need.

    • @chrisvicera6696
      @chrisvicera6696 Рік тому +4

      @@beachslap7359 AMD has the supply, they just don’t have the demand or market share. If there was proper demand, they would shift wafer capacity from their Ryzen and gaming gpus

    • @beachslap7359
      @beachslap7359 Рік тому +6

      @@coladict they have by far a more superior architecture right now compared to amd. Why would that change for the next generation just because they don't have chiplets? Especially considering the node jump is gonna be slim for both companies.

  • @hanspeter24
    @hanspeter24 Рік тому +36

    your videos are amazing and i learn so much new interesting information even though i don't understand everything its still rewarding to watch you explain and develop my own knowledge just as you did. thanks for that and greetings from hamburg, germany :)

  • @jelipebands1700
    @jelipebands1700 Рік тому +15

    Awesome job covering the mi300 it’s so impressive and beyond anything ever made no one knows how to cover it or even talk about it. The technology is going to make it into gaming consoles. I predict next gen is going to be so integrated the thought of adding ram and separate cards to a pc will feel ancient

  • @earllemongrab7960
    @earllemongrab7960 Рік тому +27

    1. Excellent video. I don't follow the data center innovations that closely, I'm more of a desktop gaming guy, so this video was absolutely fascinating to me. Well explained, well segmented. And it's exciting to think about what this will mean for the desktop for the upcoming decade.
    2. Before you introduced the 5 new technologies, I paused the video and gave it a quick think of myself. I basically came up with the same categories. Except I came up with "heterogeneous design". In my head that was something that takes the SoC and disaggregates it into chiplets but also includes mixing process nodes and possibly chiplets made by different vendors / foundries. We're not quite there yet. But in my head mixing 5 with 6 nanometers is a part of it. So I basically mushed your "SoC" and "Packaging" category together and added a bit of my own flavor.
    3. The classic 'von Neumann' architecture on PC can't keep up anymore. We see this with the consoles, how a smaller, much cheaper design can yield incredible performance. Mid to high-end PCs that cost 3 times more struggle to play the latest console game ports. This is ridiculous, somethings got to change. I'm curious how a next gen PC architecture will look like. Will we still have a modular design, how will cooling look like and will manufacturers be able to agree on a standard in time before consoles make the PC look even more boomer than it already looks to some people?
    4. Exciting times ahead.

    • @HighYield
      @HighYield  Рік тому +5

      It doesn't really matter how you call it. Remember when AMD's company slogan was "the future is fusion"? It's kinda ironic that now that they achieved this complete "fusion", it's not their slogan anymore. But the technology has been a long time coming. Long before their Ryzen comeback, AMD was forced to innovate to stay alive. The design lead AMD currently has is a result of this.
      Fully agree, exciting times ahead!

  • @bartomiejpopielarz8283
    @bartomiejpopielarz8283 Рік тому +8

    I think you are spot on.
    Though I would say that another tech to look out for is in-chip fluid cooling. Heat is a huge problem, especially with 3D stacking. Efficient extraction of heat allows for higher frequencies and lower energy use (as heat increases resistance).

    • @HighYield
      @HighYield  Рік тому +2

      Next-gen cooling tech is definitely worth a future video :)

  • @RobBCactive
    @RobBCactive Рік тому +9

    Incidentally when AMD was buying ATI for Radeon the "Fusion" idea of not just seamless GPU fp compute but a unified address space was used as justification.
    It's over a decade but at last this becomes more feasible.
    Not just MI300 but SAM and recent DX12 extensions are aimed at shared address space.

    • @HighYield
      @HighYield  Рік тому +8

      From "the future is fusion" to the actual "fusion future". AMD's slogan back then was more than just marketing, it was a promise.

    • @peterconnell2496
      @peterconnell2496 Рік тому

      Exactly, & the sale formalized in 2005 - almost a generation ago.

    • @RobBCactive
      @RobBCactive Рік тому

      @@peterconnell2496 the bottom line was the APU wasn't that feasible, but now we see Intel & AMD strengthening gfx, while Nvidia are launching ARM CPUs for the data centre.
      So this memory unification is the direction for high performance.
      The old APUs were too compromised by cost limits, restricted memory bus, cache; despite AMD having a transparent NUMA in their server CPUs, they didn't have the investment funds to realize the possibilities.
      A sceptical observer suspects the justifications were rationalisation for a panic reaction to CUDA.
      IIRC ATI were in trouble, AMD wanted a slice of GPUGP, so the Fusion concept of another VLSI step was born.
      The problem was Intel had successfully responded to AMD64 and the x2 chips with Core Duo, had bribed the key OEMs and had a voucher system of rebates with Intel Inside that small dealers relied on.
      So AMD were squeezed from both sides, not able to realise the profit from their real innovations and NOT having the financial muscle to buy an OpenCL counter to CUDA of sufficient quality & application support.
      AMD were following and trying to catch up; Intel had gone awry with P4/Itannic but commercial power kept them strong.
      Nvidia meanwhile reaped the rewards of the collapse of competition with their new main competitor having to divert funds away from future GPU designs.

  • @HighYield
    @HighYield  Рік тому +10

    Something I wanted to add: since we don't know the packaging method yet, when I talk about the "interposer" it doesn't have to be a large silicon interposer, it might l be a small "organic interposer" like on Navi31, using TSMCs "Fan-Out" technology. Once we know more another video will follow!
    www.tsmc.com/english/dedicatedFoundry/technology/InFO

    • @ChristianHowell
      @ChristianHowell Рік тому

      IIRC, TSMC is using CoWoS and TSVs with the caches like X3D... They renamed it to 3D Fabric for the whole infrastructure... AMD is the cutting edge of silicon right now... They had to sell their own fabs and came out even better in the long run... All their 5nm products haven't even come out yet and the 4/3nm Zen5 ones will have even more instrs and accelerators...

    • @HighYield
      @HighYield  Рік тому

      @@ChristianHowell AFAIK, SoIC is also a Chip-on-Wafer technology.

    • @ChristianHowell
      @ChristianHowell Рік тому

      @@HighYield Yeah, they folded it all into 3D Fabric terminology... But the basis is CoWoS because it lets them package huge chips like MI300...

    • @Cooe.
      @Cooe. 11 місяців тому +1

      It's 100% confirmed to be a standard, absolutely freaking MASSIVE (as in it pushes TSMC's max reticle size limits right up to their absolute freaking er.... limit lol 🤣) silicon interposer. 🤷
      AMD were originally planning on doing what MI250X did to connect its HBM, which is use a hybrid of a standard interposer ala MI300 & fan-out wiring ala Navi 31, w/ TSMC's "CoWoS-R" packaging, which uses a small silicon bridge + fan-out technology. (Think Intel's EMIB, but instead of the tiny silicon mini-interposer bridges being directly connected on both ends w/ physically die-to-die attached TSV's [which TSMC can now do too, called "CoWos-L"], they're connected w/ less dense fan-out wiring [although still plenty dense enough for HBM3].)
      But then at the absolute very last minute possible, AMD had to switch to a traditional absolutely gargantuan silicon interposer (aka using TSMC's "CoWoS-S" packaging) pretty much as late as they possibly could due to reliability concerns, thanks to the massive package flexing/warping when kept at its full-fat 750W load for seriously extended periods.
      Basically, with MI300 having a MUCH larger overall package than even MI250X, the tiny silicon bridges connecting the HBM stacks were simply much more prone to failure than a single massive contiguous silicon interposer across the entire package under the kinds of "100% load, 24/7/365" conditions that are inherently endemically common in HPC/data center land. 🤷

    • @Cooe.
      @Cooe. 11 місяців тому

      ​@@ChristianHowellIt's CoWoS, sure, but TSMC has like 10x COMPLETELY DIFFERENT technologies under that single banner, making it a mostly meaningless/useless label just by itself outside letting you know the product is multi-chip. 🤷 MI300 uses a classic single massive (literally reticle limit pushing) contiguous silicon interposer base layer (just like in say Vega), which in TSMC's marketing speak is called their "CoWoS-S" ("CoWoS on silicon") packaging technology.
      For an example, CoWoS-S/a traditional large silicon interposer is an ENTIRELY DIFFERENT packaging tech than what's in MI300's also CoWoS but rather "CoWoS-R" using MI250X predecessor!
      ("CoWoS-R" is a hybrid between Intel's EMIB tiny silicon interposer bridges [which TSMC also now has a proper version of, called "CoWoS-L"] & the ultra dense fan-out wiring tech used by Navi 31/2 & Apple's M# Ultra SKU's. Basically think EMIB style silicon bridges but connected on both sides w/ less dense but still plenty fast for HBM3 fan-out wiring vs EMIB's/CoWoS-L's direct chip-to-chip TSV's.
      They didn't end up using CoWoS-R again on MI300 like they'd originally planned to because of reliability issues w/ the tiny silicon bridges at its full 750W load for truly extended periods, as is the standard HPC/data center usage environment.)

  • @ChristianHowell
    @ChristianHowell Рік тому +5

    MI300 is something I've been waiting for since I saw the initial HPC chiplet APU patent... The interesting thing is that older CPUs used to remove functions from the CPU die because of limited transistors at 180nm etc...
    But the biggest thing about it is that I heard some autonomous driving folks say that 2000 TOPs are needed for an FSD experience and since MI250X has 383 TOPS, 8X that is almost 3500 TOPS... AMD can now theoretically provide all the chips for automobiles out of nowhere it seems (NOT!!!)... They can use an edge like appliance with a Pensando front end for network relays for traffic and weather, etc. for a LARGE MAP area, while an upcoming Ryzen APU can do the entire system, including 4K video and gaming... Companies are selling mini PCs with Ryzen and Radeon 6000 that can do 4K30+... Zen4 telco servers can do edge processing while EPYC can stream games and all types of data including AI for predictive routing...

  • @WXSTANG
    @WXSTANG 7 місяців тому +2

    AMD is way ahead of the curve vs the competition. They just need someone to market the tech better. They are a true heterogenious system and get better and better every year. Now AMD is sharing GDDR with CPU / GPU and other AI accelerators.

  • @NootNoot.
    @NootNoot. Рік тому +6

    The packaging/chiplet design is quite brilliant (speaking of, gratz on the sponsorship)! One day, hopefully we'll see all of these techniques trickle down to desktop/consumers! The Zettascale strategy is interesting because it pulls you into real world limitations, that is physics, that will inevitable halt performance if we don't invest in new techniques. Like with 3D V-Cache, although is a great solution for more L3$, there are still thermal limits. AMD investing in RnD is a long term goal. And investing and brute forcing into todays technologies like monolithic designs, we'll see in the near future to be unreasonable.

    • @HighYield
      @HighYield  Рік тому +3

      With X3D CPUs and Navi31 AMD is really aggressive in using their top technology for consumer products, tho as you said, its always "trickle down". Zen 2 chiplets where designed for servers, not desktop, same with 3D V-Cache. But they work great for desktop too.

  • @peterdoyle8571
    @peterdoyle8571 Рік тому +1

    1) LightMatter wafer scale optical interconnect
    2) Ultraram replacing most chiplet cache, HBM, DRAM, and NVM
    3) Accelerators on chip/package
    4) Combining CPU, GPU, FPGA on package
    5) Backside power delivery
    6) VTFET
    7) Deep trench capacitor on wafer with direct 3D bonding integration
    8) Glass based motherboards with integrated photonics, power deliver, and microfluidic cooling

  • @BrandonMeyer1641
    @BrandonMeyer1641 Рік тому +3

    Fugaku with the Fujitsu A64FX walked so El Capitan and Mi300 could run. Seriously.

  • @adoredtv
    @adoredtv Рік тому +5

    Great video, cheers!

  • @closerlookcrime
    @closerlookcrime Рік тому +1

    I hope all manufacturers start stacking chips and putting them in our desktops. Gonna get some cool stuff soon.

  • @OneAngrehCat
    @OneAngrehCat Рік тому +1

    To me, the most memorable part of the keynote in the entire Zettascale race was logic on memory.
    I can't really imagine just how much you could realistically put on RAM, probably only basic math operations as anything too complex would probably be too costly.
    But if you can even just do basic math, even just add/substract/jump, it'd be a true revolution. So many basic operations would be loaded off the CPU and live in the RAM. The CPU would just have to send the request and that would seriously take down transfers. You could go from 20 transfers and operations down to something like one CPU -> RAM transfer, operations by the ram, and then RAM -> CPU transfer when done to send back the updated data the CPU wanted.
    It's truly revolutionary in speed and efficiency. How costly/plausible...don't know. But I find it to be the most impressive thing.

  • @craftspro
    @craftspro 5 місяців тому

    Such a great channel & amazing video explanation. Even big youtubers like linus tech tips don't explain chip design like this. Very underrated channel.

  • @u-def
    @u-def Рік тому +1

    I felt this is best ad transition ever. It kinda convinced me to learn some on brilliant 😂

    • @HighYield
      @HighYield  Рік тому +1

      I get lots of sponsorship offers, but I only take the ones I actually think are really useful and Brilliant is definitely useful.

  • @wayofflow955
    @wayofflow955 Рік тому +3

    Interconnects are the key to new age of 3D stack chips I think. We will get to a point where the processor is not 2d but more like a solid cube. Inside this solid cube is all semiconductor.

    • @HighYield
      @HighYield  Рік тому

      And if you come up with a good way of cooling this cube, you could be the next Bill Gates!

    • @franchocou
      @franchocou Рік тому

      I want solid cube of gan

  • @igavinwood
    @igavinwood Рік тому +2

    New sub. Thank you for highlighting the tech along with the exciting and somewhat scary aspects of modern computing. This vid is a good indicator of how fast the computing field has moved in recent years. I expect that the AI at work currently is being utilised to create the next generation of SOC and AI. Something of a self fulfilling prophecy, thus the tech questions to achieve zetascale are likely to be answered in the next 15 years, however the bigger question is the impact it has. A discussion that has yet to really hit the larger population.

  • @El.Duder-ino
    @El.Duder-ino Рік тому +2

    Excellent vid, thank u for making it!

    • @HighYield
      @HighYield  Рік тому

      Thanks for watching :)

    • @El.Duder-ino
      @El.Duder-ino 10 місяців тому

      @@HighYield 110% agree with u that MI300 is a prototype of the chip we will see in the future also in the consumer market with lot of 3D stacked L3/4 V-Cache and possibly even with some of the HBM memory. I would say next gen consoles, maybe even PS5 Pro will utilize V-Cache because of its huge benefit for gaming.
      In the enterprise segment similar chips to MI300 will also expand to next level not just on the points u have mentioned (especially 3D stacking & packaging) but mostly with - chip/chiplet customization. Companies will be able to fully customize (for an extra price of course) what kind of chiplets/accelerators they'll get in their chips . It won't be surprising to see future MIxxx AMD chips with their Xilinx FPGA and for example Tenstorrent chiplet.

  • @zetaDirective
    @zetaDirective 11 місяців тому +1

    The most fun part there is the 3d stacked L3 CPU cache. A-freaking-some!

  • @APHRODIZZYAC
    @APHRODIZZYAC Рік тому +2

    Excellent video, really enjoyed it. I'm not an expert by any means but I do think once we are able to produce graphene transistors at scale that both speed and efficiency is going to make a gigantic leap forward, combine that with optical data transfer off chip the leap forward will be incredible.

  • @lasbrujazz
    @lasbrujazz Рік тому +2

    And with chiplets design, AMD can scale their products way easier than competitor. Shown last week, MI300 has 2 variants, "A" with 6 GCDs and 3 CCDs, and "X" with basically all CCDs replaced entirely with GCDs, making it GPU only. This modularity is going to please any kinds of customers.

  • @TrueThanny
    @TrueThanny Рік тому +3

    The eight extra "dies" are probably what I could call fillers, not spacers. I don't think they are there to provide any structural purpose, but simply to take up space that would otherwise have to be taken up by the resin used to make the whole package flat.
    As for the stacking method, I can only make a wild guess. The bottom dies are apparently mostly for I/O, so they could be quite large, given the lack of scaling with I/O. They would then also have a lot of dead space, making TSV's an easy option on that side of the equation. If I had to guess, I'd say they connect to the upper chiplets using the normal connection points those chiplets would have if they sat directly on a package. Meaning the bottom dies replace the substrate as far as the stacked dies are concerned, and use TSV's as needed to connect to the actual substrate for reaching the socket pins. That would make the overall design chiplets on top of active interposers, on top of a passive interposer with HBM at the sides.

    • @Cooe.
      @Cooe. 11 місяців тому

      Wrong. They are officially & explicitly according to both AMD & TSMC there to maintain structural integrity across the massive chip package, just like OP claimed.
      If it was just a massive valley of pure in-fill material placed over those areas in-between the HBM3 stacks instead of hard silicon, it would have allowed SIGNIFICANTLY more flex and warping under the package's full 750W power/thermal load, which could EASILY break the RIDICULOUSLY FUCKING FRAGILE like 1100mm² active silicon interposer everything sits on top of. 🤷
      (Not break as in "crack it" or anything, but rather cause some of the countless MICROSCOPIC TSV's ["Through-Silicon Vias"] connecting the massive interposer to all of the various chips on top of it to become disconnected.)

    • @Cooe.
      @Cooe. 11 місяців тому

      And the bottom dies sit on top of a massive package size active silicon interposer, NOT on the package substrate itself! The gargantuan silicon interposer underneath all the active chips above is what's connected to the actual package substrate.

  • @CharlieboyK
    @CharlieboyK Рік тому +1

    Great video about the technology. AMD has certainly embraced chiplets well. The acquisition of Xlinix earlier on by AMD ensures that they have the best substrate and packaging technology as Xlinix is considered the best in this area.
    AMD can make custom SOCs to target high end AI companies. Exciting times ahead for this technology.

  • @6SoulHunter9
    @6SoulHunter9 Рік тому

    I noticed that I started to anticipate your videos! When I have free time it's the first thing that I look for.
    There are lots of technology channels, but lots have fillers and content oriented to entertain, which isn't bad, but I am a huge nerd and I enjoy more this channel.
    Keep going!

    • @HighYield
      @HighYield  Рік тому +1

      Not stopping anytime soon, but I've had this on-and-off cough for almost a month now, which makes recording videos harder. I will try to get back to my original "once a week" schedule, or at least not keep the current "once every 2-3 weeks" :X

    • @6SoulHunter9
      @6SoulHunter9 Рік тому

      ​@@HighYield Don't worry! I view your videos for free, so I cannot complain, and I'd rather have a few and good quality than stream of conciousness videos.

  • @forrestnorrod1547
    @forrestnorrod1547 Рік тому +2

    Very good video. Great analysis and insights on what AMD is doing and why.

    • @HighYield
      @HighYield  Рік тому

      Thank you, good to hear you liked it!

  • @PunmasterSTP
    @PunmasterSTP 8 місяців тому +1

    MI300? More like "Magnificent technology, and they're going way ahead!" 👍

  • @closerlookcrime
    @closerlookcrime Рік тому

    You can get an IBM x3650 M4 with Intel Xeon E5-2670 2.6ghz base frequency and set it to run at 3.3ghz on high power mode. There are two silicone chips with 8 processors in each chip and 2 threads per cpu. The unit also comes with 64gb memory at 1333mhz for 150.00 bucks. I then added a gtx 1660 super with 6gb ddr6. and 128gb additional ram. I love it. I put Ubuntu 22.04lts on it for the operating system. I picked up 10 900gb sas drives and put 8 in the machine to start and set them up with raid 5. There are a ton of these servers on ebay and can be obtained in a server or desktop server package. Thought I would share. Good luck. Have fun.

  • @lordvass3377
    @lordvass3377 Рік тому +3

    and honestly i dont think any one can beat amd in raw power at this point and big tech corps are seeing that now

    • @Kratochvil1989
      @Kratochvil1989 Рік тому +1

      weeks ago i saw an stupid comment.. AMD is just all about brute force thats useless.. " I saw that comment and i couldnt stop laughthing" hell yes efficent brute force, best combination ever and that random guy was just upset about it :D. It seems like amd has some insanely talented engeniers, its great to see it , if i am not mistaken Apple thought about chiplets many years ago for pretty long time and they failed. Make this architecture real thing was realy an exceptional and kinda scary challenge.

  • @smactardian
    @smactardian 8 місяців тому

    The transition to GAAFET/MBCFET in near future process nodes like 18A strongly indicates that process will still prove to be a driving factor in performance. Despite TSMC still using FINFET for 3nm, GAA may be feasible in sub-nanometer

  • @legobuildingsrewiew7538
    @legobuildingsrewiew7538 10 місяців тому

    This is so cool. I want this for my home lab.

  • @jet_mouse9507
    @jet_mouse9507 Рік тому +4

    With memory integrated onto the chip, the memory timing will probably be REALLY good! Plus, with a gpu with performance levels of dedicated gpus on the same die, that's gotta have a huge performance increase. If nobody posts a UA-cam video of gaming performance with fps graphs with this thing, I will be VERY disappointed. Maybe Linus will do that.

    • @HighYield
      @HighYield  Рік тому +3

      While CDNA3 isn't really a gaming architecture like RDNA3, I'm sure you can run games on it. If you get the drivers working MI300 would be a power house!

    • @honkhonk8009
      @honkhonk8009 Рік тому

      GPU's are the few things I would rather not have directly integrated onto the CPU. For the express purpose of it destroying the modularity modern systems have.
      I like the idea of having memmory integrated onto the GPU die.
      I think its much wiser to just offload more and more things onto the GPU, than to just integrate the CPU and the GPU closer.
      For gaming especially, the CPU shouldnt be doing shit when it comes to rendering. It should mostly be hardware doing the job.

    • @raisofahri5797
      @raisofahri5797 11 місяців тому

      ​@@honkhonk8009sadly the industry are move into unified system

  • @TheRandomguy06
    @TheRandomguy06 Рік тому +1

    You said in the video that you think in the future semiconductor a packaging will be more important than process node. can you expand on that? or maybe make that its own video?

    • @HighYield
      @HighYield  Рік тому +1

      The basic idea is that with process node progress slowing down, how you package/combine chips becomes much more important. I'm sure it will be a topic in future videos :)

  • @marce.fa28
    @marce.fa28 Рік тому +1

    You are amazing, thank you 😊

  • @moncyn1
    @moncyn1 Рік тому

    3:06 so many artifacts, must be algorithm

  • @ZAGAN-OZ
    @ZAGAN-OZ Рік тому

    This gets me exited for RDNA4.

  • @cocosloan3748
    @cocosloan3748 Рік тому

    Love this YT channel . Excellent info -so educational 👍

    • @HighYield
      @HighYield  Рік тому +1

      Thank you! But its always important to realize I'm not 100% correct on everything, especially when some details are still unknown.

    • @cocosloan3748
      @cocosloan3748 Рік тому

      @@HighYield Still - we learn so much about the technologies you talk about. TY 👍

  • @jihadrouani5525
    @jihadrouani5525 Рік тому +1

    Very informative videos, please keep up the awesome content =]

    • @HighYield
      @HighYield  Рік тому

      As long as ppl are watching I won't stop!

  • @VnikXum
    @VnikXum Рік тому +1

    Thank you for video! I think changing chip materials from silicon to carbon (or somthing else) is additional way to improve efficiency.

    • @kingkrrrraaaaaaaaaaaaaaaaa4527
      @kingkrrrraaaaaaaaaaaaaaaaa4527 Рік тому

      Do you mean carbon nanotubes or graphene when talking about carbon?

    • @VnikXum
      @VnikXum Рік тому

      @@kingkrrrraaaaaaaaaaaaaaaaa4527 Yes, maybe, probably diamond structure instead of silicium.

  • @miyagiryota9238
    @miyagiryota9238 Рік тому +1

    Yea absolutely great presentation!

  • @anttimaki8188
    @anttimaki8188 Рік тому +2

    The genius part is making a sort of universal chiplets, that can actually be used in more than one purpose. Whoever figured it out needs someone to go grab him a coffee.

    • @HighYield
      @HighYield  Рік тому

      Intel's Meteor Lake will also be chiplet based (Intel calls them "tiles") and I'm sure sooner than later Apple & Nvidia will join the club.

    • @Cooe.
      @Cooe. 11 місяців тому

      There will even be a PCIe accelerator version that's basically 1/2 of a MI300X on a half size interposer & thus package! (2x base tiles + 4x GCD's & HBM3 stacks vs MI300X's 4x/8x, so 152 CDNA 3 CU's w/ 96GB HBM3 vs MI300X's 304 CU's/192GB). Think the new MI210 equivalent to MI300X's direct MI250X replacement.

  • @kirkseywysinger2112
    @kirkseywysinger2112 Рік тому

    Thank you, for doing What you do. New fan here, really appreciate the content, and your presentation.

  • @yanhao5703
    @yanhao5703 Рік тому

    Is the packaging option you mentioned where the base chiplets are smaller than the silicon on top so that there can be direct connection between the interposer and the silicon similar tp what Intel has presented some time ago as Foveros Omni?

  • @adela5561
    @adela5561 Рік тому

    Great video and well explained. Thks

  • @justinpatterson5291
    @justinpatterson5291 Рік тому

    Imagine a (professional) consumer grade APU with 24+ Compute Units of RDNA 3-4, with 128MB 3DvCache or, on die HBM. Filling a PCIE slot with other expensive silicon could become optional in some cases.

  • @vincentlzl921
    @vincentlzl921 Рік тому

    Within these 5 years AMD has never ceased to amaze the market with better products.

  • @squeezedoz
    @squeezedoz Рік тому

    Fantastic summary!

  • @ttb1513
    @ttb1513 Рік тому

    @HighYield Great channel. If I may, please pay attention to how long you keep a slide on the screen, especially when text is flashed onto the slide at the last moment. Just a little thing.

    • @HighYield
      @HighYield  Рік тому

      Which slide are you referring to? A timestamp would help :)

    • @ttb1513
      @ttb1513 Рік тому

      @@HighYieldSure:
      2:19
      9:39 2.5/3D stacking difference: diff is "both (MEM/compute) are actually active". At 9:47 irrelevant fine print comes into view for 2s. Even tho that text is unimportant, it creates the impression that I’m missing something.
      9:50 "The 5nm MCD and CCD chiplets …". I couldn’t find MCD in chip diagram or "AMD MI300" text column at right.
      11:54 slide phase-in effect made text unreadable for 4s. Slide was on for 10s, but disappeared once motion of text froze. Combined, only once text froze was I ready to read, so reading felt very rushed.
      13:29 uses same phase-n effect. But is slide stays longer.
      14:19 same
      17:15 4s
      The slide phase-in effect where the start of all sentences are not visible at first and text keeps moving but disappears once it stops moving makes a slow reader like me feel rushed. (Secret: especially for someone like me who likes to listen at faster speeds, but reads slowly. I understand it’s not possible to cater to making readable slides at faster speeds).
      The section on 2.5D vs 3D at 9:39 created confusion because the explanation of "in 3D, both chips are active" made me investigate the diagram more, and I couldn’t find MCD, so I was trying to find that or determine if you actually said GCD. The main point is that the explanation of both chips being active in 3D seemed off: the difference strikes me as more about how intensive the transistor switching density (and heat) is for the chips stacked, how tolerant the circuitry is of heat (DRAM, SRAM are finicky), and heat dissipation for the stack. After all, in HBM, all die in the stack can have dram banks active at the same time.
      Please take my original comment as a bit nit picky. I really like your channel and appreciate the time you put into such thorough content research and video production. I’d like to see your channel develop larger viewership.

  • @dralord1307
    @dralord1307 Рік тому +1

    Thanks for the interesting video

  • @GustavoNoronha
    @GustavoNoronha Рік тому +2

    Great video as always, I could do without the repetitive background music, though ;P

    • @HighYield
      @HighYield  Рік тому +2

      HYou mean no bg music at all or just change it up more? I feel like without bg music its too "empty".

    • @GustavoNoronha
      @GustavoNoronha Рік тому +3

      @@HighYield I am quite sensitive to repetitive noise, so I am probably not a good reference. I think if you change it up it should be good, as I did not feel it grow in my mind in the smaller videos. Maybe use the chapters to change it up or something like that? Still, the video was very worth it anyway =)

  • @kkgt6591
    @kkgt6591 Рік тому

    What is the 128GB HBM for ? Is it the RAM ? Can customer upgrade for additional RAM?

    • @HighYield
      @HighYield  Рік тому +1

      Yes, that's the RAM. And it's not upgradeable, as it's HBM that sits on the package itself.

  • @JxcksonSF
    @JxcksonSF Рік тому +1

    Hbm still more expensive than GDDR?
    Will it return to gaming gpus someday?

    • @HighYield
      @HighYield  Рік тому +1

      I think at some point HBM might have it's come back, but not anytime soon. GDDR7 will be next.

    • @julianfiacconi709
      @julianfiacconi709 Рік тому

      MCR DIMM is the future. Latest pursuit in memory tech, from what I’ve recently read.

  • @RealLifeTech187
    @RealLifeTech187 Рік тому

    Nice video 👍 Very interesting.

  • @markvietti
    @markvietti Рік тому +3

    AMD should no longer acknowledge Intel...as a kick in the face.. Intel deserves it..

    • @HighYield
      @HighYield  Рік тому

      Meteor Lake will be Intels make-or-break moment. If they can execute their version of a chiplet architecture well, they are back in the game.

  • @m_sedziwoj
    @m_sedziwoj Рік тому

    5:50 I would not agree, spacers would not be split in half, double trouble to place and keep height in check

  • @Artoooooor
    @Artoooooor Рік тому

    I want to have this monster in my desktop computer...

  • @samghost13
    @samghost13 Рік тому

    I really enjoy your videos. Please more : )

  • @GIANNHSPEIRAIAS
    @GIANNHSPEIRAIAS Рік тому

    can you imagine if mi300 variants replace the current threadrippers?

  • @judehariot8076
    @judehariot8076 Рік тому +1

    Great coverage and analysis, but could you perhaps remove the 'fizzy' distracting background music please? Or would it be possible to upload different version of the video that has no background music?

    • @HighYield
      @HighYield  Рік тому

      Is it bg music in general or you just dont like the one I used for this video specifically?

    • @judehariot8076
      @judehariot8076 Рік тому

      @@HighYield It would be the one used in this video in particular. It's the quick drum percussion synonymous with college/high school football bands, but also commonly used in modern hip hop. It tends to be effective in drawing one's attention to the beat, away from what you are trying to say. I hope that's clear enough. Cheers.

  • @radicalrodriguez5912
    @radicalrodriguez5912 Рік тому

    great engineering is usually simple

  • @alb.1911
    @alb.1911 Рік тому +1

    Thank you

  • @jarenpocopio6033
    @jarenpocopio6033 Рік тому

    Cant wait to equip sandivista prototype mk 1.

    • @HighYield
      @HighYield  Рік тому

      Just wait another 54 years ;)

  • @npip99
    @npip99 11 місяців тому

    2:30 I think Unified Memory will really not be a thing. 32GB of RAM for the best-in-class DDR5 Hynix A-die is $80 at its cheapest. The top-of-the-line CPU and GPUs combine to thousands of dollars. Unified Memory involves a lot of management for divy'ing up RAM between the two, which is likely to be slower than just giving CPU and GPU its own separate RAM. Especially given just how cheap RAM is.
    For Data Centers, Unified Memory for training AI models is important since we're talking Terabytes of RAM. But atm even 16GB vs 32GB is showing virtually no difference to consumers.

  • @jaynorwood2
    @jaynorwood2 4 місяці тому

    Intel's PVC GPUs have 16 compute tiles sitting on top of a base layer that has sram cache.

  • @joehopfield
    @joehopfield 2 місяці тому

    UMA makes data access *faster* (lower latency) as well as more efficient. Managing caches and external memory access was a huge burden for traditional multi-processors.

  • @theevilmuppet
    @theevilmuppet Рік тому

    Wonderful video! May I offer one correction?
    The units for die size are square millimetres, not millimetres squared: en.wikipedia.org/wiki/Square_metre

    • @HighYield
      @HighYield  Рік тому +1

      Thanks for pointing that out. Honestly never thought there would be difference. In German (my native language) it's also "square millimeters", idk why I'm saying the other way around when I talk in English.

    • @theevilmuppet
      @theevilmuppet Рік тому

      @@HighYield because English is a terrible language!
      Many native English speakers who speak no other languages make the same mistake, saying "well, that's how it's written".
      By the way - your English is perfect!

  • @denvera1g1
    @denvera1g1 Рік тому

    When i proposed to Moore's Law is Dead back when their fist chiplet CPUS were announced(2019), that AMD would make a mega APU in a few years he laughed and called me stupid

    • @denvera1g1
      @denvera1g1 Рік тому

      To be fair to MLID it was a superchat, so couldnt clarify the timeline, nor the market, so he may have been thinking a threadripper APU in 2019 based on Zen2 and CDNA/RDNA1

  • @theminer49erz
    @theminer49erz Рік тому

    I say APUs will be the next wave as they are able to perform equal to/better than a console for less money. Small form factor mid range Gaming PCs will be their replacement. I would like to see mobos using dual APUs and shared VRAM style RAM for system and graphics. I really like where they are going. It's what I was hoping for, so I think it will close about the console prediction. Maybe one more "next gen" system from MS and Sony then done....unless they start making pre-built under the name, but they prob won't wa t to deal with that if there is no licensing revenue. Nintendo will probably keep making consoles though, I hope. Anyway my same old ramblings. Thanks for the update!

  • @RRsalin
    @RRsalin Рік тому

    Maybe we shouldn't enter the zettascale era. Maybe we should use what we already have and fix climate change (thank you people of the internetz for not responding to this comment if you disagree)
    Ps: great video.

  • @andersolsen1478
    @andersolsen1478 Рік тому

    It is not only that AMD has a better product than Nvidia but they are also going to use open source software which will be better and cheaper than Nvidia software. It is a win win for AMD. 🎉

  • @ericpickering2406
    @ericpickering2406 Рік тому

    Thanks!

  • @A-BYTE64
    @A-BYTE64 Рік тому

    140B transistors 😮😮😮😮😮

  • @josiahmoorhouse8036
    @josiahmoorhouse8036 Рік тому +3

    But can it run Crysis?

  • @RafaGmod
    @RafaGmod Рік тому

    The SoC design for HPC make a lot of sense! I can se us, normal people, buying a PC ou notebook and needeng to upgrade it's RAM because of greedy software. But HPC normally go right were it's needed and don't make a lot of upgrades. If need an upgrade for the whole system, normally change the platform.
    But ok, when theses chips will show up in aliexpressn in a sketchy motherboard with a low price? 2028? I can wait with my zen 3 HAHAHHAHA

  • @hstrinzel
    @hstrinzel Рік тому

    Wow, I don't even know yet how to use my Raspberry Pi 4 to its fullest...

  • @RickBeacham
    @RickBeacham 2 місяці тому

    When will AMD sell these for the gaming PC market or is this the next Play Station?

  • @pavankumarreddy7888
    @pavankumarreddy7888 8 днів тому

    Thanos comparison was 😂😂😂

  • @MaxKrumholz
    @MaxKrumholz Рік тому

    AMD BEST

  • @MrHighway2000
    @MrHighway2000 Рік тому

    Will MI300 be able to access off-chip regular memory? Say 1TB DDR5 type.

    • @HighYield
      @HighYield  Рік тому

      Thats a good question. 128GB HBM3 is a lot, but modern severs have TBs of RAM per CPU. Could be possible.

  • @jackskalski3699
    @jackskalski3699 Рік тому

    isn't SOC kind of counter to chiplets?

    • @raisofahri5797
      @raisofahri5797 11 місяців тому

      Soc are pretty much just general name for chip that have more than one component.

  • @NaumRusomarov
    @NaumRusomarov Рік тому

    what's the purpose of the cpu cores here?

    • @HighYield
      @HighYield  Рік тому +4

      So it can function as an APU and you dont spend energy transferring data between CPU and GPU over a motherboard. It's for maximum efficiency.

    • @samlebon9884
      @samlebon9884 Рік тому +3

      CPUs do the serial computing, CDNA cores do the parallel. Together it's called heterogeneous computing.
      AMD has nice paper on heterogeneous computing, just Google AMD HSA paper, or something like that.

    • @SirMo
      @SirMo Рік тому +2

      GPUs are accelerators only used to perform a specific set of operations. CPUs are still needed to feed the GPUs and run the actual programs. For instance if you're doing AI training. You still need the CPU to parse and provide the data to the GPU, and then compile those results into a model.
      CPUs are really good at executing serial code, while GPUs are good at executing highly parallel code. You need both. And AMD is the only company that can provide best of breed CPU and GPU.

  • @seanjorgenson7251
    @seanjorgenson7251 10 місяців тому

    They are using atomera's mst

  • @ivesennightfall6779
    @ivesennightfall6779 9 місяців тому

    can't wait to find out if this thing runs doom

  • @sgtnik4871
    @sgtnik4871 Рік тому

    Now bring it into next gen consoles, call them "pro max" and charge 1.5k$. i will happily buy them :d

    • @HighYield
      @HighYield  Рік тому

      I think MI300 will be more like $10,000+, but it would be a amazing console for sure :D

  • @NKG416
    @NKG416 Рік тому

    i do think the music is too loud

  • @burakozc3079
    @burakozc3079 Рік тому

    nice laptop chip. 👍🏿

    • @HighYield
      @HighYield  Рік тому

      With a 5 min battery life :D

    • @burakozc3079
      @burakozc3079 Рік тому

      @@HighYield Gaming laptops doesnt last much more.

  • @Nobe_Oddy
    @Nobe_Oddy Рік тому +1

    I think PHOTONICS are THE FUTURE........ why continue pushing electrons around a board and chips where every bit of distance requires more power to get it there... its inefficiency, like you said, will require a Nuclear Power Plant just to run z Zettascale Supercomputer and the building it's housed in.... photons (light) are so much better to do the work with (IMO) although, we do not have a processor that can compute using photons instead of electrons (I think) .... but there has been some progress made in this field. BUT not very much. Not enough to make a difference right now. We still still use the traditional electronic methods with some photonic pieces in the system currently... we NEED to make a 100% photonic computer for it blow people's minds enough to fully pursue it instead of using a mix of both systems....
    I'm hoping you've done a video on this already (I JUST came across your channel when this video was suggested to me, but this vid got me to sub... I really like the cut of your jib, sir! lol) if nopt then I hope you can do one in the near future...there is just too much potential with photonics for it to continue being ignored :)

    • @HighYield
      @HighYield  Рік тому

      Photonics are definitely a path for the future, but we are quiet a bit off in my opinion. No video on photonics yet, but it has been on my list for a while now!

    • @shanent5793
      @shanent5793 Рік тому

      Unless you mean to return to analog, application-specific optical Fourier transforms, I don't see why photons would be better to work with than electrons. Photons still have to interact with electrons to change their phase and direction, and the structures have to be much larger than a wavelength to overcome diffraction. Current wavelengths are in the tens or hundreds of nanometers, so the density would be lower.

  • @lamhkak47
    @lamhkak47 Рік тому

    I crave the FLOPS

  • @BurningDrake39
    @BurningDrake39 Рік тому +1

    Hope linus manages to make a video on it, I want to see it run games.

  • @chrismurphy2769
    @chrismurphy2769 7 місяців тому

    I want one but I'll never be able to afford it

  • @picblick
    @picblick 10 місяців тому

    2:00
    "Apple has been a pioneer in this area..."
    No, no they have not. They licensed ARM just like thousands of companies before. If they had integrated RISC-V, maybe you would have a point.
    What about Chromebooks? Many of those run on SOCs and they have been around for many years.
    Is somebody who shows up at some point after all technologies exist a pioneer? Damn, I should do some pioneering.

  • @DB-nl9xw
    @DB-nl9xw Рік тому

    Will Nvidia produce CPU?

  • @closerlookcrime
    @closerlookcrime Рік тому

    The stacking of chips was started by Tesla and is used in there cars. This one technique increases processing power and reduces power consumption just by doing that alone. Distance matters.

  • @AK-ox3mv
    @AK-ox3mv 2 місяці тому

    #6 different ai chips for different porpuses

  • @gstormcz
    @gstormcz Рік тому

    I think AMD wants either take over or participate all main computer components (CPU. RAM, GPU) profits, it would be logical from business point of view, simple basic reason.
    3D stacking technology is beyond my imagination, but explanation by expert of how it works can be easy.
    As amateur I can think if single layer of chip adds some mass over substrate, then making dead electricity conducting areas on ground transistor layer towards vertical space would enable ability to stacked another layer of transistors fuctioning either 2 5D or fully connected.
    Or rather some kind of placing another layer of substrate at single transistor accuracy in X and Y axis.
    The real way it is made needs explanation, if I haven't missed its concept in this already scientific publication of upcoming product.

    • @HighYield
      @HighYield  Рік тому +1

      For me, the easiest way to imagine 3D stacking is with the use of TSVs, which are literally tiny copper wires drilled into the silicon that connect both chips. It's like a network of pipes.

  • @GraveUypo
    @GraveUypo Рік тому +1

    eh... i don't want socs. that will just make everything more expensive to upgrade and less modular and will remove competition and/or options from some markets (like ram)

    • @hughjassstudios9688
      @hughjassstudios9688 Рік тому +2

      True, but at some point, the interconnect becomes the bottleneck unless we can make it optical instead of electrical

  • @realshompa
    @realshompa 11 місяців тому

    Are Nvidia and Apple the last big bastions of huge monolithic chips? Both of those companies seem to have the same attitude to throw money on the problem and let the consumer pay for it. Nvidia and their maxing out chip size for each node. Apple M3 with 3 different design chips reported to cost 1 billion dollars to make. Why did Apple not release a 4-chip M ultra-ultra chip as was rumored for the Mac Pro replacement? Pro machines with 192gig memory are a no-go for many IT pros. (and 256 does not solve it either. 4 chips would at least bump it up to 512 compared to Intels MacPro with 1.5 terra)

    • @darrell857
      @darrell857 8 місяців тому

      Cerebras is banking on wafer-scale chips (entire wafer is one chip) so it can work to have big chips with the right design. You have to be able to fuse off all the bad silicon and/or have enough binning and market margin/demand to make it work. I read on the average nvidia gpu die 10-20% of the silicon is defective. The chiplet strategy is more economical, especially with a simple 2D interposer as on ryzen/epyc. The advanced packaging is much less economical, it would seem easier to do than EUV but if there is any flaw, you are throwing out a lot of chips (no fusing option at this stage). So it really depends on the packaging cost, capacity and reliability as far as how it shakes out in the end.