Why AMD's Chiplets Work

Поділитися
Вставка
  • Опубліковано 12 січ 2025

КОМЕНТАРІ • 411

  • @randalljones4370
    @randalljones4370 2 роки тому +50

    Back in the early 80's I was in a design team working on a Hardware Modeling Library at Mentor.
    Our device allowed 'chips' of up to 256 active pins (400 pins total) to be included into software simulations. (pre-widespread use of VHDL, obviously). I designed the physical interface into the customers' 'chip' (among other things). It was very interesting to query packaging folks from Intel, Fairchild, Wakefield, Brit-Telecom, Mercedes, etc) on what their upper-limit on pincount was... often I got a very cautious glance and
    ... "Well, how many can you give us?" as an answer.
    Many of them were only willing to talk about a more traditional hybrid-on-ceramic packaging. Whenever I turned to 3D packaging, I got a variety of answers, from "Nope, not for at least 5 years" to ... "Well, that depends on the vertical height we can have"
    [our device had 8 card-slots spaced 1.25 inches on center and had to account for a 0.125 inch thick controlled impedance PCB, and ZIF socketing.... either 4x64 pin, 2x 128 pin or 1x400 pin (256 active pins)].
    Just for fun, I'd ask if the full 12 inches was enough. The answer would be : "Of course, yes, but we still want to have up to 8 devices installed in the card cage... and 1.25 each inches is a bit tight. " which I always interpreted as they wanted over 1 inch EACH for their concept of a 3D multi-chip interconnect including all cooling heatsinks, fans, etc.
    Our answer was "just pull one interface board (running 7, not 8 devices) and then you have 2.5 inches to work worth ... otherwise, buy a second HML".
    Some customers did not smile at that suggestion. At over 100 grand, this was not a cheap device, bitd...but you could run up to 4 boxes under one multi-unit license.
    Only one young engineer at an unnamed aerospace company did not flinch at the 1 inch headroom... I imagine THEY were the ones who had the most compact version of the 3D packaging at the time.
    BTW, this was the same set of informational "interviews" that forced us to go to 256 active pins.
    When we started this, we thought we could get away with "just" 128 active pins. Virtually EVERYONE told us to double it.
    Our engineering manager CRIED when he heard that... us design-grunts were cheering it on!! MOAR POWER is always good, right?
    Now, designing the backplane and send/receive data lines and phasing clocks to get insane state-transition control times, THAT was fun to do.
    Controlling crosstalk and race conditions in the PCB layout just about cost me my sanity, but I made it work and it remains THE QUIETEST system (as measured on a FCC testing facility at Mariposa) that I was ever associated with... and the biggest.
    It was an amazing box of rawk.
    Tho no one asked for it, getting up to 512 active pins for one model would have required a change in the way we transmitted/stored the data for vector-in, vectors-out, tri-state-data and timing analysis data.

    • @warpspeedscp
      @warpspeedscp Рік тому +3

      Man, thus is just so cool. Im glad you shared your story here.

  • @michaellau5329
    @michaellau5329 2 роки тому +148

    Great video! I taught a course on Advanced Packaging a number of years ago and it's nice to see the industry moving towards MCM/Chiplet designs and, now, stacking of chips. Perhaps a future video can be done on passive vs active interposers?

    • @feynstein1004
      @feynstein1004 2 роки тому +2

      You're lau? I don't think you're lau.

    • @michaellau5329
      @michaellau5329 2 роки тому

      @@feynstein1004 ?

    • @feynstein1004
      @feynstein1004 2 роки тому +2

      @@michaellau5329 It's a reference to this video:
      ua-cam.com/video/h1sCiXTlR8Q/v-deo.html
      Thank me later 😂😂

    • @theman3282
      @theman3282 2 роки тому +4

      the arm of cyborg arnold left behind at cyberdyne really helps this subject.

    • @ale895
      @ale895 2 роки тому

      I remember a Professor saying that optimizing separetad parts doesn't necessary mean optimizing the full result when the parts are working together... so that in general integration and considering the entire system usually is better. It is kind of crazy that for me that we went from discrete chips, to almost full integrated, and going back to somehow discrete...But I guess that some systems became too complex for us to optimize in an integrated mode nowdays.

  • @lil----lil
    @lil----lil 2 роки тому +401

    Intel joked that AMD just "glue" their chips together.
    Also Intel: Yep, we'll be using chiplet design soon.

    • @suit1337
      @suit1337 2 роки тому +43

      intel glued CPUs together in the past - the Pentium II and III in Slot Versions had On-Package-Cache for example - and later the first Core Generation had mutiple "glued" together CPUs too - like the Core 2 Extreme QX6700, the Core 2 Quad Q6600 or the Pentium D 945

    • @gblargg
      @gblargg 2 роки тому +14

      Of course the problem with stacking is heat dissipation. A cube of silicon won't be able to get the heat out of the center fast enough. I guess they'll go to fluid cooling of these cubes if they ever stack that deep.

    • @suntzu1409
      @suntzu1409 2 роки тому +3

      Not really
      Intels tiles are very different from amds chiplets

    • @Megalomaniakaal
      @Megalomaniakaal 2 роки тому +11

      @@suntzu1409 emib baby, but it's worth a note theres multiple ways to skin a cat, and each have their own pros and cons.

    • @name-ic3vo
      @name-ic3vo 2 роки тому +2

      that joke is on intel. They aren't even able to use glue the right way when they glued the ihs to the pcb(for example i7 8700k) what caused terrible bad temperatures. i rather don't think that it would be an good idea if intel starts to glue anything together again.

  • @foxfoxfoxfoxfoxfoxfoxfoxfoxfox
    @foxfoxfoxfoxfoxfoxfoxfoxfoxfox 2 роки тому +432

    And now Intel, AMD, ARM, and others are getting together to come up with an interconnect standard. We could see chiplets from different vendors on the same package. Imagine AMD cores alongside ARM and even FPGA fabric, tasked with handling different aspects of a system.

    • @kekkocheng
      @kekkocheng 2 роки тому +59

      this might also solves licensing hell intel made to prevent vendors integrating both arm and x86 ISA into one single die.

    • @dongshengdi773
      @dongshengdi773 2 роки тому +6

      @@kekkocheng Why not connect 7 billion brains to Make a more powerful computer ?

    • @dorinxtg
      @dorinxtg 2 роки тому +31

      It's already done. Check each Threadripper, EPYC and Ryzen Pro - each CPU has a small ARM CPU inside for the Platform Security Processor (PSP).

    • @liamness
      @liamness 2 роки тому +7

      Will be interesting to see what new solutions it makes possible. e.g. AMD have dominated the console space because they are one of the only players that has all the technology and IP required to produce a monolothic SOC that can provide a good experience for gaming. My hope is that eventually it allows manufacturers to shop around to some degree, combining whichever parts that are available and best suited for their product. For instance I could imagine a high end tablet or gaming handheld using a CPU from Qualcomm and a GPU from Nvidia, and using UCIe to avoid those two companies having to collaborate too closely or share sensitive IP.
      Of course you can already have a separate CPU and discrete GPU from different companies, but this complicates the cooling, and I suspect packaging everything together could have cost and latency advantages too.

    • @mohdk2299
      @mohdk2299 2 роки тому +5

      @@liamness Nvidia is not part of UCIe. It’s not possible (yet)

  • @johnmyviews3761
    @johnmyviews3761 2 роки тому +106

    As I understand the reverse happened when a Japanese calculator company wanted intel to make a number of different chips for their calculator industry however a person in Intel decided to combine all the functions into the first CPU chip

    • @graealex
      @graealex 2 роки тому +31

      Usually integration (aka "fewer components") is the way to save on money and complexity. Like it saves big time, in some cases multiple orders of magnitude. But he explained why currently the reverse is true, at least for highly complex CPUs.

    • @michaellau5329
      @michaellau5329 2 роки тому +17

      @@graealex 100%. It's a balance between silicon yield and costs versus post-silicon process costs. In this day and age, silicon costs are astronomical so it makes sense to invest in designs that reduce them.

    • @thunderb00m
      @thunderb00m 2 роки тому +10

      If I remember correctly Intel were making ASICs.
      The ASICs were fully integrated where as the CPUs were general purpose and needed additional components to function
      Same thing here, monolithic CPUs are great with everything on one chip but if you can break it down, generalize the hardware and make it profitable then it will be a success.

  • @HypeLevels
    @HypeLevels 2 роки тому +24

    Quick tip for you, if you use obs for recording you can set your microphone audio to be on a different audio lane than the desktop audio making it possible for you to treat the audio on an external program and then import it back during the editing process, that way you can remove the breath sounds from your audio. It can be achieved by importing the audio to audacity (First you will need to use your preferred video editor to export only the audio as an mp3 or wav for example), choosing a part of the audio where the breathing is noticeable and going to effects -> Noise Reduction -> Get Noise Profile then doing CTRL + A and going to effect -> Noise Reduction -> mess around with the settings (in your case use low settings since the breathing is noticeable but not intense) and click "ok". After that export the audio from audacity and replace the audio from the video with the exported one on your preferred video editor :)

    • @that_tabby
      @that_tabby 2 роки тому +2

      yeah dude the breathing is a bit distracting (although natural)

    • @ShienChannel
      @ShienChannel 2 роки тому

      same thing i did back in 2015 x) old times

    • @HypeLevels
      @HypeLevels 2 роки тому +2

      @@ShienChannel Yeah :) It's an old trick I learned from the Minecraft gameplay times back in 2014 when we all had shit microphones

    • @joonhshin
      @joonhshin 2 роки тому +3

      I didn't really notice his breathing before but now I can't un-hear it. Thanks.

    • @raylopez99
      @raylopez99 2 роки тому

      Maybe the breathing is for effect, like "breathless prose".

  • @優さん-n7m
    @優さん-n7m 2 роки тому +8

    I worked in the aerospace industry sometime ago.
    They had a MultiChip module that contained an ASIC die which they had designed in house, and dies that were basically off the shelf, Ethernet PHY, Flash, MRAM, and oscillator. MCMs have been used there for quite some time it seems. They did it because they had basically no choice.

  • @TrueThanny
    @TrueThanny 2 роки тому +23

    Computers started out with multi-chip designs. Central processing units weren't a single chip - they were several chips, all doing different functions. It was Intel's big innovation to pull all those functions together and put them onto a single piece of silicon. Not quite all (I/O and memory controllers remained external to the CPU for a long time), but close enough.
    The funny thing is that AMD is deliberately going backwards. They were the first to move the memory controller onto the CPU die, which provided a considerably performance boost, and now they've reversed that. You really can't stress too much how important a very high-speed interconnect is when doing such things.

    • @supercool_saiyan5670
      @supercool_saiyan5670 2 роки тому +3

      looking through computing history you will notice they do this alot

    • @RobBCactive
      @RobBCactive Рік тому

      VLSI integrated functions onto one die lowering cost and improving power efficiency, while benefitting from Dennard scaling making shrinks smaller cheaper faster WITHOUT increasing thermal density.
      Now not only does i/o not scale, but cache neither on current processes and the energy density in high performance logic has become a real limitation.
      Thus chiplets space out heat generation in many core CPU, while large local caches allow gangs of CPUs to work mostly with local data, while trips to main memory become so proportionately slower that performance sensitive software does all it can to avoid unpredicted memory access.
      Not only is a 64c/128t chiplet cheaper but is impossible to market as a profitable product when monolithic.
      Binning of chiplets with selective disabling of cores, allows for far greater core efficiency or clock speed in premium models because the chiplets can be chosen to suit a SKU.
      A monolithic die has to space out cores to reduce logic density wasting expensive wafers and cannot mix n match parts to meet SKU requirements.

  • @peterconnell2496
    @peterconnell2496 2 роки тому +71

    What is missed is that AMD chiplets DONT just WORK. They overcome the downsides of familiar MCM and revolutionize computing from the stagnant Intel sinecure it was. They have been incredibly clever.
    They had one shot at avoiding oblivion - they needed a single design which cost competitively scaled to ~all markets.
    They didnt link existing cpuS, but designed a teamable processor core unit (4 core CCX) from the ground up.
    as others here say, the key downside is the power & lag used by inter core data transmissions over any distance, so this was the focus of the CCX design - each of the 4 cores had a direct hardware link to each of the other (adjacent) cores - this needed only 3 such ultra fast intra ccx links on each core.
    inter CCX links were lavishly and liberally cached to minimise lag.
    Having given Fabric the optimum hardware, they then put disproportionate focus on Fabric.
    They initially strategically yielded to Intel on the sexy IOPS at which monolithic ~wins (gaming), & gradually gained a flanking foot hold & then advantage based on raw power and cheap costs of harvesting - even at inferior process initially.
    So I have a problem with dismissing it as something Intel etc. can just whip up in the lab & marketing - BS.
    a/ it requires a ground up re-design of their entire processor range & discarding a lot of IP
    b/ its been over a decade of hard yards - validation, patents, ....
    Infinity Fabric is a huge moat for amd - its clear that Intel were caught utterly flat footed by it in 2017
    & have mostly just tried to ignore it since- not take it for the serious threat it is.

    • @niks660097
      @niks660097 Рік тому +3

      That's not everything too, before zen 2 most separate silicon interconnects for chips were very slow and huge latencies(100s of nanoseconds), amd found a way to bring 512b wide IF(servers) and < 100 nanoseconds of latency by working with TSMC, which paved the way for others.

  • @HavokR505
    @HavokR505 2 роки тому +22

    the infinity fabric is the secret sauce. having a suitable interconnect for the fabric is what makes this possible for sure.

    • @monad_tcp
      @monad_tcp 2 роки тому +1

      The irony is that the QPU would work if they dropped their UMA. You could push data from any core to any adjacent one. Supercomputers with thousands of cores work like that, a core can't randomly address any other core, not without calling the "network".
      But then you would have the Cell architecture with their PPUs.
      Fabric is easy to program I guess, because it's just like a network switch.

  • @dgillies5420
    @dgillies5420 Рік тому +5

    The great thing about chiplets is the ability to mix different type of process tech on the same chiplet device. For example, the CPU (which is built on a process tuned for low capacitance) and the DRAM (which is built on a process tuned for high capacitance ...)

  • @scottfranco1962
    @scottfranco1962 2 роки тому +79

    MCMs faced two major issues during my time at Intel several years ago. The first was the most obvious: putting multiple chips in the same package meant the heat dissipation of the package now had to be designed for the two chips - ostensibly doubling the required heat dissipation. We handled that by throttling down the chip clocks, a process that has been largely automated by onboard temperature sensors and variable clock generators. IE., if you put two of today's advanced chips in a single package, they will self-throttle downwards.
    The second issue is that anytime you have a signal leave or enter a chip, it goes in or out via a pad driver, essentially an amplifier of that signal designed to drive the large capacitance of a PCB trace. This adds delay to the line. A lot of it. Internal trace drivers only have to worry about driving very short lengths of narrow and flat aluminium/copper. Not so with PCB traces. MCM pad drivers could be backed off a bit to compensate for minimal traces in the MCM, but typical MCMs unite chips that would normally be packaged separately, requiring a design change.
    PS., like the old movie "support your local gunslinger", keep in mind I am a stupid guy in the process world. The real process guys have to keep a fan on their heads to keep their brains from overheating :-)

    • @michaellau5329
      @michaellau5329 2 роки тому +4

      It isn't really doubling the heat dissipation though. Taking an existing device at X wattage and splitting it up into different MCM components doesn't raise the wattage of the combined chips by any appreciable amount -- even if we include an active interposer. However, adding *additional* chips (read: cores, features, etc) to a design would *definitely* increase the thermals.
      Put another way, the advantage of a fully integrated chip vs MCM isn't about thermals, but moreso that fully integrated chips were a result of continuous process improvements over the course of history. It became *cheaper* to produce an integrated chip rather than go with MCM or COB. The overall performance improvements were a side benefit.
      Asianometry does touch upon the latency issues with MCM in the video too -- and it's something that AMD has seemingly solved architecturally.

    • @scottfranco1962
      @scottfranco1962 2 роки тому

      @@michaellau5329 Didn't follow that at all. Two chips in a package with X and Y heat generation somehow is not X+Y? New math?

    • @michaellau5329
      @michaellau5329 2 роки тому +8

      @@scottfranco1962 No, X/2 + X/2 = X. However, X + Y is definitely X + Y.
      A theoretical chip that is designed in a monolithic fashion versus MCM will consume roughly the same amount of power.

    • @michaellau5329
      @michaellau5329 2 роки тому +3

      @@scottfranco1962 Or put another way, designing a *device* that has a power dissipation of X watts in a particular package will be the same regardless of the number of chips inside. There is of course the arrangement of the individual chips underneath the heat spreader, but that's something else for another day.

    • @absalomdraconis
      @absalomdraconis 2 роки тому +1

      Far as I know, chip manufacturers aren't incorporating heat pipe design concepts into the surface of their chips- there should be plenty of heat performance still available.

  • @antoniomaglione4101
    @antoniomaglione4101 2 роки тому +21

    Remember Pentium II? Intel had problems placing all that primary cache memory on the Processor chip, so they used the multichip approach, with the cache being the added chiplets. The cache run at half of the processor clock speed anyway.
    They returned to a monolithic structure with the next generation, after one year...

    • @ClockworksOfGL
      @ClockworksOfGL 2 роки тому +4

      IIRC, the (earlier) Pentium Pro had a separate, interconnected cache within the same package. Unfortunately, it was expensive and ran 16bit code poorly, but it really cranked with pure 32 bit code.

    • @TrueThanny
      @TrueThanny 2 роки тому +6

      Returned? The Pentium II was the first Intel processor that included a L2 cache at all. If you wanted L2 prior to that, it had to be SRAM chips on the motherboard. Or, with the 386, if you wanted any cache at all, it was on the motherboard.

    • @Charlie-zj3hw
      @Charlie-zj3hw Рік тому

      @@TrueThanny It was way more fun back then! Nothing like discovering the newsgroups and getting a JPEG viewer to look at your first digital nudie pics!

  • @jpierce2l33t
    @jpierce2l33t 2 роки тому +1

    YES!!! I love all your videos, but I get especially excited when I see you post a tech one!

  • @ByWire-yk8eh
    @ByWire-yk8eh 2 роки тому +9

    IBM developed the first large MCMs in the late 1970s with volume shipments starting in the 1980s. Their first MCMs have 100 and 118 chips on each module. They were originally used in mainframes.

    • @AvgAtBes2
      @AvgAtBes2 2 роки тому +1

      Im not completely knowledgable, but everytime I come across some technology/architecture I hear "yeah IBM made a product with that back then and it dipped but then it gained traction" lol

    • @densepixel
      @densepixel Рік тому

      @@AvgAtBes2 Lisa Su (AMD CEO) worked at IBM from 95-07. So technically she is an IBM alumni.

  • @soren6045
    @soren6045 2 роки тому +3

    You are missing where the package innovation came from. Multi-Die packages where driven by memory and image sensor (processor). For example through silicon via technology was developed and matured for these applications. I still think that are the leading edge for this topic. A modern ISP is amazing part of technology with wafer level direct bonding, thinned image sensor chip to just 10um thickness to be ilumanted from the backside. Memory is producing stacks with up to 16 chips.

  • @ParkyCat
    @ParkyCat 2 роки тому +57

    This was so successful at the time, now Apple and Intel do it too with similar techs. good topic!

  • @albyx
    @albyx 8 місяців тому

    Great video. I didn't really understand why chips on a bigger fab process are more expensive, but once you talked about using smaller chips to 'approximate' to the size of a bigger one, it made sense (my 'aha!' moment) - bigger fabs use more space, so there's less space to make more of them = if it has errors, it is a waste and now you've wasted all that space. If it had been made smaller, then at the least the cost for a bad chip is less. Thanks!

  • @maverickvgc4220
    @maverickvgc4220 2 роки тому +46

    A funny thing about chiplets is that they're helping overall but they are so good that they leave no silicon for the low end. The low availabiliy of the 3300X and thelaco of a 5300X is becaus AMD is getting little to no chiplets with less than 6 working cores

    • @johndoh5182
      @johndoh5182 2 роки тому +9

      This is the fault of the way AMD approached Zen 2 and 3 and nothing inherent with chiplets.
      Something as simple as a 4 or 6 core part doesn't require chiplets and if AMD does this for Zen 4 I think Intel will eat AMD for lunch at the low end.
      If I were designing AMD's Zen 4 Ryzen based CPUs and APUs, I would use 5 chips in all. The first would be a 6 core monolithic die, with no graphics, that allows a tile based interface to add a graphics chiplet. I would create 2 graphics chiplets for different power levels. I can now use this single chip for 4 - 6 core CPUs and APUs. That's a lot of versatility still. I'd then make another chiplet that has 8 cores, the IO functionality and the ability to add graphics just like the 6 core chip, and on another side be able to add another 8 core chiplet, with no IO functionality, so it's simply 8 more cores.
      AMD did the whole IO die thing with GloFo because they were still under contract to GloFo for die.
      I don't for a single second believe Intel is going to move to tiling for every single part because it just doesn't make sense to add complexity when the die isn't very big.
      So this is where AMD failed, at low cost items, and they shouldn't have to rely on having enough bad monolithic APU die to make lower core count parts. That's just bad. And, using an 8 core chiplet to make a 4 core part is BAD. It's a waste of 50% of the die.

    • @jakejakedowntwo6613
      @jakejakedowntwo6613 2 роки тому +17

      @@johndoh5182 thats a lot of tooling though, it isn't cost effective to have that many different designs. It's also a waste of capacity since you have so many sku's hogging precious laser space.
      AMD has 3 designs: CPU,GPU,IO
      INTEL has maybe 1or 2 design: APU
      every intel sku is a binned down 12900k or is a 12900k

    • @edhofiko7624
      @edhofiko7624 2 роки тому +17

      @@johndoh5182 You seems to have a backward idea on the whole chiplet idea. Remember when AMD was on the brink of colapse, high end desktop cpu got like 4 cores and maybe halo product with 8 cores. With chiplet AMD raised the bar, nowdays most common gaming machine got 6 cores. AMD is now becoming intel since no competition is insight (thankfully we got ADL, but thats not enough). Instead of asking AMD to downgrade their effort, i would instead go to intel to ask them to up their game. So we can raise the bar again so midrange can have more cores like 12 core and hopefully, making 4-6 core to be low end. Thats the beauty of competition although i dont see it coming anytime soon since price is slowly creeping up.

    • @suntzu1409
      @suntzu1409 2 роки тому +4

      Suffering From Success™ 🗿🗿🗿🗿

    • @suntzu1409
      @suntzu1409 2 роки тому +1

      @@jakejakedowntwo6613
      No?
      AMD makes 1 die for APU, CCD, IO each and a bunch of dies for discrete GPUs
      6 core CCD is the only new one that AMD would need to make
      And intel makes more than 1 dies for alder lake

  • @geneballay9590
    @geneballay9590 2 роки тому +2

    Another well presented and very informative video. Thank you.

  • @coraltown1
    @coraltown1 2 роки тому +9

    The original Intel P6 (1995) had 2 dies, a CPU plus a wire connected cache. That's the processor I started my long verification career on. Fun times!

  • @scottfranco1962
    @scottfranco1962 2 роки тому +2

    Thanks for the overview. I have a high end threadripper CPU, nice to know how it got manufactured. First class video as usual.

  • @droknron
    @droknron 2 роки тому +35

    I'm surprised you didn't mention Intel tried this a few times. They did it with the Pentium 4 to get dual-cores (really dual-dies) to market. Then again with the QX6700 (2006) and Q6600's (2007) which contained two dual-core chiplets to deliver quad-core processors and they used the MCM naming for those too. There were also XEON equivalents at the time.
    In Intels case, AMD got to 4 cores with a single die and Intel was a bit behind so they had to use an MCM to catch up then they went back to a single die design. Now it has switched around and chiplets are being used to get higher core counts at a reduced cost like you said in the video.

    • @MrGreghome
      @MrGreghome 2 роки тому +4

      Intel double cheeseburgers....
      Those were the days.

    • @tringuyen7519
      @tringuyen7519 2 роки тому +4

      Intel is so behind AMD and Apple on chiplet technology that it’s not even funny. AMD 5800X3D has a stacked 64MB of L3 vertical cache at $450! Intel’s Faveros EMIB technology announced in 2019 is still nonexistent!

    • @dogman2387
      @dogman2387 2 роки тому +2

      Intel started doing it in 1997 with Pentium II

    • @Zgreed66
      @Zgreed66 2 роки тому +2

      @@tringuyen7519 more like TSMC, not AMD.

    • @IAmPattycakes
      @IAmPattycakes 2 роки тому

      @@dogman2387 Intel started doing it in 1981 with the iAPX 432

  • @kelamulenga9164
    @kelamulenga9164 2 роки тому +5

    Hey, could you do a video on BYD, their blade batteries and their semiconductor division?

  • @catsspat
    @catsspat 2 роки тому +152

    Quick tip: Xilinx is pronounced like, "zai links."

    • @deusexaethera
      @deusexaethera 2 роки тому +1

      How do you know?

    • @apidas
      @apidas 2 роки тому +10

      @@deusexaethera uh maybe chinese pronounce "Xi" as "Zai"

    • @chubbymoth5810
      @chubbymoth5810 2 роки тому +7

      Ssst,.. don't tell that secret! Nobody else knows how to pronounce it other Ksee Links or Shi links. Its one of those PR failures.

    • @666Tomato666
      @666Tomato666 2 роки тому +29

      no, it's pronounced "gif links"

    • @AgentOrange96
      @AgentOrange96 2 роки тому +10

      Alternatively, it's pronounced "Ayy Em Dee" ;p

  • @photonboy999
    @photonboy999 2 роки тому +3

    *voltage, frequency & chiplet efficiency*
    Multi-Chip has some very interesting EFFICIENCY options available. Since POWER DRAW does not scale linearly, and is often the limit (especially in laptops) then adding MORE chiplets at lower performance per chiplet can result in better performance. If you plan to run at a max of, say, 1500MHz you can also optimize around that as a max frequency further saving power somewhat. So there's a lot more flexibility compared to a traditional monolithic design... and not being on the most cutting edge NODE might not matter so much if you have this flexibility. Maybe put those cost savings into adding more chiplets on less efficient nodes? There's definitely going to be a BALANCE of everything that results in the best performance per dollar to make and that's going to vary depending on the product, current node costs etc.

  • @MarkWongUSA
    @MarkWongUSA 2 роки тому +1

    Great topic to cover, especially given the rise of importance of substrates and interposers and inter-die interconnects

  • @williamholmes7529
    @williamholmes7529 Рік тому +1

    How about a video about the Inmos Transputer? Loving what you do, nice 😉

    • @vulpo
      @vulpo Рік тому

      I am wondering how chiplets are fundamentally different from transputers.

    • @williamholmes7529
      @williamholmes7529 Рік тому

      @@vulpo the Wikipedia article on the Transputer is quite good.

  • @universalparadox4144
    @universalparadox4144 2 роки тому +9

    Can you do a video on yourself...how you generate so many high quality videos in such fast iterations? Truly amazing! I seriously would be interested how you do this!

    • @aniksamiurrahman6365
      @aniksamiurrahman6365 2 роки тому +2

      Yeah, like "Why Asianometry videos work".

    • @brodriguez11000
      @brodriguez11000 2 роки тому

      Cloning technology.

    • @scottfranco1962
      @scottfranco1962 2 роки тому +1

      You don't want too much of a paper trail after the Chinese invade... :-)

    • @carstenraddatz5279
      @carstenraddatz5279 2 роки тому +2

      He explains his process in this audio podcast episode: compoundingpodcast.com/ep24/ - worth your while!

    • @megalonoobiacinc4863
      @megalonoobiacinc4863 2 роки тому

      @@scottfranco1962 one minute they are the world's leading chip producer, with military and bleeding edge computing resting on their back, and the next minute they are a highly armed guerilla fighting for their existence...
      its crazy how close both versions seems to be

  • @soviut303
    @soviut303 2 роки тому

    Just a friendly heads up about your audio; your VO seems to have a bit of ringing to it. I noticed it around the 8:30 mark. This might be your mic arm resonating in which case it should be tightened. If the mic is desk mounted, put it on something soft so the desk can't resonate. Lastly, it might be feedback if you're listening to yourself with speakers but I'm doubtful of that last one.

  • @graealex
    @graealex 2 роки тому +7

    Back in the days, mainframe CPUs were basically a bunch of chiplets in a single package and a heat spreader on top. "CPU Galaxy" is a good channel to see some of these old beasts.

  • @ClockworksOfGL
    @ClockworksOfGL 2 роки тому +35

    Credit needs to go to software developers for finding ways to parallelize tasks. All the chiplets in the world won’t work if the software isn’t designed to run on multiple cores. You young folks probably won’t remember the days before multithreaded software, but trust me when I say it sucked. Programs would freeze as they completed a task, which in the days of slow CPUs and mechanical drives was quite often.

    • @scottfranco1962
      @scottfranco1962 2 роки тому +2

      Nothing has changed. Most true multicore/threaded designs are one-off. Think supercomputer clusters. True multithread/multicore software engineers are hard to find. The average software engineer is scared to death of multithreaded designs, and it would not be a stretch to imagine that most multithread apps have latent bugs.

    • @gblargg
      @gblargg 2 роки тому +4

      @@scottfranco1962 The more services the OS provides, the more those can be multi-threaded. Even the natural layers allows for it, e.g. GUI in a separate thread from the app.

    • @prashanthb6521
      @prashanthb6521 2 роки тому +8

      @@scottfranco1962 Nowadays almost all apps are multithreaded. Go/Julia languages offer multithreading out of the box.

    • @Baronvonbadguy3
      @Baronvonbadguy3 2 роки тому +2

      @@scottfranco1962 its dumb easy to do with Go give it a spin! 👍

    • @realshompa
      @realshompa 2 роки тому +4

      That is Windows thinking. Devtools for multithreading has existed for 30-plus years on real OSes like Unix. Even today Win10/11 cant handle over 16 threads on a single process. Most windows apps start to taper off after 4 cores. Apple solved this almost 10 years ago with Grand central dispatch. Microsoft should do the same and if Microsoft wants to make an OS that actually works they should use a Linux kernel and put their brilliant desktop system over it. That would also solve the multithreaded problem in Windows if MSFT wants to do it.

  • @TheDude50447
    @TheDude50447 2 роки тому +13

    The Ryzen chiplet design and the huge investment needed to make it work were a very risky move by AMD and almost an all in. With Bulldozer selling extremely poorly and no new cpu products on the market for about 5 years Im not sure if the AMD cpu division wouldve survived if ryzen wouldve been a failure. My guess is that it were the 2 major consoles using Bulldozer and GCN 1 keeping AMD alive. Of course we can thank the silicon gods that it workedout or wed be paying a pretty penny for Intel 4 core cpus in 14 nm to this day :D

  • @Quxxy
    @Quxxy 2 роки тому +1

    **Edit**: explained by a reply to my comment.
    10:03 - "[...] that handles analog functions like USB and SATA [...]". Neither of those are analog, and I'm not aware of any homophone that would make that statement make sense. Did you perhaps mean something like "auxiliary"?

    • @BusAlexey
      @BusAlexey 2 роки тому +4

      In chip fabs chip design is split into 3 general categories: logic, SRAM, and analog. analog (aslo known as I/O) is everything that touches outside the chip.

    • @Quxxy
      @Quxxy 2 роки тому

      @@BusAlexey Aah, okay. Thank you. Do you know if that is a legacy from when a lot of the external I/O *would* have been analog?

    • @BusAlexey
      @BusAlexey 2 роки тому +6

      @@Quxxy to the chip's circuit that's running at multiple gigahertz, any I/O is "analog", in a sense that the signal doesn't arrive clearly in a well defined form for the chip's circuit. There are special circuits in the chip that translate that "analog" voltages to a more manageable form inside the chip, and they are what's known as analog circuits

  • @edwardcasati3374
    @edwardcasati3374 2 роки тому +27

    Around 20 years ago we saw the end of the MHz race between CPUs. Slower chips doing things smarter and more efficiently substituted for faster clock speeds.
    Chips have been mired in the same 'bloatware' that software finds itself.... we don't program smarter, we just throw more lines of code and billions more transistors at a problem.
    I believe that the 7 nM 'node' is over-extended, and just like with clock speeds, we will take a step back towards more realistic geometries.
    AMD has taken an interesting tack towards simplification. A database server does not need ANY graphics functionality, why include it? Design a couple of chiplets that implement database functionality in hardware, omit the graphics engine and glue them together with a CPU. Less transistors, easier geometries and better performance.

    • @deusexaethera
      @deusexaethera 2 роки тому +11

      Scaling up the easiest way makes sense, until it stops making sense, and then scaling up the next-easiest way makes sense. Repeat until full theoretical capability is achieved.

    • @minespeed2009
      @minespeed2009 2 роки тому +3

      Thats one of the problems with x86 (and ARM to some extent). About 90% of all instructions a program executes are jump,compare,add,subtract,load,store and x86 has ~15 000 instructions which makes decoding A LOT more difficult and requires more bits to encode an instruction, eating away silicon space for more useful things such as more pipelines or a better branch predictor. Thats one of the reasons why RISC-V is appealing since it wants to keep things as simple as possible, which also allows for higher code-density.

    • @niewazneniewazne1890
      @niewazneniewazne1890 2 роки тому

      Database acceleration has been done by Oracle with their SPARC cpus

    • @scottfranco1962
      @scottfranco1962 2 роки тому +2

      Sure, what is the clock speed of the human brain? Probably less than a 1khz. Yet all of the computers in the world cannot equal one. Parallel processing in 3d solids baby!
      Oh, yea, and all that for the cost of a beer or so...

  • @MathysWalma
    @MathysWalma 2 роки тому

    Xilinx (now AMD) used a multidie approach for ultrascale too. Some designs that ran fine on V7 have problems meeting timing at the die crossing interface in ultrascale.

  • @MassimoTava
    @MassimoTava 2 роки тому +3

    Can you do a video on camera image sensors?

    • @MassimoTava
      @MassimoTava 2 роки тому

      @TacticalMoonstone I like to know how similar or different the process technology is between sensor manufacturing vs processors. I have a feeling that image sensors use older equipment/process. Also, how far behind is full frame / large sensor semiconductors compared to latest cell phone cameras. If you would scale up for example, iPhone main sensor to larger size, how much would it cost? TowerJazz, Samsung, Sony, sigma…I’m also curious about those.

  • @TomAtkinson
    @TomAtkinson 2 роки тому +1

    Remember the jump in performance that came with the Pentium II Xeon / Pro? It seemed like a big jump in performance. That used a primitive version of chiplet based L2 cache memory if I recall correctly; This is why it had a weird rectangular shape and case. I think big CPU cache is the way to go, generally.

    • @Charlie-zj3hw
      @Charlie-zj3hw Рік тому

      Pentium 2 and 3.. It plugged into a socket like the old agp slot

  • @TigeroL42
    @TigeroL42 2 роки тому +2

    The Threadripper series is simply unbeatable in the prosumer market!

    • @scottfranco1962
      @scottfranco1962 2 роки тому

      All it takes to beat them is an Arm... and a leg.... :0

  • @rydplrs71
    @rydplrs71 2 роки тому +1

    SOC is the opposite of what amd is using. They are doing SIP system in a package, it’s been in use for decades. The next evolution of that is stacked die, where you use through silicon vias to directly connect die without bond wires and footprint increases. SOC is putting multiple die on the same wafer making large die with sub die only connected by the top metal layers or bond wires in the package. That was a large expense where board space was the primary concern.
    The other huge advantage to this approach is you don’t need as many different devices on a single die, reducing mask layers for each and not having to share thermal budgets at each layer. This reduces cost and complexity. The reduction in base and interconnect layers also greatly improves cost of scrap. Your net yield increases slightly but anything you scrap at end of line has man fewer masks so it costs less. These advantages are shared by Traditional SIP but stacked die is a way to take moores law into 3d and continue it beyond atomic size constraints.

  • @unreliablenarrator6649
    @unreliablenarrator6649 2 роки тому +1

    The biggest roadblock issue for MCMs was cache memory latency bottlenecks. Consequently, integration of memory on die for CPUs GPUs and SoC would be a priority from the mid-90's onward, while on the severs side, massively parallel blade architecture solved the problems temporerally, but by ~2010, latency was again an issue. Chiplets solve this another way; splitting up the processors into manageable increments.

  • @sebastianelytron8450
    @sebastianelytron8450 2 роки тому +4

    10:00 "Analog functions like USB and SATA"
    Wtf? Can someone explain how USB is analog?

    • @ulwen
      @ulwen 2 роки тому +2

      It's a bit of hair splitting but technically USB sends digital communication over analog differential pairs. The usb controller interface he mentions translates the external analog electrical signal into internal digital data.
      PAM 4 in PCIe 6 adds a more clear analog touch, with the communication signal voltage being able to represent 00, 01, 10 and 11 instead of just 0 or 1.

  • @2001pulsar
    @2001pulsar 2 роки тому +4

    What about die stacking?

  • @j7ndominica051
    @j7ndominica051 Рік тому

    Why don't separate the systemsonachip into multiple processors like it used to be? Have separate video card and a neural AI calculator. What does "diffused" in USA mean on the AMD processor?

  • @locusgaudi
    @locusgaudi 2 роки тому +5

    I wonder whether i saw this approach with first Intel quad cores. As far as i remember these were two 2-core chips "glued together" on a single package.

    • @AlfaPro1337
      @AlfaPro1337 2 роки тому

      Intel already proven that chiplet works back in P4D, C2 era, except AMD shamed Intel for gluing and went as far as calling it fake dual core and later quad core.

    • @dylon4906
      @dylon4906 2 роки тому +1

      @@AlfaPro1337 and then Intel did the same thing to AMD after the success of ryzen. capitalism is wonderful

    • @TrueThanny
      @TrueThanny 2 роки тому +1

      Intel's first "dual-core" chip, the Pentium D, was two single-core chips placed on the same package. There were zero interconnects. When one chip needed to communicate with the other, it had to do so via the FSB, so it was actually SMP on a single package, not a dual-core chip.
      Their first "quad-core" was similar. It was a pair of dual-core chips on the same package, with the same lack of interconnects.

    • @TrueThanny
      @TrueThanny 2 роки тому +4

      @@AlfaPro1337 AMD was right. The Pentium D was SMP on a single package, and the Core 2 Quad was dual-core SMP on the same package. AMD had the first actual dual-core processor, and also the first actual quad-core processor. Many workloads showed the consequences of this, too.

    • @AlfaPro1337
      @AlfaPro1337 2 роки тому

      @@TrueThanny There were 0 interconnects, because the I/O die is basically on the North Bridge?
      Plus, I'm guessing Intel opted for a ring bus-like, thus, the main core die communicate with the NB and fills up, until it hits a certain workload that it needs to shift some work to the second core die.
      It's still a dual-core in a sense--though, not a single package, but hey, AMD was being a dick in the 1st place.
      Heck, even the early consumer 1st gen Core i series are basically 2 die, but, one is the core and the other is the I/O die.
      Technically, Intel has been gluing for years.

  • @MoritzvonSchweinitz
    @MoritzvonSchweinitz 2 роки тому +2

    But how are those chiplet interconnects actualy made physically? Bond wires? Solder balls?

    • @mawkzin
      @mawkzin 2 роки тому +2

      TSMC calls it TSV if you want to know more.

    • @MoritzvonSchweinitz
      @MoritzvonSchweinitz 2 роки тому +2

      @@mawkzin Thank you! But TSV seem to be "through-silicon-vias", i.e. to connect different layers to each other, no?
      But searching for TSV led me to pages that mention "microbumps", which I believe is how the individual chiplets are bonded to the base material. It seems that microbumps are µm-level solder balls.
      Now I'd love to know even more about this part of IC packaging! How are those microbumps manufactured? How are they placed so precisely? Are they heat-soldered?

  • @depth386
    @depth386 2 роки тому +2

    Maybe we can go back to having a north bridge on the motherboard in typical PCs? It would be a small step.
    This also reminds me of how 80286 and 80386 CPUs needed an optional “math co-processor” to be installed in the motherboard in order to perform some floating point calculations.

    • @insu_na
      @insu_na 2 роки тому +2

      A North bridge would come with massive latency costs, tho. They didn't integrate the North bridge just because it was cheaper, they also did it because having an imc massively increases performance

    • @blkspade23
      @blkspade23 2 роки тому +2

      We wouldn't actually want that. You're just adding more lengths of wire and logic between a device and the CPU, which becomes a bottleneck and adds latency. Everything either needs access to the CPU, RAM or both. The chipset that still exists on desktop systems still has the same problem. AMD's Epyc based systems don't even have a chipset because there is room in the IO die for all those auxiliary functions. It's probably key to understand that the Northbridge is mainly only really different from the modern chipset (and southbridge) in that it housed the memory controller. That specifically makes way more sense to be on the CPU package. The northbridge was removed in favor of the Integrated Memory Controller. It would be a step backwards.

    • @depth386
      @depth386 2 роки тому

      @@blkspade23 Okay so not worth it, I guess Ryzen already does the logical thing and puts the IMC in the I/O chiplet. It maxes out at moderate memory speeds but it makes sense from a value perspective.

    • @depth386
      @depth386 2 роки тому

      @@blkspade23 by the way, thank you for your lengthy explanation/reply. I appreciate that you shared your knowledge.

  • @dgillies5420
    @dgillies5420 Рік тому +1

    The Brilliance of AMD is that they have - essentially - only a single 8-core CPU. By connecting many of them with different fabric chips and/or disabling broken ones by lasering them out of the circuits, they can build a chip anywhere in size from 2 CPUs to 128 CPUs - all with only 1 GPU, roughly. So AMD really makes only ONE CPU and it has 8 cores, period.

  • @joedotcom888
    @joedotcom888 2 роки тому

    Asianonetry should be a guest on MLID’s Broken Silicon podcast.

  • @LuigiSimoncini
    @LuigiSimoncini 2 роки тому

    Thanks. I was expecting more information about the packaging technology and process, that’s where the real innovation is as far as I understood. Is that covered by secrecy?

  • @whatthefunction9140
    @whatthefunction9140 2 роки тому +16

    Large chips also have internal lag issues even at the speed of light signals that need to move 2 centimeters is always twice as long as if it needs to travel 1 cm

    • @ihmpall
      @ihmpall 2 роки тому

      That’s like an hour longer

  • @UnrealVideoDuke
    @UnrealVideoDuke 2 роки тому +1

    The older server CPU's were split up into separate dies on the same substrate based on functionality and the needs of the customer. There were different variations for them as needed for server processing. If it needed to do more calculations it would need more cpu cores on the same substrate. Back in the 90's Intel had separate cache dies In their Pentium 2 line because they found too many failures manufacturing CPU dies with cache at the time. It really came down to the cache itself when it came to manufacturing for the reason for most of the failures. Instead of wasting a whole CPU chip just because the cache was faulty. They decided to manufacture the cache separately. It does not cost more for having more instructions on the same die size. It costs more for the higher percentage of failed chips manufactured. I.E. Bad chips = no money. It may cost more for a finer lithography process on the same die size because of the expense of the equipment and development costs and also risking higher failure rates. It may cost even more if the design of the chip may require more layers of processing. This is how AMD "Chiplets" are the idea behind their new high-end processor with their 7nm process. Very few chip manufacturers use 7nm process at this time and and it seems to be getting better but risky in the means of higher failure rates. I'm sure the failure rates are getting lower but by splitting the dies or "Chiplets" the manufacturers may create less wasted failures and earn more profit.

  • @justice929
    @justice929 2 роки тому +2

    LISA SU WAS MIT SUPERSTAR IN COMPUTERS AS STUDENT. GUESS NOW DR. LISA SU.....

  • @m_sedziwoj
    @m_sedziwoj 2 роки тому

    4:00 this is not what Moore said, because chiplets and tiles are in same package, and he said about in separately packaged. Don't give credits, when don't deserved.

  • @Doctrina_Stabilitas
    @Doctrina_Stabilitas 2 роки тому +10

    great coverage as always. I wonder if apple will ditch the monolithic dies for the pro lineup going forwards, they really are pushing that monolithic design SOC with their on silicon interconnect

    • @Theoryofcatsndogs
      @Theoryofcatsndogs 2 роки тому

      I don't think it is possible for the A series chip as it need to be small for iDevice. Possible for future M series. The current M1 Ultra is 2 chips in the same die with RAM, Unify Memory, as chiplet on the side.

    • @Doctrina_Stabilitas
      @Doctrina_Stabilitas 2 роки тому +4

      @@Theoryofcatsndogs not really, the M1 ultra is really one monolithic die, since the interconnect is on silicon. they just cut off one of the dies if it fails which is how they preserve yield

  • @Alex_Bket
    @Alex_Bket 2 роки тому

    Nice vid !!!!

  • @gijbuis
    @gijbuis 2 роки тому +6

    As I understand this, using ‘chiplets’ means breaking up a single chip into several components. This means that each chiplet would need extra circuitry to concentrate the signals into channels for exporting and importing data from the other chiplets. This extra channel circuitry would not be needed in a single integrated chip. That makes the integrated chip inherently more efficient that a bunch of interconnected chiplets.

    • @lyq232
      @lyq232 2 роки тому +3

      Yup, had the technology allowed them to keep shrinking the process node in Integrated chips, they would've done so

    • @afriendofafriend5766
      @afriendofafriend5766 2 роки тому

      Except for that it's not because it doesn't work. So sure.

    • @lonyo5377
      @lonyo5377 2 роки тому

      More efficient until you reach a size limit. More efficient except more prone to manufacturing defects. More efficient except more expensive. In an ideal world it wouldn't be necessary, but it is necessary for various reasons.

    • @Random-Stranger
      @Random-Stranger 2 роки тому

      What you described is how Moore's Law basically allowed Intel to keep squeezing out performance out of it's chips over the past decades. Problem is: Moore's Law is dead. We are pretty much at the limit at how much we can shrink transistor dimensions. You can't just shrink them them further and pack more of them in an integrated circuit. The failure rate is significantly higher. This is why Intel's chips and semiconductor roadmap got fucked because of the consistently poor yields they were getting.

  • @TJ-vh2ps
    @TJ-vh2ps 2 роки тому

    Great video! I love all of them: thanks for the great work you do!
    When you refer to the USB and SATA connections to the AMD Ryzen I/O chip as "analog", what about them is analog? Is it the transceiver (not sure if this is the correct term in this context) that converts the external digital USB and SATA 5v signals to the 1.2-1.5v (I believe) signals that the I/O chip uses internally?

    • @thewheelieguy
      @thewheelieguy 2 роки тому +4

      Nope, when you put the data rate of a digital signal high enough, it's not sharp edges and neat transitions on clock edges. You 100% have to treat them as analog signals when they come in from off-chip. On chip you can control levels and delays to make things near-enough digital ideals to treat them as crisp signals.

    • @TJ-vh2ps
      @TJ-vh2ps 2 роки тому

      @@thewheelieguy Oh yeah, that totally makes sense. At the signal level everything is analog, but we (software and systems engineers) can usually ignore that. Thanks for the explanation!

  • @boots7859
    @boots7859 2 роки тому +2

    Intel 2 years ago, "Well, if you wanna glue 2 chips together...."

  • @BlunderMunchkin
    @BlunderMunchkin 2 роки тому

    Agnus, Denise, and Paula would like to discuss the Amiga home computer.

  • @tomorrowland2684
    @tomorrowland2684 2 роки тому +1

    @Asianometry If you could find time to make video on next generation of interconnects both data and power to deal with multi chip on a substate, the companies who are doing it and whether Intel can give competition to TSMC, Samsung with the next generation foundry, that would be awesome.
    When I look for information they are all fairy tales. Could you incorporate what kind of materials would be used for interconnect, interposers and why they are challenging.

  • @Artoooooor
    @Artoooooor Рік тому

    It reminds me of Intel 2 or 3 "cartrige" processsors. The reason was similar - yield issues from trying to put too much on single chip.

  • @SianaGearz
    @SianaGearz 2 роки тому

    Mhm why isn't L3 a separate unit or unit group integrated with the memory controller?
    Also why would MCM be a curse word?

  • @accessiblenow
    @accessiblenow 2 роки тому

    Good review.

  • @Rugged-Mongol
    @Rugged-Mongol 2 роки тому

    12:03 - "Pana-see-ya" bud.

  • @HanSolo__
    @HanSolo__ Рік тому

    4:25 Dr Lisa Su is wearing a 1.5ct diamond.

  • @annieshedden1245
    @annieshedden1245 2 роки тому +1

    no mention of IBM mainframe MCMs

    • @thewheelieguy
      @thewheelieguy 2 роки тому

      He specifically showed the big perforated copper blocks with chips embedded and called them MCMs. I don't recall if he mentioned the letters I, B, and M but they're in there.

  • @blanamaxima
    @blanamaxima 2 роки тому

    I work in embedded , no such issues there :) You would have to be mad to go chiplets there. I see the point for high end designs if you want to have some scalability and lower cost with some performance drawbacks.

    • @monad_tcp
      @monad_tcp 2 роки тому

      Isn't going chiplet when you have more than 1 MCU in a PCB, lol.

  • @thatguy7595
    @thatguy7595 2 роки тому

    Why do you say that first chiplet product is Epyc when AMD releases desktop Ryzen parts earlier than server?

    • @TrueThanny
      @TrueThanny 2 роки тому

      Because they only had one chip. First-gen Zen was monolithic. Everything was on the one die, which had two memory controllers. The consumer platform got one chip, giving it up to eight cores and two memory channels. EPYC got four chips, which is up to 32 cores and eight memory channels. Later, Threadripper got two chips, for up to 16 cores and four memory channels.
      With Zen 2, all the I/O, including memory channels, moved into a separate die. That allowed an arbitrary number of cores for any platform, only really limited by space on the package.

  • @bslay4r
    @bslay4r 2 роки тому

    Some people here in the comments confuse chiplets with MCM. They're not the same.
    In an MCM design a single chip can work in itself.
    In a chiplet design a single chip can't work in itself without an additional chip. In the case of AMD one chip (CCX) consist the cpu-cores and another chip consists the IO interfaces and connections (IO die). Neither works without the other.
    So a 386 motherboard or an IBM mainframe from 1965 is not based on chiplets.
    One thing is missing from this video: chiplets improve not just yields but the binning. Smaller chips can reach higher clockspeeds.

  • @Hansengineering
    @Hansengineering 2 роки тому +3

    lol at 1:10 the person in the clean room on top the machine has a fall protection harness. However, the distance they have to fall is so short, it won't activate before they hit something.

    • @ShienChannel
      @ShienChannel 2 роки тому

      the protection is not used by woker to being protected by the machine, but is used by the machine to be protected by workers

  • @deilusi
    @deilusi 2 роки тому +4

    I recommend checking caly-technologies die-yield-calculator.
    as I think you gave numbers for small chips and mature process here (8:18) with new and unrefined ones, big chips like epyc would have yeld of 1-5 chips per 300mm wafer which cost ~10000$ each. (yeld in 10% range) which would make cpu cost way above 500 000$ each to pay for every step of the process.
    same wafer and same confg with chiplet drops 840 working chiplets, and maybe 10 half's, that can be configured, into 850 working ryzens, 400 32 core threadrippers or 200/100 epycs. (assuming perfect interconnecting, you can drop 5% if you want to be realistic). IO dies on lower process are easy to get and WAY cheaper.
    as TSMC and others have always full load and fight for every wafer between companies, AMD can make 10x as much chiplet based cpu's as monolithic ones, with number of wafers they got from TSMC.
    amd is already above what would be possible without splicing into smaller pieces, and if intel would have chiplets with their 10 nm issues, they would have good yelds few years sooner.
    I know its extra complexity, extra time and extra problem layer, but we already past the point where monolith is physically possible and every manufacturer goes around their own way.
    NVIDIA sucks up any issues and disable 1 or 2 out of 70 cores completely, and just say their product have less, than it have working products.
    I like AMD way as divide and conquer is what IT found the best solution for wide variety of problems....

  • @andersjjensen
    @andersjjensen 2 роки тому +7

    And AMD's latest chiplet innovation is 3D V Cache which stacks additional cache on top of the base chiplet.

    • @abaj006
      @abaj006 2 роки тому +2

      Man AMD should just stop gluing together all these parts. Give the competition time to copy their designs first... ohh wait, Intel just patented the AMD Ryzen architecture. Never mind.

    • @anarsosoroo2891
      @anarsosoroo2891 2 роки тому +1

      @@abaj006 what? What do you mean?

    • @drbali
      @drbali 2 роки тому +3

      @Anar he is spreading fake news that he didn't understand or read fully

    • @anarsosoroo2891
      @anarsosoroo2891 2 роки тому

      @@drbali ohh, i was very confused at first xD

    • @andersjjensen
      @andersjjensen 2 роки тому +3

      @@anarsosoroo2891 He, humorously, said: "Intel used to dunk on AMD by calling chiplets 'glued together cores' but now Intel recently published a specification for a, supposedly, industry wide approach to a cross-vendor way to mix and match different accelerator chipslets into a coherent package. It 'incidentally' happens to be perfectly compatible with AMDs Infinity Fabric".
      But @J Bali is also right, as AMD (and a few specific AMD employees) is actually mentioned in the white paper. So AMD was in on it right from the inception. Hence Intel didn't just "patent Ryzen".

  • @himanshusingh5214
    @himanshusingh5214 2 роки тому

    Thank you for telling me this.

  • @JJC007
    @JJC007 Рік тому

    You didn't mention anything about how packaging technology suddenly enables the integration of multi chiplet modules without sacrifizing the performance. And what's the difference between an SiP and a chiplet structute.

  • @RoodeMenon
    @RoodeMenon 2 роки тому

    I don't see Chiclets anywhere these days. Are they banned? I miss them.

  • @yash_kambli
    @yash_kambli 2 роки тому +1

    Why haven't we have seen any chiplet in smartphones instead of soc yet? Are they really that power hungry or is there any another issue?

    • @Random-Stranger
      @Random-Stranger 2 роки тому

      You're likely going to see it moving forward more in mobile gaming. For example, the chip being used in the Steamdeck is a 4-core APU derived from AMD's Zen3 micro-architecture paired with 2 RDNA compute units.
      As for phones, battery technology probably hasn't caught up yet that phone OEMs would look beyond the current SOCs they're using.

  • @pinkipromise
    @pinkipromise 2 роки тому

    12:03 pana key not pana cia

  • @BleughBleugh
    @BleughBleugh 2 роки тому

    Thanks for this

  • @harryniedecken5321
    @harryniedecken5321 11 місяців тому

    In 1983, I was at Intel and we were bonding together multiple 16k drams in a package for IBM.

  • @Graeme_Lastname
    @Graeme_Lastname Рік тому

    Voice synth needs a bit of a touch up. 🙂

  • @axl1002
    @axl1002 2 роки тому

    Also the first dual core Pentium4 and Q6600 were 2 dies glued together :)

  • @dgillies5420
    @dgillies5420 Рік тому +1

    Mobile phone sales per year are 1.2 - 1.3 BILLION, not "in the millions each year".

  • @edwardcasati3374
    @edwardcasati3374 2 роки тому

    Your link to the Podcast is broken.

  • @AlexanderSylchuk
    @AlexanderSylchuk 2 роки тому +2

    I remember being obsessed with Hyper transport technology. What always bothered me was this idea that theoretically we can unite all the best solutions from competitors in order to make one "ultimate" product. At the same time losing competition (or at least a threat of it) was the major reason why one company developed its "killer feature" to stay afloat in the market or to outcompete its rivals. On the other hand there is no such thing as the ultimate product with all the cutting edge technology in the real world since every product is just a mean for fulfilling certain need or completing certain task. For each need or task the best solution might not require the best of the best or "ultimate" tech in every regard.

    • @scottfranco1962
      @scottfranco1962 2 роки тому +1

      It kinda happens already. AMD and Intel engineers can practically throw forks at each other during lunch...

    • @AlexanderSylchuk
      @AlexanderSylchuk 2 роки тому

      @@scottfranco1962 My dilemma stems from the assumtion that technological advancement comes either from competition or from some extreme external limitation. It seems logical to me that in the future we will compete directly in technology (of every processes) rather than the products. I might be simply wrong but isn't it the limitations from weak points of your product make you develop new technologies to overcompensate your weaknesses.

    • @scottfranco1962
      @scottfranco1962 2 роки тому +1

      @@AlexanderSylchuk Oh, you are begging for my favorite story. I interviewed with a company that made precision flow valves. These were mechanical nightmares of high precision that accurately measured things like gas flow in chemical processes. This is like half the chemical industry (did you know a lot of chemical processes use natural gas as their feed stock?). Anyways, what has that got to do with this poor programmer? Well, like most industries they were computerizing. They had a new product that used a "bang bang" valve run by a microprocessor. A bang bang valve is a short piston that is driven by a solenoid that when not energized, is retracted by a spring and opens a intake port and lets a small amount of gas into a chamber. then the solenoid energizes, pushes the piston up and the gas out another port. Each time the solenoid activates, a small amount of gas is moved along. Hence the "bang bang" part. If you want to find one in your house, look at your refrigerator. Its how the Freon compressor in it works.
      Ok, well, that amount of gas is not very accurately measured no matter how carefully you machine the mechanism. But, it turns out to be "self accurate", that is, whatever the amount of gas IS that is moved, it is always the same. The company, which had got quite rich selling their precision valves, figured they could produce a much cheaper unit that used the bang bang valve. So they ginned it up, put a compensation table in it so the microprocessor could convert gas flows to bang bang counts, and voila! ici la produit! It worked. Time to present it to the CEO! The CEO asks the engineers "just how accurate is it?" Engineer says:
      well... actually it is more accurate than our precision valves. And for far cheaper.
      The story as told me didn't include just how many drinks the CEO needed that night.
      So the CEO, realizing that he had seen the future, immediately set into motion a plan to obsolete their old, expensive units and make the newer, more accurate and cheaper computerized gas flow valves.
      Ha ha, just kidding. He told the engineers to program the damm thing to be less accurate so that it wouldn't touch their existing business.
      Now they didn't hire me. Actually long story, they gave me a personality test that started with something like "did you love your mother", I told them exactly where, in what direction, and how much force they could use to put their test and walked out.
      I didn't follow up on what happened, mainly because I find gas flow mechanics to be slightly less interesting than processing tax returns. But I think if I went back there, I would have found a smoking hole where the company used to be.
      And that is the (very much overly long) answer to your well meaning response.

    • @AlexanderSylchuk
      @AlexanderSylchuk 2 роки тому

      ​@@scottfranco1962 Great story! For some reason I imagined something like water valve from washing machines as a "bang bang" valve. To me fridge compressors work more like one piston combustion engine, but they don´t make distinct souds due to their continuous work. Maybe with a larger piston which stops at every cycle they will produce a "bang-bang" sound. It actually reminded me "from zero to one" by Peter Thiel and your story is actually my main concern about that book. When you outcompete your market you will find it a lot harder to disrupt your own business with new technology. It's just like it was in the soviet union, the only tech that was improving was military and only because of competition with the west.

  • @ablazedguy
    @ablazedguy 2 роки тому

    A CPU you get for your computer is not SOC, not even chiplet based Ryzens. SOC includes not just the CPU and it's I/O, but also a GPU, storage and memory at least. Phones and consoles have SOCs, but not personal computers.

  • @zunriya
    @zunriya 2 роки тому

    beter explaining SerDes, nrz, or pam 4, used on its chiplet used for data transfer on chiplet technology

  • @Steven_Edwards
    @Steven_Edwards Рік тому

    The Athlon Tri-Core was just enterprise quad-core chips that failed to yield due to manufacturing loss, so they just turned off the defective core in Microcode, sold it to consumers as the Tri-Core and we ate it up.
    Also because the failures were sometimes still semi-usable so we figured out ways to turn the defective CPUs back on.

  • @tuy60
    @tuy60 2 роки тому

    In the 1980s AMD had the AM2900 bit-slice Microprocessor design. Do not know if anyone actually used these but they where and interesting idea, Multi-chip processor. The improvements in chip manufacturing made these redundant.

    • @thewheelieguy
      @thewheelieguy 2 роки тому

      I can remember reading the data books on those when I was in my undergrad. The only industry applications I can think of were in very high speed signal processing, before the advent of integrated DSPs.

  • @dwaynezilla
    @dwaynezilla 2 роки тому

    Intel, 2017: "We'Re GoNnA gLuE ChIpS ToGethER"
    Intel, 2022: "They're called _Tiles"_

  • @GoogleUser-ee8ro
    @GoogleUser-ee8ro 2 роки тому

    What kinds of chips/applications are not suitable for chiplets?

  • @matchedimpedance
    @matchedimpedance 2 роки тому

    Xilinx is pronounced ZI-links, not ZEE-links. And panacea is typically pronounced with emphasis on the first and third syllables, not the second syllable.
    Otherwise, good video.

  • @_exilon_
    @_exilon_ 2 роки тому

    They worked because despite all of the performance disadvantages of chiplets, Intel spent half a decade relaunching Skylake on 14nm because of their initial 10nm failure. Sacrificing 10-20% of the power envelope and adding 10-20ns of memory latency to do hops over the package is fine if the competition is THAT stagnant. Had Ice Lake and Alder Lake launched in 2017 and 2018 as originally planned, this video would've been titled "why AMD's chiplets failed"

  • @gamma_draconis9905
    @gamma_draconis9905 2 роки тому

    I dunno why it never occurred to me that they would use the same chiplets across both server and desktop processors.

  • @josefsadiksson3642
    @josefsadiksson3642 2 роки тому +1

    Amd survived thanks to economic of chiplets.
    Intel is behind amd in that. But soon will have better yields for server chips thanks to chiplets.

  • @ktkace
    @ktkace 2 роки тому +1

    Remember Terminator 2: judgement day? Chiplets.

  • @Dr7-1
    @Dr7-1 2 роки тому

    Habana intel differences? In passive data transfer I mean

  • @soapbar88
    @soapbar88 2 роки тому +1

    Help us AMD, you're our only hope

  • @aniksamiurrahman6365
    @aniksamiurrahman6365 2 роки тому

    Next up, someday: "Why Asianometry Videos Work"?

  • @jpthiran
    @jpthiran 2 роки тому

    excellent - very intresting