This Server CPU is so FAST it Boots without DDR5

Поділитися
Вставка
  • Опубліковано 18 гру 2024

КОМЕНТАРІ • 374

  • @DigitalJedi
    @DigitalJedi Рік тому +354

    I worked on this CPU! Specifically the bridge dies between the CPU tiles. I figured I'd share some fun facts about those CPU tiles here for you guys:
    Each CPU tile has 15 cores. Yes, 15. The room that the 16th would occupy is instead taken up by the combined memory controllers and HBM PHYs.
    There is not one continuous interposer. Instead, each CPU tile sits on top of EMIB "bridge" dies as I call them. this strategy is more similar to Apple's than AMD's, or even Meteor Lake's. This is because Sapphire Rapids is so enormous that it exceeds the reticle limit of the machines that make normal interposers.
    There are 4 CPU tiles, but and 10 bridges. The tiles each have 5 connections, 3 on one edge and then 2 on the neighboring edge. 2 of the tiles are mirror images of the other 2. You can get a diagonal pair by rotating one about the center axis 180 degrees, but the other 2 have to be mirrored to keep the connections in the right place.

    • @ummerfarooq5383
      @ummerfarooq5383 Рік тому +15

      Can it play starfield

    • @marcogenovesi8570
      @marcogenovesi8570 Рік тому +6

      @@ummerfarooq5383 can Starfield play?

    • @DigitalJedi
      @DigitalJedi Рік тому +53

      @@ummerfarooq5383 There is enough PCIE and RAM for 7 players to each have the P-cores of a 12900K and their own full bandwidth 4090.

    • @johnmijo
      @johnmijo Рік тому +13

      @@DigitalJedi thanks for you insight, always nice to see engineers talk about the work they do ;)
      I'm busy playing Starfield and porting it to my C128, why because I think that Z-80 will work as a nice co-processor to the 8510 CPU, ha :p

    • @GeekProdigyGuy
      @GeekProdigyGuy Рік тому +2

      any special reason why there's an asymmetric 3+2 bridges instead of having 3 on both sides?

  • @stefannilsson2406
    @stefannilsson2406 Рік тому +151

    I hope they evolve this and bring it to the workstation xeons. I would love to have a unlocked xeon with built in memory.

    • @jondadon3741
      @jondadon3741 Рік тому

      Yo same

    • @stefannilsson2406
      @stefannilsson2406 Рік тому +19

      @@startrekkerll5635 What do you mean? You still have memory slots that you can put memory in...

  • @L0S7N01S3Deus
    @L0S7N01S3Deus Рік тому +41

    Considering new AMX instructions and all that bandwidth afforded by HBM, it would be very interesting to see benchmarks for AI tasks, like running stable diffusion or llama models. How would they stack up against GPUs performance wise, or power and cost efficiency wise? Would be very relevant in current datacenter GPU shortage!

  • @Mr76Pontiac
    @Mr76Pontiac Рік тому +16

    One of the nice things about "Serve the HOME" (Emphasis on HOME) is that we get to have a glimpse to see what we'll be running in our HOMES as low end servers in 30 years....
    I'm 5 minutes in and I can't imagine the cost of those things when they come to market, not to mention the REST of the hardware costs.

  • @maxhammick948
    @maxhammick948 Рік тому +39

    Without the RAM slots taking up width, you could pack a HBM-only server incredibly dense - maybe 3 dual socket modules across a 19" rack? Not many data centres could handle that power density, but it would be pretty neat to see

    • @RENO_K
      @RENO_K Рік тому +5

      💀💀 the cooling on that bad boy is gonna be insane

    • @sanskar9679
      @sanskar9679 9 місяців тому

      @@RENO_K with 3m's liquid that boils at almost 50 celcius you could maybe pack almost a thousand per rack

  • @shammyh
    @shammyh Рік тому +17

    Great content Patrick!! Been waiting to hear about these for a while... And you always get the cool stuff first. 😉

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +5

      This one took a long time. Partly due to the complexity but also moving STH to Scottsdale and doing like 40 flights over the summer. I was hoping to get this live before Taiwan last week.

  • @Gastell0
    @Gastell0 Рік тому +17

    Damn, that localized memory is incredible for SQL instance/shard, web server cache and so much more.
    HBM memory runs at lower wattage than DDR memory, with significantly higher bus width and lower frequency required to achieve high bandwidth (afaik).
    p.s. Didn't show the bottom of it even once =\

    • @aarrondias9950
      @aarrondias9950 Рік тому

      Bottom of what?

    • @Gastell0
      @Gastell0 Рік тому

      @@aarrondias9950 the cpu module/pcb

    • @aarrondias9950
      @aarrondias9950 Рік тому

      @@Gastell0 1:01

    • @Gastell0
      @Gastell0 Рік тому

      @@aarrondias9950ooh, that was in introduction, I looked over again everywhere but that, thanks!

  • @Strykenine
    @Strykenine Рік тому +8

    Love a good datacenter CPU discussion!

  • @BusAlexey
    @BusAlexey Рік тому +4

    Yes! Waited long time for this monster cpu

  • @sehichanders7020
    @sehichanders7020 Рік тому +18

    8:53 I always figured HBM was the endgame for the entire Optane thing. Too bad it never really panned out since it had mad potential and could have changed how we think about, for example, database servers all together. Intel sometimes is so far ahead of themselves even they can't catch up to them (and then something like Arc happens 🤦‍♀)

    • @TheExard3k
      @TheExard3k Рік тому

      HBM gets wiped as any other memory on power loss. It has nothing to do with Optane and persistent memory

    • @sehichanders7020
      @sehichanders7020 Рік тому +1

      @@TheExard3k It's not about persistency. But when your persistent storage is so fast and low latency as Optane was supposed to be you can get away much much smaller memory pools, hence you can use faster HBM.
      The entire promise behind Optane was that it is so fast (especially IOPS wise) that you don't need to keep your entire application's data in memory.

    • @noth606
      @noth606 5 місяців тому

      @@sehichanders7020 Well, it is a bit higher level in a sense, tiered pipelining is not _either_ HBM _OR_ Optane, correctly used both will be a boost just at a different level, Optane has been used as RAM lower tier and HBM here as higher tier, so from the CPU looking for data it would go L1, L2, HBM, RAM, Optane, mass storage if it needs to go that far with each tier being progressively slower and larger. It would work very well most likely, just be hella expensive and a bit of a bear to backend manage. But it would boost performance a lot for almost all data intensive types of applications since you'd never loose more clocks than absolutely necessary to get the needed data, so the CPU would be waiting a lot less time than it does now, which is very advantageous. Now if the data is not in L2 or L3 you have a immediate hit of 10+ clocks to check RAM and if you draw the short straw you have to go fish in mass storage SOL.

  • @chaosfenix
    @chaosfenix Рік тому +13

    I hope this is something that filters down to consumer parts. Especially for APUs with integrated graphics we are pretty clearly getting to the point where they are being limited by memory bandwidth. The Z1 extreme with 8 CPU cores and 12 GPU cores is only about 5-30% faster than the Z1 with only 6 CPU cores and 4 GPU cores. These two chips are meant to operate in the same power limits and are running the same architectures. Given all that you would think that something with 3x as many GPU cores would be much faster but that just isn't the case and it is my guess that it is probably due to memory bandwidth. GPUs are bandwidth hungry and there is a reason GPUs pack their own specialized memory. I wonder if combining this with an APU couldn't let that iGPU stretch its legs to its full potential. Here is hoping.

    • @ummerfarooq5383
      @ummerfarooq5383 Рік тому

      I want to someone run starfield on it just for show. Of course let the cpu be overclocked to 5ghz

    • @chriswright8074
      @chriswright8074 Рік тому +1

      Amd instinct

    • @DigitalJedi
      @DigitalJedi Рік тому +12

      This issue is that HBM is very expensive, and doing HBM right means a pretty much ground-up design for your chip to not only fit in the PHYs for the kilobit+ bus, but also the differences in controllers and possibly dual controllers if you still want DDR5 options.
      I've worked with HBM, and when you get to the class of connection density it requires, you need to spend the big bucks for a silicon interposer. Radeon Fiji did this, Vega and VII, and The Titan V come to mind. That is a whole massive die you need to make and then stack at least 2 other dies on top of.
      An HBM APU sounds awesome I agree, we even saw a glimmer of it with the i7 8809G, which had a 24CU Vega MGH GPU and 4GB of HBM. The more practical approach for right now though would be something with a dedicated GDDR controller, even just 128-bit 8GB would be plenty, as that is already around 288GB/s of bandwidth you aren't fighting the CPU over.

    • @-szega
      @-szega Рік тому +2

      Meteor Lake has hundreds of megs of L4 cache in the interposer, presumably mostly for the iGPU and as a low-power framebuffer (somewhat like the M1).

    • @chaosfenix
      @chaosfenix Рік тому +2

      @@DigitalJedi Yeah I know there are definite issues. HBM has a 4096 bit bus which is gigantic compared to anything else and is why you need the complex interposer. Intels EMIB looks interesting and may help in that respect but we will have to see. Personally I would not have the option for additional DDR5. This would be replacing it. Many systems already use soldered memory so this would simply be an extension of that. I would dare say 90% of consumers don't bother upgrading the RAM on their computers anyway so if it is balanced properly it wouldn't be much of an issue.

  • @BlackEpyon
    @BlackEpyon Рік тому +7

    Some of us remember when CPUs had L2 cache external to the CPU. Then the Slot 1 had the cache integrated onto the same card as the CPU, and when the Pentium III came out, L2 cache was completely internal to the CPU die. I don't see external RAM going away any time soon, just because of how useful it can be to just add more RAM, but this seems to be following the same evolution, and the performance it brought. Perhaps one day we'll see internal RAM on consumer CPUs as well!

    • @RENO_K
      @RENO_K Рік тому

      That's seriously cool

    • @fangzhou3235
      @fangzhou3235 Рік тому +1

      No the original Pentium III (0.25um Katmai) does not have on die L2. It only comes in the 0.18um Coppermine version, which was super cool. The 500Mhz coppermine can OC to 666MHz without a sweat.

    • @maxniederman9411
      @maxniederman9411 10 місяців тому +1

      Ever heard of M-series macs?

  • @edplat2367
    @edplat2367 Рік тому +8

    I can't wait to 5-10 years from now when see this come to high end gaming machines.

    • @Alex-wg1mb
      @Alex-wg1mb 4 місяці тому

      or to buy the used xeons max from aliexpress with specially crafted motherboards for midi tower cases

  • @thatLion01
    @thatLion01 Рік тому +1

    Amazing content. Thank you intel for sponsoring this.

  • @gsuberland
    @gsuberland Рік тому +14

    On the topic of 1000W power draw, I believe these are the same CPU power delivery topology that Intel showed a while back during some of the lab tours (e.g. I believe one of der8auer's videos in the extreme OC labs showed this off), where you have a relatively small number of VRM phases on the motherboard providing an intermediate package voltage, followed by a massive number of on-die power stages (100+) parallelised into a huge segmented polyphase buck converter, which helps reduce ohmic losses and PDN impedance by moving the regulation closer to the point of load on the die. The combined continuous output current of the on-package converters appears to be 1023A, logically limited by the number of bits in the relevant power management control register. This kind of current delivery would be unworkable with a traditional VRM, but since the phases are physically distributed around the package the average current density is heavily reduced.

  • @hermanwooster8944
    @hermanwooster8944 Рік тому +4

    I remember you telling me this episode was coming a few weeks ago! The idea of memory-on-a-chip would be sweet for the consumer audience. It was worth the wait. :)

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +2

      Took a little longer than expected because of a trip to Taiwan. I hope you have a great week

    • @BlackEpyon
      @BlackEpyon Рік тому

      Similar to how L2 cache used to be external to the CPU, then moved adjacent to the CPU with the Slot 1 and Slot A, and then moved completely internal to the CPU die, gaining performance with each evolution.

  • @cy5911
    @cy5911 Рік тому +7

    Can't wait to buy these 5 years from now and use it for my homelab 🤣

  • @Superkuh2
    @Superkuh2 Рік тому +10

    64GB is kind of small for any AI workload that would take advantage of the memory bandwidth.

    • @GeekProdigyGuy
      @GeekProdigyGuy Рік тому +2

      Compare it to GPU VRAM - sure top of the line GPUs have slightly more but H100 is pretty industry standard and has 80GB. Considering CPUs are definitely going to have way lower throughout than GPUs it doesn't seem like capacity would be the issue.

    • @ThelemaHQ
      @ThelemaHQ Рік тому

      its a HBM2e also works like VRAM, its superfast btw my P40 24GB tesla with GDDR5 gets 2,50 sec in stable diffusion, while P100 16GB with HBM get 0,8 - 1,5 now imagine i use double P100

    • @Superkuh2
      @Superkuh2 Рік тому

      @@ThelemaHQ stablediffusion isn't really memory bandwidth limited. Things like, say, transformer based large language models are.

  • @CobsTech
    @CobsTech Рік тому +11

    While I work with virtualisation a lot compared to specific high performance workloads, this has always begged the question for me, even when playing around with a legacy Xeon Phi 5110p CoProcessor, how would a chip like this handle memory failure? Nowadays whenever we have memory failure, ECC kicks in as a first resort and then you have options such as Memory Mirroring so your workloads can continue with a reduced amount of available memory.
    How would a chip like this handle it, say, one of the HBM packages was defective or outright didn't work, does the BIOS of the system have any form of mirroring? Considering this is four seperate packages working as one, would this prevent the chip from booting up at all?
    Great coverage though, always fun to see what new products in the HPC sector brings to the table.

    • @skunch
      @skunch Рік тому +2

      if the memory fails, throw it out. This is the way now, integration of core components at the sacrifice of modularity and repairability

    • @autohmae
      @autohmae Рік тому +1

      I don't know if this system supports it, but CPU hotplugging exists. Maybe the least useful way to do it, but that would be 1 way

  • @EyesOfByes
    @EyesOfByes Рік тому +2

    8:52 My thought is why Apple didn't try to aquire the Optane ip and patents. Then we wouldnt have to worry about write endurance, and also an even lower latency SoC in combination with the massive amount of L2 Cache Apple has

    • @uncrunch398
      @uncrunch398 Рік тому +2

      Optane drives have failed to write endurance being exceeded. Being used as DRAM extensions IIRC. Its best placement is as a large swap space or cache for tiered storage to preserve endurance and power on time of other tiers. Intel stopped production / development and sold it due to it not selling well enough. The purchaser IIRC was a company primarily focused on memory. Enterprise and high end prosumer SSDs serve sufficiently where it fits best for a tiny fraction of the cost per cap.

    • @Teluric2
      @Teluric2 Рік тому

      Because Apple knows they have no chance in the HPC biz , Apple rules where the looks matter.

  • @MrHav1k
    @MrHav1k Рік тому +3

    Good call out of the Intel Developer Cloud there at the end. It's so important to try these kinds of systems out to see if you'll even benefit from these features before you go out and drop a massive bag of $$$ on procuring one.

    • @magfal
      @magfal Рік тому +2

      Does AMD have a similar service?
      I've been wondering about the benefits of buckets of L3 cache.

    • @MrHav1k
      @MrHav1k Рік тому +1

      @@magfal AMD doesn’t offer anything like the IDC to my knowledge. Just another edge Intel’s size and resources can deliver.

    • @shanent5793
      @shanent5793 Рік тому

      ​@@magfal Supermicro has their Jumpstart remote access, they can lend you an AMD server. Bergamo was even available pre-release

  • @GeoffSeeley
    @GeoffSeeley Рік тому +6

    @2:23 Ah, so Intel isn't above "gluing" together chips like AMD eh? Ya Intel, we remember.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +4

      You know I was sitting in the front row when that presentation was given in Oregon back in 2017

    • @billymania11
      @billymania11 Рік тому

      Kind of a long time ago. Things can change in that length of time right Patrick?@@ServeTheHomeVideo

  • @Jibs-HappyDesigns-990
    @Jibs-HappyDesigns-990 Рік тому +1

    thanks 4 the tech vid Patrick!! wowee 4 Intel Xenon Max!! gota get a few!! giddy up!!

  • @shiba7651
    @shiba7651 Рік тому +14

    Pfff the cpu in my server is so fast it boots with ddr3

  • @waldmensch2010
    @waldmensch2010 Рік тому

    I had testet Xeon Max a few months ago for kvm/vmware and did not performed well. this is only for hpc useful, nice video

  • @MYNAME_ABC
    @MYNAME_ABC Рік тому +2

    What is the cinebench results, single and multi? That is all that counts at the end of the day....

  • @georgeindestructible
    @georgeindestructible Рік тому

    The ventilation in these looks great.

  • @ytmadpoo
    @ytmadpoo Рік тому +2

    I'm wondering how it would do running Prime95. With multiple cores per worker, it can hammer the memory pretty hard so the throughput of HBM should significantly boost the per-iteration speed, assuming the clock rates of the cores are decent. Tuning the worker threads to stick with the NUMA nodes would give the ideal performance (4 worker threads, each using all 14 cores on the same NUMA node). We did some similar tests way back when on a Xeon Phi and it was pretty decent although the HBM on there was much smaller so it still had to go out to "regular" memory quite often which slows things down. I've found that going over regular DDR4, it only takes a couple of cores in a worker to saturate the memory bus, although you do still get marginal improvements as you add cores. By the time I got above 10-12 cores per worker though, you can actually see a degradation as the individual cores are just sitting there waiting for RAM so the overhead can make iteration times drop.

  • @benedicteich8697
    @benedicteich8697 6 місяців тому

    Just got my hand one ES Version. Can‘t wait to run it..

  • @gheffz
    @gheffz Рік тому +1

    Thanks!! Subscribed, All.

  • @OVERKILL_PINBALL
    @OVERKILL_PINBALL Рік тому +5

    Interesting CPU for sure. All about finding the best use case. I was thinking this CPU might also be used to drive faster networking if it is using the HBM memory. Not sure if that was tested.

  • @jmd1743
    @jmd1743 Рік тому +3

    Honestly it feels like once AMD did their monster sized CPU chip everyone stopped caring about keeping things conventional like how it took on couple to make everyone start dancing at the school dance.

  • @D.u.d.e.r
    @D.u.d.e.r Рік тому

    Enterprise and personal chips will continue to be even more tightly integrated and they'll mimic more a motherboard than chips we see today (also with the size). Just check Cerebras chip... memory system is still way behind the compute.

  • @RR_360
    @RR_360 Рік тому +1

    I would love to have one of those old servers in your studio.

  • @EyesOfByes
    @EyesOfByes Рік тому +3

    So, GDDR6X has higher latency than standard DDR5. How is HBM2e in this sense?

  • @ted_van_loon
    @ted_van_loon Рік тому

    Ram in a APU would eventually also greatly reduce cost. HBM ofcource is expensive and such.
    but it might become normal to see APU's become the new general CPU's and make APU's more like SOC's, essentially it allows to add in many more features and ram in the cpu allows for much simpler and cheaper motherboards and such.
    meaning that ram integration in low end chips allows to make super cheap and power efficient chips(more normal memory modules).
    that said despite HBM being much more expensive on these high end systems it is great actually many years ago when HBM and HBM2 where still cheap to make(cheap enough to be used in mid tier gaming gpu's) I also recommended doing essentially the same using something like hbm directly in a cpu.

  • @Veptis
    @Veptis Рік тому +2

    Isn't that also the kind of Xeon where you pay to "unlock" some of the accelerators and frequency curve?
    Also it's not really a workstation part sadly. Intel is marketing their Xeons for workstation, while I want a GPU Max 1100 (PVC-56) as a workstation card. I got hopes for announcements next week. Intel is demoing it on InvelDecCloud and I had a chance to try it.
    I believe my workstation will still get a i9 14900K with custom look cooling (slight chance of tec)

  • @nobodhilikeshu4092
    @nobodhilikeshu4092 Рік тому

    My computer boots without DDR5 too. Nice to see they're starting to catch up. ;)

  • @whyjay9959
    @whyjay9959 Рік тому +2

    Do you think DIMMs could disappear in favor of embedded DRAM and CXL memory?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +1

      I think CXL memory in the PCIe Gen6 generation will have more bandwidth and be more interesting, but some applications will still like locally attached. More interesting is if there is optically attached memory.

  • @stevesloan6775
    @stevesloan6775 Рік тому +1

    I’m keen to see full high performance computers on die utilising a derivative of this tech.

  • @noth606
    @noth606 5 місяців тому

    We are getting more bandwidth using on carrier mounted direct attached HBM than with off carrier DDR5 somewhere on the board - crazy! *** Ehm no, if you DIDN'T - it would be crazy, and not just that, it would be a failed design that should just be binned and never released, since anyone buying it would be certifiable and should be put in a padded room for their own and others safety...
    It is a cool design, following the roadmap/wishlist/path to glory that was laid out many years ago of which Optane is also part. Intel has been at it quite some time, they have explained it and reiterated it many times for those who stop and listen. They are building systems to allow for multiple tiers of storage where you'd have L1 - L2 - HBM2 - RAM - OPTANE - SSD cache - SSD - HDD sort of, all in one system where data is promoted or demoted between the tiers depending on how *close* to the actual CPU cores it needs to be since closer = faster. It's like a Guru and disciples in a sense, with the disciples being the memory tiers sitting in ever larger circles around the Guru dispensing the wisdom and word of god. The closer one sits, the easier and clearer it is to hear and understand what the Guru says.

  • @matsv201
    @matsv201 Рік тому

    I use to work developing telecom servers that was ultra efficient. Just run on normal intel i series CPU:s.
    The one we had go down to 10W for the whole board with a full intel Xenon CPU if the memory was removed. With the memory they draw like 40 watts. (This was quite a while back, like sandy bridge era)

  • @gl.72637
    @gl.72637 Рік тому +1

    Is this comparable to the Nvidia Grace ARM based CPU with 144 cores that Linus tech tips showed 3 months back? Or just Intel trying to catch up? Would like to see a video about comparing the server against server.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому

      This has been in production and is being installed into the Aurora supercomputer which will likely be the #1 in the world in November. Grace Superchip you cannot buy yet (we covered it on the STH main site) despite the hype.

  • @PingPong-em5pg
    @PingPong-em5pg Рік тому +1

    "HBM memory" resolves to "High Bandwidth Memory memory" ;)

  • @ted_van_loon
    @ted_van_loon Рік тому

    sleep states probably are a early version problem, since it likely has to do with the memory needing constant power. in the future with a motherboard which supports 2 seperate cpu voltages at the same time(based on pin groups) or if the cpu's have some added in logic then it should probably work.
    ofcource they might not have given it priority since honnestly a cpu like this right now makes most sense in a server. while it is also great for videoeditting and 3d moddeling and rendering and simulating, most such softwares likely don't support it well enough yet,
    and while good and well maintained FOSS software like blender might support it quite rapidly and quite well.
    many companies who have shown to be very slow and ignorant in adopting new tech like adobe(even though they seem to accept AI pretty well now), and things like solidworks which still don't understand modern computers have more than 1 cpu core.

  • @tomstech4390
    @tomstech4390 Рік тому +7

    Imagine if AMD started adding HBM2E or HBM3 (that samsung connection they have) onto their Epyc.. aswell as the 1152MB of L3 cache and the 96 fast cores.

    • @IamBananas007
      @IamBananas007 Рік тому +4

      Mi300 APU

    • @tomstech4390
      @tomstech4390 Рік тому

      @@IamBananas007 24 cores, but yeah fair point. :D

    • @post-leftluddite
      @post-leftluddite Рік тому +2

      Well, Phoronix published reviews of the Newest AMD Epycs inluding Bergamo and they literally destroyed even the HBM version Sapphire Rapids chips....so apparently AMD doesn't need HBM

    • @VideogamesAsArt
      @VideogamesAsArt Рік тому +1

      @@tomstech4390 their MI300C has no GPU cores at all and is 96 Zen4 cores with HBM, but it's unsure whether they will release it since there might be not enough demand for it since their V cache already gives them a lot of memory on-die

  • @степанстепаненко-б1э

    you are saying words faster than this processor can handle. I wanted to see traditional tests of this processor in aida64, senibench, 3dmark

  • @davelowinger7056
    @davelowinger7056 Рік тому +1

    You know I imagine the CPU of the future. It would be a CPU sandwich. With 4 to 64 firewire ports. first Northbridge. Now system memory

  • @lamhkak47
    @lamhkak47 Рік тому

    Is it possible to apply such design to GPU? A bit like HBCC for AMD but you can install DIMM modules on the GPU to give extra RAM for various purpose, such as running large AI models, running heavily modded KSP and try novice shitty program that memory leaks for no reason.

  • @richfiles
    @richfiles Рік тому

    I wish Apple would adopt this memory style for their Apple Silicon SoCs. No current Mac has upgradable memory. You buy the SoC configured with a memory capacity from the factory, and that's it... Sure would be nice to have off the factory floor fast RAM, and _user expandable_ memory expansion slots for future upgrades.
    I really am liking the direction Intel is going with these!

    • @billymania11
      @billymania11 Рік тому

      Everybody thinks Apple is being stingy or playing games with RAM. Memory of that type can't be slotted. Because of timing and signal propagation, the LPDDR memory has to sit close to the CPU and be soldered. Which in a way leads to HBM memory. I think that will happen and Apple might do that in the consumer or PRO space at some point.

    • @richfiles
      @richfiles Рік тому

      @@billymania11 what are you even talking about. Numerous laptops and desktops have slotted RAM. Your high sleed RAM remains factory determined, as part of the SoC, and "slow" RAM can be slotted in at a later date by the user.
      Many computers have used Fast/Slow RAM configurations. Every modern computer already does this, to andegree, with Cache. This is merely adding one more later between. SoC fast RAM, and slower socketed RAM.

    • @billymania11
      @billymania11 Рік тому

      Sure Rich, whatever you say.@@richfiles

    • @richfiles
      @richfiles Рік тому

      @@billymania11 i am literally describling what is inside laptops. _today..._ I work in a PC repair shop. I have been building and repairing computers most of my life. My first computer repair was in 1989. Look up how Cache memory works. Computers have had different amounts of different speed memory on and off die for decades. Most CPUs have at least 2 or 3 levels of cache memory, plus the external RAM accessed through the memory controller (also on die with modern CPUs). Some computers (mostly long ago) had both fast and slow RAM, accessed directly by the CPU for the fast RAM and through a memory controller for the slow RAM. The Amiga did this. Even many modern PCs can do this. If you have a matched pair of faster RAM modules in a pair of DIMM sockets on one channel, and a slower matched pair of RAM modules in the DIMM sockets of a separate memory channel, then many CPUs will be able to run each channel at it's best speed. There is no reason you can't have a high speed memory controller with some channels directed to on SoC chiplet RAM (HBM or HBM like), while _ALSO_ having some memory channels reserved for slower slotted RAM (either in SODIMM or the newly developed CAMM socket). There is literally no reason a computer manufacterer can't do this, particularly in lower factory memory configurations, where less high speed Factory installed chiplet SoC RAM is installed.
      You say "sure", like it's something unbelievable... I work on laptops every weekday. More have slotted RAM than don't, and some already solder some ram on board, and have a secondary slot for expansion. No reason you cant have some higher speed RAM on the SoC, as configured from thenfactory, and use other memory channels for slower socketed RAM.
      I'd LOVE to have sockets in my Mac Studio, so I could add to the already present 32GB of high speed RAM... But YES, Apple is being stingy, because they are profiting on people buying the RAM they _expect to use someday_ right now, while it's still expensive, rather than just buying the RAM they know they need to be high speed, and adding slower RAM in the future to aleviate usage for miscellaneous tasks, freeing up the high speed RAM for more intensive tasks.

  • @jlficken
    @jlficken Рік тому +1

    I love enterprise hardware!
    I'm still rocking E5-26XX V4 CPU's at home though 😞

  • @shanent5793
    @shanent5793 Рік тому +1

    What is so difficult about the integration that Intel does but AMD does not? Why is this harder to do than AMD Instinct HBM or Versal HBM? If HBM is used as cache how many sets does it support and how long does it take to search 16GB of cache for a hit?

    • @lukas_ls
      @lukas_ls Рік тому +1

      It’s "3D" Stacking, that makes it much more expensive. It’s similar to HBM Packaging (but still different) and not just a couple of Dies glued together on the same package. AMD could so it but they want lower costs.
      AMD uses these packaging techniques but not in Ryzen/EPYC CPUs

  • @scarecrow5848
    @scarecrow5848 Рік тому

    3:14 "thats why intel does it but AMD doesnt." Wrong, AMD started doing chiplets back in 2015 with the... uh... i forget the name of it. Ill edit the comment lol. It was one of their GPU's. And also starting in 2023 with their 7000 series GPU'S theyve gone back to doing chiplets. Still no HBM to replace VRAM entirely yet but its still a chiplet design for the core.

    • @billymania11
      @billymania11 Рік тому

      We have to give credit to AMD on this one, regarding chiplets. The danger though is the approach AMD chose to implement chiplets. Later designs like Intel's might be superior in a range of functions not initially considered. I do expect a pendulum swing in favor of Intel as their approach gets validated.

  • @shadowarez1337
    @shadowarez1337 Рік тому +1

    Hmmm Nvidia should take a stack of that HBM2e for a new shield console. And they are sorta hybridizing the next consumer cpu with on-die ram like apple did with the M1-2 SoC's interesting times ahead I can get a frequency Tuned Epyc with enough cores and cache to build out a nice fast NAS.

  • @matthiaslange392
    @matthiaslange392 Рік тому +1

    This Xeon's will Serve The Home - all homes at once 😎
    But who needs this power? Usually the storage is the slowest part of a system and you better invest in faster storage than in faster CPUs. Most of the time several cores are idling.
    But i'm sure there are some strange physics-simulstions as a usecase... simulating earthquakes, weather or nuclear fusion... or simply having the fastest minecraft-server of all 😉

  • @uncrunch398
    @uncrunch398 Рік тому +1

    No sleep states are needed for any platform, though preferred when running on battery. A workstation or gaming PC benefits from disabling them. Except for power choking of unused cores to boost those heavily used. Or cooling is insufficient, so sleep states are needed to help for that. Lacking them is not a reason for not using the same CPU as in this video for those workloads. What is always relevant is the performance per cost. Or just performance if cost doesn't matter.

  • @SilverKnightPCs
    @SilverKnightPCs Рік тому +1

    I just don't understand where in the current Marketplace it makes sense to buy xeons. You can buy a AMD Epyc with double the core count and half the power consumption and usually 3/4 the price

    • @billymania11
      @billymania11 Рік тому

      Goes to show you there is more to these decisions than a PC benchmark. I can imagine it gets quite complex comparing all the features.

  • @ravnodinson
    @ravnodinson 7 місяців тому +1

    What kind of place would be using something like this and what would they be running on it? This kind of tech is fascinating to me and I don't even know what it's used for.

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  7 місяців тому +1

      Often supercomputer clusters. See the new Intel Aurora supercomputer as an example.

    • @ravnodinson
      @ravnodinson 7 місяців тому

      @@ServeTheHomeVideo It is amazing!! 2 billion billion calculations per second. One thing that interests me that was mentioned being done by Aurora was Dr's studying neurology and mapping out the brains neurological pathways. What does the program running that even look like and also that it needs such mind bending computational power? I know I'm in way over my head, but to me it's such awe inspiring work.

  • @LaserFur
    @LaserFur Рік тому +1

    I wonder how long it will be before the system boots up with just the cache and then a ACPI message tells the OS when the main memory is online. This would help with the long DDR5 training time.

    • @bradley3549
      @bradley3549 Рік тому

      Something like that would be valuable in the consumer market I reckon. Servers are already notorious for long boot times so I don't think there is a lot of incentive at the moment to enable a fast boot.

  • @gusatvoschiavon
    @gusatvoschiavon Рік тому +2

    I would love to have an arm CPU with hbm memory

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому

      That is powering the former #1 supercomputer: www.servethehome.com/supercomputer-fugaku-by-fujitsu-and-riken-revealed-at-no-1/

  • @velo1337
    @velo1337 Рік тому +1

    are you doing a follow up with this cpu?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +1

      We will have a video with Xeon Max in it later this week.

    • @velo1337
      @velo1337 Рік тому

      @@ServeTheHomeVideo would be nice to get some superpi, cpuz, 7zip and geekbench benchmarks for the 9480

  • @aarcaneorg
    @aarcaneorg Рік тому

    I called it! Less than a year after I asked when we would be able to boot servers without even needing to add RAM right here on one of your videos, and here we are! Somebody saw my comment and made it happen!

    • @bradley3549
      @bradley3549 Рік тому +1

      Hate to burst your bubble, but the CPU design timeline is such that they would have been actively working on this CPU design for *years* prior to review samples being available.

    • @aarcaneorg
      @aarcaneorg Рік тому

      @@bradley3549 be that as it may, a lot of things, like the ability to use the onboard cache like system ram, are minor revisions that can be made in firmware or opcodes, the kind of tweaks that can happen at the end. The extra cache was planned for years. Booting from it was my idea.

    • @bradley3549
      @bradley3549 Рік тому

      @@aarcaneorg You're definitely not the first to think of integrating ram and CPU and then booting from it. That's been a feature of CPUs for a LONG time. Just not x86 CPUs. Sorry.

  • @exorsuschreudenschadenfreude
    @exorsuschreudenschadenfreude Рік тому +1

    sick bro

  • @Jerrec
    @Jerrec Рік тому +1

    HBM is the future. I wonder how long it takes until it reaches consumer CPU's. Though upgrading RAM wouldnt be possible then anymore.

    • @whyjay9959
      @whyjay9959 Рік тому +1

      CXL could allow upgrading RAM then.

    • @Jerrec
      @Jerrec Рік тому +1

      @@whyjay9959 HBM2 has got a Bandwidth from 420 GB/sec. There is quite some way to go for PCIe to allow CXL Ram Expansion at that speed.
      PCIe7 x16 only manages 240 GB/sec. PCIe7 isnt even out yet, and HBM3 is already beginning rollout 2024 with a whopping 512GB/sec Bandwidth.
      Even the latency on the Bus would be way too high, even if the Bandwidth would be reached.
      With HBM memory expansions die out. CXL only helps for "slow" DDR5 and DDR6. The HBM standard even states that RAM must be on the processing logic die.

    • @whyjay9959
      @whyjay9959 Рік тому +1

      ​@@Jerrec I think you mean bytes? Found a chart showing 128 gigabytes per second for PCIe gen6 x16. But sure, it's all a tradeoff. CPU-integrated chiplets get inherent performance advantages from having the shortest simplest connections but cannot be changed, so they will probably continue to be combined with slower, more flexible types of memory as preferred.

    • @Jerrec
      @Jerrec Рік тому

      @@whyjay9959 I dont get your point. You are right with PCIe 6 x16 and 128 gigabytes. I wrote about PCIe 7 that comes out 2025.
      You are right, I mean Bytes. Sorry about that. If I can correct it, I will.
      Anyways, not considering HBM3 in 2025, it means HBM2 runs on 25% or maybe 50% speed. Thats not a tradeoff, that is ... unusable for such a memory.

  • @alastor2010
    @alastor2010 Рік тому +1

    Isn’t using HBM to cache DDR5 just like using DRAM to cache DRAM?

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому

      In a way, yes. But think of it more as caching slower/ higher latency/ higher trace power far DRAM to faster/ lower latency/ lower trace power close HBM. There is a big difference between access over a few mm on package and going out of the package, through the socket, through the motherboard, through the DDR5 socket, onto the DDR5 module and so forth.

  • @ZanderSwart
    @ZanderSwart Рік тому +1

    as a Xeon 2650v2 daddy this makes me proud

  • @KiraSlith
    @KiraSlith 10 місяців тому

    I'm usually an Intel hater, but man Threadripper and the GPU mining boom destroying Phi completely messed up superscaler development, and AMD just never filled that market niche back out again for Epyc (they had this nice locked down with Opteron), so there was just a pit there ARM was slowly trickling into like groundwater. Database hosting apps really needed these bulk core chips ages ago, but it's good we're at least getting something comparable now in the form of Xeon Max.

  • @ThelemaHQ
    @ThelemaHQ Рік тому

    i still wait Xeon comeback, ive been stick with xeon till GOLD 6140, before switch to red team EPYC 7742 dual

  • @zilog1
    @zilog1 Рік тому +2

    I give it 5 years before we start seeing these chips in massive lots on ebay for $80 each :p
    no, this is not commentary on "this is a flop" this is more on the nature of server/enterprise grade equipment being dumped on ebay for cheap eventually because no one wants them after a large company/school upgrade :p

    • @marcogenovesi8570
      @marcogenovesi8570 Рік тому +2

      Maybe not supercomputer-grade stuff like this but yes it will be on ebay in less than 10 years if AMD and Intel keep barraging each other every year with new and better stuff

    • @zilog1
      @zilog1 Рік тому +1

      @@marcogenovesi8570 think absolute so. Look at Tesla cards and the Xeon Phi. I absolutely think it will. A uni buys thousands of them, then once they upgrade, shutdown, downsize, something, there will be a massive LOT of these. The kinds of people that buy these products don't just buy one, they buy dozens.

  • @davelowinger7056
    @davelowinger7056 Рік тому

    You know if you were born 50 years earlier you would be a horse race caller. Oh wait a minute you still are

  • @matthiaslange392
    @matthiaslange392 Рік тому

    With the tiles it looks a little like the chip, that's pulled out of Schwarzeneggers head in Terminator 2. 😎

  • @MNGermann
    @MNGermann Рік тому

    “I will use this photo that I took at Intel event and I look awesome “ :) :P

  • @uncrunch398
    @uncrunch398 Рік тому +1

    I foresee people trying this with everything they'd ever do, at least within 64GB DRAM, without DRAM.

  • @aarcaneorg
    @aarcaneorg Рік тому +1

    These RAM-on-CPU or L4-as-RAM (whichever you want to call it) solutions would be excellent for storage-only or Ceph OSD nodes. Excellent options for low-cost computing to save cost and power when all you really need is a few PCIe lanes and some compute

    • @berndeckenfels
      @berndeckenfels Рік тому +2

      That totally fits the strategy to make ceph as expensive as possible to make it useable .)

    • @aarcaneorg
      @aarcaneorg Рік тому

      @@berndeckenfels the idea is to save cost by getting a cheap motherboard with only 0 or 1 LRDIMM slots per CPU and use these as low-cost high-thread chips to power the cluster.

    • @berndeckenfels
      @berndeckenfels Рік тому +1

      @@aarcaneorg they are not low cost (neither are there Cheap mainboards for them).

    • @aarcaneorg
      @aarcaneorg Рік тому

      @@berndeckenfels yes, as with all fancy new hardware that solves low cost problems, it launches at obscene prices, then eventually comes down in price. Eventually, these, or their ancestors, will become affordable and mainstream.

    • @berndeckenfels
      @berndeckenfels Рік тому

      @@aarcaneorg that sounds rather unlikely, that’s a specialized hpc model with an expensive production process and special socket and bios.. not like a mass market D-Xeon

  • @m5a1stuart83
    @m5a1stuart83 Рік тому

    But how long does it take to compile C++ Project?

  • @TheAnoniemo
    @TheAnoniemo Рік тому

    Can't wait for ASRock to create a mini-ITX board for this and just have no DDR5 slots.

  • @shoobidyboop8634
    @shoobidyboop8634 Рік тому

    When will this be available for desktop PCs?

  • @kenzieduckmoo
    @kenzieduckmoo Рік тому +1

    So what I’m seeing here is apple complaining they couldn’t add ddr5 slots to the Mac Pro cause of unified memory was just their engineers not being allowed to do it

  • @AlexandruVoda
    @AlexandruVoda Рік тому +2

    Well, that is certainly a chip that will not serve the home, but is very cool however.

  • @majstealth
    @majstealth Рік тому +2

    damn these 2 cpus alone have half the ram each of my esx have - wow

  • @CyberdriveAutomotive
    @CyberdriveAutomotive Рік тому +1

    I like how Intel made fun of AMD for using chiplets, saying they're "glued together" and now they're doing it lol

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому +1

      You have to remember I was one of the people in the room when that presentation was made (and we had EPYC 7601 already in the lab at the time)

  • @EyesOfByes
    @EyesOfByes Рік тому

    But can it run Crysis or Minecraft max renderdistance?

  • @ewenchan1239
    @ewenchan1239 Рік тому +1

    It is AMAZING to me that for CFD, the AMD Genoa X still TROUNCES this Xeon Platinum Max, even despite the HBM2e memory at nearly DOUBLE the Xeon Platinum 8490H baseline, whereas the Max, running in caching mode, actually performs WORSE than the baseline, and with the HBM2e only, it performs slightly better, but nowhere CLOSE to what the Genoa X is able to do.
    That, for me, is a much better marketing slide for AMD than it is for the Xeon Platinum Max.

    • @billymania11
      @billymania11 Рік тому

      LOL! If you say so.

    • @ewenchan1239
      @ewenchan1239 Рік тому

      @@billymania11
      The data shows that.
      Watch the video.

  • @benardmensah7688
    @benardmensah7688 Рік тому +1

    Apple silicon, intel max cpu!! I feel amd will be in trouble next year when this goes mainstream

  • @IBM29
    @IBM29 10 місяців тому +1

    I wonder how long it takes to amortize engineering / development / fab setup at $13,000 each...

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  10 місяців тому

      Also how much is shared with the standard SPR parts since a lot of the difference is in packaging

  • @gh975223
    @gh975223 Рік тому +1

    why would i care about sleep states on a workstation? the cpu should never go to sleep!

  • @Clobercow1
    @Clobercow1 Рік тому

    I'm curious how well this thing can run Factorio. It might set records. That game needs hella cache and memory bandwidth / latency.

  • @JohnKhalil
    @JohnKhalil Рік тому

    First official Windows cpu!

  • @nikolausluhrs
    @nikolausluhrs Рік тому

    Many people dont do any tweaking because corprate IT doesnt allow us to. Also support agreements

  • @simonhazel1636
    @simonhazel1636 Рік тому

    Question on video quality, everything looks fine except Patrick's face is super red, but everything else looks fine, and pictures in the video of Patrick looks fine.

    • @simonhazel1636
      @simonhazel1636 Рік тому

      Just to note it's only on the 4k youtube setting, if I bump it down to 1440p or 1080p, the issue disapears

  • @hgbugalou
    @hgbugalou Рік тому

    This is the future. Its inevitable all ram will be on the CPU.

  • @lordbacon4972
    @lordbacon4972 Рік тому

    Actually i was wondering if Intel Xeon Max would be a good gaming CPU?

  • @pete3897
    @pete3897 Рік тому +3

    115 pounds?! Wow, that's really cheap ;-)

  • @maou5025
    @maou5025 11 місяців тому

    Can you do some gaming benchmark with HBM only? To see infinite money performance lol.

  • @Mihonisuto
    @Mihonisuto Рік тому

    DSG PCIe5 Accelerator?

  • @SP-ny1fk
    @SP-ny1fk Рік тому +1

    Yeah yeah yeah but when can I expect this in my homelab? lol

  • @miigon9117
    @miigon9117 Рік тому

    I think "without ramstick" is a better title than "without DDR5"

  • @SB-qm5wg
    @SB-qm5wg Рік тому +1

    115lbs in a 2U. That's a thick boi 💪

    • @concinnus
      @concinnus Рік тому +1

      Seriously. And it's not even water cooled! IME, 115# would be ~5U. 2U was ~60#.

  • @czolus
    @czolus Рік тому

    So, like the now-defunct Xeon Phi?

  • @davidlucavish7948
    @davidlucavish7948 Рік тому +1

    I think that jab about AMD doesnt know how to do HBM memory is not correct since AMD used HBM memory on an AMD Radeon R9 Nano in 2015. Let's keep to the benefits and keep it civil!

    • @ServeTheHomeVideo
      @ServeTheHomeVideo  Рік тому

      AMD knows how to do HBM, and you will see it on the MI300 this year. Intel's packaging, however, is a step beyond what AMD is doing on its SP5 socket currently.

    • @davidlucavish7948
      @davidlucavish7948 Рік тому

      @@ServeTheHomeVideo The first GPU utilizing HBM was the AMD Fiji which was released in June 2015 powering the AMD Radeon R9 Fury.
      In January 2016, Samsung Electronics began early mass production of HBM2.[17][18] The same month, HBM2 was accepted by JEDEC as standard JESD235a.[7] The first GPU chip utilizing HBM2 is the Nvidia Tesla P100 which was officially announced in April 2016.
      In June 2016, Intel released a family of Xeon Phi processors with 8 stacks of HCDRAM, Micron's version of HBM. At Hot Chips in August 2016, both Samsung and Hynix announced a new generation HBM memory technologies. Both companies announced high performance products expected to have increased density, increased bandwidth, and lower power consumption. Samsung also announced a lower-cost version of HBM under development targeting mass markets. Removing the buffer die and decreasing the number of TSVs lowers cost, though at the expense of a decreased overall bandwidth (200 GB/s).
      Nvidia announced Nvidia Hopper GH100 GPU, the world's first GPU utilizing HBM3 on March 22, 2022.
      This is right from Wikipedia.
      The point I was trying to make is yes this is cool technology and I really like that you brought it up to the attention of UA-camrs but take the opinions out of the equation. It was almost like you were taking what Intel told you and were not keeping Intel honest as to the information they were giving you.
      Keep up the good work! I really enjoy your enthusiasm!