It's not even "a" computer, it's basically just a botnet installed locally... The only difference I can see between this and a botnet it's this is installed inside a single room so they'll get better latency between the different RAM modules, CPUs and GPUs. What's I'm trying to say is this technology is not very impressing, they're using pretty much the same CPUs and GPUs that we are using in out cheap desktops. Just a lot of them...
@@Alfred-NeumanFor a while they used special processors in these systems, developed directly for them. Cray comes to the mind. When clock speeds become less scalable they started to use consumer hardware. In this they have 8.7 million processor cores instead of 16 or 64 (talking about high end desktop machines).
One of the coolest aspect of Frontier's network architecture is at the node level. Since all the compute is done on GPUs the network fabric connects directly to the GPUs instead of something like a PCIe bus. So simulations can transfer directly between GPU memory and the network fabric without involving the CPU or having to move data on or off the GPU to get to the network. It allows for incredibly efficient internode communication.
So the GPU's have NIC's connected directly to them? With some sort of second MMU with it's own NIC? It's a tad unclear from the way you describe it, but I wonder how it connects to the GPU since you say it's not using PCIe?
@@noth606 I slightly misspoke. The NICs use PCIe ESM but connected directly to a PCIe root complex on one of the GPUs. Each node has 4 GPUs each with 2 dies (so 8 visible GPUs) and a dedicated nic, so 4 NICs per node. Thus any CPU operation that has to use the fabric actually traverses one of the GPUs to get to a nic. Source: you can find a bunch of architecture docs for Frontier but I also worked for several years on developing some of the library and software stack for this machine and a few others that were just beginning to come online.
Biggest difference between HPC networks and corporate networks is lack of security in favor of performance at all costs. The compute nodes directly access remote memory over the network RoCE
Take note of the power cables for each rack, similar to the amount a large house might use, per rack. Removing the heat from those racks is a big part of the design. Air flows from the floor and out the top in active exhausts. A little hard to believe, but compactness is a top priority.
@@mikepict9011 They have so much heat to get rid of, the concept of blowing cold air is no longer valid. Fluid is far more effective at conducting heat away from a metal structure and processors are manufactured with built-in liquid cooling. Each rack is built for purpose with an exchanger which takes it directly out of the room, then returns cold for the next batch. If you work on your home’s HVAC unit, you’re familiar. A widely distributed system like that can be monitored and adjusted for best efficiency.
@@artysanmobile yeah thats part of a larger cascading system when you consider the envelope usually. The liquid usually and ultimately needs to e rejected outside. And thats called a chiller in a liquid system and a condenser in a direct exchange system. But yeah , vapor compression, pipe joining . Its what i do .
@@artysanmobile i serviced the mini chillers that cool MRI machines, they still had a 1 air to refrigerant dx hx and 2 coaxial heat exchangers ( hx ) with 2 pumps . Simple systems compared to real low temp refrigeration
Frontier is water cooled. You have water doing the heat exchange, not your traditional HVAC. There is still HVAC in the room since there are other non water cooled systems in the same room (storage and commodity gear). The switches, controllers and nodes in Frontier are all water cooled.
Sorry for nitpicking but he got one thing wrong. The reason you don't use electrical network cables for longer distances is not primarily because of interference from the power cables but has all to do with attenuation. At these speeds it is very hard to get the signal more than a few meters, it will be heavily attenuated and very hard to distinguish a 1 from a 0. The solution to the problem is to use fibre optics instead.
It's about impedance matching and Standing Wave Ratio that causes the attenuation when sending pulses down copper. The copper need to be matched at both ends so you don't get reflections. Stick to fiber.
@@steveking7719 No, attenuation is caused by the cable impedance and skin effect (and dielectric losses). Impedance mismatch creates reflections that make it harder for the receiver to receive the signal correctly. It is true that both the transmitter and receiver must be impedance matched, but that is already the case. No high speed copper networking would be possible if that wasn't the case. But still, even if you have matched tx/rx, you still get very high attenuation that makes copper networking unusable for lengths over just a few meters.
No matter how big or small: The network IS the computer… For the past few decades, outside of embedded applications (and even in many situations there), computers have to be connected to a network to have any practical value; every piece of software, and most if not all its data, is sent over a network at some time in its lifecycle.
We did learn that they have full service staff provided by OEMs of the supercomputer. They were there performing maintenance that day. Our POC didn't have specifics on hardware failures of the HPC environment, I'll see if he has anything on the networking components.
I can't image working there with all these computers so much electric field energy and hopefully is not affecting people's health. Any EMI/EF Faraday cage?
We were there specifically to talk about the super computing network. We did have access to and see other things while onsite, but were only authorized to record the networking piece seen here.
Fort Meade is also volumes faster than this system. It’s just the specs are classified - someone will know those specs eventually (perhaps in 20-30 years). Even Snowden knew the NSA has had the best computer in the world since 2002
You know if they switched off all those small diodes on each server, blinking all the time, consuming power, I wonder how many watts that is total. You really only need those lights to debug if something is working right? could be a little switch instead to toggle those on and off
You can think of five L.E.D.s using about 1 watt of power. In the grand scheme of things, If they were switched off, most people would not know that some energy was saved. If you look at the home computer, it's costs (on average) $35-$40+/- a year to run a home computer 8 hours a day for one year (possibly much less). Those same five LEDs (diodes) that you mentioned would cost 35-40 cents to run them 8 hours a day for a full year (or just over a dollar per year if running 24/7)
About 10y ago when they built the Trinity supercomputer at Los Alamos they were able to save a bunch on power costs by partially diverting the river that runs through the town into giant pipes that run under the lab's datacenter for cooling. That was a wild one.
Not sure I understand the "noise" issue with copper Ethernet? It is transformer coupled at each end, self balancing with common mode induced noise rejection via the twist. I've seen it run around along with the wiring for 3 phase CNC equipment with no issues. Even at those scales I am not sure I buy that explanation. Length would be a real issue at that scale rather than noise I would have thought.
The speed of the link and acceptable error rates for encoding on those links is a limiting factor. Noise, heat and other factors all play a role in maintaining high speed links.
It would have been nice to know a few things about how that plethora of processors is organised, how they work together and most of all how does the output from all processors is combined to one knowledgeable fact. I can imagine myself a number of cores where on each core is a part of a programme working, But with a numerous number of processors this can't be done any more.
Granted this is a 'network' overview. Go read about MPI. Swear a lot. And it will start to come together. Jobs that run on systems like this can run on hundreds of nodes at the same time. It's not at all impossible. And yes, they can run containers if you wanted to do so.
Fascinating... still don't understand much of what that "time machine" is all about, but fascinating nevertheless... even though I think a DMC DeLorean properly retrofitted for time travel offers a bit more practicality and excitement in terms of time travelling!! Haha
The time machine reference was that the supercomputer has done in a shorter amount of time what would have taken us years to complete without it. It dramatically speeds up research.
Yes, you likely did a few uplinks as ISL.That's cool. However, on a slingshot switch -every- single one of those 64 ports on that switch do 200gpbs. To the nodes connected to it, and towards the fabric.
We will have the same processing power and a phone and around 20 years.. I watched the documentary about a super computer the size of a factory and it wasn't as fast as a new phone 10\15 years later.
..."the types of problems" they are targeted to solve is the real answer. The problem dictates the architecture. It's still built on geometries and silicon, "Same - Same" but targeted differently. Tools for science not tools for enterprise etc.
In all seriousness, not very well. Games have such miniscule latency requirements that any distributed system is immediately going to fall on its face. Even chiplet-to-chiplet within the same CPU package has proven to be enough to affect game experience - reviews of the R9 7950X all identified that frame pacing was affected dramatically when threads moved between CCDs, let alone moving between entire racks. Now, playing Crysis on a single unit, especially if it has both CPU and GPU compute...
I didn't enjoy this as much as I thought I would. My criticism and suggestions below: 1. It would have made more sense to plan/script some of these conversations (e.g. the example of what a flop/exflop was pointless. Even saying: ("Add 1.5 + 1.5", that's a FLOP, if everyone on Earth does the calculation at the same time, that's a 8 Gigaflops, now if we had 125,000,000 Planet Earths doing a calculation per second, this would be the same power.) 2. Really didn't talk much about the Network, very wishwashy/high level. A 5-minute whiteboard session could have added more detail. Seems there was another Team who managed the HPC Fabric vs. the "Classic Network" that could have been consulted. Really not much detail, most of this detail had to be sighted from the b-roll. 3. Asking the applications used on the HPC would have been better asked to a user/scientist 4. Could have included a tour of the operations centre/NOC and other areas (HVAC, Power) would have been interesting. These are all supporting areas which are interesting 5. This could have been 10 minutes long 6. The guest mentions early in the video it's 200 Gbps Ethernet, yet the question is asked again in the video
Asks whats a exaflop, proceeds not to explain a exoflop. EXA FLoating point OPerations per Second) One quintillion floating point operations per second.
@@evanstayuka381 Driver and firmware stability vs nvidia, look at the drama around Geohot / tinygrad / tinycorp having to abandon AMD as their primary platform due to the lack of stability.
None of that matters when they use custom software and write their own codes , the hpc are used for open computing, CUDA wont matter. Now you better go back to your gtx 1650 fanboy.
@@evanstayuka381 I thought I already replied to this, or the post got deleted perhaps as the truth was too hard to handle perhaps. The AMD driver and firmware are competitively unstable compared with the NVIDIA driver and firmware stack. Look at the drama that Geohot / Tinygrad / Tinycorp had trying to go with AMD GPU's and had to abandon going AMD as the tinybox standard and offer the NVIDIA option as the primary option, as they could not get the driver / firmware stability required for the a shippable platform. Lets see if this post gets deleted.
This guy doesn’t seem like he’s ever seen the inside of a data center before what embarrassingly basic questions that didn’t even get to what’s special about their setup or capability.
The real crunching of data here is happening on GPU cores, not the CPUs. Those are just managing the GPU cores, effectively. With a system like this, your biggest concern is power and cooling, so efficiency is what matters. AMD very much wins there, and has the experience with building large systems like this - ergo, they won the contract.
The funny thing is. This oak ridge will be number 2 once el capitan project done , and it will be amd number 1 and 2 for fastest super computer. Intel will have number 3 with 80% more power usage lmao.
You build the worlds fastest car and run it on your own private hidden race track. How do you then convince the world your car is faster than the fastest car out there when you refuse to show it to anyone? That's what China is doing at the moment. They very likely have some fast system(s) hidden somewhere. Maybe. :)
@@artofneteng Who cares? its the exact same thing, its just a bunch of servers clustered together. With enough money you can make one twice the size. There is nothing revolutionary here.
@@EvoPortal to a certain degree you are correct. HPC systems are just "bigger meaner" versions of a traditional cluster. Your traditional cluster could be optimized for a multitude of things where most HPC clusters are optimized for parallelism where speed is the key to running and processing massive sets of calculations. Every bit of the HPC cluster is optimized for speed. A traditional cluster is usually not built explicitly for speed. It's built for function. HPC is built for both speed and function. The innovation these days is coming (mostly, and imo) from faster/better connected fabrics. And with PCIe 7.0 soon to come out too. Evolution... with the occasional revolution thrown in there for fun.
What a great video, so informative must be a real privilege to work on that system. Reading about it here as well so impressive. en.wikipedia.org/wiki/Frontier_(supercomputer)
"And is 2 Exaflops" big smile 😊 Nice to see someone that is passionate about his work!
It's not even "a" computer, it's basically just a botnet installed locally...
The only difference I can see between this and a botnet it's this is installed inside a single room so they'll get better latency between the different RAM modules, CPUs and GPUs. What's I'm trying to say is this technology is not very impressing, they're using pretty much the same CPUs and GPUs that we are using in out cheap desktops. Just a lot of them...
@@Alfred-NeumanFor a while they used special processors in these systems, developed directly for them. Cray comes to the mind. When clock speeds become less scalable they started to use consumer hardware. In this they have 8.7 million processor cores instead of 16 or 64 (talking about high end desktop machines).
@@Alfred-Neuman Why don't you build a better one and impress us all..................
One of the coolest aspect of Frontier's network architecture is at the node level. Since all the compute is done on GPUs the network fabric connects directly to the GPUs instead of something like a PCIe bus. So simulations can transfer directly between GPU memory and the network fabric without involving the CPU or having to move data on or off the GPU to get to the network. It allows for incredibly efficient internode communication.
So the GPU's have NIC's connected directly to them? With some sort of second MMU with it's own NIC? It's a tad unclear from the way you describe it, but I wonder how it connects to the GPU since you say it's not using PCIe?
@@noth606 I slightly misspoke. The NICs use PCIe ESM but connected directly to a PCIe root complex on one of the GPUs. Each node has 4 GPUs each with 2 dies (so 8 visible GPUs) and a dedicated nic, so 4 NICs per node. Thus any CPU operation that has to use the fabric actually traverses one of the GPUs to get to a nic.
Source: you can find a bunch of architecture docs for Frontier but I also worked for several years on developing some of the library and software stack for this machine and a few others that were just beginning to come online.
Biggest difference between HPC networks and corporate networks is lack of security in favor of performance at all costs. The compute nodes directly access remote memory over the network RoCE
Great video. Thanks for taking us along with you all!
Take note of the power cables for each rack, similar to the amount a large house might use, per rack. Removing the heat from those racks is a big part of the design. Air flows from the floor and out the top in active exhausts. A little hard to believe, but compactness is a top priority.
Could you imagine their hvac systems!!!! Chillers rated in swimming pools per min
@@mikepict9011 They have so much heat to get rid of, the concept of blowing cold air is no longer valid. Fluid is far more effective at conducting heat away from a metal structure and processors are manufactured with built-in liquid cooling. Each rack is built for purpose with an exchanger which takes it directly out of the room, then returns cold for the next batch. If you work on your home’s HVAC unit, you’re familiar. A widely distributed system like that can be monitored and adjusted for best efficiency.
@@artysanmobile yeah thats part of a larger cascading system when you consider the envelope usually. The liquid usually and ultimately needs to e rejected outside. And thats called a chiller in a liquid system and a condenser in a direct exchange system. But yeah , vapor compression, pipe joining . Its what i do .
@@artysanmobile i serviced the mini chillers that cool MRI machines, they still had a 1 air to refrigerant dx hx and 2 coaxial heat exchangers ( hx ) with 2 pumps . Simple systems compared to real low temp refrigeration
Frontier is water cooled. You have water doing the heat exchange, not your traditional HVAC. There is still HVAC in the room since there are other non water cooled systems in the same room (storage and commodity gear). The switches, controllers and nodes in Frontier are all water cooled.
Awesome tour/interview. Dan seems like a real genuine dude. 👍
ORNL is my dream job. I'd honestly sweep floors just to be in the building.
It isn't all rainbows and unicorns
@@iamattdid you have a traumatic experience at oak ridge national laboratory
Good attitude. Doesn’t matter where you get on the ladder, just get on, work hard, learn and be agile.
I visited there long ago. It was the most awesome place I have ever seen.
@@jinchey it's an interesting place to work when you get in the mix and actually see how the politics are, let's just say that.
incredibly good interview you did there.
Thank you!
You can tell when the machine is running some serious workloads because the lights flicker in the offices next to it.
Sorry for nitpicking but he got one thing wrong. The reason you don't use electrical network cables for longer distances is not primarily because of interference from the power cables but has all to do with attenuation.
At these speeds it is very hard to get the signal more than a few meters, it will be heavily attenuated and very hard to distinguish a 1 from a 0. The solution to the problem is to use fibre optics instead.
This is correct.
It's about impedance matching and Standing Wave Ratio that causes the attenuation when sending pulses down copper. The copper need to be matched at both ends so you don't get reflections. Stick to fiber.
@@steveking7719 No, attenuation is caused by the cable impedance and skin effect (and dielectric losses).
Impedance mismatch creates reflections that make it harder for the receiver to receive the signal correctly.
It is true that both the transmitter and receiver must be impedance matched, but that is already the case. No high speed copper networking would be possible if that wasn't the case.
But still, even if you have matched tx/rx, you still get very high attenuation that makes copper networking unusable for lengths over just a few meters.
@@alexanderahman4884 rut row... here we go again... two people disagreeing and neither will back down from their position.
@@steveking7719 I'm just stating facts.
What happens to the hardware after they are removed from Oak Ridge? Is there still some value in them besides recycling?
It can be sold as spare parts, and/or recycled.
No matter how big or small:
The network IS the computer…
For the past few decades, outside of embedded applications (and even in many situations there), computers have to be connected to a network to have any practical value; every piece of software, and most if not all its data, is sent over a network at some time in its lifecycle.
Never underestimate the bandwidth of a FedEx truck.
This is amazingly quiet for a system of that size
Water cooled! The other half of the data center not shown in the video was all storage and that side was LOUD!
Great Video, thank you. Interesting would have been the type of Failures they see - Overheating, Bad Solder, Caps fail, Fans/Plumbing fails etc
We did learn that they have full service staff provided by OEMs of the supercomputer. They were there performing maintenance that day. Our POC didn't have specifics on hardware failures of the HPC environment, I'll see if he has anything on the networking components.
L3.cache errs for 1
@@artofneteng MTBF was 1 hour at first
@@artofneteng blue jackets are pushing carts all day
Did I see Cray - oh my - that is just awesome
And AMD not NVDA 😂
I’m surprised they can even talk in there. I’ve been in some major data centers and communication can be difficult.
They were water cooled so no fans on that side of the DC. The other side was storage which still had traditional cooling and was very loud!
@@artofneteng Ah, that makes sense.
I can't image working there with all these computers so much electric field energy and hopefully is not affecting people's health. Any EMI/EF Faraday cage?
Why not check out the visualization suite? That's the coolest part.
We were there specifically to talk about the super computing network. We did have access to and see other things while onsite, but were only authorized to record the networking piece seen here.
I am in Security now but I really miss being a network engineer. Thank you for sharing on this platform.
I’ve been in the computer rooms at fort Meade. Awe inspiring
Fort Meade is also volumes faster than this system. It’s just the specs are classified - someone will know those specs eventually (perhaps in 20-30 years). Even Snowden knew the NSA has had the best computer in the world since 2002
You know if they switched off all those small diodes on each server, blinking all the time, consuming power, I wonder how many watts that is total. You really only need those lights to debug if something is working right? could be a little switch instead to toggle those on and off
You can think of five L.E.D.s using about 1 watt of power. In the grand scheme of things, If they were switched off, most people would not know that some energy was saved.
If you look at the home computer, it's costs (on average) $35-$40+/- a year to run a home computer 8 hours a day for one year (possibly much less). Those same five LEDs (diodes) that you mentioned would cost 35-40 cents to run them 8 hours a day for a full year (or just over a dollar per year if running 24/7)
@@sky173 But it's quite redundant to have them right? You dont need them at all really
If their blinking you know it's working.
The power supply behind them is unbelievable. Enough for a town.
About 10y ago when they built the Trinity supercomputer at Los Alamos they were able to save a bunch on power costs by partially diverting the river that runs through the town into giant pipes that run under the lab's datacenter for cooling. That was a wild one.
@@chuckatkinsIII Intelligent use of resources makes me happy.
After working with one we heard the gruntiest one is in Japan now rather than Oakridge.
Computers are like watches now we need to start making computers that last hundreds of years in my opinion
200GBPS for Outof band network ?! I never thought of that.. I was wondering may be it is 2GBPS for management network
Not sure I understand the "noise" issue with copper Ethernet? It is transformer coupled at each end, self balancing with common mode induced noise rejection via the twist. I've seen it run around along with the wiring for 3 phase CNC equipment with no issues. Even at those scales I am not sure I buy that explanation. Length would be a real issue at that scale rather than noise I would have thought.
The speed of the link and acceptable error rates for encoding on those links is a limiting factor. Noise, heat and other factors all play a role in maintaining high speed links.
Amazing tour, mind blowing stuff
Glad you enjoyed it!
It would have been nice to know a few things about how that plethora of processors is organised, how they work together and most of all how does the output from all processors is combined to one knowledgeable fact. I can imagine myself a number of cores where on each core is a part of a programme working, But with a numerous number of processors this can't be done any more.
Granted this is a 'network' overview. Go read about MPI. Swear a lot. And it will start to come together. Jobs that run on systems like this can run on hundreds of nodes at the same time. It's not at all impossible. And yes, they can run containers if you wanted to do so.
But can it run doom ?
How many instances of doom can it load?
Fascinating... still don't understand much of what that "time machine" is all about, but fascinating nevertheless... even though I think a DMC DeLorean properly retrofitted for time travel offers a bit more practicality and excitement in terms of time travelling!! Haha
The time machine reference was that the supercomputer has done in a shorter amount of time what would have taken us years to complete without it. It dramatically speeds up research.
Simply Awesome!
Where old system goes, eBay?
I think it will go to auction
It's usually auctioned off.
Are these super computers shielded against EMP
Great question! I don't recall if he said whether they are or not.
In supercomputing it's either Network or Notwork :DD
thats really funny lil buddy 😊
wheres the NSA Stickers?
I'm sure the NSA has their own super computers.
At Fort Meade. :P
@@artofneteng yeah, and theyre plugged into this.
Not any more. LLNL's El Capitan is now the leading supercomputer.
200Gb, lol I have that between the switches at work which I put in like 3 years ago.
You don’t understand. Each device in this platform has 200Gb so every CPU, GPU and storage connection has 200Gb direct to the platform.
Thank you!
Yes, you likely did a few uplinks as ISL.That's cool. However, on a slingshot switch -every- single one of those 64 ports on that switch do 200gpbs. To the nodes connected to it, and towards the fabric.
AWS has 400gbit per gpu over EFA (3.2tbit in total per host), and can sustain 3.6tbit on NVLink intranode, any to any.
We will have the same processing power and a phone and around 20 years.. I watched the documentary about a super computer the size of a factory and it wasn't as fast as a new phone 10\15 years later.
Moore's law says yes. We just need people learning engineering so that we can push those limits. Innovation and education go hand in hand.
Great video
You’re here to look at the networking in here as an electrician looking at the electrical.
..."the types of problems" they are targeted to solve is the real answer. The problem dictates the architecture. It's still built on geometries and silicon, "Same - Same" but targeted differently. Tools for science not tools for enterprise etc.
Beowulf Cluster of Doom!
The guy in light blue needs a trimmer wardrobe.
No way do you get access to the world's fastest computer...
Hypersonic missile systems are classified :P
Open research, class are in other DCs
But can it run crysis?
no
You spelled Doom wrong.
how many chrome tabs can it handle?
In all seriousness, not very well.
Games have such miniscule latency requirements that any distributed system is immediately going to fall on its face. Even chiplet-to-chiplet within the same CPU package has proven to be enough to affect game experience - reviews of the R9 7950X all identified that frame pacing was affected dramatically when threads moved between CCDs, let alone moving between entire racks.
Now, playing Crysis on a single unit, especially if it has both CPU and GPU compute...
Nothing can.
But can it run Crysis?
So SkyNet is a Tennesseean.
7:50 his head got a head 😂 i cant stop seeing this
I would love to work there. Tired of making 10 gb as fast as possible. Mind you, I got into a terraflop
But can it run doom
I didn't enjoy this as much as I thought I would. My criticism and suggestions below:
1. It would have made more sense to plan/script some of these conversations (e.g. the example of what a flop/exflop was pointless. Even saying: ("Add 1.5 + 1.5", that's a FLOP, if everyone on Earth does the calculation at the same time, that's a 8 Gigaflops, now if we had 125,000,000 Planet Earths doing a calculation per second, this would be the same power.)
2. Really didn't talk much about the Network, very wishwashy/high level. A 5-minute whiteboard session could have added more detail. Seems there was another Team who managed the HPC Fabric vs. the "Classic Network" that could have been consulted. Really not much detail, most of this detail had to be sighted from the b-roll.
3. Asking the applications used on the HPC would have been better asked to a user/scientist
4. Could have included a tour of the operations centre/NOC and other areas (HVAC, Power) would have been interesting. These are all supporting areas which are interesting
5. This could have been 10 minutes long
6. The guest mentions early in the video it's 200 Gbps Ethernet, yet the question is asked again in the video
Great feedback, thank you!
For playing games?
did he really tell what is the use of these machines?
Asks whats a exaflop, proceeds not to explain a exoflop.
EXA FLoating point OPerations per Second) One quintillion floating point operations per second.
I would hate to see their electric bill.
Can it play Minecraft?
OMG he asks so many stupid and repeated questions about the network cables....
Some clarifying questions never hurt, and this channel is Network Engineering focused.
love the mildly autistic and awkward geeky conversations
Quantum computers are exponentially faster.
Very limited though
Panduit.
Those poor bastards having to deal with AMD GPU drivers in HPC.
Is that a problem? Can you expantiate?
@@evanstayuka381 Driver and firmware stability vs nvidia, look at the drama around Geohot / tinygrad / tinycorp having to abandon AMD as their primary platform due to the lack of stability.
None of that matters when they use custom software and write their own codes , the hpc are used for open computing, CUDA wont matter. Now you better go back to your gtx 1650 fanboy.
@@gunturbayu6779 You statement makes no sense, and why the hate? As they say when you do not have a good argument you resort to person attacks.
@@evanstayuka381 I thought I already replied to this, or the post got deleted perhaps as the truth was too hard to handle perhaps.
The AMD driver and firmware are competitively unstable compared with the NVIDIA driver and firmware stack.
Look at the drama that Geohot / Tinygrad / Tinycorp had trying to go with AMD GPU's and had to abandon going AMD as the tinybox standard and offer the NVIDIA option as the primary option, as they could not get the driver / firmware stability required for the a shippable platform. Lets see if this post gets deleted.
This guy doesn’t seem like he’s ever seen the inside of a data center before what embarrassingly basic questions that didn’t even get to what’s special about their setup or capability.
Bitcoin miners😂
Balder dude has been wearing headphones for toooooo long....
Well, he's a podcaster too so...
But can it run Minecraft? 😂
Get that man clothes that don't look like he was just shrunk into
Heheh all im thing how cool be to play minecraft on it
Horrible soundtrack!
We appreciate your feedback
it should mine BTC :P
Not impressed. It was outdated as soon as it was deployed.
Thanks for your feedback
They need to generate crypto to pay for future machines and upgrades.
lol. No. Who needs crypto when you can literally just print more dollars? DoE has a massive budget regardless.
@@kilodeltaeight Then they could construct a supercomputer that had no final size or limitations.
Chose amd to save money, could be faster with intel omegalul
Faster with Intel ? Are you living under a stone? Lol
The real crunching of data here is happening on GPU cores, not the CPUs. Those are just managing the GPU cores, effectively. With a system like this, your biggest concern is power and cooling, so efficiency is what matters. AMD very much wins there, and has the experience with building large systems like this - ergo, they won the contract.
The funny thing is. This oak ridge will be number 2 once el capitan project done , and it will be amd number 1 and 2 for fastest super computer. Intel will have number 3 with 80% more power usage lmao.
@@gunturbayu6779 You forgot Aurora!
Are you sure it's the fastest super computer in the world? China has come a long way in this field and would be very competitive.
At the time that were recorded this video, which was August of 2023, Frontier held the titled of fastest super computer in the world.
You build the worlds fastest car and run it on your own private hidden race track. How do you then convince the world your car is faster than the fastest car out there when you refuse to show it to anyone? That's what China is doing at the moment. They very likely have some fast system(s) hidden somewhere. Maybe. :)
It's just a lot of servers clustered together.... whats the big deal? Server clustering has been around for decades....
Right, but this particular cluster of servers has a MASSIVE amount of compute that has done amazing things for us!
@@artofneteng Who cares? its the exact same thing, its just a bunch of servers clustered together. With enough money you can make one twice the size. There is nothing revolutionary here.
@@EvoPortal to a certain degree you are correct. HPC systems are just "bigger meaner" versions of a traditional cluster. Your traditional cluster could be optimized for a multitude of things where most HPC clusters are optimized for parallelism where speed is the key to running and processing massive sets of calculations. Every bit of the HPC cluster is optimized for speed. A traditional cluster is usually not built explicitly for speed. It's built for function. HPC is built for both speed and function. The innovation these days is coming (mostly, and imo) from faster/better connected fabrics. And with PCIe 7.0 soon to come out too. Evolution... with the occasional revolution thrown in there for fun.
You talk like a child!
What a great video, so informative must be a real privilege to work on that system.
Reading about it here as well so impressive.
en.wikipedia.org/wiki/Frontier_(supercomputer)