Thanks so much for including our RAK system Wendell! (replaced axe handle or not!) Your 8-channel vs. 4-channel testing results mirror our own - it's incredible the performance boost 8-channel memory gives to UE compiles!
Please make more videos geared towards game dev workloads, there is very little good information available out there when it comes to hardware configs for things such as build servers.
This is 100% true, there is documentation on the unreal website but it doesn't tell you how that translates. It also depends if you're doing farms or solo/small team setups. The current bet is buy the most horsepower you can then have a rig that is a mid range rig you're aiming for (at least in the small times)
Just get a cheap SSD, max out the CPU cores, and make sure you have enough memory to support all the cores (its not that much). I'd be surprised if 64 was not enough. Most of everything is just tiny incremental improvements. Having lots of CPU cores an SSD and enough memory makes a big different to compile times, and CPU cores helps with multitasking without stuttering also. Most of everything else is optional. I'd reccomend high bandwidth networking if you can.
you're living your best hardware life building these epyc systems, and you look a lot healthier, and the content is always very detailed. very inspiring bro.
It would have been interesting to have a baseline of fastest AM5 alternative with these Threadripper comparisons as it would give a baseline of something individuals with limited budgets would actually buy for this.
Really useful comparison info. Would like to see some modeling workloads, like mapping and water modelling. That's a common use case for these big budget workstations.
As a long time Unreal Engine developer and licensee, I found this video interesting. I actually built an Intel Xeon W7-2495X, W790-ACE, 512GB, RTX-3090 system. I couldn't afford the extra to go W3400 W790-SAGE but that would have been nice. My Xeon system still performs really well. A W2400 vs W3400 UE5 comparison test would be nice if you have the hardware at Level1.
Even though ill probably never use a threadripper machine, I always love the content you do with these kinds of systems, keep up the great work Wendell!
Really amazing performance overall, but as you mentioned; the CPUs have gotten so fast that they are specifically good at things like compiling. I have a 3960x system and it is still blazing fast for my development work. My software isn't large so compile time is like literally instant.
Amazing! Thank you Wendell, I have been looking into making a ramdisk based TR / Epic dev machine for UE. Now if only AMD and industry could kick off the CxL memory option too so that we can have some additional and larger RAM storage or cache options.
I’d have killed for this performance a few years ago… I was running a XR research project for the BBC and we were recompiling an Unreal Engine based prototype between users to explore accessibility customisations…. Waiting 15-20 minutes each time was brutal and limited the amount we could experiment and learn from each user.
I've have been running a 14 core broadwell xeon for a while now with quad channel ddr4 and I have constantly been geting weird stutters despite my system supposedly being pretty beefy. Sure it's old for some games but it's really strong for nearly everything I ever have to do for it. Fast forward to a month ago when I started trying to play Helldivers and the stuttering had been at it's worst. Looking for solutions I found a few settings for windows that were supposed to be for helping laptops save battery power. Now I have maybe rebuilt windows 4-5 times on this rig, probably more. So I've tried countless settigns and fresh installs and kept seeign the same issues. Memory tests never showed any issues either. The setting that I think helped the most was disabling dynamic tick. It was either that or disabling multiplane overlay or disabling core parking with the registry. Ever since those changes windows has been razor sharp. Helldivers is butter-smooth and fully ray-traced Cyberpunk went from 45 fps to an easy 60. It was never the gpu, I was always cpu bound in all of these tasks. My theory was that my 14 core cpu was barely breaking a sweat and so windows was clocking everything down to needlessly save power. Have you encountered anything like that with high core count chips like the threadripper? PS: this is somewhat unrelated but blocking windows search in the firewall is a handy way to tempter that nuisance as well.
Oh these big compiles... I remember the initial compilation tests on Threadripper 1950X (also different usecase when I was comparing some Open CL workloads run on it vs. on Fiji Pro), that platform was stellar after years of Intel 4 cores 😁 1950X still going strong today in my main PC, but simply no competition to the current gen whatsoever, not even the normal Ryzens haha
Very interesting about CPU's being so fast that compressed memory is utilized. I'm thinking about a PL which should enable less memory to be consumed. And I didn't know about DPC and such which I looked up after a comment here. Learned stuff. Nice.
It’s hard to justify Threadripper Pro or even Threadripper when EPYC Genoa 96 core with all the server goodies is available on eBay for less than the cost of a 24 core Threadripper….
saying you have a 'problem' really is an understatement. I get that this is your business and all, but those are €10.000+ systems (probably 20.000+ by the time it's fully outfitted) and you have like 5 of them
Just to mirror some of the other comments here, I would also love to see more content like this. Have you done any - or would you consider doing - some similar tests for real shipping games made in Unity?
On 4chan Threadripper platform, it is clear that the memory bandwith has not scaled as fast as the number of cores and compute performance, and cache can only help up to some point where raw bandwith becomea critical. Computing the Gb/s per Core or Gb/s / Gflop is a good indicator. Also the size if the active dataset is critical: if you have 100Gb+ of active data that is accessed constantly, the memory bottleneck is probable, like in CFD for example where a large set of data is treated iterativelly at each simulation step.
If I recall TPM stutter on AMD was caused by the fTPM. Maybe using an actual TPM module eliminates it? All of the TRPro's I have deployed at the office have a discrete TPM and do not suffer this issue when compiling UE.
If you watch the video I added a hardware tpm which helped but not eliminated entirely. It did help a lot. Were you experiencing stutters without the tpm on tr7000?
zRAM! It's like Shazam!, but for your Linux system (queue drum riff)... Jokes aside, this was another very informative presentation. I've been running Arch on all my systems for several years, and among all the lesser known optimizations, zRAM is not one I had knowledge of until today. It also seems like some distros are enabling zRAM by default like Pop!_OS. My current threadripper build (7970x) just got a lot more exciting!
I wish I could play (and work with, I know you work hard) with all that hardware. Instead if I want to learn CUDA, I have to scrounge over 300$ on Jetson Nano 2GB Dev Kit. Jetson Nano itself is cheaper, but it has to go on the board, which isn't cheap. And I know there were some extensions....... getting off track here. I am jealous and envious Wendell - you're beautiful human being, but I do love so much hardware!
Wendell definitelly has nice toys around.. But to get started with CUDA a second hand Geforce 10xx or an old Quadro with a Linux install is probably cheaper than the Jetson..
I’ve been compiling UE4 on 3700x before, and it took hours to do so. These system are certainly fast. One game that I was working with is on Unity 3D and … damn it took hour to build. Turn out it runs external tools to convert those textures (and it runs through node js script to boot). I usually start my works on clean build so I do have a few hour for making coffee every enhancement I worked on lol.
Bro, I really appreciate this deep dive. I always assumed that the 64 was going to curb stomp on the 32, but that isnt really the case. Unreal (and other programs like Houdini) can be real zebra cases lol
Only windows. I'm not 100% sure it's tpm but it seemed to be 90% improved with a hardware tpm modules. Linux didn't do it at all when ramping to 100% LOAD from idle
@@Level1Techs Thanks. My build workflow very often jumps from 0 to 100% (because Visual Studio/.idl files), so if a $20 TPM chip can have a positive effect, I’ll definitely check that out (Asus TRX50/7970X).
Yeah, mystical hardware indeed. As I saw the Thredripper 4000 series launched, and no DIY hardware. Just the OEM's are getting them. But 69 minutes, NICE!! HUH-HU!!
For some applications it would probably perform well. For the UE compilations tested here probably not. The 3DVCache is nice, but only in applications where most of the operations will be handled in the cache. As soon as you get a cache miss it's back to memory speed. And with only two banks there will be a big penalty.
I've got a TRX40 system. Yeah, it's not the best - can't enable power savings since it becomes unstable, but aside from that, it's pretty good. How did TRX40 owners get screwed over? I'm in the dark on what you are referring to.
@@bitcoredotorg AMD only offered TRX40 a single generation of Ryzen, the ZEN2 Threadripper 3000, when they stated on release of TRX40 that this platform was the necessary move, having already killed the previous platform X399 that saw Threadripper 1000 & 2000. AMD said they will have long term support for TRX40. But that didn't last, after 2020 Q4 when ZEN3 Ryzen 5000 launch to mainstream, AMD zip quiet on anything with TRX40, we would find out they had prepped for a Threadripper 5000 Chagall based CPU but they cancelled it. They've left TRX40 in the dust, while their mainstream AM4 continued to get more new chips from ZEN3 to ZEN3-D all which are cutdowns from its EPYC Milan and Milan-X server CPUs.
CPUs have come along way. When I first started compiling a full build of UE 4 it took over 4 hours for a full build. Even a modern i7 can now do it in about an hour. However, unless you are making low level engine changes you are probably doing an incremental build and not a full build of the engine. An incremental build is usually less than 10 minutes on most modern desktop CPUs. These machines great for a subset of the development team, but probably overkill for most devs.
Regardless of how far they've come, LEDT (Low-end Desktop) simply isn't that good or that fast. Over a hour is a LONG time to wait for a compile. And for anyone making a real game, you'll still have to do an incremental build for every single change you make. Throughout the course of development this is tens of thousands of compiles. Not to mention that without HEDT you're looking at only 16 PCIe lanes and 2 channels for RAM on LEDT platforms like AM5 and Z690. AM5 and Z690 (your "modern i7") are toy platforms that are heavily cut down, with toy processors.
The bogeyman of "Processor Groups" emerges again to scare the children, but yet again there's no data to substantiate the scary legends. Benchmark the build on Windows and Ubuntu and share the results, and then if the Windows results are markedly lower, perhaps that's the explanation. Until then, the sky's not falling in the Windows scheduler, as far as I can tell. And I say that as (a) someone who runs Proxmox on my own 7995WX, and (b) hasn't touched the kernel in 25 years. But without numbers, it's just an oft-repeated opinion.
The "Unity building process" you refer to is actually called a "Unity build" or a "Jumbo build", and has nothing to do with Unity the game engine. Instead, it entails combining all the source and headers into a single translation unit, which can lead to significant speedups, because headers don't get repeatedly compiled in every TU, and linker has less work to do.
very interesting, it seems like when you have twice as many channels as you do ccd's performance suffers (as we see with the 32 core 8 channel). Its almost like each ccd is working extra hard because it has to handle more channels. As for the 24 core with 4 channels, understand there are still 4 ccds here (6 cores per ccd), so we dont see it impacted vs the 32core 4 channel (which you verbally mentioned.) Correct me if I am wrong here.
I think that when you compare the price/performance, it looks like that the sweet spot might actually be the 64-core, non-Threadripper Pro, with 4 channels (TRX50), and 256 GB of RAM. I haven't run the calculations yet, but to get it down to sub 10-minute compiles, you're just throwing money and hardware at the problem.
I didn't understand 1/10 of this😂, I'm just fascinated by these Threatripper CPU's. I run a 7900x, 32gb 6000mhz DDR5 ram for focus stacking extreme macro photography.
Speeds up from 32 to 24 cores? Perhaps memory starved cores? I do know that currently with linpack and other memory heavy workloads, at 6200 or 6400 MT/s, buildzoid can manage to fully utilize between 10,5 and 11,5 (21 to 23 threads) cores on 12 and 16 cores desktop parts. Now 8 channels of memory should prevent that, but if there is bottleneck somewhere in infinity fabric - maybe that could warrant small investigation? Like checking if all cores fully utilize memory and seeing if IF overclock alone doesn't improve results. Also it would be cool to see how X3D parts handle it, but maybe that is further in the video.... I keep pausing and commenting.... well at worst it shouldn't be bad for the algorithm.
@@origamitom Depends how many CCD's it has and how many memory channels there are. Also whether the bottleneck is in IF and whether it comes from IO Die or from links to CPU. Not to mention AFAIK, EPYC and Threadripper have different IO die that might alleviate this issue (I suspect it does, routing of data for Desktop Ryzen is... not super bandwidth efficient). I need to mention again - this is only when cores have as much memory thrown at them as they can handle. In "normal" use 12 and 16 cores on system such as Desktop Ryzen has, should be fine. The issue is how much bandwidth each CCD has from IO die most likely (which is why most EPYC's and Threadripper's, use a lot of them with a lot of disabled cores for lower end parts). Plus how much of traffic there is between CCDs. In desktop parts with 2 CCDs, CCD1 communicates with CCD0 through IO Die. CCDs in EPYC and TR that are on opposite sides of IO die, might have the same issue (IDK how it is for the ones in the same row, so I'm picking worst case possible). But again - the IO die there is much "beefier". Still, one should check if programs don't necessitate a lot of inter-CCD communication - because then it might throttle bandwidth in very odd ways.
I don't think we actually have 32-core numbers. Here it's the old 5000 series 32-core. Can't compare that one with the 7000 which is not even in the table. We also don't have 8-channel 24-core listed...
The overhype of UE is actually counter productive as it makes graphics evolution stuck with old screen space and shadow map techniques. Convinient developers may forget about path tracing and use mediocre lumen instead that got lots of light leaking, inaccurate scene lighting and flat reflections etc.
Thanks for posting this. Would be cool if you could do a real world rendering test using Dreanworks' MoonRay (now free open source), even if just for a 10 second scene.
@@Level1Techs Not me personally, but if you contact Dreamworks, I bet they would love to work with you. They've made a serious commitment to the open source community.
It took me a long time to learn a valuable lesson about computers. PC's are by far the fastest depreciating thing there is. I built computers for 25 years and lost my ass on all of them. I now buy only last gen PC's from Dell or Lenovo. I am on a refurbished I7 13700 that I got for $560. It's so much more satisfying getting a deal than building your own at an extreme loss.
One of the sad things about the experience building on Windows being better? Windows sucks as a platform. It's a shame that all that great engineering has been done to have it all tossed on a bonfire with bullshit features and spyware.
Oh well, could be worse. Try and build the JVM on Linux or Windows. Back in the days Linux took 40-45m to build it, Windows took half a day, and that was after all the downloads and stuff to get the required libraries installed in the first place. Real devs run Linux, or possibly Apple for some tasks.
Probably never. Even on ryzen with two ccds they would put it on just one of them. One could imagine a future where all ccds are basically under the same die and a thick 3d v-cache layer on top, may be in 20-30 years...
7 months later and there's STILL f* all motherboards for TRX50. Where is the Zenith Extreme? Aorus Xtreme? Etc etc. Compared to the extended functionality and greatly increased MB range of TRX40, TRX50 is a bit...sh#t. AMD putting another nail into enthusiast/non-pro Threadripper!
board partners are burned out from AMD's lack of action from the previous TR chipsets. MSI put a lot of money in 2nd gen X399 with that 2018 released MEG Creation board that only see a single gen use, once AMD killed X399 and claimed it would have been impossible to put ZEN2 on it. MSI come back with a new Creator board for TRX40 and get slapped with single gen use again. They are doing FAFO with AMD and can you blame them.
@@vh9network And to think, AMD marketing for the non-pro TRX50 was effectively "Hey! Non-Pro TR is back! We love our enthusiasts!" In reality AMD killed any hopes of a return to enthusiast TR with that insane and very deliberate TRX40 instant EOL manoeuvre, killing their entire enthusiast userbase in the process. Such a shame for the enthusiast since TRX40 was a superb base and PCIe4 still has years of life left to maximise its potential. AMDumb indeed.
Still disappointed with AMD’s decision to only use a 4-channel memory configuration with 1DPC motherboards for non-PRO and 8-channel/1DPC for the PRO variant. They f’ing knew that 2DPC is a pain-in-the-ass on these platforms then don’t castrate the theoretically available 12 memory channels at least on the PRO configuration to give users more memory slots.
That's what happens when they butchered the previous TRX40, then split all that extended functionality into 2 for TR 7000. AMD doing what AMD does best...shooting itself in the foot. Where's the TRX50 equivalent of all those TRX40 MB's...Zenith/Aorus etc etc?
Its lanes problem , you want to send lossless in realtime 8bytes n 8bytes in realtime no studio usrs win 11 or these core overloaded its sennding in bits to memory as it cant send native instant its better on the older tech these huge cpus are too slow and use mwmory, cl16 16gb xmp x 2 on a z270 gihabuye with a i77700k would have done it better at lossless in studio stored in audio seperate at hq to video. Hardware today is a scam to sell shares. All lies on specs n binned cpu dependent on win ai n 4gb rar codecs n compression. Hence more legs on cpu 3 halfs tied together trying to do one highend but cant so memory not on cpu its stored fetched checked not iso 4bit studi pretending to be 16
There is a huge difference in developing on a PC versus running on a PC. Compile time, running the game in a debugger, performing automated / unit testing, running an IDE, running tracer and other applications, having many documents open, running a server next to the client etc. etc. etc. etc. etc....
Thanks so much for including our RAK system Wendell! (replaced axe handle or not!) Your 8-channel vs. 4-channel testing results mirror our own - it's incredible the performance boost 8-channel memory gives to UE compiles!
"So I bought 3 other Threadripper systems!"... Never change Wendell.
"So I -bought- was given 3 more Threadripper systems!"
Half a mill subscribers and you don't need to buy anything ;)
@@ChrisM541 that part he was specifically talking about buying them. Every part he got from the companies he specified it.
@@lucasljs1545 Again, half a million subscribers does not guarantee honesty ;)
Throw a little white lie here and there ;)
@@ChrisM541 that's a pretty strong accusation to make without any evidence that I can see - is it *possible*? Sure. I think it is unlikely though.
Thanks for the mention and the hardware!
Please make more videos geared towards game dev workloads, there is very little good information available out there when it comes to hardware configs for things such as build servers.
This is 100% true, there is documentation on the unreal website but it doesn't tell you how that translates. It also depends if you're doing farms or solo/small team setups.
The current bet is buy the most horsepower you can then have a rig that is a mid range rig you're aiming for (at least in the small times)
100%! I'm a developer. And I want to know these kind of information
+1 for this :)
Just get a cheap SSD, max out the CPU cores, and make sure you have enough memory to support all the cores (its not that much). I'd be surprised if 64 was not enough.
Most of everything is just tiny incremental improvements. Having lots of CPU cores an SSD and enough memory makes a big different to compile times, and CPU cores helps with multitasking without stuttering also.
Most of everything else is optional. I'd reccomend high bandwidth networking if you can.
you're living your best hardware life building these epyc systems, and you look a lot healthier, and the content is always very detailed. very inspiring bro.
Real world utilization! YAAAAAAYYYY!! Thanks, much appreciated 😁
It would have been interesting to have a baseline of fastest AM5 alternative with these Threadripper comparisons as it would give a baseline of something individuals with limited budgets would actually buy for this.
OMG I love you! Some of the best information for Threadripper systems. This video was super helpful, please give me more!!
Really useful comparison info. Would like to see some modeling workloads, like mapping and water modelling. That's a common use case for these big budget workstations.
As a long time Unreal Engine developer and licensee, I found this video interesting.
I actually built an Intel Xeon W7-2495X, W790-ACE, 512GB, RTX-3090 system. I couldn't afford the extra to go W3400 W790-SAGE but that would have been nice. My Xeon system still performs really well. A W2400 vs W3400 UE5 comparison test would be nice if you have the hardware at Level1.
Even though ill probably never use a threadripper machine, I always love the content you do with these kinds of systems, keep up the great work Wendell!
Good stuff, definitely looking forward to more of this.
Really amazing performance overall, but as you mentioned; the CPUs have gotten so fast that they are specifically good at things like compiling. I have a 3960x system and it is still blazing fast for my development work. My software isn't large so compile time is like literally instant.
Zram (also Zswap) lead to regular deadlocks on our bigmem hpc nodes. Not worth the trouble.
Amazing! Thank you Wendell, I have been looking into making a ramdisk based TR / Epic dev machine for UE.
Now if only AMD and industry could kick off the CxL memory option too so that we can have some additional and larger RAM storage or cache options.
I’d have killed for this performance a few years ago… I was running a XR research project for the BBC and we were recompiling an Unreal Engine based prototype between users to explore accessibility customisations…. Waiting 15-20 minutes each time was brutal and limited the amount we could experiment and learn from each user.
I've have been running a 14 core broadwell xeon for a while now with quad channel ddr4 and I have constantly been geting weird stutters despite my system supposedly being pretty beefy. Sure it's old for some games but it's really strong for nearly everything I ever have to do for it.
Fast forward to a month ago when I started trying to play Helldivers and the stuttering had been at it's worst. Looking for solutions I found a few settings for windows that were supposed to be for helping laptops save battery power. Now I have maybe rebuilt windows 4-5 times on this rig, probably more. So I've tried countless settigns and fresh installs and kept seeign the same issues. Memory tests never showed any issues either.
The setting that I think helped the most was disabling dynamic tick. It was either that or disabling multiplane overlay or disabling core parking with the registry.
Ever since those changes windows has been razor sharp. Helldivers is butter-smooth and fully ray-traced Cyberpunk went from 45 fps to an easy 60. It was never the gpu, I was always cpu bound in all of these tasks.
My theory was that my 14 core cpu was barely breaking a sweat and so windows was clocking everything down to needlessly save power. Have you encountered anything like that with high core count chips like the threadripper?
PS: this is somewhat unrelated but blocking windows search in the firewall is a handy way to tempter that nuisance as well.
Oh these big compiles... I remember the initial compilation tests on Threadripper 1950X (also different usecase when I was comparing some Open CL workloads run on it vs. on Fiji Pro), that platform was stellar after years of Intel 4 cores 😁 1950X still going strong today in my main PC, but simply no competition to the current gen whatsoever, not even the normal Ryzens haha
Very interesting about CPU's being so fast that compressed memory is utilized. I'm thinking about a PL which should enable less memory to be consumed. And I didn't know about DPC and such which I looked up after a comment here. Learned stuff. Nice.
AMD Threadripper had a marvel movie it would be "Threadripper in the Multicore of Madness".
Multicoreverse
It’s hard to justify Threadripper Pro or even Threadripper when EPYC Genoa 96 core with all the server goodies is available on eBay for less than the cost of a 24 core Threadripper….
But aren’t those like 2 times slower in MHz?
saying you have a 'problem' really is an understatement. I get that this is your business and all, but those are €10.000+ systems (probably 20.000+ by the time it's fully outfitted) and you have like 5 of them
Just to mirror some of the other comments here, I would also love to see more content like this. Have you done any - or would you consider doing - some similar tests for real shipping games made in Unity?
On 4chan Threadripper platform, it is clear that the memory bandwith has not scaled as fast as the number of cores and compute performance, and cache can only help up to some point where raw bandwith becomea critical. Computing the Gb/s per Core or Gb/s / Gflop is a good indicator. Also the size if the active dataset is critical: if you have 100Gb+ of active data that is accessed constantly, the memory bottleneck is probable, like in CFD for example where a large set of data is treated iterativelly at each simulation step.
Most of the CPU reviewers forget about developers... Probably would be good to do some tests with Java, c, c#, c++, kotlin and python...
If I recall TPM stutter on AMD was caused by the fTPM. Maybe using an actual TPM module eliminates it? All of the TRPro's I have deployed at the office have a discrete TPM and do not suffer this issue when compiling UE.
If you watch the video I added a hardware tpm which helped but not eliminated entirely. It did help a lot. Were you experiencing stutters without the tpm on tr7000?
zRAM! It's like Shazam!, but for your Linux system (queue drum riff)...
Jokes aside, this was another very informative presentation. I've been running Arch on all my systems for several years, and among all the lesser known optimizations, zRAM is not one I had knowledge of until today. It also seems like some distros are enabling zRAM by default like Pop!_OS.
My current threadripper build (7970x) just got a lot more exciting!
@level1techs At 2m 53 seconds, the black server rack in the middle, could you share the name please? :) (Looking to buy a new server chassis)
After a google it looks like a SilverStone Technology RM44
Silverstone we got a separate video on it somewhwre
@@Level1Techs Thank you! :)
@@joshluvhalo Thank you :)
OMFG!! YES!!!
I hope it ends well, this might be my move....
Would love followup on the TPM stutter stuff... Purchased a discrete TPM for my 7970X build, specifically to (hopefully) address this issue?
Discreet tpm didn't seem to fully resolve the stutters
@@Level1Techs It could just be the good old DPC latency / HPET issue that pops up time to time in Windows.
I'm going to be really sorry if ATX goes away, I have some -axes- pc cases still strong from the 90s.
I wish I could play (and work with, I know you work hard) with all that hardware. Instead if I want to learn CUDA, I have to scrounge over 300$ on Jetson Nano 2GB Dev Kit. Jetson Nano itself is cheaper, but it has to go on the board, which isn't cheap. And I know there were some extensions....... getting off track here. I am jealous and envious Wendell - you're beautiful human being, but I do love so much hardware!
Wendell definitelly has nice toys around.. But to get started with CUDA a second hand Geforce 10xx or an old Quadro with a Linux install is probably cheaper than the Jetson..
Thank you for doing this video!!
*_AHHHHHHHH!!!_* 😖
My BRAIN hurts! 🧠🔨
I just saw that ASRock WRX90 motherboard at NewEgg earlier today.
It's *BEASTLY.* 👹
I’ve been compiling UE4 on 3700x before, and it took hours to do so. These system are certainly fast.
One game that I was working with is on Unity 3D and … damn it took hour to build. Turn out it runs external tools to convert those textures (and it runs through node js script to boot). I usually start my works on clean build so I do have a few hour for making coffee every enhancement I worked on lol.
Bro, I really appreciate this deep dive. I always assumed that the 64 was going to curb stomp on the 32, but that isnt really the case. Unreal (and other programs like Houdini) can be real zebra cases lol
Can you elaborate on how TPM stuttering affects compilation, and how you detected that? Windows or Linux?
Only windows. I'm not 100% sure it's tpm but it seemed to be 90% improved with a hardware tpm modules. Linux didn't do it at all when ramping to 100% LOAD from idle
@@Level1Techs Thanks. My build workflow very often jumps from 0 to 100% (because Visual Studio/.idl files), so if a $20 TPM chip can have a positive effect, I’ll definitely check that out (Asus TRX50/7970X).
Yeah, mystical hardware indeed. As I saw the Thredripper 4000 series launched, and no DIY hardware. Just the OEM's are getting them. But 69 minutes, NICE!! HUH-HU!!
6:26 wait....you compiled clang???? Thats a brave man right there!
I'm curious if one of the Epyc 3dVCache 9004X series chips would outperform the TR pros even if you went with lower core count Epyc's.
For some applications it would probably perform well. For the UE compilations tested here probably not. The 3DVCache is nice, but only in applications where most of the operations will be handled in the cache. As soon as you get a cache miss it's back to memory speed. And with only two banks there will be a big penalty.
The stutter problem is interesting. Why could it be ? Are you sure it is TPM ?
I am not sure it's tpm except that adding a hardware module improved things a lot. But it would still stutter on windows
Those cases are huge.
I'm so glad a channel like this exists.
Wendell you're the only one I know whose buying Threadripper. Most given up on AMD after they screwed them over with X399 and TRX40.
I've got a TRX40 system. Yeah, it's not the best - can't enable power savings since it becomes unstable, but aside from that, it's pretty good. How did TRX40 owners get screwed over? I'm in the dark on what you are referring to.
@@bitcoredotorg AMD only offered TRX40 a single generation of Ryzen, the ZEN2 Threadripper 3000, when they stated on release of TRX40 that this platform was the necessary move, having already killed the previous platform X399 that saw Threadripper 1000 & 2000. AMD said they will have long term support for TRX40. But that didn't last, after 2020 Q4 when ZEN3 Ryzen 5000 launch to mainstream, AMD zip quiet on anything with TRX40, we would find out they had prepped for a Threadripper 5000 Chagall based CPU but they cancelled it. They've left TRX40 in the dust, while their mainstream AM4 continued to get more new chips from ZEN3 to ZEN3-D all which are cutdowns from its EPYC Milan and Milan-X server CPUs.
I will go for the threadripper 7000. But my expectation is the same. I don't expect long support on this socket. However, I really hope they will.
CPUs have come along way. When I first started compiling a full build of UE 4 it took over 4 hours for a full build. Even a modern i7 can now do it in about an hour. However, unless you are making low level engine changes you are probably doing an incremental build and not a full build of the engine. An incremental build is usually less than 10 minutes on most modern desktop CPUs. These machines great for a subset of the development team, but probably overkill for most devs.
Regardless of how far they've come, LEDT (Low-end Desktop) simply isn't that good or that fast. Over a hour is a LONG time to wait for a compile. And for anyone making a real game, you'll still have to do an incremental build for every single change you make. Throughout the course of development this is tens of thousands of compiles. Not to mention that without HEDT you're looking at only 16 PCIe lanes and 2 channels for RAM on LEDT platforms like AM5 and Z690. AM5 and Z690 (your "modern i7") are toy platforms that are heavily cut down, with toy processors.
5:47 I guess Optane P5800X comes to good use with that
The bogeyman of "Processor Groups" emerges again to scare the children, but yet again there's no data to substantiate the scary legends. Benchmark the build on Windows and Ubuntu and share the results, and then if the Windows results are markedly lower, perhaps that's the explanation. Until then, the sky's not falling in the Windows scheduler, as far as I can tell.
And I say that as (a) someone who runs Proxmox on my own 7995WX, and (b) hasn't touched the kernel in 25 years. But without numbers, it's just an oft-repeated opinion.
The "Unity building process" you refer to is actually called a "Unity build" or a "Jumbo build", and has nothing to do with Unity the game engine.
Instead, it entails combining all the source and headers into a single translation unit, which can lead to significant speedups, because headers don't get repeatedly compiled in every TU, and linker has less work to do.
Can you run CFD benchmarks, like WPCcfd 😭🙏
very interesting, it seems like when you have twice as many channels as you do ccd's performance suffers (as we see with the 32 core 8 channel). Its almost like each ccd is working extra hard because it has to handle more channels. As for the 24 core with 4 channels, understand there are still 4 ccds here (6 cores per ccd), so we dont see it impacted vs the 32core 4 channel (which you verbally mentioned.) Correct me if I am wrong here.
Need to test if memory ranks also have an impact. Would like to test my 2990wx with 256gb vs the 7000s :)
I think that when you compare the price/performance, it looks like that the sweet spot might actually be the 64-core, non-Threadripper Pro, with 4 channels (TRX50), and 256 GB of RAM.
I haven't run the calculations yet, but to get it down to sub 10-minute compiles, you're just throwing money and hardware at the problem.
Wendell looking more healthy ever video
Which DDR5 modules did you use to hit the 1TB load?
I hope Zen 6 will utilise 12 core ccds, making the top of the line Ryzen a 24 core cpu and making the Thread-ripper with 48 cores a thing with 4 ccds.
i had to double check he wasnt sped up the first 8 seconds
How did the Silverstone Rack Case go with the non-Pro Threadripper?
Can you run MS SQL Server under Linux and avoid the processor group nonsense?
Yeah
I didn't understand 1/10 of this😂, I'm just fascinated by these Threatripper CPU's.
I run a 7900x, 32gb 6000mhz DDR5 ram for focus stacking extreme macro photography.
LOL ... literally as i am developing on UE5.
Through the magic of buying 3 more of them..
Speeds up from 32 to 24 cores? Perhaps memory starved cores? I do know that currently with linpack and other memory heavy workloads, at 6200 or 6400 MT/s, buildzoid can manage to fully utilize between 10,5 and 11,5 (21 to 23 threads) cores on 12 and 16 cores desktop parts. Now 8 channels of memory should prevent that, but if there is bottleneck somewhere in infinity fabric - maybe that could warrant small investigation? Like checking if all cores fully utilize memory and seeing if IF overclock alone doesn't improve results.
Also it would be cool to see how X3D parts handle it, but maybe that is further in the video.... I keep pausing and commenting.... well at worst it shouldn't be bad for the algorithm.
If that were the case, wouldn't the issue also present itself at 64 core and 128 GB ram?
@@origamitom Depends how many CCD's it has and how many memory channels there are. Also whether the bottleneck is in IF and whether it comes from IO Die or from links to CPU. Not to mention AFAIK, EPYC and Threadripper have different IO die that might alleviate this issue (I suspect it does, routing of data for Desktop Ryzen is... not super bandwidth efficient).
I need to mention again - this is only when cores have as much memory thrown at them as they can handle. In "normal" use 12 and 16 cores on system such as Desktop Ryzen has, should be fine. The issue is how much bandwidth each CCD has from IO die most likely (which is why most EPYC's and Threadripper's, use a lot of them with a lot of disabled cores for lower end parts). Plus how much of traffic there is between CCDs.
In desktop parts with 2 CCDs, CCD1 communicates with CCD0 through IO Die. CCDs in EPYC and TR that are on opposite sides of IO die, might have the same issue (IDK how it is for the ones in the same row, so I'm picking worst case possible). But again - the IO die there is much "beefier". Still, one should check if programs don't necessitate a lot of inter-CCD communication - because then it might throttle bandwidth in very odd ways.
I don't think we actually have 32-core numbers. Here it's the old 5000 series 32-core. Can't compare that one with the 7000 which is not even in the table. We also don't have 8-channel 24-core listed...
Outstanding content
The overhype of UE is actually counter productive as it makes graphics evolution stuck with old screen space and shadow map techniques. Convinient developers may forget about path tracing and use mediocre lumen instead that got lots of light leaking, inaccurate scene lighting and flat reflections etc.
What’s an alternative except writing your own?
You have more computing power and memory there than many universities here in Spain.
Waiting for SQL video :)
Opinion: Should I buy a threadripper pro 24 core cpu or is it so much better to get the 24 core series 7 pro or no pro?
I think you should go for the non pro. Unless you really really need it.
What is the aio watercooler model at 4:29 ? @wendell
Thanks for posting this. Would be cool if you could do a real world rendering test using Dreanworks' MoonRay (now free open source), even if just for a 10 second scene.
Can you give me a step by step I can run?
@@Level1Techs Not me personally, but if you contact Dreamworks, I bet they would love to work with you. They've made a serious commitment to the open source community.
Gigabyte just announced a new TR motherboard :)
The TRX50 AI TOP
Will you test it ?
Damn, sad to see we're still doomed by the TPM stutter issue.
Tpm is overrated anyway. Just like windows is overrated
But how does it bench on MS Office Suite?
@1:10 - I wonder will the axe/blade/handle story "Trigger" anyone?
In a case like this, deactivating Hyperthreading may help.
The TPM stutter still wasn't fixed! :-0
It took me a long time to learn a valuable lesson about computers. PC's are by far the fastest depreciating thing there is. I built computers for 25 years and lost my ass on all of them. I now buy only last gen PC's from Dell or Lenovo. I am on a refurbished I7 13700 that I got for $560. It's so much more satisfying getting a deal than building your own at an extreme loss.
One of the sad things about the experience building on Windows being better? Windows sucks as a platform. It's a shame that all that great engineering has been done to have it all tossed on a bonfire with bullshit features and spyware.
Oh well, could be worse. Try and build the JVM on Linux or Windows. Back in the days Linux took 40-45m to build it, Windows took half a day, and that was after all the downloads and stuff to get the required libraries installed in the first place. Real devs run Linux, or possibly Apple for some tasks.
When are we going to get 3d v-cache on threadripper?
Probably never. Even on ryzen with two ccds they would put it on just one of them.
One could imagine a future where all ccds are basically under the same die and a thick 3d v-cache layer on top, may be in 20-30 years...
@@KoRNeRd Zen 5 Threadripper will have it.
7 months later and there's STILL f* all motherboards for TRX50. Where is the Zenith Extreme? Aorus Xtreme? Etc etc. Compared to the extended functionality and greatly increased MB range of TRX40, TRX50 is a bit...sh#t. AMD putting another nail into enthusiast/non-pro Threadripper!
board partners are burned out from AMD's lack of action from the previous TR chipsets.
MSI put a lot of money in 2nd gen X399 with that 2018 released MEG Creation board that only see a single gen use, once AMD killed X399 and claimed it would have been impossible to put ZEN2 on it. MSI come back with a new Creator board for TRX40 and get slapped with single gen use again. They are doing FAFO with AMD and can you blame them.
@@vh9network And to think, AMD marketing for the non-pro TRX50 was effectively "Hey! Non-Pro TR is back! We love our enthusiasts!"
In reality AMD killed any hopes of a return to enthusiast TR with that insane and very deliberate TRX40 instant EOL manoeuvre, killing their entire enthusiast userbase in the process. Such a shame for the enthusiast since TRX40 was a superb base and PCIe4 still has years of life left to maximise its potential. AMDumb indeed.
@@vh9network The Pandemic causes most of that.
Yowza. This is an expensive video.
Interesting. Would like to see a follow up video on a Intel platform. Including a workflow to any kind of VR real time.
Nah. No Intel.
Who is upvoting the comments from UA-cam's porn bots? what !
No one is doing full rebuilds for every run. Plus, there is also distributed build.
He's using the source, can't use distributed build.
@@bac483 I am not sure what you mean. You need the source to do a build.
The 'rip of Theseus
Just let me know when you don't want one of those anymore.
If you really like technology, you can't like Windows.
Learn how to use GNU/Linux. Zram is something too basic for you not to know.
You mentioned testing Grok is these Threadrippers related with that ?
Still disappointed with AMD’s decision to only use a 4-channel memory configuration with 1DPC motherboards for non-PRO and 8-channel/1DPC for the PRO variant. They f’ing knew that 2DPC is a pain-in-the-ass on these platforms then don’t castrate the theoretically available 12 memory channels at least on the PRO configuration to give users more memory slots.
That's what happens when they butchered the previous TRX40, then split all that extended functionality into 2 for TR 7000. AMD doing what AMD does best...shooting itself in the foot. Where's the TRX50 equivalent of all those TRX40 MB's...Zenith/Aorus etc etc?
so 96 cores without SMT is sweet spot? 😄
Its lanes problem , you want to send lossless in realtime 8bytes n 8bytes in realtime no studio usrs win 11 or these core overloaded its sennding in bits to memory as it cant send native instant its better on the older tech these huge cpus are too slow and use mwmory, cl16 16gb xmp x 2 on a z270 gihabuye with a i77700k would have done it better at lossless in studio stored in audio seperate at hq to video. Hardware today is a scam to sell shares. All lies on specs n binned cpu dependent on win ai n 4gb rar codecs n compression. Hence more legs on cpu 3 halfs tied together trying to do one highend but cant so memory not on cpu its stored fetched checked not iso 4bit studi pretending to be 16
People still run windows for something that isn't Desktop?
If your game needs a 10k PC it's going to fail
There is a huge difference in developing on a PC versus running on a PC. Compile time, running the game in a debugger, performing automated / unit testing, running an IDE, running tracer and other applications, having many documents open, running a server next to the client etc. etc. etc. etc. etc....
Nice tax write-off...oh, my bad, wrong channel!
Unreal engine more like possed engine lol they them engine
Wat?