Minor correction: In the video, I mentioned Intel Macs had a shared memory design which is true, which I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline.
there the key aspect on this is with intel iGPUs you still need to copy data to and from the GPUs reserved memory. With apple silicon you don't need to copy at all just path the memory address to the GPU and it can read the data in place. And that reserved capacity is static it cant be increased or reduced so it eats into your system memory even if your just using very basic apps.
Not superior because you cant run the older Mac OS , cant do 32 bit apps and games, cant don 4-6 monitors, their still games and apps coming out intel vs apple junk. An it superior why they haven’t giving that tech to Apple TV? Because suck for gaming
@@hishnash I think you are wrong, where did you get this information? Reserved memory doesn't mean that only reserved memory is used, if it's reserved how can the CPU write to it?
This whole video was literally what I've been asking my brain about all week. I thought I'd never get an answer. Thanks for reaching into my mind and making a video about it
Finally a reasonable explanation of what “cores” are on the very different platforms. Thank you. It seems that right up to the Studio Ultra (small desktop) the tremendous power savings and quiet operation is the main selling feature of the M architecture, and graphics performance is good enough. If the M3 doesn’t change design to include thousands of additional external PCI lanes plus that hybrid graphics architecture you mentioned Apple had patterned, I would say they’re completely done competing in that space.
I think the media engines help for those that are doing video particularly, along with the neural engine but yep, the silence and weirdly the portability are pretty dam impressive, having been on PC for decades this has become quite the surprise.
Apple might have some really good technology and what not, but it all comes down to 3rd party apps adopting such tech. Problem is, apple has always been a moving target. Not to mention all the secrecy which makes it really tough for the 3rd parties to adopt. And the drama with some other companies…
Yep. The whole Nvidia vs Apple is two companies who make great products and are also jackasses. Apple could solve it's gaming issues nearly over night if they brought native Vulkan support or at least a hyper optimized translation layer like DXVK.
To reply to my own comment, I’m kinda glad that no other 3rd party other than pegasus made any MPX modules for the 2019. Imagine them spending resources to come up with a product then becoming obsolete with the 2023. FFS apple, whatever you make and call innovative, godamnit stick to it! With the 2019’s modularity being the marquee feature and then ditching it just one generation later, I’m just sick of the moving target thing.
@@yjchoi17 MPX struck me as doomed from the start. The entire concept was so Apple could deliver video passthrough via Thunderbolt with the bonus of higher power delivery but no sane person was going to pay $5999 for a $900 GPU just for those two features. Apple became obsessed with video over Thunderbolt, which makes sense for laptops as I personally love it as I connect my work laptop to my two 4k displays and it charges my computer at the same time via one cable. However, I give zero shits about Thunderbolt on the desktop as long thunderbolt cables are prohibitively expensive. It's frustrating as Apple almost certainly could develop some sort of modular RAM upgrades (at the very least for desktops) and they do not need to integrate the SSD controller into SOC. There's plenty of examples of ARM chipsets that have these things.
@@dmugExactly just look how fast Rosetta 2 is. If company like Valve can bring Proton to Linux, Apple, a 3 Trillion Dollar company can easily bring the best alternative for Metal.
@@soraaoixxthebluesky the fun part is, the best way to do that is just throw money at the company doing that anyway (code weavers, who also did Proton) and it will result an open source solution
Oh for sure, I don’t mean to downplay how nice it’s be to have a helluva less heat output than a Mac Pro 2019. I just wish Apple had the option for dGPUs for the times when you don’t care about TDP
Apple: we want gamers in our platform. Also apple: we are not interested. Amd: we have an Apple silicon killer. Apple: nah ur lying. 💫strix halo entered the market. Apple:
Appreciate your honesty. Our studio waited so long for Apple to catch up to PC. The M's were incredibly disappointing. We finally bit the bullet, and switched to PC for the first time in 14 years and we're so happy
@@SpaceChimes Lightroom, Photoshop, Premiere. Lightroom uses all physical cores and e cores. It utilizes the gpu a fair bit as well. My 13900k gets hotter running Lightroom than it does gaming
Hey mate, would you mind doing a follow up video on this looking at the supposedly 'improved ray tracing' on the m4's? Thanks. You make great content; I'm a new happy subscriber!
Possibly, I haven't really paid it much attention since there's not a ton of applications that support it. If I had to guess, it's further optimization to the Bounding Volume Hierarchies (BVH). It's unclear if they have an analog to the RT cores found in Nvidia GPUs, where's there's physically something different.
In Perf/W apple is quite a bit ahead of NV. it is worth noting the OpenCL score on macOS is very poor unless your task is only supports openCL since the openCL driver has a LOT of issues, (its deprecated). The math that is done on the Metal test and the OpenCL test in GB6 is the same, and the score is computed in the same way so you can compared between backends infact that is the point. The aim here is so that you can see the massive perf benefit you get from using metal (or in other words how poor the openCl driver is).
Yeah, I think the Metal score is more interesting as you see that Apple really pushed the GPU quite a bit, which why I lead with it. It's performing now at least near top tier previous gen AMD so with that the in mind, probably would be closer to the AMD's performance. I know Apple hasn't put a lot into it but I haven't seen much how good bad it is. Got any sources? Always appreciate boring tech docs.
Nobody cares about Perf/W if performance is shit. What is required is low idle/w and fast performance when needed. Either Apple moves to another architecture or there will be a bad future coming ... and one day a switch back to Intel/AMD again. I'm ready to predict also that we won't have CPUs smaller then 3nm, we hit physical limits now. Thats why Apple hat do put an Ultra into the MacPro instead of an M2 Extreme (the one they planned with 96 cores).
@@hishnash It's a big issue for a developer. I work on 3D graphics and Metal is very lacking/restrictive compared to other APIs. Not only that but it's a pain if you are making an app or software that's multiplatform. You can either use Vulkan and play around with the inconsistencies of MoltenVK or have to write a Metal backend which may not have all the features you may need. There's a reason a lot of triple AAA developers stopped releasing games in Mac ever since Metal came out and they did try making backends for that API.
I probably will at some point but I have a lot of reading/research to be done and as I understand it, Apple hasn't released a ton of information about the NPU.
I think they are trying to force the industry into developing for their integrated GPU before they start thinking about external GPU. To be honest, I dont think Apple ever dreamed of the success of Apple silicon.
Thanks, been trying to do explainers like this now and again. I did one on macOS memory management and two on sepOS, and plan to do one on the neural engine, and probably SSD life spans.
Basically my philosophy here, I am an Apple focused UA-cam channel as its my preferred platform but I pay Apple for stuff; they don't pay me. I don't understand people who are absolutists as Apple does some good things and bad things, same can be said about say, Microsoft or various Linux distros.
Hey Greg. Great channel With tons of relevant info. Thanks So Much. I have a question regarding cores: What are the differences between Performance cores, Efficiency Cores and Virtual Cores that I see in my activity monitor but never seem to get used. Thanks, AM Edit: Another question: What are PCi pools? - I just found out about this management aspect of Mac Pro 2023
In Apple Silicon and the most modern AMD and Intel CPUs, there's two core types (AMD's are a little different) where you have the performance, high clocked cores, aimed for maximum processing and efficiency which are lower power cores designed for background tasks that do not require the full might and power of a performance core. Virtual cores are assigned to virtual machines as a way to ensure performance by assigning some prioritization when scheduling execution. There's no additional performance beyond priority. In Intel Macs, the "virtual" core used to refer to hyper threading which is entirely different as Intel CPUs could execute two instructions in a clock cycle using hyperthreading when certain conditions were met, increasing the performance. The term has now evolved. Here's a secret: I learn a LOT from Eclectic Light's blog. eclecticlight.co/2023/10/23/how-does-macos-manage-virtual-cores-on-apple-silicon/ PCI pools are bandwidth groups within the Mac Pro 2019, I'm going off my memory but Each MPX slot is on it's own separate pool as they're direct CPU lanes, whereas the other two pools exist behind the Northbridge chip. The two behind the northbridge can be load balanced by switching the PCIe lanes to the adjustable PCIe slots (the non direct lane access ones). I assume in the Mac Pro 2023, the PCIe controller has load balancing as well, but I’m not clear on the division as MPX is dead and it’s not Intel standard where there’s a ton of info floating around. I’d love to get my hands on one but UA-cam doesn’t pay nearly enough to justify it. My next video will be of interest, as I get really nerdy about memory management in macOS.
Hey love your videos man. I had a quick question. As a final cut pro and davinci editor, with some ocasional AI, would it make sense to buy a 2019 mac pro? get used and upgrade it wioth beefy parts, or will the apple silicon catch up quick enough to make it a bad investment/upgrade path. Thanks!
Not really, video editing is the place where Apple's SOC is probably the most optimized, even the base M2 now has the media engine, which is essentially is a hardware encoder/decoder, so that the CPU/GPU are freed from these operations when using supported codecs. It doesn't stop there, Apple's media engine is very powerful thus can handle multiple streams of 4k (about 5 4k 422 streams if I remember right, and 3+ 8k might be more). An M2 Mac Mini with even 8 GB of RAM will run circles around the Mac Pro 2019 in a lot of basic editing operations as video editing itself isn't super RAM intensive unlike most popular belief. The Mac Pro 2019 would have advantages in edges cases, certain GPU bound plugins or serious visual compositing. The base M1 didn't include the media engine. It's never been cheaper to get professional speeds when video editing. The M2 Max has dual media engines, so it can do something silly like 11 8k 422 ProRes steams, which at that point would require serious infrastructure to store and stream that much data to the Mac. The two Macs I'd be most interested for video editing is the base M2 Pro Mac Mini (It has the beefier CPU/more Thunderbolt/comes with the 16 GB of RAM) or the base M2 Max studio if wanted 32 GB of RAM and 10gb enet). Right now even Nvidia and AMD don't really have answer to Apple when it comes to video editing. It's the one spot where Apple is playing a different game.
@@dmugI've been thinking about getting a mac for video editing for exactly the reasons you mentioned. Unfortunately, I find it very difficult to estimate the individual configurations of the mac's and their performance so that you don't spend a lot of money unnecessarily for a few seconds of render time that doesn't make that much difference in the end anyway. Could you recommend which model is best for video editing and color grading? Maybe even in video form? Thanks for your great content. It is very helpful to have these things get explained so clearly.
@@chrgans4619 I'd really look to see if you can score an M1 Pro laptop (I've seen pick scoring them refurbished/used for as little as about $1400), or go all in on an M2 Pro Mac Mini.. Color grading is tough as that's fairly intensive, there's several colorists on UA-cam who've made videos about the M1 series and M2 series. For the most part, you can get away even the base M2 Mini, especially if you're a hobbyist/prosumer level. I had a M1 Air for awhile and even without a media engine, it was a 4k editing beast. The only upgrade that I'd highly recommend is getting the M2 Mini in a 16 GB/512 GB configuration as you get the much faster SSD in the 512 GB (the M2's 256 GB SSD is half the speed) and the RAM certainly will improve performance and save the SSD wear and tear. Fortunately video editing and grading isn't the biggest RAM hog, and it's enough for a decent level of compositing. Here's a good real world guy who got the M2 Pro Mini and his experience doing professional video for real estate. ua-cam.com/video/YX8hVEMmG0o/v-deo.html Of course if you have $2g to toss around the M2 Max Studio has dual Media engines and double the GPU cores. It only really becomes worth it for the M2 Ultra if you have either gobs of money or at a point where your time is so valuable that seconds saved on exports or compiles or whatever you're doing that the M2 Ultra is an easy business expense. Personally, if I had had to replace all my computers tomorrow and needed to keep the cost down, I'd have a M2 Max Studio (32 GB of RAM / 512 GB base storage) and then use extra money for some sort of storage solution.
Older design that maximizes compatibility with multiple OSes, and not needing to rearchitect an OS to support the unified memory model. Apple has the advantage of being able to tightly integrate this hardware to its operating system.
Thanks for going in depth more than others. It's becoming clear to me that Apple will make graphics cards (Mac only of course). They just have to figure out how to tie them into the system. If they wanted to grab a market by storm they could build monster cards for server farms to compute (compete against A100)
The reason for ALU count as opposed to any other metric, is not as complex as that. It's simply because it's the highest number they can site out of all the potential "core" counts. :P It's like calling the Neo Geo console "24 bit" when it was really a 16 bit CPU and 8 bit co-processor (mainly used for audio control) added together. :P Marketing likes big numbers.
This video is good but there’s two unmentioned nerdy details which have a big impact on performance. Cache hierarchy and dispatch style. Within a GPU process a lot of data is used multiple times. Unified memory is fast, but it’s not fast enough to keep the GPU fed. With this in mind the M series chips include a huge amount of cache right within the GPU cores so frequently used data is fast to access. NVIDIA uses cache layering too, but the caches used are smaller and slower. It’s part of why they consume more power than the M series chips as data is being shuffled around more often. The second interesting difference is dispatch style. The NVIDIA approach is to split the task across huge numbers of fairly slow cores which operate in parallel to produce a frame. Each core produces a single tile The Apple GPU’s are ‘deferred tile renderers’. There are fewer cores, but each core is much faster and can render multiple tiles per frame per core. The difference is dispatch approach is why ‘core’ numbers seem so different but the performance isn’t as far off as numbers would suggest. Apple take the ‘few but fast’ approach rather than the ‘many but slow’ approach used by others. Software needs to be written with this in mind. It’s not hard. But it does require a bit of engineering time to get the best performance. I hope that’s a useful bit of context for why an apples, to erm, apples, comparison so so tricky. The power efficiency side of things is also deeply impressive, though less relevant for desktop gaming. If Apple ever made a competitively priced server component with the performance per watt of the M series part they would give NVIDIA a big headache. Luckily… Apple probably don’t have the fab capacity to compete in that market.
Good stuff. I debating going deeper with this but I’m always trying to balance entertainment vs information, and making visuals for hyper abstract concepts is tedious not to mention my own self education. I’m on the edge often for Apple centric channels as is for technicality already and I’d like to be the bridge or gateway to extreme niche content. Apple’s fab is the same as nvidia with tsmc. Apple would really have a tough go against CUDA in the server world, so many docker images you’d use are entirely chained to cuda for say, ML/AI, and they’d need dump oodles of resources into Linux, which Apple hasn’t show any interest in.
I think it's a bit of a mischaracterization to say we've had shared memory architectures for a long time now, as though Apple isn't doing anything new here. This isn't the second rate hack job approach of simply hooking an integrated GPU up to the existing system memory architecture like we've seen done before (which doesn't actually avoid memory duplication, as the GPU memory is partitioned away from the main system memory, and the GPU is basically treated like a classical setup). Apple has actually structured the system memory to be much more similar to a traditional GPU memory architecture, than a traditional CPU memory architecture (so it's more like hooking the CPU up to the GPU than the other way around). They have also piped everything so that everything fed by the memory is a first class citizen. Because of this tight integration, designed from the ground up, and not just tacked into place after the fact, Apple's approach is far more flexible, dynamic, efficient, parsimonious, and performant than previous "shared" designs. Just as an example, Apple's approach allows for a situation where you can throw an utterly monstrous ML model at the GPU/NPU, something you couldn't easily accomplish otherwise, and totally not worth doing on a classical integrated approach.
Yeah, it kinda is and pinned a comment and put in the description: " I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline."
Games made for mac actually perform amazingly well, its just that the PC to Mac ports are still games designed for direct x. There are some things you just have to turn off on a mac like ray tracing, and then a mac can play the same games just fine. But this isn't a surprise to anybody who games on a mac. We see the comparison graphs that run benchmarks designed for direct x, and we know the point of the video isn't intended to demonstrate how well a mac can game. It's like running a direct x benchmark against a playstation 4 and an xbox series x. Thats not a benchmark designed to show what the playstation 4 can do, its more akin to a sponsored hitpiece. Yes, games on windows with dedicated graphics cards provide the highest fidelity graphics gaming experiences. And they are big, loud, and hot. And RGB 🤡 But I game on an M2 mini, and the games made for mac all run extremely well. I typically don't have raytracing, but I also can't hear my computer, and its not embarrassing to keep in the livingroom
Benchmarks, Schmenschmarks, Synthetic benchmarks of CPU and GPU performance are for people who dont understand computers. Its not about how fast your CPU or GPU is in a sythetic test, its about how fast the whole system is. Compare 3 systems with the same CPU/GPU/SoC, one with 8GB of RAM, one with 16GB of RAM and one with 64. Now start doing real world work on them, heavy rendering, Batch processing of RAW photos or 8K video edit of a 20 minute clip. Then you will see how much the difference is and why the total system is important
The base floor level of the Apple silicon Mac GPUs is pretty high compared against the iGPUs found in x86, meaning any Apple Silicon Mac has enough GPU horsepower to play modern titles in HD with low-mid settings. While it does suck that the high end GPU options are are a far cry from the top from AMD and Nvidia, the biggest obstacle is Apple’s refusal to support vulkan and a few other common libraries.
While the memory not he intel Macs does get shared with the iGPU this is not unified it is reserved. The memory address space that the GPU can access is seperate from that of the GPU and is non-resizable (in effect at boot the os selects a portion of your memory and reservers it just of the intel iGPU even if the gpu does not need it and it cant change that separation post boot). Very different from the unified memory model apple are using were cpu and gpu can both point to the same memory page tables read and write the same memory at once (be careful if you do this) etc. Sharing memory like this with a GPU is very uncommon in consumer devices (some data centre systems have done this before). With the PS there is some shared memory like this but it is a reserved section of memory when the game starts it needs to pre-locate the region of memory it wants to share and that region is read only for the GPU. I believe the xbox does not support this shaders address space model at all but does support re-sizing the amount of memory reserved for the gpu on the fly however you cant hav the GPU and CPU address the same memory pages.
That's just false. A portion of memory is reserved, usually 32 or 64 MB, how do you expect the GPU to work with so little VRAM? Memory is shared with the GPU during runtime, Apple's 'unified memory' is just marketing bullshit, they didn't invent anything new, both Intel and AMD CPUs have different types of accelerators, they also have a GPU, media codecs, even NPU and ISP in laptop SoCs. Apple just copy pasted it's iPhone chip and changed some core counts in a Verilog/VHDL file. Apple silicon is more efficient only because of 5nm and lower frequency.
I'm actually impressed. I have the M1 Max, and am waiting until the M3 series to upgrade and pass this one down to my partner. Now I definitely want to wait seeing how much improvement they got from M1 to M2. I can imagine what the M3 will be. I'm hoping they focus more on GPU performance since they are trying to push their way into gaming more. Right now I'm playing Hogwarts Legacy through the game porting toolkit, and it works pretty well for me. About 30 fps consistently at 1440p ultra settings. Hopefully this will convince some companies to start porting games over so they are optimized.
I always consider Apple being one or two gen's behind but for me, it's more than good. I don't notice these differences unless I am doing extreme 8k rendering. Most of the user base for Apple are usually just casual users anyways. Also for the M2 M3 X-treme chip, hope we see Tim Cook with a backwards hat on with skateboard saying: 'With the unveiling of the new X-treme chip, here's Tony Hawk..'
There is only one question I would like anybody who is criticizing Apple M2 Ultra to answer it, why the hell are you mentioning OpenCL?! Bringing up OpenCL which is deprecated by Apple in 2018 to come out with this conclusion that Apple is not a match to Nvidia has to possibilities: 1- You don’t know Apple and you are not aware that they have depreciated OpecCL long time ago. 2- you are Apple hater and you are not ok with Apple offering a good product. When you test Apple Silicon you use only Metal. This is the only fare way to test. With Metal, M2 Ultra is matching RTX-4080 using Cuda which is very very very good result. Imagine if we got M2 Extreme, even Nvidia RTX-4090 will be ways behind. People use Metal with Apple and Cuda with Nvidia. This is the fact. Apple does not need dedicated GPU because their integrated GPU are matching the dedicated GPU with much less power consumption. PC enthusiasts need to be open minded about what is good for performance. Things don’t have to PC-like things in order to perform well!
I'm giving a good faith reply here; it's long as I wanted to fully address this 1: First, I did use Metal. Second, we go with the benchmarks we have. You cannot compare Metal scores against OpenCL; they're different metrics, as evidenced by chipsets that exist both on Windows/Linux and macOS. Same GPU = different scores between the two. Cross-platform is tough to compare. We already know that the M2 Ultra isn't even as powerful as a 6900 XT in Metal computing. The 6900 XT is about 17% faster. In OpenCL, the 6900 XT is about 32% more than the M2 Ultra. That leaves us with a discrepancy of 15-16%. If we were to say, bump the M2 Ultra's score of 120006 by 16%, that'd be 139,207, which is below RX 6800 XT. Assuming by some miracle that the M2 Extreme is 2x as fast (the M2 Ultra isn't 2x the GPU speed of M2 Max), it'd still not be a 4090 Ti. That's completely expected; the 4090 Ti draws 450w, is clocked very, and has a staggering 16384 ALUs. Apple is a generation behind it's GPU tech. I don't know what else to say, but when it comes to battery life in the laptop space, AMD and Nvidia do not have an answer. Apple's primary concern is not top performance but efficiency, as laptops drive the Mac sales by a considerable amount. Desktops are the afterthought. 2: I must be the most confused Apple hater since I own an Mac Pro 2019, MacBook Pro M1 Max, Apple Watch, iPhone 14 Pro, Apple TV, 2 TB iCloud Subscription with Apple Music, and have a work-provided MacBook Pro M1 Pro and been using Macs since the PowerPC era as my first computer I had all to my own was a PowerMac G3 and never have had a PC as my primary computer. I do not blindly accept what Apple tells me, and my only hope that Apple gets better is if I and others demand better from them. You can check my youtube work or the guides I've written for the Mac Pros. The M2 Ultra is an absolute monster for certain tasks, chiefly video editing, but it's not perfect and comes in at a high price for something that has zero-upgrade opportunity, and that's the big problem here. Next year in fall 2024, Nvidia and AMD will release new GPUs. However, Mac Pro 2023 owners would have to sell their Mac Pros and buy a new entire computer to get Apple's newest GPU performance.
@@dmug First sorry for the late reply. Second I did not mean you specifically with my reply but I meant generally people don’t like to see Apple is making something good. Third you can’t use OpenCL not at all. Are you aware that the benchmark you got from OpenCL is unrealistic. OpenCL does not work well with apple silicone so how do accept it?! There are other methods and if not then there is no way to compare. Did you try applications with the same output and compare? Or did you try a game optimized for both and compare? You simply used something does not work with one of them and compare, I don’t know how this logically accepted. Apple GPU is not a generation behind and efficiency doesn’t mean less performance it means higher performance per watt. Apple gpu in fact is more technologically advanced than Nvidia and AMD. Have you heard about something called Tile Based Deferred Rendering (TBDR) how is drastically different from Immediate Mode Rendering (IMR)? Apple GPU give much better performance per watt because of this and because of the unified memory. Higher performance per watt does mean the GPU is weaker but it means the GPU give a certain performance say X with less power. A performance which if you try to take it from Nvidia or AMD, you will need higher power. Teraflops is not a measurement also. Because TBDR GPUs need less computing to draw a scene than IMR because since TBDR doesn’t overdraw the hidden scenes and then discard them as IMR GPUs do. The only way to compare both different approaches is to use what is optimized for each one like metal for Apple and OpenCL/CUDA for Nvidia/AMD. Or you can use real life comparisons with any application or game optimized for them. But definitely, deprecated OpenCL is not the way we should go. People are really in the traditional PC way in understanding the computing. While there are always other philosophies. And for PC advantages, I particularly agree with you. There are things which PCs shine with them.
@@sayafalwagdany6111 P/W on desktop is useless ML CUDA ahead from metal? why use metal? why not using open source and standard API for benchmark comparison...it's fair and square ...remember if Apple GPU were superior why it cant handle UE 5 and blender well? certain game stutter? is it their tech backwards?igpu is still igpu no comparison compare Dgpu..because igpu share main RAM pool and graphics RAM pool...which they need both share the same resources...sound like pathetic Apple shenigan btw
@@sayafalwagdany6111 game optimization like RE for mac m2 still poor if compare to PC...not even close to compute performance to Nvidia and AMD ....stop spreading FUD linus tech tips and make an independent review on this M2 ultra GPU...there's no magic alchemist in M2 series CPU either ....RAW performance still lagging behind AMD intel Nvidia
Fully loaded Mac Studio M2 Ultra is USD8.6k with 192G ram. That is like 15% cheaper than 8 x RTX 4090 (24G each), without the computer yet. M2 Ultra may be not a good gaming machine, but is a steal for people doing AI stuff.
I'm not sure what you're referencing, but the price of a 4090 Ti is $1599, not $24,000 and it shouldn't be any sort of shocker that a card with the TDP of a 4090 Ti walks all over the Ultra, especially in ML/AI. As someone whose done some ML training, many of the popular stacks are hyper optimized or only in CUDA unfortunately. Also don't take my word for it, vid on the previous generation (Basically 3080 TI is 3x faster than the M1 Ultra) ua-cam.com/video/k_rmHRKc0JM/v-deo.html Generally the cope counter argument is that for serious loads, you'd be leasing cloud services rather than doing it locally, thus the lesser performing NPU/GPU combo is less of a factor vs an expensive GPU. There's a lot to like about the M2 Ultra, but when you're willing to forget about power-draw, it's not the performance king in CPU and even further away in GPU.
@@dmug I understand your channel is not focusing on AI. I myself just starting recently. Just want to throw in my 2cents. - Alex's Video - Made before PyTorch has support for Apple GPU, this is important - No idea how much ram his box has, but M1 Studio max ram is 128G, M2 Ultra Studio max ram is 196G (not sure if ram important for that specific test, I am too new to this) - RTX 4090 ram is 24G, 8 of them is $12k - Cloud GPU Price - H100(80G) = $2/GPU/HR (not include machine) (lamdalab) = $1440/30days (assume running 24/7), before machine cost - Quadro RTX 6000(24G) with minimum machine config = $1000/mo (linode) - AI Model - LLaMA full model need ~130G (yes, vram, that's why people are building boxes with 5 - 8 cards) With a fully loaded Studio M2 Ultra, one can do model fine tuning, and running it on a local box, at a cheaper price. I saw non-English video doing a comparison using a 128G M2 Ultra. Hope in a few months we will see more AI related performance review, not just test scripts. Cheers!!
I literally work repairing MacBooks and this is such a dumb name But then again Apple silicon is magic at this point, because my base model m1 MacBook Air outperforms my custom built pc with a 3080ti that costs multiple times more in my video editing workflows. (Mainly h265 hdr so that’s probably why since m series chips have the h265 encoders from the iPhone) Idk how this is possible since the MacBook is fabless and can maintain that performance for 8h on a charge.
Hi, so do you think M2 ultra can be used to run LLM of 100B paras? I'm so curious about that, where 100+B para LLM often needs 5 or more A100 cards and much more expensive than M2 ultra... Thanks.
@@saraili3971 I’d be leaning towards no, however you may want to look at videos that discuss the performance of the Neural Engine because based on a video I saw by Ziskind where in some tests the neural engine could handle wayyyyyyy more than the default gpu (by like multiple’s of 10 times better performance) however I’m not 100% sure what tests would be relevant to you
@@D0x1511af hey genius Linus video literally didn’t show h265 encoding + Linus unironically said that the product is good and plenty powerful it’s the marketing he disagrees with And like I said, I literally have a pc with a 3080ti in it as we speak, I have an s22 ultra as a 2nd phone (main is 14 pro max, which i only use cause it’s better camera, battery and brighter screen)
@@himynameisryan apple silicon?? there's no magic silicon? only custom ARM ISA extension developed specifically for MacOS IOS and IpadOS or any apple products ... there's no alchemist power suddenly make Apple sliicon nuke x86 ISA ...or beat Nvidia hardware GPU? even M2 ultra still lagging behind in ML and blender OTOY and DR? what Apple silicon could offer? NO CXL? no ECC RAM? no NVLink throughput no CUDA ...Pfft ....CUDA far mature than Metal? no RT? NO DLSS ? and tonnes Apple silicon still lags behind x86 and Nvidia based machine?I owned an M based apple products too... the only major advantages they given only superior battery life capability
This hardly is limited to Apple, the tech space values bombastic statements as it drives share prices. Jensen Huang of NVidia and Sam Altman of OpenAI make Apple look tame by comparison. Jensen has made absurd statements like programming is dead. Altman thinks that he should control 1/5 of the world's economy and people should be paid in "compute" shares. That's not even touching Elon Musk who's promised a gazillion things like self driving cars that earn you money while you're at work by acting as a taxi. Apple making claims about GPUs and RAM is pretty tame in this space. It's still cringe inducing none-the-less.
Raw performance is hlaf the story apple has uses unifiled system so its has own advantages, and even it smokes 4090 in some situations, and remenber its uses less that half the power of 4090 itself, and Apple silicon runs much cooler, no crazy heating issues and its 🤞 media engines
Not counting the bandwidth latency, the shared RAM, which shortens the processing cycles needs to work on the same pieces of data. The cache pipeline. The Apple team smacked a home run on this one. Not a single complaint about the hardware here.
I’ve yet to see any comprehensive benchmarks comparing the M2 ultra. Considering where the compute benchmarks are, (not as fast as the 6900 XT) my guess is it’ll perform as such. This isn’t to say the M2 ultra is slow but it won’t be besting a 13900KS build with a 3090 or 4080 or better
@@dmug Mathew moniz did a conparasion with 13900k with rtx 4090 with m2 U, actually m2 U beats in all video editing works, lightroom export, puget, prores etc timeline smoothness, etc where usually mac wins, and pc had a upper hand in 3d and blender gaming, and considering it uses half power and almost dead silent in all time
Yeah, that's expected the ProRes work flow for the media engine is just dummy fast. My M1 Max is a bit quicker than my Mac Pro 2019. That's the one place Apple really has it nailed down and totally expected. It's a nice system as it frees the GPU to do the compute instead of code mashing. There's a give and take with the media engine as I think Apple is has hard baked codecs into it, thus can't do AV1 (I'd like macOS to even support AV1 at this point). Whereas in a modular computer you'd just buy a new card for AV1 support. First time watching Mathew Moniz, that was a nice quick vid.
I Look at a lot of companies, and i hate Apple as a Company, i would only get a Iphone to impress people but i hate All apple hardware due to their business model and decision making. You should be supporting NVIDIA Their a Trillion dollar company and make Modular hardware and anyone hardware is able to use NVIDIA GPU.
@@dmug Their are many Reasons to hate Apple as a Company and they get lots of profit, And They make the Ram and Storage Non removable and You can't upgrade these hardware components on your own. Apple Hates Modular Hardware, and With iphones the Battery can't be removed. Give me Reasons why NVIDIA is a Bad Company? And They make Modular Graphic cards that can be put in any machine that is much different from Apple Fussiness Model. I wouldn't mind to work for NVIDIA, I can live without Apple Prodcuts, But i would feel sad if Nvidia died as a Company Because no one else can compete with NVIDIA. I am simply saying Apple as a Company pisses me off, But NVIDIA with their Graphics cards like the RTX 4090 Aren't the thing causing me problems. What's your issue with NVIDIA? and This isn't like some smartphone where everyone can easily come up with creating their own phone hardware. Graphic cards are more superior then a smartphone and NVIDIA has caused me less issues with their business model compared to Apple.
Minor correction:
In the video, I mentioned Intel Macs had a shared memory design which is true, which I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline.
there the key aspect on this is with intel iGPUs you still need to copy data to and from the GPUs reserved memory. With apple silicon you don't need to copy at all just path the memory address to the GPU and it can read the data in place. And that reserved capacity is static it cant be increased or reduced so it eats into your system memory even if your just using very basic apps.
@@hishnashthank you for the clarification
Not superior because you cant run the older Mac OS , cant do 32 bit apps and games, cant don 4-6 monitors, their still games and apps coming out intel vs apple junk. An it superior why they haven’t giving that tech to Apple TV? Because suck for gaming
Intel macs still king
@@hishnash I think you are wrong, where did you get this information? Reserved memory doesn't mean that only reserved memory is used, if it's reserved how can the CPU write to it?
This whole video was literally what I've been asking my brain about all week. I thought I'd never get an answer. Thanks for reaching into my mind and making a video about it
Really informative and helpful. Thanks for making a really clear breakdown between Apple's GPU cores vs. NVIDIA's
Finally a reasonable explanation of what “cores” are on the very different platforms. Thank you. It seems that right up to the Studio Ultra (small desktop) the tremendous power savings and quiet operation is the main selling feature of the M architecture, and graphics performance is good enough. If the M3 doesn’t change design to include thousands of additional external PCI lanes plus that hybrid graphics architecture you mentioned Apple had patterned, I would say they’re completely done competing in that space.
The new AMD architecture is almost as power efficient as ARM. We will see in 2024 how the impressive benchmarks are turning out in reality.
I think the media engines help for those that are doing video particularly, along with the neural engine but yep, the silence and weirdly the portability are pretty dam impressive, having been on PC for decades this has become quite the surprise.
Apple might have some really good technology and what not, but it all comes down to 3rd party apps adopting such tech. Problem is, apple has always been a moving target. Not to mention all the secrecy which makes it really tough for the 3rd parties to adopt. And the drama with some other companies…
Yep. The whole Nvidia vs Apple is two companies who make great products and are also jackasses. Apple could solve it's gaming issues nearly over night if they brought native Vulkan support or at least a hyper optimized translation layer like DXVK.
To reply to my own comment, I’m kinda glad that no other 3rd party other than pegasus made any MPX modules for the 2019. Imagine them spending resources to come up with a product then becoming obsolete with the 2023.
FFS apple, whatever you make and call innovative, godamnit stick to it! With the 2019’s modularity being the marquee feature and then ditching it just one generation later, I’m just sick of the moving target thing.
@@yjchoi17 MPX struck me as doomed from the start. The entire concept was so Apple could deliver video passthrough via Thunderbolt with the bonus of higher power delivery but no sane person was going to pay $5999 for a $900 GPU just for those two features. Apple became obsessed with video over Thunderbolt, which makes sense for laptops as I personally love it as I connect my work laptop to my two 4k displays and it charges my computer at the same time via one cable. However, I give zero shits about Thunderbolt on the desktop as long thunderbolt cables are prohibitively expensive.
It's frustrating as Apple almost certainly could develop some sort of modular RAM upgrades (at the very least for desktops) and they do not need to integrate the SSD controller into SOC. There's plenty of examples of ARM chipsets that have these things.
@@dmugExactly just look how fast Rosetta 2 is. If company like Valve can bring Proton to Linux, Apple, a 3 Trillion Dollar company can easily bring the best alternative for Metal.
@@soraaoixxthebluesky the fun part is, the best way to do that is just throw money at the company doing that anyway (code weavers, who also did Proton) and it will result an open source solution
Good editorial sir.
I enjoyed listening to your take. What I like about Apple SOC is the lack of heat during basic operations. 😮
Oh for sure, I don’t mean to downplay how nice it’s be to have a helluva less heat output than a Mac Pro 2019. I just wish Apple had the option for dGPUs for the times when you don’t care about TDP
@@dmug Yes exactly!
Not everyone cares about efficiency. Especially when time is of the essence.
Another great video mate!
Apple: we want gamers in our platform.
Also apple: we are not interested.
Amd: we have an Apple silicon killer.
Apple: nah ur lying.
💫strix halo entered the market.
Apple:
It’s about to get really interesting with AMD, we’re already seeing extremely small performance minded devices like the ASUS ROG.
Great video!
Thanks! Glad this is still getting viewers.
Appreciate your honesty. Our studio waited so long for Apple to catch up to PC. The M's were incredibly disappointing. We finally bit the bullet, and switched to PC for the first time in 14 years and we're so happy
Ive seen this same comment on 4 videos, can u chill with the pc dickriding
Would you mind sharing what is the main software used by your studio?
@@SpaceChimes Lightroom, Photoshop, Premiere. Lightroom uses all physical cores and e cores. It utilizes the gpu a fair bit as well. My 13900k gets hotter running Lightroom than it does gaming
Hey mate, would you mind doing a follow up video on this looking at the supposedly 'improved ray tracing' on the m4's? Thanks. You make great content; I'm a new happy subscriber!
Possibly, I haven't really paid it much attention since there's not a ton of applications that support it. If I had to guess, it's further optimization to the Bounding Volume Hierarchies (BVH). It's unclear if they have an analog to the RT cores found in Nvidia GPUs, where's there's physically something different.
In Perf/W apple is quite a bit ahead of NV.
it is worth noting the OpenCL score on macOS is very poor unless your task is only supports openCL since the openCL driver has a LOT of issues, (its deprecated). The math that is done on the Metal test and the OpenCL test in GB6 is the same, and the score is computed in the same way so you can compared between backends infact that is the point. The aim here is so that you can see the massive perf benefit you get from using metal (or in other words how poor the openCl driver is).
Yeah, I think the Metal score is more interesting as you see that Apple really pushed the GPU quite a bit, which why I lead with it. It's performing now at least near top tier previous gen AMD so with that the in mind, probably would be closer to the AMD's performance. I know Apple hasn't put a lot into it but I haven't seen much how good bad it is. Got any sources? Always appreciate boring tech docs.
Nobody cares about Perf/W if performance is shit. What is required is low idle/w and fast performance when needed. Either Apple moves to another architecture or there will be a bad future coming ... and one day a switch back to Intel/AMD again.
I'm ready to predict also that we won't have CPUs smaller then 3nm, we hit physical limits now. Thats why Apple hat do put an Ultra into the MacPro instead of an M2 Extreme (the one they planned with 96 cores).
The biggest issue with Apple & GPUs is that the only API that's reliable in it is Metal.
@@BlinkHawk Metal is not that much of an issue.
@@hishnash It's a big issue for a developer. I work on 3D graphics and Metal is very lacking/restrictive compared to other APIs. Not only that but it's a pain if you are making an app or software that's multiplatform. You can either use Vulkan and play around with the inconsistencies of MoltenVK or have to write a Metal backend which may not have all the features you may need. There's a reason a lot of triple AAA developers stopped releasing games in Mac ever since Metal came out and they did try making backends for that API.
Nice video. Can you do an introduction on Apple's NPU and how it accelerates machine learning?
I probably will at some point but I have a lot of reading/research to be done and as I understand it, Apple hasn't released a ton of information about the NPU.
@@dmug Cool! Thanks for the hard work.
I think they are trying to force the industry into developing for their integrated GPU before they start thinking about external GPU. To be honest, I dont think Apple ever dreamed of the success of Apple silicon.
the level of detail in this video is outstanding!
Thanks, been trying to do explainers like this now and again. I did one on macOS memory management and two on sepOS, and plan to do one on the neural engine, and probably SSD life spans.
One thing I will never do is be on "team _____" regarding a computer. Great video!
Basically my philosophy here, I am an Apple focused UA-cam channel as its my preferred platform but I pay Apple for stuff; they don't pay me.
I don't understand people who are absolutists as Apple does some good things and bad things, same can be said about say, Microsoft or various Linux distros.
Basically :
One Apple GPU core = One NVIDIA SM (streaming multiprocessor) = 2 AMD CUs = 128 FP32 registers (or shader cores / ALUs)
thank you for explaining this apple-marketing-naming
Hey Greg. Great channel With tons of relevant info. Thanks So Much. I have a question regarding cores: What are the differences between Performance cores, Efficiency Cores and Virtual Cores that I see in my activity monitor but never seem to get used.
Thanks,
AM
Edit: Another question: What are PCi pools? - I just found out about this management aspect of Mac Pro 2023
In Apple Silicon and the most modern AMD and Intel CPUs, there's two core types (AMD's are a little different) where you have the performance, high clocked cores, aimed for maximum processing and efficiency which are lower power cores designed for background tasks that do not require the full might and power of a performance core.
Virtual cores are assigned to virtual machines as a way to ensure performance by assigning some prioritization when scheduling execution. There's no additional performance beyond priority.
In Intel Macs, the "virtual" core used to refer to hyper threading which is entirely different as Intel CPUs could execute two instructions in a clock cycle using hyperthreading when certain conditions were met, increasing the performance. The term has now evolved.
Here's a secret: I learn a LOT from Eclectic Light's blog.
eclecticlight.co/2023/10/23/how-does-macos-manage-virtual-cores-on-apple-silicon/
PCI pools are bandwidth groups within the Mac Pro 2019, I'm going off my memory but Each MPX slot is on it's own separate pool as they're direct CPU lanes, whereas the other two pools exist behind the Northbridge chip. The two behind the northbridge can be load balanced by switching the PCIe lanes to the adjustable PCIe slots (the non direct lane access ones). I assume in the Mac Pro 2023, the PCIe controller has load balancing as well, but I’m not clear on the division as MPX is dead and it’s not Intel standard where there’s a ton of info floating around. I’d love to get my hands on one but UA-cam doesn’t pay nearly enough to justify it.
My next video will be of interest, as I get really nerdy about memory management in macOS.
Thanks for the fantastic response. That clarifies things nicely and I'll check out the blog. Can't wait for your next upload Thanks. @@dmug
I think it has to do with cost factor, if nvida would be integrated in to macs, an mac mini would cost around 2500 dollars
Hey love your videos man. I had a quick question. As a final cut pro and davinci editor, with some ocasional AI, would it make sense to buy a 2019 mac pro? get used and upgrade it wioth beefy parts, or will the apple silicon catch up quick enough to make it a bad investment/upgrade path. Thanks!
Not really, video editing is the place where Apple's SOC is probably the most optimized, even the base M2 now has the media engine, which is essentially is a hardware encoder/decoder, so that the CPU/GPU are freed from these operations when using supported codecs. It doesn't stop there, Apple's media engine is very powerful thus can handle multiple streams of 4k (about 5 4k 422 streams if I remember right, and 3+ 8k might be more).
An M2 Mac Mini with even 8 GB of RAM will run circles around the Mac Pro 2019 in a lot of basic editing operations as video editing itself isn't super RAM intensive unlike most popular belief. The Mac Pro 2019 would have advantages in edges cases, certain GPU bound plugins or serious visual compositing. The base M1 didn't include the media engine. It's never been cheaper to get professional speeds when video editing.
The M2 Max has dual media engines, so it can do something silly like 11 8k 422 ProRes steams, which at that point would require serious infrastructure to store and stream that much data to the Mac.
The two Macs I'd be most interested for video editing is the base M2 Pro Mac Mini (It has the beefier CPU/more Thunderbolt/comes with the 16 GB of RAM) or the base M2 Max studio if wanted 32 GB of RAM and 10gb enet).
Right now even Nvidia and AMD don't really have answer to Apple when it comes to video editing. It's the one spot where Apple is playing a different game.
@@dmugI've been thinking about getting a mac for video editing for exactly the
reasons you mentioned. Unfortunately, I find it very difficult to estimate the individual configurations of the mac's and their performance so that you don't spend a lot of money unnecessarily for a few seconds of render time that doesn't make that much difference in the end anyway. Could you recommend which model is best for video editing and color grading? Maybe even in video form? Thanks for your great content. It is very helpful to have these things get explained so clearly.
@@chrgans4619 I'd really look to see if you can score an M1 Pro laptop (I've seen pick scoring them refurbished/used for as little as about $1400), or go all in on an M2 Pro Mac Mini.. Color grading is tough as that's fairly intensive, there's several colorists on UA-cam who've made videos about the M1 series and M2 series. For the most part, you can get away even the base M2 Mini, especially if you're a hobbyist/prosumer level. I had a M1 Air for awhile and even without a media engine, it was a 4k editing beast. The only upgrade that I'd highly recommend is getting the M2 Mini in a 16 GB/512 GB configuration as you get the much faster SSD in the 512 GB (the M2's 256 GB SSD is half the speed) and the RAM certainly will improve performance and save the SSD wear and tear. Fortunately video editing and grading isn't the biggest RAM hog, and it's enough for a decent level of compositing.
Here's a good real world guy who got the M2 Pro Mini and his experience doing professional video for real estate.
ua-cam.com/video/YX8hVEMmG0o/v-deo.html
Of course if you have $2g to toss around the M2 Max Studio has dual Media engines and double the GPU cores. It only really becomes worth it for the M2 Ultra if you have either gobs of money or at a point where your time is so valuable that seconds saved on exports or compiles or whatever you're doing that the M2 Ultra is an easy business expense.
Personally, if I had had to replace all my computers tomorrow and needed to keep the cost down, I'd have a M2 Max Studio (32 GB of RAM / 512 GB base storage) and then use extra money for some sort of storage solution.
I wanna know why Intel igpu can’t be more like Apple Silicon igpu?
Older design that maximizes compatibility with multiple OSes, and not needing to rearchitect an OS to support the unified memory model. Apple has the advantage of being able to tightly integrate this hardware to its operating system.
I guess they didn’t learn their lesson from Mac Pro 2013
Thanks for going in depth more than others. It's becoming clear to me that Apple will make graphics cards (Mac only of course). They just have to figure out how to tie them into the system.
If they wanted to grab a market by storm they could build monster cards for server farms to compute (compete against A100)
You are much more hopeful than I am.
The reason for ALU count as opposed to any other metric, is not as complex as that. It's simply because it's the highest number they can site out of all the potential "core" counts. :P
It's like calling the Neo Geo console "24 bit" when it was really a 16 bit CPU and 8 bit co-processor (mainly used for audio control) added together. :P
Marketing likes big numbers.
This video is good but there’s two unmentioned nerdy details which have a big impact on performance.
Cache hierarchy and dispatch style.
Within a GPU process a lot of data is used multiple times. Unified memory is fast, but it’s not fast enough to keep the GPU fed. With this in mind the M series chips include a huge amount of cache right within the GPU cores so frequently used data is fast to access. NVIDIA uses cache layering too, but the caches used are smaller and slower. It’s part of why they consume more power than the M series chips as data is being shuffled around more often.
The second interesting difference is dispatch style.
The NVIDIA approach is to split the task across huge numbers of fairly slow cores which operate in parallel to produce a frame. Each core produces a single tile
The Apple GPU’s are ‘deferred tile renderers’. There are fewer cores, but each core is much faster and can render multiple tiles per frame per core.
The difference is dispatch approach is why ‘core’ numbers seem so different but the performance isn’t as far off as numbers would suggest.
Apple take the ‘few but fast’ approach rather than the ‘many but slow’ approach used by others. Software needs to be written with this in mind. It’s not hard. But it does require a bit of engineering time to get the best performance.
I hope that’s a useful bit of context for why an apples, to erm, apples, comparison so so tricky.
The power efficiency side of things is also deeply impressive, though less relevant for desktop gaming.
If Apple ever made a competitively priced server component with the performance per watt of the M series part they would give NVIDIA a big headache. Luckily… Apple probably don’t have the fab capacity to compete in that market.
Good stuff. I debating going deeper with this but I’m always trying to balance entertainment vs information, and making visuals for hyper abstract concepts is tedious not to mention my own self education. I’m on the edge often for Apple centric channels as is for technicality already and I’d like to be the bridge or gateway to extreme niche content.
Apple’s fab is the same as nvidia with tsmc. Apple would really have a tough go against CUDA in the server world, so many docker images you’d use are entirely chained to cuda for say, ML/AI, and they’d need dump oodles of resources into Linux, which Apple hasn’t show any interest in.
Bro, you gotta change the camera angle.
Tried a camera mount on the monitor for a video or two. This is one of them.
I think it's a bit of a mischaracterization to say we've had shared memory architectures for a long time now, as though Apple isn't doing anything new here. This isn't the second rate hack job approach of simply hooking an integrated GPU up to the existing system memory architecture like we've seen done before (which doesn't actually avoid memory duplication, as the GPU memory is partitioned away from the main system memory, and the GPU is basically treated like a classical setup). Apple has actually structured the system memory to be much more similar to a traditional GPU memory architecture, than a traditional CPU memory architecture (so it's more like hooking the CPU up to the GPU than the other way around). They have also piped everything so that everything fed by the memory is a first class citizen. Because of this tight integration, designed from the ground up, and not just tacked into place after the fact, Apple's approach is far more flexible, dynamic, efficient, parsimonious, and performant than previous "shared" designs. Just as an example, Apple's approach allows for a situation where you can throw an utterly monstrous ML model at the GPU/NPU, something you couldn't easily accomplish otherwise, and totally not worth doing on a classical integrated approach.
Yeah, it kinda is and pinned a comment and put in the description: " I really should have clarified as the GPU has a reserved pool of RAM and not the same as Apple Silicon which is a much superior design as the processing (vertex shaders, pixel shaders, texture units) part of the same pipeline."
Games made for mac actually perform amazingly well, its just that the PC to Mac ports are still games designed for direct x. There are some things you just have to turn off on a mac like ray tracing, and then a mac can play the same games just fine.
But this isn't a surprise to anybody who games on a mac. We see the comparison graphs that run benchmarks designed for direct x, and we know the point of the video isn't intended to demonstrate how well a mac can game. It's like running a direct x benchmark against a playstation 4 and an xbox series x. Thats not a benchmark designed to show what the playstation 4 can do, its more akin to a sponsored hitpiece.
Yes, games on windows with dedicated graphics cards provide the highest fidelity graphics gaming experiences. And they are big, loud, and hot. And RGB 🤡
But I game on an M2 mini, and the games made for mac all run extremely well.
I typically don't have raytracing, but I also can't hear my computer, and its not embarrassing to keep in the livingroom
Why does it say “Chrome is my enemy”?
Making the age old joke about chrome ram usage….
Benchmarks, Schmenschmarks,
Synthetic benchmarks of CPU and GPU performance are for people who dont understand computers. Its not about how fast your CPU or GPU is in a sythetic test, its about how fast the whole system is.
Compare 3 systems with the same CPU/GPU/SoC, one with 8GB of RAM, one with 16GB of RAM and one with 64. Now start doing real world work on them, heavy rendering, Batch processing of RAW photos or 8K video edit of a 20 minute clip.
Then you will see how much the difference is and why the total system is important
What you just described is a form of benchmarking…
People who have nothing to add always start whining about grammer or semantics.
I wish they would improve architecture for gaming
The base floor level of the Apple silicon Mac GPUs is pretty high compared against the iGPUs found in x86, meaning any Apple Silicon Mac has enough GPU horsepower to play modern titles in HD with low-mid settings. While it does suck that the high end GPU options are are a far cry from the top from AMD and Nvidia, the biggest obstacle is Apple’s refusal to support vulkan and a few other common libraries.
While the memory not he intel Macs does get shared with the iGPU this is not unified it is reserved. The memory address space that the GPU can access is seperate from that of the GPU and is non-resizable (in effect at boot the os selects a portion of your memory and reservers it just of the intel iGPU even if the gpu does not need it and it cant change that separation post boot).
Very different from the unified memory model apple are using were cpu and gpu can both point to the same memory page tables read and write the same memory at once (be careful if you do this) etc.
Sharing memory like this with a GPU is very uncommon in consumer devices (some data centre systems have done this before). With the PS there is some shared memory like this but it is a reserved section of memory when the game starts it needs to pre-locate the region of memory it wants to share and that region is read only for the GPU. I believe the xbox does not support this shaders address space model at all but does support re-sizing the amount of memory reserved for the gpu on the fly however you cant hav the GPU and CPU address the same memory pages.
Yep, I purposely said shared memory, I probably should have made the distinction it's not the same.
That's just false. A portion of memory is reserved, usually 32 or 64 MB, how do you expect the GPU to work with so little VRAM? Memory is shared with the GPU during runtime, Apple's 'unified memory' is just marketing bullshit, they didn't invent anything new, both Intel and AMD CPUs have different types of accelerators, they also have a GPU, media codecs, even NPU and ISP in laptop SoCs. Apple just copy pasted it's iPhone chip and changed some core counts in a Verilog/VHDL file. Apple silicon is more efficient only because of 5nm and lower frequency.
I'm actually impressed. I have the M1 Max, and am waiting until the M3 series to upgrade and pass this one down to my partner. Now I definitely want to wait seeing how much improvement they got from M1 to M2. I can imagine what the M3 will be. I'm hoping they focus more on GPU performance since they are trying to push their way into gaming more. Right now I'm playing Hogwarts Legacy through the game porting toolkit, and it works pretty well for me. About 30 fps consistently at 1440p ultra settings. Hopefully this will convince some companies to start porting games over so they are optimized.
Did he say goodest….
So you need to tell us what your half-Notepad, half-Excel app is
Soulver 3
I want a apple computer!!
A North American with a sense of humour. Fair play.
M1 and M2 use LPDDR5X rammmemory!
The m2 uses LPDDR5-6400, not LPDDR5x. Rumor is the next iPhone will make the jump.
I always consider Apple being one or two gen's behind but for me, it's more than good. I don't notice these differences unless I am doing extreme 8k rendering. Most of the user base for Apple are usually just casual users anyways.
Also for the M2 M3 X-treme chip, hope we see Tim Cook with a backwards hat on with skateboard saying: 'With the unveiling of the new X-treme chip, here's Tony Hawk..'
There is only one question I would like anybody who is criticizing Apple M2 Ultra to answer it, why the hell are you mentioning OpenCL?!
Bringing up OpenCL which is deprecated by Apple in 2018 to come out with this conclusion that Apple is not a match to Nvidia has to possibilities:
1- You don’t know Apple and you are not aware that they have depreciated OpecCL long time ago.
2- you are Apple hater and you are not ok with Apple offering a good product.
When you test Apple Silicon you use only Metal. This is the only fare way to test. With Metal, M2 Ultra is matching RTX-4080 using Cuda which is very very very good result. Imagine if we got M2 Extreme, even Nvidia RTX-4090 will be ways behind. People use Metal with Apple and Cuda with Nvidia. This is the fact. Apple does not need dedicated GPU because their integrated GPU are matching the dedicated GPU with much less power consumption. PC enthusiasts need to be open minded about what is good for performance. Things don’t have to PC-like things in order to perform well!
I'm giving a good faith reply here; it's long as I wanted to fully address this
1: First, I did use Metal. Second, we go with the benchmarks we have. You cannot compare Metal scores against OpenCL; they're different metrics, as evidenced by chipsets that exist both on Windows/Linux and macOS. Same GPU = different scores between the two.
Cross-platform is tough to compare. We already know that the M2 Ultra isn't even as powerful as a 6900 XT in Metal computing. The 6900 XT is about 17% faster. In OpenCL, the 6900 XT is about 32% more than the M2 Ultra. That leaves us with a discrepancy of 15-16%. If we were to say, bump the M2 Ultra's score of 120006 by 16%, that'd be 139,207, which is below RX 6800 XT. Assuming by some miracle that the M2 Extreme is 2x as fast (the M2 Ultra isn't 2x the GPU speed of M2 Max), it'd still not be a 4090 Ti. That's completely expected; the 4090 Ti draws 450w, is clocked very, and has a staggering 16384 ALUs.
Apple is a generation behind it's GPU tech. I don't know what else to say, but when it comes to battery life in the laptop space, AMD and Nvidia do not have an answer. Apple's primary concern is not top performance but efficiency, as laptops drive the Mac sales by a considerable amount. Desktops are the afterthought.
2: I must be the most confused Apple hater since I own an Mac Pro 2019, MacBook Pro M1 Max, Apple Watch, iPhone 14 Pro, Apple TV, 2 TB iCloud Subscription with Apple Music, and have a work-provided MacBook Pro M1 Pro and been using Macs since the PowerPC era as my first computer I had all to my own was a PowerMac G3 and never have had a PC as my primary computer. I do not blindly accept what Apple tells me, and my only hope that Apple gets better is if I and others demand better from them. You can check my youtube work or the guides I've written for the Mac Pros.
The M2 Ultra is an absolute monster for certain tasks, chiefly video editing, but it's not perfect and comes in at a high price for something that has zero-upgrade opportunity, and that's the big problem here. Next year in fall 2024, Nvidia and AMD will release new GPUs. However, Mac Pro 2023 owners would have to sell their Mac Pros and buy a new entire computer to get Apple's newest GPU performance.
@@dmug
First sorry for the late reply.
Second I did not mean you specifically with my reply but I meant generally people don’t like to see Apple is making something good.
Third you can’t use OpenCL not at all. Are you aware that the benchmark you got from OpenCL is unrealistic. OpenCL does not work well with apple silicone so how do accept it?!
There are other methods and if not then there is no way to compare. Did you try applications with the same output and compare? Or did you try a game optimized for both and compare? You simply used something does not work with one of them and compare, I don’t know how this logically accepted.
Apple GPU is not a generation behind and efficiency doesn’t mean less performance it means higher performance per watt. Apple gpu in fact is more technologically advanced than Nvidia and AMD. Have you heard about something called Tile Based Deferred Rendering (TBDR) how is drastically different from Immediate Mode Rendering (IMR)? Apple GPU give much better performance per watt because of this and because of the unified memory. Higher performance per watt does mean the GPU is weaker but it means the GPU give a certain performance say X with less power. A performance which if you try to take it from Nvidia or AMD, you will need higher power. Teraflops is not a measurement also. Because TBDR GPUs need less computing to draw a scene than IMR because since TBDR doesn’t overdraw the hidden scenes and then discard them as IMR GPUs do.
The only way to compare both different approaches is to use what is optimized for each one like metal for Apple and OpenCL/CUDA for Nvidia/AMD. Or you can use real life comparisons with any application or game optimized for them. But definitely, deprecated OpenCL is not the way we should go.
People are really in the traditional PC way in understanding the computing. While there are always other philosophies.
And for PC advantages, I particularly agree with you. There are things which PCs shine with them.
yadax2 look at linus tech tips review again pal...stop being brand apologists...i'm typing this on iphone 13
@@sayafalwagdany6111 P/W on desktop is useless ML CUDA ahead from metal? why use metal? why not using open source and standard API for benchmark comparison...it's fair and square ...remember if Apple GPU were superior why it cant handle UE 5 and blender well? certain game stutter? is it their tech backwards?igpu is still igpu no comparison compare Dgpu..because igpu share main RAM pool and graphics RAM pool...which they need both share the same resources...sound like pathetic Apple shenigan btw
@@sayafalwagdany6111 game optimization like RE for mac m2 still poor if compare to PC...not even close to compute performance to Nvidia and AMD ....stop spreading FUD linus tech tips and make an independent review on this M2 ultra GPU...there's no magic alchemist in M2 series CPU either ....RAW performance still lagging behind AMD intel Nvidia
Who is supporting Metal in free SW world? 😂
Fully loaded Mac Studio M2 Ultra is USD8.6k with 192G ram. That is like 15% cheaper than 8 x RTX 4090 (24G each), without the computer yet. M2 Ultra may be not a good gaming machine, but is a steal for people doing AI stuff.
I'm not sure what you're referencing, but the price of a 4090 Ti is $1599, not $24,000 and it shouldn't be any sort of shocker that a card with the TDP of a 4090 Ti walks all over the Ultra, especially in ML/AI. As someone whose done some ML training, many of the popular stacks are hyper optimized or only in CUDA unfortunately. Also don't take my word for it, vid on the previous generation (Basically 3080 TI is 3x faster than the M1 Ultra)
ua-cam.com/video/k_rmHRKc0JM/v-deo.html
Generally the cope counter argument is that for serious loads, you'd be leasing cloud services rather than doing it locally, thus the lesser performing NPU/GPU combo is less of a factor vs an expensive GPU.
There's a lot to like about the M2 Ultra, but when you're willing to forget about power-draw, it's not the performance king in CPU and even further away in GPU.
@@dmug
I understand your channel is not focusing on AI. I myself just starting recently. Just want to throw in my 2cents.
- Alex's Video
- Made before PyTorch has support for Apple GPU, this is important
- No idea how much ram his box has, but M1 Studio max ram is 128G, M2 Ultra Studio max ram is 196G (not sure if ram important for that specific test, I am too new to this)
- RTX 4090 ram is 24G, 8 of them is $12k
- Cloud GPU Price
- H100(80G) = $2/GPU/HR (not include machine) (lamdalab) = $1440/30days (assume running 24/7), before machine cost
- Quadro RTX 6000(24G) with minimum machine config = $1000/mo (linode)
- AI Model
- LLaMA full model need ~130G (yes, vram, that's why people are building boxes with 5 - 8 cards)
With a fully loaded Studio M2 Ultra, one can do model fine tuning, and running it on a local box, at a cheaper price. I saw non-English video doing a comparison using a 128G M2 Ultra.
Hope in a few months we will see more AI related performance review, not just test scripts.
Cheers!!
@@JSiuDev For big LLMs, there's software like FlexGen and CPU offloading.
@@dmug Thx for the info, will definitely look into those when I reach that stage :)
I literally work repairing MacBooks and this is such a dumb name
But then again Apple silicon is magic at this point, because my base model m1 MacBook Air outperforms my custom built pc with a 3080ti that costs multiple times more in my video editing workflows. (Mainly h265 hdr so that’s probably why since m series chips have the h265 encoders from the iPhone)
Idk how this is possible since the MacBook is fabless and can maintain that performance for 8h on a charge.
Hi, so do you think M2 ultra can be used to run LLM of 100B paras? I'm so curious about that, where 100+B para LLM often needs 5 or more A100 cards and much more expensive than M2 ultra... Thanks.
@@saraili3971 I’d be leaning towards no, however you may want to look at videos that discuss the performance of the Neural Engine because based on a video I saw by Ziskind where in some tests the neural engine could handle wayyyyyyy more than the default gpu (by like multiple’s of 10 times better performance) however I’m not 100% sure what tests would be relevant to you
you serious...look at linux video apple fanboy dont write bad comment
@@D0x1511af hey genius Linus video literally didn’t show h265 encoding + Linus unironically said that the product is good and plenty powerful it’s the marketing he disagrees with
And like I said, I literally have a pc with a 3080ti in it as we speak, I have an s22 ultra as a 2nd phone (main is 14 pro max, which i only use cause it’s better camera, battery and brighter screen)
@@himynameisryan apple silicon?? there's no magic silicon? only custom ARM ISA extension developed specifically for MacOS IOS and IpadOS or any apple products ... there's no alchemist power suddenly make Apple sliicon nuke x86 ISA ...or beat Nvidia hardware GPU? even M2 ultra still lagging behind in ML and blender OTOY and DR? what Apple silicon could offer? NO CXL? no ECC RAM? no NVLink throughput no CUDA ...Pfft ....CUDA far mature than Metal? no RT? NO DLSS ? and tonnes Apple silicon still lags behind x86 and Nvidia based machine?I owned an M based apple products too... the only major advantages they given only superior battery life capability
In every video when i see these apple executives they seem to be ultra arrogant. Wonder why is this so.
This hardly is limited to Apple, the tech space values bombastic statements as it drives share prices. Jensen Huang of NVidia and Sam Altman of OpenAI make Apple look tame by comparison. Jensen has made absurd statements like programming is dead. Altman thinks that he should control 1/5 of the world's economy and people should be paid in "compute" shares. That's not even touching Elon Musk who's promised a gazillion things like self driving cars that earn you money while you're at work by acting as a taxi.
Apple making claims about GPUs and RAM is pretty tame in this space. It's still cringe inducing none-the-less.
Raw performance is hlaf the story apple has uses unifiled system so its has own advantages, and even it smokes 4090 in some situations, and remenber its uses less that half the power of 4090 itself, and Apple silicon runs much cooler, no crazy heating issues and its 🤞 media engines
Not counting the bandwidth latency, the shared RAM, which shortens the processing cycles needs to work on the same pieces of data. The cache pipeline.
The Apple team smacked a home run on this one. Not a single complaint about the hardware here.
I’ve yet to see any comprehensive benchmarks comparing the M2 ultra. Considering where the compute benchmarks are, (not as fast as the 6900 XT) my guess is it’ll perform as such. This isn’t to say the M2 ultra is slow but it won’t be besting a 13900KS build with a 3090 or 4080 or better
@@dmug Mathew moniz did a conparasion with 13900k with rtx 4090 with m2 U, actually m2 U beats in all video editing works, lightroom export, puget, prores etc timeline smoothness, etc where usually mac wins, and pc had a upper hand in 3d and blender gaming, and considering it uses half power and almost dead silent in all time
Yeah, that's expected the ProRes work flow for the media engine is just dummy fast. My M1 Max is a bit quicker than my Mac Pro 2019.
That's the one place Apple really has it nailed down and totally expected. It's a nice system as it frees the GPU to do the compute instead of code mashing. There's a give and take with the media engine as I think Apple is has hard baked codecs into it, thus can't do AV1 (I'd like macOS to even support AV1 at this point). Whereas in a modular computer you'd just buy a new card for AV1 support.
First time watching Mathew Moniz, that was a nice quick vid.
@@dmug that dedicated encoderd for 8k and Prores works is a boon to editors
Hee looks similar to Zelenskis 😅
Every now and again I get someone who says this but I think I’m like discount Zelensky, but if you google Napoleon….
after failing to run the country,zelensky is now a tech UA-camr
bro you look like president zelensky ❤
Of things people could say about me, I’ll take this 1000x over ;)
I Look at a lot of companies, and i hate Apple as a Company, i would only get a Iphone to impress people but i hate All apple hardware due to their business model and decision making. You should be supporting NVIDIA Their a Trillion dollar company and make Modular hardware and anyone hardware is able to use NVIDIA GPU.
Nvidia isn’t your friend either, there isn’t a good guy mega tech company.
@@dmug Their are many Reasons to hate Apple as a Company and they get lots of profit, And They make the Ram and Storage Non removable and You can't upgrade these hardware components on your own.
Apple Hates Modular Hardware, and With iphones the Battery can't be removed.
Give me Reasons why NVIDIA is a Bad Company? And They make Modular Graphic cards that can be put in any machine that is much different from Apple Fussiness Model.
I wouldn't mind to work for NVIDIA, I can live without Apple Prodcuts, But i would feel sad if Nvidia died as a Company Because no one else can compete with NVIDIA.
I am simply saying Apple as a Company pisses me off, But NVIDIA with their Graphics cards like the RTX 4090 Aren't the thing causing me problems. What's your issue with NVIDIA? and This isn't like some smartphone where everyone can easily come up with creating their own phone hardware. Graphic cards are more superior then a smartphone and NVIDIA has caused me less issues with their business model compared to Apple.
@@dmug I know White people can't resist Apple hardware, if i had to rely on apple or nvidia, i am choosing NVIDIA every single day.
Apple GPU cores are worse.