It wont matter Nvidia will still out sell Intel. Even when AMD has faster GPU's for less money still people will buy the slower Nvidia card everytime even if its a slower GPU that cost more money. Its crazy dumb. But that is what it is. Also for Linux AMD is king... Nvidia is complete garbage on Linux.
Because the 3060 is the most plausible Graphics card to get, AMD and ARC prices aren't cheap as the 3060 of 12 Gb of VRAM, and i'm speaking of "Cheap" because on this country $300 dollars aren't cheap.
not sure if you've been living under a rock lately , but the Nvidia linux drivers are pretty good now, even wayland support , VRR and HDR work , gaming under linux is just 5-10% slower then windows now
Architecturally, a core in a GPU doesn’t perform the same tasks as a core in a CPU. The fundamental difference between an Arm and an x86/x64 processor is the instruction set. That is, the actual machine language instructions that the CPU can execute. Arm features a reduced instruction set, which has some theoretical advantages-but this is a conversation mostly for CS PhDs and processor engineers. A GPU has a massively reduced instruction set… it essentially can only do addition and multiplication. CPUs, Arm or x86/x64, do significantly more. A video card will always need a CPU to feed it. It’s essentially a graphics co-processor… which is to say a secondary processor which handles specific tasks that the CPU offloads to it… you’re a bit too young to remember, but before the Pentium processor was released, it was common for CPUs to be sold with and without a math co-processor. A math co-processor was a second chip built into the CPU which handled floating point calculations for the CPU. You could think of a GPU as a kind of math co-processor that is designed specifically for the math required by video game graphics. We have had, in the past, computers with only one kind of chip that handles both. The performance was so bad that they invented GPUs. I’m not talking about CPUs with imbedded GPUs, I’m talking about the days before 3D graphics and GPUs.
To clarify, it wasn't a "kind of chip that handles both"; it was just a CPU but one that handled graphics as well. Once graphics processing became too complex and intensive for the CPU to do (meaning a very large chunk of time was allocated to just 1 type of workload i.e. graphics), it was decided to create a separate processor which would offload that burden from the CPU, a solution which made graphics processing better and the CPU could breathe again. Generally there is this idea that once a specific task/workload becomes too much for the CPU, a separate processor (co-processor) is made to do just that task. At some point the difficulty of that task doesnt increase but the technology of the CPU does improve so later in the future that task is reabsorbed back into CPU processing (thus removing the need for additional silicon and eliminating communication lag between the CPU and something that would have otherwise been off chip). It's all about optimization. There are other tasks which benefit from having a dedicated processor such as the Digital Signal Processor (DSP) or Image Signal Processor (ISP) which can be found in smartphones for example
@ you’re right, in the sense that a CPU can perform those calculations. They used to. And, in fact, when I took the computer graphics class in university, many years ago, the final assignment was to write a 3D rendering engine that did that very thing in CPU. It’s significantly less lines of code than one would expect, though it was missing all the features we’d expect of a modern game engine. The reality is, however, that a video card can do it more efficiently and faster than a CPU because a video card is engineered specifically for that purpose. However, what you suggested in your video, the thing which I was responding to, running a PC from a GPU, isn’t going to happen unless the architecture is significantly different from what we call a GPU today.
@@thejontao Im not the author of the video, Im just another commenter here :D I wanted to give a more clear view for anyone that would happen to read these comments. I know what you mean, Vex (the video author) doesn't understand that a CPU is needed but a GPU isn't. And that their architectures are quite different so a GPU would be quite unfit to do any general computing
This was what Sony was hoping to get when it partnered with IBM and Toshiba to make the CELL BE processor. One chip that could do everything well and scale to massive capability. The original concept for the PS3 was that it would have several CELL processors and developers would allocate the processing as needed for all the tasks a game console needed to perform. The problem was that what they had when approaching deadline for freezing the spec was a very expensive chip that would not scale beyond two units. The console had to be redesigned to use a dedicated GPU with a single CELL. This is why the first public demo of the PS3 (which didn't actually exist in usable form yet) at E3 had CELL demos and Nvidia GPU demos running on a PC but no demos of the PS3. There simply hadn't been time to do the work on the platform combining the processors.
Yoy think you know what ur talking about but you sadly have no clue. ARM bloes x86 oit of the water. Apple showed the way, Nvidia will come out with some powerfull CPU's as well, x86 will be done before 2030
13:30 "Most of the work is then parallelized by the GPU" Well, a lot of the work done by day-to-day-programs just cannot be parallelized. You have to wait for the thread to finish (serialization bottleneck). That's just the bane of general-purpose computing. So there will always be general-purpose computing cores for that kind of work and highly-parallel special-purpose computing cores for work that can be parallelized (SIMD work). Some work is so specialized that it will eventually disappear into very specialized machinery: analog computing cores for neural networks.
THAT'S why SoC architecture wins in this regard. I hope NVIDIA will release not only 8 GB version of such machines but also large memory volumes, and preferably fast ones like M2 Ultra (one such M2 Ultra gives 5-10 tokens per second when working with LLM 70b with consumption of 60-90 W, which even 4090 can't give).
Mate, there's a reason for the CPU-GPU split. GPUs may have a lot of cores, but those cores are really bad at general CPU tasks -- they have almost no cache per core so things constantly go to the memory, there's no IO die to work with storage, and graphics memory in general has much higher latency due to high bandwidth requirements (basically, more bandwidth means more data needs to be sent at once, which takes more time to process). There's also a lot of specialized circutry on the CPU side made to fast-track certain instructions. Even if you could perfectly parallelize your application, one CPU core would be significantly faster than one GPU core, and most applications don't scale well past, at most, 20 threads. Same applies in the opposite direction, of course -- a CPU will be significantly worse than a GPU at doing a ton of parallel computation tasks, like matrix math or image processing (which is mostly matrix math, but that's besides the point), both due to the architecture and because a GPU will have a hundred times more cores for the job. That said, it can handle almost any task eventually, while running stuff on the GPU has no such guarantee.
wrong, you can replace a CPU with a GPU. look at the Nintendo Switch.. it runs an Nvidia Tegra X. the operating system on the switch is very light weight but it still runs it.. can you ask a CPU to do heavy graphical calculations? yes you can, but it will be VERY slow, this is why you CAN'T replacec a GPU with a CPU.
The Nintendo Switch is powered by an Nvidia Tegra X1 SoC, while a plethora of other devices like the Surface RT series of tablets were all powered by various version of Tegra SoCs. Nothing new here.
I had an LG phone with an nVidia Tegra 3 cpu over 10 years ago. It was the LG P895. The low 1GB RAM capacity was the main bottleneck, but i don't know if that was due to the tegra CPU's limitations or LG's decision
But thats thing...their products just work and perform lmao. The difference between $999 and $500 over the course of 4 years (a generation) is a $124.75 difference. No enthusiast is noticing that.
@@adridell idk how you don't notice what I'm calculating when it's literally in the comment.....499 over the course of 4 years is a drop in the hat.....lmao
@@thedeleted1337 That doesn't change anything to the product's ownership cost. If it's 999$, then it will cost you 999$ not 500 plus 124.75 over 4 years. your logic is flawed.
@adridell the longer you own something before it breaks or you replaces it is literally ownership value. You're either like 12 years old or just not very financially literate
How is this a big deal? I ran linux on my cfw nintendo switch that uses tegra, even tegra 3 back in 2011 could run Linux, now get a raspberryPi basically and suddenly is breaking news?
I could be wrong. But in the past, I've heard something along the lines of that, CPUs are good at handling complex various tasks and GPUs are good at handling more linear but a huge load of tasks. That's why they are each good at what they do. If this is the case, we won't be able to see computers without CPUs anytime soon.
I bought the low end model of the generation of CPU for my current, now old PC. When I told a computer store salesman my PC specs, he told me to upgrade my PC, I'd need a new GPU. I was surprised because I bought a top of the line GPU back then, but, again, I skimped out on the CPU, buying the low end model of the generation. But it really is all about the GPU, yeah. I could probably replace my old AMD Radeon with a new intel Arc B series and it might run Tekken 8 perfectly. Good video!
Just wait, .... just like back in 22 Huang stated in reference to GPUs: "the idea that a chip is going to go down in cost over time, unfortunately, is a story of the past.", he is eagerly awaiting in saying that again but in reference to CPUs...
So, x86 and ARM architecture are completely different paradigms in design. ARM is extremely power efficient, with several low power features. I'm not so familiar with all the power saving features of either, but I know that ARM is extremely power efficient. ARM uses a Reduced Instruction Set Computer (RISC) instruction set architecture (ISA) while x86 is a Complex Instruction Set Computer (CISC) ISA. What this means is that x86 has many times more instructions that are capable of performing hyperspecific tasks, this is what allows x86 to be so powerful. You can perform the same processes with significantly fewer instructions compared to a RISC architecture. The advantage of RISC, is obviously reduced complexity, better modularity, power efficiency, and compatibility with third party intellectual property (IP, hardware not software) In short, ARM has to overcome its inherent disadvantages over x86 to be useful for computationally complex tasks. x86 is still not very power efficient compared to ARM, but AMD has made several leaps in innovation to make their Ryzen architecture more power efficient. All that said, I'm barely scratching the surface of this topic and my knowledge is limited. My source is that I have a degree in computer engineering and remember some of this information. That's all I'm gonna comment on, this is a complicated topic that I don't think I'm fully qualified to explain off the top of my head.
do you know how good ARM in supercomputer market? A64FX from fujitsu inside Fugaku SC is beast and in top 10, risc and cisc debates are already done, arm no longer means Advanced RISC Machines
@pid1790 you're talking a supercomputer. It's not really applicable to a normal desktop environment, also, I don't know anything about this supercomputer or what it's purpose is, but I highly doubt it is doing general purpose computing tasks like your desktop is. They're usually meant for highly parallelized computations, which makes RISC far more suitable since it is easier to scale. CISC is "complex" and is meant to be singularly powerful. CISC also uses more physical space and is more expensive to produce. Nothing I said has anything to do with a debate about supercomputing. I didn't even mention the word "supercomputer" in my original post.
I hate the fact that you PC guys forget apple has been showing what ARM can do, for over 4 years now, your whole comment about arm doesnt make any sense. Apple runs a full desktop on ARM and its on average more power efficient and powerfull then x86.....
The reason why we can't just ditch the CPU and have the GPU doing it all is because GPUs are extremely specialized processors, while CPUs are general use. Think of it this way, imagine having a scientific calculator that can do multiple functions and even be programmed and compare it to a simple specialized calculator that can only do additions. To make a single addition on the scientific calculator you'd need to press far more buttons than you'd have to on the specialized calculator, but you'd be able to also substract, multiply, etc. but with each new function you can perform you need to press more buttons to tell the calculator what to do. This is why CPUs are "slower" compared to GPUs, because the additional functions take more instructions to specify what to do which take more processing power and time in order to be selected and executed. This is also why GPUs can parallelize loads, because they are very specialized (same reason why you need RT cores to process Ray tracing, since regular raster cores just can't do it.) On the same scenario imagine adding more simple addition calculators, you would be able to perform additions in parallel orders of magnitude faster, but you wouldn't be able to do anything else. That's the trade of, and the reason why you can use dedicated cards for specialized loads but still need a general processor to handle non-specialized tasks. Also, the reason why ARM is more efficient is because it drops many additional functions that x86 can do but that aren't really used. So instead of running a scientific calculator you are running a regular calculator but since most people never use scientific functions it's ok for most use scenarios. (PS > By the way, having less instructions means many functions that x86/64 can do natively have to be "emulated" on ARM if needed, which makes them far less optimized. Check out videos on how game devs used programming tricks on the NES/SNES to make things that weren't natively supported by the systems, it's an eye opener. But it also means more pressure is put on programmers to know how to work the magic, and you know how that usually ends.)
I disagree that CPUs are slower, they are amazing at scalar math and boolean logic. You just don't use programs that are 100% CPU bottlenecked. I do music production and it's 100% CPU bottlenecked. Graphics cards can't handle audio because it's a very logical and serial processing thing which is what CPUs are designed to do. NPUs can't handle audio accurately as well, they work through approximations of things like the human brain, thanks to matrix math. My previous CPU was an i7-4700MQ from 2013 and the perfomance gap isn't that big, unless i have a lot of audio tracks in parallel (4 cores and 8 threads vs 12 cores and 20 threads), the 4700MQ is still pretty fast for audio work. Also, GPU architecture is heavily flawed. For example, your average GPU sillicon die for good gaming graphics and perfomance is 300 mm2 and GPU obsolescence is way faster than CPUs. My i7-12700K was cheap at 330 dollars (months after launch) for the ridiculous big perfomance it offers, all in a tiny 215mm2 sillicon die. This explains why i hate the GPU market so much, any die is 300mm2, 50% bigger than my i7-12700K, those graphics cards just age poorly and are so expensive. Easily my i7-12700K would stay fast for a decade, just like my i7-4700MQ laptop chip.
@@RaydeusMX "Functions that x86 can do but that aren't really used" are already mostly emulated within x86 with basically no dedicated chip space, so the idea that ARM is inherently more efficient is kinda silly. x86 CPU engineers aren't stupid, they know which instructions are used by their clients and which aren't.
A long time ago (before Nvidia even wanted to buy ARM) I was saying that the natural evolution of current PC tech would be having the CPU and GPU on the same card, sharing the same memory, with the mobo being used for I/O instead. And it sounds like we could see something like that in the near future. (And honestly it's about time we go from integrated graphics to integrated cpus.)
@@RaydeusMX The mobo can't be used for IO, how the hell would it know which resources to provide with no instruction from the CPU? Outside of that, that's what most CPUs look like these days, a CPU and iGPU sharing system memory.
We already build games for Android (Linux + Vulkan), IOS, and Switch, all ARM devices. Also, a lot of Linux apps run perfectly on ARM devices like Raspberry Pie. I made my own emu-- "personal" console with it many years ago. Right now, I don't think you could get rid of a CPU but my expertise is limited. Maybe there is a way to make some kind of hybrid. After all, we already use things like APUs. Let's hope Microsoft and Nvidia don't get more powerful with this.Valve, hurry and spread SteamOS please !
Dont PC handhelds already do this but the opposite with Zen 4? instead of a CPU being nonexistent and integrated into the GPU, The GPU CUs are integrated into the CPU? Or am i missing something.
A GPU without CPU that's dumb, there can be CPU without GPU and a CPU can also have integrated GPU for graphical capability. CPU and GPU have different instructions and for a computer a CPU is just, also CPU is much faster in computation than GPU. Without CPU a GPU can't work as it needs instructions on what to work on through the CPU.
Technically a CPU is needed in the current computing paradigm. However, it's not out of the question that a new computing paradigm could be invented that greatly reduces the complexity of the CPU and distributes those tasks over a "GPU-like" architecture. Currently, we cannot do this, but Nvidia and others could be working on this.
So, basically, Nvidia in the future will develop an open-source DXtoVulkan or OpenGL translation layer, similar to DXVK, but this one will merely work for the Windows Operating System, and not for the Games. It's also possible that they make ARM CPU Chips be a new Standard if this does come to reality. Since ARM is very efficient, and is in many ways better than x86, It's just that Windows uses an older Standard, and ARM is a bit newer. Also, again, gaming shouldn't be much of a problem because of Proton, DXVK, Wined3d or DgVoodoo to OpenGL which all exist, but the best of them is DXVK ( made and maintained by One Guy, but Valve helps him financially ) and Proton ( Made by Valve for their SteamOS system ), both are open-source and can be used in pretty much any game that isn't already running on Vulkan or OpenGL, which ARM Supports. *Hit me up with a response when this actually happens.*
Let me predict the future nvidia will integrate ram into their cpus and will starve them of ram except the highest end one and push people to buy that one and that will cost like 1500$.
14:31 lol all it's saying is bypass the cpu and system ram how do you do that? make an SoC with both CPU and GPU fused together with a highbandwidth link similar to how apple fusion silicon works.
Choose one: Amazon Graviton2: Up to 64 cores based on the ARM Neoverse N1 architecture. Amazon Graviton3: Also up to 64 cores Ampere Altra: Up to 80 cores based on the ARM Neoverse N1 architecture. Ampere Altra Max: Scales up to 128 cores per processor. Fujitsu A64FX: 48 compute cores based on the ARMv8.2-A SVE, additional 4 assistant cores for managing tasks. NVIDIA Grace CPU: Combines up to 144 ARM cores with high memory bandwidth and energy efficiency. Huawei Kunpeng 920: Up to 64 cores based on ARMv8 architecture. Marvell ThunderX2: Up to 32 or 64 ARMv8-A cores, targeting HPC and data center applications. Marvell ThunderX3 (announced): Expected to scale up to 96 cores SiPearl Rhea: Designed for European exascale computing. ARM Neoverse cores (exact count expected to be high). Phytium Processors: Used in Chinese high-performance systems, up to 64 cores. Alibaba Yitian 710: Up to 128 cores based on the ARMv9 architecture.
10:07 This GPU is exactly half of Ampere Quadro lineup A500 which has 2048 cuda cores and 64 tensor cores. There is even lower-end quadro A400 (which would be even worse than the GPU in this Jetson) which has 768 cores and 24 tensor cores.
Microsoft needs to intervene and strong arm Intel to license x86 to Nvidia, Qualcomm, and everyone else who wants it. It's probably a good time to do that now since they're in trouble LOL
Is it possible for Nivdia to acquire the x86-64 license or any future development of X86 instruction set? I feel like x86 is on its death bed so if intel and AMD wants to save it then they need to license it out like ARM.
nvidia and mediatek already working together. for quite sometime already. the so called nvidia CPU that being talk about most likely mediatek CPU with nvidia GPU instead of the usual ARM GPU. in automotive market mediatek already licensing nvidia GPU.
I would prefer getting rid of GPUs tbh. I am a huge fan of such thing as an APU. AMD did a great job making this a thing and they should continue to improve it
it could be a good macbook air alternative (that can do some light gaming). I would love to see a passively cooled silent windows ultrabook (that doesnt suck)
14:45 your missing some important info. much like a gpu vs a fpga or asic for coin mining, a gpu core is hardware to do a very limited set calculations very rapidly(we will call them instructions) because its limited in those instructions and its set up to do those instructions specifically it does them very rapidly. for mining A gpu would be doing general purpose calculations on its cuda cores(slow) an fpga is programed to have hardware support for that specific calculation(decent speed) and an asic miner is made from the factory to ONLY do that calculation and has all of its circuitry designed to do that calcuation(extremely rapid) In day to day use a x86/64 CPU is setup as a jack of all trades, it has many many instruction sets allowing it to do all those things, this makes it more power hungry but it can do a wider array of tasks at a reasonable speed. This also means the cores have to be physically bigger, because they have to decide which calculations get used more often and have to have more hardware dedicated to that particular instruction set in each core. ARM CPU's have a much more limited set of instructions, any program that falls inside its set of instructions will run like its very powerful and use very little power at good speed, but the moment you try get it to do a task it doesn't have an instruction set for it becomes horribly painfully slow at it as it has to rely on software trickery to work out what it doesn't have the hardware to do directly. Arm gets to save power because theres alot less instruction sets fighting for circuitry in each core and more space can be dedicated to circuitry for the sets they are using A Gpu has even more limited set of instructions limited to video and rendering related stuff, there are some calculations for video/rendering that fall inside of general use aswell that it can do and it will do them fast(this is why they made cuda/compute to allow people to make use of these for things that can actually use them) but outside of that it would be useless. this means GPU's can have tiny cores and lots of them because each section has only a few instruction sets so they can split it up into many many tiny cores and have them work as groups or single as they please as they are all doing the same job. so the tradeoff is many instruction sets using hardware made for each specific calculation type at a reasonable speed, or few instruction sets doing the few calcs they need to do very very rapidly granted GPU's have been adding more lately, but as they add more they aren't going to out pace a CPU at what it does there isn't any magic too it. TLDNR: the x86/64 cpu is an axe/wreckingbar/hammer combo tool - it does all 3 decently but its not the best at any of the three things you can get. ARM CPU is a wrecking bar, it does wrecking bar things really well its the best wrecking bar, and you can use it to whack in a nail or split some wood slowly and badly but it will do it. GPU - its a ball pein hammer, it can hit a nail really well and is useless for the rest.
This video is kind of misinformation. Comparing cores between architecture is apples to oranges. Also, theoretically you could run everything on just a GPU but it would require an entirely new software stack.
Kinda, it would suck though, CPU's are jacks of all trades, GPU's are masters of one. And that one thing is doing lots of complex math really really fast.
@baronvonslambert not necessarily, like I said, it would require an entirely new stack, which means solving all the problems fundamentally differently than they're currently solved
Theoretically you could but everything would be incredibly slow. GPUs only work fast because they are using hundreds or thousands of cores in parallel, while most general software uses 1 or 2 threads to perform tasks, with complex software using a few more. This is because most tasks simply can't be efficiently split into a thousand small ones and each logic step requires the outcome of the previous one. Also those cores on the GPU are only designed to do one thing only, which is batches of very fast floating point calculations required to render complex 3D geometry and effects. Lately they have been repurposed for other similar tasks such as crypto mining or AI processing but they aren't efficient for doing general computing at all.
Oh. I actually have some of the information you wanted. So Orin is the Tegra 234 architecture, and is composed of a 12 core Arm cortex a78ae, and 2 ga107 GPC's, so thays 2x8 sm's for a total of 2048 cuda cores and 64 tensor cores. These are not geforce, so they don't have ray trace cores, I guess theu would be categorized with the ga100 products rather than geforce/rtx. The orin nano is a binned version of this, with only 8 SM's and 6 a78c cores active. However the rest of the gpu is still active and that contains custom ai/automotive components like the PVA, and the big one, the dual DLA's (Deep Learning Accelerators) which are basically extra tensor cores on steroids. And that's a big deal because of the use case for the pron nano being primarily ai/ml. Remember your comparison to the 1070ti? Well you were just comparing fp32, which is the primary metric for games. Or, rather, was until recently but I digress. Well the 1070ti has a 64x performance penalty for fp16, so it gets 8 tflops fp32, but only 0.128 tflops fp16. It also doesn't have hardware Integer support, so can face up to a 4x performance penalty for Integer ops depending on the instruction. Well these are the bread and butter of ai/ml performance. This orin nano, gets 1.2 tflops fp32, compared to the 1070 it's 8tflops fp32 out of its cuda cores, however it gets 25 tflops fp16, 50 tops int8, and 100 tops int4, at 10 watts out of its tensor cores and dla's. This is the performance it needs most for its use case, running convolutional Inference on ai models. Moving onto the cpu, the arm cortex a78ae is a specialized safety cpu, that only gets "half" the performance of a normal a78, as is the case with all arm cpu's with the ae designator. This is because it's purpose is to operate in a mode called lockstep mode, where each half of the cpu cores runs the same exact operations, to double check against any manner of computational error. Completely unsuitable for running games. Which brings us to the last subject. The switch 2. The switch 2 is using the Tegra 239 SOC, not the orin, which is T234. The cpu is an 8 core Arm cortex a78c, a variant of the a78/cortex X family with extra cache, and the ability to have 8 cores on a single cluster instead of just 4. The GPU is the GA10f, as opposed to orins ga10b. It is a derevitive of the ga102 ampere architecture, with 1 GPC of 12 sm's (1536 cuda cores), and unlike orin is geforce/rtx, so gone are things like the dla's, and instead it has the trappings of the rtx GPU's like ray trace cores. Years ago this would have fallen under the geforce go branding, but it's all been folded under geforce now.
You’re actually describing what the M series has already accomplished. I would say budget computers on the level of 4080s today. In that form factor of the dev kit in 10 years.
Hey Vex, I think if we see something like CPUs being replaced by GPUs, it'd likely cease to be a GPU in the first place. I mean, isn't that technically what an APU is?
Lock the voltage and core speed on your 13700K 11:25 Sync if with RAM and ring-bus using some ratio. Use HWiNFO to make sure not to waste too much power and have exsolves heat output (which is all subjective).
I don't think "Jetson" is a play on Jensen's last name so much as it is a reference to "The Jetsons", which was a Hanna Barbara cartoon from back in the day that we had re-runs of on TV in the 90s.
instead of onboard graphics card, we would get onboard cpu. what a time to be alive. interesting to see how this pans out. gpu tech has advanced impressively indeed.
Dude GPUs and GPUs have totally different machine code, architecture, instructions... GPUs are specialized on floating point task, CPUs are a "catch all" machine they have a lot of specific instructions built into the hardware, the same way GPUs have their special hardware paths. They have different architectures, like ARM, X86 (32-64), RISCV... even worst, GPUs have all those cores because they are "tiny and simple" with a very limited list of tasks they can do. You just have to go back a "few" years and compare games using "software rendering" (CPU) or "accelerated/hardware rendering" Glide, OpenGL, Direct3D, DirectX (GPU). On the late 90-early 2000 you could have a flag ship CPU and run some early 90s games on software mode and have a good experience, right now CPUs and GPUs have diverge light years apart.
@@saricubra2867 It is an example from old CPUs, then I suppose newer ones "have" what 90s and so GPUs had an edge on. The problem with this video is that he is saying something like "Wow, look what all those raytracing/cuda cores do, why not run windows on them?" it's a ridiculous premise, is like asking a paint brush to cut a ceramic tile. CPU=Handyman+construction worker+electrician+... GPU=Picasso (extreme example, but you can put "Professional house painter")
That would pretty much be impossible. Its possible to have a GPU emulate SOME work on a CPU. The CPU is what connects all the different BUSes (expansion systems like PCIE, USB, SATA, etc) and allows them to communicate by acting like an interpreter, however one then runs into the same problem as going from x86 to ARM. Different architectures require different system languages.. So any program will need to be effectively written for that CPU (the GPU) architecture. There are multiple different GPU architectures just like there are different CPU types.
I'm looking forward to this already. The more competition, the more the CPUs have to get better, and the less old CPUs (that are still decent) may cost now.
I don't think it will be that easy to replace a cpu with gpu anytime in the future. But this is NVIDIA, who knows maybe they finally can turn their gpu into an actual CPU and we never have to buy a new one
ARM has a way to make gpu's work, its a standard called PCIE lol it'd be rather simple to make gpu's with arm, but most companies aren't bothering yet as the arm cpu's currently focus on performance per watt rather than performance in general.
12:14 both processors use 8 cores for gaming & it's misleading by saying 8 cores is faster (like wtf), Intel just has extra 16 E-cores that handle small work so not "gaming" if u wanted comparison u should have found 8 core vs threadripper and with this u could argue what's better or worse
That be nice Vex. Imagine seeing a motherboard without a CPU tray or whatever. And only have like one or two PCIE slots for the GPU. Or I think I seen where CPU and GPU would be one thing, like your saying. Be on big looking cpu. Say you wouldn't have any PCIE slots for the GPU since you wouldn't need it unless you needed a pcie slot for something idk. I'm sure as more technology goes into developing a combined CPU and GPU would be nice. It might even work better or be more efficient even.
We have APUs already which are a CPU with a gpu strapped on, and sometimes the GPU portion is larger than the CPU. I take issue with Cormacks wording because if the "gpu" becomes the central processor, then that by definition would be a CPU, which would then have accelerators for tasks it's not very good at... Just like current day CPUs have GPUs to accelerate specific workloads, and even GPUs have separate accelerators attached to them (like hardware video encoding/decoding). On the software side, it would be very difficult to parallelize current code bases to be able to take advantage of very weak but very high core count GPUs. It's not a new idea but it's just been very impractical up until now, and would still probably be another 10 years+ away in a realistic setting, and even longer before it becomes mainstream. Just look at how long it's taken for ARM / RISC-V to really be put in the spotlight.
I think the whole point of the Jeston Orin Nano is that we have AI cores on a SOC which is great for anyone looking to get into AI development on the cheap. This isn't going to revolutionize the CPU/GPU market, but it certainly makes AI development more accessible to those interested, but would rather not drop a ton of money on the idea. If you just wanted an ARM based SOC, there are plenty already on the market, but you really can't do AI development on something like a Raspberry PI. Maybe in the future we see AI emulating X86 instruction sets on non X86 platforms?
I've often considered if a small cpu on the gpu wouldn't be potentially usefull. Add in just 4 decent cores and a healther local ram and cahce to have tigher integration with the gpu to optimize and do tasks better suited for cpu execution, not to mention foraward flexibility.
In theory, a GPU could do stuff without a CPU. But it needs some way to know what to do. Either you need to run an OS on the GPU, essentially making it an APU. Or you need to have a CPU still and change how Operating Systems work to basically just be for selecting a program that the GPU then handles all the processing for. The bigger hurdle is GPUs are fast because they are optimized for specific functions. CPUs are WAAAAY more general. Some of the stuff a CPU is fantastic at would be dog slow on a GPU and require a major infrastructure change to make viable
I think you got something wrong in the video. CPUs are designed to run pretty much every task you can throw at them by using hardware that's designed to do every kind of mathmatical or logic operation you can think of. They need lot of different logic sub circuits to do that which is why each of the cores is larger than anything a GPU would have. You can make a CPU able to do more things (to a certain point) by updating the microcode tell "tell the hardware" how to do the new tasks. It might take longer but if you can break down the task into something the CPU has hardware build in to do it can do it. GPUs on the other hand are designed to do specific tasks in a hardware that cannot do anything besides that specific tasks even if you would update software. If you want the GPU to do a task it was not designed for there's no way to make it somehow possible. If you look at FSR for example it's only working on older GPUs because the algorithm was designed in a way that it works with the hardware those GPUs already could do. So the software was written in a way that would work with the hardware. If we come back to the video now you can probably guess what I'm getting at. Nvidia will not be able to run something like Windows on a GPU without basically building an SoC (System on a Chip). Modern GPUs are getting more and more like those in between chips that basically can do everything (essentially not GPUs anymore) which is why it's also possible to run Doom entirely on a GPU (there has been an article about it). If this trend is a good direction or not is not for me to decide. I'll leave that to the people who design the chips. I',m just a random guy on the internet who knows how the hardware is working on a probably around mid level basis because that's my daily job (VHDL coder). Imo at some point we'll hit the physical constrains of the way we build PCs now and we have to think about a workaround like making the chips even bigger, split the tasks up for multiple chips or something entierly different. This will create other issues like signal interity, latency or just prevent the thing from overheating (look at der8auer's video about Threadripper overclocking for example). Anyways... enough rambling Another point which was not wrong but just maybe interesting for you is that Nvidia has been working with ARM for well over 15 years now. They've been workin on their Tegra lineup since 2000~ish (+- a few years of development before showing it to the puplic).
NVIDIA CPU? Do i hear 2 core for 580$?
id buy that
thats the deal you didnt hear the 8 core for 4800$
🤣
Only in bulk and you sign an NDA. $749.99 otherwise
And if you don't buy it you are poor
Oh god no, now cpu price will be 1500 dollar for 4 core cpu.
But the Shills will pay their premium price and the rest of us suffer.
now we don't even need to load the textures thanks to nvdia 🙏🏿💪🏿
More like 8 core.
da fuck? my comment got removed why?
@@dolpoof2335We talking about Nvidia pricing
Nvidia and 8 gigs of RAM name more iconic duo.
Intel and Quad Core Processors before the creation of Ryzen
@@kcato5879I'll never forget that.
ok, Apple and 8gb of ram.
@@kcato5879 Year after year of quad core idk how we lived through that
@@kcato5879 they had 190 single core in 2014 when amd reached this speeds with ryzen 3000 series ahahhaha
It's crazy to me the 3060 has sold that much recently at $280... Arc really needs a restock.
fr
Fr I'm building a pc and my only other option is the 3060 since b580 is hard to get in my country
It wont matter Nvidia will still out sell Intel. Even when AMD has faster GPU's for less money still people will buy the slower Nvidia card everytime even if its a slower GPU that cost more money. Its crazy dumb. But that is what it is. Also for Linux AMD is king... Nvidia is complete garbage on Linux.
@@SemperValor nvidia gpus have better name ;-;
Because the 3060 is the most plausible Graphics card to get, AMD and ARC prices aren't cheap as the 3060 of 12 Gb of VRAM, and i'm speaking of "Cheap" because on this country $300 dollars aren't cheap.
NVIDIA using linux for their test knowing that they spit on the open source environment is uncanny
You mean "insulting"?
not sure if you've been living under a rock lately , but the Nvidia linux drivers are pretty good now, even wayland support , VRR and HDR work , gaming under linux is just 5-10% slower then windows now
nvidia probably do significantly more open source stuff than AMD.
@@arenzricodexd4409 like what?
@@arenzricodexd4409 that's the most fanboy delulu comment I read in a long while
Architecturally, a core in a GPU doesn’t perform the same tasks as a core in a CPU.
The fundamental difference between an Arm and an x86/x64 processor is the instruction set. That is, the actual machine language instructions that the CPU can execute. Arm features a reduced instruction set, which has some theoretical advantages-but this is a conversation mostly for CS PhDs and processor engineers.
A GPU has a massively reduced instruction set… it essentially can only do addition and multiplication. CPUs, Arm or x86/x64, do significantly more.
A video card will always need a CPU to feed it. It’s essentially a graphics co-processor… which is to say a secondary processor which handles specific tasks that the CPU offloads to it… you’re a bit too young to remember, but before the Pentium processor was released, it was common for CPUs to be sold with and without a math co-processor. A math co-processor was a second chip built into the CPU which handled floating point calculations for the CPU.
You could think of a GPU as a kind of math co-processor that is designed specifically for the math required by video game graphics.
We have had, in the past, computers with only one kind of chip that handles both. The performance was so bad that they invented GPUs. I’m not talking about CPUs with imbedded GPUs, I’m talking about the days before 3D graphics and GPUs.
To clarify, it wasn't a "kind of chip that handles both"; it was just a CPU but one that handled graphics as well. Once graphics processing became too complex and intensive for the CPU to do (meaning a very large chunk of time was allocated to just 1 type of workload i.e. graphics), it was decided to create a separate processor which would offload that burden from the CPU, a solution which made graphics processing better and the CPU could breathe again.
Generally there is this idea that once a specific task/workload becomes too much for the CPU, a separate processor (co-processor) is made to do just that task. At some point the difficulty of that task doesnt increase but the technology of the CPU does improve so later in the future that task is reabsorbed back into CPU processing (thus removing the need for additional silicon and eliminating communication lag between the CPU and something that would have otherwise been off chip). It's all about optimization.
There are other tasks which benefit from having a dedicated processor such as the Digital Signal Processor (DSP) or Image Signal Processor (ISP) which can be found in smartphones for example
@ you’re right, in the sense that a CPU can perform those calculations. They used to. And, in fact, when I took the computer graphics class in university, many years ago, the final assignment was to write a 3D rendering engine that did that very thing in CPU. It’s significantly less lines of code than one would expect, though it was missing all the features we’d expect of a modern game engine.
The reality is, however, that a video card can do it more efficiently and faster than a CPU because a video card is engineered specifically for that purpose.
However, what you suggested in your video, the thing which I was responding to, running a PC from a GPU, isn’t going to happen unless the architecture is significantly different from what we call a GPU today.
@@thejontao Im not the author of the video, Im just another commenter here :D
I wanted to give a more clear view for anyone that would happen to read these comments. I know what you mean, Vex (the video author) doesn't understand that a CPU is needed but a GPU isn't. And that their architectures are quite different so a GPU would be quite unfit to do any general computing
This was what Sony was hoping to get when it partnered with IBM and Toshiba to make the CELL BE processor. One chip that could do everything well and scale to massive capability. The original concept for the PS3 was that it would have several CELL processors and developers would allocate the processing as needed for all the tasks a game console needed to perform. The problem was that what they had when approaching deadline for freezing the spec was a very expensive chip that would not scale beyond two units. The console had to be redesigned to use a dedicated GPU with a single CELL. This is why the first public demo of the PS3 (which didn't actually exist in usable form yet) at E3 had CELL demos and Nvidia GPU demos running on a PC but no demos of the PS3. There simply hadn't been time to do the work on the platform combining the processors.
Yoy think you know what ur talking about but you sadly have no clue. ARM bloes x86 oit of the water. Apple showed the way, Nvidia will come out with some powerfull CPU's as well, x86 will be done before 2030
Your facial expressions vibe : "I have no idea what I'm talking about for about 90% of this video"
More like 100%. Nvidia making cpus hasn't been news in 2 decades
@@TylerTroglensooo.....whats a tegra processor then??? Nvidia shield? 🤡
@SWOTHDRA I mean that's what he meant. It's not really news because Nvida has been doing cpus for so long.
Pretty sure Jetson is in reference to the old sci-fi comedy cartoon “The Jetsons”
I’m 24 and I know that. My generation is uncultured lmao. Not saying I know everything either
"Let me explain how CPUs are not needed while not knowing how CPUs and GPUs handle data for gaming"
Gaming on ARM is like the CPU version of your mom saying ''We have x86 at home!''
fr
N Switch (ARM+NV) has tons of modern games and they run efficiently
lol do you even have a mobile phone ?
@@sapphireluv2705 everyone has
@@sapphireluv2705 I do, and its still subpar, Maybe in a decade or 2 ARM maybe useful as a desktop.
13:30 "Most of the work is then parallelized by the GPU"
Well, a lot of the work done by day-to-day-programs just cannot be parallelized. You have to wait for the thread to finish (serialization bottleneck). That's just the bane of general-purpose computing. So there will always be general-purpose computing cores for that kind of work and highly-parallel special-purpose computing cores for work that can be parallelized (SIMD work). Some work is so specialized that it will eventually disappear into very specialized machinery: analog computing cores for neural networks.
THAT'S why SoC architecture wins in this regard. I hope NVIDIA will release not only 8 GB version of such machines but also large memory volumes, and preferably fast ones like M2 Ultra (one such M2 Ultra gives 5-10 tokens per second when working with LLM 70b with consumption of 60-90 W, which even 4090 can't give).
As a software engineer I agree. Not everything can be done using parallel processing.
Ie. Why they included 6 CPU cores on the SOC
With our dollar tanking and taxes, the 5090 will be well over $3k in Canada. Unimaginable.
Mate, there's a reason for the CPU-GPU split. GPUs may have a lot of cores, but those cores are really bad at general CPU tasks -- they have almost no cache per core so things constantly go to the memory, there's no IO die to work with storage, and graphics memory in general has much higher latency due to high bandwidth requirements (basically, more bandwidth means more data needs to be sent at once, which takes more time to process). There's also a lot of specialized circutry on the CPU side made to fast-track certain instructions. Even if you could perfectly parallelize your application, one CPU core would be significantly faster than one GPU core, and most applications don't scale well past, at most, 20 threads.
Same applies in the opposite direction, of course -- a CPU will be significantly worse than a GPU at doing a ton of parallel computation tasks, like matrix math or image processing (which is mostly matrix math, but that's besides the point), both due to the architecture and because a GPU will have a hundred times more cores for the job. That said, it can handle almost any task eventually, while running stuff on the GPU has no such guarantee.
I wouldn't buy Nvidia anything after all the scumbaggery they do.
Soon as AMD FS4 matches DLSS quality has dedicated hardware RT, I will never need to buy Nvidia again...
You dont have to, millions of others will.
Good for you, more Nvidia for us, Win-Win
@@JasonKing-m6myou really NEED raytracing and DLSS?
@@JasonKing-m6m nvidia will just keep on inventing new gimmicks like ray tracing and the stupid consumers will buy it.
Bro you managed to invent the SoC in this video, good job lol
Managed to credit windows with trying to move to arm this year too, though they’ve been in arm for over 10 years (and failing at it)
GPU and CPU are completly different devices. You can't replace CPU with GPU, you can only replace GPU with CPU.
cpu is very general
it isnt just gpu are smaller core or anything
gpu cant even do some things that cpu can
wrong, you can replace a CPU with a GPU.
look at the Nintendo Switch.. it runs an Nvidia Tegra X.
the operating system on the switch is very light weight but it still runs it..
can you ask a CPU to do heavy graphical calculations? yes you can, but it will be VERY slow, this is why you CAN'T replacec a GPU with a CPU.
@@VintageCR Tegra X is an arm SoC not a GPU.
fr
@@VintageCRbrother is confidently incorrect
The Nintendo Switch is powered by an Nvidia Tegra X1 SoC, while a plethora of other devices like the Surface RT series of tablets were all powered by various version of Tegra SoCs. Nothing new here.
That Architecture is from 2013 though and shows how slow Nvidia has been to do anything. There first attempt used only 2GB of Ram
they are also used for cars infotainment systems like the if im not wrong they are in the audi mmi systems
I had an LG phone with an nVidia Tegra 3 cpu over 10 years ago. It was the LG P895. The low 1GB RAM capacity was the main bottleneck, but i don't know if that was due to the tegra CPU's limitations or LG's decision
0:42 ngl i thought jenson had green hair
I didn’t think he was real
@@junyaiwasei thought he was ai generated.
Nvidia CPU: just works. Only 999 dollar 😂😂😂
But thats thing...their products just work and perform lmao. The difference between $999 and $500 over the course of 4 years (a generation) is a $124.75 difference. No enthusiast is noticing that.
@@thedeleted1337 The difference is still 499$, I don't understand what it is you are calculating.
@@adridell idk how you don't notice what I'm calculating when it's literally in the comment.....499 over the course of 4 years is a drop in the hat.....lmao
@@thedeleted1337 That doesn't change anything to the product's ownership cost. If it's 999$, then it will cost you 999$ not 500 plus 124.75 over 4 years. your logic is flawed.
@adridell the longer you own something before it breaks or you replaces it is literally ownership value. You're either like 12 years old or just not very financially literate
How is this a big deal? I ran linux on my cfw nintendo switch that uses tegra, even tegra 3 back in 2011 could run Linux, now get a raspberryPi basically and suddenly is breaking news?
ah wait, nvidia have been making jetson nano for a while now
they have been at it for like a decade now.
My honest reaction : No.
Why not?
@@SterileNeutrinohe’s not a dreamer
I could be wrong. But in the past, I've heard something along the lines of that, CPUs are good at handling complex various tasks and GPUs are good at handling more linear but a huge load of tasks. That's why they are each good at what they do. If this is the case, we won't be able to see computers without CPUs anytime soon.
Pc Gaming in a nutshell:
2008 : intel cpu + amd gpu
2016 : intel cpu + nvidia gpu
2020 : amd cpu + Nvidia Gpu
2024 : Amd cpu + intel gpu
2030 : Nvidia gpu + intel gpu?
Wth is going on?
Corporations just working together at different years to screw us
2008 was a core 2 duo + 8800 GTX
It's gonna be hard making GPU cores do the work of CPU.... Really hard
I bought the low end model of the generation of CPU for my current, now old PC. When I told a computer store salesman my PC specs, he told me to upgrade my PC, I'd need a new GPU. I was surprised because I bought a top of the line GPU back then, but, again, I skimped out on the CPU, buying the low end model of the generation. But it really is all about the GPU, yeah. I could probably replace my old AMD Radeon with a new intel Arc B series and it might run Tekken 8 perfectly. Good video!
The 3 big companies are all making CPUs and GPUs now. It’s somewhat unbelievable.
Just wait, .... just like back in 22 Huang stated in reference to GPUs: "the idea that a chip is going to go down in cost over time, unfortunately, is a story of the past.", he is eagerly awaiting in saying that again but in reference to CPUs...
@@JasonKing-m6mand people believe Jensens weapons grade horseshit
So, x86 and ARM architecture are completely different paradigms in design. ARM is extremely power efficient, with several low power features. I'm not so familiar with all the power saving features of either, but I know that ARM is extremely power efficient. ARM uses a Reduced Instruction Set Computer (RISC) instruction set architecture (ISA) while x86 is a Complex Instruction Set Computer (CISC) ISA. What this means is that x86 has many times more instructions that are capable of performing hyperspecific tasks, this is what allows x86 to be so powerful. You can perform the same processes with significantly fewer instructions compared to a RISC architecture. The advantage of RISC, is obviously reduced complexity, better modularity, power efficiency, and compatibility with third party intellectual property (IP, hardware not software)
In short, ARM has to overcome its inherent disadvantages over x86 to be useful for computationally complex tasks. x86 is still not very power efficient compared to ARM, but AMD has made several leaps in innovation to make their Ryzen architecture more power efficient.
All that said, I'm barely scratching the surface of this topic and my knowledge is limited. My source is that I have a degree in computer engineering and remember some of this information.
That's all I'm gonna comment on, this is a complicated topic that I don't think I'm fully qualified to explain off the top of my head.
do you know how good ARM in supercomputer market? A64FX from fujitsu inside Fugaku SC is beast and in top 10, risc and cisc debates are already done, arm no longer means Advanced RISC Machines
@pid1790 you're talking a supercomputer. It's not really applicable to a normal desktop environment, also, I don't know anything about this supercomputer or what it's purpose is, but I highly doubt it is doing general purpose computing tasks like your desktop is. They're usually meant for highly parallelized computations, which makes RISC far more suitable since it is easier to scale. CISC is "complex" and is meant to be singularly powerful. CISC also uses more physical space and is more expensive to produce. Nothing I said has anything to do with a debate about supercomputing. I didn't even mention the word "supercomputer" in my original post.
☝️
I hate the fact that you PC guys forget apple has been showing what ARM can do, for over 4 years now, your whole comment about arm doesnt make any sense. Apple runs a full desktop on ARM and its on average more power efficient and powerfull then x86.....
@SWOTHDRA point out to me where I mentioned Apple in my original post, or how what I said has anything to do with Apple.
Looking forward to having an NVIDIA CPU running Steam OS
8MB L3 cache is actually perfectly suitable guys, you just don't know what you're talking about
The reason why we can't just ditch the CPU and have the GPU doing it all is because GPUs are extremely specialized processors, while CPUs are general use.
Think of it this way, imagine having a scientific calculator that can do multiple functions and even be programmed and compare it to a simple specialized calculator that can only do additions. To make a single addition on the scientific calculator you'd need to press far more buttons than you'd have to on the specialized calculator, but you'd be able to also substract, multiply, etc. but with each new function you can perform you need to press more buttons to tell the calculator what to do.
This is why CPUs are "slower" compared to GPUs, because the additional functions take more instructions to specify what to do which take more processing power and time in order to be selected and executed. This is also why GPUs can parallelize loads, because they are very specialized (same reason why you need RT cores to process Ray tracing, since regular raster cores just can't do it.)
On the same scenario imagine adding more simple addition calculators, you would be able to perform additions in parallel orders of magnitude faster, but you wouldn't be able to do anything else. That's the trade of, and the reason why you can use dedicated cards for specialized loads but still need a general processor to handle non-specialized tasks. Also, the reason why ARM is more efficient is because it drops many additional functions that x86 can do but that aren't really used. So instead of running a scientific calculator you are running a regular calculator but since most people never use scientific functions it's ok for most use scenarios.
(PS > By the way, having less instructions means many functions that x86/64 can do natively have to be "emulated" on ARM if needed, which makes them far less optimized. Check out videos on how game devs used programming tricks on the NES/SNES to make things that weren't natively supported by the systems, it's an eye opener. But it also means more pressure is put on programmers to know how to work the magic, and you know how that usually ends.)
I disagree that CPUs are slower, they are amazing at scalar math and boolean logic. You just don't use programs that are 100% CPU bottlenecked.
I do music production and it's 100% CPU bottlenecked. Graphics cards can't handle audio because it's a very logical and serial processing thing which is what CPUs are designed to do.
NPUs can't handle audio accurately as well, they work through approximations of things like the human brain, thanks to matrix math.
My previous CPU was an i7-4700MQ from 2013 and the perfomance gap isn't that big, unless i have a lot of audio tracks in parallel (4 cores and 8 threads vs 12 cores and 20 threads), the 4700MQ is still pretty fast for audio work.
Also, GPU architecture is heavily flawed. For example, your average GPU sillicon die for good gaming graphics and perfomance is 300 mm2 and GPU obsolescence is way faster than CPUs.
My i7-12700K was cheap at 330 dollars (months after launch) for the ridiculous big perfomance it offers, all in a tiny 215mm2 sillicon die.
This explains why i hate the GPU market so much, any die is 300mm2, 50% bigger than my i7-12700K, those graphics cards just age poorly and are so expensive.
Easily my i7-12700K would stay fast for a decade, just like my i7-4700MQ laptop chip.
@@RaydeusMX "Functions that x86 can do but that aren't really used" are already mostly emulated within x86 with basically no dedicated chip space, so the idea that ARM is inherently more efficient is kinda silly. x86 CPU engineers aren't stupid, they know which instructions are used by their clients and which aren't.
thats not a play on his name. but i dont blame you for not knowing. too young.
"Meet George Jetson."
Jane - His wife
A long time ago (before Nvidia even wanted to buy ARM) I was saying that the natural evolution of current PC tech would be having the CPU and GPU on the same card, sharing the same memory, with the mobo being used for I/O instead. And it sounds like we could see something like that in the near future.
(And honestly it's about time we go from integrated graphics to integrated cpus.)
Apple has been doing this for over a decade.....
@@RaydeusMX The mobo can't be used for IO, how the hell would it know which resources to provide with no instruction from the CPU? Outside of that, that's what most CPUs look like these days, a CPU and iGPU sharing system memory.
Oh, the irony of Nvidia running on Linux. Torvalds will be enraged.
not for nvidia SoC. in fact he openly said he gives nvidia thumbs up with the work they did with tegra linux. this is back in 2013.
Jetson has been almost a 10 year old product based on tegra was their GPU integrated SoC product . Nothing new here
We already build games for Android (Linux + Vulkan), IOS, and Switch, all ARM devices. Also, a lot of Linux apps run perfectly on ARM devices like Raspberry Pie. I made my own emu-- "personal" console with it many years ago.
Right now, I don't think you could get rid of a CPU but my expertise is limited. Maybe there is a way to make some kind of hybrid. After all, we already use things like APUs.
Let's hope Microsoft and Nvidia don't get more powerful with this.Valve, hurry and spread SteamOS please !
I think Wendel has been saying that soon we might be plugging the PC into the GPU
Dont PC handhelds already do this but the opposite with Zen 4? instead of a CPU being nonexistent and integrated into the GPU, The GPU CUs are integrated into the CPU? Or am i missing something.
did you know that i like how deep you investigate the different subject you make videos on... thanks again man.
How is your idea of “a gpu with a few cpu cores” different than what is generally called a SoC?
a PC build with an nvidia CPU and intel GPU feels like a fever dream to me.
A GPU without CPU that's dumb, there can be CPU without GPU and a CPU can also have integrated GPU for graphical capability. CPU and GPU have different instructions and for a computer a CPU is just, also CPU is much faster in computation than GPU. Without CPU a GPU can't work as it needs instructions on what to work on through the CPU.
Technically a CPU is needed in the current computing paradigm. However, it's not out of the question that a new computing paradigm could be invented that greatly reduces the complexity of the CPU and distributes those tasks over a "GPU-like" architecture. Currently, we cannot do this, but Nvidia and others could be working on this.
So, basically, Nvidia in the future will develop an open-source DXtoVulkan or OpenGL translation layer, similar to DXVK, but this one will merely work for the Windows Operating System, and not for the Games.
It's also possible that they make ARM CPU Chips be a new Standard if this does come to reality. Since ARM is very efficient, and is in many ways better than x86, It's just that Windows uses an older Standard, and ARM is a bit newer. Also, again, gaming shouldn't be much of a problem because of Proton, DXVK, Wined3d or DgVoodoo to OpenGL which all exist, but the best of them is DXVK ( made and maintained by One Guy, but Valve helps him financially ) and Proton ( Made by Valve for their SteamOS system ), both are open-source and can be used in pretty much any game that isn't already running on Vulkan or OpenGL, which ARM Supports.
*Hit me up with a response when this actually happens.*
Let me predict the future nvidia will integrate ram into their cpus and will starve them of ram except the highest end one and push people to buy that one and that will cost like 1500$.
14:31
lol all it's saying is bypass the cpu and system ram
how do you do that? make an SoC with both CPU and GPU fused together with a highbandwidth link similar to how apple fusion silicon works.
intel going cpu to gpu and nvidia going gpu to cpu
Holy NVIDIA CPU! You paid $350 for each core
thats the beginning of cpu/gpu war between amd,intel,nvidea
the amount of Nvidia glazing in this market is disgusting.
So if I get a Jetson Orin Development kit, it'd be a beast for retro gaming?
I want to build a little retro gaming machine.
yes, and no? i mean it runs very specific things from my understanding..
you would be better off with one of those ryzen mini pc (likely cheaper and faster)
@@legendp2011 those cost hundreds more.
No
@@ekayn1606 where I am the ryzen mini pc are the same cost (does not have to be the newest version, even an older discounted version is much faster)
Never heard of a jetson board? 1:07
lol i know right, i think average viewers from this channels isnt used to the embedded software industries.
Nintendo Switch 😶
Are you sure the cpu isn’t like the “jetsons” like the old cartoon but futuristic but in an old school way ?
Someone needs to put 5 of these JONs together into one tiny supercomputer.
Choose one:
Amazon Graviton2: Up to 64 cores based on the ARM Neoverse N1 architecture.
Amazon Graviton3: Also up to 64 cores
Ampere Altra: Up to 80 cores based on the ARM Neoverse N1 architecture.
Ampere Altra Max: Scales up to 128 cores per processor.
Fujitsu A64FX: 48 compute cores based on the ARMv8.2-A SVE, additional 4 assistant cores for managing tasks.
NVIDIA Grace CPU: Combines up to 144 ARM cores with high memory bandwidth and energy efficiency.
Huawei Kunpeng 920: Up to 64 cores based on ARMv8 architecture.
Marvell ThunderX2: Up to 32 or 64 ARMv8-A cores, targeting HPC and data center applications.
Marvell ThunderX3 (announced): Expected to scale up to 96 cores
SiPearl Rhea: Designed for European exascale computing. ARM Neoverse cores (exact count expected to be high).
Phytium Processors: Used in Chinese high-performance systems, up to 64 cores.
Alibaba Yitian 710: Up to 128 cores based on the ARMv9 architecture.
😂
12:15 yea, cuz cache is more important than core count
10:07 This GPU is exactly half of Ampere Quadro lineup A500 which has 2048 cuda cores and 64 tensor cores. There is even lower-end quadro A400 (which would be even worse than the GPU in this Jetson) which has 768 cores and 24 tensor cores.
Microsoft needs to intervene and strong arm Intel to license x86 to Nvidia, Qualcomm, and everyone else who wants it. It's probably a good time to do that now since they're in trouble LOL
Everyone: "Oh wow, NVidia CPU! Incredible!"
Me who adapted NVidia Shield TV back in 2015 and till uses it: "eh"
In fact, Jetson boards were available since 2017 or so. So there's no news here.
Is it possible for Nivdia to acquire the x86-64 license or any future development of X86 instruction set? I feel like x86 is on its death bed so if intel and AMD wants to save it then they need to license it out like ARM.
Anyone heared of Nvidia Tegra SoC series? It's what every Nintendo Switch uses.
I think this is the first video from anyone ive caught under the first 5 minutes
fr
4 cores and 4 threads GDDR7X pro max for 1500$ It's a steal
What about power consumption and heat generation?
it's for AI drones, not desktop replacement
Nvidia does CPUs, Intel does GPUs.
Homies, what kind of alt history fanfiction is this?!
Valve is working on a Proton for Arm processors to run x86 games. It will be a game changer when they release it.
wasn't there a rumour about MediaTek and NVIDIA working together?
nvidia and mediatek already working together. for quite sometime already. the so called nvidia CPU that being talk about most likely mediatek CPU with nvidia GPU instead of the usual ARM GPU. in automotive market mediatek already licensing nvidia GPU.
Always is but it was for auto mobile
you can buy 192 core / 384 thread epyc cpus from amd, but the average user does not need that much threads
Man your voice is so soothing to the ears , keep it up 🤗🤗
I would prefer getting rid of GPUs tbh. I am a huge fan of such thing as an APU. AMD did a great job making this a thing and they should continue to improve it
APU have it's drawback and limitation. that's why despite APU already exist since 2011 it cannot make GPU obsolete.
it could be a good macbook air alternative (that can do some light gaming). I would love to see a passively cooled silent windows ultrabook (that doesnt suck)
Nvidia powered pc running Linux natively is a new era in PC industry.
What if GPU had an on-card CPU that can coordinate with the GPU faster and the BIOS is set up to look for the CPU on the PCI devices?
14:45 your missing some important info.
much like a gpu vs a fpga or asic for coin mining, a gpu core is hardware to do a very limited set calculations very rapidly(we will call them instructions) because its limited in those instructions and its set up to do those instructions specifically it does them very rapidly.
for mining A gpu would be doing general purpose calculations on its cuda cores(slow) an fpga is programed to have hardware support for that specific calculation(decent speed) and an asic miner is made from the factory to ONLY do that calculation and has all of its circuitry designed to do that calcuation(extremely rapid)
In day to day use a x86/64 CPU is setup as a jack of all trades, it has many many instruction sets allowing it to do all those things, this makes it more power hungry but it can do a wider array of tasks at a reasonable speed.
This also means the cores have to be physically bigger, because they have to decide which calculations get used more often and have to have more hardware dedicated to that particular instruction set in each core.
ARM CPU's have a much more limited set of instructions, any program that falls inside its set of instructions will run like its very powerful and use very little power at good speed, but the moment you try get it to do a task it doesn't have an instruction set for it becomes horribly painfully slow at it as it has to rely on software trickery to work out what it doesn't have the hardware to do directly.
Arm gets to save power because theres alot less instruction sets fighting for circuitry in each core and more space can be dedicated to circuitry for the sets they are using
A Gpu has even more limited set of instructions limited to video and rendering related stuff, there are some calculations for video/rendering that fall inside of general use aswell that it can do and it will do them fast(this is why they made cuda/compute to allow people to make use of these for things that can actually use them) but outside of that it would be useless.
this means GPU's can have tiny cores and lots of them because each section has only a few instruction sets so they can split it up into many many tiny cores and have them work as groups or single as they please as they are all doing the same job.
so the tradeoff is many instruction sets using hardware made for each specific calculation type at a reasonable speed, or few instruction sets doing the few calcs they need to do very very rapidly
granted GPU's have been adding more lately, but as they add more they aren't going to out pace a CPU at what it does there isn't any magic too it.
TLDNR:
the x86/64 cpu is an axe/wreckingbar/hammer combo tool - it does all 3 decently but its not the best at any of the three things you can get.
ARM CPU is a wrecking bar, it does wrecking bar things really well its the best wrecking bar, and you can use it to whack in a nail or split some wood slowly and badly but it will do it.
GPU - its a ball pein hammer, it can hit a nail really well and is useless for the rest.
I wonder if you could flash custom switch or android onto the Jetson and get like a turbo nvidia shield gaming console.
This video is kind of misinformation. Comparing cores between architecture is apples to oranges. Also, theoretically you could run everything on just a GPU but it would require an entirely new software stack.
Kinda, it would suck though, CPU's are jacks of all trades, GPU's are masters of one. And that one thing is doing lots of complex math really really fast.
@baronvonslambert not necessarily, like I said, it would require an entirely new stack, which means solving all the problems fundamentally differently than they're currently solved
Theoretically you could but everything would be incredibly slow. GPUs only work fast because they are using hundreds or thousands of cores in parallel, while most general software uses 1 or 2 threads to perform tasks, with complex software using a few more. This is because most tasks simply can't be efficiently split into a thousand small ones and each logic step requires the outcome of the previous one. Also those cores on the GPU are only designed to do one thing only, which is batches of very fast floating point calculations required to render complex 3D geometry and effects. Lately they have been repurposed for other similar tasks such as crypto mining or AI processing but they aren't efficient for doing general computing at all.
1:37 is that cpu to cpu or does it take into account the arm CPU’s come with a gpu.
Oh. I actually have some of the information you wanted.
So Orin is the Tegra 234 architecture, and is composed of a 12 core Arm cortex a78ae, and 2 ga107 GPC's, so thays 2x8 sm's for a total of 2048 cuda cores and 64 tensor cores. These are not geforce, so they don't have ray trace cores, I guess theu would be categorized with the ga100 products rather than geforce/rtx.
The orin nano is a binned version of this, with only 8 SM's and 6 a78c cores active. However the rest of the gpu is still active and that contains custom ai/automotive components like the PVA, and the big one, the dual DLA's (Deep Learning Accelerators) which are basically extra tensor cores on steroids.
And that's a big deal because of the use case for the pron nano being primarily ai/ml. Remember your comparison to the 1070ti? Well you were just comparing fp32, which is the primary metric for games. Or, rather, was until recently but I digress.
Well the 1070ti has a 64x performance penalty for fp16, so it gets 8 tflops fp32, but only 0.128 tflops fp16. It also doesn't have hardware Integer support, so can face up to a 4x performance penalty for Integer ops depending on the instruction. Well these are the bread and butter of ai/ml performance.
This orin nano, gets 1.2 tflops fp32, compared to the 1070 it's 8tflops fp32 out of its cuda cores, however it gets 25 tflops fp16, 50 tops int8, and 100 tops int4, at 10 watts out of its tensor cores and dla's. This is the performance it needs most for its use case, running convolutional Inference on ai models.
Moving onto the cpu, the arm cortex a78ae is a specialized safety cpu, that only gets "half" the performance of a normal a78, as is the case with all arm cpu's with the ae designator. This is because it's purpose is to operate in a mode called lockstep mode, where each half of the cpu cores runs the same exact operations, to double check against any manner of computational error. Completely unsuitable for running games. Which brings us to the last subject.
The switch 2. The switch 2 is using the Tegra 239 SOC, not the orin, which is T234.
The cpu is an 8 core Arm cortex a78c, a variant of the a78/cortex X family with extra cache, and the ability to have 8 cores on a single cluster instead of just 4.
The GPU is the GA10f, as opposed to orins ga10b. It is a derevitive of the ga102 ampere architecture, with 1 GPC of 12 sm's (1536 cuda cores), and unlike orin is geforce/rtx, so gone are things like the dla's, and instead it has the trappings of the rtx GPU's like ray trace cores. Years ago this would have fallen under the geforce go branding, but it's all been folded under geforce now.
You’re actually describing what the M series has already accomplished. I would say budget computers on the level of 4080s today. In that form factor of the dev kit in 10 years.
Hey Vex, I think if we see something like CPUs being replaced by GPUs, it'd likely cease to be a GPU in the first place. I mean, isn't that technically what an APU is?
vex test the intel apo in games?
cool video i still cant think about the fact that the 3060 is still this popular for buyers
Lock the voltage and core speed on your 13700K 11:25 Sync if with RAM and ring-bus using some ratio.
Use HWiNFO to make sure not to waste too much power and have exsolves heat output (which is all subjective).
I don't think "Jetson" is a play on Jensen's last name so much as it is a reference to "The Jetsons", which was a Hanna Barbara cartoon from back in the day that we had re-runs of on TV in the 90s.
instead of onboard graphics card, we would get onboard cpu. what a time to be alive. interesting to see how this pans out. gpu tech has advanced impressively indeed.
Dude GPUs and GPUs have totally different machine code, architecture, instructions... GPUs are specialized on floating point task, CPUs are a "catch all" machine they have a lot of specific instructions built into the hardware, the same way GPUs have their special hardware paths.
They have different architectures, like ARM, X86 (32-64), RISCV... even worst, GPUs have all those cores because they are "tiny and simple" with a very limited list of tasks they can do.
You just have to go back a "few" years and compare games using "software rendering" (CPU) or "accelerated/hardware rendering" Glide, OpenGL, Direct3D, DirectX (GPU).
On the late 90-early 2000 you could have a flag ship CPU and run some early 90s games on software mode and have a good experience, right now CPUs and GPUs have diverge light years apart.
CPUs can handle floating point, they are excelent for scalar math, GPU for vector math (3D stuff).
@@saricubra2867 It is an example from old CPUs, then I suppose newer ones "have" what 90s and so GPUs had an edge on. The problem with this video is that he is saying something like "Wow, look what all those raytracing/cuda cores do, why not run windows on them?" it's a ridiculous premise, is like asking a paint brush to cut a ceramic tile.
CPU=Handyman+construction worker+electrician+...
GPU=Picasso (extreme example, but you can put "Professional house painter")
That would pretty much be impossible. Its possible to have a GPU emulate SOME work on a CPU. The CPU is what connects all the different BUSes (expansion systems like PCIE, USB, SATA, etc) and allows them to communicate by acting like an interpreter, however one then runs into the same problem as going from x86 to ARM. Different architectures require different system languages.. So any program will need to be effectively written for that CPU (the GPU) architecture. There are multiple different GPU architectures just like there are different CPU types.
I'm looking forward to this already. The more competition, the more the CPUs have to get better, and the less old CPUs (that are still decent) may cost now.
I don't think it will be that easy to replace a cpu with gpu anytime in the future.
But this is NVIDIA, who knows maybe they finally can turn their gpu into an actual CPU and we never have to buy a new one
Nvidia rlly said "Oh yeah, intel? You wanna make gpu's? Watch this then 😈"
ARM has a way to make gpu's work, its a standard called PCIE lol
it'd be rather simple to make gpu's with arm, but most companies aren't bothering yet as the arm cpu's currently focus on performance per watt rather than performance in general.
12:14 both processors use 8 cores for gaming & it's misleading by saying 8 cores is faster (like wtf), Intel just has extra 16 E-cores that handle small work so not "gaming" if u wanted comparison u should have found 8 core vs threadripper and with this u could argue what's better or worse
That be nice Vex. Imagine seeing a motherboard without a CPU tray or whatever. And only have like one or two PCIE slots for the GPU. Or I think I seen where CPU and GPU would be one thing, like your saying. Be on big looking cpu. Say you wouldn't have any PCIE slots for the GPU since you wouldn't need it unless you needed a pcie slot for something idk. I'm sure as more technology goes into developing a combined CPU and GPU would be nice. It might even work better or be more efficient even.
ok, but why do you have so much bass on the mic
We have APUs already which are a CPU with a gpu strapped on, and sometimes the GPU portion is larger than the CPU.
I take issue with Cormacks wording because if the "gpu" becomes the central processor, then that by definition would be a CPU, which would then have accelerators for tasks it's not very good at... Just like current day CPUs have GPUs to accelerate specific workloads, and even GPUs have separate accelerators attached to them (like hardware video encoding/decoding).
On the software side, it would be very difficult to parallelize current code bases to be able to take advantage of very weak but very high core count GPUs.
It's not a new idea but it's just been very impractical up until now, and would still probably be another 10 years+ away in a realistic setting, and even longer before it becomes mainstream. Just look at how long it's taken for ARM / RISC-V to really be put in the spotlight.
I think the whole point of the Jeston Orin Nano is that we have AI cores on a SOC which is great for anyone looking to get into AI development on the cheap. This isn't going to revolutionize the CPU/GPU market, but it certainly makes AI development more accessible to those interested, but would rather not drop a ton of money on the idea. If you just wanted an ARM based SOC, there are plenty already on the market, but you really can't do AI development on something like a Raspberry PI. Maybe in the future we see AI emulating X86 instruction sets on non X86 platforms?
I've often considered if a small cpu on the gpu wouldn't be potentially usefull. Add in just 4 decent cores and a healther local ram and cahce to have tigher integration with the gpu to optimize and do tasks better suited for cpu execution, not to mention foraward flexibility.
In theory, a GPU could do stuff without a CPU. But it needs some way to know what to do. Either you need to run an OS on the GPU, essentially making it an APU. Or you need to have a CPU still and change how Operating Systems work to basically just be for selecting a program that the GPU then handles all the processing for.
The bigger hurdle is GPUs are fast because they are optimized for specific functions. CPUs are WAAAAY more general. Some of the stuff a CPU is fantastic at would be dog slow on a GPU and require a major infrastructure change to make viable
I think you got something wrong in the video.
CPUs are designed to run pretty much every task you can throw at them by using hardware that's designed to do every kind of mathmatical or logic operation you can think of. They need lot of different logic sub circuits to do that which is why each of the cores is larger than anything a GPU would have. You can make a CPU able to do more things (to a certain point) by updating the microcode tell "tell the hardware" how to do the new tasks. It might take longer but if you can break down the task into something the CPU has hardware build in to do it can do it.
GPUs on the other hand are designed to do specific tasks in a hardware that cannot do anything besides that specific tasks even if you would update software. If you want the GPU to do a task it was not designed for there's no way to make it somehow possible. If you look at FSR for example it's only working on older GPUs because the algorithm was designed in a way that it works with the hardware those GPUs already could do. So the software was written in a way that would work with the hardware.
If we come back to the video now you can probably guess what I'm getting at. Nvidia will not be able to run something like Windows on a GPU without basically building an SoC (System on a Chip). Modern GPUs are getting more and more like those in between chips that basically can do everything (essentially not GPUs anymore) which is why it's also possible to run Doom entirely on a GPU (there has been an article about it).
If this trend is a good direction or not is not for me to decide. I'll leave that to the people who design the chips. I',m just a random guy on the internet who knows how the hardware is working on a probably around mid level basis because that's my daily job (VHDL coder). Imo at some point we'll hit the physical constrains of the way we build PCs now and we have to think about a workaround like making the chips even bigger, split the tasks up for multiple chips or something entierly different. This will create other issues like signal interity, latency or just prevent the thing from overheating (look at der8auer's video about Threadripper overclocking for example). Anyways... enough rambling
Another point which was not wrong but just maybe interesting for you is that Nvidia has been working with ARM for well over 15 years now. They've been workin on their Tegra lineup since 2000~ish (+- a few years of development before showing it to the puplic).
At one point, during your x86 vs arm explanation. I saw a very young Louis Rossmann for a sec there...
Was literally having a shower thought about this like 2 days ago!!!