Why does this GPU have an SSD? - AMD Radeon Pro SSG

Поділитися
Вставка
  • Опубліковано 16 лип 2024
  • Get $25 off all pairs of Vessi Footwear with offer code LinusTechTips at www.Vessi.com/LinusTechTips
    SmartDeploy: Claim your FREE IT software (worth $580!) at lmg.gg/OTTP7
    AMD announced the Radeon Pro SSG in 2016 combining a GPU and onboard SSD - But when it launched in 2017, practically nobody bought it. Was it simply ahead of its time, or was it truly a dud?
    Discuss on the forum: linustechtips.com/topic/14183...
    Check out the Radeon™ Pro SSG: geni.us/yQ43dMY
    Buy some LTT Store Dot Com Cable Ties: www.lttstore.com/products/cab...
    Buy an Intel Core i9-12900K: geni.us/hrzU
    Buy an ASUS TUF Z690-PLUS WIFI D4: geni.us/mgWYr2
    Buy a Noctua NH-D15: geni.us/vnuvpW
    Buy a Corsair Force MP600: geni.us/TkxIgO
    Purchases made through some store links may provide some compensation to Linus Media Group.
    ► GET MERCH: lttstore.com
    ► AFFILIATES, SPONSORS & REFERRALS: lmg.gg/sponsors
    ► PODCAST GEAR: lmg.gg/podcastgear
    ► SUPPORT US ON FLOATPLANE: www.floatplane.com/
    FOLLOW US ELSEWHERE
    ---------------------------------------------------
    Twitter: / linustech
    Facebook: / linustech
    Instagram: / linustech
    TikTok: / linustech
    Twitch: / linustech
    MUSIC CREDIT
    ---------------------------------------------------
    Intro: Laszlo - Supernova
    Video Link: • [Electro] - Laszlo - S...
    iTunes Download Link: itunes.apple.com/us/album/sup...
    Artist Link: / laszlomusic
    Outro: Approaching Nirvana - Sugar High
    Video Link: • Sugar High - Approachi...
    Listen on Spotify: spoti.fi/UxWkUw
    Artist Link: / approachingnirvana
    Intro animation by MBarek Abdelwassaa / mbarek_abdel
    Monitor And Keyboard by vadimmihalkevich / CC BY 4.0 geni.us/PgGWp
    Mechanical RGB Keyboard by BigBrotherECE / CC BY 4.0 geni.us/mj6pHk4
    Mouse Gamer free Model By Oscar Creativo / CC BY 4.0 geni.us/Ps3XfE
    CHAPTERS
    ---------------------------------------------------
    0:00 Intro
    0:53 What is an... SSG?
    1:27 SSD performance
    2:06 Is this like DirectStorage?
    2:55 The SSG API
    4:11 Enter Adobe
    5:00 But... Why?
    6:02 Can we... Upgrade it?
    7:04 Why is direct-to-GPU storage important?
    7:58 Conclusion - Why it won't come back
  • Наука та технологія

КОМЕНТАРІ • 2,2 тис.

  • @Legatron17
    @Legatron17 2 роки тому +5871

    Truly opened my eyes as to why the GPU does

    • @comedy6631
      @comedy6631 2 роки тому +9

      This*

    • @justsam07
      @justsam07 2 роки тому +4

      🌚👍

    • @AnxulJyoti
      @AnxulJyoti 2 роки тому +21

      Registering my comment before this blows up.

    • @deki9827
      @deki9827 2 роки тому +78

      Exactly why I dislike the clickbait titles, they don't tell us why the gpu does!

    • @aumshumanmohapatra7567
      @aumshumanmohapatra7567 2 роки тому +7

      But why?

  • @alphapuggle
    @alphapuggle 2 роки тому +4116

    I'm glad there's finally an answer to why that GPU does

    • @32bites
      @32bites 2 роки тому +80

      Has Anyone Really Been Far Even as Decided to Use Even Go Want to do Look More Like?

    • @noamtsur
      @noamtsur 2 роки тому +111

      but no one ever asks how is GPU

    • @brownie2648
      @brownie2648 2 роки тому +43

      @@32bites r/ihadastroke

    • @brownie2648
      @brownie2648 2 роки тому +3

      @@noamtsur the gpu iz brocken :((((

    • @tobiwonkanogy2975
      @tobiwonkanogy2975 2 роки тому +1

      this gpu does well. Gpu fricks.

  • @loicgregoire3058
    @loicgregoire3058 2 роки тому +105

    Those would be crazy useful in AI Application, imagine loading all your dataset in you GPU once without having to reload it for each training iteration

    • @steevem4990
      @steevem4990 2 роки тому +2

      i actually thought it was where they where going to when antony switched the ssd's

    • @fluidthought42
      @fluidthought42 Рік тому

      Especially now that ML has moved away from NVIDIA proprietary tech

  • @joshuatyler4657
    @joshuatyler4657 2 роки тому +382

    Imagine being the engineering team that made all of that work, only for the industry to say "ok. cool."

    • @mnomadvfx
      @mnomadvfx 2 роки тому +15

      The people that actually buy them for working on appreciate it.
      The people who just commentate on it are not "the industry" at all, just glorified journalists.

    • @JonatasAdoM
      @JonatasAdoM 2 роки тому +2

      @@mnomadvfx If the industry needed it it would exist. Just look how much better enterprise tools and systems are.
      Also, a victory against ever more complex and complicated hardware.

    • @paulpoco22
      @paulpoco22 2 роки тому +3

      Just like VESA Local Bus video cards

    • @lazertroll702
      @lazertroll702 2 роки тому

      @@JonatasAdoM aye! #KludgeDeath

  • @pezz1232
    @pezz1232 2 роки тому +496

    I wonder how long until the title changes lol

    • @mys31f70
      @mys31f70 2 роки тому +5

      probably within the first hour or 2

    • @christiannkulcsar7951
      @christiannkulcsar7951 2 роки тому +2

      Same

    • @slug8039
      @slug8039 2 роки тому +29

      Why does that GPU?

    • @shadihaddad7
      @shadihaddad7 2 роки тому +8

      I bet they leave it out of spite lol

    • @ritualduke8825
      @ritualduke8825 2 роки тому +8

      Why do they do that? Linus does seem to do that a lot and it’s confusing.

  • @mirandahw
    @mirandahw 2 роки тому +578

    Ah, yes, "Why does this GPU"
    Gotta love the original titles...

  • @davidg5898
    @davidg5898 2 роки тому +389

    If the API was actually widely rolled out, something like this would be incredibly useful for science departments at universities (which is a niche market, but not an insubstantial one).

    • @jinxash3580
      @jinxash3580 2 роки тому +23

      Would have been Ideal to train models for Deep Learning using this GPU

    • @davidg5898
      @davidg5898 2 роки тому +16

      @Sappho I was doing astronomical simulations (my work was more with globular cluster formation and evolution, 10s of thousands to millions of stars, sometimes coupled with a molecular cloud during early formation) and there definitely would have been a performance boost if the read-out/save time of each time slice could have been sped up by having the GPU dumped it straight to storage.
      Just as you also described, most of my work was done on a Beowolf cluster with a lot of high powered off-the-shelf GPUs.

    • @TheCodyLaxton
      @TheCodyLaxton 2 роки тому +7

      Niche but actually large market haha, I worked in university research at the National Weather Center. Basically any excuse to build something cool for research is the path most travelled by pre-doctoral and under grads.

    • @lazertroll702
      @lazertroll702 2 роки тому

      meh ... there's only so many times storage malware can give the feelz...
      although .. iffins the gpu executed from that storage on init ... 🤔

  • @tranquilitybase8100
    @tranquilitybase8100 2 роки тому +54

    This technology eventually found a home... AMD used it in the PS5 and Xbox Series. Both systems can load into RAM directly from the SSD, bypassing the CPU.

    • @Pacbandit13
      @Pacbandit13 2 роки тому +1

      Is that true?

    • @louism771
      @louism771 2 роки тому +6

      Well, this lead to Direct Storage, that we have today on this consoles, and soon will have on Windows 11. Pretty much the same idea, technologically a bit different i guess.

    • @derpythecate6842
      @derpythecate6842 2 роки тому +2

      I think bypassing the CPU is difficult/insecure, and I did some research and was right.
      Complete CPU bypass would mean being able to skip the kernel layer, and all the security checks which gives you arbitrary read/write to memory from disk that should be minimized as it provides a loophole.
      What DirectStorage does is simply use the GPU to decompress compressed data sent over the bus from the SSD, which then hands it to the CPU to decode and execute. This basically just speeds up the data retrieval pipeline, but doesn't not expose any loopholes as it is fundamentally the same underlying mechanism all computers use today to fetch data.
      The AMD SSG card in the video can do such caching as the GPU doesn't execute the kernel code, which means that while you can still write malicious code in to target the GPU, it's way more self contained than executing it directly on the CPU which takes control of all your processes including your OS.

  • @mini-_
    @mini-_ 2 роки тому +1385

    Everyone always ask "Why does this GPU?", but never asks "How is this GPU?" 😔

    • @alexdavis9324
      @alexdavis9324 2 роки тому +4

      Original

    • @_yuri
      @_yuri 2 роки тому +21

      how does this gpu? *

    • @Roriloty
      @Roriloty 2 роки тому +11

      What da GPU doing?

    • @grimmel9894
      @grimmel9894 2 роки тому +6

      What's tha gpu doing?

    • @writingpanda
      @writingpanda 2 роки тому

      Lol I laughed out loud to this.

  • @Steamrick
    @Steamrick 2 роки тому +229

    'Why does this' indeed. I think the question the manufacturer asked is 'why not?'.

    • @Worthy_Edge
      @Worthy_Edge 2 роки тому +9

      @Czarodziej stronger than my will to click this link and be greeted by cringe

    • @nikoheino3927
      @nikoheino3927 2 роки тому

      @Czarodziej stronger than my will to live after reading your comment.

    • @nobody7817
      @nobody7817 2 роки тому

      @@Worthy_Edge I think you meant to say "greeted" but still that was a HILARIOUS retort!"

  • @cleverclever2317
    @cleverclever2317 2 роки тому +2

    7:58 hello back mr editor

  • @Tsudico
    @Tsudico 2 роки тому +74

    I wonder if they had done something like a separate PCIe daughter card with something similar to the SLI or Crossfire interfaces would have worked better. It wouldn't have shared bandwidth across the PCIe bus but still allowed direct access to the SSDs installed.

    • @ryanchappell5962
      @ryanchappell5962 2 роки тому +3

      Hey this seems like a really cool idea.

    • @jeromenancyfr
      @jeromenancyfr 2 роки тому +3

      I am not sure I understand , aren't the SLI and Crossfire interface very slow by themselves ? The data would move through PCIE... Like any NVME drive

    • @Rakkakaze
      @Rakkakaze 2 роки тому +1

      @@jeromenancyfr I imagine the idea is, GPU calls for data, pass over link, then out through pci... CPU calls for data, pass through pci.

    • @mnomadvfx
      @mnomadvfx 2 роки тому

      That eats into compute density though if you want to have several of them per node.

    • @legendp2011
      @legendp2011 2 роки тому

      well my understanding is thats basically how nvme direct storage drives are going to work (and are already working like that in the ps5)

  • @BurntFaceMan
    @BurntFaceMan 2 роки тому +312

    "This is hilarious" : "You can totally run your system without any additional storage as long as you are ok with the overhead of sharing bandwidth with a GPU." Anthony's sense of humour differs from most. Moss would be proud.

    • @saikanzen1762
      @saikanzen1762 2 роки тому +9

      I read that in Moss' voice and I cannot agree more.

    • @WarPigstheHun
      @WarPigstheHun 2 роки тому +5

      Goddamn Moss and his old Fetts..

    • @adriancoanda9227
      @adriancoanda9227 Рік тому

      Ah if you have pci-e 64x because most used programs task in ram after that the boot disk is barely used you can use such cards in low profile setups without chiping uou the gpu cooling solution if it had also a cpu socket to handle the graphics management would ve something

  • @watercannonscollaboration2281
    @watercannonscollaboration2281 2 роки тому +148

    I learned that this was a thing a month ago when doing research on the WX 2100 and I’m surprised no major tech channel did something funny with it

    • @cool-soap
      @cool-soap 2 роки тому +13

      @@joz534 no, run.

    • @cmd8086
      @cmd8086 2 роки тому +3

      Maybe because it is too expensive?

  • @hdrenginedevelopment7507
    @hdrenginedevelopment7507 2 роки тому +3

    That kind of reminds me of the old Real3D Starfighter PCI Intel i740 gfx card from wayyyy back in the day. Intel had just released the AGP bus architecture and the i740 was their first foray into the discrete graphics space…probably to help support the launch of AGP 1X, because it wasn’t all that fast otherwise. For the majority of the non-AGP systems, Real3D built a card with an AGP-PCI bridge chip that basically had an AGP bus and dedicated SDRAM AGP texture memory on board, in addition to the i740’s local SGRAM framebuffer RAM like any other graphics card. It was pretty cool at the time. They were sold with 4-8 MB framebuffer plus 8-16 MB AGP texture memory for a max of whopping 24 MB total onboard. They weren’t very fast, but they supported 1024x1024 px texture tile resolution whereas the vast majority of the competition including 3DFX only supported 256x256 pixels max resolution texture tiles. It was slow, but it looked so much better than anything else on the market and helped milk some extra capability from old non-AGP slot systems…perfect tradeoff people like Nintendo 64 players were used to dealing with, lol. 3DFX Voodoo 2 cards had a similar structure with separate RAM for framebuffer and texturing. Ok, now I’m done dating myself 😂

  • @xaytana
    @xaytana 2 роки тому +3

    I'd be curious to see this concept again once m.2 key f finally sees some use; though if we never see a future where there's high bandwidth busses with tight memory timings, essentially combining what GPUs and CPUs like, this concept should be put off to key H, J, K, or L, to not confuse high bandwidth GPU memory with tight timing CPU memory on key f, assuming a future memory standard ever actually makes the switch. Though with how fast devices are becoming, it'd be cool to see a unified memory-storage platform where the only difference is if the chip itself is considered volatile or not, essentially the original concept of Optane on steroids; this would also be cool if there's semi-volatile chips where a sudden shutdown could retain otherwise volatile data.

  • @scorch855
    @scorch855 2 роки тому +669

    If ML libraries target this platform it seems like it could be a compelling option. Now days models are getting so large that even 24gig of VRAM is not enough. Yes the performance would undoubtedly be worse using SSDs but the alternative is not being able to use the model at all.

  • @Hobo_X
    @Hobo_X 2 роки тому +743

    I honestly wondered if this GPU was actually hiding a secret, that Microsoft had these to base DirectStorage work off of for all these years while they worked on it. Maybe now that it's finally public and AMD has actual tangible research into this as the product actually exists... well, I don't know... imagine if RDNA3 has this as a surprise feature to work amazingly with DirectStorage?!

    • @nielsbishere
      @nielsbishere 2 роки тому +51

      This is exactly what's needed to blow graphics to a new area, think of the huge scenes you could render

    • @epobirs
      @epobirs 2 роки тому +39

      The beta setup for DirectStorage used Nvidia RTX cards, as Nvidia was already doing work in the same direction for RTX I/O, aimed at the workstation market. Remember, they needed something that was going to work in the PCs people will own in the foreseeable future rather than create something requiring a costly niche hardware design. If Microsoft used them in R&D at all, it was more likely for the Series X/S Velocity Architecture, as a proposed console design was less sensitive to non-standard hardware if the cost was good. Even then, this wasn't very close as the major component there (and in the PS5) is the controller functionality with the dedicated decompression block. Offloading those operations from CPU and GPU are a big factor in letting the console perform optimally.
      I strongly suspect that Microsoft and AMD will try to push an open standard for a PC hardware spec that will bring a version of Velocity Architecture to PC to give DirectStorage the full functionality it has on Xbox. This needs to be a vendor independent spec to get Intel and Nvidia on board, otherwise it's will remain a niche that game developers will be reluctant to use. A recent previous example would be DirectML, which is hardware agnostic and relies on the drivers to bridge the gap between PCs and vendors of ML focused hardware. Thus the ML hardware can live in the CPU, GPU, or a separate device on the PCIe bus, the user doesn't need to know so long as the driver tells the system what to look for and how to talk to it.

    • @kkon5ti
      @kkon5ti 2 роки тому +2

      This would be amazing

    • @erdem--
      @erdem-- 2 роки тому +7

      At this point, i think we don't need CPU's. It is cheaper and better (for gaming) to produce all in one APU designs, like how PS5 and other game consoles are designed.

    • @zaidlacksalastname4905
      @zaidlacksalastname4905 2 роки тому +2

      According to some market analysts, top RDNA4 could come with 512 gigs of pcie gen 4 memory

  • @Michplay
    @Michplay 2 роки тому +4

    it just amazes me that Direct Storage / RTX IO is taking this long for a demo to test with

  • @chadlumpkin2375
    @chadlumpkin2375 2 роки тому +2

    This reminds me of the Intel math coprocessors for the 286/386 CPU's Before floating point unit (FPU) processing became the default for all X86 processors. With the 486 Intel introduced the 486DX with the FPU and the 486SX with the FPU disabled.

  • @EvanMorgoch
    @EvanMorgoch 2 роки тому +212

    With respect to the random read speeds (1:41); Why not test the drives independently from the SSG, or use MP600 drives in the SSG to get a proper apples to apples comparison? The drives firmware themselves may just be crap and account for why the random speeds don't scale nearly as well.

    • @ThranMaru
      @ThranMaru 2 роки тому +3

      Ain't nobody got time for that.

    • @flandrble
      @flandrble 2 роки тому +4

      Because driver overhead for RAID increases latency. on AM4 you're loosing approx 30% of your IOPs even if all your SSDs are connected to the CPU and not chipset. Intel is no where near this bad (same with Windows) but it's still a loss.

    • @ayoubboulehfa3932
      @ayoubboulehfa3932 2 роки тому +4

      @@ThranMaru well they tested a GPU from 2017 that no one have, so yes they have time,

    • @bigweeweehaver
      @bigweeweehaver 2 роки тому +2

      @@ayoubboulehfa3932 has nothing to do with time and more with uniqueness to interest the viewer into clicking on the video.

    • @I2obiNtube
      @I2obiNtube 2 роки тому +1

      Because then you'd just be testing drive performance which wouldn't make sense. It's end to end testing

  • @seireiart
    @seireiart 2 роки тому +39

    "Why does this GPU?!!"
    Great question.

    • @Worthy_Edge
      @Worthy_Edge 2 роки тому +3

      Only 11 minutes and there’s already 2 bot replies

    • @seireiart
      @seireiart 2 роки тому

      @@Worthy_Edge These bots can't just chill. Can they?!!

  • @user-rd3jw7pv7i
    @user-rd3jw7pv7i 2 роки тому +9

    I can see this being used for ONE specific use case. Instead of having a separate SSD-in-one-enclosure and GPU and taking more than 1 or 2 PCIE slot, i.e. 1 LIQID Honey badger and 1 GPU, just use this!
    This card actually make sense and I'm sad to see this tech not taking off because if you know how and why to use it, this is revolutionary!

    • @Craft97pl
      @Craft97pl 2 роки тому

      with direct storage sharing bandwidth with ssd is no problem. Problem is gpu it self. In few yers it will suck.

    • @adriancoanda9227
      @adriancoanda9227 Рік тому

      @GoSite solder a better one flash a model firmware done or you can adapt a Socket like mountings and replace gpu as fast as you want before the gpu is assembled how do you think they test it

  • @jet_aviation
    @jet_aviation 2 роки тому +2

    *Apple after soldering the RAM onto the CPU with integrated graphics:*
    _The time has come to integrate storage as well._

  • @zoey.steelimus
    @zoey.steelimus 2 роки тому +55

    LTT: "Why does this GPU?"
    Me: "Yes, but have you considered HOW the GPU does?"

  • @CoolJosh3k
    @CoolJosh3k 2 роки тому +293

    Actually handy for when your motherboard does not have enough m.2 slots.
    You can buy these as just the raid 0 cards that will plug in and use PCIx4.

    • @WayStedYou
      @WayStedYou 2 роки тому +30

      They could literally give you more m.2 if they gave you a pcie card with m.2 slots

    • @upperjohn117aka
      @upperjohn117aka 2 роки тому +35

      @@WayStedYou but those dont look cool

    • @Bobis32
      @Bobis32 2 роки тому +35

      @@WayStedYou as someone who uses an itx system when pcie5.0 comes out i really hope something like this comes out as even gpus barely use the extra bandwidth from pcie4.0 why not put some m.2 slots on gpu's espessialy with MDA coming in the near future

    • @somefish9147
      @somefish9147 2 роки тому

      @@Bobis32 power and bandwith

    • @virtualtools_3021
      @virtualtools_3021 2 роки тому +11

      @@somefish9147 oh yeah bc SSD use sooooooooooooooooooooo much power

  • @keldwikchaldain9545
    @keldwikchaldain9545 2 роки тому +2

    When I saw that board I thought they were gonna have a complex memory controller that'd drive the nvme drives with the normal ddr memory as a cache, not as literal storage devices sitting on the gpu for fast load times.

  • @beythastar
    @beythastar 2 роки тому +1

    I've been waiting for this video for such a long time! Finally, I can see how it performs!

  • @jseen9568
    @jseen9568 2 роки тому +216

    When PCIe Gen 4 first came out, everyone was saying how it wasn't practical because it would be used fully. I said then that what would be more interesting if you saw some instances where multiple uses through a single PCIe 16x slot could take place without any hindering in performance. This would be one of those scenarios. not useful, but pretty cool.

    • @BrentLobegeier
      @BrentLobegeier 2 роки тому +29

      Couldn't agree more. When someone made a car, everyone said horses were better. Without manufacturers trying things outside of the box we would never progress, and I have no idea why everyone is so against innovation. Noone is forcing anyone to become early adopters of anything, and most things people were skeptical about soon became integral to everyday life. With progression comes niche products like this, but at least we can say they are trying.

    • @bojinglebells
      @bojinglebells 2 роки тому +15

      and now we're up to PCIe 5.0 with Alder Lake...there's even consideration to adjust NVMe storage standards from 4 lanes down to 2 because of how much bandwidth 4.0 and now 5.0 offer.
      I would love a product like this if only to gain more NVMe storage without taking up extra slots

    • @CheapSushi
      @CheapSushi 2 роки тому +4

      @@bojinglebells same, I love the dual functionality. I get a pretty decent GPU and 4 NVMe slots in two PCIe slots instead of three if I had to get a separate addon card. I personally love using up all my 7 slots with lots of cards.

    • @jseen9568
      @jseen9568 2 роки тому +2

      @@bojinglebells And I think about some more niche area like small form factor PCs and even the NUC extreme. With the speed and bandwidth increases, these types of compute cards could make for near instantaneous connections and make those types of products more viable

  • @GuusKlaas
    @GuusKlaas 2 роки тому +48

    Man, from what I recall, this thing was baller for Revit/CAD work. Those needed the entire model in VRAM, and it'd be a massive hurdle to do that over SSD > CPU > MEM > GPU. This was pre-host bus controller, which is the 'not as fancy' name for directstorage. Allowing devices 'other' than the main controller in a PCIe network to take control of another device. Like a GPU just... assuming direct control of an SSD (after some mediation obv) to just load stuff off without the big overhead. Obviously since then we also got (first on AMD, later on Intel) SSD's direct on CPU, rather than a PCH in-between (like Intel had until recently when they figured out that just 16 lanes from CPU was not enough).

    • @MrTrilbe
      @MrTrilbe 2 роки тому +6

      I was kinda thinking the same, or using it for parraralised ML or big data applications it is a WS card after all, running an openCL coded ML algorithm direct from 2TB of fast storage on the GPU, that's a lot of test data.

    • @ProjectPhysX
      @ProjectPhysX 2 роки тому +7

      It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-3300 GB/s. So the SSG never really took off.

    • @Double_Vision
      @Double_Vision 2 роки тому +1

      I occasionally deal with massive scenes and mesh or particle caches in Redshift for Maya, and Redshift could use this for sure! The same goes for trying to use Redshift to render massive print images where Redshift's out-of-core technology could benefit from having all this storage connected directly to the GPU core. No more Out-Of-Memory failures!

  • @mrkezada5810
    @mrkezada5810 2 роки тому +3

    I think one of the most productive uses for this GPU is to enable fast Unified memory accesses to memory when programming with OpenCL or something like that. Although that is a really niche and low level use case, mostly investigation-focused.

  • @floogulinc
    @floogulinc 2 роки тому +1

    Actually this design is kinda awesome for mini itx machines where storage expansion is very limited and you're already using your only PCIe slot for the GPU.

  • @DasFuechschen
    @DasFuechschen 2 роки тому +212

    I remember the launch event of this this at siggraph. AMD „gifted“ some of those cards to RED which then gave them to some Indian filmmakers which had previously betatested the card in animation and editing one of their movies if I remember correctly. But TBH, I have more memories of the after-party than the event itself.

  • @fuzzynine
    @fuzzynine 2 роки тому +43

    Boy, this is awesome. I wish you would show more obscure tech. I feel like watching retro computer channels right now. Only with new stuff. :D
    Thanks. This is really awesome!

  • @vectrobe
    @vectrobe 2 роки тому +11

    theres also one thing that wasnt really mentioned and that they launched EPYC and threadripper around the same time, which effectively provided the same functionality. This card was in a timeframe where NVMe RAID's were an amazing concept but the PCIe lanes needed for it were often hard to come by, even on the xeon and opteron series

  • @genesisphoenix00
    @genesisphoenix00 2 роки тому

    For someone who used to built sff this would be a godsend in 2017 infact even now it still good, i had a dan case a4 sfx and most of it volume is dedicated for gpu and cpu and yes you can cramp in 3 2.5 drive but boy you need custom cable for everything including for mb, cpu, gpu to make space for the drive. even lian li tu105 also had 1-2 drive mount and itx mb low end one come with maybe 1 and high end with 2 m.2, having this would solve so much of space issues for me, my steam library already 6TB

  • @jfolz
    @jfolz 2 роки тому +20

    Everyone asks "Why does GPU?"
    Nobody asks "How does GPU?"

  • @vgaggia
    @vgaggia 2 роки тому +35

    I wonder how it'd work with deep learning stuff, if the memory capacity would outweigh the speed.

    • @ilyearer
      @ilyearer 2 роки тому +16

      I was surprised there was no mention of that potential application as well.

    • @ZandarKoad
      @ZandarKoad 2 роки тому +6

      @@ilyearer Same. Seriously looking hard at this card now, since memory size is an upper limit on the types of existing neural nets you can fine tune. RTX 3090 has only 24 Gigs compared to this, 2048 Gigs. Yikes.

  • @MrPruske
    @MrPruske 2 роки тому +1

    I feel like the only person that could have made use of this was the slow-mo guys in 2017. I'd like to see them try to use it now

  • @Sencess
    @Sencess 2 роки тому

    0:01 LOL whose idea was that, EPIC intro

  • @kaseyboles30
    @kaseyboles30 2 роки тому +20

    Adding modular storage to a GPU makes sense if it's directly useable by the GPU itself. A game could preload the textures and models to the storage and use them from there similar to how direct storage works, but potentially faster and lower latency.

    • @ProjectPhysX
      @ProjectPhysX 2 роки тому +4

      It's very interesting for computational fluid dynamics too. Although there are ways to make CFD codes require less VRAM (see the demos on my YT channel), you always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice it's unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.

    • @vamwolf
      @vamwolf 2 роки тому +1

      Yes and no. Games still use sdr texture to the point ...hd assets are no worth it atm.

    • @kaseyboles30
      @kaseyboles30 2 роки тому +4

      @@ProjectPhysX for applications like that where data is re-written constantly I think just adding sodimm slots for ddr 5 would be ideal. With 4 slots you could add a ton of ram. Not as fast as the gddr ram, but good enough to worthwhile.

    • @ravenof1985
      @ravenof1985 2 роки тому +1

      @@kaseyboles30 i feel this is the answer for a lot of GPU applications, from low budget cards (4GB VRAM not enough anymore, pop a desktop DIMM in the expansion slot) to the high end, populate all 16+ DIMM slots for maximum AI/machine learning/CFD performance.

    • @aravindpallippara1577
      @aravindpallippara1577 2 роки тому +1

      @@ravenof1985 aye would be faster and cheaper in the long run - though you aren't breaking grounds in vram unless going for an hedt with threadripper cpu or something

  • @benjaminlynch9958
    @benjaminlynch9958 2 роки тому +48

    Huge use case for AI training. Anything over 80GB of memory means training has to move from GPU’s to CPU today, and that means a slowdown by multiple orders of magnitude. Unfortunate AMD has never had any real market share in the AI/ML world because their software support - even in 2020 - sucks.

    • @ManuSaraswat
      @ManuSaraswat 2 роки тому +5

      how about in 2022?

    • @WisestPongo
      @WisestPongo 2 роки тому

      @@RyTrapp0 ye but intel bad

    • @fernbear3950
      @fernbear3950 2 роки тому

      Wearout makes it a nonstarter, for inference though maybe could be a monster in the right circumstances.

    • @ProjectPhysX
      @ProjectPhysX 2 роки тому +4

      AMD has introduced their new MI250X GPU with 128 GB memory.
      But still you can never have enough memory. I'm working with CFD (see my YT channel), and there it's the same problem: You always want maximum resolution possible. You could do colossal resolutions with TBs of storage. But in practice the SSG is unusable, because even the 14GB/s is extremely slow, and you would rewrite this storage 24/7 which would quickly degrade/break it. With VRAM directly, you can do 400-2000 GB/s. So the SSG never really took off.

    • @ZandarKoad
      @ZandarKoad 2 роки тому +1

      @@ProjectPhysX Thanks, I figured as much. It's a shame. Memory in the TB range truly opens up new possibilities for deep learning.

  • @NdxtremePro
    @NdxtremePro 2 роки тому +7

    I wonder how hard implementing a Direct Storage layer over the API would be.

    • @mnomadvfx
      @mnomadvfx 2 роки тому

      Probably easier because you are cutting out a middle man - though there might be some latency introduced as they communicate with each other.

  • @ShiroKage009
    @ShiroKage009 2 роки тому +1

    This would have been awesome for things like genomic alignment and similar applications that lost performance due to latency when attempting to utilize GPUs.

  • @abhivaryakumar3107
    @abhivaryakumar3107 2 роки тому +6

    Ngl Anthony is my favourite LTT member and it makes me so happy whenever I see his face in a thumbnail:))

    • @LordYamcha
      @LordYamcha 2 роки тому +5

      Same but these bots goddamnit

    • @abhivaryakumar3107
      @abhivaryakumar3107 2 роки тому +4

      @@LordYamcha I stg what the absolute fuck is this, commented 2 minutes ago and there are already 2 bots

  • @00kidney
    @00kidney 2 роки тому +6

    Everyone is asking "Why does this GPU?" but I'm just glad to see an upload featuring Anthony.

  • @andrewbrooks2001
    @andrewbrooks2001 2 роки тому

    Great video and information presentation! Thank you!

  • @MushroomKingdoom
    @MushroomKingdoom 2 роки тому

    hey I did not know about this as an option, great idea!!
    Usually there is enough bandwith for both components even in gen 3 pci e.

  • @MerpSquirrel
    @MerpSquirrel 2 роки тому +10

    I could see this being used for machine learning or data analysis for Microsoft R. Good usecase for direct storage.

    • @willgilliam9053
      @willgilliam9053 2 роки тому

      train a model with very limited host CPU usage... ya that would be cool

  • @markk8104
    @markk8104 2 роки тому +6

    Did you try seeing if one of the versions of graphics card powered SQL works well on this? Current issue with this is the data transfer speed with the CPU step involved. So might be worthwhile trying that.

    • @cheeseisgud7311
      @cheeseisgud7311 2 роки тому +1

      It really wouldn't help unless the sql server used the api this GPU needs to directly access the file

  • @jmssun
    @jmssun 2 роки тому

    It was used to accelerate large scaled industrial Ray Tracing or simulation. The industrial scene files (of factories with complete parts) are so large that they usually would not fit in regular Ram/VRam, and by having it in SSD within GPU allows random look up to such humongous scene possible

  • @ianemery2925
    @ianemery2925 2 роки тому

    Pro tip for small, non ferrous screws, use a tiny bit of Blue-Tack to stick the screw driver to the screw head, then a larger blob to remove it if it stays in the screw threads and you want it back.

  • @0tool505
    @0tool505 2 роки тому +5

    I think brands should be more transparent and start answering the consumers why does the GPU do

  • @kevinheimann7664
    @kevinheimann7664 2 роки тому +7

    Would be intresting if such a Idea would be combined with optane memory that with a driver using it as 2nd level ram

    • @suhendiabdulah6061
      @suhendiabdulah6061 2 роки тому

      What did you mean with optane? Is optane can store data? Sory if i am wrong

  • @sexylexy22100
    @sexylexy22100 2 роки тому +1

    Would be great for cfd on gpu workflows generally if you run out of space in ram it crashes you're multi day set of calculations and you have nothing so can do much larger computes with this card

  • @fat_pigeon
    @fat_pigeon 2 роки тому

    6:10 Probably the screws are ferrous, but they're stainless steel, which responds only weakly to a magnet. Try sticking a magnet right onto the screwdriver bit; the stronger magnetic field should pick them up.

  • @juniperburton7693
    @juniperburton7693 2 роки тому +3

    I do like this idea. Would be cool to see it come back. This would be really great, actually, for space confined builds. It seems... unique

  • @thatsgottahurt
    @thatsgottahurt 2 роки тому +3

    Hope to see some Direct Storage content soon.

  • @matikaevur6299
    @matikaevur6299 2 роки тому

    I feel old ..
    I remember times when discreet soundcard was essential for gaming .. soldering my own LPT DAC ..and bizarre (experimental) situation when soundcard (Sound Blaster AWE32) had more memory than PC (28MB vs 16MB). Gravis Ultrasound equivalent (don't remember exact model) went only to 16MB

  • @JedismyPet
    @JedismyPet 2 роки тому

    whoever did the ad animation i love you for adding Saitama

  • @lakituwick7002
    @lakituwick7002 2 роки тому +5

    After years of searching, I finally understood why this gpu does.

  • @Respectable_Username
    @Respectable_Username 2 роки тому +3

    Interesting no discussion of what benefit this could bring to ML on large datasets. Is it the SSDs being that close doesn't provide enough of a benefit to data transfer speeds, or is it the price being too expensive for those doing ML research at places such as universities?

  • @thesix______
    @thesix______ 2 роки тому

    7:55 thx editor

  • @SuperLarryJo
    @SuperLarryJo 2 роки тому

    So good to see Anthony back on Anthony Tech Tips

  • @Owenzzz777
    @Owenzzz777 2 роки тому +3

    This was the first GPU with M.2 slots, but definitely not the only one today. NVIDIA EGX-A30/40/100 are the new ones designed for a completely different purpose. Although technically they are NICs with a GPU, an ARM SoC, and M.2 SSD slot.

  • @Sweenus987
    @Sweenus987 2 роки тому +3

    They should add 1TB SSD directly to the board and have it used as a more long term cache that could store data from multiple applications that load things into the GPU memory and then load it from this storage into its global memory when needed instead of going through the CPU at all

    • @mnomadvfx
      @mnomadvfx 2 роки тому

      Once a new memory tech comes along that is less power/heat intensive they may just add it directly to the chip packaging ala HBM.
      In theory they could already just add it to that, but even the much higher endurance SLC NAND has wear limits.
      You don't want to bolt memory that can wear out directly onto the packaging of the processor.

  • @NikoNikolaev
    @NikoNikolaev 2 роки тому

    SSG can be used for rendering large scenes in 3Ds Max + V-Ray. Some scenes can easily consume above 100Gb of RAM if V-Ray is set to use the CPU, once set to GPU those 100Gb can be relocated to the vRAM/M2's in SSG case.

    • @adriancoanda9227
      @adriancoanda9227 Рік тому

      Ah why would you buy a graphic card then? 🤔 AXAGON PCEM2-S PCIE NVME M.2 Adapter The PCI-Express x.16 adapter for connecting a NVMe M.2 SSD hard drive to your computer

  • @pkt1213
    @pkt1213 2 роки тому

    Pretty cool. I wonder if one could use this to boost raster and lidar analysis. Load and proces through the GPU. Hmmm

  • @JorgeMendoza-qx5bp
    @JorgeMendoza-qx5bp 2 роки тому +6

    Video Idea
    Could we get an updated video for 2022 of your
    "3D Modeling & Design - Do you REALLY need a Xeon and Quadro??" video.
    A cheap computer for 3D CAD modeling.

    • @commanderoof4578
      @commanderoof4578 2 роки тому

      Blender + EEVEE = you need a potato and will still render multiple minutes of frames before something such as 3DS max even does a dozen

  • @CharcharoExplorer
    @CharcharoExplorer 2 роки тому +4

    5:35 - That is not true. HBM2 is still connected by a 1024-bit memory bus. Its just that 2 stacks of HBM2 = 2048, while 2 stacks of HBM1 ... also means 2048 bit bus. They are exactly the same here. HBM2 brought much higher capacities, higher speeds, and lower latencies, it didnt change the connection it had. The Radeon VII for example and the R9 Fury are both 4096 bit machines, one is just 16GB of HBM2 while the other is 4 GB of HBM 1.

    • @bsadewitz
      @bsadewitz 2 роки тому +1

      Reading your post, for some reason I recalled this:
      en.m.wikipedia.org/wiki/Personal_Animation_Recorder
      I had the PC version. It used its own dedicated IDE bus, had its own framebuffer, etc. Upon its release, there was only one HDD that was capable of the sustained throughput required. The images also don't quite convey how huge these cards were. It is probably the heaviest PC expansion card I have ever handled.
      It did not compress the video whatsoever, and could not use the system's bus/IDE controller--too demanding. Furthermore, IIRC the video was stored as still images, one frame per file. I don't recall whether it used FAT or a proprietary filesystem. It was primarily intended for playing back 3d animation, but you could use it for whatever you wanted. I think it cost at least $1000US.

  • @PostalTwinkie
    @PostalTwinkie 2 роки тому

    Reminds me of 3DFX's "upgradeable" GPU they were working on in the 90s.

  • @AlphaCygni
    @AlphaCygni 2 роки тому +2

    Just like old days, when my AWE32 (ISA) soundcard had dedicated slots for RAM on it. even before I watch this video, I would suggest it's because of DirectStorage.

  • @LycanWitch
    @LycanWitch 2 роки тому +2

    i'd imagine this is where pci gen 4 or especially 5 could have shined if this concept kept going to present. No worries about sharing bandwidth with the gpu as there is plenty to go around, far more than the graphics card and m.2 drives combined could saturate.

  • @writingpanda
    @writingpanda 2 роки тому +33

    Anthony is fantastic. Just wanted to say he's doing an excellent job with these videos. Kudos, Anthony!

    • @board2death
      @board2death 2 роки тому +4

      Agreed Anthony is the man!

    • @TopGearScooter
      @TopGearScooter 2 роки тому +3

      I subscribe because of Anthony

    • @z31Joshyman
      @z31Joshyman 2 роки тому +2

      Anthony is the #techgod a fuckin goat.

  • @sebeth
    @sebeth 2 роки тому

    Can you test if the storage will work on a hypervisor like ESX (data store) and then pass the GPU to a VM? I have a very specific use case where I want both these these things but I'm out of PCIe ports...

  • @npcwill283
    @npcwill283 2 роки тому

    Always felt like you taught the master class on LTT ! I watch linus for novelty but I feel like I have to learn to absorb your information !

  • @frknaydn
    @frknaydn 2 роки тому +4

    Main usecase could be AI research. When we run our application sometimes take too much time to load files for training. This way it could be a lot faster. I wish you guys testing not just games. Computers not just game platforms. Please add some software development tests as well. compile nodejs or golang program. Run simple AI trainings.

  • @lukaaleksic9284
    @lukaaleksic9284 2 роки тому +3

    LLT always bring a smile on my face.

  • @nohomeforfreepeople2894
    @nohomeforfreepeople2894 2 роки тому

    Anthony, is there any speed differnce from a PCIE 16x or 8x drive adapter? Ive had luck on older systems using the OWC Accelsior S (intended for Macs) to get good load times, better than the onboard sata. So, would these beat out an m.2 adapter card for the same nice cases? And I have to wonder, what would those do once the MS API is finally out?

  • @phillee2814
    @phillee2814 2 роки тому

    If they'd rolled out proper driver support for that wee beastie for all OSs, it could have been awesome, with a Hyper M.2 card in another 16-lane slot, and some nice big NVMes in both, you could have a mirrored pair of 32TB VDEVs - with maybe parity added by having a couple on the motherboard as well. Downsize each by a wee bit with partitions to allow for L2ARC, SLOG and boot partitions to be split between lanes/devices and mirrored and striped for the bits you want to (so not swap - that could be stripe only). Stuff it with fast ram and a decent processor and you have a heck of a graphics workstation or gaming rig. All for the lack of decent driver support, which if it came with source code for the Linux drivers, would be easy for game or video software developers to hook into.

  • @IvanpilotNX1
    @IvanpilotNX1 2 роки тому +11

    AMD when was creating this gpu:
    AMD: Hmmm... We need a different gpu, something different.
    That one worker: Boss and if we combine storage with a gpu
    AMD: Hmmm... That idea is... PERFECT, another increase James, good job 😀

  • @VoteOrDie99
    @VoteOrDie99 2 роки тому +3

    Add a CPU and power supply to the card and it's essentially a gaming console/computer in a card. I wonder what potential this brings

  • @stefanhoffmann8417
    @stefanhoffmann8417 2 роки тому

    8:55 I fractured my finger tip once with this "pull tab"

  • @Amazon_Finds_2k
    @Amazon_Finds_2k 11 місяців тому

    I want to see gameplay footage of this but i cant find it anywhere

  • @gertjanvandermeij4265
    @gertjanvandermeij4265 2 роки тому +38

    Would still love to see an GPU with some sort of GDDR slots, so everybody can choose their own amount of Vram !

    • @coni7392
      @coni7392 2 роки тому

      It would be amazing to be able to add more VRAM to my card

    • @psycronizer
      @psycronizer 2 роки тому +3

      @@coni7392 why ? your GPU can only access and throw around only so much data, and oddly enough the GPU's are tailored exactly to how much ram they have, might be useful for static images at high res, but high frame rates at higher res ? not so much.

    • @oiytd5wugho
      @oiytd5wugho 2 роки тому +3

      The expense in no way justifies the benefit. The only thing you'd get is limited upgradibility. GPUs have a highly specified memory controller, basically supporting a few variations in volume, like, a chip might support 4, 8 and 16 gigs of discrete memory ICs each holding 512MB and nothing else

    • @lazertroll702
      @lazertroll702 2 роки тому

      @@psycronizer assuming dynamic physical ram size and that firmware binds the addresses on init, would there really be no advantage in gaming, like having more loaded pre-render objs or prefetch code?
      it seems that the disadvantage is letting the os treat them as global instead of driver-exclusive/defined fs ..? 🤨

    • @psycronizer
      @psycronizer 2 роки тому

      @@lazertroll702 not really, transfer speeds from non display storage to frame buffer are really a non issue now, so at some point adding more ram just makes for higher cost with no benefit

  • @490o
    @490o 2 роки тому

    I like how you guys kept that title

  • @RevolverOcelotMGS2
    @RevolverOcelotMGS2 2 роки тому

    This is the first actual review of the SSG in the five years since it was announced. Every other article about it just parroting the AMD PR sheet from when it was announced. It uses a custom API to treat the NAND flash as an extension of the VRAM pool. The only adopter of that API was Adobe Premier.
    In theory, it's a huge benefit for some workloads where larger VRAM pools are required. As long as the VRAM extension doesn't need to be as high performance.
    In practice, it was only really practical back in 2017 when VRAM was both very expensive and small capacity. The NAND flash was just a cheaper way to expand the memory pool, something that AMD was not able to do traditionally with HBM. There are many solutions, even back then, that could render this GPU unnecessary for most of the workloads it was targeted for.

  • @kandmkeane
    @kandmkeane 2 роки тому +23

    This gpu has always made me want to know of Intel’s Optane m.2 could be used? Would it even work? would there be any use cases for that? Any benefits?
    Probably not but it’s just such an interesting opportunity to experiment with mixing different computer technologies…

    • @christopherschmeltz3333
      @christopherschmeltz3333 2 роки тому +1

      I haven't used Optane much, but the technology is fundamentally more like non-volatile RAM, with higher performance and endurance but a fraction of the capacity as a comparably priced NAND Flash SSD. It's most commonly effective when utilized as a hybrid cache layer, like how it's built into Intel H10 and H20. I don't think the data center grade M.2 have achieved 1.5TB yet, last time I noticed that capacity was reserved for special use server DIMM!
      Therefore, I expect Optane should mostly function in this SSG, but the benefits of upgrading would probably just be how long the M.2 would last with medium sized but frequently changing data sets before wearing out and if you're using it's API to not get performance bottlenecked elsewhere. Perhaps use the API to write your own storage subsystem using two Optane 64GB 80mm long cache drives and two high capacity 2TB 110mm long storage drives... but I'm not aware of when an ordinary M.2 RAID card feeding multiple compute GPUs wouldn't be more practical.

    • @mnomadvfx
      @mnomadvfx 2 роки тому

      @@christopherschmeltz3333 Exactly.
      Optane phase change memory tech hasn't even breached 4 layer device limitations.
      While the state of the art in 3D/VNAND is already up to 170 layers and counting.
      In short Intel bet on the wrong tech foundations to build Optane upon - it simply isn't well suited to 3D scaling which is a necessity for modern memory as area scaling is already reaching problematic limitations.

    • @christopherschmeltz3333
      @christopherschmeltz3333 2 роки тому

      @@mnomadvfx Intel fabrication definitely bet on the wrong 10mn tech, but Optane will probably hold onto a smaller niche than planned until hardware RAM Disks make a comeback. You know, like the Gigabyte iRAM back in the DDR1 era... there are newer and older examples, but Gigabyte's seemed to have been noticed by the most PC enthusiasts and should be simple to research.

  • @undeadlolomat8335
    @undeadlolomat8335 2 роки тому +4

    Ah yes, ’Why does this GPU?’ 😂

  • @thirdpedalnirvana
    @thirdpedalnirvana 2 роки тому

    The Quadro was a CAD card, set up for more high density CAD modeling without the shader hardware since it's not for games. It would be cool to see the Radeon SSG's 3D CAD performance (or Revit).

  • @sirfer6969
    @sirfer6969 2 роки тому

    Love your work Anthony, keep it up =)

    • @text-85367
      @text-85367 2 роки тому

      Congratulations⤴️contact on claming your prize.

  • @o9mb
    @o9mb 2 роки тому +3

    Damn

  • @tvollogy
    @tvollogy 2 роки тому +3

    "Why does this GPU?"

  • @SylpheedW
    @SylpheedW 2 роки тому

    Are there any modern graphic cards that use additional so-dimm memory? i remember those matrox agp cards

  • @cestialfall84
    @cestialfall84 2 роки тому

    I love watching linus tech tips while not understanding a single thing, yet enjoying it

  • @martinlagrange8821
    @martinlagrange8821 2 роки тому +6

    Well...I would have a use for it. When running Tensorflow through a GPU as a coprocessor for Neural Networks, the SSG would result in supercomputer performance for complex multi-level networks. Its not for apps & games - its for AI !

  • @chrcoluk
    @chrcoluk 2 роки тому +3

    From a flexibility standpoint this is amazing, GPU's integrating their own m.2 slots, and sharing the 16 lanes is awesome. Also makes it easier to change ssd's as can just remove GPU from case to work on them easily.

    • @mnomadvfx
      @mnomadvfx 2 роки тому

      In the case of DirectStorage use it makes even more sense again - not to mention you can cool the m2 drives better with a full GPU cooler (albeit the SSG cards are passively cooled server form factor).

  • @bertoonz
    @bertoonz 2 роки тому

    Hey. you rocking this video man! Nice hosting

  • @nohomeforfreepeople2894
    @nohomeforfreepeople2894 2 роки тому

    I want this for my vector editing programs. Loading large graphics files would make my print processing and rip so much faster (if Corel, Adobe, and Roland would take advantage of it)

  • @TheOnlyTwitchR6
    @TheOnlyTwitchR6 2 роки тому +7

    If I had to guess, this didn't take off because we didn't have OS direct access to GPU storage
    I really want this to become normal in the future to throw m.2's onto the GPU