Cerebras @ Hot Chips 34 - Sean Lie's talk, "Cerebras Architecture Deep Dive"

Поділитися
Вставка
  • Опубліковано 22 гру 2024

КОМЕНТАРІ • 22

  • @piscocuk2011
    @piscocuk2011 9 місяців тому +1

    00:04 Cerebras aims to revolutionize AI compute with a co-designed architecture
    02:06 Architecture focused on neural networks
    06:25 Memory bandwidth enables full performance in neural network computation.
    08:36 Cerebras core hardware architecture flexibility
    13:08 Cerebras chip has 84 die with 850,000 cores on a single 300mm wafer.
    15:27 Homogeneous array of cores across the wafer for unprecedented fabric performance
    19:21 Cerebras architecture utilizes dataflow mechanisms for weight computations
    21:12 Single chip enables high-performance neural networks
    25:02 Scalable clustering and wafer-scale chips enable large model access to everyone
    Crafted by Merlin AI.

  • @centuriomacro9787
    @centuriomacro9787 2 роки тому +5

    Very interesting presentation, thx

  • @whyjay9959
    @whyjay9959 Рік тому +4

    Hi. There's something that a few people were wondering about: Why is the Wafer-Scale Engine square? Since it looks like there's room for ~28 more complete, attached tiles.

    • @CerebrasSystems
      @CerebrasSystems  Рік тому +7

      It's a good question! The answer is rather prosaic, we're afraid. If the WSE weren't rectangular, the complexity of power delivery, I/O, mechanical integrity and cooling become much more difficult, to the point of impracticality.
      Take a look at the virtual teardown on our website and you may get a feel for some of these challenges: www.cerebras.net/cs2virtualtour
      The upshot is that a mere 850,00 cores will just have to suffice. ;)

    • @whyjay9959
      @whyjay9959 Рік тому +2

      @@CerebrasSystems I think I get the idea, thanks.

    • @AbeDillon
      @AbeDillon 5 місяців тому +1

      ​@@CerebrasSystems Would it be possible to lop off some of those edge tiles to make mini engines?

  • @JoeLion55
    @JoeLion55 9 місяців тому

    Re: the The die-to-die interface at about 15:15.
    You mentioned you an upper metal layer to cross the scribe lines between the dies. What does the reticle look like for this. Is this a regular mask, but the alignment for the mask is just offset so it straddles the scribe lines for the rest of the wafer? Is this something TSMC does regularly for other products? Or is this a new process to have reticles on the same wafer that don’t align on top of each other?

  • @christopherkeates4147
    @christopherkeates4147 3 місяці тому

    Incredible work. How do you scale a trained model down so that you can put it in something smaller and run inference real-time for control of a system?

  • @CaseyKoh
    @CaseyKoh 3 місяці тому

    What is the yield of that wafer sir ? thank you

  • @RalphDratman
    @RalphDratman Рік тому +2

    Is the CS-2 used only for training?
    Will a time come when, for massively concurrent inference, this architecture will be applicable?

    • @CerebrasSystems
      @CerebrasSystems  Рік тому +2

      Hi Ralph, good question. The vast bulk of our customers have used our systems for training LLMs or for HPC applications.
      We have had a couple of projects using it for inference, like one with Lawrence Livermore National Laboratory where they offloaded an unwieldy inference step from many nodes of their Lassen supercomputer to one of our systems. You can read the case study here: www.cerebras.net/cerebras-customer-spotlight-overview/spotlight-lawrence-livermore-national-laboratory/
      But in principle, our architecture should make at terrific concurrent inference platform because we can run many (hundreds or even thousands depending on the model) in parallel across our massive array of cores.

  • @xeusai
    @xeusai 8 місяців тому

    Was wondering if memory x is actually an independent device outside of wse-2 , wafer ,? the fact it has better spars performance in hardware level , is very interesting?

  • @xeusai
    @xeusai 8 місяців тому

    I didn't catch that much from the routing protocol, and how actually die to communicate on wse2 , yiu guys have alot if things , congratulations 🎊 😊

  • @808bigisland
    @808bigisland 2 роки тому +2

    Aloha and thanks! Way to go! Just imagined what you will be doing in ten years from now! Do you have a public roadmap?

    • @CerebrasSystems
      @CerebrasSystems  2 роки тому

      Thanks, 808 Big Island! Sadly, no public roadmap. You'll just have to keep watching!

  • @WoodyDataAI
    @WoodyDataAI 3 місяці тому

    Super fast, lighting speed AI system. Great!

  • @billykotsos4642
    @billykotsos4642 2 роки тому +3

    👀👀👀👀👀👀

  • @hg6996
    @hg6996 2 місяці тому +1

    If this wse is really that good why is still nobody talking about Cerebras AI while Nvidia is still printing money?

    • @jhockey11liu91
      @jhockey11liu91 2 місяці тому

      Because they are f-u-c-k up

    • @Marqui17
      @Marqui17 2 місяці тому

      Because todays biggest models dont fit on one Cerebras chip

    • @hg6996
      @hg6996 2 місяці тому

      @@Marqui17 Hmm. So it's not possible to put together more of them in order to make the models fit on such a system?

    • @Marqui17
      @Marqui17 2 місяці тому

      @@hg6996 I guess you should be able to interconnect them and split the model on them but then you are introducing the same complexities Nvidia has, taking away Cerebras' main advantage