The Next 100x - Gavin Uberti | Stanford MLSys #92

Поділитися
Вставка
  • Опубліковано 27 лют 2024
  • Episode 92 of the Stanford MLSys Seminar Series!
    The Next 100x - How the Physics of Chip Design Shapes the Future of Artificial Intelligence
    Speaker: Gavin Uberti
    Abstract:
    Moore's law is slowing down, but AI models are rapidly getting bigger. But why exactly is this happening? How chip designers dealt with it in the past? Why is it happening unevenly across transistors, wires, and memory? And how can AI designers avoid fighting the physical limitations, and work with them instead?
    Bio:
    Gavin is the founder of Etched, a company making highly specialized AI chips for Transformers. Before founding Etched, Gavin studied math at Harvard and worked for Xnor and OctoML building AI compilers like Apache TVM. His interests lie in AI scaling laws, watermarking and watermark detection, and in the interaction of chip design with the above topics.
    --
    Stanford MLSys Seminar hosts: Avanika Narayan, Benjamin Spector, Michael Zhang
    Twitter:
    / avanika15​
    / bfspector
    / mzhangio
    --
    Check out our website for the schedule: mlsys.stanford.edu
    Join our mailing list to get weekly updates: groups.google.com/forum/#!for...
    #machinelearning #ai #artificialintelligence #systems #mlsys #computerscience #stanford
  • Наука та технологія

КОМЕНТАРІ • 11

  • @jaytau
    @jaytau 4 місяці тому +7

    Would it be possible to use an external mic for the speaker and the person who asks the question?
    Its quite challenging to hear

  • @vicaya
    @vicaya 4 місяці тому +1

    37:40, as you already realized that LLM (and transformer architecture in general) is memory constrained, the extra FLOPS are wasted until TSMC productize SOT-MRAM. groq with SRAM is a more realistic short term approach for small models.

  • @sucim
    @sucim 4 місяці тому +1

    Very interesting and well presented!

  • @LazyDotDev
    @LazyDotDev 7 днів тому +2

    Great talk, but why didn't anyone ask questions around competition. What is to prevent Nvidia, AMD, or Intel from producing niche chips like this? With their R&D teams, Quality Assurance systems, Warranties, and supply chains, they likely thought of this and if not should be able to deploy a more competitive and reliable solution fast.
    That being said I really appreciate Gavin breaking down the history here I learn a lot of new things.

    • @manonamission2000
      @manonamission2000 15 годин тому

      Corporations tend to move slowly... it is less expensive (relatively, $ and time) for a nimble co to attempt to innovate like this... also, the gamble is the Sohu platform becomes so appetizing that it ends up as an acquisition target... again, both are simply bets... not without risk

    • @LazyDotDev
      @LazyDotDev 14 годин тому

      @@manonamission2000 Sure, you could argue some leaders like Blockbuster moved slow when the rising leader Netflix transitioned to online and on-demand content.
      However, unlike on-demand streaming services, Gen Ai is the most revolutionary technology of our time and if this direction was so promising and yet as simple as creating a niche chip focused solely on transformers then you'd think Intel and AMD with it's massive R&D teams would already be doing it to get an edge on Nvidia.
      These serious business questions should have been asked, I'll do more research but hard to take any of this seriously if such as basic question could not have been asked/answered.

  • @georgehart5182
    @georgehart5182 4 місяці тому +3

    it's cool, but this is going to be a long road. The main problem is software at the IR (e.g. CUDA), not necessarily hardware. There are many companies that can make interesting transistor permutations that have been doing it for a long time and they are not magically "accelerating superintelligence". This is a software ecosystem problem more than anything else. good luck.

    • @mainecoonandragdoll
      @mainecoonandragdoll 4 місяці тому +1

      The performance difference between GPGPU and DSA would not be easy to evaluate if you would like to optimize DSA for certain applications while software remains the biggest obstacle. NVIDIA has placed emphasis on stacks of software since acquitison of the Portland Group that would make CUDA friendly to software engineers. But the costs of operating may exceed the benefits of CUDA in the future. I still believe that hybrid bonding LPDDR would be more feasible along with adoption of CXL would eventually become the solution to replace HBM.

  • @nauy
    @nauy Місяць тому

    Nice history lesson. Nothing about the ‘next 100x’ promised in the title.

  • @briancase6180
    @briancase6180 8 днів тому

    Dude, you're at Stanford; I think students know what an inverter does. This was an ML seminar talk? How? And, how did this have anything to do with the topics explicitly raised in the Abstract? Just asking.... And, BTW, HBM isn't the only type of memory that's relevant especially for inference, which is, BTW, the focus of his company.

  • @mainecoonandragdoll
    @mainecoonandragdoll 4 місяці тому +1

    Following all these comments, I believe that the AI Bubble is destined to explode within a few years. Majority of DSA/GPGPUs available in the market currently, were developed for DNN long before Transformer becomes the norm. In order to reach optimal performance, the allocation of memory system would be applied between weight Stationary and then hot data,the KV Cache. it would be too optimistic to believe that hundreds of Groqs of its own 230mb on die memory would be able match similar throughout of single 8P DGX Platform, without serious consideration of memory bound and sufficient interconnection bandwidth required for Transformer.