LPUs, NVIDIA Competition, Insane Inference Speeds, Going Viral (Interview with Lead Groq Engineers)

Поділитися
Вставка
  • Опубліковано 26 чер 2024
  • This is an interview with Andrew Ling (VP, Compiler Software) and Igor Arsovski (Chief Architect and Fellow) from Groq. We cover topics ranging from the founding story to chip design and manufacturing and so much more. Plus, they reveal how Groq's insane inference speed can generate much better quality from existing models!
    Check out Groq for Free: www.groq.com
    Join My Newsletter for Regular AI Updates 👇🏼
    www.matthewberman.com
    Need AI Consulting? ✅
    forwardfuture.ai/
    My Links 🔗
    👉🏻 Subscribe: / @matthew_berman
    👉🏻 Twitter: / matthewberman
    👉🏻 Discord: / discord
    👉🏻 Patreon: / matthewberman
    Rent a GPU (MassedCompute) 🚀
    bit.ly/matthew-berman-youtube
    USE CODE "MatthewBerman" for 50% discount
    Media/Sponsorship Inquiries 📈
    bit.ly/44TC45V
  • Наука та технологія

КОМЕНТАРІ • 274

  • @Batmancontingencyplans
    @Batmancontingencyplans 3 місяці тому +72

    Matt is flying high, kudos buddy for landing this interview!

  • @MrLargonaut
    @MrLargonaut 3 місяці тому +75

    Grats on landing the interview!

  • @kamelirzouni4730
    @kamelirzouni4730 3 місяці тому +26

    Matt, thank you so much for the interview. You addressed many questions I was eager to understand. The point that truly astounded me was how inference affects model behavior, significantly enhancing response quality. This is a game-changer. Groq has managed to combine speed and quality. I'm eager for it to become widely available and to have the opportunity to run it locally.

  • @maslaxali8826
    @maslaxali8826 3 місяці тому +22

    I was not expecting this.. Wow bro, natural interviewer

  • @andresprieto6554
    @andresprieto6554 3 місяці тому +10

    I am only 11 minutes in, but i love how passionate and knowledgeable Igor is about his industry.

    • @IGame4Fun2
      @IGame4Fun2 3 місяці тому

      He as fast as grog, saying "yea, yea good.." before question is finished 😂

  • @alelondon23
    @alelondon23 3 місяці тому +10

    well done, Matthew! great interview.
    This guys at Groq are crushing it! Great attitude, OOTB thinking, hard work, letting their delivery speak for itself. A very refreshing alternative to the typical over-hyped promises of vaporware. Thank you, Groq!

  • @GuidedBreathing
    @GuidedBreathing 3 місяці тому +10

    15:40 Holy grail of automated vectorizing compiler, threading multi core synchronization.. peak performance, kick the compiler to the side under the hood for finance applications.. Great interview thus far☺️ the repeating loop for reasoning is on hardware on the groq chip; yep that makes things allot faster and very exciting, to repeat itself for the reasoning👏👏👏good job

  • @Alice8000
    @Alice8000 3 місяці тому +9

    Nice work Groq boys!

  • @74Gee
    @74Gee 3 місяці тому +9

    When asked about running locally on a cellphone they skillfully avoided the fact that you need a rack of chips for inference - although working as an integrated system, the 500+ tokens per second come from around 500+ chips.

    • @ritteradam
      @ritteradam 3 місяці тому

      Actually Igor answered honestly before Andrew took over: SRAM is much bigger than DRAM, so it’s not a good idea for LLMs.

    • @74Gee
      @74Gee 3 місяці тому +1

      @@ritteradamThere are pros and cons. SRAM uses less power and produces less heat - so it's a good fit.
      The simple honest answer is you need hundreds of groq chips so it's not viable for personal computing. But that would be a hype-killer wouldn't it.

    • @MDougiamas
      @MDougiamas 3 місяці тому

      Well but remember what they have is on 14nm … new chips are being designed for 2nm … Groq 3 might be vastly more portable and powerful

  • @justinIrv1
    @justinIrv1 3 місяці тому +3

    Incredible interview! Thank you all.

  • @aiAlchemyy
    @aiAlchemyy 3 місяці тому +12

    Thats Some amazing Valuable content

  • @aaronpitters
    @aaronpitters 3 місяці тому +5

    Great interview! So the innovation to create a simpler design and faster chip came because they didn't have the money to hire people to create a traditional chip. Love that!

    • @albeit1
      @albeit1 3 місяці тому

      Constraints force people to innovate. The obstacle is often the way.

  • @torarinvik4920
    @torarinvik4920 3 місяці тому +9

    Awesome, please do more of these "expert interviews" if you can :D

  • @adtiamzon3663
    @adtiamzon3663 3 місяці тому +2

    😍 Matt, sooo interesting and informative interview with the lead Groq engineers, Igor and Andrew! Easy to comprehend their presentation, indeed. Thank you, guys. Keep simplifying and relatable. Keep innovating. 🌞👏👏👍💐💞🕊

  • @NahFam13
    @NahFam13 3 місяці тому +2

    THIS IS THE CONTENT I WANTED TO SEE!!
    Dude I literally complained about a video you made and you have NO idea how happy it makes me to see you doing this interview and asking the types of questions I would ask.

  • @nicolashuray1356
    @nicolashuray1356 3 місяці тому +1

    Just wow ! Thanks Matt, Andrew and Igor for that incredible interview about Groq architecture. I'm just fascinated about the beauty of that design and all the uses cases it gonna unlock !

  • @albeit1
    @albeit1 3 місяці тому +2

    The traffic scheduling analogy is interesting. Each vehicle in every moment occupies a particular space. And no other vehicle can occupy it. If you can schedule all of them and every pedestrian, you can maximize throughput.
    That also reminds of one reason service oriented architecture work. Small web requests and small vehicles both get out of the way a lot faster. Two herds of mopeds crossing paths can do that a lot faster than two trains.

  • @gynthos6368
    @gynthos6368 3 місяці тому +26

    I just realised, you look like Jon from Garfield

    • @RX-8GT
      @RX-8GT 3 місяці тому +2

      lol for real

    • @zallen05
      @zallen05 3 місяці тому +2

      GOAT comment

    • @howardelton6273
      @howardelton6273 3 місяці тому

      I can't unthink that now haha

  • @SirajFlorida
    @SirajFlorida 3 місяці тому +4

    Wow, great job on this interview. I've been really excited about Groq. Thumb clicked. LoL

  • @PLACEBOBECALP
    @PLACEBOBECALP 3 місяці тому +8

    I think Matt was having the best day of his life talking to these 2 guys, I don't think that smile left Matt's face for the entire interview. Great interview, about time someone asked some questions that matter, instead of the parroted repetition of When will this and that be ready is it AGI, will robots call me nasty names behind my back?

    • @matthew_berman
      @matthew_berman  3 місяці тому +5

      Lol. Indeed I was having a blast!!

    • @PLACEBOBECALP
      @PLACEBOBECALP 3 місяці тому +2

      @@matthew_berman Ha ha Me too man... well until my jaw hit the floor when he described the architecture of the chip at the smallest scale, 10,000 transistors fit in a single blood cell and they need to use Extreme Ultra Violet light... it truly blew my mind. Do you know if Moore's law allows for an additional reduction in scale, or is 4nm the limit, if this is the case, i assume the technology to build chips atom by atom must have been going on in the background for years in preparation for this long understood inevitability??

    • @Maelzelmusic
      @Maelzelmusic 3 місяці тому

      To my understanding, you can go smaller. Up to 2 or 3 nm but there’s a point where the size gets so small that you enter into quantum realm wave/particle nature and then you get other problems, mainly related to cooling and interpretability of results. Im just going by memory here but you can research further in perplexity or other types of search. It’s a very interesting topic. PS Marques Brownlee/MKBHD has a great video on quantum computers actually.
      Cheers.

  • @TheJohnTyra
    @TheJohnTyra 3 місяці тому +4

    This is fantastic Matt!! 🎉 Really enjoyed the technical deep dive on this hardware architecture. 🤓💯

  • @joe_limon
    @joe_limon 3 місяці тому +13

    This is the single greatest interview I have seen this year

    • @matthew_berman
      @matthew_berman  3 місяці тому +1

      thank you joe!

    • @joe_limon
      @joe_limon 3 місяці тому +1

      @@matthew_berman I think the ai they described at the end could finally reliably answer the how many words in your next response question.

  • @nuclear_AI
    @nuclear_AI 3 місяці тому +2

    In the context of computing and chips, when folks talk about a 7 nm (nanometer) process or a 5 nm (nanometer) process, they’re referring to the size of the smallest feature that can be created on a chip. Smaller nanometer processes mean more transistors can be packed into the same space, leading to more powerful and efficient chips.
    I hope this helps visualize how incredibly small a nanometer is and the scale at which modern technology operates. It’s like a magical journey from the world we see down to the realm of atoms and molecules, all packed into the tiny silicon chips powering the gadgets we use every day!👇👇👇
    Imagine you have a meter stick. It’s about as long as a guitar or a bit taller than a large bottle of soda. That's our starting point: one meter.Meter (m) - Our starting point. Picture it as the height of a guitar.Decimeter (dm) - Divide that meter stick into 10 equal parts, and each part is a decimeter. Think of it like the length of a large notebook or a bit shorter than the width of your keyboard.Centimeter (cm) - Take one of those decimeters and chop it into 10 smaller pieces. Each piece is now a centimeter, roughly the width of your fingernail or a large paperclip.Millimeter (mm) - If we slice a centimeter into 10 tiny slivers, you get millimeters. That's about the thickness of a credit card or a heavy piece of cardboard.Now, hold onto your hat, because we're about to shrink down into the world of the incredibly tiny:Micrometer (µm) - Dive deeper and slice a millimeter into 1,000 pieces. Each piece is a micrometer, also known as a micron. You can't see these with your eyes alone; it’s about the size of bacteria or a strand of spider silk.Nanometer (nm) - And now, the star of our journey! Cut one of those micrometers into 1,000 even tinier pieces. These are nanometers. A nanometer is so small that it’s used to measure atoms, molecules, and the tiny features on computer chips that you mentioned. To put it in perspective, a human hair is about 80,000 to 100,000 nanometers wide. So, we’re talking seriously small scales here.

  • @autohmae
    @autohmae 3 місяці тому

    Thanks for this interview ! Great to see you were able to get this interview. You can be proud. And even if their are things you don't know, this is often still very useful, asking simple questions, because it will let them think and speak instead of answering short questions.
    Regardless if they are a big deal or not, it helped me better understand the inefficiencies in the existing systems. Their might be many questions I would have asked that Matt wouldn't know where to start. Especially if I had time to think about them... but these more surface level questions are very useful. Because I knew parts were hand written/tuned and knew their is even a big research area just in networking things together, but not really got the big picture. Removing inefficiencies is a huge deal, removing a whole bunch of them at multiple levels is a game changer.
    Also shows, if an important part of CUDA is hand-written and it took so many man hours to make by really smart people, than it will mean AMD can't catch up as easily as many would like to see (their reasoning is: competition is good).

  • @gkennedy_aiforsocialbenefit
    @gkennedy_aiforsocialbenefit 3 місяці тому +1

    Truly incredible interview! wow! Andrew and Igor are brilliant, cool and humble...Just like you Matt. So refreshing. Really excited about the last question and answer concerning Agents. Deeply grateful to you and happy for you Matt. Have been following every video of yours from the onset.

  • @AIApplications-lg1ud
    @AIApplications-lg1ud 2 місяці тому

    Thank you! Awesome conversation! The idea that the Groq architecture would also yield better LLM answers and less hallucination is revolutionary.

  • @koen.mortier_fitchen
    @koen.mortier_fitchen 3 місяці тому +1

    So cool this interview. I follow the Matt’s for all my Ai news: Matt Wolfe, Mattvidpro and Matthew 👌

  • @planetchubby
    @planetchubby 3 місяці тому +4

    this interview is awesome, really cool

  • @jessicas-discoveries-age-6-12
    @jessicas-discoveries-age-6-12 3 місяці тому +1

    Great interview Matt really inciteful. Being able to talk to LLM's in real time will actually make it feel we are that much closer to AGI even if there is still work to do to make it happen in reality.

  • @howardelton6273
    @howardelton6273 3 місяці тому +1

    Awesome interviewer achievement unlocked. This is a great format.

  • @seancriggs
    @seancriggs 3 місяці тому +1

    Outstanding content, Matt!
    Very well managed and explained.
    Thank you for doning this!

  • @JoseP-cw3je
    @JoseP-cw3je 3 місяці тому +10

    To run llama 70b unquantize with Groq cards of 230MB, you'd need a staggering 1,246 of them at $20K each - that's $25 million total. Their crazy 80TB/s bandwidth would let you run the entire model stupidly fast on this setup. But good luck with the 249kW power draw! For comparison a H100 for that same $25M, you get 833 units at $30K per GPU. Each H100 has "only" 80GB VRAM, so the 280GB model would need to be split across 3-4 GPUs. But with 833 GPUs, you could run around 238 instances insteadof just 1 with Groq. The H100 rig would still chug 583kW, so even if Groq cards can be 80x the speed of a H100 is still 3x behind the H100 in price per performance so to be competitive they would need to be close to 7k.

    • @diga4696
      @diga4696 3 місяці тому +1

      I would say close to 5k, Blackwell with its dgx stack is a ready to rack solution which will offer even better price per performance, and working with a familiar stack is huge for bigger clients

    • @dewardsteward6818
      @dewardsteward6818 3 місяці тому +2

      Please provide a legitimate source for the $20k. The mouser thing people point at is a joke.

    • @actepukc
      @actepukc 3 місяці тому +1

      Haha, this breakdown does make you wonder what other burning questions Matt couldn't ask during the interview. Maybe Groq's pricing strategy will be revealed in the sequel, just like he hinted at follow-up questions?

  • @cablackmon
    @cablackmon 3 місяці тому

    This SUPER interesting and enlightening. Especially the part about how inference speed can affect the actual quality of the output. Thank you! Keep it up Matt!

  • @BradleyKieser
    @BradleyKieser 3 місяці тому

    Absolutely the best interview ever! WOW!

  • @markwaller650
    @markwaller650 3 місяці тому

    Amazing interview and insights. Really interesting - how you asked the questions to make this accessible to us. Thank you all!

  • @kumargaurav2170
    @kumargaurav2170 3 місяці тому

    Till date best video for providing insights about LPUs beyond just their faster inference speed. You should conduct more such videos as it unlocks so much of behind the scenes for Normal ppl. Outstanding video & Outstanding company groq 🙏🏻🙏🏻

  • @jimg8296
    @jimg8296 2 місяці тому

    Fantastic interview. Learned so much. Thank you.

  • @instiinct_defi
    @instiinct_defi 3 місяці тому +2

    Amazing, This content is greatly appreciated!🔥🔥

  • @JMeyer-qj1pv
    @JMeyer-qj1pv 3 місяці тому +6

    Nvidia announced that their upcoming Blackwell chip improves inference speed by 30x. I wonder if that will bring it close to Groq's inference speed or if Groq will still be faster. I'm also curious why the Groq architecture doesn't work for training LLMs.

    • @PaulStanish
      @PaulStanish 3 місяці тому +1

      To the best of my knowledge, the memory doesn't need to change as much for backpropagation so they don't need to be as conservative with timing assumptions etc.

    • @seanyiu
      @seanyiu 3 місяці тому

      The cost for GPU will always be much higher regardless of performance

  • @scotlandcorpnaics2385
    @scotlandcorpnaics2385 2 місяці тому

    Outstanding discussion!

  • @semeandovidaorg
    @semeandovidaorg 3 місяці тому

    Great interview!!! Thank you!

  • @user-eo1vg6oc3v
    @user-eo1vg6oc3v 3 місяці тому +1

    An interesting combo of ideas presented one was using Claude 3 Opus to train the much smaller Claude 3 Haiku which makes it quicker by being smaller and prompt in step by step. Then it was suggested that adding quiet star to rethink before answering could make the answers 10-50% more accurate. This architecture on groq seems to simplify the traffic flow with ‘one way’ timed traffic. The final suggestion about reiterating the question could be solved by adding Quiet Star which automates that by directing a review of the whole process overall before answering which gave 10-50% more accuracy especially for math or code. So when will this be usable for the general public? -a Groc cloud app?

  • @swamihuman9395
    @swamihuman9395 3 місяці тому

    - Fascinating.
    - Thx.

  • @Maelzelmusic
    @Maelzelmusic 3 місяці тому

    Lovely video, Matt. Huge props for your evolution :).

  • @rikhoffbauer
    @rikhoffbauer 3 місяці тому

    This is great! More like this! Very interesting and insightful

  • @kongchan437
    @kongchan437 3 місяці тому +1

    Great to hear more tech pioneers from U of T starting with Dr.Hinton himself. I remember our big Lisp manual was not like commercially published text book so maybe made by U of T researchers ? l remember seeing some very long Lisp program and wondered which grad student had that highly abstract recursive thinking ability.

  • @ZeroIQ2
    @ZeroIQ2 3 місяці тому

    That was a great interview, so much interesting information, good job Matthew!

  • @glennm7086
    @glennm7086 3 місяці тому

    Perfect level of detail. I wanted an LPU primer.

  • @831Miranda
    @831Miranda 3 місяці тому

    Great interview! Very accessible info! 🎉❤

  • @nvda2damoon
    @nvda2damoon 2 місяці тому

    fantastic interview!

  • @JariVasell
    @JariVasell 3 місяці тому +2

    Great interview! 🎉

  • @jonniedarko
    @jonniedarko 3 місяці тому

    by Far my most favorite video you have done! ❤

  • @charlestheodorezerner2365
    @charlestheodorezerner2365 3 місяці тому

    Love your content. Thank you for all you do. And I love Groq. This was a really fresh look into an area (namely, the inner workings of hardware) that is rarely covered. So this was great.
    One insane benefit to Groq that I wished you had asked about: energy consumption. I gather that Groq chips are not only vastly faster, they are also vastly more energy efficient-which is insane when you think about it. Typically, energy consumption increases significantly with increases in speed. (Compare a 4090 to a 4060). Not Groq. It’s blazingly fast while using a small fraction of the energy of a traditional GPU. This is a HUGE deal to me-not only because it decreases the cost of inference, but for environmental reasons. When you scale up the compute necessary to power the world’s inference needs, the energy impact is scary. I wouldn’t be surprised if AI inference becomes a greater source of green house gas emissions than automobile use in a few years. And if I understand it correctly, Groq chips are massively more ecologically friendly. Ultimately, that should be as big a deal as the speed itself. Would love to understsnd better why they are so much more efficient….

  • @AlexanderBukh
    @AlexanderBukh 3 місяці тому +2

    well spoken, aaight

  • @kingrara5758
    @kingrara5758 3 місяці тому

    great interview, so interesting. Loved seeing everyone's enthusiasm. Your videos are my favourite source of AI news. big thank you.

  • @savant_logics
    @savant_logics 3 місяці тому +1

    Thanks! Great interview.👍

  • @darwinboor1300
    @darwinboor1300 3 місяці тому

    Thanks gentlemen,
    The comparison seems to be between a momentum bound industry locked to existing architectures and looking for better ways to play musical chairs with their data and a startup (Groq) practicing first principles to produce a new hardware model suited to the task at hand that moves data and results through memory and compute in multiple parallel queues.
    I look forward to seeing more from Groq.

  • @marktrued9497
    @marktrued9497 3 місяці тому

    Great interview!

  • @fpgamachine
    @fpgamachine 3 місяці тому

    Very interesting talk, thanks!

  • @manishpugalia8559
    @manishpugalia8559 3 місяці тому

    Too good very good learning. Kudos

  • @kostaspramatias320
    @kostaspramatias320 3 місяці тому +1

    Darn, that's gonna be epic!

  • @seamussmyth2312
    @seamussmyth2312 3 місяці тому +2

    Great interview 🎉

  • @AncientSlugThrower
    @AncientSlugThrower 3 місяці тому

    Great interview for a great channel.

  • @vinaynk
    @vinaynk 2 місяці тому

    Very informative. This thing will be the heart of skynet :)

  • @Raskoll
    @Raskoll 3 місяці тому +1

    These guys are actual geniuses

  • @RikHeijmen
    @RikHeijmen 3 місяці тому +2

    Matt! Wow! Didz you find out more about the last thing they talked about? About feeding the answer multiple times and asking questions in a slightly different way? It seems like a new way of using the groq chat rather than a new model, right?

    • @unom8
      @unom8 3 місяці тому

      It sounds like energy based modelling, no?

  • @issiewizzie
    @issiewizzie 3 місяці тому

    Great interview

  • @KitcloudkickerJr
    @KitcloudkickerJr 3 місяці тому +1

    wonderful interview

  • @albeit1
    @albeit1 3 місяці тому +1

    Creating hardware specifically designed to serve LLMs reminds me of why vertical integration works. Things get created or optimized to serve the mission. The company doesn’t have to adapt to how existing industries are doing things.

  • @scott701230
    @scott701230 3 місяці тому +1

    Grog chip sounds amazing.

  • @NoCodeFilmmaker
    @NoCodeFilmmaker 3 місяці тому +2

    Their API is really competitive too

  • @frankjohannessen6383
    @frankjohannessen6383 3 місяці тому

    The fact that their chip is built on 14nm transistors is insane. That's what Nvidia used for the GTX 10-series back in 2017. Imagine how fast Groq would be with 4nm transistors.

  • @RonLWilson
    @RonLWilson 3 місяці тому +1

    Interesting!
    BTW, I spent my career with asynchronous software and synchronous software was a big no no in that it too rigidly coupled and we needed to handle sloppy data flows over a distributed architecture..
    That said we did write some of the drivers in hand written assembly language that was synchronous for the drivers where we needed the speed.

  • @nicknick6464
    @nicknick6464 3 місяці тому +1

    Thanks for the great interview. I have a question. Since their chip is quite old (14nm), they must be thinking about an updated version based on 5nm or below. When it will be available in the future and how much faster it will be ?

  • @bladestarX
    @bladestarX 3 місяці тому +1

    Great interview, Matt; you are the best. I think Groq helped create awareness about the benefits of designing and optimizing a chip for inference. However, wasn't this already known by leading companies like NVIDIA? GPUs just happened to be the most appropriate existing architecture that works best for AI training and inference. Remember, prior to ChatGPT, it was all about AI classification and training. Inference was just not a thing. I am not sure if it wasn't because of the focus on the need for inference; something like an LPU would simply not be justified for mass production. So, the reason why the big players don't have LPUs is simply because the demand for these LPUs was not there before ChatGPT woke up the world about LLMs. LPUs actually have a simpler architecture and fewer components than a general-purpose GPU. I believe Groq will benefit from being the first, but it will be very difficult to defend or keep up with the larger chip manufacturers as they have the infrastructure to create LPUs that will probably perform 10x faster than Groq's 14nm.

    • @GavinS363
      @GavinS363 3 місяці тому +2

      This comment doesn't make any sense, what infrastructure is it that you speak of Navita having that gives them a huge advantage in designing chips? I think you are mistakenly believing these companies such as grok and Nivita are not only designing these chips but manufacturing them as well, this is incorrect.
      The only company who both designs and manufacturers silicone is Intel, the rest all only design and then subcontract out Fabs. Usually it's TSCM, who only builds chips to spec and does not design themselves. That's how it is now and how it will remain in the foreseeable future. Trying to build a Fab without having access to Nation level money is basically impossible at this point.

    • @bladestarX
      @bladestarX 3 місяці тому

      @@GavinS363 Everyone knows NVIDIA itself does not operate fabrication plants (fabs) for chip production but outsources the manufacturing to third-party foundries like TSMC and Samsung. they focus on design and development don’t they have facilities for research and development, testing, and other purposes related to their products and technologies? You don’t consider these critical infrastructure? How about their 30,000 employees including their scientists, engineers and architects. Do you think they can give them an advantage when designing LPUs? Not sure why you thought I was explicitly talking about fabs, specially on a video about chip design and architecture. Maybe I should have said chip producer instead of manufacturer?

    • @user-cv2as4jo9l
      @user-cv2as4jo9l 3 місяці тому

      @@GavinS363 Make sure your spelling is correct first. NVIDIA not Nivita and TSMC not TSCM. >

  • @janewairimu5625
    @janewairimu5625 3 місяці тому

    These groups guys need funding in the billion to stop them giving into large corporate bullying..as happening to inflection and stability.
    Their work is so precious..yet tantalising to the big corporations..

  • @coulterjb22
    @coulterjb22 3 місяці тому

    Great interview. I would have loved to hear how they are working on lowering manufacturing costs and when that might happen. My very limited knowledge is these chips are more expensive to make.

  • @testchannel7896
    @testchannel7896 3 місяці тому +1

    great interview

  • @elyakimlev
    @elyakimlev 3 місяці тому +1

    Good interview. I just wish you hadn't mentioned phones. I really wanted to know if they could create a GPU size hardware for PC, that would outperform RTX 3090 at inference, while being able to run bigger models than the RTX can.

  • @ZychuPL100
    @ZychuPL100 3 місяці тому +2

    This sound like the LPU is a neuron! They basically created a Artificial Neuron that can be connected to other neurons, so this is like artificial brain. Awesome!

    • @executivelifehacks6747
      @executivelifehacks6747 3 місяці тому

      That is the sense I got too. Why is the human brain efficient? Lots of parallel computations, not overly fast. That being said it's not working the whole time at least not all of it AFAIK.

  • @netsi1964
    @netsi1964 3 місяці тому

    ARM originally was created the same way: design instructions first, then the hardware - it was also originally Acorn Risc Machine, as it was to be used inside the Acorn BBC microcomputer.

  • @jpdominator
    @jpdominator 2 місяці тому

    Simplicity never wins. Simplicity is using someone else’s library. Complexity is writing your own to increase performance. Complexity is going down several layers and working there. Igor did something extremely complex to create something more simple than the conventional.

  • @arturoarturo2570
    @arturoarturo2570 3 місяці тому +1

    Súper instructive

  • @goodtothinkwith
    @goodtothinkwith 3 місяці тому +1

    Great job Matt! It sounded like it would scale, but might be limited by the die size in the fab..? Is there a limit to how many chips can be chained together like one big chip? I.E., can many Groqs compete with Cerebras’ massive chips? When can we get an agent-based Llama 2 (or 3!) that had this kind of reflexive thinking that Andrew mentioned at the end? Good stuff!

    • @goodtothinkwith
      @goodtothinkwith 3 місяці тому +1

      Maybe even more provocatively, if a bunch of Groqs were chained together to be the size of Cerebras’ chips, just how large of a LLM could it run?

  • @shyama5612
    @shyama5612 3 місяці тому +1

    Would love a comparison between groq LPU and TPUv5p

  • @wetcel1236
    @wetcel1236 3 місяці тому +2

    Awesome! Thanks Matt!

  • @ryzikx
    @ryzikx 3 місяці тому +2

    very good fantastic content 🤯🤯

  • @Artfully83
    @Artfully83 3 місяці тому

    Ty

  • @rbdvs67
    @rbdvs67 3 місяці тому

    I wonder what, if any, are the power requirement differences with the Groq architecture? Are they planning on making this on the more current 4-5 nano silicone? Amazing interview and very exciting.

  • @1242elena
    @1242elena 3 місяці тому

    That's awesome 😎

  • @ArnoldJagt
    @ArnoldJagt 3 місяці тому

    I have such a huge project for groq a soon as it can handle digesting a big chunk of software.

  • @transquantrademarkquantumf8894
    @transquantrademarkquantumf8894 3 місяці тому

    Nice Show

  • @skitzobunitostudios7427
    @skitzobunitostudios7427 3 місяці тому

    Matt, are you going to interview 'Cerebras' next? I would like you to maybe get two chaps from each company in a cast with you and have a little 'Shoot Out' of thoughts.

  • @jayconne2303
    @jayconne2303 3 місяці тому

    Very nice model of traffic at an intersection.

  • @CalinColdea
    @CalinColdea 3 місяці тому +2

    Thanks

  • @KCM25NJL
    @KCM25NJL 3 місяці тому

    Man, the Groq7B PCI-e accelerator card would be such an easy win..... guess we can keep dreaming :)

  • @rickevans7941
    @rickevans7941 21 день тому

    Really high-end graphics card, huh? My Vega64 Reference card with HBM is a beast that was ahead of it's time.

  • @segelmark
    @segelmark 3 місяці тому

    Cool that they achieve this on a 16nm architecture, without knowing anything it feels like they might be able to get ~2x more performance and ~4x less power usage and size of just moving to the leading edge.

  • @Alice8000
    @Alice8000 3 місяці тому

    very cool

  • @brycetidwell7193
    @brycetidwell7193 3 місяці тому

    Hope this goes public so I can buy stock sometime soon.