How CUDA Programming Works | GTC 2022

Поділитися
Вставка
  • Опубліковано 3 жов 2024

КОМЕНТАРІ • 23

  • @sami9323
    @sami9323 4 місяці тому +6

    this is one of the clearest and most lucid presentations i have seen, on any topic

  • @KingDestrukto
    @KingDestrukto 21 день тому

    Fantastic presentation, wow!

  • @dennisrkb
    @dennisrkb 2 роки тому +7

    Great presentation on GPU architecture, performance tradeoffs and considerations.

  • @SrikarDurgi
    @SrikarDurgi 3 місяці тому +1

    Dan is definitely the MAN.
    Great talk!

  • @ypwangreg
    @ypwangreg Рік тому +1

    I was always puzzled and fascinated about how those grid/block/threads work in parallel in the GPU and this video explains it in one and all. very impressive and helpful!

  • @hadiorabi692
    @hadiorabi692 7 місяців тому +1

    Man this is amazing

  • @holeo196
    @holeo196 Рік тому

    Another great presentation by Stephen Jones, fascinating

  • @chamidou2023
    @chamidou2023 5 місяців тому

    Great presentation!

  • @purandharb
    @purandharb Рік тому

    thanks for detailed explanation. Really enjoyed it.

  • @mugglepower
    @mugglepower 8 місяців тому +2

    oh man I hope my mum fixed me with a better brain processing unit so I could understand this

  • @kimoohuang
    @kimoohuang 2 місяці тому

    Great presentation!It is mentioned that 4 warps x 256 bytes per warp = 1024 bytes, and that equals to the Memory page size 1024 bytes. It only happens when the 4 warps running adjacent threads。Are the 4 warps always running adjacent threads?

    • @perli216
      @perli216  2 місяці тому

      @@kimoohuang Not necessarily. Depends on the warp scheduler

  • @openroomxyz
    @openroomxyz 2 роки тому +1

    Interesting!

  • @KalkiCharcha-hd5un
    @KalkiCharcha-hd5un Місяць тому

    @21:17 "Its exactly the right amount of data to hit the peak bandwidth of my mem system , Even if my program reads data from all over the place , each read is exactly ONE page of my memory " I didnt understand this statement 21:17 "Even if my program reads data from all over the place" Does it mean even if the data is read from non consecutive memory ??

    • @perli216
      @perli216  Місяць тому

      yes

    • @perli216
      @perli216  Місяць тому +1

      You got the benefits of reading contiguous memory for free basically, even when doing random reads

    • @KalkiCharcha-hd5un
      @KalkiCharcha-hd5un Місяць тому

      @@perli216 Ok cool so basically only mem is contigues we get advantage like if i = tid + bid*bsize , and not like i = 2*(tid + bid*bsize)

    • @perli216
      @perli216  Місяць тому

      @@KalkiCharcha-hd5un I don't understand your question

    • @KalkiCharcha-hd5un
      @KalkiCharcha-hd5un Місяць тому +1

      @@perli216 "Even if my program reads data from all over the place" , I think I got it , Initially I thought "... all over the place" as in any random memory / non consecutive .
      all over the place as in diff threads from same page , because single thread will bring in the data from same page anyway.

  • @LetoTheSecond0
    @LetoTheSecond0 Місяць тому

    Looks like the link in the description is broken/truncated?

    • @perli216
      @perli216  Місяць тому

      @@LetoTheSecond0 yes, yourube did this. It's just the original source for the video