CUDA Programming on Python

Поділитися
Вставка
  • Опубліковано 29 лис 2024

КОМЕНТАРІ • 769

  • @aishahoura2619
    @aishahoura2619 2 роки тому +167

    Thank you so much for responding to my request for making a CUDA programming. I have donated 0.1 BTC to your account as a way to thank you. My professor has done so many hours trying to explain CUDA and none of my classmates really understood. I just can not believe that you do all this for free and that is why me and my classmates have decided to collect some funds to donate to you.
    Thanks for all that you do and please keep going.

    • @AhmadBazzi
      @AhmadBazzi  2 роки тому +86

      Thank you for the donation, it really means a lot !

    • @aishahoura2619
      @aishahoura2619 2 роки тому +3

      @@AhmadBazzi No thank you !

    • @mdrubelahmed6434
      @mdrubelahmed6434 2 роки тому

      Thank you so much for responding to my request for making a CUDA programming.

    • @btspower3844
      @btspower3844 2 роки тому +1

      Wow amazing

    • @btspower3844
      @btspower3844 2 роки тому

      Wow amazing

  • @denizart2255
    @denizart2255 2 роки тому +50

    You just opened my eyes to parallel programming. Thanks for the quick overview.

    • @mdrubelahmed6434
      @mdrubelahmed6434 2 роки тому

      Too hard to find high -quality content like this these days. Thank you so much

  • @Drex.Yt1
    @Drex.Yt1 2 роки тому +80

    Too hard to find high-quality content like this these days. Thank you so much

  • @sksk-lo8kc
    @sksk-lo8kc 2 роки тому +91

    That was very well explained. I have only have taken one course, and you made it clearer than my professor or fellow students ever did.

  • @nilsu1941
    @nilsu1941 2 роки тому +108

    12:36 This guy is a God !

  • @aoungamingyt3160
    @aoungamingyt3160 2 роки тому +71

    Thank you so much. Probably the best introdution to CUDA with Python. The example you use, while very basic, touches on usage of blocks, which is usually omitted in other introduction-level tutorials. Great stuff! Hope you return with some more videos. I have subscribed!

  • @leonelaguilera9059
    @leonelaguilera9059 2 роки тому +73

    this was such an excellent video

  • @tatldunyas2471
    @tatldunyas2471 2 роки тому +2

    Just did my research and this guy is at one of the most prestigious universities in the world ! No wonder why his lectures come up neat !

  • @apogeetheboss9999
    @apogeetheboss9999 2 роки тому

    as a data scientist +2 years of experience, i ALWAYS learn something new with your content! please nich, never stop doing this things, and also, never cut your smile in your face, even if your are having bugs!!

  • @pantherofficial5059
    @pantherofficial5059 2 роки тому

    I have been looking into gpu programming using numba and python for a while, this seems to be the best tutorial I was able to find so far.. . thank you

  • @adeeshaamabidu9616
    @adeeshaamabidu9616 2 роки тому

    Love the channel Nicholas, have recently graduated from an NLP Master's degree and seeing you explain stuff in a simpler way and your coding challenges is really helping me connect with the material I've learned! Keep it up and I'll keep watching!

  • @erenbasak7694
    @erenbasak7694 2 роки тому

    Hey this is super useful! I elected High Performance Computing and Microprocessors and Embedded Systems modules for my degree, and this channel has become my go-to guide.

  • @basslvers4501
    @basslvers4501 2 роки тому

    wanted to comment that the information in this presentation is very well structured and the flow is excellent.

  • @todotasks7645
    @todotasks7645 2 роки тому

    Too hard to find high-quality content like this these days. ⚡

  • @RAVIShankar-bm4ou
    @RAVIShankar-bm4ou 2 роки тому +5

    Thank you so much for this series! It's so clear and easy to follow

  • @muradhesenov5245
    @muradhesenov5245 2 роки тому

    the essence of Deep learning in a few lines of code... awesome

  • @youtubemullim319
    @youtubemullim319 2 роки тому

    I feel like Cuda has been demystified. Very glad I found your series.

  • @mustafasamet2783
    @mustafasamet2783 2 роки тому

    Ayyyy, so glad you like it @Patrick. For the last two weeks I've just been making videos on stuff I find hard or want to get my head around I figure it's not just me staring there at some of these concepts like huh?!? Thanks for checking it out!!

  • @SiTacuissem
    @SiTacuissem 2 роки тому +1

    Interesting, but two remarks:
    Example 1: on my setup (3080Ti, i7-8700K, running in WSL2 with Ubuntu 22.04) vector multiplication runs actually *faster* on CPU (if you either use the vectorized formulation in MultiplyMyVectors with target "cpu" or, simply, a*b instead of the unnecessary for loop in the CPU code). IMO that is mostly due to the overhead of copying the data to the GPU memory.
    Example 2: to get a fair comparison, you should also use the JIT for FillArrayWithouGPU, decorating with @jit(target_backend="cpu"). Then, GPU array filling is still faster, but only by a factor of 2.

  • @LouieVon
    @LouieVon 2 роки тому

    This is the best introduction to CUDA I've seen, thanks a lot !

  • @Kvmizo
    @Kvmizo 2 роки тому

    This was by far one of the most enlightening videos you have put up on your channel. Thanks and keep up the good work!!

  • @-oof1016
    @-oof1016 2 роки тому

    Ahmad , thanks for taking time to create these videos. It is unfortunate that people view your videos and then feel inspired to complain about a free gift. Folks could just keep it moving or add helpful insights.

  • @rithusvlogtime3342
    @rithusvlogtime3342 2 роки тому

    Fantastic tutorials on CUDA. You deserve more followers.

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies..⤴️

  • @tajaochrisciamae4121
    @tajaochrisciamae4121 2 роки тому

    what a passionate tutorial! I wish you were my professor for my parallel programming course. Well done!

  • @fatihcalidkan2254
    @fatihcalidkan2254 2 роки тому

    holy shit, i was looking into this to speed up my mandelbrot-zooms and they are what you use as an example! This is a dream come true!

  • @cavansirmahmudov217
    @cavansirmahmudov217 2 роки тому

    You saved me, i had to read the PointNet2 implementation for my BCS thesis. this made the job much easier!

  • @excolabirbuyuyecek9438
    @excolabirbuyuyecek9438 2 роки тому

    LOL. Loved the graphic at 6:23! Brought tears to my eyes.

  • @Fiekriekd
    @Fiekriekd 2 роки тому

    and that's what I call a great tutorial. Thankyou sir. I wish you make more tutorials.

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies...⬆️

  • @Animals-vi5wt
    @Animals-vi5wt 2 роки тому

    Woah congrats @Ally 🎊 🎉 glad you’re enjoying the challenges, plenty more to come!!

  • @lixaxel6815
    @lixaxel6815 2 роки тому

    Excellent example of vector addition of using for loop and using CUDA

  • @astaadxofficials7813
    @astaadxofficials7813 2 роки тому

    I have no idea what kind of videos i am watching ... but i sure will learn

  • @merthanozer2964
    @merthanozer2964 2 роки тому

    Ohh, yes, Thank you, and the documentation at nvidia site about CUDA is highly professionally written. Thank you.

  • @mrfcbs1251
    @mrfcbs1251 2 роки тому

    Oh Ahmad , your tutorials are incredible and inspiring....

  • @yasincaferzade8069
    @yasincaferzade8069 2 роки тому

    Great video, I like this kind of video where you code some AI task counterclock, you teach us the concepts and show us the reality of implementing it👏

  • @TheTurksxayers
    @TheTurksxayers 2 роки тому

    Thank you for this great introduction to numba and more specifically numba+cuda.

  • @teamchanel4384
    @teamchanel4384 2 роки тому

    I'm doing an internship in a research lab and I'll have to program some kernels to implement Blas primitives, this video really helps :)

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies..⤴️

  • @hilalkoskli6266
    @hilalkoskli6266 2 роки тому

    Wow It is really awesome! It is much better than a tutorial from university! Thanks!

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies...⬆️

  • @totallycz6819
    @totallycz6819 2 роки тому

    You are a lifesaver @Spencer, will do it next time i'm on the streaming rig!

  • @uniquevlogsbyadil
    @uniquevlogsbyadil 2 роки тому

    This was oddly intense. Great job Nicholas! Even though you ran out of time, this video is still a win to me. 😉

  • @denizugurbiltekin622
    @denizugurbiltekin622 2 роки тому

    Thank you so very much. This is the exact kind of material I was looking for on this very specific subject. Kudos.

  • @turkceraplyrics6701
    @turkceraplyrics6701 2 роки тому

    Hey Ahmad , I love watching your videos because of the way you tell the story. Great graphics mate. Love the reference to rocket man too... lol keep up the good work.

  • @نٌے.نٌے.نٌےۦصہٰ̐كْٰٓاكہٰ̐ہٰ̐يي

    OHHHH MANNN, I thought about doing that but I was debating whether I'd hit the 15 minute deadline already. Good suggestion @Julian!

  • @crystalannringor1025
    @crystalannringor1025 2 роки тому

    Thank you so much for this video. It has helped me massively to prepare for my computer science exam.

  • @kelechijames5577
    @kelechijames5577 2 роки тому

    This is very helpful. Most people don't realize the overheads and code refactoring necessary to take advantages of the GPUs. I am going to refactor a simple MNIST training propgram I have which currently uses only Numpy. See if I can get meaningful improvements in training time.

  • @mipxello7749
    @mipxello7749 2 роки тому

    Very well explained. The best CUDA explaination I have come across uptil now 😊😊. Keep up the spirits sir.👍👍

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies...⤴️

  • @tugrasolak7852
    @tugrasolak7852 2 роки тому

    this is extremely helpful. you did an amazing job explaining the foundations

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies...⤴️

  • @weds8296
    @weds8296 2 роки тому

    Thanks for making all these topics very approachable!

  • @maher9422
    @maher9422 5 місяців тому +14

    يعطيك العافية اخوي احمد
    بدي اطلب منك طلب بسيط انه تعمل نفس الكورس عربي عارف مش حيكون كثير المشاهدات. ولكن
    اخوانك يحتاجوك اكثر من الاجانب
    انا بفهم عليك بس فيه غيري بيحبوا المجال وبيحبوا يتعلموه بلغتهم.
    اذا انت ما تملك الوقت اسمحلي اترجم الفيديو واشرحه عندي بلايك من عندك على انك موافق.

  • @yigitboran5565
    @yigitboran5565 2 роки тому +1

    That's mostly how it works. It's more like sorting the stones by its color and pattern and counting each variety. Using the CPU way, you would need to count each variety separately. If you have 100 different colors and patterns, that would take a long time to count (even if you could count extremely accurate and fast, similar to how the CPU makes up for it's lack of parallelism). The GPU way lets many people count them. Given 100 people (like the GPU), each person would count each variety at the same time.

  • @likky2229
    @likky2229 2 роки тому

    The video was very helpful for me. Many thanks to the author for developing his audience with interesting and useful content

  • @pandamusic_tz
    @pandamusic_tz 2 роки тому

    The Knowledge of Ahmad knows no bounds.

  • @notachannel2601
    @notachannel2601 2 роки тому

    It is effectively a very easy approach to harness the power of cuda in simple python scripts.

  • @hgmalani21
    @hgmalani21 2 роки тому

    It's very informative and a good intro to CUDA programming. Thanks very much!

  • @imsoumyajitbag
    @imsoumyajitbag 2 роки тому

    Awesome video !! It's preety cool to see such theoretical concepts coded and explained like this. Keep going Nich !!

  • @cemiltuna
    @cemiltuna 2 роки тому

    Thanks for the video, I found the first half and the wrap up really excellent.

  • @mjmlangenihd8706
    @mjmlangenihd8706 2 роки тому +1

    yes, you could do this by hand, which would be a great challenge in distributed computing to code by hand. Another option is to use a framework/platform like AWS Sagemaker to do distributed kmeans. Most organizations will do this.

  • @Voicemelod
    @Voicemelod 2 роки тому

    Amazing! I'm learning so much watching you code. Thank you for sharing.

  • @bodyprodaction9718
    @bodyprodaction9718 2 роки тому

    Well just built a new rig with a 980ti and a 4790k so I'm gonna put that to test. Thank you for your wonderful explanation :D

  • @enescakmak6699
    @enescakmak6699 2 роки тому

    It works on both AMD and NVIDIA. If you have CUDA code, you can convert it to HIP with their automated tool, there is very little CUDA specific that can't be just translated over.

  • @vanshd6884
    @vanshd6884 2 роки тому

    PS. I really so moved for your stock price episode. thank you so sosososo much.

  • @a.s.m.rashedchowdhury784
    @a.s.m.rashedchowdhury784 2 роки тому

    opened my eyes to parallel programming

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies..⬆️

  • @bombosbikanal3858
    @bombosbikanal3858 2 роки тому

    Perfect Video! Saw was revealing to me to understand how it works. Thank you! I am a new subscriber of your channel. Regards from Buenos Aires, Argentina

  • @peloizol8947
    @peloizol8947 2 роки тому

    I like how you did the website for documenting the video notes for reference later

  • @emircoltu875
    @emircoltu875 2 роки тому

    Once you initialized lr to 0.0, I knew you were going to forget to change it lol. Love the challenges tho, keep doing them, I think it would be cool to see how you implement a neural network from scratch

  • @mendes.02
    @mendes.02 2 роки тому +1

    This is an academic example that shows the process of copying data to the GPU, doing a vectorized operation, then showing the results. Actually what makes sense on the GPU vs CPU is something I didn't cover, and am hoping other can figure out some cool ideas.

  • @theatlantisreport1595
    @theatlantisreport1595 2 роки тому

    This reminds me a lot of the computer tutorial tapes from the 90s

  • @beratcansamur1517
    @beratcansamur1517 2 роки тому

    An insanely underrated series!!!

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies..⤴️

  • @mehmetak4349
    @mehmetak4349 2 роки тому

    What makes the CPU better than the GPU is that each core is clocked at a faster speed and has many built-in instructions like SSE, allowing data to be processed faster. This provides a tremendous benefit to programs that only run on 1 core. In rendering where multiple cores can be used, you would need the CPU to process pixels about 5+x faster to match the GPU's performance.

  • @sachinram3783
    @sachinram3783 2 роки тому

    Sir,make more detailed sessions on CUDA,your explanation is great

  • @besttwitcher4569
    @besttwitcher4569 2 роки тому

    YESSSS, right?! Glad you liked it Miguel!

  • @kolaybreaworlds3178
    @kolaybreaworlds3178 2 роки тому

    Also, the CT5 simulator from 1981 may not count as being from the '70s or '60s, but from what I understand, the CT5 was capable of realtime, rasterized, 3D polygonal rendering and was $20 million at the time. It used gouraud shading, if memory serves. There were several other CT (continuous tone) simulators developed by E&S in the '70s that did something similar or of much lower capability than the CT5 of '81. There was also the Digistar planeteriums that date back to the early '80s, and the Picture System goes back to at least the early '80s. Might be vector or raster, not entirely sure myself, though.

  • @yapmabekanka7151
    @yapmabekanka7151 2 роки тому

    This is amazing! Thank you for taking effort to make it!

  • @mobilerepairs3620
    @mobilerepairs3620 2 роки тому

    Technically, Yes. However, CUDA isn't designed to give you an extra processor to use. It's just to give you the option of using a different type of processor to do your work. GPUs have lots of processing cores (100-1000+) which helps a lot with rendering. Each core can process 1 pixel allowing 100+ pixels to processed at once. CPUs have a small number of cores (2 - 18 in the Xeons) so only 2 - 18 pixels can be processed at once. The Hyper-Threading technology can double that number, but 36 is small compared to 100.

  • @yusifhsnov1802
    @yusifhsnov1802 2 роки тому

    So stoked you liked it 🙏

  • @gularif1
    @gularif1 2 роки тому

    On the PC side Matrox was the first company to introduce GPU's . This was followed by ATI . NVidia came into the scene after the success of these 2 Canadian companies. Matrox's original 3D board was a 3 setboard with custom asics. I believe NVIDA actually acquired ATI. So yes, NVIDIA was not the first but they are the biggest in the space now. Matrox is still around but more involved in the industrial and nice markets.

  • @Cardexs
    @Cardexs 2 роки тому

    This is really helpful for my computing. Thank you.

  • @MTHHC
    @MTHHC 2 роки тому

    Hey, thanks for explanation! Very well done 👍 I am downloading CUDA 💪

  • @coolboy_0459
    @coolboy_0459 2 роки тому

    I was needing this!!! Thanks a lot, Sir!!!!

  • @Houseofcreate
    @Houseofcreate 2 роки тому

    Many thanks for the lucid explanation.

  • @mehmetplgx8025
    @mehmetplgx8025 2 роки тому

    Love your videos. Please don't stop!

  • @pusher7051
    @pusher7051 2 роки тому

    glad to see you take it as a feedback and not as a hate comment

  • @agusexclusife2577
    @agusexclusife2577 2 роки тому

    Can't wait to see Juan's better tutorial that he's definitely going to release :') lmao. Great video Ahmad .

  • @prietjepruck
    @prietjepruck Рік тому

    Thank you very much for this tutorial. I would love to have the code available because typing it in myself from the video is a bit hard especially with the atocomplete on all the time. Keep up the good work.

  • @aliyensagaltc3121
    @aliyensagaltc3121 2 роки тому

    You are bloody watching a master at work xD

  • @ricardomilos5889
    @ricardomilos5889 2 роки тому +1

    This was a great video to me, I have very limited C++ experience and was looking for an explanation of CUDA. Another video like this could easily have been 70-80% over my head. This one was only about 15% whoosh. And now I actually find C++ interesting again!

  • @blackiselia8054
    @blackiselia8054 2 роки тому

    i need to say this: you are the gamechanger here!!

  • @balli9849
    @balli9849 2 роки тому

    I disagree, I think he did a great job explaining everything, especially the code.

  • @Muhannad_ALAZZO
    @Muhannad_ALAZZO 2 роки тому

    wold love to see a video on what are a few CUDA programming challenges

  • @TheGameboyTheDream123
    @TheGameboyTheDream123 2 роки тому

    @nvidia I personally think the way you did the demonstration was perfectly sufficient. IMO, fancy graphics are unnecessary. Good job.

  • @alidoruk5588
    @alidoruk5588 2 роки тому

    This guy is so underrated.

  • @wosbb9674
    @wosbb9674 2 роки тому

    Nice demo - I am getting into CUDA GPU programming and have a workstation build with a 1950x 16 core CPU and two rtx 2080ti gpus and would like to check this demo on the machine and observe the outcome results without using colab definitely will check this out today. By the way , with notebook python3 environment , I need to use pip to install numba library as shown or do i have to create a new virtal environemnt? I am curious about that. Thank you

  • @appnana4449
    @appnana4449 2 роки тому

    CUDA also is in the form of an API (i.e. using NVIDIA's CUDA library in C) to abstract away parallel computation tasks to the GPU - but yes its both, the API is the software side but the GPU must be CUDA compatible (have CUDA cores) to take advantage of this.

  • @gamingtouryoutubechannel1131
    @gamingtouryoutubechannel1131 2 роки тому

    It's a mandelbrot set explorer that uses both CUDA- and C-extensions to calculate the iterations. The multithreaded C-implementation is definitely no slouch, but when you start doing over 10 000 iterations per pixel the CUDA-implementation becomes significantly faster. In contrast pure Python based implementation get frustratingly slow already at around 1000 iterations so it wasn't even worth adding to the comparison.

  • @aliarda9719
    @aliarda9719 2 роки тому

    Thanks a million @Lakshman!! I try to keep it pretty tight so it’s a good challenge otherwise I know I’ll just talk for 22 minutes anyway😅

  • @vipyt8550
    @vipyt8550 2 роки тому

    It can be found in O(1). As far as I remember the formula is derived using LDU decomposition or Diagonalising a matrix, for matrix exponentiation.

  • @dragnn1430
    @dragnn1430 2 роки тому +101

    Dear Ahmad, you are 30 years old only doing post-doc ? I'm sorry but this to me sounds very underrated. Postdocs are not always well compensated for their work but spend a lot of time working and doign research. If i were you, i'd invest more time on my youtube channel, rather than doing something that does not compensate well.

  • @imperalzula7062
    @imperalzula7062 2 роки тому

    Great talk, thank you ! Well structured and clear.

    • @user-sh6jw1ce3m
      @user-sh6jw1ce3m 2 роки тому

      Thanks for the comment... contact me for information and profitable investment strategies...⬆️

  • @streamersofficial3834
    @streamersofficial3834 2 роки тому

    Great explanation! Fascinatingly clear

  • @keremkipri9436
    @keremkipri9436 2 роки тому

    Thanks for the video, subscribed! A suggestion : this small change to your code would demonstrate a real-world gradient descent solution for linear regression with noisy data. E.g. :

  • @isimiztiktok
    @isimiztiktok 2 роки тому

    This was really good. Thanks for posting this!