The Independent Code
The Independent Code
  • 4
  • 299 173
Softmax Layer from Scratch | Mathematics & Python Code
In this video we go through the mathematics of the widely used Softmax Layer. We then proceed to implement the layer based on the code we wrote in last videos.
😺 GitHub: github.com/TheIndependentCode/Neural-Network
EIDT: I decided to deactivate my @CodeIndependent Twitter account. Instead I'll use my personal account to tweet about upcoming videos :)
🐦 Twitter: omar_aflak
Chapters:
00:00 Introduction
00:30 Forward
01:31 Forward Code
02:14 Backward
05:47 Backward Code
06:21 Conclusion
Переглядів: 14 805

Відео

The unknown trick to solve these polynomials: symmetrical coefficients
Переглядів 2,8 тис.2 роки тому
In this video I show you a trick to divide by two the degree of some special sets of polynomials: the symmetrical coefficients polynomial. 😺 GitHub: github.com/TheIndependentCode/Neural-Network 🐦 Twitter: omar_aflak Chapters: 00:00 The method 01:52 Degree 6 03:00 Generalisation 04:28 Proof 05:28 Summary
Convolutional Neural Network from Scratch | Mathematics & Python Code
Переглядів 161 тис.3 роки тому
In this video we'll create a Convolutional Neural Network (or CNN), from scratch in Python. We'll go fully through the mathematics of that layer and then implement it. We'll also implement the Reshape Layer, the Binary Cross Entropy Loss, and the Sigmoid Activation. Finally, we'll use all these objects to make a neural network capable of classifying hand written digits from the MNIST dataset. 😺...
Neural Network from Scratch | Mathematics & Python Code
Переглядів 120 тис.3 роки тому
In this video we'll see how to create our own Machine Learning library, like Keras, from scratch in Python. The goal is to be able to create various neural network architectures in a lego-fashion way. We'll see how we should architecture the code so that we can create one class per layer. We will go through the mathematics of every layer that we implement, namely the Dense or Fully Connected la...

КОМЕНТАРІ

  • @no-gamedev8336
    @no-gamedev8336 15 годин тому

    In a back prop. part it is absolutely hard to understand math without visualisations of example neurons. No explanation of little details

  • @The_Quaalude
    @The_Quaalude 6 днів тому

    Bro stopped making videos at the worst time possible ‼️😭

  • @Artelion-pk2he
    @Artelion-pk2he 7 днів тому

    Very good explanation video (I learned CNNs mainly from here), but I am a little bit confuse why when we clasify only zeros and ones in MNIST, accuracy of the CNN in your example is super high (about 99.7%), but when we clasify all digits from 0 to 9 accuracy is worse (about 73%). I read that CNN can achieve even 99%+ results when it classifies all digits from MNIST. Can you explain why does this happen, and how to improve the accuracy of the CNN?

    • @independentcode
      @independentcode 7 днів тому

      Thanks. I've trained on 0s and 1s only for speed reasons since the code runs on CPU. But in theory if you train for all digits it should perform well too. In my code I truncate the dataset to several thousand examples; when you train on all digits, do you use all examples in the dataset?

    • @Artelion-pk2he
      @Artelion-pk2he 7 днів тому

      @@independentcode With original hyperparameters I achieved 93.82% accuracy when I trained it 20 epochs, each epoch with whole dataset. It is not very bad result, but it can be better. But I did not change loss from binary-cross-entropy to normal cross entropy :D. When I used normal cross entropy with 2 convolutional layers, I achieved in 10 epochs, 98.83% accuracy which is enough for me. Thx PS: After lots of training (maybe about 20 epochs), I actually achieved about 99.93% of accuracy.

  • @barkmemes
    @barkmemes 10 днів тому

    im 14 but damn i understood half of that

  • @kaycee3224
    @kaycee3224 15 днів тому

    Legendary

  • @bossgd100
    @bossgd100 16 днів тому

    best video on cnn ever

  • @bossgd100
    @bossgd100 16 днів тому

    why you stopped making new videos :/

  • @codybarton2090
    @codybarton2090 21 день тому

    Great video

  • @methsiri123
    @methsiri123 26 днів тому

    One of the best tutorial I have gone through. Thank you so much.

  • @ChengZhang-py7br
    @ChengZhang-py7br 27 днів тому

    I got a liitle confused about the biases. From my perspective, the bias is one number per output channal, but the vedio used a matrix with the same size to the output matrix.

  • @jvnganesh4943
    @jvnganesh4943 Місяць тому

    really found this video very interesting and informative . I really appreciate it a lot. Thanks!

  • @black-sci
    @black-sci Місяць тому

    In tensorflow they use weight matrix W dimensions i x j then take transpose in calculation.

  • @black-sci
    @black-sci Місяць тому

    Can you also make video on cross-entropy loss.

  • @jamieabw4517
    @jamieabw4517 Місяць тому

    thank you so much for all of these videos - the explanations are genuinely incredible and make it so simple to understand, are you ever gonna return and continue to make these videos?

  • @orilio3311
    @orilio3311 Місяць тому

    I love the 3b1b style of animation and also the consistency with his notation, this allows people to learn the matter with multiple explanations while not losing track of the core ideas. Awesome work man

  • @wilfredomartel7781
    @wilfredomartel7781 Місяць тому

    🎉

  • @s8x.
    @s8x. Місяць тому

    u can go a layer deeper and just use numpy to build from scratch

  • @doolfdrahcir5492
    @doolfdrahcir5492 Місяць тому

    i hate it when i cant hear the lesson on full volume.. did you drop the mic?

  • @naheedray
    @naheedray Місяць тому

    This is the best video i have seen so far ❤

  • @dcrespin
    @dcrespin 2 місяці тому

    Are you aware of the following articles by Daniel Crespin? There you find explicit global formulas for backpropagation and some closely related topics. "Generalized Backpropagation" "Matrix Formulas for Semilinear Backpropagation" "Theory and Formulas for Backpropagation in Hilbert Spacesprimer" "Neural Network Formalism" "A Primer on Perceptrons" They are easy to find. Google for the author with article names. Cordial Regards, Daniel Crespin

  • @jameshopkins3541
    @jameshopkins3541 2 місяці тому

    DO NOT DO THIS UNUSEFUL VIDS PLEASE THE WORD CROSS IS DISGUSTING.

  • @AynurErmis-vp9lq
    @AynurErmis-vp9lq 2 місяці тому

    BEST OF BEST THANK YOU

  • @macsiaproduction7823
    @macsiaproduction7823 2 місяці тому

    "encourage you to keep going")

  • @macsiaproduction7823
    @macsiaproduction7823 2 місяці тому

    Thank you for really great explanation! Wish you will make even more 😉

  • @petevenuti7355
    @petevenuti7355 2 місяці тому

    Do you have a discord?

  • @kalagecko77
    @kalagecko77 2 місяці тому

    Amazing video again. Unfortunately i am facing some problems with the code File "../lib/python3.12/site-packages/keras/src/utils/tree.py", line 12, in <module> from tensorflow.python.trackable.data_structures import ListWrapper ModuleNotFoundError: No module named 'tensorflow' After installing tensorflow and running the script: Python 3.12.3 [GCC 11.4.0] on linux Process finished with exit code 132 (interrupted by signal 4:SIGILL) I am running it in an old Laptop linux kernel 5.15.0-102-generic, processor Intel© Pentium© CPU N3540 @ 2.16GHz × 4, memory 7.6 GiB do you have any advice what could help me to fix this issue. i am really looking forward to debug the script to get deeper understanding of it. Thanks a lot!

  • @Bapll
    @Bapll 2 місяці тому

    Thanks for making this. You're really good at doing what you do.

  • @kalagecko77
    @kalagecko77 2 місяці тому

    Thank you very much for your video. It is just amazing how simple you have explained every step! After grabbing the code from github I am not having the right results. After plotting the graphic I get all point with a z value approximately to 1. Do you have any hint? Additionally the error does not converge. The minimum error that I got was 0.5. It oscillates all the time Thank you in advance!

    • @kalagecko77
      @kalagecko77 2 місяці тому

      Thank you! i did find the errors on my code. Now it works perfectly

  • @chetanchoudhary485
    @chetanchoudhary485 2 місяці тому

    Why stop uploading the video, isn't it your responsibility to complete what you've started ? 🥲 Your content is what many of us looking for, so keep doing bro 🤜🤛

  • @babakshiri7270
    @babakshiri7270 2 місяці тому

    Thanks, This vidio was perfect also had some beauty of mathematics

  • @samtetruashvili930
    @samtetruashvili930 2 місяці тому

    This video is very useful, but it would be even better if it motivated the CNN somehow. In particular, I was not previously aware of the depth in the input and the output. It would be great to motivate why these both shouldn't be set to 1.

    • @independentcode
      @independentcode 2 місяці тому

      It depends on your data. Colored images are arrays of RGB, meaning h*w*3. So there's depth in your input.

    • @samtetruashvili930
      @samtetruashvili930 2 місяці тому

      ​@@independentcode ThanksVery good point. Is there a similar intuition for why you'd want various depths in the hidden layers? Do those typically end up having a depth of 3 too? Also, can you motivate having multiple convolution kernels (I guess besides having desired depths in the hidden layers)?

  • @OmkarKulkarni-wf7ug
    @OmkarKulkarni-wf7ug 2 місяці тому

    How output gradient is calculated and passed into the backward function?

  • @2255.
    @2255. 2 місяці тому

    can someone explain why we use eulers number and not something else like phi?

  • @ramincybran
    @ramincybran 2 місяці тому

    whiteout any doubt best explanation of NN ive ever seen - why you stop your productivity my friend ?

  • @SoftwareDeveloper2217
    @SoftwareDeveloper2217 2 місяці тому

    It is the best It is the Beauty because the explanation is great

  • @xmlkb
    @xmlkb 3 місяці тому

    thanks man

  • @black-sci
    @black-sci 3 місяці тому

    best video, very clear-cut. Finally I got the backpropagation and derivatives.

  • @user-qs8dz2tr2d
    @user-qs8dz2tr2d 3 місяці тому

    interesting content + sleepy background music.. i got sleep

  • @xiangqi_in_russia
    @xiangqi_in_russia 3 місяці тому

    3b1b video format & amazing calming voice OMG, you are a treasure

  • @amjadiqbal478
    @amjadiqbal478 3 місяці тому

    Matrices multiplication is done R x C: But you have done R x R.?

  • @noone7692
    @noone7692 3 місяці тому

    Can you create new videos

  • @LoongBerries
    @LoongBerries 3 місяці тому

    why don't you inherited the Activation class

    • @independentcode
      @independentcode 3 місяці тому

      That's because the Activation class takes in a function that will be applied to each input element individually: y_i=f(x_i). In the case of Softmax, each output depends on all the inputs, so the backpropagation works out differently.

  • @magnusbrzenk447
    @magnusbrzenk447 3 місяці тому

    Just take my money!

  • @alfredovalerijlaino5284
    @alfredovalerijlaino5284 4 місяці тому

    In 1 Word :"GOD"

    • @kornflakesss
      @kornflakesss 18 днів тому

      the term God is way too watered down if you were to define it this way.

  • @khueminh6970
    @khueminh6970 4 місяці тому

    Can you make a video for Gen AI, please?

  • @vanshajchadha7612
    @vanshajchadha7612 4 місяці тому

    This is one of the best videos to really understand the vectorized form of neural networks! Really appreciate the effort you've put into this. Just as a clarification, the video is considering only 1 data point and thereby performing SGD, so during the MSE calculation Y and Y* are in a way depicting multiple responses at the end for 1 data point only right? So for MSE it should not actually be using np.mean to sum them up?

  • @alexandrek.6024
    @alexandrek.6024 4 місяці тому

    Awesome video. I just have a question why do we have a different kernel per input (why K11 !=K1N) ? It would seem more logical to have the same kernel for each input. Because now when we want to predict an image what kernel how do we know what kernel should should we take.

    • @independentcode
      @independentcode 4 місяці тому

      Just to clarify, at 6:27, X1, X2, X3 are different channels in the input (r,g,b for example). So K11 to K1N is a single kernel, applied to the input.

    • @alexandrek.6024
      @alexandrek.6024 4 місяці тому

      or each channels of the input can have a different kernel ?

  • @vilmospalik1480
    @vilmospalik1480 4 місяці тому

    this is a great video thank you so much

  • @AB51002
    @AB51002 4 місяці тому

    This helped me a lot! Lots of projects out there just use the Pytorch API for the most common functionalities, which makes things harder for people who desire to change the code in order to implement a research idea or just experiment in general. Thank you so much for your effort, and for taking the time to make these videos!

  • @mohammed_amine_7173
    @mohammed_amine_7173 4 місяці тому

    Just amazing work ❤