A Review of 10 Most Popular Activation Functions in Neural Networks

Поділитися
Вставка
  • Опубліковано 19 жов 2024

КОМЕНТАРІ • 20

  • @tomorourke6301
    @tomorourke6301 Рік тому +10

    Extremely helpful thank you very much

    • @PyMLstudio
      @PyMLstudio  Рік тому +1

      Thanks Tom for your nice words, that’s very encouraging!

  • @harriehausenman8623
    @harriehausenman8623 Рік тому +6

    Thanks for the video! Great overview & refresher 🤗
    I appreciate the calm, slow and clear voice.

    • @PyMLstudio
      @PyMLstudio  Рік тому +2

      Thanks for the comment. That’s very encouraging to hear!

  • @soyoltoi
    @soyoltoi Рік тому +6

    Great video! Some points about English to help you improve:
    You should say "how it looks" instead of "how does it look like". If you want to include the "like", you can say "what it looks like" instead.

    • @PyMLstudio
      @PyMLstudio  Рік тому +4

      Thanks, that’s a good point! I’ll try to remember that for my next video.

  • @dashnarayana
    @dashnarayana Рік тому +3

    Great video. I would recommend plus 2 level students who can see how basic calculus is used in AI in later stage. Besides, these would serve as good exercises

  • @phuonglethithanh8498
    @phuonglethithanh8498 Рік тому +1

    Thank you for this video ❤

  • @FelheartX
    @FelheartX Рік тому +8

    relu, leaky relu, swish seem to be an evolution.
    The issue with relu was that it leaves dead weights. Then the issue with leaky relu was its discontinuity. And swish finally fixed all of them.
    Are relu and leaky relu still useful for anything?
    Also why was GELU used for language models? Why does GELU work better there than other activation functions?

    • @PyMLstudio
      @PyMLstudio  Рік тому +5

      Thank you for your question. Indeed, ReLU, LeakyReLU and Swish is an evolution. And it is true that ReLU suffers from dead neurons, but still, ReLU and its variants such as LeakyReLU are used in ANNs, specially in computer vision tasks like image segmentation. One advantage of ReLU is its simple and **efficient** computation.
      As for GELU, some properties of GELU make it suitable for more complex tasks like NLP. For example, its non-monotonic behavior allows the network to capture more complex patterns in text data.
      But having said these, the choice of activation function heavily depends on the data and the task, and one should experiment with different activation functions to find the best one for a given task.

  • @SolathPrime
    @SolathPrime 5 місяців тому

    I have my own activation function that I use, it's Softplus like function
    it's the integral of (1+tanh(x))/2 which looks like Sigmoid except it's faster in training
    It's integral is this equation that I call "Rectified Integral Tangent Hyperpolica" RITH for short
    It's mostly linear for x≥1 which makes it fast in training
    (x+ln(cosh(x)))/2 I added the term 1/e to center it between 0 and positive infinity

  • @Sathishreddy-fe3cp
    @Sathishreddy-fe3cp 15 днів тому

    i want to practice all optimizers with different activation functions with some maths problems and in python could you please suggest good book

  • @ThePhysicsTrain
    @ThePhysicsTrain Рік тому +5

    Animations are cool.. Have you used manimce or manimgl.?

    • @PyMLstudio
      @PyMLstudio  Рік тому +1

      Thanks for the comment 😊 I have used ManimCE
      so far I have never played with ManimGL yet , but will check it out and see if it worth switching

  • @jiahao2709
    @jiahao2709 3 місяці тому

    may i know how you make these videos?

  • @Saurabhmaths1999
    @Saurabhmaths1999 Рік тому +2

    Love from India

  • @kies9416
    @kies9416 Рік тому +5

    Cool

  • @hipphipphurra77
    @hipphipphurra77 Рік тому

    Wrong!
    The derivativ of the ELU ist a perfect continuous function everywhere even at 0.
    ua-cam.com/video/56ZxEmGRt2k/v-deo.html

    • @PyMLstudio
      @PyMLstudio  Рік тому +3

      Thanks for the comment, but that depends on the value of alpha.
      As I mentioned in the video, if alpha =1, the derivative of ELU is continuous (also see the plotted curve corresponds to alpha=1)
      But if alpha != 1, the derivative will be a discontinuous function