Why do we need Cross Entropy Loss? (Visualized)

Поділитися
Вставка
  • Опубліковано 3 лют 2025

КОМЕНТАРІ • 136

  • @kvnptl4400
    @kvnptl4400 Рік тому +2

    This is the exact one video you need for the "Cross Entropy Loss" keyword. Straight to the point

  • @manfredmichael_3ia097
    @manfredmichael_3ia097 4 роки тому +41

    Best explanation I could find! This channel is gonna be big.

  • @mazharmumbaiwala9244
    @mazharmumbaiwala9244 2 роки тому +7

    it was really great pointing out that its the gradient that matters more than the actual loss value.
    Great video, keep it up

  • @sunitgautam7547
    @sunitgautam7547 4 роки тому +11

    The video production quality along with the explanation is really nice. Keep making such great content, your channel is bound to gain traction very rapidly. :)

    • @NormalizedNerd
      @NormalizedNerd  4 роки тому

      Thanks man...Another interesting video is on the way!

  • @lelouchlamperouge5910
    @lelouchlamperouge5910 4 місяці тому

    Best video to understand cross entropy out there, I was struggling a bit until I found this one.

  • @mojtabazare3606
    @mojtabazare3606 2 роки тому +1

    I love the math, calculus and all the visualizations that come in this video. Great job

  • @eclypze_
    @eclypze_ Рік тому

    Amazing explanation! just what I was looking for :)
    great job man!

  • @sorvex9
    @sorvex9 4 роки тому +7

    I am a big fan of how you remove all the unneccesary steps in the formulas in order to explain it as simply as possible. Very nice!

    • @NormalizedNerd
      @NormalizedNerd  4 роки тому

      Thanks! Yeah, this way you get to know the essence.

    • @_sonu_
      @_sonu_ Рік тому +1

      no those steps are too important

  • @TheQuantumPotato
    @TheQuantumPotato 4 роки тому +4

    Your explanations are great! Thanks for the vids!

    • @NormalizedNerd
      @NormalizedNerd  4 роки тому +1

      Thanks for watching!

    • @TheQuantumPotato
      @TheQuantumPotato 3 роки тому

      @@NormalizedNerd I just wanted to come back and let you know - I got a distinction in my MSc (I did my thesis on GANs for tabular data) and your vids were a huge factor in helping me achieve this! So thank you!

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому +1

      @@TheQuantumPotato Your comment just made my day! Best wishes for your future endeavors 😊

    • @TheQuantumPotato
      @TheQuantumPotato 3 роки тому

      @@NormalizedNerd I am currently working on a medical computer vision project - so it’s all going well! Thanks again, I look forward to watching more of your vids

  • @shaktijain8560
    @shaktijain8560 3 роки тому

    Undoubtedly, the explanation of cross-entropy loss I found on youtube.

  • @yiqian2977
    @yiqian2977 4 роки тому +2

    Thanks for this helpful video! This delivers a clear visual explanation that my professor didn't do

  • @angelinagokhale9309
    @angelinagokhale9309 3 роки тому +1

    Very nicely explained! Your video helped me a lot in my classroom discussion today. Thank you very much.

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому +1

      Really glad to hear that!

    • @angelinagokhale9309
      @angelinagokhale9309 3 роки тому

      @@NormalizedNerd And my students enjoyed that explanation. I'll surely share your channel link with them.

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому +2

      @@angelinagokhale9309 Omg! I thought you are attending the class as a student! Really happy to see other educators appreciating the videos :)

    • @angelinagokhale9309
      @angelinagokhale9309 3 роки тому

      @@NormalizedNerd And well, as a teacher (or better still a facilitator) of the subject, I am a student first. There is just so much to keep learning! And I enjoy it :)

  • @UCSAdityaKiranPal
    @UCSAdityaKiranPal Рік тому

    You took this explanation to the next level man! Great analysis

  • @mautkajuari
    @mautkajuari Рік тому

    Amazing video, totally understood the concept

  • @augurelite
    @augurelite 4 місяці тому

    Extremely helpful. Thank you

  • @whenmathsmeetcoding1836
    @whenmathsmeetcoding1836 2 роки тому

    Loved it thanks for making this video

  • @majidlotfi4622
    @majidlotfi4622 3 роки тому

    Best explanation I could find!

  • @ant1fact
    @ant1fact 3 роки тому

    Even as a mathematically handicapped person I can understand it now fully. Bravo!

  • @taewookkang2034
    @taewookkang2034 4 роки тому +2

    This is such an amazing video! Thanks!

  • @TejasPatil-fz6bo
    @TejasPatil-fz6bo 3 роки тому

    Nicely explained....I was struggling to decode it

  • @ahmed0thman
    @ahmed0thman 3 роки тому

    Very neat and clearful beast i've ever found.
    thanks a lot❤.

  • @balltomessi8515
    @balltomessi8515 4 місяці тому

    Great explanation 🫡

  • @govindnarasimman6819
    @govindnarasimman6819 2 роки тому

    In regression also u can use CE. IF U CAN QUANTIZE the target into lets say n classes. Else if u are intersted in mape . U can use mape loss or (1+log) compression.

  • @veganath
    @veganath Рік тому

    Thank you, your explanation was perfect....

  • @TheRohit901
    @TheRohit901 2 роки тому

    awesome video, thanks for explaining it so well! keep it up.

  • @lancelotdsouza4705
    @lancelotdsouza4705 3 роки тому

    very good explanation

  • @pedramm.haqiqi1022
    @pedramm.haqiqi1022 2 роки тому

    Amazing explanation

  • @wodkawhatelse8781
    @wodkawhatelse8781 3 роки тому

    Very good explanation, you got a new subscriber

  • @hanchen2355
    @hanchen2355 4 роки тому

    Your explanation helps me a lot!

  • @HeduAI
    @HeduAI 3 роки тому

    Such an awesome explanation! Thanks!

  • @vikashbhagat6867
    @vikashbhagat6867 3 роки тому

    Khub sundor video, amake onek bhalo lageche. Keep it up :)

  • @nimishamanjaly1048
    @nimishamanjaly1048 4 роки тому +1

    great explanation!

  • @Eta_Carinae__
    @Eta_Carinae__ Рік тому

    Another way of looking at it is that L2 is already way too punishing where outliers are concerned; hence we use L1, so cross-entropy is likely to exacerbate the issues already found in L2.

  • @coralexbadea
    @coralexbadea 4 роки тому

    Really good explanation, good job !

  • @juliokaro
    @juliokaro 11 місяців тому

    Great video. Thanks!

  • @wedenigt
    @wedenigt 4 роки тому

    Awesome explanation - keep up the good work 👍

  • @somritasarkar6608
    @somritasarkar6608 3 роки тому

    great explanation

  • @sujitha3335
    @sujitha3335 3 роки тому

    wow, great explanation

  • @varunahlawat9013
    @varunahlawat9013 2 роки тому

    There is a point that I felt missing. I've read on websites that the cross-entropy function helps reach the global minima quicker.

  • @christiaanpretorius05
    @christiaanpretorius05 4 роки тому

    Nice video and visualisation

  • @mathimagik
    @mathimagik 4 роки тому

    Wow very much helpful
    Thank you for your graphical presentation

  • @willliu1306
    @willliu1306 3 роки тому

    Thanks for your insights sharing ~

  • @usmannadeem18
    @usmannadeem18 3 роки тому

    Awesome explanation! Thank you!

  • @EduardoAdameSalles
    @EduardoAdameSalles 4 роки тому +1

    Congrats from Brazil!

  • @studyaccount9662
    @studyaccount9662 3 роки тому

    Brilliant insight thank you so so much!!!

  • @halihammer
    @halihammer 2 роки тому

    Ok great but what if the number in the log is zero. For example when the ground truth is 1 but my model predicts zero? Im having trouble understanding this. I try to make an XOR-MultilayerPerceptron but 1, 0 are not good inputs. If an input is zero the weight update for the corresponding weight is impossible. I tried -1 and 1 as inputs and labels but then the loss function is not working. Im using the simoid activation function and have to hidden and one output neuron but it does not work. Maaaaan im going crazy with this ML stuff

  • @anirudhbhattacharya1749
    @anirudhbhattacharya1749 4 роки тому +1

    Is there any difference between BCE & weighted cross entropy loss function?

    • @NormalizedNerd
      @NormalizedNerd  4 роки тому

      Yes. In the 2nd one there's an extra weight term. The value of weight is different for each class.

  • @1paper1pen63
    @1paper1pen63 3 роки тому

    How do we come up with this formula of binary cross entropy loss? Linked to any proof? It would be a great help

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому

      I did a video about it -> ua-cam.com/video/2PfGO753UHk/v-deo.html

  • @irok1
    @irok1 3 роки тому

    That's an amazing introduction.

  • @jdval3476
    @jdval3476 2 місяці тому

    I dont get why the growth rate of Cross Entropy is the "sweet spot". If in classification tasks having a very steep gradient is important even when the prediction is wrong for a small amount, why don't we just use a linear loss function with a very steep gradient (the gradient would be constant and high on all the domain, not only when the prediction is far from the ground truth)? Otherwise, if what we want is a gradient that starts low and then increments fast (even faster than the parabolic MSE) why don't we use an exponential loss? or something that grows even faster than the n*log(n) of Cross Entropy?

  • @yawee12
    @yawee12 3 роки тому

    Is the curvature of the gradient the only reason we prefer CEL over MSE? does this mean that MSE would work but just converge slower needing more data to train on?

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому

      Yes, the slope is an important point. There's another thing...CEL arises naturally if you solve classification problem using maximum likelihood method. More about that here: ua-cam.com/video/2PfGO753UHk/v-deo.html

  • @mathewhobson
    @mathewhobson 5 місяців тому

    this made sense! thank you

  • @danielsvendsen8808
    @danielsvendsen8808 3 роки тому

    Awesome video, thank you very much! :)

  • @yorranreurich596
    @yorranreurich596 4 роки тому

    Great explanation, Thank you :)

  • @PoJenLai
    @PoJenLai 4 роки тому

    Well explained, nice!

  • @firstkaransingh
    @firstkaransingh 2 роки тому

    Excellent 👍

  • @yashdeshmukh4404
    @yashdeshmukh4404 3 роки тому

    How do you make such animations? What softwares do you use?

    • @lusvd
      @lusvd 2 роки тому

      Its called manim

  • @severlight
    @severlight 3 роки тому

    Just in a point! thanks

  • @ionut.666
    @ionut.666 3 роки тому

    Nice video! It is possible to train a neural network for a classification task using MSE. For a binary classification, we can use two neurons for each class and train the network using the MSE loss. When you want to compute the classification accuracy, you can do similarly to the classification case: the predicted class is the index of largest logit in the output layer. Any idea why this works?

  • @tobiasweingarten2737
    @tobiasweingarten2737 3 роки тому +1

    What a great explanation, thank you!
    One question: Don't we want the derivative to be zero if the model performs as best as it could get and that is when p always equals p^ ? Using binary-cross-entropy, we have derivatives of +1 and -1 of the loss function at the intersections with the x-axis...

    • @NormalizedNerd
      @NormalizedNerd  3 роки тому

      Ideally yes, but for functions with log terms, it's not possible to achieve a derivative of 0 right?

  • @jefferybenzos5879
    @jefferybenzos5879 3 роки тому

    I thought this was a great video!
    Can you explain how this generalizes to multi-class classification problems or link me to a video where I can learn more?
    Thank you :)

  • @sENkyyy
    @sENkyyy 3 роки тому

    very well done!

  • @Shaswatapal
    @Shaswatapal 7 місяців тому

    Excellent

  • @ИринаДьячкова-и5ф
    @ИринаДьячкова-и5ф 4 роки тому +3

    Thanks, this was really helpful! though i had to put it on 0.75 :D

  • @nidcxl4223
    @nidcxl4223 3 роки тому

    great video man

  • @matteoocchini3119
    @matteoocchini3119 4 роки тому +1

    good Video!

  • @SP-db6sh
    @SP-db6sh 3 роки тому

    Speechless ! Paid course fail to deliver these concept. Experience data scientist can only.

  • @anubhavsharma3642
    @anubhavsharma3642 3 роки тому

    3blue 1 brown ... but its literally a brown guy in this case.. Loved the videos man ....

  • @cboniefbr
    @cboniefbr 4 роки тому

    Great video!

  • @pumplove81
    @pumplove81 3 роки тому

    brilliant ..also delightful bong accent :)

  • @patrickjane276
    @patrickjane276 2 роки тому

    subscribed just b/c of the UA-cam channel name!

  • @TheFilipo2
    @TheFilipo2 4 роки тому

    Superb!

  • @vinuvs4996
    @vinuvs4996 2 роки тому

    very nice

  • @b97501063
    @b97501063 4 роки тому

    Brilliant

  • @Niksonk
    @Niksonk 2 місяці тому

    Great!

  • @jamesang7861
    @jamesang7861 4 роки тому

    Thank you!!!!

  • @basics7930
    @basics7930 4 роки тому

    good

  • @sheldonsebastian7232
    @sheldonsebastian7232 3 роки тому

    Noice Explanation

  • @erikpalacios7238
    @erikpalacios7238 4 роки тому

    Thanks!!!!!!!

  • @popupexistence9253
    @popupexistence9253 4 роки тому

    AMAZING!

  • @ArvindDevaraj1
    @ArvindDevaraj1 4 роки тому

    nice

  • @earthpatel365
    @earthpatel365 4 роки тому +1

    3blue1brown copy :/

  • @jocemarnicolodijunior2851
    @jocemarnicolodijunior2851 5 місяців тому

    Amazing explanation

  • @Ajeet-Yadav-IIITD
    @Ajeet-Yadav-IIITD 3 роки тому

    Amazing explanation!! Thanks!

  • @rizwanhamidrandhawa8090
    @rizwanhamidrandhawa8090 4 роки тому +1

    Awesome video!

  • @neoblackcyptron
    @neoblackcyptron Місяць тому

    Excellent