Gradient Descent From Scratch In Python

Поділитися
Вставка
  • Опубліковано 3 жов 2024

КОМЕНТАРІ • 21

  • @vikasparuchuri
    @vikasparuchuri Рік тому

    Hi everyone! The code and explanations behind this video are here - github.com/VikParuchuri/zero_to_gpt/blob/master/explanations/linreg.ipynb . You can also find all the lessons in this series here - github.com/VikParuchuri/zero_to_gpt .

  • @yujin8770
    @yujin8770 Місяць тому

    i think the derivative of the loss function should be 160×(80w+11.99−y), instead of 2×(80w+11.99−y)

  • @fd2444
    @fd2444 Рік тому +2

    Is there a discord to discuss the projects on this channel?

  • @TheMISBlog
    @TheMISBlog Рік тому

    Very useful,Thanks

  • @josuecurtonavarro8979
    @josuecurtonavarro8979 Рік тому

    Hi Vik! Thanks o much for the amazing work! Your content is always one of my best choices when it comes to learning DataScience and ML. I have a doubt though about the video in minute 40:56. You mention that in the init_params function, if we substract 0.5 from the result of np.random.rand() , it would rescale weights from -0.5 to 0.5. But wouldn't it just gives us (randomly) some negative values (depending also on the chosen seed) whenever the ones returned by np.random.rand() function are less than 0.5? Thanks so much again and please, keep on doing what you do! I've already come a long way thanks to all your work!

    • @Dataquestio
      @Dataquestio  Рік тому

      Thanks :) np.random.rand returns values from 0 to 1 by default, so subtracting .5 will rescale that range to -.5, .5 .

  • @FootballIsLife00
    @FootballIsLife00 10 місяців тому

    Exactly at 19:44, you mention that the derivative of loss function regarding b is the same as loss function but I don't think so, because derivative of :
    dL/db ( (wx+b) - y )^2 = 2((wx+b)-y)
    and
    dL/dw = 2x((ws+b)-y)
    can anyone help me out ?

  • @hussainsalih3520
    @hussainsalih3520 Рік тому

    amazing

  • @hchattaway
    @hchattaway Рік тому +1

    Please keep doing these, they are really excellent!

  • @rjchen83
    @rjchen83 Рік тому +1

    Thanks for the tutorial! Could you also add access to the data 'clean_weather.csv'

    • @Dataquestio
      @Dataquestio  Рік тому +1

      You should be able to download the file here - drive.google.com/file/d/1O_uOTvMJb2FkUK7rB6lMqpPQqiAdLXNL/view?usp=share_link

    • @AI_BotBuilder
      @AI_BotBuilder Рік тому +3

      @Seekersbay Learn to say that politely rather than a command when the man‘s actually putting content out there for everyone. Replace your ´should‘ with could and a please, it changes the tone a lot Ser…

  • @AvinashChandrashukla
    @AvinashChandrashukla 7 місяців тому

    What i am neet step by step bussiness analyst

  • @arielgarcia3184
    @arielgarcia3184 2 місяці тому

    👌 excellent

  • @envision6556
    @envision6556 Рік тому

    love your work, so clear

  • @anfedoro
    @anfedoro Рік тому

    finally I have managed to implement the gr descent for linear regression myself :-).. almost with no looking back to Vik's notebook. Can consider now that I understand how it works and all math underlying. Just curious, why my final weights and bias are very different compare to that sklean is calculating ? I plot all three - original test labels, calulated via my own procedures and calculated via sklearn.. I see that my is less acurate vs sklearn. Why it could be ?

    • @Dataquestio
      @Dataquestio  Рік тому +1

      Congrats on implementing it yourself! Scikit-learn doesn't use gradient descent to calculate the coefficients (I believe they use analytical solutions in most cases). This would lead to a different solution.
      Even when using gradient descent, it is possible to use better initializations or optimizers (ie, don't use SGD).
      I would only be concerned if your error is significantly higher (say more than 50% higher), or your gradient descent iterations aren't improving over time.

    • @anfedoro
      @anfedoro Рік тому

      @@Dataquestio thanks.. I played further with more iterations and got mae better than sklearm given. Just as I understand this doest matter much due possible overfitting.. right?

    • @anfedoro
      @anfedoro Рік тому

      @@Dataquestio playing further I have implemented floating learning rate and got faster convergence as well as far better MSE :-)

  • @kurtisunknown
    @kurtisunknown Рік тому

    This is the easy form of the gradient, how about when we have a difficult form of cost function ?