I'm currently reading Python Machine Learning by Sebastian V. I wish that all the code in this book was as clear as it is in your video. Thank you for posting it
The formulation of the model is derived in detail in my Linear Regression video, see here the explanation for it ua-cam.com/video/fkS3FkVAPWU/v-deo.html
Hi Kabilan, thanks for you questions. This line stores the value of the parameters for every iteration, since there are more than one parameters I use the [:, iteration] to store a column of parameters. You can take a look at numpy array slicing to get more information on that.
Only if you make the gradient = np.array([1.0,X]) * (y_hat -y). Bit that is not how I have defined the problem, please see the start of the video for the mathematical description
@Nutty Jedi the function that used to split the data is sklearn.model_selection.train_test_split, that funciton by default shuffles the data see the link below for the funciton documentation from the sklearn page. scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
very good explained. It will be very appreciated if you implement the structure from Motion algorithm. or segest a good resource to learn how to do that. thanks
Hey Kant, thanks for watching. The reason I choose to go with python is because it is used a lot in the data science / machine learning community. It is also in higher demand by companies as compared to Octave. I will think about making some future videos with Octave, but for now I think it will mostly be python
how can i turn this into a polynomial model? i need to increase the complexity of the model and see the cost decrease while the complexity goes higher.
You can still use gradient descent with a general n-order polynomial model, you just have to collect your data in a form that fits y = (theta^T) X to do for example batch gradient descent. The wikipedia page on polynomial regression has a good example en.wikipedia.org/wiki/Polynomial_regression
Stochastic gradient descent implementation is not correct. you supposed to shuffle the data and every iteration pick only 1 random data point ( or multiple in case of mini-batch) for calculating the params(ETA) value.
Hi, many thanks for providing these materials -- for free! However, I didn't really get understand the implementation of the SGD as it seems you didn't shuffle and randomly choose but looped over the entire dataset. Kindly clarify.
Hi Obinna, thanks for your question. I used the sklearn.model_selection.train_test_split functionality ti split the data. That functionality has an option to shuffle the data, and it is true by default. See the documentation here: scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Hi Thanks for your nice illustration In your code, you supposed that we already know the derivative of the cost function. Could you please show an example of some complicated function and then show us how we can write a python code to derive it! Thanks
Hi Mohammed, thanks for your question. Please checkout my video Linear Regression with Gradient Descent + Least Squares for all the mathematical details video here --> ua-cam.com/video/fkS3FkVAPWU/v-deo.html
Endless Engineering Thanks for your response, I already checked that video, and thank you again for the nice job! My question is how to write a Python code to derive the cost function !
@@mjar3799 I am not sure I understand your question. Is it that you want python code to compute the derivative of any cost function? That is a little out of scope here, since for linear regression we assume the cost function has a certain structure. Which is why the derivative is mathematically computed. If you want code that computes the derivative of any function you can try to use something like SymPy www.sympy.org/en/index.html If you want numerical computation of the derivative I would recommend just using the tools in scipy or nympy
Rachel Newell yeah, you can do this and it increases efficiency so much. In the video above, to calculate the cost and the gradient, he is looping over the data. The way you can vectorize it is as follows (Pay attention to single quotes as transpose operator!): cost = (y-X*theta)’*(y-X*theta)/2N, where y is Nx1, X is Nx2 (the i-th row is (1, x_i)), and theta is 2x1. The gradient is then grad = X’*X*theta - X’*y, so that the updating rule is theta = theta - alpha*grad!
Ty man, any chance to implement fuzzy c-means (FCM)? I'm suffering trying to understand it and implement the kernel fuzzy c-means (KFCM). Nice project, ty again!
Hey Rachel! Do you mean a cost function that generates a cost vector? That would certainly be mathematically possible, but the math would get a little messy. And I am not exactly sure what that would buy you
A vectorised cost function apparently does all the calculations in one go rather than iteratively so it is much faster for larger amounts of data. Also im a noob in ML... Just started studying but read this in my book
I'm currently reading Python Machine Learning by Sebastian V.
I wish that all the code in this book was as clear as it is in your video.
Thank you for posting it
Thank you! I am glad you found this video clear and useful. Please let me know if there are other topics you would like to see videos on
excellent absolutely we need more implementations of these algorithms
Thank you for watching! I'm glad you found this useful
Best explanation EVER!
SUBSCRIBED!!
Thank you for watching Rohan! I am glad you found the video useful
how is h(xi) = theta transpose . x bar? Please explain. Thanks in advance
The formulation of the model is derived in detail in my Linear Regression video, see here the explanation for it ua-cam.com/video/fkS3FkVAPWU/v-deo.html
@7:11 what do you mean by params[:,iteration] = params
pls explain this line
Thank you
Hi Kabilan, thanks for you questions.
This line stores the value of the parameters for every iteration, since there are more than one parameters I use the [:, iteration] to store a column of parameters. You can take a look at numpy array slicing to get more information on that.
@10:35 is it not " params = params - alpha * gradient/num_sampls" ?
Only if you make the gradient = np.array([1.0,X]) * (y_hat -y). Bit that is not how I have defined the problem, please see the start of the video for the mathematical description
I love the format !
Glad you enjoyed it! Thanks for watching
heey guy this was extremly useful 😍😍😍 did yourself know ??
no "while" loop in your stochastic gradient decent function??? What happened there?
Hi Ying, not sure I understand your question. There is a for loop in the stochastic gradient descent function
@Nutty Jedi the function that used to split the data is sklearn.model_selection.train_test_split, that funciton by default shuffles the data see the link below for the funciton documentation from the sklearn page.
scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
very good explained.
It will be very appreciated if you implement the structure from Motion algorithm.
or segest a good resource to learn how to do that.
thanks
Thank you for amazing explanation
Hi Rishabh,
You are most welcome! Thank you for watching!
Hey..... Thanks so much for this video...... But please can you do the same with OCTAVE PROGRAMMING LANGUAGE
Hey Kant, thanks for watching.
The reason I choose to go with python is because it is used a lot in the data science / machine learning community. It is also in higher demand by companies as compared to Octave. I will think about making some future videos with Octave, but for now I think it will mostly be python
Please zoom into the notebook for better visibility
Thank you for the feedback! Will do
how can i turn this into a polynomial model? i need to increase the complexity of the model and see the cost decrease while the complexity goes higher.
You can still use gradient descent with a general n-order polynomial model, you just have to collect your data in a form that fits y = (theta^T) X to do for example batch gradient descent. The wikipedia page on polynomial regression has a good example en.wikipedia.org/wiki/Polynomial_regression
Stochastic gradient descent implementation is not correct. you supposed to shuffle the data and every iteration pick only 1 random data point ( or multiple in case of mini-batch) for calculating the params(ETA) value.
Hi, many thanks for providing these materials -- for free! However, I didn't really get understand the implementation of the SGD as it seems you didn't shuffle and randomly choose but looped over the entire dataset. Kindly clarify.
Hi Obinna, thanks for your question.
I used the sklearn.model_selection.train_test_split functionality ti split the data. That functionality has an option to shuffle the data, and it is true by default. See the documentation here: scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html
Hi
Thanks for your nice illustration
In your code, you supposed that we already know the derivative of the cost function. Could you please show an example of some complicated function and then show us how we can write a python code to derive it!
Thanks
Hi Mohammed, thanks for your question.
Please checkout my video Linear Regression with Gradient Descent + Least Squares for all the mathematical details
video here --> ua-cam.com/video/fkS3FkVAPWU/v-deo.html
Endless Engineering
Thanks for your response,
I already checked that video, and thank you again for the nice job!
My question is how to write a Python code to derive the cost function !
@@mjar3799 I am not sure I understand your question. Is it that you want python code to compute the derivative of any cost function? That is a little out of scope here, since for linear regression we assume the cost function has a certain structure. Which is why the derivative is mathematically computed. If you want code that computes the derivative of any function you can try to use something like SymPy www.sympy.org/en/index.html
If you want numerical computation of the derivative I would recommend just using the tools in scipy or nympy
Sir, Thanks a lot.
what about a vectorized cost function? :D
Rachel Newell yeah, you can do this and it increases efficiency so much. In the video above, to calculate the cost and the gradient, he is looping over the data. The way you can vectorize it is as follows (Pay attention to single quotes as transpose operator!): cost = (y-X*theta)’*(y-X*theta)/2N, where y is Nx1, X is Nx2 (the i-th row is (1, x_i)), and theta is 2x1. The gradient is then grad = X’*X*theta - X’*y, so that the updating rule is theta = theta - alpha*grad!
Ty man, any chance to implement fuzzy c-means (FCM)? I'm suffering trying to understand it and implement the kernel fuzzy c-means (KFCM). Nice project, ty again!
Thanks! Glad you enjoyed the video. I do not have one planned for FCM soon, but I will put it on my list!
also i think gradient = np.array([1.0,X]) * (y_hat -y)
Can you please provide me your email.I have an error in my code and its killing me from days
Hi Annie, you can send an email to endlessengineeringphd@gmail.com
what about a vectorized cost function? :D
Hey Rachel! Do you mean a cost function that generates a cost vector? That would certainly be mathematically possible, but the math would get a little messy. And I am not exactly sure what that would buy you
A vectorised cost function apparently does all the calculations in one go rather than iteratively so it is much faster for larger amounts of data.
Also im a noob in ML... Just started studying but read this in my book