Why was the corner case example regarded as not allowed under the assumption of linear separability with gamma margin? We can see clearly that the corner case data set is linearly separable with some gamma margin but then the way we took our initial weight vector which lead to points on the decision boundary caused the perceptron algorithm to not converge. Are you trying to say that for every weight vector we encounter in the perceptron algorithm update rule process including the initial weight vector we took should have no point lying on it, then only perceptron algorithm will converge?
I am doing SDS offered by MIT on edx and this is very helpful.
Awesome lecture !
Great lecture 🙌🙌
Why are we adjusting (and assuming) so much for this Perceptron thing? Can't we just reject it?
Fantastic video, thanks for the help!
Why was the corner case example regarded as not allowed under the assumption of linear separability with gamma margin? We can see clearly that the corner case data set is linearly separable with some gamma margin but then the way we took our initial weight vector which lead to points on the decision boundary caused the perceptron algorithm to not converge. Are you trying to say that for every weight vector we encounter in the perceptron algorithm update rule process including the initial weight vector we took should have no point lying on it, then only perceptron algorithm will converge?
At 30:06 , x: wTx = -γ , So does it means that γ and -γ are equidistance from center?
Yes
obviously
While solving W1, shouldn't the label y3 be equal to 1? Why did he take y3 = -1?
Because the actual label for data point x3 is (-1).
Y_hat 3 = +1 is mistake made by w0.