Interesting insight, I think it is analogous to how two people (or two brains) can receive the same input (like see the same image) and process/respond differently, and this is due to the differences in connections and connection strengths of neurons between the two brains. Also, as discussed later in the course, the training of these networks is based on the fine tuning of those connections between neurons to achieve a goal. In neuroscience, a theory is that our brains also tune our neuron connections to achieve changes in behavior, form memories, etc. (known as plasticity), yet each neuron by itself at its core is the same.
what are other elements in your matrix wT , you have a row vector that you only wrtote ___w1[1]T___ ,___w2[1]T___ , ..... , what are before and after each w? are those representing w's for other nodes?
All is Good but the multiplication between Weight matrix containing transpose of all weight of each level of Layer 1 sholud be wight a row vector of input feature instead of column vector of input feature ?
I thought W was a vector that is updated constantly based on the training and learning. Why do we have w1, w2 etc. operating at the same time on the same m training examples ?
In neural networks we have hidden layers, w is a vector only when you are in logistic regression due to having only one neuron/unit as an output. But here we have multiple units per layer. You can also use w as a vector if you choose but you will have to use for-loop which would be more computational expensive.
if W[1] is a (4, 3) matrix, then wouldn't W[1]i be a (1,3) row vector, which is compatible with the (3, 1) column vector x or a[0]. Why do we still need to transpose W[1]i to become a column vector?
Its not easy stuff, but this is probably the best explanation, by the best teacher that you are ever likely to come across :) he certainly helped my understanding..
if you finish the original ML course, and then move to this specialization, it will become much much easier :-) (sometimes the tortoise can outbeat the hare)
This is the most detailed explanation I’ve ever seen, reduced to the most intuitive parts.
Do read haykin book on neural networks
Excellent explanation. Very clear.
Each seemingly identical neuron in a layer takes the same input differently, this is quite fascinating from a layman's view.
Interesting insight, I think it is analogous to how two people (or two brains) can receive the same input (like see the same image) and process/respond differently, and this is due to the differences in connections and connection strengths of neurons between the two brains.
Also, as discussed later in the course, the training of these networks is based on the fine tuning of those connections between neurons to achieve a goal. In neuroscience, a theory is that our brains also tune our neuron connections to achieve changes in behavior, form memories, etc. (known as plasticity), yet each neuron by itself at its core is the same.
what are other elements in your matrix wT , you have a row vector that you only wrtote ___w1[1]T___ ,___w2[1]T___ , ..... , what are before and after each w? are those representing w's for other nodes?
All is Good but the multiplication between Weight matrix containing transpose of all weight of each level of Layer 1 sholud be wight a row vector of input feature instead of column vector of input feature ?
I have difficulty understanding what exactly are these W and B vectors and how do we determine it for each neuron in the hidden layer ?
w is weight and b is bais . we use gradient descent to get this values
nice explanation
I thought W was a vector that is updated constantly based on the training and learning. Why do we have w1, w2 etc. operating at the same time on the same m training examples ?
In neural networks we have hidden layers, w is a vector only when you are in logistic regression due to having only one neuron/unit as an output. But here we have multiple units per layer. You can also use w as a vector if you choose but you will have to use for-loop which would be more computational expensive.
if W[1] is a (4, 3) matrix, then wouldn't W[1]i be a (1,3) row vector, which is compatible with the (3, 1) column vector x or a[0].
Why do we still need to transpose W[1]i to become a column vector?
W[1]weights will be (3,4) and inputs (3,1). so you need to transpose W, wT*x. otherwise multiplication will not be valid
W and B here represents weight and Bias ? in the eqn Wt*X+b
yes
What is W[2] and how is it calculated?
it's the second weight vector, between layer[1] and layer[2]
why is W[1] is 4×3 matrix??? instead of 4×1matrix
Because there are 3 features: x1, x2, and x3, W[1] assigns a weight to each feature of the input.
Very difficult to understand.. :(
I thought it was just me :0)
Its not easy stuff, but this is probably the best explanation, by the best teacher that you are ever likely to come across :) he certainly helped my understanding..
He explained well.
if you finish the original ML course, and then move to this specialization, it will become much much easier :-) (sometimes the tortoise can outbeat the hare)
Please take the first machine learning course from him first and then come back here
Too much easy to understand for my genius mind, now I need something much more harder like Transformers or CNN