Lovely presentation; states the problem clearly (class imbalance for dense boxes) and the solution just as clearly (modulating the cross-entropy loss towards the hard examples). Brilliant solution too!
you all prolly dont care at all but does anybody know a way to log back into an instagram account..? I somehow lost the password. I appreciate any tips you can offer me!
It was really great work. I am very curious about the αt term and α - balance factor, can you please help you to get some clarity about α and αt. it will be a great help for my studies
It is an idea from balanced cross-entropy, they just brought it up. Datasets with 2 or more classes usually have a class imbalance. This is a problem because the networks tend to focus on majority data, getting poor learning over the minority ones. So the idea of alpha is to put weight on the loss so that majority classes have less impact than minority classes. Alpha can be thought of as the "inverse frequency" of class distribution in the dataset. Example: if you have 100 dogs (class 0) and 900 cats (class 1), the distribution is 10% for dogs and 90% for cats. So the inverse frequency would be 1 - 0.1 = 0.9 for dogs, and 1 - 0.9 = 0.1 for cats. It means that alpha_dogs = 0.9 and alpha_cats = 0.1. In binary classification the alpha is thought of as the "weight for positive classes", so the weight for negative classes would be 1 - alpha. For the above problem, alpha=alpha_cats, as cats represent the positive class. However, for multiclass classification, the alpha is a vector with a length corresponding to the number of classes.
5:35 On which reason he pointed 2.3(Left) and 0.1(right)?? The point is meaningful? or just example. If it is example how can he say hard example affect 40x bigger loss then easy example like a general case. It's strange.
He is trying to say that a hard example only impacts loss 20 times bigger than the easy example. So with the setting of the dense detector, where hard examples : easy examples is 1 : 1000, then the loss of hard examples : the loss of easy examples is 2.3 : 100. This means the loss is overwhelmed by the easy examples.
Easy examples are those that the model quickly learns how to correctly predict. In the context of object detection, you can think of it as big objects, having a unique shape (low chance of confusing it with other objects), etc. Hard examples are those that have high similarity or are too small in the images. Detecting an airplane and differentiating it from a bottle (easy) is more suitable than detecting and differentiating a dog from a wolf (hard). Based on this context, during the learning process is expected that the model quickly learns the easy examples, meaning that its probabilities will be close to 1 for positive examples. The factor (1-pt) modularizes it. As pt is close to 1 the factor (1-pt) get close to zero, then reducing the loss value. Semantically it can be seen as "reducing the impact of easy examples". The factor gamma just tells how intense is this modularization.
I feel like he had a small prank hidden in his talk. As a deep learning expert at Google Brain, the one word he should know better than any other would be the word "classify", yet he stumbles on it multiple times. But oddly enough, only that word. Clearly, those that work at Google Brain are some of the brightest most talented (I'm not trying to pick on him). That's why that must be a prank right!? Or maybe he was just a bit nervous.
you won't believe, but every TV presenter does exactly this, - nobody wants to slip up while presenting their thoughts to the large audience modern cameras have some tricky mirror system which allows you to read the text while at the same time looking at camera and apparently, it's not the case here :)
Lovely presentation; states the problem clearly (class imbalance for dense boxes) and the solution just as clearly (modulating the cross-entropy loss towards the hard examples). Brilliant solution too!
just want to clarify hard examples here mean FP and FN while training.
you all prolly dont care at all but does anybody know a way to log back into an instagram account..?
I somehow lost the password. I appreciate any tips you can offer me!
@Raphael Desmond instablaster ;)
I wish i could work with these people someday.
note for myself
06:36 09:25 Imbalanced Loss
woww Simple anaylisis leads to best perfomance...
Think different.
It was really great work. I am very curious about the αt term and α - balance factor, can you please help you to get some clarity about α and αt. it will be a great help for my studies
Hope for your reply
It is an idea from balanced cross-entropy, they just brought it up. Datasets with 2 or more classes usually have a class imbalance. This is a problem because the networks tend to focus on majority data, getting poor learning over the minority ones. So the idea of alpha is to put weight on the loss so that majority classes have less impact than minority classes. Alpha can be thought of as the "inverse frequency" of class distribution in the dataset.
Example: if you have 100 dogs (class 0) and 900 cats (class 1), the distribution is 10% for dogs and 90% for cats. So the inverse frequency would be 1 - 0.1 = 0.9 for dogs, and 1 - 0.9 = 0.1 for cats. It means that alpha_dogs = 0.9 and alpha_cats = 0.1.
In binary classification the alpha is thought of as the "weight for positive classes", so the weight for negative classes would be 1 - alpha. For the above problem, alpha=alpha_cats, as cats represent the positive class. However, for multiclass classification, the alpha is a vector with a length corresponding to the number of classes.
5:35 On which reason he pointed 2.3(Left) and 0.1(right)?? The point is meaningful? or just example. If it is example how can he say hard example affect 40x bigger loss then easy example like a general case. It's strange.
He is trying to say that a hard example only impacts loss 20 times bigger than the easy example. So with the setting of the dense detector, where hard examples : easy examples is 1 : 1000, then the loss of hard examples : the loss of easy examples is 2.3 : 100. This means the loss is overwhelmed by the easy examples.
Thanks for this
Is he talking about binary cross entropy
what is the definition of easy train or hard train datasets?
Easy examples are those that the model quickly learns how to correctly predict. In the context of object detection, you can think of it as big objects, having a unique shape (low chance of confusing it with other objects), etc. Hard examples are those that have high similarity or are too small in the images. Detecting an airplane and differentiating it from a bottle (easy) is more suitable than detecting and differentiating a dog from a wolf (hard).
Based on this context, during the learning process is expected that the model quickly learns the easy examples, meaning that its probabilities will be close to 1 for positive examples. The factor (1-pt) modularizes it. As pt is close to 1 the factor (1-pt) get close to zero, then reducing the loss value. Semantically it can be seen as "reducing the impact of easy examples". The factor gamma just tells how intense is this modularization.
@@luansouzasilva31 thanks
He is cuteeee!!!!!
Great report!
great work!
I feel like he had a small prank hidden in his talk. As a deep learning expert at Google Brain, the one word he should know better than any other would be the word "classify", yet he stumbles on it multiple times. But oddly enough, only that word. Clearly, those that work at Google Brain are some of the brightest most talented (I'm not trying to pick on him). That's why that must be a prank right!? Or maybe he was just a bit nervous.
May b they talked in chinese there
actually he is Japanese😊
Kl- before a vowel is really hard to say for the Chinese. They say kr- instead.
@@hangfanliu3370 he is from P.R Taiwan
Paper: arxiv.org/abs/1708.02002
Bravo!
TY is cool
apparently he does not know what he's talking about
really bad presenter but great idea
this guy is reading...
you won't believe, but every TV presenter does exactly this, - nobody wants to slip up while presenting their thoughts to the large audience
modern cameras have some tricky mirror system which allows you to read the text while at the same time looking at camera and apparently, it's not the case here :)