FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

Поділитися
Вставка
  • Опубліковано 2 чер 2024
  • FixMatch is a simple, yet surprisingly effective approach to semi-supervised learning. It combines two previous methods in a clever way and achieves state-of-the-art in regimes with few and very few labeled examples.
    Paper: arxiv.org/abs/2001.07685
    Code: github.com/google-research/fi...
    Abstract:
    Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power of a simple combination of two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using the model's predictions on weakly-augmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image. Despite its simplicity, we show that FixMatch achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 -- just 4 labels per class. Since FixMatch bears many similarities to existing SSL methods that achieve worse performance, we carry out an extensive ablation study to tease apart the experimental factors that are most important to FixMatch's success. We make our code available at this https URL.
    Authors: Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel
    Links:
    UA-cam: / yannickilcher
    Twitter: / ykilcher
    BitChute: www.bitchute.com/channel/yann...
    Minds: www.minds.com/ykilcher
  • Наука та технологія

КОМЕНТАРІ • 21

  • @manuelpariente2288
    @manuelpariente2288 4 роки тому +22

    Thanks again :-)
    Loved the critic at the end.
    Also, nice from them that they report these results, lots of papers would silence it to make it seem like the method brought all the gains !

  • @herp_derpingson
    @herp_derpingson 4 роки тому +19

    78% accuracy from 1 image per class. This blew my mind.
    What a time to be alive.

    • @TeoZarkopafilis
      @TeoZarkopafilis 4 роки тому +6

      HOLD ON TO YOUR PAPERS

    • @meudta293
      @meudta293 4 роки тому +1

      my brain matter is all over the floor right now hhh

    • @matthewtang1489
      @matthewtang1489 4 роки тому +1

      @@TeoZarkopafilis Woah! A fellow scholar here!

  • @shrinathdeshpande5004
    @shrinathdeshpande5004 4 роки тому +8

    definitely one of the best ways to explain a paper!! Kudos to you

  • @sora4222
    @sora4222 Рік тому

    I loved the critique at the end. Thanks.

  • @hihiendru
    @hihiendru 4 роки тому +1

    just like UDA, emphasis on way you augment. and poor UDA got rejected. ps LOVE your breakdowns, please keep them coming.

  • @AmitKumar-ts8br
    @AmitKumar-ts8br 3 роки тому

    Really nice explanation and concise...

  • @jurischaber6935
    @jurischaber6935 Рік тому

    Thanks again...Great teacher for us students. 🙂

  • @vishalahuja2502
    @vishalahuja2502 3 роки тому +1

    Yannic, nice coverage of the paper. I have one question: at 15:05, you explain that the pseudo-label is used only if the confidence is above a certain threshold (which is also a hyperparameter). Where is the confidence coming from? It is well known that the confidence score coming out of softmax is not reliable. Can you please explain?

  • @hungdungnguyen8258
    @hungdungnguyen8258 27 днів тому

    well explained. Thank you

  • @christianleininger2954
    @christianleininger2954 4 роки тому

    Really Good Job please keep going

  • @ramonbullock6630
    @ramonbullock6630 4 роки тому +1

    I love this content :D

  • @reginaldanderson7218
    @reginaldanderson7218 4 роки тому +1

    Nice edit

  • @NooBiNAcTioN1334
    @NooBiNAcTioN1334 2 роки тому

    Fantastic!

  • @tengotooborn
    @tengotooborn 3 роки тому

    Something which I find weird: isn’t a constant pseudolabel always correct? It seems that there are only positive examples in the scheme which uses the unlabeled data, and so there is nothing in the loss which forces the model to not always output the same pseudolabel for everything.
    Yes, one can argue that this would fail the supervised loss, but then the question becomes “how is the supervised loss weighted w.r.t. the unsupervised loss”. In any case, it seems that one would also desire to have negative examples in the unsupervised case

  • @abhishekmaiti8332
    @abhishekmaiti8332 4 роки тому +1

    In what order do they train the model, feed the labelled image first and then the unlabelled ones? Also, can two unlabelled images of the same class have a different pseudo label?

    • @YannicKilcher
      @YannicKilcher  4 роки тому +4

      I think they do everything at the same time. I guess the labelled images can also go the unlabelled way, yes. But not the other way around, obviously :)

  • @Manu-lc4ob
    @Manu-lc4ob 4 роки тому +1

    What is the software that you are using to annotate papers Yannic ? I am using Margin notes but it does not seem as smooth

  • @Dr.Z.Moravcik-inventor-of-AGI
    @Dr.Z.Moravcik-inventor-of-AGI 3 роки тому

    Google again, wow! 😂