SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Поділитися
Вставка
  • Опубліковано 3 тра 2024
  • Full paper:
    arxiv.org/abs/2002.05709?ref=...
    Presenter: Dan Fu
    Stanford University, USA
    Abstract:
    This paper presents SimCLR: a simple framework
    for contrastive learning of visual representations.
    We simplify recently proposed contrastive selfsupervised learning algorithms without requiring
    specialized architectures or a memory bank. In
    order to understand what enables the contrastive
    prediction tasks to learn useful representations,
    we systematically study the major components of
    our framework. We show that (1) composition of
    data augmentations plays a critical role in defining
    effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations,
    and (3) contrastive learning benefits from larger
    batch sizes and more training steps compared to
    supervised learning. By combining these findings,
    we are able to considerably outperform previous
    methods for self-supervised and semi-supervised
    learning on ImageNet. A linear classifier trained
    on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a
    7% relative improvement over previous state-ofthe-art, matching the performance of a supervised
    ResNet-50. When fine-tuned on only 1% of the
    labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100× fewer labels.
  • Розваги

КОМЕНТАРІ • 9

  • @autripat
    @autripat 3 роки тому +45

    starts at 6:30
    contrastive loss 9:20
    self supervised contrastive loss 15:30
    key findings 20:07

  • @baqirhusain5652
    @baqirhusain5652 Рік тому +1

    The group is awesome.. they are asking most question that the audience might ask!

  • @maser9433
    @maser9433 Рік тому

    It was a good summary indeed...seeing this after going through the paper actually makes more sense!

  • @sarangak.mahanta6168
    @sarangak.mahanta6168 6 місяців тому

    In 24:05, I think the green bar is only present for the 2048 dimensional size (in the y-axis) and not for the other dimensionalities because the representation 'h' is fixed (2048) according to the caption in the figure 8 of the paper. In the presence of a linear or non-linear layer, the output dimension may be altered (32, 64, 128 etc), but for no-projection, the 'h' is directly being used in the loss function. Since, h is fixed for 2048, it is not being compared for other dimensional sizes.

  • @capocianni1043
    @capocianni1043 4 місяці тому

    thanks for sharing guys

  • @shahzodamirov3762
    @shahzodamirov3762 10 місяців тому

    Really good explanation!
    You eased my burden!
    Can you share paper presentation slides Dan Fu?

  • @phuccoiinkorea3341
    @phuccoiinkorea3341 2 роки тому +1

    why is that 2 FC Layers layers become non-linear at 25:00?

    • @kartofemshoyu
      @kartofemshoyu 2 роки тому +7

      I think for the 1 FC case, it's just the linear layer without activation, and in the 2 FC case, there is a relu activation between them that makes it nonlinear (btw "relu" in that slide is written in a small font so it's hard to see)