Sentence Transformers - EXPLAINED!

CS480/680 Lecture 19: Attention and Transformer Networks

Natural Language Processing in Python

КАПИТАНА С КУРСКОЙ ОБЛАСТИ, НЕ ПРИЗНАЮТ РОДИТЕЛИ. РОМА ИЛИ НЕ РОМА? @dmytrokarpenko

Можно ли пропускать завтрак? #эндокринолог #питание #диеты #правильноепитание

«Зайшли п*дорам в тил у перший день штурму» #україна #війна #зсу #окупанти

Language Learning with BERT - TensorFlow and Deep Learning Singapore

Engineers.SG

Переглядів 62 275

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 22 сер 2024

КОМЕНТАРІ • 19

@rohitdhankar360 5 років тому ⁺³
@10:05 - Excellent explanation of Byte-Pair Encodings , thanks.
@daewonyoon 5 років тому ⁺⁵
Thank you. This summary/introduction is very very helpful.
@autripat 5 років тому ⁺⁸
The presenter says, these models "do not use RNNs" (correct), instead "they use CNNs" (incorrect, no use of convolution kernels). They use simple linear transformations of the type XW(transpose) + b
@jaspreetsahota1840 5 років тому ⁺²
You can model convolution operations with transformers.
@mouduge 4 роки тому ⁺⁸
IMHO, that's debatable. Indeed, think of what happens when you apply the same dense layer to each input in a sequence? Well you're effectively running a 1D convolutional layer with kernel size 1. If you're familiar with Keras, try building a model with:
TimeDistributed(Dense(10, activation="relu"))
then replace it with this:
Conv1D(10, kernel_size=1, activation="relu")
You'll see that it gives precisely the same result (assuming you use the same random seeds).
Since the Transformer architecture applies the same dense layers across all time steps, you can think of the whole architecture as a stack of 1D-Convolutional layers with kernel size 1 (then of course there's the important Multihead attention part, which is a different beast altogether).
Granted, it's not the most typical CNN architecture, which usually use fairly few convolutional layers with kernel size 1, but still, it's not really an error to say the Transformer is based on convolutions. I think Martin's goal was mostly to highlight the fact that, contrary to RNNs, every time step gets processed in parallel.
Just my $.02! :))
@archywillhe1379 4 роки тому
wow engineers sg sure haz come a long way ha! great talk
@chirpieful 5 років тому ⁺²
Very good updates for nlp enthusiasts
@MegaBlizzardman 4 роки тому ⁺¹
Very clear and helpful talk
@zingg7203 4 роки тому
BERT uses wordpiece. Albert uses sentence piece
@hiyassat 5 років тому ⁺³
Can we have link to slides please
@prakashsharma-uv4pj 5 років тому ⁺¹
Very Informative.
@monart4210 4 роки тому
Could I extract word embeddings from BERT and use them for unsupervised learning, e.g. topic modeling? :)
@revolutionarybitnche 5 років тому
thank you!
@mkpandey4909 4 роки тому ⁺¹
Where to get this PPT; Please share the link
@janekou2482 5 років тому
Does bpe also works well for non english languages like chinese and french?
@xiaochengjin6478 5 років тому
very nice speech!
@zingg7203 4 роки тому ⁺¹
How is it CNN based?
@ishishir 5 років тому
Nice !
@chriscannon303 4 роки тому
what in gods name are you talking about?? what is an LSTM chain?? I came here because I need to know im writing the correct content for my website and I haven't a fucking clue what the hell you are on about.

Наступне

Автоматичне відтворення

Sentence Transformers - EXPLAINED!

Sentence Transformers - EXPLAINED!

CS480/680 Lecture 19: Attention and Transformer Networks

CS480/680 Lecture 19: Attention and Transformer Networks

Natural Language Processing in Python

Natural Language Processing in Python

КАПИТАНА С КУРСКОЙ ОБЛАСТИ, НЕ ПРИЗНАЮТ РОДИТЕЛИ. РОМА ИЛИ НЕ РОМА? @dmytrokarpenko

КАПИТАНА С КУРСКОЙ ОБЛАСТИ, НЕ ПРИЗНАЮТ РОДИТЕЛИ. РОМА ИЛИ НЕ РОМА? @dmytrokarpenko

Можно ли пропускать завтрак? #эндокринолог #питание #диеты #правильноепитание

Можно ли пропускать завтрак? #эндокринолог #питание #диеты #правильноепитание

«Зайшли п*дорам в тил у перший день штурму» #україна #війна #зсу #окупанти

«Зайшли п*дорам в тил у перший день штурму» #україна #війна #зсу #окупанти

Дон ДОН, Алаудинов и СБЕЖАВШИЕ из под Курска ахматовцы 😁 [Пародия]

Дон ДОН, Алаудинов и СБЕЖАВШИЕ из под Курска ахматовцы 😁 [Пародия]

LSTM is dead. Long Live Transformers!

LSTM is dead. Long Live Transformers!

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Internet is going wild over this problem

Internet is going wild over this problem

A Hackers' Guide to Language Models

A Hackers' Guide to Language Models

BERT Neural Network - EXPLAINED!

BERT Neural Network - EXPLAINED!

Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass

Attention is all you need; Attentional Neural Network Models | Łukasz Kaiser | Masterclass

Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73

Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73

MIT Introduction to Deep Learning | 6.S191

MIT Introduction to Deep Learning | 6.S191

Applying the four step "Embed, Encode, Attend, Predict" framework to predict document similarity

Applying the four step "Embed, Encode, Attend, Predict" framework to predict document similarity

а ты любишь париться?

а ты любишь париться?

Lp. Сердце Вселенной #1 НАЧАЛО ПУТЕШЕСТВИЯ [Новый сезон] • Майнкрафт

Lp. Сердце Вселенной #1 НАЧАЛО ПУТЕШЕСТВИЯ [Новый сезон] • Майнкрафт

«Вони вміють воювати як терористи»: військовослужбовець «Пастор»

«Вони вміють воювати як терористи»: військовослужбовець «Пастор»

SCHOOLBOY RUNAWAY В РЕАЛЬНОЙ ЖИЗНИ 📚🔔 #schoolboy #runaway #schoolboyrunaway #shorts YOUNG

SCHOOLBOY RUNAWAY В РЕАЛЬНОЙ ЖИЗНИ 📚🔔 #schoolboy #runaway #schoolboyrunaway #shorts YOUNG

아이스크림으로 진짜 친구 구별하는법

아이스크림으로 진짜 친구 구별하는법

How I Did The SELF BENDING Spoon 😱🥄 #shorts

How I Did The SELF BENDING Spoon 😱🥄 #shorts

Атака на Москву: взрывы и пожары. Разгром российской бригады под Курском. Где Путин? | УТРО

Атака на Москву: взрывы и пожары. Разгром российской бригады под Курском. Где Путин? | УТРО

Втрачене дитинство | GOVOR TikTok #govor #shots

Втрачене дитинство | GOVOR TikTok #govor #shots