Evaluation Measures for Search and Recommender Systems

Is GPL the Future of Sentence Transformers? | Generative Pseudo-Labeling Deep Dive

How to Create Shared Activities in Mixed Reality - MR Motifs

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

Уличный боец с ДУХОМ воина

Today Unsupervised Sentence Transformers, Tomorrow Skynet (how TSDAE works)

James Briggs

Переглядів 4 844

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 1 лют 2025

КОМЕНТАРІ •

@charmz973 3 роки тому ⁺¹
This is a masterpiece given my research hindrances in unsupervised training. Thanks bro
@jamesbriggs 3 роки тому ⁺²
Haha I'm glad to hear it helps, definitely going to do more on unsupervised methods in the future
@liolaeezan5927 3 роки тому ⁺²
Thanks for the consistency and clear explanations!
@guanglilu2182 2 роки тому
Super clear as always! Thank you so much!
@ismailashraq9697 3 роки тому
Wow. Great explanation, going to attempt this soon. Thanks for the video😊
@jamesbriggs 3 роки тому
Happy to see your comment, I hope you get it working!
@shanborwillwarjri Рік тому
Thank you James..wonderfully explained. Can we use this on a new language if we do have a pretrained bert based language model for it? Thanks in advance
@malikrumi1206 2 роки тому
Around 16:48, looking at table 5, there is less than 6 points between the top and bottom. Is this difference actually meaningful or significant? Would a human domain expert, handed a sample result, be able to notice these differences?
@ax5344 2 роки тому
@19:31 the training data is from "oscar"-- "train"; @37:32 the evaluation set is "stsb" --"validation". Did you say you chose "stsb"--"validation" because the model is trained on the "train" split? How so? Are the datasets "oscar" and "stsb" the same dataset?
@jamesbriggs 2 роки тому ⁺¹
It's just a precaution that I tend to try and avoid testing on training sets *just in case* there is any overlap between two datasets, as in some cases there are datasets that have been built by merging several other datasets - in this case there's no overlap (afaik)
@d_b_ Рік тому
Have any other SOTA unsupervised fine tuning techniques overtaken this in the past year? Is this the best choice for creating sentence embeddings on custom documents with rare words?
@elahehsalehirizi618 Рік тому
Hi i have a question please. Could we use fined tuned model on source dataset ( which is already trained on target data )as the pretrained model for SetFit?
@brianhance4337 3 роки тому
Thanks for the vid! Do you think applying TSDAE on the raw transformer models of the more advanced sbert models will yield even better results compared to the advanced sbert models without tsdae training on their base model?
@DJNed12 3 роки тому
Brilliant video, thank you! I know it's possible to "continue" training an existing model from the sentence-transformers library on a different dataset, and the docs seem to suggest this continuation of training can take effectively any training objective/loss function even if it differs from what was used in initial pre-training or fine-tuning. Is it possible to use this fact to use TSDAE or any other unsupervised technique to fine-tune an existing pre-trained sentence-transformers model, rather than just a transformers model as you did in the video? If so I'm wondering if that has the potential to produce even better results on a particular dataset.
@jamesbriggs 3 роки тому ⁺¹
It's not something I've tried but one of the primary use cases for TSDAE is 'domain adaption', which I understood as taking an existing sentence transformer and fine-tuning with TSDAE so that it can be applied to another domain.
If the model has already been trained on that domain though, I'm not sure TSDAE can be used to improve performance further, I haven't seen anyone trying this however
@nicolaithomsen7005 2 роки тому
Hi James, thank you so much for this awesome guide. I'm trying to use this technique - almost 1:1 - with 100K+ new training instances. However, this results in the following error in training
- RuntimeError: CUDA error: device-side assert triggered
Which, when explored a bit, seems to relate to
- RuntimeError: CUDA out of memory. Tried to allocate 512 MiB (GPU 0; 2.00 GiB total capacity; 584.97 MiB already allocated; 13.81 MiB free; 590.00 MiB reserved in total by PyTorch) - With the actual specific numbers being specific to my setup of course.
Is this process just very memory intensive? And is there any way to get around this?
Thanks again!
@jamesbriggs 2 роки тому ⁺¹
It's relatively memory intensive, a GPU with 2GB capacity won't be enough to run this on any models I know of, I think you need something like 10-15GB for BERT-base, you can reduce training batch size to reduce memory needed, but even at a single item you will need a larger GPU unfortunately - I think there is a paid version of Google Colab that is large enough
I hope you manage to find something!

Наступне

Автоматичне відтворення

Evaluation Measures for Search and Recommender Systems

Evaluation Measures for Search and Recommender Systems

Is GPL the Future of Sentence Transformers? | Generative Pseudo-Labeling Deep Dive

Is GPL the Future of Sentence Transformers? | Generative Pseudo-Labeling Deep Dive

How to Create Shared Activities in Mixed Reality - MR Motifs

How to Create Shared Activities in Mixed Reality - MR Motifs

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

"ВСЯ УЛИЦА полетела" — курянка про обстріли рф

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

У ДЕТЕНЫША СТЕПЫ ИСЧЕЗ ГЛАЗИК

Уличный боец с ДУХОМ воина

Уличный боец с ДУХОМ воина

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

BERTopic Explained

BERTopic Explained

Fine-tune High Performance Sentence Transformers (with Multiple Negatives Ranking)

Fine-tune High Performance Sentence Transformers (with Multiple Negatives Ranking)

Intro to Sentence Embeddings with Transformers

Intro to Sentence Embeddings with Transformers

Fine-tuning, RAG, Llama, prompt-engineering, LLM-арены | Что происходит в LLM

Fine-tuning, RAG, Llama, prompt-engineering, LLM-арены | Что происходит в LLM

SPLADE: the first search model to beat BM25

SPLADE: the first search model to beat BM25

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Training State-of-the-Art Sentence Embedding Models

Training State-of-the-Art Sentence Embedding Models

BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1

BERTopic for Topic Modeling - Maarten Grootendorst - Talking Language AI Ep#1

BERT Neural Network - EXPLAINED!

BERT Neural Network - EXPLAINED!

Морпіх із Каліфорнії доєднався до лав ЗСУ #shorts

Морпіх із Каліфорнії доєднався до лав ЗСУ #shorts

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

Син ПОВАЛІЙ ПЛЮНУВ ЇЙ в ОБЛИЧЧЯ! Скандальне ПРИВІТАННЯ для ЗРАДНИЦІ! | OBOZ.LIFE

Як азовська піхота прийняла групу розвідки вс рф? Зізнання окупантів і кадри з GoPro

Як азовська піхота прийняла групу розвідки вс рф? Зізнання окупантів і кадри з GoPro

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

"ВСЯ УЛИЦА полетела" — курянка про обстріли рф

«Я жити не хочу»: винесли «з нуля» пораненого побратима #shorts

«Я жити не хочу»: винесли «з нуля» пораненого побратима #shorts

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

СОЛДАТ КНДР: ВТЕЧА/ВІЙНА В УКРАЇНІ/10 РОКІВ ШПИГУВАВ У ПІВНІЧНІЙ КОРЕЇ/ТОРГУЮТЬ НАРКОТИКАМИ І ЗБРОЄЮ

The evil clown plays a prank on the angel

The evil clown plays a prank on the angel

🔥"СВОшник" РОЗНОСИТЬ шоу путіністів! Ведучий ШОКОВАНИЙ від цих СЛІВ #shorts

🔥"СВОшник" РОЗНОСИТЬ шоу путіністів! Ведучий ШОКОВАНИЙ від цих СЛІВ #shorts