Intro to Playground and Prompting | Multimodal Weekly 64

A Deep Dive into Twelve Labs Embed API for Multimodal Embeddings | Multimodal Weekly 66

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

How Strong Is Tape?

How to treat Acne💉

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

Analysis and Insights from Holistic Evaluation on Video Foundation Models | Multimodal Weekly 65

Twelve Labs

Переглядів 137

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 9 лют 2025
In the 65th session of Multimodal Weekly, we had Lucas Lee from the Twelve Labs Science team to present our recent work on evaluating video foundation models.
Connect with Lucas: hyeongminlee.g...
Check out the following resources about TWLV-I:
- Blog Post: www.twelvelabs...
- arXiV: arxiv.org/abs/...
- HuggingFace: huggingface.co...
GitHub: github.com/twe...
Timestamps:
00:15 Introduction
03:05 Lucas starts
03:22 How should we call video foundation models?
04:05 The most representative image feature extraction (VGGNet, CLIP)
05:10 Image retrieval is not equivalent to image embedding
06:13 Image representation
07:35 Image foundation models
07:50 DINO(v2)
09:28 MAE (Masked Auto Encoder)
10:40 I-JEPA
11:40 How about videos?
12:36 CLIP4Clip
14:00 The video foundation model architecture that we want
15:10 A vision transformer that can capture motions
15:40 New structures and supervisions for videos
15:52 VideoMAE
16:45 UMT (Unmasked Teacher)
18:10 V-JEPA
19:15 Video is not just a sequence of images
19:50 Kinetics-400 vs Something-Something v2
21:53 Motion vs Appearance in V-JEPA and VideoGLUE
22:36 Motion vs Appearance in TWLV-I
23:05 TWLV-I is Twelve Labs' first technical report on video foundation model
24:10 TWLV-I proposes a better evaluation framework
25:00 Directional motion distinguishability
26:13 TWLV-I code is available on GitHub!
27:10 Q&A with Lucas
Join the Multimodal Minds community to receive an invite for future webinars: / discord

КОМЕНТАРІ •

Наступне

Автоматичне відтворення

Intro to Playground and Prompting | Multimodal Weekly 64

Intro to Playground and Prompting | Multimodal Weekly 64

A Deep Dive into Twelve Labs Embed API for Multimodal Embeddings | Multimodal Weekly 66

A Deep Dive into Twelve Labs Embed API for Multimodal Embeddings | Multimodal Weekly 66

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

How Strong Is Tape?

How Strong Is Tape?

How to treat Acne💉

How to treat Acne💉

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей

Что-что Мурсдей говорит? 💭 #симбочка #симба #мурсдей

Time-Interval Machine, ID-Aware Movie Descriptions, and Story Summarization | Multimodal Weekly 56

Time-Interval Machine, ID-Aware Movie Descriptions, and Story Summarization | Multimodal Weekly 56

How language model post-training is done today

How language model post-training is done today

Temporal Action Localization, Hallucination Benchmark, and Attention for ViTs | Multimodal Weekly 62

Temporal Action Localization, Hallucination Benchmark, and Attention for ViTs | Multimodal Weekly 62

Simon Sinek's Advice Will Leave You SPEECHLESS 2.0 (MUST WATCH)

Simon Sinek's Advice Will Leave You SPEECHLESS 2.0 (MUST WATCH)

Multimodal Reasoning, Video Instruction-Tuning & Explaining Vision Backbones | Multimodal Weekly 53

Multimodal Reasoning, Video Instruction-Tuning & Explaining Vision Backbones | Multimodal Weekly 53

Transformers (how LLMs work) explained visually | DL5

Transformers (how LLMs work) explained visually | DL5

Multimodal Benchmarks, Video Prediction, and Multimodal Video Models | Multimodal Weekly 67

Multimodal Benchmarks, Video Prediction, and Multimodal Video Models | Multimodal Weekly 67

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Composed Video Retrieval, Consent In Crisis, and Video Annotations at Scale | Multimodal Weekly 57

Composed Video Retrieval, Consent In Crisis, and Video Annotations at Scale | Multimodal Weekly 57

СКОЛЬКО ИХ...?! #Shorts #Глент

СКОЛЬКО ИХ...?! #Shorts #Глент

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

УКРАЇНСЬКИЙ ДЕТЕКТИВ | Стоматолог. ТОП СЕРІАЛ. 1,2 серія

УКРАЇНСЬКИЙ ДЕТЕКТИВ | Стоматолог. ТОП СЕРІАЛ. 1,2 серія

Анна Трінчер - Треш (Official Music Video)

Анна Трінчер - Треш (Official Music Video)

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!

ЧТО ОПАСНЕЕ? ОТВЕТЫ ВАС ШОКИРУЮТ... (1% ОТВЕЧАЮТ ПРАВИЛЬНО) #Shorts #Глент

ЧТО ОПАСНЕЕ? ОТВЕТЫ ВАС ШОКИРУЮТ... (1% ОТВЕЧАЮТ ПРАВИЛЬНО) #Shorts #Глент

Как найти себе жену? Больше - тут @stas.yornik.shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников