The Unreasonable Effectiveness of Spectral Graph Theory: A Confluence of Algorithms, Geometry & ...

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Repeated Linear Function example

$10,000 Every Day You Survive In The Wilderness

The Worlds Most Powerfull Batteries !

DELETE TOXICITY = 5 LEGENDARY STARR DROPS!

In-Context Learning: A Case Study of Simple Function Classes

Simons Institute

Переглядів 8 428

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 17 сер 2023
Gregory Valiant (Stanford University)
simons.berkeley.edu/talks/gre...
Large Language Models and Transformers
In-context learning refers to the ability of a model to learn new tasks from a sequence of input-output pairs given in a prompt. Crucially, this learning happens at inference time without any parameter updates to the model. I will discuss our empirical efforts that shed light on some basic aspects of in-context learning: To what extent can Transformers, or other models such as LSTMs be efficiently trained to in-context learn fundamental function classes, such as linear models, sparse linear models, and small decision trees? How can one evaluate in-context learning algorithms? And what are the qualitative differences between these architectures with respect to their ability to be trained to perform in-context learning? I will also discuss subsequent work of other researchers which illuminates connections between language modeling and learning: must a good language model be able to perform in-context learning? Do large language models know how to perform regression? And are such primitives useful for language-centric tasks? This talk will be mostly based on joint work with Shivam Garg, Dimitris Tsipras, and Percy Liang.

КОМЕНТАРІ • 4

@henrylouis5143 9 місяців тому ⁺³
49:11 I think this is the most striking part of this talk: LSTM doesn't show numerical unstablity, meaning it never learns to "find the inverse matrix" as OLS does. But Transformer does learn it... Attention is all you need!
@franky07724 9 місяців тому ⁺²
My tests on ChatGPT, Bing Chat, and Bard cannot get "4 - 1= 5" due to different reasons. Does it mean that they cannot perform in-context learning or contexts cannot overwrite weights?
@prescod 9 місяців тому ⁺³
What's the Abraham Lincoln joke?
@mshonle 9 місяців тому ⁺¹
Abraham Lincoln said “if you call a ‘tail’ a ‘leg,’ a dog still has four legs”… in context, even if you call slavery by a different name it’s still slavery.

Наступне

Автоматичне відтворення

The Unreasonable Effectiveness of Spectral Graph Theory: A Confluence of Algorithms, Geometry & ...

The Unreasonable Effectiveness of Spectral Graph Theory: A Confluence of Algorithms, Geometry & ...

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Jacob Andreas | What Learning Algorithm is In-Context Learning?

Repeated Linear Function example

Repeated Linear Function example

$10,000 Every Day You Survive In The Wilderness

$10,000 Every Day You Survive In The Wilderness

The Worlds Most Powerfull Batteries !

The Worlds Most Powerfull Batteries !

DELETE TOXICITY = 5 LEGENDARY STARR DROPS!

DELETE TOXICITY = 5 LEGENDARY STARR DROPS!

Ну Лилит))) прода в онк: завидные котики

Ну Лилит))) прода в онк: завидные котики

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Are LLMs the Beginning or End of NLP?

Are LLMs the Beginning or End of NLP?

Distinguished Speaker Series @SCIoI: Sepp Hochreiter - xLSTM: New Architectures for LLMs

Distinguished Speaker Series @SCIoI: Sepp Hochreiter - xLSTM: New Architectures for LLMs

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

Algorithmic Trading and Machine Learning

Algorithmic Trading and Machine Learning

All Learning Algorithms Explained in 14 Minutes

All Learning Algorithms Explained in 14 Minutes

Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

Ultraproducts as a Bridge Between Discrete and Continuous Analysis

Ultraproducts as a Bridge Between Discrete and Continuous Analysis

надувательство чистой воды

надувательство чистой воды

Как быстро замутить ЭлектроСамокат

Как быстро замутить ЭлектроСамокат

WHY THROW CHIPS IN THE TRASH?🤪

WHY THROW CHIPS IN THE TRASH?🤪

Узнал страшную правду 30ти летнего мотора Форда😱

Узнал страшную правду 30ти летнего мотора Форда😱

ГОРОД ТОЛЬКО ИЗ ОДНОЙ ДОРОГИ - ПОЛНОЕ ВИДЕО ПО ССЫЛКЕ ЧУТЬ ВЫШЕ! #embro #CitiesSkylines2 #shorts

ГОРОД ТОЛЬКО ИЗ ОДНОЙ ДОРОГИ - ПОЛНОЕ ВИДЕО ПО ССЫЛКЕ ЧУТЬ ВЫШЕ! #embro #CitiesSkylines2 #shorts

✈️ ЗСУ відтісняють авіацію РФ за полярне коло

✈️ ЗСУ відтісняють авіацію РФ за полярне коло

«Проти семи окупантів, вів бій з автомату»: «Монгол» про оборону позиції на Запорізькому напрямку

«Проти семи окупантів, вів бій з автомату»: «Монгол» про оборону позиції на Запорізькому напрямку