Rich Sutton’s new path for AI | Approximately Correct Podcast

Dynamic Deep Learning | Richard Sutton

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

ПРАНК НАД БОЯРСКИМ | КОНФЛИКТ НА ДОРОГЕ

ФИЛЬМ! НЕВИНОВНЫЙ ГОТОВИТ ДЕРЗКИЙ ПОБЕГ С НЕПРИСТУПНОГО ОСТРОВА-ТЮРЬМЫ! Мотылёк! Русский фильм

Unexpected way to open the new Audi A6 e-tron Frunk 😮! #shorts

Rich Sutton, Toward a better Deep Learning

Amii

Переглядів 9 110

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 2 лют 2025

КОМЕНТАРІ • 10

@mavenlin 4 місяці тому ⁺⁸
But how are we going to prevent interference to old data when we change the backbone?
IMO, the major issue of using a dynamic architecture is, when the fringe joins the backbone, it not only provide you the capacity to deal with new data, but also change the function mapping for all the old data, this change can be catastrophic especially when we're doing this with a very long temporal sequence, and some early data may take a long time to appear again.
I guess some form of information of the old data is still needed, e.g. replay buffer or a bayesian posterior of weights.
@christopherbentley6647 2 місяці тому
No idea but it sounds like the back bone never changes only grows
@mavenlin 2 місяці тому ⁺¹
@christopherbentley6647 but when you grow, the newly added part will interfere with the function mapping for old data. Unless you choose not to activate them for the old data. But then how to make this decision what to activate? How to evolve this decision "continually"?
@stevenkao4800 2 місяці тому
I think the shadow weights are meant to address this. Their shadow weights were initialized in accordance with their master hunger, and hence they provide activation in a similar direction as the backbone. In some sense, these shadow weights have inherited some knowledge learned from the previous old data.
---
One goal of continual learning is to avoid the network re-learn previously learned knowledge. So, the replay buffer seems to contradict this goal.
@mavenlin 2 місяці тому
@@stevenkao4800 I don't like replay buffer either. But if growing/pruning the network would ever work without replaying. It needs to have some form of theory that guarantees the retention of old information. I wonder if the "similar direction" argument can be formalized as such a guarantee.
@stevenkao4800 2 місяці тому
I think that even with today’s standard neural networks, it is hard to pose any theoretical guarantees on their knowledge or ability. The most we can say is that they seem to work well most of the time.
This is actually a strength of neural networks-their approximate nature makes them extremely flexible.
@hemig 4 місяці тому ⁺³
Great thinking. But, doesn't this reverse regularization and tend to overfit?
@andrewferguson6901 4 місяці тому ⁺⁶
Overfit continously and you might end up somewhere

Наступне

Автоматичне відтворення

Rich Sutton’s new path for AI | Approximately Correct Podcast

Rich Sutton’s new path for AI | Approximately Correct Podcast

Dynamic Deep Learning | Richard Sutton

Dynamic Deep Learning | Richard Sutton

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

ПРАНК НАД БОЯРСКИМ | КОНФЛИКТ НА ДОРОГЕ

ПРАНК НАД БОЯРСКИМ | КОНФЛИКТ НА ДОРОГЕ

ФИЛЬМ! НЕВИНОВНЫЙ ГОТОВИТ ДЕРЗКИЙ ПОБЕГ С НЕПРИСТУПНОГО ОСТРОВА-ТЮРЬМЫ! Мотылёк! Русский фильм

ФИЛЬМ! НЕВИНОВНЫЙ ГОТОВИТ ДЕРЗКИЙ ПОБЕГ С НЕПРИСТУПНОГО ОСТРОВА-ТЮРЬМЫ! Мотылёк! Русский фильм

Unexpected way to open the new Audi A6 e-tron Frunk 😮! #shorts

Unexpected way to open the new Audi A6 e-tron Frunk 😮! #shorts

СПОРИМ ТЫ НЕ ЗНАЕШЬ ТРИ СЛОВА НА БУКВУ О? #shortsvideo #юмор #катяклон #comedy #прикол #мамадочка

СПОРИМ ТЫ НЕ ЗНАЕШЬ ТРИ СЛОВА НА БУКВУ О? #shortsvideo #юмор #катяклон #comedy #прикол #мамадочка

OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

OpenAI's Noam Brown, Ilge Akkaya and Hunter Lightman on o1 and Teaching LLMs to Reason Better

Value alignment? | Richard Sutton & Blaise Agüera y Arcas | Absolutely Interdisciplinary 2023

Value alignment? | Richard Sutton & Blaise Agüera y Arcas | Absolutely Interdisciplinary 2023

Pete Shadbolt at MIT EmTech: Building the World’s First Useful Quantum Computer

Pete Shadbolt at MIT EmTech: Building the World’s First Useful Quantum Computer

Accelerating scientific discovery with AI

Accelerating scientific discovery with AI

How to Read Math in Deep Learning Paper?

How to Read Math in Deep Learning Paper?

The Alberta Plan for AI Research: Tea Time Talk with Richard S. Sutton

The Alberta Plan for AI Research: Tea Time Talk with Richard S. Sutton

Ilya Sutskever: Sequence to Sequence Learning with Neural Networks at NeurIPS 2024

Ilya Sutskever: Sequence to Sequence Learning with Neural Networks at NeurIPS 2024

Heroes of Deep Learning: Andrew Ng interviews Ian Goodfellow

Heroes of Deep Learning: Andrew Ng interviews Ian Goodfellow

The Most Important Algorithm in Machine Learning

The Most Important Algorithm in Machine Learning

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

"Бажано відбити посадку без втрат": військовий розповів, як загибель побратимів впливає на психіку

"ВСЯ УЛИЦА полетела" - курянка про обстріли рф

"ВСЯ УЛИЦА полетела" — курянка про обстріли рф

How Strong Is Tape?

How Strong Is Tape?

"ХИТРЕЦ": Трамп РОЗЛЮТИВ Скабєєву / Оля ЛИЄ ЯДОМ #shorts

"ХИТРЕЦ": Трамп РОЗЛЮТИВ Скабєєву / Оля ЛИЄ ЯДОМ #shorts

Нельзя смеяться | Смех с водой | 97 #shorts

Нельзя смеяться | Смех с водой | 97 #shorts

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)