Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

The Dome Paradox: A Loophole in Newton's Laws

Why LLMs Are Going to a Dead End Explained | AGI Lambda

😳Трамп ПОТІШИВ Скабєєву, але одразу РОЗЧАРУВАВ #shorts

МІША ЛЕБІГА і АНДРІЙ ЛУЗАН в СРАЧІ #32

Ветеран війни отримав гроші на житло

o3 (Part 1): Generating data from multiple sampling for self-improvement + Path Ahead

John Tan Chong Min

Переглядів 1 784

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 26 гру 2024

КОМЕНТАРІ • 7

@drhxa 9 хвилин тому ⁺¹
Where did OpenAI say they didn't use tree search? I think they do use tree search, specifically MCTS for generating the synthetic data for o1, then at inference time they don't use tree search. The magic is in creating the synthetic data - they take a variety of paths including some wrong paths of the tree search and chain those with keywords like "but wait, the above is getting me stuck. Let's try this instead" then jump to another branch (the branch frequently does lead to the correct answer) of the tree.
The key is MCTS + "let's verify step by step" in my opinion, so they linearize the MCTS thoughts chains and train on that. Somewhere in there they're using RL also as another key ingredient.
Looking forward to hear your thoughts
@drhxa 4 хвилини тому ⁺¹
Add one more thing: take a look at Sasha Rush's video "speculations on o1" where he describes 4 possible approaches and he explains the stream of search approach. There are a number of problems with this approach such as collapse and loss of generality (as you noted experiencing). But their "secret sauce" could really just be a lot of hard work to overcome these issues to scale the techniques
@m_ke 4 дні тому
Great content, thanks so much for sharing all of your videos!
@johntanchongmin 3 дні тому ⁺¹
Prompt that makes 4o behave like o1:
```
[Problem]
Do it out by the following format, taking care to reflect, verify, clarify all assumptions:
###Thoughts###
...
###Final Answer###
```
@_PranavDesai 4 дні тому
What is the purpose of generating synthetic data from the model which would be used to improve itself? Wouldn't the synthetic data it produced contain the exact same biases as the model? How do you remove the inherent bias? More importantly, if it can produce expert data, why would it be used to fine-tune itself over it again considering the model was already able to produce the very same data?
Does this feel like CoT or ReAct with extra steps?
@johntanchongmin 4 дні тому ⁺⁴
@@_PranavDesai You can actually do chain of thought prompting to get the model to output more detailed steps, which it natively may not do due to web data not being of that format.
Such understanding of reasoning steps can be transferred across domains by fine tuning it, resulting in a model that can do reasoning/chain of thought natively without the prompt
In most cases, you have a ground truth dataset to check if the answer obtained by reasoning is correct, and so you are more assured (though not 100%) that the model is generating the right reasoning traces.
Btw I myself do not believe models can actually reason like humans, but these reasoning serves as chain of thought to help guide better generation, so it plays an important role.
@francisco444 День тому
one important reason of producing synthetic data from the model is that it helps the model represent its knowledge, otherwise you would be feeding the knowledge from another source which it doesn't know anything about. since we want the models to be honest, which means they should learn about what they know and don't, this self-generating data is the best way to make them hallucinate less.

Наступне

Автоматичне відтворення

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

The Dome Paradox: A Loophole in Newton's Laws

The Dome Paradox: A Loophole in Newton's Laws

Why LLMs Are Going to a Dead End Explained | AGI Lambda

Why LLMs Are Going to a Dead End Explained | AGI Lambda

😳Трамп ПОТІШИВ Скабєєву, але одразу РОЗЧАРУВАВ #shorts

😳Трамп ПОТІШИВ Скабєєву, але одразу РОЗЧАРУВАВ #shorts

МІША ЛЕБІГА і АНДРІЙ ЛУЗАН в СРАЧІ #32

МІША ЛЕБІГА і АНДРІЙ ЛУЗАН в СРАЧІ #32

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло

СПОРИМ ТЫ НЕ ЗНАЕШЬ ТРИ СЛОВА НА БУКВУ О? #shortsvideo #юмор #катяклон #comedy #прикол #мамадочка

СПОРИМ ТЫ НЕ ЗНАЕШЬ ТРИ СЛОВА НА БУКВУ О? #shortsvideo #юмор #катяклон #comedy #прикол #мамадочка

Ilya Sutskever: Sequence to Sequence Learning with Neural Networks at NeurIPS 2024

Ilya Sutskever: Sequence to Sequence Learning with Neural Networks at NeurIPS 2024

SWARMathon Series #1 Human-Centered Innovation: Balancing Humanity and Technology in Smart Cities

SWARMathon Series #1 Human-Centered Innovation: Balancing Humanity and Technology in Smart Cities

Learn Machine Learning Like a GENIUS and Not Waste Time

Learn Machine Learning Like a GENIUS and Not Waste Time

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Stanford Webinar - Large Language Models Get the Hype, but Compound Systems Are the Future of AI

Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

Ilya Sutskever: "Sequence to sequence learning with neural networks: what a decade"

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Visualizing transformers and attention | Talk for TNG Big Tech Day '24

Why Does Diffusion Work Better than Auto-Regression?

Why Does Diffusion Work Better than Auto-Regression?

Building Production RAG Over Complex Documents

Building Production RAG Over Complex Documents

ICML 2024 Tutorial: Physics of Language Models

ICML 2024 Tutorial: Physics of Language Models

人是不能做到吗？#火影忍者 #家人 #佐助

人是不能做到吗？#火影忍者 #家人 #佐助

1% vs 100% #beatbox #tiktok

1% vs 100% #beatbox #tiktok

The Security Guard Fell Into The Trap Of The Beauty #still #parkour #funny#skate

The Security Guard Fell Into The Trap Of The Beauty #still #parkour #funny#skate

Как найти себе жену? Больше - тут @stas.yornik.shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

ふわふわシフォン大作戦🩷スイーツ戦隊のキラキラミッション✨【銀座コージーコーナー】 #shorts #シフォンケーキ #クリスマスケーキ #クリスマス #ケーキ #チョコケーキ #christmas

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников

МАФИЯ в РЕАЛЬНОЙ ЖИЗНИ: Дубровский, Позов, Мамикс, Катя Клэп, Егорик, Кадрол, Столяров, Масленников

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!