Forest-of-Thoughts: AI Test-Time Compute Reasoning

Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

ICL and TTT: Adaptive Intelligence for Small LM

Что будет если украсть в магазине шоколадку 🍫

до конца, там самая счастливая табалапка🐾🐾 #тикток #табалапка

Пилот обманул смерть ракета пролетела рядом с ним #shorts

o3 Inference Time CoT Reasoning: How relevant is SFT and RL?

Discover AI

Переглядів 2 142

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 22 гру 2024

КОМЕНТАРІ • 11

@GodbornNoven 12 годин тому ⁺⁴
Increasing Inference time compute just improves what is already there, multiplying the already present foundation. While interesting, it doesn't revolutionize the core concepts that are required for AGI. Concepts like neuroplasticity, SNN's, hierarchical processing of concepts(words vs sentences vs abstract thoughts). Better transfer learning, working on methods to avoid catastrophic forgetting to create the potential for continuous learning
We don't want to just mindlessly throw compute at a problem, that's barbaric, even though it works, there are naturally much better things we can work on.
I think right now instead of taking something we know works and expanding upon that, it might be better to innovate. The improvements brought by the increase in inference time won't really stop but its a log scale, i think we ought to focus our efforts in trying to find new ways instead and how to make all these new breakthroughs work together in an efficient and reliable way.
@mrd6869 9 годин тому ⁺¹
Or you can avoid the comment section and build something better.
I'll wait🤣
@luke.perkin.online 14 годин тому ⁺¹
There's test time fine tuning too, on nearest neighbour examples. Very successful on ARC.
@SebastianFabricioMaidanaFariña 10 годин тому
I'm a researcher in Paraguay.
Looking forward to meeting you.
@davidhurtado9922 12 годин тому
Just a random idea: would it be possible to train an AI in layers, where each layer trains using meta-analysis of the previous layers, and then the system becomes "meta-trained"? I mean, similar to how humans use their brains-training useful neural pathways through repetition and meta-consciousness, allowing us to learn and refine each layer of consciousness.
@CharlotteLopez-n3i 12 годин тому
Optimizing safety data across pre-training, fine-tuning, and reinforcement learning can reveal key dependencies and enhance O3’s performance. Has anyone explored this in depth?
@msokokokokokok 11 годин тому
I think what they do is 4 step 1. Create synthetic data for given task say 1000 2. Train a reward Model to plan to generate synthetic output given synthetic input 3. Train a policy to optimise thinking on that task 4. Use policy to generate thinking token to solve test time task
@zandrrlife 13 годин тому ⁺¹
First. A+ quality doc per usual. However, i still seeing cognitive dissonance on pretrain and Sft/RL. Pretrain for helpfulness? How is it helpful if the model learns the data, instead of learning from it? Qwen loss didn’t really improve with qwq post-training, rather the benefits only seen during test-time. Apparently openAI o series did though. I also hear they had about 50 trillion tokens synthetic data. Reasoning…I mean preference modeling in general. How is it not obvious there is a significant KL divergence gap between the two modes. It’s insane to me the solution is so obvious. Hybrid/synthetic data with full pretraining coverage. This way we can decouple atom, biases…etc from the raw data. Since procedural knowledge drives models. We know reasoning and values are encoded in the data. We can decouple this and make it more explicit…otherwise. Especially for reasoning. The data scale you need to overcome pretrain inductive biases imo, is GREAT. This has to be a pretraining thing. “Physics of language models” is simple and something i frequently refer to, but it highlights this intuition. We must close the distribution gap, otherwise all post-training is suboptimal. Anthropomorphic model trying to escape is a pretrain data failure. Not weird alien behavior from the model.
@GoronCityOfficialBoneyard 5 годин тому
The funny thing is I work in AI safety research, not professionally but the group running this channel basically breaks it, always go with a logic loop, always go basic, then move into more complex as the higher order checks lack understanding. Even if you train under the chain of thought or other reasoning models they still have to check the outputs, meaning you need a higher order check sum system which needs abstract output reasoning
@vrc5674 8 годин тому
Are you aware of any research where they train LLMs on Tableau reasoning steps or teaching LLM's to shortcut logical reasoning by applying (complex) learned laws/rules inherent in logical reasoning or boolean algebra? Sorta like applying DeMorgan's laws, etc to boolean algebra expressions to simplify them down. In theory an LLM may be able to discover more complex logical expressions that can be simplified down in a single step by encoding the expression into a vector that can in essence search a vector space for the simplification rather than going through the painfully slow process of testing the validity of each individual logical arguement. In this way, I wonder if an LLM can sorta "feel" its way to an answer and then work backward from the answer to test whether or answer is sound based on the initial assertions. I feel this might be closer to how humans actually reason. We employ a sort of pseudo-logical reasoning where its very close to formal logic but its quicker and not as rigorous.
@richsoftwareguy Годину тому
Happy to see that annoying intro HELLLOOO is gone.. might be able to watch some videos now 👌

Наступне

Автоматичне відтворення

Forest-of-Thoughts: AI Test-Time Compute Reasoning

Forest-of-Thoughts: AI Test-Time Compute Reasoning

Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

Byte Latent Transformer - BLT explained (Entropy of Next Byte, META)

ICL and TTT: Adaptive Intelligence for Small LM

ICL and TTT: Adaptive Intelligence for Small LM

Что будет если украсть в магазине шоколадку 🍫

Что будет если украсть в магазине шоколадку 🍫

до конца, там самая счастливая табалапка🐾🐾 #тикток #табалапка

до конца, там самая счастливая табалапка🐾🐾 #тикток #табалапка

Пилот обманул смерть ракета пролетела рядом с ним #shorts

Пилот обманул смерть ракета пролетела рядом с ним #shorts

TOY STORY IN BRAWL STARS!?

TOY STORY IN BRAWL STARS!?

What is the Dark Web? A Guide to the Dark Side of the Internet

What is the Dark Web? A Guide to the Dark Side of the Internet

Instability is All You Need: The Surprising Dynamics of Learning in Deep Models

Instability is All You Need: The Surprising Dynamics of Learning in Deep Models

Proximal Policy Optimization (PPO) - How to train Large Language Models

Proximal Policy Optimization (PPO) - How to train Large Language Models

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Andrew Ng Explores The Rise Of AI Agents And Agentic Reasoning | BUILD 2024 Keynote

Open Reasoning vs OpenAI

Open Reasoning vs OpenAI

Controlling Blender with my voice using LLM

Controlling Blender with my voice using LLM

OpenAI Unveils o3! AGI ACHIEVED!

OpenAI Unveils o3! AGI ACHIEVED!

Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

Neural and Non-Neural AI, Reasoning, Transformers, and LSTMs

New AI Discovery: Phase Transition in Learning (no fine-tuning)

New AI Discovery: Phase Transition in Learning (no fine-tuning)

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Cool Items!🥰 New Gadgets, Smart Appliances, Kitchen Tools Utensils, Home Cleaning, Beauty #shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

1% vs 100% #beatbox #tiktok

1% vs 100% #beatbox #tiktok

REAL or FAKE? #beatbox #tiktok

REAL or FAKE? #beatbox #tiktok

When you lose control of your Waboba Moon Ball. @TheWabobaTeam #wabobapartner

When you lose control of your Waboba Moon Ball. @TheWabobaTeam #wabobapartner

Прочистка шлюзов

Прочистка шлюзов

МІША ЛЕБІГА і АНДРІЙ ЛУЗАН в СРАЧІ #32

МІША ЛЕБІГА і АНДРІЙ ЛУЗАН в СРАЧІ #32