Coconut: Latent Reasoning in Large Language Models
Вставка
- Опубліковано 17 січ 2025
- Ref: arxiv.org/pdf/...
This research introduces Coconut, a novel method for enhancing Large Language Model (LLM) reasoning. Instead of relying solely on language-based chain-of-thought (CoT) reasoning, Coconut utilizes the LLM's hidden states as "continuous thoughts," feeding them directly back into the model for iterative reasoning in a continuous latent space. Experiments demonstrate Coconut's superior performance on tasks requiring substantial planning and backtracking, achieving higher accuracy with fewer generated tokens than traditional CoT methods. The study analyzes this improved performance through the lens of a breadth-first search, showcasing how latent reasoning allows for parallel exploration of multiple reasoning paths. Finally, the research explores the benefits of a multi-stage training curriculum to guide the LLM in learning effective latent reasoning.
Fundamental Mechanism:
Language-based reasoning relies on LLMs generating a step-by-step solution using natural language tokens, like the CoT prompting technique. [1, 2] This forces the reasoning process to stay within the "language space." [1, 3]
Latent reasoning, as proposed in the "Coconut" (Chain of Continuous Thought) paradigm, uses the last hidden state of the LLM as a representation of the reasoning state (a "continuous thought"). [3, 4] This continuous thought is fed back to the LLM as the next input embedding directly, allowing the model to reason in a latent space without being constrained by language. [3, 4]
Efficiency and Computational Budget:
The sources argue that using language for reasoning can be inefficient because the computational effort needed to generate each token is essentially the same, regardless of its importance for reasoning. [5] Many tokens are primarily for textual coherence and don't contribute much to the reasoning itself. [5]
Latent reasoning, on the other hand, potentially allows the model to allocate computational resources more effectively. [5, 6] The continuous thought representation can encode multiple potential next steps, enabling a breadth-first search (BFS) like approach. [7] This contrasts with the deterministic, single-path nature of CoT. [7]
Performance on Different Reasoning Tasks:
On math reasoning tasks (like GSM8k), Coconut shows promising results, indicating that chaining continuous thoughts can enhance reasoning accuracy similar to language-based CoT. [8, 9]
On logical reasoning tasks that require significant planning and backtracking (ProntoQA and the authors' proposed ProsQA dataset), Coconut outperforms language-based CoT methods. [8] This suggests that latent reasoning is particularly advantageous for tasks that benefit from exploration and planning ahead. [10, 11]
Training:
While latent reasoning is promising, the sources acknowledge that LLMs still need guidance to learn effectively in the latent space. [12] Simply training on questions and answers without any language reasoning data (as done in the "Coconut w/o curriculum" variant) doesn't yield good results. [12]
The authors employ a multi-stage training curriculum where they initially train the model on language reasoning data and then gradually replace reasoning steps with continuous thoughts. [4, 13] This approach proves effective, suggesting that language data can be leveraged to bootstrap the learning of latent reasoning. [12]
Interpretation and Insights:
Though continuous thoughts aren't meant to be decoded into language, the sources show a case study where decoding reveals meaningful insights into the reasoning process. [14] The decoded tokens correspond to intermediate variables in a math problem, demonstrating that continuous thoughts can capture and encode relevant information for reasoning. [14]
By analyzing the performance of Coconut variants with different numbers of continuous thoughts, the authors gain insights into the model's planning behavior, suggesting that latent reasoning facilitates a breadth-first search approach with the ability to prune unpromising paths. [7, 15-17]
Created with NotebookLM