Agustinus Kristiadi - Uncertainty-Guided Optimization on LLM Search Tree

Поділитися
Вставка
  • Опубліковано 4 жов 2024
  • Abstract: Beam search is a standard tree search algorithm when it comes to finding sequences of maximum likelihood, for example, in the decoding processes of large language models. However, it is myopic since it does not take the whole path from the root to a leaf into account. Moreover, it is agnostic to prior knowledge available about the process: For example, it does not consider that the objective being maximized is a likelihood and thereby has specific properties, like being bound in the unit interval. Taking a probabilistic approach, we define a prior belief over the LLMs' transition probabilities and obtain a posterior belief over the most promising paths in each iteration. These beliefs are helpful to define a non-myopic Bayesian-optimization-like acquisition function that allows for a more data-efficient exploration scheme than standard beam search. We discuss how to select the prior and demonstrate in on- and off-model experiments with recent large language models, including Llama-2-7b, that our method achieves higher efficiency than beam search: Our method achieves the same or a higher likelihood while expanding fewer nodes than beam search.
    Speaker's bio: Agustinus Kristiadi is a postdoctoral fellow at the Vector Institute, working primarily with Alán Aspuru-Guzik and Pascal Poupart. He obtained his PhD from the University of Tuebingen in Germany, advised by Philipp Hennig and Matthias Hein. His research interests are in probabilistic deep learning methods for uncertainty quantification, their Riemannian-geometric aspects, and their applications in broader science such as chemistry. His work has been recognized in the form of best PhD thesis award and multiple spotlight papers along with best reviewer award from top machine learning conferences. His contributions to the scientific society include mentoring underrepresented students in Canada under the IBET PhD Project and co-developing the Laplace-Torch open-source library, democratizing Bayesian neural networks to general audiences

КОМЕНТАРІ •