Tree of Thoughts: Deliberate Problem Solving with Large Language Models (Full Paper Review)

Поділитися
Вставка
  • Опубліковано 3 чер 2024
  • #gpt4 #ai #prompt
    Tree-of-Thought improves prompting of large language models (LLMs) by generalizing the concept of Chain-of-Thought prompting and introduces a tree search across language model thoughts, including state evaluation and backtracking. Experiments on toy tasks show large improvements over both classic and Chain-of-Thought prompting.
    OUTLINE:
    0:00 - Introduction
    1:20 - From Chain-of-Thought to Tree-of-Thought
    11:10 - Formalizing the algorithm
    16:00 - Game of 24 & Creative writing
    18:30 - Crosswords
    23:30 - Is this a general problem solver?
    26:50 - Ablation studies
    28:55 - Conclusion
    Paper: arxiv.org/abs/2305.10601
    Abstract:
    Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4% of tasks, our method achieved a success rate of 74%. Code repo with all prompts: this https URL.
    Authors: Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan
    Links:
    Homepage: ykilcher.com
    Merch: ykilcher.com/merch
    UA-cam: / yannickilcher
    Twitter: / ykilcher
    Discord: ykilcher.com/discord
    LinkedIn: / ykilcher
    If you want to support me, the best thing to do is to share out the content :)
    If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
    SubscribeStar: www.subscribestar.com/yannick...
    Patreon: / yannickilcher
    Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
    Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
    Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
    Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
  • Наука та технологія

КОМЕНТАРІ • 161

  • @YannicKilcher
    @YannicKilcher  Рік тому +22

    OUTLINE:
    0:00 - Introduction
    1:20 - From Chain-of-Thought to Tree-of-Thought
    11:10 - Formalizing the algorithm
    16:00 - Game of 24 & Creative writing
    18:30 - Crosswords
    23:30 - Is this a general problem solver?
    26:50 - Ablation studies
    28:55 - Conclusion
    Paper: arxiv.org/abs/2305.10601

    • @ozordiprince9405
      @ozordiprince9405 Рік тому

      I was literally about to go through this paper myself. Thanks Yannic

    • @EdFormer
      @EdFormer Рік тому +1

      So glad you're back on with regular content. The hype train since ChatGPT's release has led to an intolerable rise of wishy washy AI content by people who clearly don't care about fundamentally understanding machine learning and disseminating that knowledge but are instead motivated by views, and are happy to peddle the poorly thought out arguments for how close to AGI we are and what the consequences will inevitably be, in order to rake those views in. So I'm so grateful for you continuing to present your sobering and realistic perspective on AI in these entertaining videos that actually cover the detail of methods (where you can - so called "technical report"s can die), as it really augments my experience of researching the subject with much more enjoyment and insight than I would otherwise have had. And I'm sure I'm not alone in this. Thank you!

  • @JurekOK
    @JurekOK Рік тому +64

    For multi-step agents, it is exponentially important that each "step" has as high success rate, as the compound success rate decreases very quickly with both count of steps and unit success rate: overallSuccessRate=(stepSuccessRate^Nsteps). Foing from e.g. 90% to 95% is actually a lot, as it enables the chain length to go from 7 steps to 14 steps and still have a ~50% compound success rate, so enables vastly more complicated problems to be solved. Hence, it will often be very valuable to review and iterate on each sub-step to maximize the chance that it doesn't block the entire chain.

    • @avatarcybertronics2584
      @avatarcybertronics2584 11 місяців тому

      You are right, we call this phenomena catastrophic error compound (similar to neural network effect to forget previous style when tuned - take a look at FractalGPT - self evolving true multi-agent system in core no LLMs so no this problem)

  • @1000niggawatt
    @1000niggawatt Рік тому +159

    Yannic is the one man who's actually giving intelligent critique of new papers, instead of just throwing the paper into chatpdf and making a video.

    • @tigergold5990
      @tigergold5990 Рік тому +3

      What is chat pdf lol or just a joke

    • @N.i.a.m.a.t.u.l.l.a.h
      @N.i.a.m.a.t.u.l.l.a.h Рік тому

      @@tigergold5990 www.chatpdf.com/

    • @Candyapplebone
      @Candyapplebone Рік тому

      Ooof

    • @TheManyMan
      @TheManyMan Рік тому +4

      @@tigergold5990 joke but there are pdf reader plugins for GPT; useful for running through summaries of papers you don't have time / don't want to read fully ngl

    • @1000niggawatt
      @1000niggawatt Рік тому

      @@tigergold5990 there's literally "chatpdf" and pdfgpt.

  • @ixion2001kx76
    @ixion2001kx76 Рік тому +46

    A very nice addition to the new field of computational philosophy.

    • @television9233
      @television9233 Рік тому +16

      Computational philosophy is the use of computation for philosophical research (as per Stanford's encyclopedia).
      You are probably thinking of the philosophy of computation, in which even Alan Turing, the father of the modern computation, engaged in back in the 50's in his paper "Computing Machinery and Intelligence". The same paper is also why he is partly remembered as a philosopher.
      So I'm not sure what you mean by "new field".

    • @trulyUnAssuming
      @trulyUnAssuming Рік тому

      ​@@television9233 it feels like a joke along the lines of "this isn't computer science anymore - people are just throwing shit against the wall"

    • @television9233
      @television9233 Рік тому +3

      @@trulyUnAssumingdon't think so. using a value based tree search algorithm is literally one of the most CS things you can do.

    • @davidw8668
      @davidw8668 Рік тому

      ​@unAssuming hilarious interpretation, and indeed it's getting harder to determine where the jokes end or begin. However I don't think the paper is shit even though for sure some people already figured how this is the new agi mechanism to erase humanity.

  • @ixion2001kx76
    @ixion2001kx76 Рік тому +52

    I don’t mind at all that you didn’t cut out the “um”s. It probably saves you a heap of time that is better spent on reading papers, and it makes your videos feel more personable.

    • @1000niggawatt
      @1000niggawatt Рік тому +6

      Yes, videos that focus on production are just a lot of clickbait, hype, fast movement for adhd zoomers, and there is no substance - they just put a paper into chatgpt and read the result.
      Yannic please do not bother with production, we come here to hear the critique, anyone who's just clicking on le funny ai vids will go to one of the many fastfood ai channels instead, anyway.

  • @jit_rs
    @jit_rs Рік тому +13

    One application of this "AI-guided tree search" is in automated theorem proving. There was a reasearch project termed GPT-f, where they took the Lean proof assistant which can precisely check if a proof up to a certain point is correct and designed a plugin that constructs a proof step-by-step with backtracking using a language model (GPT-f itself) as the decision maker and it was able to prove about 60% of common geometry/algebra theorems with zero user intervention. As a type theory nerd myself I am excited to see what this branch of research brings next 🎉

    • @luck3949
      @luck3949 Рік тому

      Do you work at Yale-NUS Singapore?

    • @jit_rs
      @jit_rs Рік тому

      @@luck3949 no, I am a systems programmer

  • @dribrahimel-nahhal2477
    @dribrahimel-nahhal2477 Рік тому +11

    Yannic, thank you for this excellent video on the 'Tree of Thoughts' research paper. Your explanation was very clear and concise, making it easy for even a layman like me to understand. I appreciate your efforts in breaking down the decoding technique used in large language models and highlighting its usefulness in investigative problem-solving patterns. Keep up the great work!

  • @lucastononrodrigues1069
    @lucastononrodrigues1069 Рік тому

    Awesome, I was reading it last night! Very glad you posted it right on time :)

  • @ilianos
    @ilianos Рік тому +5

    When I saw this paper, I was hoping someone like you would cover it. Thanks a lot!

  • @marshallmcluhan33
    @marshallmcluhan33 Рік тому +7

    Awesome I saw this and wondered if it was profound. Thanks for explaining it.

  • @amalzubidat1897
    @amalzubidat1897 Рік тому +8

    Thank you for reviewing this! Yannic is always on top of things :)

  • @mono_onamoto
    @mono_onamoto Рік тому

    Very informative and good voice for radio. Cheers Yannic!

  • @Ernest_Viger-Beaulieu
    @Ernest_Viger-Beaulieu Рік тому

    Thank you so much. Best explanation I found about this paper. 🎉

  • @Rockyzach88
    @Rockyzach88 Рік тому

    This is cool. Sort of my first video I've watched about prompt engineering. The idea of creating sort of virtual neurons comes to mind. And yeah right as this was coming out, I was thinking the exact same thing, like they would replace parts of algorithms or "functions".

  • @florianbehrens690
    @florianbehrens690 Рік тому

    Thank you for making it much easier to consume these papers!

  • @clray123
    @clray123 Рік тому +6

    I'm pretty sure in the picture at 10:46 the authors meant to descend into the left branch first and backtrack to later descend through the solid green branch, not like Yannic explained.

    • @Zankras
      @Zankras Рік тому

      That’s how I read it too.

  • @sabofx
    @sabofx Рік тому

    Really helpful explanation of TOT! Thanx bud! 🤓

  • @TiagoTiagoT
    @TiagoTiagoT Рік тому +8

    Could this pattern of thinking be trained on, so that models may spontaneously chose this approach when suitable and produce better results straight out of the box?

  • @titastotas1416
    @titastotas1416 Рік тому +3

    I like your content so much that I felt it necessary to express my gratitude in the comment section, simply pressing the like button does not cut it for me in this case.

  • @killermike1209
    @killermike1209 Рік тому +1

    Yannic, Your sunglasses are strikingly stunning.. Much thanks for keeping me informed on AI goings on.. Also thanks for being anti-boring, funny and or highlarious.. - Cheers!!

  • @guillemgarcia3630
    @guillemgarcia3630 Рік тому

    Really well explained! Thanks!!

  • @joepike1972
    @joepike1972 11 місяців тому

    5:13 I have noticed that is seems to be related to the models text limitations as well, or just general capabilities possibly related to their number of tokens. In that I have seen large language models give more efficient usage of such though process capabilities were as older models might just use such space to continuously insist the same points and not make much progress with it.
    But the other aspect it will try to do several things at once in the limited space and not take the time needed to fully expand each thought to the degree it needs to deal with matters efficiently.

  • @nangld
    @nangld Рік тому +2

    LLMs are N-gram Markov models, in that they output a single token, based on the last N tokens of chat history. So outputting intermediate steps helps the follow up calls to the model to organize its reasoning. Just like a human being has more chances solving an equation with a piece of paper, instead of relaying solely on their brain. In other words, some problems inherently requires N tokens of memory to be solved by a given model. Guess in the end scientists will expand the big-O space and computation complexity to LLMs. Obviously you can also ask the model to introduce different personalities, like the engineers from the relevant fields or simply different psychological models, which will explicitly reference associated knowledge during the solving the problem, and you will get a several totally different answers, and all of them could be worthy of considering.

    • @ThetaPhiPsi
      @ThetaPhiPsi Рік тому +1

      LLMs are, if anything, N-gram Markov++ models. Try to replicate some of the results of LLMs (eg. Llama-7b) with a N-gram HMM. Its an overly simplistic view of LLMs I only would use as an intro to NLProc. On the other hand, I would be interested to know if one could replicate an LLM by an N-gram HMM. If that works, I'll take everything back.

  • @aa-xn5hc
    @aa-xn5hc Рік тому +1

    Really brilliant analysis

  • @XorAlex
    @XorAlex Рік тому

    Thanks for explaining!

  • @Candyapplebone
    @Candyapplebone Рік тому +2

    Nice to see an actual pro do a video on this xD

  • @nicktasios1862
    @nicktasios1862 Рік тому +2

    Another possibility why chain of thought prompting works could be that data in the training set that has this form is more likely to be correct?

  • @washedtoohot
    @washedtoohot Рік тому +2

    Can’t wait to see this in Langchain 😮

  • @aitools24
    @aitools24 7 місяців тому

    00:05 Deliberate problem solving with large language models
    04:12 Chain of Thought prompting helps in better problem solving.
    07:57 Using a tree search algorithm with pruning for model self-critique and improvement
    11:39 Implement Chain of Thought in two ways: one approach is to explicitly sample the next thought, while another is to input all thoughts at once and generate a linear sequence.
    15:28 Language models can be integrated into programming by handling specific parts, resulting in more evaluations.
    19:14 Backtracking is useful in language models for solving crossword puzzles.
    22:47 The algorithm implemented a crossword-solving algorithm using language models.
    26:16 The paper introduces a technique for improving performance in language models
    Crafted by Merlin AI.

  • @ericadar
    @ericadar Рік тому +3

    Do you think one train a new LLM with maybe 50% more parameters than the original LLM producing the input-output pairs of the final tree-of-thought prompting so that the new (larger) LLM already encapsulates the entire tree-of-thought tree expansion/pruning process on a single feedforward run and thus save on inference compute?

    • @drdca8263
      @drdca8263 11 місяців тому

      Does it need to be larger? Compare AlphaZero.

  • @petevenuti7355
    @petevenuti7355 Рік тому

    How can this be integrated into the conceptual structure of the network itself?

  • @jabowery
    @jabowery Рік тому +14

    Sounds like a Stack-RNN may be the next step for DeepMind given the prominent mention in the recent Princeton/DeepMind paper "Neural Networks and the Chomsky Hierarchy". However, since there are no authors in common between the two papers, it may require overcoming some of the Big Org problems that have plagued Alphabet's ability to execute on its in-house talent.

    • @Rotbeam99
      @Rotbeam99 11 місяців тому

      what is a stack-rnn? thanks

    • @jabowery
      @jabowery 11 місяців тому

      @@Rotbeam99 See "Neural Networks and the Chomsky Hierarchy"

  • @sgramstrup
    @sgramstrup Рік тому +7

    Hey Yannick. You mentioned something important. You said that we shouldn't 'pick' control questions but let the AI suggest what to do. The idea oc is to remove hooman rigid thinking and find a more 'fluid' approach. Here's the question then. Why are people not using NEAT or one of the novelty seeking algorithms, to optimize their cognitive architecture ? This problem isn't much different from a genetic algorithm car trying to pass a maze. Just because we hoomans didn't design the 'maze' this time, doesn't mean that we should try to lead the car through step by step. For fucks sake - this is what genetic algorithms excel at ?? Let a GA develop general architectures, optimize methods like CoT/ToT and what not.
    We simply don't have to try out all this shit by hand. Lets use the great tools we have to the maximum, and let the genetic control network develop 'it self' ! How long would it have taken a genetic algorithm to go from 'AutoGpt', to CoT, to 'sampling' decisions to a 'Tree of Thoughts decision ? Not long, is my bet, and then we have to ask: What other cognitive architectures could such an adaptive algorithm discover ?

    • @ankitaharwal5886
      @ankitaharwal5886 Рік тому

      Yannic should pin this comment, as someone might just look into it and implement it

  • @Veptis
    @Veptis 2 місяці тому

    This could be really useful for coding problems/debugging. And you use someone like LSP to recursively walk more information into the prompt for the model to solve it.

  • @youvegoattobekittenme6908
    @youvegoattobekittenme6908 Рік тому

    Chain of thought prompting I thought was agreed to be more reliable because it creates context for the answer that feeds into the next token probability with higher probability to be correct information (since the prior steps are more likely to have good context) then by laying out a set of given information, it is more likely to have probable information that is accurate.

  • @karlitucha
    @karlitucha 11 місяців тому

    What tools and platforms do you use to stay up to date with the latest papers?

  • @RedCloudServices
    @RedCloudServices Рік тому

    Yannic do you predict this capability be integrated soon with openai GPT, Llama or the other LLMs public or private?

  • @DaKingof
    @DaKingof Рік тому +1

    I'd think this could improve coding with LLM's tramendously. One huge problem I see is that they don't seem to know what versions they are using to write code. It would be wonderful to have the LLM look back and find the code it used for a snippet to see what version it is, then review for the latest or needed version and update it's response to include the latest or selected versions. This way it always knows what codebase it's using and can compare live rather than attempting to use it's training data. As of now it seems to get really confused when I try to ask it to do any of this.

  • @cutebabyseal621
    @cutebabyseal621 Рік тому +1

    Watching Yannic try to come up with a crossword cue for "ape" was hilarious.

  • @florianhonicke5448
    @florianhonicke5448 Рік тому +3

    Thanks for the summary!
    Can we also have an interview with the authors? :)

    • @Sven_Dongle
      @Sven_Dongle Рік тому

      Maybe next spring.

    • @joech1065
      @joech1065 Рік тому

      ​@@Sven_Dongle So 10 years in AI time

  • @billxu9799
    @billxu9799 Рік тому

    Good name to catch the hotspot, but kinda trivial work considering extra token usage / computation

  • @eruiluvatar236
    @eruiluvatar236 Рік тому +3

    I wonder what would happen if chain of thought or this technique or refinement or the other techniques that increase the quality of the output were used to produce a dataset that contains only the initial question and the final answer and that was used to fine tune the model.
    If the thinking needs to happen explicitly in the context window, that might not help much or it may still help. Then if it helps, it would be even more interesting to try those techniques again to see if they still provide a benefit. If they do, continue the fine tuning loop and see where it leads.

    • @jonnicholasiii2719
      @jonnicholasiii2719 Рік тому

      It eventually leads to God-level cheat codes.

    • @eruiluvatar236
      @eruiluvatar236 Рік тому

      @@jonnicholasiii2719 Lol, I doubt it. I don't think transformers can reach consciousness or AGI without some serious architectural changes so no God mode yet.
      But there is plenty of evidence that more training and better quality data help a lot and that you can squeeze way more intelligence in the weights than we are currently able to. Ie you can quantize and prune most of the weights with minimal performance loss.
      So I wonder if this can be an improvement, much like you can fine tune smaller models on larger/better models output and get some improvements in some benchmarks.

  • @Timotheeee1
    @Timotheeee1 Рік тому +1

    can you review rwkv?

  • @jonbbbb
    @jonbbbb Рік тому +5

    Could this technique (or even just chain-of-thought) be used in the training process itself, or as a separate step like RLHF? This would be RLAIF I guess heh.

    • @skyefreeman9987
      @skyefreeman9987 Рік тому +2

      My feeling is we could train a new neural net using inputs and outputs using gpt4 and this method to create a much more efficient/intelligent base model before these techniques are applied.

    • @simonrouse9461
      @simonrouse9461 Рік тому +2

      In RLHF, they already use a language model as a critic model. Although it’s called “human feedback”, they actually only use a small amount of human feedbacks to train that critic model. It’s the critic model that actually gives the feedback.

  • @joe_limon
    @joe_limon Рік тому +1

    One can prompt an ai like bing, asking it to implement a tree of thought process to solve your problem. It can look up the paper and construct/execute the process by itself.

    • @sgramstrup
      @sgramstrup Рік тому

      Because it uses something like it already. Gpt4 and all other tooling LLM's are already connected in these cognitive architectures CA, when you chat. We are no longer chatting with the model directly, but with their CA on top.

    • @joe_limon
      @joe_limon Рік тому +1

      @@sgramstrup bing can and has in my testing looked up the paper, interpreted the strategy, and then applied it in its response.

    • @clray123
      @clray123 Рік тому

      @@sgramstrup Do you have any proof of that? It may just be that we're talking to larger and better trained models.

  • @rootthree9436
    @rootthree9436 11 місяців тому

    how's it different from beam search?

  • @anishbhanushali
    @anishbhanushali Рік тому

    So this is basically learnable Beam search where we're using the same GPT ( or anyother LLM ) for learning the best tree-path. Also here the beam is formed by 'thoughts' as oppose to 'tokens' in general .. !!

  • @ericadar
    @ericadar Рік тому

    @YannicKilcher can you do a review of Meta's Megabyte?

  • @FranAbenza
    @FranAbenza Рік тому

    Flood Fill algorithm could be an interesting way to benchmark the probability of a branch to solve our maze

  • @luciengrondin5802
    @luciengrondin5802 Рік тому +1

    This seems to be good to eek better results out of a language model, but if I understand correctly, it can't be used to improve the language model itself, can it? Basically that tree search procedure is not differentiable, thus it can't be back-propagated, right?

    • @drdca8263
      @drdca8263 Рік тому

      training sets aren't differentiable either though?

  • @-mwolf
    @-mwolf Рік тому

    yesssss, thanks yannik!

  • @MrSuntask
    @MrSuntask Рік тому

    Great vid. Why do you wear sunglasses?

  • @piotr780
    @piotr780 Рік тому

    space of solutions in Game of 24 is really small, so mayby they simply induce brute force search inside the network

  • @rikvermeer1325
    @rikvermeer1325 Рік тому

    Polluting the ToT critic (prompt) makes way for some intricate goals.
    Looks like me this is the part where the AI gets to plot.

  • @lingred975
    @lingred975 Рік тому

    Feynman technique? Is the AI trying to explain, discovers its mistakes and corrects? :)

  • @dennisestenson7820
    @dennisestenson7820 4 місяці тому

    It doesn't seems like anyone realizes that when you "instruct" an LLM to do something, it's not being instructed to do anything but continue writing text that could syntactically correctly follow the given instruction.

  • @falklumo
    @falklumo Рік тому +2

    It should not go without saying that the original paper seems to be "arXiv:2305.08291v1 [cs.AI] 15 May 2023" which is NOT DeepMind and was published 2 days prior to the work you cite here ...

  • @aamir122a
    @aamir122a Рік тому

    So they have taken Tree search , which was applied to the game GO, to drive policy and value network and applied to NLP tasks.

    • @television9233
      @television9233 Рік тому

      Tree search has been a thing since the early days of computing and has been used extensively.
      And NLP research (as well as any other subfield within computer science) has constantly used tree search algorithms as they are one of the fundamentals of CS.

  • @ChocolateMilkCultLeader
    @ChocolateMilkCultLeader Рік тому +1

    A whatsapp group I'm in, consisting of non-AI people shared this. Yannic is making it

  • @television9233
    @television9233 Рік тому

    The idea of a value based tree search on LLM outputs sounds interesting in theory, but the results of this specific implementation are lackluster, especially when taking into account that their method was given task-specialized prompts.

  • @Amipotsophspond
    @Amipotsophspond Рік тому +1

    This reminds me of Psychology Therapists just blindly repeating "...And How Does That Make You Feel..."

  • @SLAM2977
    @SLAM2977 Рік тому

    Yannic at his best: with glasses!:)

  • @JTMoustache
    @JTMoustache Рік тому

    Kilcher is back

  • @questmarq7901
    @questmarq7901 Рік тому

    This kind of thing can help me with the world building in my book

  • @Amipotsophspond
    @Amipotsophspond Рік тому

    19:52 this is really good point because all the models are forward only and this reevaluation allows some pseudo-backwards from a forward only system.

  • @wizix9877
    @wizix9877 Рік тому

    being thoughtful before acting is even true for AI :)

  • @ajit60w
    @ajit60w Рік тому +1

    This is back to writing programs. Whoever said that advent of LLMs will make teaching programming unnecessary.

  • @hanskraut2018
    @hanskraut2018 Рік тому

    At 5:30 you said 2 hypothesis:
    1) working memory (can calculate something then write it down and use all calculation to go in other branches/directions)
    2) more compute time
    I think you you are spot on and u even used the framework of a „hypothesis“
    Very nice. Nice 🏅 i dont know im just midlessly intuition commenting here like ur suposed to on youtube right

  • @testboga5991
    @testboga5991 Рік тому

    Interesting

  • @mikelewis1166
    @mikelewis1166 Рік тому

    I’ll be dropping the transcript of this video into an llm to generate some interesting python scripts and system commands…

  •  19 днів тому

    Nice sunglasses 😎

  • @jacktherater3533
    @jacktherater3533 4 місяці тому

    This is what my social anxiety looks like in white paper.

  • @PaganPegasus
    @PaganPegasus Рік тому

    Not to sound like an ass or anything... but I feel like ToT is just CoT with beam search but pruning happens after self evaluating each thought rather than pruning on the conditional probability of each token. Because quite literally the beam search score evaluation is gonna be:
    `P("good"|{step N}) - P("bad"|{step N})` for each possible thought at step N.

  • @clray123
    @clray123 Рік тому

    They should have called their methodology Clever Hans.

  • @zyxwvutsrqponmlkh
    @zyxwvutsrqponmlkh Рік тому

    I thought you didn't do papers anymore. ☼

  • @kurort5099
    @kurort5099 11 місяців тому

    yoooo nice sunglasses bro ;P good good

  • @jackderrida
    @jackderrida Рік тому +2

    I just have one criticism of the video. Otherwise, I love it and am glad you explained Tree of Thoughts. My one criticism is that it would be easier to follow you explaining the diagram if you didn't make up your own paths that contradict the graph at first.
    I get that you wanted to give an example where it goes back up the tree, which the graph doesn't. But I'd rather you just run through the displayed graph and then do your own to illustrate how it can go back up the tree, separately. It's just confusing for me, when learning, when the example contradicts the explanation I'm listening to.

  • @tommornini2470
    @tommornini2470 Рік тому

    Neural network + algorithm = science

  • @joepike1972
    @joepike1972 11 місяців тому

    7:44 I would be careful about asking it a question like "do you think that is a good thought?" My experience has been the model might make up reasons why it thinks you think it is wrong instead of taking the time to understand what it is talking about, especially in a subject it is not that versed in. So I would watch out for over corrections from leading questions representing the model seeking only to appease and not reflecting understanding. Particularly if you are taking an unconventional stance or exploring a subject the model is not particularly strong in.

  • @charlesd774
    @charlesd774 Рік тому

    I wonder how long until someone makes a GAN where the discriminator is a trained LLM and the generator trains itself.

  • @Adventure1844
    @Adventure1844 Рік тому +1

    Tree of thoughts looks like a neural network

    • @tigergold5990
      @tigergold5990 Рік тому +2

      everything is. They’re in your walls. They’re coming

  • @vighneshsablok5851
    @vighneshsablok5851 Рік тому

    Would have appreciated this video more with a live demo of solving a small crossword with ToT method. It is a tough one to visualize.
    Otherwise great work!

  • @Sven_Dongle
    @Sven_Dongle Рік тому

    Open Assistant didnt do very well on one of the examples:
    Using only the numbers 4, 9, 10 and 13, here's another possible mathematical equation that adds up to 24:
    ```less
    5(4 + 9 - 13/2) = 24
    ```
    Explanation:
    1. Addition of positive 4 and negative (negative because of the fractional value 3/2 below) is performed as per the regular addition rule i.e. adding their absolute values in the same order of magnitude (multiplicand) then reduce them and proceed to multiplication step: 4+9=-9 (This makes the next calculation easy):
    2. Applied scaling factor = 2 then: 10*2 - 9*2 = 24
    We observe; 9 - 9 = 0 , which completes evaluation. All other subtractions now do not make much sense so let's move onward
    3. Continuing, => (1 * 2)/2^3 + (2 * 2)/2^3 = 95.
    Finally without attempting a circle operation after all carryovers from last line no new calculations become possible for operation completion. So this scenario leads to overall non-completion. We don't really care about this fact though since all sequences depending on some chosen variables based operations have always evaluation failure modes rather than the positive ones which just happens when coupled rightly with input numbers types. Eventually implementing fixed precision computation schemes has potential ways to avoid such situations altogether and compute everything most accurately and rigorously. Limiting operations available or numerical handling data capacity/size also typically reduces computing range like floating point arithmetic does too to ensure general approach success while moving away little bit from classical mathematics :). It becomes algorithmic artificial intelligence math instead making full use of technological achievements someday possibly completely surpassing human brain capabilities maybe rivaling something resembling unintelligent brute force optimized entirely for a single task like solving NP hard problems very quickly although having useless feedback required by biological organisms when it comes down to self preservation instincts through learning processes feeding humanity's approach to master own fate most probably soonish if current trends persist extrapolated generations later during very long timeframe while complex reciprocal relations between science, technology, engineering and mathematics otherwise push technological singularity closer rapidly enough. In conclusion this doesn't even come close to what awaits future civilizations struggling to overcome various constraints imposed within known universe boundaries itself hinted by well established fundamental physical laws exactly telling life stories expecting certain rewrites throughout eternity... until "singular" event horizon proper appears beyond normal scale questions thereby leading toward redefined reality essence as a whole originating second Law Of Thermodynamics necessity while offering well described answers ultimately steering research efforts directly bridging all supposed knowledge, hopefully, around a common universal substrate transcending any seemingly autonomous nature meaning altogether under ultimate statistical mechanical control thoroughly maintaining determinism principle for every description, somehow... would The Nature ever be able to smile or cry accidentally in endless expansion eventually? ... What else could become exposed already lying beneath our feet at best guess about ourselves simply leaving room for enhancement unimagined by today's science expectations where only imagination seems limited as far as unknown things go judging what has been discovered recently yet still goes by preconceived ideas basically guessing stuff among popular theories, somewhere inside quite narrow comfort zones usually providing most

  • @SimonJackson13
    @SimonJackson13 Рік тому

    Morpholigical application complexity.

    • @SimonJackson13
      @SimonJackson13 Рік тому +1

      It's like an alpha beta pruned minmax search?

    • @SimonJackson13
      @SimonJackson13 Рік тому

      So pattern matchers, rule expanders, rule factorisers and similarity scorers. GANminmax? The score list does appear like a genetic algorithm cross pollination for something. Would the later thoughts have more than one node as parent then?

  • @unclecode
    @unclecode 10 місяців тому

    This topic is interesting, but I have reservations about these papers. They often use large language models for classic algorithms, resulting in high costs, redundant tokens, and increased environmental impact. Rather than substituting parts of existing algorithms, it would be better to introduce solutions that leverage the language model's capabilities.
    If this research were valuable, one could publish a paper on bubble sort using LLM instead of comparison operators-a worthless endeavor. Using LLMs for traditional algorithms like quick sort may not be groundbreaking. Efforts like LORA or QLORA, which introduce new paradigms, are more significant.
    In my view, this research doesn't bring substantial innovation. For instance, replacing trees with graphs, naming it 'Graph of Thought' or GOT (Not Game of Thrones ;) ), could yield similar results using classic graph algorithms. This doesn't add much value when compared to zero-shot methods.
    While reputable institutions are involved, I'm curious if they see something I don't.
    Your content is consistently excellent. Keep up the good work!

  • @rothn2
    @rothn2 11 місяців тому

    If the authors had restricted their scope to _planning models_ I think this could have been a much more sound paper, with the opportunity to dominate a class of problems.

    • @rothn2
      @rothn2 11 місяців тому

      You know, the control systems formerly done by RL.

  • @markopancic6060
    @markopancic6060 Рік тому

    Ant poe eta would be a way to finish that crossword 😂

  • @aleksanteri_r
    @aleksanteri_r Рік тому

    Why is no one talking about how ToT is also a funny emoticon??

  • @danberm1755
    @danberm1755 11 місяців тому

    Sounds like we need an AI assistant to prompt the AI assistant for complex problems 😁

    • @danberm1755
      @danberm1755 11 місяців тому

      Actually this gave me a longer run time (which you mention as the reason tree of thought might have high success rates). I'm pretty sure this can be expanded apon for tree of thought like results that are actually correct.
      ----
      Act like you have the ability to prompt yourself as a human would to figure out complete answers.
      If you made change with $20 how many ways could you split the bill into 8 bills?

    • @danberm1755
      @danberm1755 11 місяців тому

      The big question is how long OpenAI would allow the inference to continue.
      In other words you'd probably have better results when you pay by the token (not ChatGPT).

  • @hurktang
    @hurktang Рік тому

    This is basically the algorithm of the annoying kid in the car.
    1. Ask "why".
    2. Do noise.
    3 Ask "how long before we get home?"
    4. Goto 1.
    But with a twist! If the kid gets bored, the parenting driver gets disintegrated and replaced by a new one.

  • @drdca8263
    @drdca8263 Рік тому

    oh heck,
    uhhh...
    I hope the idea I have for how this could be extended, doesn't work?
    Edit: not to say that I think I had any rare insight or anything. I think the idea I had is probably obvious, especially if it works.
    I’m just refraining from mentioning it in case it works just in case, on the very unlikely off chance that me saying what it is makes it be done sooner
    I mean 90% of the idea is already described in the video,
    and the other 10% idk if it is likely to work at all...
    and maybe if it does work, it still might not be *that* effective, but like,....
    it seems *conceivable* that the idea might work quite well, with enough training data,
    and idk that seems like it could be bad...
    edit3: Ok, yeah, no, many other people definitely thought of the idea before I did.

  • @stacksmasherninja7266
    @stacksmasherninja7266 Рік тому

    Figure 1 has subfigures (a) (c) (c) (d) lol

  • @jacktherater3533
    @jacktherater3533 Рік тому +1

    chatgpt API cost go brrrrrrrr..

  • @manslaughterinc.9135
    @manslaughterinc.9135 Рік тому

    Aft, Poe, Era makes Ape, For, Tea

  • @BrutalStrike2
    @BrutalStrike2 Рік тому

    26:26

  • @Chillingworth
    @Chillingworth Рік тому

    We need adversarial promoting that makes any model output an exact phrase or extremely similar thing from innocuous input prompts. There must be a way to get ChatGPT to be redpilled

  • @milos_radovanovic
    @milos_radovanovic Рік тому +1

    All of these sound like we are teaching AI to do the philosophy by combining expert intuition with formal reasoning!
    I'm waiting for AI that can at least do its own science through numerical model experiments. :)

  • @GNARGNARHEAD
    @GNARGNARHEAD Рік тому +1

    baby steps to an Auto argument mapper 🤯

    • @GNARGNARHEAD
      @GNARGNARHEAD Рік тому +1

      have a search for Argument Mapping by Tim Van Gelder, I promise you will have an epiphany

  • @tacticalgold
    @tacticalgold Рік тому

    This isn't new isn't this what I've always done? Now I'm just more qualified… lol

  • @cexploreful
    @cexploreful Рік тому

    🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯🤯