What Happens When You Use Prolog to Enhance LLMs?

Поділитися
Вставка
  • Опубліковано 14 гру 2024

КОМЕНТАРІ • 36

  • @SteveRowe
    @SteveRowe 2 місяці тому +6

    I'm glad you did the experiments with Prolog. Good first-principles research. Publish and keep up the good work.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      I'm not in academia right now, my publishing days are long forgotten, but thanks!

  • @juandesalgado
    @juandesalgado 2 місяці тому

    These are great ideas, I hope you can continue developing them further. Planning problems are a possible follow-up, though those tend to combinatorially explode when treated as search problems, which is what Prolog code would probably do.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      Oh, that's exactly what I have for one of my next videos. I use ASP to solve word problems. (but it's not very exciting at the moment, not sure it will see the light of day). And yep, word problems ( or something like Sudoku ), is doable by Prolog, but quickly explodes

  • @ioannischrysochos7737
    @ioannischrysochos7737 2 місяці тому +6

    LLMs are much better to give prolog without errors than other languages. The drive is to combine LLM with Symbolic logic. The chain of thought can use external symbolic logic. We must expect to see such things in the future.

    • @Salveenee
      @Salveenee 2 місяці тому

      100% agreed

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      @@ioannischrysochos7737 it does feel like the future you’re talking about has already arrived in the form of o1. It does feel like this might be already be present in it.

    • @ubit123
      @ubit123 2 місяці тому

      @@FutureIsAmazing569 there is a difference between statistics and formal logic. In some cases you need to be sure that answer is correct, on most of the cases 99% correctness will suffice.

    • @adrianojordao4634
      @adrianojordao4634 2 місяці тому

      Prolog is more exciting that llms. But nobody knows prolog, or logics. Wrong time. But defenitly a part of agi what ever that is.

    • @vitalyl1327
      @vitalyl1327 2 місяці тому

      @@adrianojordao4634 Prolog works well with LLMs both ways - not just Prolog generated by LLMs, but Prolog execution traces explained to LLMs the way they can understand. There are some potentially interesting "explainable prolog" attempts out there, check out pyexpert Python package for example.

  • @MarcoServetto
    @MarcoServetto 2 місяці тому

    One way that I've found interesting when I ask it to write code is to do the following:
    Generate a version of code
    On separate chats ask to discuss
    (a) why this code is right
    (b) we know as a fact that there is a mistake in this code in line 1. Explain why.
    (c) we know as a fact that there is a mistake in this code in line 2. Explain why.
    ///and so on (we can skip lines with no meaningful code on)
    (AA) here is a bunch of discussion about this code. Rank them and list the ones that are more correct.
    (BB) here is some code and a discussion on why it is wrong. Fix the code
    Rinse and repeat. Of course if you use a language with a type system you can also compile the code and provide the error messages in the mix.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      This works wonderfully, when you can hold your finger on the pulse, and have clear objectives. Doing this during automation steps is a bit more challenging.
      So my intuition is to always try to get away with a zero-shot or (if available) one-shot prompting - just so I can automate this easier down the line.
      But you're right, especially for code generation, that might not be enough and we have to resort to complicated tactics, like the example you've given.

  • @KCM25NJL
    @KCM25NJL 2 місяці тому

    I tried this initial problem you gave at the start of the video with 4o, o1-mini and then o1-preview. The first 2 stated the exact same thing..... essentially they concluded that Alice was inclusive of the number of Sisters instead of +1. Preview on the other hand got the correct answer.
    When I queried in the same context as all three questions why the first 2 questions were wrong (asking o1-Mini), it suggested that it should have checked the Problem Statement for any ambiguity prior to giving an unambiguous response. It was only when I said that the ambiguity lay in the errant interpretation of the problem statement, that the problem statement did not have any grammatical ambiguity..... o1-mini acquiesced and admitted it's fault.
    It would seem that even with CoT and reflection built in, the scaling laws still apply for accuracy.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      This is explored in quite an old (January 2023) Wei et al.'s paper.:"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" arxiv.org/pdf/2201.11903
      "That is, chain-of-thought prompting does not positively impact performance for small models, and only yields performance gains when used with models of ∼100B parameters."
      o1-mini is exactly 100B, while o1-preview is 300B, so you're absolutely right, scaling laws do apply

  • @Dron008
    @Dron008 2 місяці тому

    Wow, that is really interesting idea, I think it can be used somehow.

  • @VictorGallagherCarvings
    @VictorGallagherCarvings 2 місяці тому

    What a great idea ! Could this approach be used with smaller models ?

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      I did not try Prolog with smaller models, but I would suspect they should be good at it. Great idea to try later, thanks!

  • @franzwollang
    @franzwollang 2 місяці тому

    I've thought for many years now that the eventual *true* union between programming and AI will be reached when AI models are somehow built as complex sets of fuzzy predicates and can thus seamlessly merge their internal fuzzy logic representations with statements in a logical programming language (e.g. Prolog), creating a generally homoiconic system. This would give them a way to apply complex, fuzzy pattern matching where beneficial or efficient, strict pattern matching where beneficial. And best of all, everything the AI system would do or think is automatically interpretable because the fuzzy atoms could be mapped to specific localized regions (by definition of what an atom is) of the approximate data manifold the system learns when ingesting data, identifying the atoms, and distilling predicates.
    If we could then build the logical programming language as a layer on top of a functional programming language to implement any imperative logic required... and build the functional language on top of a low-level systems language to implement the abstract data types, mapping to various hardware idiosyncrasies, and hardware optimizations... and preserve the ability at each language layer to reach down to lower layers for more control when necessary --that would be even more elegant.
    And if we could build the functional and low-level languages incorporating techniques to expose facets of those languages in a form that can be transformed into fuzzy logic (i.e. vectorizing the call graph using graph sketches, exposing the mapping from the low-level language AST to assembly code such that the AI could execute a guided evolutionary optimization algorithm to adapt and optimize itself to new hardware automatically --especially important as hardware becomes insanely complex with tons of non-linear self-interactions and/or incorporates biological elements) would be even more elegant.
    Ok, sorry for the rant. I like your idea to mix Prolog with an LLM! It is a very good intuition.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      Thanks a lot for sharing your rant :)
      Blending AI models with fuzzy logic, plus integrating with logical languages is amazing new area to study. Hey, if we sprinkle some Qubits in there, consciousness is guaranteed!

    • @franzwollang
      @franzwollang 2 місяці тому

      @@FutureIsAmazing569 Never go full Deepak Chopra, my friend! Quantum computing can only (ever?) speed up specific algorithms by a quadratic factor. Quantum processing units (QPUs?) will be like GPUs or TPUs or hardware noise samplers in computers --that is, task specific accelerators.
      Thanks for your video!

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      But I'm not going full Deepak Chopra :), Just a bit of Sir Roger Penrose!

  • @johanndirry
    @johanndirry 2 місяці тому

    Not sure if Prolog code is the best approach, since it is very limited in what kind of problems it can solve. I was experimenting with GPT4 restating the problem as a graph and solving the problem in Python using graph algorithms. However, o1-preview made that approach obsolete too.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      I agree - even simple word puzzles are quite difficult to do in Prolog ( Unless you're using a special library, like bibmm.pl). Something like MiniZinc is way better at it. I chose Prolog for this project, since I found GPT-4o is quite good at writing Prolog code.
      But yep, you're right, o1-preview makes almost every logic enhancement obsolete

  • @andychristianson490
    @andychristianson490 2 місяці тому

    Can you do a video on doing similar, but with SAT solvers? E.g. generate Alloy.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      Yep, I thought of doing a video specifically for logic puzzles with Z3. But from what I've already tried, LLMs are way worse in generating Z3 (also tried ASP), compared to Prolog. I think that might be due to sheer amount of training data available on the wild, which LLMs were exposed to.
      Did not try Alloy, maybe I'll try aggregating various reasoning systems in one video. I also have an idea to pick one and fine-tune LLama3 on it to the max.

    • @vitalyl1327
      @vitalyl1327 2 місяці тому +1

      @@FutureIsAmazing569 they're ok in generating Z3 code if you do it step by step and via a code analysis feedback loop - like you should with any other language.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      you’re right, but any non one-shot prompting would have introduced additional complications to an already complicated multi-step process. While Prolog seems quite fine even with one-shot

    • @vitalyl1327
      @vitalyl1327 2 місяці тому

      @@FutureIsAmazing569 I'm mostly using small local models, so many-shot is a default even with Prolog and Datalog. It's not too hard, and having an unlimited feedback loop improves model performance many-fold, so it's a good idea in general to do it with any tool.
      Another nice thing with local models is that you can do inference harnessing - nudge the model to select only tokens that form correct syntax, and provide a very tight feedback loop for tool usage.
      Even if you're getting an ok Prolog most of the time with one-shot, it's never guaranteed to be ok for all cases, so a feedback loop is needed even for the very large and powerful models anyway.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      @@vitalyl1327 thanks for the insight. The monster PC I use to run local models has been idle lately, since it’s been quite hot lately. I should get back to doing just local models in a week! I agree that feedback loop should be default for such tasks

  • @timseguine2
    @timseguine2 2 місяці тому

    I don't see a reason why you can't use gpt-o1 as the base model for this approach. Considering the model is also apparently better at coding. It seems like it might then also be able to generate correct prolog code for more complex problems.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +2

      gpt-o1 will certainly perform perfectly for this approach. The only problem I had is that o1 chain of thought reasoning was beating the Alice in the Wonderland+, so there was no point of improving it.
      But you're absolutely right, the approach is still valid. Just as the next paper comes out, which poses a problem o1 can't solve, I will be back at it!

  • @szebike
    @szebike 2 місяці тому

    Maybe OpenAI were "inspired" by users like you etc. I assume they take a lot of freedom of the interpretation of "observing user chatlogs for safety".

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому

      I would not go so far :) I think they are pretty competent at what they do. But if they would benefit from it, I think I would be fine with it. Whatever it takes to advance this amazing new tech!

    • @szebike
      @szebike 2 місяці тому +1

      @@FutureIsAmazing569 Well you said you hoped to make a buck or two with your approach you can't tell where they get their ideas from (from my experience people from high academia background are smart but usally very uncreative).So if you want to make money keep yur important ideas for yourself until they are market ready (you can use local models to help you). Given that S.Altman is their CEO I would be more cautious if you look how he behaved towards very poor people with his crypto currency back then. (If TLDR then the short version: He" buyed" biometric scans of eyeballs of those people without *informed consent* for some cryptocurrency per eyescan until the government Kenyan government halted it. OpenAI also used very low paid Kenyan workers to create trainingdata not long ago.)

  • @vitalyl1327
    @vitalyl1327 2 місяці тому

    Now use Prolog and SMT solvers with o1 to enhance it further. On my tasks llama3.1 with Prolog still outperforms o1 anyway.

    • @FutureIsAmazing569
      @FutureIsAmazing569  2 місяці тому +1

      Yes, I haven't been diligent enough of thinking about tasks which can still not be solved with o1. I am sure there's plenty though