Using Prolog to enhance LLMs. Until OpenAI o1 derailed me.

Future Is Amazing

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 19 вер 2024
I am using Prolog to help large language models with logic reasoning!
o1-preview is just too good though.
Наука та технологія

КОМЕНТАРІ • 28

@SteveRowe День тому ⁺³
I'm glad you did the experiments with Prolog. Good first-principles research. Publish and keep up the good work.
@FutureIsAmazing569 День тому ⁺¹
I'm not in academia right now, my publishing days are long forgotten, but thanks!
@Dron008 22 години тому
Wow, that is really interesting idea, I think it can be used somehow.
@ioannischrysochos7737 2 дні тому ⁺²
LLMs are much better to give prolog without errors than other languages. The drive is to combine LLM with Symbolic logic. The chain of thought can use external symbolic logic. We must expect to see such things in the future.
@Salveenee 2 дні тому
100% agreed
@FutureIsAmazing569 2 дні тому
@@ioannischrysochos7737 it does feel like the future you’re talking about has already arrived in the form of o1. It does feel like this might be already be present in it.
@ubit123 2 дні тому
@@FutureIsAmazing569 there is a difference between statistics and formal logic. In some cases you need to be sure that answer is correct, on most of the cases 99% correctness will suffice.
@adrianojordao4634 День тому
Prolog is more exciting that llms. But nobody knows prolog, or logics. Wrong time. But defenitly a part of agi what ever that is.
@KCM25NJL День тому
I tried this initial problem you gave at the start of the video with 4o, o1-mini and then o1-preview. The first 2 stated the exact same thing..... essentially they concluded that Alice was inclusive of the number of Sisters instead of +1. Preview on the other hand got the correct answer.
When I queried in the same context as all three questions why the first 2 questions were wrong (asking o1-Mini), it suggested that it should have checked the Problem Statement for any ambiguity prior to giving an unambiguous response. It was only when I said that the ambiguity lay in the errant interpretation of the problem statement, that the problem statement did not have any grammatical ambiguity..... o1-mini acquiesced and admitted it's fault.
It would seem that even with CoT and reflection built in, the scaling laws still apply for accuracy.
@FutureIsAmazing569 День тому
This is explored in quite an old (January 2023) Wei et al.'s paper.:"Chain-of-Thought Prompting Elicits Reasoning in Large Language Models" arxiv.org/pdf/2201.11903
"That is, chain-of-thought prompting does not positively impact performance for small models, and only yields performance gains when used with models of ∼100B parameters."
o1-mini is exactly 100B, while o1-preview is 300B, so you're absolutely right, scaling laws do apply
@franzwollang День тому
I've thought for many years now that the eventual *true* union between programming and AI will be reached when AI models are somehow built as complex sets of fuzzy predicates and can thus seamlessly merge their internal fuzzy logic representations with statements in a logical programming language (e.g. Prolog), creating a generally homoiconic system. This would give them a way to apply complex, fuzzy pattern matching where beneficial or efficient, strict pattern matching where beneficial. And best of all, everything the AI system would do or think is automatically interpretable because the fuzzy atoms could be mapped to specific localized regions (by definition of what an atom is) of the approximate data manifold the system learns when ingesting data, identifying the atoms, and distilling predicates.
If we could then build the logical programming language as a layer on top of a functional programming language to implement any imperative logic required... and build the functional language on top of a low-level systems language to implement the abstract data types, mapping to various hardware idiosyncrasies, and hardware optimizations... and preserve the ability at each language layer to reach down to lower layers for more control when necessary --that would be even more elegant.
And if we could build the functional and low-level languages incorporating techniques to expose facets of those languages in a form that can be transformed into fuzzy logic (i.e. vectorizing the call graph using graph sketches, exposing the mapping from the low-level language AST to assembly code such that the AI could execute a guided evolutionary optimization algorithm to adapt and optimize itself to new hardware automatically --especially important as hardware becomes insanely complex with tons of non-linear self-interactions and/or incorporates biological elements) would be even more elegant.
Ok, sorry for the rant. I like your idea to mix Prolog with an LLM! It is a very good intuition.
@FutureIsAmazing569 23 години тому ⁺¹
Thanks a lot for sharing your rant :)
Blending AI models with fuzzy logic, plus integrating with logical languages is amazing new area to study. Hey, if we sprinkle some Qubits in there, consciousness is guaranteed!
@franzwollang 23 години тому
@@FutureIsAmazing569 Never go full Deepak Chopra, my friend! Quantum computing can only (ever?) speed up specific algorithms by a quadratic factor. Quantum processing units (QPUs?) will be like GPUs or TPUs or hardware noise samplers in computers --that is, task specific accelerators.
Thanks for your video!
@FutureIsAmazing569 23 години тому
But I'm not going full Deepak Chopra :), Just a bit of Sir Roger Penrose!
@johanndirry День тому
Not sure if Prolog code is the best approach, since it is very limited in what kind of problems it can solve. I was experimenting with GPT4 restating the problem as a graph and solving the problem in Python using graph algorithms. However, o1-preview made that approach obsolete too.
@FutureIsAmazing569 День тому
I agree - even simple word puzzles are quite difficult to do in Prolog ( Unless you're using a special library, like bibmm.pl). Something like MiniZinc is way better at it. I chose Prolog for this project, since I found GPT-4o is quite good at writing Prolog code.
But yep, you're right, o1-preview makes almost every logic enhancement obsolete
@VictorGallagherCarvings 16 годин тому
What a great idea ! Could this approach be used with smaller models ?
@FutureIsAmazing569 16 годин тому
I did not try Prolog with smaller models, but I would suspect they should be good at it. Great idea to try later, thanks!
@rhym8882 День тому
Sorry to go a bit off-topic, but I was curious-what mouse pointer highlighter are you using?
@FutureIsAmazing569 День тому
Screenbrush apps.apple.com/us/app/screenbrush/id1233965871?mt=12
I haven't configured it at all, that's why it's always multicolored :), but it's very easy to switch on-off, with a single shortcut.
@timseguine2 23 години тому
I don't see a reason why you can't use gpt-o1 as the base model for this approach. Considering the model is also apparently better at coding. It seems like it might then also be able to generate correct prolog code for more complex problems.
@FutureIsAmazing569 23 години тому ⁺¹
gpt-o1 will certainly perform perfectly for this approach. The only problem I had is that o1 chain of thought reasoning was beating the Alice in the Wonderland+, so there was no point of improving it.
But you're absolutely right, the approach is still valid. Just as the next paper comes out, which poses a problem o1 can't solve, I will be back at it!
@andychristianson490 День тому
Can you do a video on doing similar, but with SAT solvers? E.g. generate Alloy.
@FutureIsAmazing569 День тому ⁺¹
Yep, I thought of doing a video specifically for logic puzzles with Z3. But from what I've already tried, LLMs are way worse in generating Z3 (also tried ASP), compared to Prolog. I think that might be due to sheer amount of training data available on the wild, which LLMs were exposed to.
Did not try Alloy, maybe I'll try aggregating various reasoning systems in one video. I also have an idea to pick one and fine-tune LLama3 on it to the max.
@vitalyl1327 16 годин тому ⁺¹
@@FutureIsAmazing569 they're ok in generating Z3 code if you do it step by step and via a code analysis feedback loop - like you should with any other language.
@FutureIsAmazing569 24 хвилини тому
you’re right, but any non one-shot prompting would have introduced additional complications to an already complicated multi-step process. While Prolog seems quite fine even with one-shot
@vitalyl1327 16 годин тому
Now use Prolog and SMT solvers with o1 to enhance it further. On my tasks llama3.1 with Prolog still outperforms o1 anyway.
@FutureIsAmazing569 2 години тому ⁺¹
Yes, I haven't been diligent enough of thinking about tasks which can still not be solved with o1. I am sure there's plenty though

Наступне

Автоматичне відтворення