- 244
- 246 726
John Tan Chong Min
Singapore
Приєднався 7 лют 2012
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: discord.gg/bzp87AHJy5
LinkedIn: www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: delvingintotech.wordpress.com/.
Twitter: johntanchongmin
Try out my games here: simmer.io/@chongmin
Discord: discord.gg/bzp87AHJy5
LinkedIn: www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: delvingintotech.wordpress.com/.
Twitter: johntanchongmin
Try out my games here: simmer.io/@chongmin
o3 (Part 1): Generating data from multiple sampling for self-improvement + Path Ahead
75% on ARC-AGI semi-private dataset is insanely good!
o3 is indeed groundbreaking, and shows that we might be close to finding a general training procedure that can self-improve with fine-tuning.
Here's some slides I made to explain how o3 works based on my own understanding of it (or more generally, architectures that bootstrap learning via fine-tuning on correct trajectories)!
That said, I do not think o3-type architecture is the only way ahead for learning.
I believe that fine-tuning on own trajectories is slow to learn, and having a procedure to learn with external memory is very important (and missing) right now!
Moreover, we should consider imbuing some biases so that we can reduce samples needed for training - like filters in Convolutional Neural Networks to bias for neighbouring pixels.
Main article for discussion: arcprize.org/blog/oai-o3-pub-breakthrough
~~~
Slides: github.com/tanchongmin/agentjo/blob/main/paper_reviews/o3_discussion.pdf
o3 results: ua-cam.com/video/SKBG1sqdyIU/v-deo.html
o1 results: openai.com/index/learning-to-reason-with-llms/
Papers on fine-tuning:
STaR: Bootstrapping Reasoning With Reasoning
arxiv.org/abs/2203.14465
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
arxiv.org/abs/2408.03314
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? arxiv.org/pdf/2405.05904
Journey: Surpassing o1 via fine-tuning
arxiv.org/pdf/2411.16489
~~~
0:00 Introduction
1:45 Primers about Deep Learning
3:50 Web Data is almost used up
7:31 Augmenting with Expert Data
10:00 Generating Data with Model
10:50 Self-Improvement with Generated Data
16:54 Expert Data Generation
24:20 Self-taught Reasoner (STaR)
36:05 Multiple Sampling and Filtering
43:13 Generating Diverse Samples
48:33 Improving chance of correct generation
57:49 Fine-tuning as curriculum learning
1:03:00 Chain of Thought via Subtasks
1:09:57 Prompting Chain of Thought (live)
1:13:35 Limitations of LLM-only reasoning
1:18:40 Fine-tuning on reasoning as tree search prioritization?
1:24:24 Primer for next session
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: discord.gg/bzp87AHJy5
LinkedIn: www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: delvingintotech.wordpress.com/
Twitter: johntanchongmin
Try out my games here: simmer.io/@chongmin
o3 is indeed groundbreaking, and shows that we might be close to finding a general training procedure that can self-improve with fine-tuning.
Here's some slides I made to explain how o3 works based on my own understanding of it (or more generally, architectures that bootstrap learning via fine-tuning on correct trajectories)!
That said, I do not think o3-type architecture is the only way ahead for learning.
I believe that fine-tuning on own trajectories is slow to learn, and having a procedure to learn with external memory is very important (and missing) right now!
Moreover, we should consider imbuing some biases so that we can reduce samples needed for training - like filters in Convolutional Neural Networks to bias for neighbouring pixels.
Main article for discussion: arcprize.org/blog/oai-o3-pub-breakthrough
~~~
Slides: github.com/tanchongmin/agentjo/blob/main/paper_reviews/o3_discussion.pdf
o3 results: ua-cam.com/video/SKBG1sqdyIU/v-deo.html
o1 results: openai.com/index/learning-to-reason-with-llms/
Papers on fine-tuning:
STaR: Bootstrapping Reasoning With Reasoning
arxiv.org/abs/2203.14465
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
arxiv.org/abs/2408.03314
Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? arxiv.org/pdf/2405.05904
Journey: Surpassing o1 via fine-tuning
arxiv.org/pdf/2411.16489
~~~
0:00 Introduction
1:45 Primers about Deep Learning
3:50 Web Data is almost used up
7:31 Augmenting with Expert Data
10:00 Generating Data with Model
10:50 Self-Improvement with Generated Data
16:54 Expert Data Generation
24:20 Self-taught Reasoner (STaR)
36:05 Multiple Sampling and Filtering
43:13 Generating Diverse Samples
48:33 Improving chance of correct generation
57:49 Fine-tuning as curriculum learning
1:03:00 Chain of Thought via Subtasks
1:09:57 Prompting Chain of Thought (live)
1:13:35 Limitations of LLM-only reasoning
1:18:40 Fine-tuning on reasoning as tree search prioritization?
1:24:24 Primer for next session
~~~
AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator.
Discord: discord.gg/bzp87AHJy5
LinkedIn: www.linkedin.com/in/chong-min-tan-94652288/
Online AI blog: delvingintotech.wordpress.com/
Twitter: johntanchongmin
Try out my games here: simmer.io/@chongmin
Переглядів: 1 816
Відео
AgentJo CV Generator: Generate your CV by searching for your profile on the web!
Переглядів 10921 день тому
Demonstrates how to use an AgentJo agent, equipped with a selenium web browsing function, to generate a CV! Notebook: github.com/tanchongmin/agentjo/blob/main/contrib/Demo/AgentJo_CV_Generator.ipynb 0:00 LLM Setup 1:50 Custom Search API 3:33 Agent Definition 5:22 Customised Replies AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and ...
AgentJo FinTech Demo: Extract Data from Web Pages, Answer with Citation, Multi-step Agentic RAG
Переглядів 13721 день тому
Check out my latest FinTech demo notebook: It has: - strict_json to extract information from webpage - citation from pdf with sources cited - multi-step agentic retrieval augmented generation (RAG) Of course, more work needs to be done to make sure that LLMs are robust with different queries, but this is a baseline template you can freely use for your own use cases as well. Check out the notebo...
Can LLMs be used in self-driving? CoMAL: Collaborative Multi-Agent LLM for Mixed Autonomy Traffic
Переглядів 264Місяць тому
CoMAL is a very interesting paper which uses mutli-agent collaboration to define leader / follower roles between autonomous vehicles, and then each agent will plan their velocity, acceleration and spacing from car in front individually. It has a memory to draw information from and to update dynamically according to environment experience. Overall, there is a lot of merit to the architecture cre...
From TaskGen to AgentJo: Creating My Life Dream of Fast Learning and Adaptable Agents
Переглядів 5792 місяці тому
TaskGen is a framework that is a culmination of 5 years of thoughts during my PhD to build fast learning and adaptable agents. It uses a task-directed, memory-based mechanism to focus on tasks and learn from the environment, with knowledge sharing on a need-to-know basis. AgentJo is the continuation of TaskGen, as we scale it to multiple memory abstraction spaces, multiple agent augmentations, ...
Tian Yu X John: Discussing Practical Gen AI Tips for Image Prompting
Переглядів 1242 місяці тому
Speaker Profile: Tianyu is a generative AI practitioner, speaker, author. He founded a company dedicated to upskilling professionals in generative AI. He also chairs the GenAI Risk Chapter at RIMAS. With over 1,000 hours dedicated to generative AI tools, Tianyu has generated over 30,000 images and 10,000 videos. He holds a best-selling DALL-E course and recently published the book "Will ChatGPT...
Jiafei Duan: Uncovering the 'Right' Representations for Multimodal LLMs for Robotics
Переглядів 1992 місяці тому
Speaker Profile: Jiafei Duan is a third-year PhD student in robotics at the University of Washington’s Paul G. Allen School of Computer Science & Engineering, where he is part of the Robotics and State Estimation Lab, co-advised by Professors Dieter Fox and Ranjay Krishna. His research focuses on robot learning, embodied AI, foundation models, and computer vision. He is currently funded by the ...
TaskGen Tutorial 6: Conversation Wrapper
Переглядів 843 місяці тому
I talk about how to wrap an agent in a conversational interface, for chatbots and interaction with environment. This conversation wrapper also has persistent memory that can keep track of states throughout the conversation. TaskGen Repo: github.com/simbianai/taskgen AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo AI and ML enthusiast. Likes to think ab...
TaskGen Tutorial 5: External Functions & CodeGen
Переглядів 883 місяці тому
Here, I go through how to integrate functions from other agentic frameworks easily, and how to get TaskGen to generate and run code! AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. A...
TaskGen Tutorial 4: Hierarchical Agents
Переглядів 1623 місяці тому
Here, we cover how to use hierarchical agents in TaskGen! AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen AI and ML enthusiast. Likes to think about the essences behind breakthroughs of AI and explain it in a simple and relatable way. Also, I am an avid game creator. Discord: discord.gg/bzp87AHJy5 LinkedIn: ww...
TaskGen Tutorial 3: Memory
Переглядів 1733 місяці тому
Memory is very important for learning. We need different abstraction spaces of memory, each of them consolidating experiences in a different form. Then, when we need to retrieve the memories, we should take those memories that are similar to what we are experiencing right now. AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/sim...
TaskGen Tutorial 2: Shared Variables and Global Context
Переглядів 1323 місяці тому
Here, we go through Shared Variables, a way to store and retrieve important information in a dictionary. We also go through Global Context, a way to put these Shared Variables into the agent's prompt. AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen AI and ML enthusiast. Likes to think about the essences behind...
Beyond Strawberry: gpt-o1 - Is LLM alone sufficient for reasoning?
Переглядів 1 тис.3 місяці тому
gpt-o1 has Chain of Thought (CoT) likely already built into the dataset, perhaps by using methods such as Self-Taught Reasoner (STaR) to augment the dataset with rationales, or getting PhD students to provide the rationale. The key takeaway is that inference at runtime helps significantly on traditional problem solving domains like math and code, and gpt-o1's way of doing this has significant p...
TaskGen Tutorial 1: Agents and Equipped Functions
Переглядів 2263 місяці тому
In this TaskGen Tutorial Series, I will be going through how to use TaskGen. In this tutorial, we will cover the basics of how to use Agents and Equipped Functions for agentic pipelines. AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen AI and ML enthusiast. Likes to think about the essences behind breakthroughs...
TaskGen Tutorial 0: StrictJSON
Переглядів 2223 місяці тому
In this TaskGen Tutorial Series, I will be going through how to use TaskGen. In this tutorial, we will cover the basics of StrictJSON, an LLM output JSON parser with type forcing and iterative correction, and how to use it. AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen AI and ML enthusiast. Likes to think ab...
LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation + Improvements
Переглядів 5073 місяці тому
LLM-Modulo: Using Critics and Verifiers to Improve Grounding of a Plan - Explanation Improvements
Agentic Systems for Production: Tips and Tricks
Переглядів 6814 місяці тому
Agentic Systems for Production: Tips and Tricks
alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
Переглядів 2434 місяці тому
alphaXiv - Share Ideas, Build Collective Understanding, Interact with ANY open sourced paper authors
TaskGen Overview: Open-Sourced LLM Agentic Framework - Task-Based, Memory-Infused, StrictJSON
Переглядів 2,8 тис.4 місяці тому
TaskGen Overview: Open-Sourced LLM Agentic Framework - Task-Based, Memory-Infused, StrictJSON
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
Переглядів 1,9 тис.4 місяці тому
AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents
NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
Переглядів 5065 місяців тому
NeoPlanner - Continually Learning Planning Agent for Large Environments guided by LLMs
Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
Переглядів 1,6 тис.5 місяців тому
Michael Hodel: Reverse Engineering the Abstraction and Reasoning Corpus
TaskGen Conversational Class v2: JARVIS, Psychology Counsellor, Sherlock Holmes Shop Assistant
Переглядів 1465 місяців тому
TaskGen Conversational Class v2: JARVIS, Psychology Counsellor, Sherlock Holmes Shop Assistant
CodeAct: Code As Action Space of LLM Agents - Pros and Cons
Переглядів 6936 місяців тому
CodeAct: Code As Action Space of LLM Agents - Pros and Cons
TaskGen Conversation with Dynamic Memory - Math Quizbot, Escape Room Solver, Psychology Counsellor
Переглядів 1777 місяців тому
TaskGen Conversation with Dynamic Memory - Math Quizbot, Escape Room Solver, Psychology Counsellor
Integrate ANY Python Function, CodeGen, CrewAI tool, LangChain tool with TaskGen! - v2.3.0
Переглядів 3877 місяців тому
Integrate ANY Python Function, CodeGen, CrewAI tool, LangChain tool with TaskGen! - v2.3.0
Empirical - Open Source LLM Evaluation UI
Переглядів 3417 місяців тому
Empirical - Open Source LLM Evaluation UI
StrictJSON (LLM Output Parser) Ask Me Anything #1
Переглядів 5887 місяців тому
StrictJSON (LLM Output Parser) Ask Me Anything #1
Where did OpenAI say they didn't use tree search? I think they do use tree search, specifically MCTS for generating the synthetic data for o1, then at inference time they don't use tree search. The magic is in creating the synthetic data - they take a variety of paths including some wrong paths of the tree search and chain those with keywords like "but wait, the above is getting me stuck. Let's try this instead" then jump to another branch (the branch frequently does lead to the correct answer) of the tree. The key is MCTS + "let's verify step by step" in my opinion, so they linearize the MCTS thoughts chains and train on that. Somewhere in there they're using RL also as another key ingredient. Looking forward to hear your thoughts
Add one more thing: take a look at Sasha Rush's video "speculations on o1" where he describes 4 possible approaches and he explains the stream of search approach. There are a number of problems with this approach such as collapse and loss of generality (as you noted experiencing). But their "secret sauce" could really just be a lot of hard work to overcome these issues to scale the techniques
Thanks for the insightful comments. I think tree search may be possible but it is extremely hard to get the heuristic for the nodes right. For example, in AlphaZero the value network is very hard to train and often leads to system collapse if initialised wrongly (I've trained AlphaZero before). OpenAI members have repeatedly said the underlying algo is very simple. I think tree search is good but may be too complex for self-learning.
I tried o1 with moderately complex questions regarding solar astronomy and o1 completely fell apart. It was useless. I would point the contradictions in the answers and it kept apologizing.
Prompt that makes 4o behave like o1: ``` [Problem] Do it out by the following format, taking care to reflect, verify, clarify all assumptions: ###Thoughts### <Subtask 1 Name> <Subtask 1 Result> <Subtask 2 Name> <Subtask 2 Result> ... ###Final Answer### <Final Result> ```
Great content, thanks so much for sharing all of your videos!
What is the purpose of generating synthetic data from the model which would be used to improve itself? Wouldn't the synthetic data it produced contain the exact same biases as the model? How do you remove the inherent bias? More importantly, if it can produce expert data, why would it be used to fine-tune itself over it again considering the model was already able to produce the very same data? Does this feel like CoT or ReAct with extra steps?
@@_PranavDesai You can actually do chain of thought prompting to get the model to output more detailed steps, which it natively may not do due to web data not being of that format. Such understanding of reasoning steps can be transferred across domains by fine tuning it, resulting in a model that can do reasoning/chain of thought natively without the prompt In most cases, you have a ground truth dataset to check if the answer obtained by reasoning is correct, and so you are more assured (though not 100%) that the model is generating the right reasoning traces. Btw I myself do not believe models can actually reason like humans, but these reasoning serves as chain of thought to help guide better generation, so it plays an important role.
one important reason of producing synthetic data from the model is that it helps the model represent its knowledge, otherwise you would be feeding the knowledge from another source which it doesn't know anything about. since we want the models to be honest, which means they should learn about what they know and don't, this self-generating data is the best way to make them hallucinate less.
AgentJo GitHub repo here: github.com/tanchongmin/agentjo
Great work and very practical too, kudos 👍 How much more time for going from agentjo to agentjohn? LOL
haha it's agentjo as it sounds friendlier
Check out AgentJo GitHub here! github.com/tanchongmin/agentjo
I like this more. But how does the tech differ from hypothes.is ?
Thank you John for this explanation! Will try it out!
i am quite interested to know, the distribution of XOR/XNOR distribution here. They are very good with non linearity classification. after the training is complete, does the distribution of xor/xnor seem higher than other gates?
Can't wait to see how this will revolutionize math
may you provide code pls?
Since the connections are randomly initialized and then fixed, so it's impossible to guarantee a 100% acc even the data is guaranteed no noise?
i hear the size being of concern, but i feel the comparison is a bit off: while the number of neurons in the logic gate net scale n^2 with the input size, where as in classic neural net scale linear. the number of weights in a classic FC-NN also scale quadratically with the number of neurons. so the learnable parameter count scales quadratically in both. the logic gate net when drawn on paper may look larger, but it wont require more memory i think. in the paper they mention that they just save the random connections in form of a seed number, so that does not require any space
There’s no choice, but it feels like too many interfaces have changed. 😭
It will not work well...
I'm happy to hear alternate views. How to improve it?
No, if you have a pulse signal, it will have an infinite representation in the latent (frequency) space. No advantage! Also, this JEPA is nothing new. In the past it was called PLS (Partial Least Squares and the likes -- Kernel PLS).
The idea of a suitable abstraction space is still a good one. It is normally good to do representations in a space where it is simple. For the infinite representation in frequency space, it just means that it may not be the right space for processing. Maybe we should do in the time domain instead for that.
Related video (TaskGen Paper): ua-cam.com/video/F3usuxs2p1Y/v-deo.html
Here's my discord group for the AgentJo discussion + Logo Design competition: discord.gg/bzp87AHJy5
Looking forward to following the development of AgentJo ( very cool name), keep up the excellent work Jon💪
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
TaskGen Repo: github.com/simbianai/taskgen AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
Book Link: www.amazon.com/Will-ChatGPT-Take-Job-Strategies-ebook/dp/B0D6Y8ZX5Y DALL-E course: www.udemy.com/course/the-dall-e-master-course/ Upcoming Video Generation Course in Singapore (Thursday 24 Oct 2024 4pm): lu.ma/p0g5ksm9
AgentJo Repo (building on TaskGen for multi-agent systems): github.com/tanchongmin/agentjo TaskGen Repo: github.com/simbianai/taskgen
Thank you for this tutorial. Would love to see integrating LanceDB (multi-modal vector db) to TaskGen.
cool wrappers
The Design Philosophy and whole architecture for taskgen was a pleasure to watch. Got to learn so much from it. I would love to contribute to the Project. Do we have any next ideas/features that we want to develop for the Project ? I have joined your Discord Channel as well
Thanks for the affirmation! I am intending to build a lot of different memory structures, you can help with that!
Pretty powerful. Thanks!!
Hi John. Was chugging along well and fine with Ollama-mistral-nemo model. All works well right till Tutorial 2 - Shared variables. improvised with another function with sqlite dbase - works fine as well. Then came this 3rd Tutorial on Memory, having OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable. Is your memory method strictly requires OpenAI API key to continue? Look forward. Cheers and appreciate your work very much.
the default memory uses openai embeddings, let me do another with sentence transformers
Update: Proudly rejected at ENNLP Industry Track 2024 for not having enough experiments!
Cognitive Overload is the key motivation to design multiple agents.
loved it. knowledge graph support would be pretty awesome.
Thank you very much for your work! I already tried it with Ollama and your examples did work from the first try, really good job!
Firstly, this explains the inner workings well (I have jumped directly to the code and could only understand like 70-80% of it, but watching this video made it more sense of what's overall happening inside) So is there any way to track the number of API calls made for a given task and even the total number of input tokens sent and the total number of output tokens generated. I looked briefly through the code and could not find it. Would be happy to contribute for the same if such functionality is required
Indeed, we could have a counter in the llm function to track this. You could implement this yourself too in your own custom llm function If you would like to contribute this functionality for the default llm, you can directly modify the chat function in base.py Thanks!
@@johntanchongmin thats exactly I thought after a while. Like one method is to use a counter in the llm function directly which will not alter the actual codebase
Loved it. Eagerly waiting for planning implementaion in Taskgen
bookmarking this one! question: have you tried using OCR capable LLM's like gemini as an alternative for CLIP embeddings ?
Have not yet, but will be very interested to see if there are good alternatives for CLIP embeddings. Multimodal LLMs and applications will be on the rise in the near future.
Where are these live streams conducted?
Usually on zoom. For details, go to my discord.
@@johntanchongmin thanks. joined the discord channel
Love this on all levels. I have created multiple agent flows already which had impressive performance but I can’t wait to delve into hierarchical agents!
Very nice! Do share what you've created if you can!
The multi agent concept really brings up many ideas.