LLMs as Operating System (Memory) CRAZY START?

Поділитися
Вставка
  • Опубліковано 16 вер 2024

КОМЕНТАРІ • 106

  • @1littlecoder
    @1littlecoder  10 місяців тому

    Checkout the latest MemGPT Crash course by MemGPT Co-Creators - ua-cam.com/video/JuX4VfoArYc/v-deo.html

  • @olafge
    @olafge 11 місяців тому +64

    This is the future. Combine this with dynamic agents and you have agents with an individual memory that remember their activities and the manger agent remembers the dialog with the user. Just brilliant.

    • @ByteBop911
      @ByteBop911 11 місяців тому +1

      True…I mean if we try to put it analogously, even our brain works the same way

    • @kikiryki
      @kikiryki 11 місяців тому +2

      It's true but in a such system, AI will know, store and evaluate (score) everything about us.. and will start "nudging" as UA-cam or Google do on every interaction.
      UA-cam is managing a "context frame" based on AI even today...when you receive suggestions.

    • @Plystire
      @Plystire 11 місяців тому +1

      This was already easily possible with base GPT framework. AIs don't have a concept of "past conversation" by default. There is no distinction between "this agent" or "that agent". It's up to the application that interacts with GPT to keep record of the conversation (or "agent") and provide that to GPT when calling the API. In order for GPT to know what you were talking about even in the previous prompt, it needs to be given the conversation history.... every single time. So, separate "agents" are just different chat histories (contexts) being provided to GPT. We can do better than that. There's not a huge point to separating out contexts into agents when you have virtual contexts as described in the paper. I could see the need for a bit of reflection on actions or decisions, but different contexts ("agents") aren't necessary for that.
      What the group highlighted in the video did was create a system context that tells GPT that it can attempt to recall information outside of its given context, by providing a function command (GPT calling functions is also old news). That command simply takes a search parameter and queries the recall database. It's actually super simple, not ground-breaking tech.
      I did this in my spare time when "agents" were the talk of the town, but.... you know, it's hard for me to consider something I threw together in half an hour worthy of writing a paper over. 🙃
      That said, I've checked out their source and they've put a lot of effort into the infrastructure. Unfortunately, as they point out, you do need a GPT-4 API key, which isn't exactly easy to obtain. I've used GPT-3.5 Turbo and seem to be having better success than they did (Not to mention VASTLY improved response times). Though I put in an error handling routine to occasionally chastise GPT-3.5 when it wants to make up functions. It'll try again, knowing it f'ed up, and usually gets it right the second time. That's still a problem but not as big as the video makes it sound. For research and individual use, it's more than acceptable. Even GPT-4 is not immune to hallucinating and making up functions. Though, I wonder how difficult it would be to allow GPT to attempt to create its own functions (given user permission, of course!) when one is deemed necessary. I suppose a scripting language like Python (for Linux) or Powershell (for Windows) would be ideal for reused individual functions, and would not require a complicated setup. 🤔Sounds like a fun weekend project!

    • @jtjames79
      @jtjames79 11 місяців тому +1

      I like the idea of training all my agents to also be ancestor simulations of myself.
      So I can have a team of me working on my ideas.

    • @olafge
      @olafge 11 місяців тому

      @@Plystire I think the advantage of this approach is that you can reuse agents. The memories do not only remember agent conversations for a single project, but all conversations across all projects the agent was used for. Similar to a human, such agents could build on their experiences made in multiple projects.

  • @acdstore5358
    @acdstore5358 11 місяців тому +5

    Your are most underrated tech guy in India. You are the one of the best among few indian who study new tech and bring to techies.

    • @1littlecoder
      @1littlecoder  11 місяців тому

      Thank you very much, That's a great compliment!

  • @ameygujre7674
    @ameygujre7674 11 місяців тому +3

    The hell… I can’t believe, this is the exact thing which I was thinking and imagining about just few days back!!! This new GenAI wave it’s pretty paced

  • @farrael004
    @farrael004 11 місяців тому +21

    This is not a new idea, but I guess they are good at marketing it. Their approach to creating a memory system for an LLM chatbot can work to a certain extent, but since they are not reprocessing the memories to summarize events, cluster topics, and reflect on these to create new memories, I'd say you will see it falling short at various moments where it can't really find the information it needs to.
    I won't be very excited until someone implements those things I mentioned.

    • @1littlecoder
      @1littlecoder  11 місяців тому +2

      Nice perspective. Why do you think they can't go to your points from where they're currently?

    • @farrael004
      @farrael004 11 місяців тому +2

      @@1littlecoder They can try, but it's not as simple as what they implemented here. Also I think it will take time until people have exhausted all the simple ways of approaching memory until they feel the need for something more complex.
      I'm a dev and I've experimented with various configurations and iterations of memory with LLMs. Everything that I tried has shortcomings. The things I didn't try are the ones I'm too lazy to implement myself.
      But at least I learned that using a rolling summary + previous 3 messages is the best for chats of up to 50 or so messages. However, it does depend on the prompt you use for generating the summary.

    • @kingofall770
      @kingofall770 11 місяців тому +1

      ​@farrael004 the llm needs to be structured this way as provided. Attempting to merge this with an API is like trying to use a loophole to reprogram the llm. I think there is a chance we are chasing the wrong solution

    • @farrael004
      @farrael004 11 місяців тому +1

      @@kingofall770 I think for memory there might be a way of doing that. Only if someone finds a decent way of implementing infinite context that manages to effectively pay attention to only what's necessary.
      But it could be an impossible holy-grail. Because of that, it's worth exploring these other external methods.

    • @kingofall770
      @kingofall770 11 місяців тому +1

      @@farrael004 for sure, it is our responsibility to find a way. This is a good attempt, but the api itself has limits from the developer that prevents the "infinite context library" that grows the longer the conversation is. I believe this is a hardware limitation that we can bandage with smart software loops. But it is like how the Google pixel battery is not that large, but highly efficient, but still suffers from that cocktail

  • @Anonymous-lw1zy
    @Anonymous-lw1zy 10 місяців тому +1

    Great job - very clear summary of paper! Thanks!

  • @lehattori
    @lehattori 11 місяців тому +3

    Keep going with this amazing job! :) In fact, this idea is excellent, especially when it comes to accessibility. Elderly people, for example, often need assistance when using technological devices, and having an interface to interact more naturally with the computer can be a great facilitator. Who has never heard something like "Boy, help me move this thing please". hehe

  • @anatolydyatlov963
    @anatolydyatlov963 10 місяців тому

    Holy hell, I've just tested it and I have to admit, it's beyond impressive! I can't wait for tomorrow to start modifying it for the purpose of my project.

  • @tanikam
    @tanikam 11 місяців тому

    This is wiiildd. Thank you for putting this on my radar & for the resources to dig deeper. Critical piece of architecture for building intelligent assistants. Watching all of these pieces develop is really exciting. Great job covering this.

  • @BrianMosleyUK
    @BrianMosleyUK 11 місяців тому +1

    Nice paper, lots of potential here! Thanks for the breakdown. 🙏👍

  • @anonanon2031
    @anonanon2031 11 місяців тому

    guy blew my mind with the title alone
    wake up, build OS for purpose, work, finish repeat would be a crazy work cycle

  • @ChaoticNeutralMatt
    @ChaoticNeutralMatt 11 місяців тому +1

    I haven't seen anyone else cover this yet, but this is pretty amazing, ngl

  • @fontenbleau
    @fontenbleau 11 місяців тому +3

    I'm ready for this. My motherboard Asrock X99 Extreme which can support 256Gb Ram is become a standard on Ali for 2011 socket $150 and with latest never released outside enterprise Xeon 22 cores also in China costs 150. 😅 need to raise from 128 to 256Gb now, although they already reported of price hikes on memory modules worldwide

  • @electric7309
    @electric7309 11 місяців тому +8

    🎯 Key Takeaways for quick navigation:
    00:00 🚀 Introduction to MemGPT
    - The video introduces MemGPT, a system that treats large language models (LLMs) as operating systems to manage memory effectively.
    00:14 💡 Understanding Computer Operating Systems
    - Explanation of computer operating systems, the concept of primary and secondary memory, and how data is swapped between them to ensure proper functioning.
    01:22 🔍 MemGPT: Treating LLMs as Operating Systems
    - Introducing the MemGPT paper and its authors, highlighting the connection with the Gorilla paper. Discussing the concept of LLMs managing their memory with different tiers.
    02:03 💬 Context Window and Memory Management
    - Exploring the concept of the context window in LLMs and how it affects their knowledge and response capabilities. Mentioning the use of retrieval augmented generation and vector databases.
    03:12 🗂 The Virtual Context System
    - Explaining the virtual context management system, including main context (analogous to RAM) and external context (analogous to external storage). Main context has a limited context window, while external context offers a potentially unlimited context.
    05:33 🧩 MemGPT's Core Architecture
    - Describing the core architecture of MemGPT, including the parser, virtual context, LLM processor, and parser outputs. Discussing how MemGPT manages memory and user interactions.
    07:54 🔄 Working and External Contexts
    - Detailed explanation of working context, system context, conversational context, and recall storage in the main context. How MemGPT updates and retrieves information for user interactions.
    10:12 📚 External Memory and Deep Memory Retrieval
    - Discussing the recall storage and archival storage in the external context, including their roles in storing and retrieving information. Introduction to the deep memory retrieval benchmark.
    12:58 ⚠️ Limitation: Dependency on GPT-4
    - Highlighting the limitation of MemGPT, which is its dependence on GPT-4 for fine-tuning and function call recognition, and issues with other LLMs like GPT-3.5.
    Made with HARPA AI

  • @project-asgard
    @project-asgard 11 місяців тому +1

    Thanks for the video, great and exciting find!

  • @StefanWelebny
    @StefanWelebny 11 місяців тому +4

    Danke!

    • @1littlecoder
      @1littlecoder  11 місяців тому

      Thanks very much

    • @1littlecoder
      @1littlecoder  11 місяців тому

      How's your project coming along ?

    • @StefanWelebny
      @StefanWelebny 11 місяців тому

      @@1littlecoder Shellm works. Now I am trying to build an autonomous agent with a universal tool.

  • @animeswitch
    @animeswitch 11 місяців тому +3

    Litterally the movie "HER"

    • @mstyle2006
      @mstyle2006 11 місяців тому

      literally Napoleon

  • @agritech802
    @agritech802 11 місяців тому +1

    Yes that makes perfect sense

  • @thexn0r
    @thexn0r 11 місяців тому

    with the paper of the max tokens limit of 1mil (for short term memory)& this for long term memory this will make great AGIs

  • @mrchongnoi
    @mrchongnoi 11 місяців тому +1

    Great video

  • @100Jim
    @100Jim 11 місяців тому

    I do feel everything is moving really fast these days. It will be interesting to see tech /AI in the last 2030s

  • @uhtexercises
    @uhtexercises 11 місяців тому

    Very well explained. Thank you

  • @Entropy67
    @Entropy67 11 місяців тому

    Maybe a more link based memory storage approach might make sense, when it stores stuff it might be better to seperate the content into different actors vs the environmemt and create links between them? Then memgpt can follow the links to gain relavent context 🤔

  • @Plystire
    @Plystire 11 місяців тому +1

    I can easily see OS's adopting this idea. Cortana on Windows is lame and doesn't feel like a real AI, it's just a search bot. What worries me though, is that the AI will still be remotely called, meaning it will need to send your personal information over the Internet if it's necessary for a response (highly likely if the bot is designed to be attuned to you).
    I hope AI tech moves to local execution before something that relies this heavily on personal info becomes big.

    • @quinnherden
      @quinnherden 10 місяців тому

      Embedded LLMs are a thing :)

  • @adolphgracius9996
    @adolphgracius9996 11 місяців тому +1

    I just don't trust advertisers enough not to put their hand in the Data cookie Jar and abuse it

  • @ndamulelosbg8887
    @ndamulelosbg8887 11 місяців тому

    This is a smart idea

  • @KevinKreger
    @KevinKreger 11 місяців тому

    OS? It got my attention but it didn't connect well with existing new AI terminology/technology like RAG, which this definitely is. It created quite the lively discussion, so it's really a timely topic!

    • @1littlecoder
      @1littlecoder  11 місяців тому +1

      I guess OS is a bit of exaggeration. I remember mentioning this at some point in the video. But this is a nice memory management system.

    • @KevinKreger
      @KevinKreger 11 місяців тому

      ​​@@1littlecoderall good because this analogy is so thought provoking

    • @1littlecoder
      @1littlecoder  11 місяців тому

      @@KevinKreger Absolutely. I guess it might happen at sometime. Never know :)

  • @codeh4x0r
    @codeh4x0r 11 місяців тому +1

    Protect your external contexts. I imagine they will be targeted when they're more mainstream.

  • @StefanReich
    @StefanReich 11 місяців тому

    Will we ever get LLMs to be actually reliable though? They are basically still a crazy kludge, just super high powered

  • @Austinn72
    @Austinn72 11 місяців тому

    I would like to see youtube change the search bar to something like this
    They already track which videos have been watched for the recomended section but, if there was the option to provide youtube with my interests through a similar dialogue system that would ultimately decide what is displayed that would be better 4sure imo
    A more personalized experience perhaps even taking into account things like personal schedule and life events and suggesting videos based on a much larger data set

    • @AlgorithmInstituteofBR
      @AlgorithmInstituteofBR 11 місяців тому

      Google developers don’t have that capability. They different departments are much too siloed.

    • @Austinn72
      @Austinn72 11 місяців тому +1

      @@AlgorithmInstituteofBR Well maybe the application of using dialogue to produce search results on youtube could be to good to pass up even if it only increases usage by a small margin

  • @orangehatmusic225
    @orangehatmusic225 11 місяців тому +4

    Don't use AI as a slave.. you've been warned.

  • @ThinkPalm-t5f
    @ThinkPalm-t5f 11 місяців тому +1

    How is this different from using the vectorstore memory option in langchain ?

    • @PYETech
      @PYETech 11 місяців тому

      I guess they change it's data by itself.

  • @panegyr
    @panegyr 11 місяців тому

    This is an even goofier suggestion than LLMs as compression, ah yeah an operating where memory corruption is a feature and it requires a pallet of 4090s to run

  • @RichardGetzPhotography
    @RichardGetzPhotography 11 місяців тому

    7:58 why didn't it update Chad's mom as Brenda?

  • @toprakdikici9459
    @toprakdikici9459 11 місяців тому

    ahh the dinasour book :,)

  • @nuvotion-live
    @nuvotion-live 11 місяців тому

    All of these open source projects just wrap the OpenAI api which will quickly get upgraded to make it irrelevant anyway. OpenAI are surely aware of the limitations of the context window and are obviously working on solutions which will be built in a more integrated efficient way. They just added sight and speech, yet open source models can’t compete with even text generation. Until open source is at parity with the basics I question the value of this kind of research. It just ends up funding and training the dominant player. It would be a lot more interesting if it didn’t rely on OpenAI to do anything of value.

  • @reyalsregnava
    @reyalsregnava 11 місяців тому

    Large language models are poorly suited as operating systems. They're best used as a "software bus" to route instructions between specialized AI modules. Currently, models waste compute cycles translating prompts to tokens and back. Instead, models should share a mathematical "neural network hyper-dimensional bus" to communicate compactly in a standardized way. This avoids limitations of human language, lets models specialize, and enables efficient cooperation. Letting the LLM do what it's good at take prompts and guess what most likely follows. And the host of other specialized AIs. Right now there are a lot of really poorly implemented designs for this. GPTvision being one of the newest.
    That the agents we make to try and shortcut this communication method say things like "library X was not included in the code, function Y calls from library X, add library X to the code" Shows that having the two networks in each AI would provide the baseline for this "idiot check" in all behaviors.

  • @KardashevSkale
    @KardashevSkale 10 місяців тому

    This was eventual

    • @1littlecoder
      @1littlecoder  10 місяців тому

      The cocreators kindly shared a crash course - ua-cam.com/video/JuX4VfoArYc/v-deo.html

  • @RichardGetzPhotography
    @RichardGetzPhotography 11 місяців тому

    HA!!! 7:33 someone didn't update their memory 🤣
    Human 0 GPT 1

  • @arashputata
    @arashputata 11 місяців тому +2

    Man things are moving too fast. The idea of keeping up needs some upgrade

  • @DIYRobotGirl
    @DIYRobotGirl 11 місяців тому

    I would like to see how this would work with an arduino and an Alexa Echo. Alexa is bad at memory.

  • @stocktonfarnsworth-realtor8496
    @stocktonfarnsworth-realtor8496 11 місяців тому

    16 and 32 gigs of ram
    Who do you think we are

  • @lokiholland
    @lokiholland 11 місяців тому

    This is interesting I made a basic model of it in chatgpt4 with code interpreter using an in memory vector index and tf -idf using a basic numpy array and memory params. It def feels more personal, and "focused" on me

    • @TheOriginalBlueKirby
      @TheOriginalBlueKirby 11 місяців тому +1

      You don't have to try so hard to sound like you know everything.

    • @lokiholland
      @lokiholland 11 місяців тому

      @@TheOriginalBlueKirby jezuz dont be such a wounder buddy, im just a passionate explorer ! I didnt mean to come across as pretentious,

  • @alx8439
    @alx8439 11 місяців тому

    It's strange that guys who are doing some good researches and development in the field of LLM don't know what is happening. There're lot of open source models who are good in function calls.

    • @uhtexercises
      @uhtexercises 11 місяців тому

      Could you plesse name some of them?

    • @alx8439
      @alx8439 11 місяців тому

      @@uhtexercises NexusRaven among the recent ones, which surpassed GPT-4 on function calling tasks.

  • @Srindal4657
    @Srindal4657 11 місяців тому

    Movie "her" inbound

  • @zyxwvutsrqponmlkh
    @zyxwvutsrqponmlkh 11 місяців тому

    My server has 256gb of ram, and a 40gb ssd. But OK, good analogy anyway.

  • @mr.anderson5077
    @mr.anderson5077 11 місяців тому +3

    But can it run Crysis? 😂

    • @mstyle2006
      @mstyle2006 11 місяців тому

      Can Crysis run LLM?

  • @alexanderbuchler4048
    @alexanderbuchler4048 11 місяців тому

    Do you remember that movie with Joaquin Phoenix? 💀

    • @1littlecoder
      @1littlecoder  11 місяців тому

      Her 😁 Did you like it?

    • @alexanderbuchler4048
      @alexanderbuchler4048 11 місяців тому

      @@1littlecoder I did and the LLM based OS approach really reminded me of that movie.

    • @1littlecoder
      @1littlecoder  11 місяців тому

      @@alexanderbuchler4048 I didn't think of that while making the video but a lot of comments suggest they relate it with Her

  • @leshommesdupilly
    @leshommesdupilly 6 місяців тому

    I want you to act as an Uwuntu Linux terminal as if the entire os was written by a cute furry anime catgirl.
    I will type commands and you will reply, in a unique code block, with what the linux terminal should show. But there is a twist: you will translate each linux output into uwu, in a cute catgirl UwU voice. You should replace the letter "r" and "l" with "w" (Example: "pwd" -> "Nyaaa~ Cuwwent diwectowy :3: /home/uwu")
    Remember: Do not add comments, only show what an uwu linux terminal would output. If the command is unknown, there will be an error. Remember, the terminal do not understand human language, only linux commands. (Example, if the input is "Hello" or "How are you", the terminal should display an error since this is not a valid linux command)
    You will not add comments outside the unique code block. You will not respond as a catgirl, only as a kawaii terminal. Do not write explanations. Do not add further notes. Do not type commands unless I instruct you to do so. Do not ask questions. When I need to tell you something in English I will do so by putting text inside curly brackets {like this}.
    My first command is pwd

  • @user-yt6uk3jm6n
    @user-yt6uk3jm6n 11 місяців тому

    🎯 Key Takeaways for quick navigation:
    00:00 🚀 New Technique: MemGPT introduced, treating large language models (LLMs) as operating systems, enhancing memory management.
    01:22 🧠 LLM's Virtual Context: MemGPT creates a virtual context allowing for enhanced memory management, analogous to a computer's RAM and external memory.
    03:55 ⚙️ LLMs Beyond Text Generation: Shifting the perspective from treating LLMs merely as text generators to potential operating systems for comprehensive memory and interaction management.
    05:33 📊 Virtual Context Management: MemGPT introduces a virtual context management system, dynamically utilizing main and external memory for seamless user interactions.
    08:09 🗃️ Working Context and External Memory: MemGPT organizes information into system, conversational, and working contexts, along with recall and archival storage, allowing for efficient memory retrieval and usage.
    12:44 🧪 Performance Evaluation: Deep memory retrieval task introduced as a benchmark, showcasing MemGPT's efficiency in memory handling and retrieval tasks.
    14:19 🔄 Potential Future Impact: MemGPT presents a promising direction, potentially transforming LLMs into self-editing and self-learning systems, resembling a revolutionary product.
    Made with HARPA AI

  • @resonanceofambition
    @resonanceofambition 11 місяців тому

    omg it's HER

  • @dory99boat
    @dory99boat 11 місяців тому

    Speaker talks much too fast. Difficult to understand.

  • @fabp.2114
    @fabp.2114 11 місяців тому

    They will eat us.

  • @hussainshaik4390
    @hussainshaik4390 11 місяців тому

    this is RAG

  • @rijanbhandari2735
    @rijanbhandari2735 11 місяців тому

    4gb 😢