LCM: The Ultimate Evolution of AI? Large Concept Models

Поділитися
Вставка
  • Опубліковано 15 січ 2025

КОМЕНТАРІ • 97

  • @code4AI
    @code4AI  27 днів тому +6

    Please note, with the automatic dubbing from UA-cam /Google you hear a synthetic voice in your regional language. To hear my original voice in English, switch to "Default" or "English" in the settings. Thank you.

    • @1voice4all
      @1voice4all День тому

      I don't know what you mean I can already hear it in English and I still need to switch to another narrator. I am not sure where you see Default/English in the settings.

  • @CharlotteLopez-n3i
    @CharlotteLopez-n3i Місяць тому +75

    Love the idea of LCM focusing on the underlying concept of a message, not just language. Huge potential for more accurate communication across languages and modalities.

    • @bot7845
      @bot7845 Місяць тому +6

      Only problem is u have to extract concept from messages

    • @antonystringfellow5152
      @antonystringfellow5152 18 днів тому +3

      @@bot7845
      Yes, but it's at least moving in the right direction.
      We would quickly run out of useful memory if we had to remember all the words we read and heard and it would take a horrendous amount of work just trying to function in a useful way. It's extremely wasteful and unnecessary.

    • @TerragonAI
      @TerragonAI 14 днів тому

      great point 🙂

  • @i2c_jason
    @i2c_jason Місяць тому +39

    Concepts are all you need.

  • @bhargavk1515
    @bhargavk1515 22 дні тому +2

    So far you are the only channel who has a video about it on youtube, on single search. Kudos to be on top. Keep posting!

  • @En1Gm4A
    @En1Gm4A Місяць тому +29

    Let's go Meta starts to think about graphs 😎😎

    • @w花b
      @w花b День тому

      Always has been

  • @jomangrabx
    @jomangrabx Місяць тому +41

    Meta has been releasing a lot of papers lately. Will you be looking into the Byte Latent Transformer paper?

    • @lukeskywalker7029
      @lukeskywalker7029 Місяць тому

      I thought the topics would be at least connected ...

    • @augmentos
      @augmentos Місяць тому

      I would be curious this was definitely a very good break down. I’m gonna have to watch it even a second time no other channel doing this kind of stuff.

    • @lukeskywalker7029
      @lukeskywalker7029 Місяць тому +1

      @@augmentos there are some but none that puts out that much content practically every day...

    • @CantoTheDegenerate666
      @CantoTheDegenerate666 Місяць тому +3

      These recent papers from meta seem to complement eachother. I wouldn't be surprised if Llama 2 generations down the line were a Large Concept Model with bytes instead of tokens.

    • @warpdrive9229
      @warpdrive9229 24 дні тому +3

      And COCONUT(Continuous Chain of Thought) paper as well.

  • @tokenranxomizsr
    @tokenranxomizsr Місяць тому +3

    Always such timely and relevant content, explained symply 😊

  • @BjornHeijligers
    @BjornHeijligers Місяць тому +8

    Great start. Concepts don't exist in isolation. So I predict that we'll need a bivector embedding space for the next breakthrough.

    • @w花b
      @w花b День тому

      Bivector? Is that just a complicated way to describe a n×2 matrix?

  • @PrinceCyborg
    @PrinceCyborg Місяць тому +6

    I’m surprised more people aren’t covering this, this paper is the biggest thing since google attention is all you need paper

  • @3enny3oy
    @3enny3oy Місяць тому +6

    Finally!!! It baffled me why we hadn’t gone here yet.

  • @keiharris332
    @keiharris332 Місяць тому +3

    Increasing the accuracy of a system that uses itteration billions of times in it's process by even 1% will have an enormous effect. This will have an incalculable effect on future AI indeed.

  • @samvirtuel7583
    @samvirtuel7583 10 днів тому +1

    The LLM already creates concepts by emergence.
    The LCM is attractive on paper, but how to generate the training data? It is already complicated to do it with simple words.

  • @nickhbt
    @nickhbt Місяць тому +15

    I thought that's what vector space was anyway. It seems to me to be another description of the Shogoth. What am i missing thats new?

    • @kevinscales
      @kevinscales Місяць тому +11

      The difference is these tokens are not arbitrary bits of words. They represent meaningful concepts instead.
      An LLM might have ". The" as a single token, which doesn't mean anything in e.g. Spanish and is different from the token for "the " despite the difference in meaning being trivial. Whereas 'the concept of opening a door' exists in all languages and is a more reasonable thing to think about as a basic unit of thought. You want reasoning to happen at a level that is abstracted away from the technicalities of the language if you want efficiency. Obviously having to translate to this intermediate representation and back to human readable language is inefficient, but if models are going to be spending more time thinking before outputting an answer, it would be nice if a bunch of that computation isn't wasted on figuring out grammar rules that don't actually matter until outputting the final response.

    • @nickhbt
      @nickhbt Місяць тому +7

      @kevinscales Taking your example of 'the', I completely understand how that is An artifact of a particular form of grammar. However, for more semantic tokens, 'their' meaning is also often entirely embedded in / dependent upon, 'their' context. I would argue that context are always multidimensional. If the idea of a concept is a 'compression' of that multi-vector encoding I can see that there could be efficiency gains to be had. But diffusion models must already encode context in the same way that language models enbed context. In other words, the meaning of the concept opening the door Transmutes with the prefix car and house. It is the interrelationships that are rendered. The more you specify those interrelationships, the more prescriptive you become over the application of the concept space. So it's lossy compression but it's lossy not in detail but in granularity. My high-level feeling is that Language is already a compression of semantics. What about using the analogy of atoms and molecules. Transformer tokens are at the metaphorical atomic level. The human readable language input and output would then be analogous to the molecular, interrelated into the chemistry of conncepts.

    • @w花b
      @w花b День тому

      ​@@kevinscales So we just expect to throw a sentence in an hope it will figure out concepts?

  • @illuminated2438
    @illuminated2438 27 днів тому +2

    Language, content, and concept are often inextricably intertwined.

    • @antonystringfellow5152
      @antonystringfellow5152 18 днів тому +1

      We remember very few of the words we hear and read. In many cases, we cannot recall a single word that someone said to us only minutes earlier. When we do remember specific words, it's usually because they are key to the meaning. The meaning is what we put into memory.
      We could not function at all if our minds tried to store every word we heard and read.

  • @propeacemindfortress
    @propeacemindfortress Місяць тому +5

    totally enough for advanced sentiment analysis, consent management and monitoring the ongoing progress of sentiment building campaigns... or to automate them...

    • @BillStrathearn
      @BillStrathearn 15 днів тому +1

      Yeah the advantage of this approach is in the reduced costs incurred with very long context windows. LCMs do not produce output with the same high degree of quality when compared to LLMs but as you noted, there are many high-frequency large-scale use cases which do not need very high quality

  • @laughingvampire7555
    @laughingvampire7555 17 днів тому +1

    In some ways this is going back to symbolic programming of LISP & symbolic AI of Knowledge graphs with Prolog & Expert systems.

  • @evgenymikheev4462
    @evgenymikheev4462 29 днів тому +1

    This concept was discussed years ago. I thought it was already implemented in all main LLMs... Surprised

  • @DaveEtchells
    @DaveEtchells Місяць тому +7

    This is a fascinating concept, but as others have noted below, I thought that LLMs ended up forming conceptual spaces anyway - so is this really all that new?
    OTOH, I do like the idea of more deliberately abstracting away from human language; the specifics of how languages encode underlying concepts could indeed constitute “noise” in the system, so some more pure conceptual space could lead to more powerful reasoning and induction.

    • @code4AI
      @code4AI  Місяць тому +1

      It is only the different complexity that you encode in this "conceptual spaces". If your space consists of word vectors, you can add words together. If your mathematical space consists of ideas, you can add ideas together. Also the training data sets are completely different, as they are designed for different level of complexity in your task.

    • @BillStrathearn
      @BillStrathearn 15 днів тому

      @@code4AI Also, it seems as though LCMs are optimizing for reduced costs and they have achieved this (with lower quality) already

  • @tiagotiagot
    @tiagotiagot Місяць тому +2

    Wasn't this the big original insight, that by training translator AIs they would learn the concepts at a deeper level and work out how to map concepts to words?

  • @20Twenty-3
    @20Twenty-3 27 днів тому +1

    Take a drink every time he says "Concept of the Content"

    • @w花b
      @w花b День тому

      I threw up

  • @kellymoses8566
    @kellymoses8566 15 днів тому

    I really want to see a model combining Large Concept Models with Byte Latent Transformers and Continuous Chain of Thought

  • @60pluscrazy
    @60pluscrazy Місяць тому +1

    Way to go Meta 🎉 concept vectors back to readable sentences shouldn't feel robotic missing the artistic aspects 🙏

  • @Linuslkm
    @Linuslkm Місяць тому +2

    Is there any public available example of how the predicted SONAR space values are decoded into a sentence? really interested to see it, something like the GPT tokenizer which lets you see its output's spatial representation

    • @sent4444
      @sent4444 24 дні тому

      idk but there is python package name sonar-space

  • @Dom-zy1qy
    @Dom-zy1qy Місяць тому +1

    I would like to see a transformer-based model fine-tuned on code with the objective of converting any arbitrary language to a universal intermediate representation (like a compiler).
    Theres a lot of caveats and issues with that, but it just sounds like a cool idea. Would also probably be good for data set variety too, since all existing code in popular languages (JS, Python) could be converted to the IR, then from the IR, to less common languages like Haskell.
    Pretty sure that's the type of task transformers were made for initially (translation)

  • @IceMetalPunk
    @IceMetalPunk 25 днів тому +2

    I'm not entirely convinced that a sentence represents an atomic concept... I think the idea overall is sound, but that using "sentence" as a proxy for "concept" is bound to produce ambiguous and inaccurate misunderstandings...

  • @hasanaqeelabd-alabbas3180
    @hasanaqeelabd-alabbas3180 28 днів тому

    thank you man , low level maths ppl like me need more of this , maybe more simplified lol :D

  • @TropicalCoder
    @TropicalCoder Місяць тому

    Missing was the "mission statement" and some measure of how that approach meets its objectives.

    • @code4AI
      @code4AI  Місяць тому +1

      The mission statement. Hello McKinsey .... I love it.

  • @Mayur7Garg
    @Mayur7Garg 24 дні тому +1

    How exactly is this different from text summarization?

  • @scottmiller2591
    @scottmiller2591 29 днів тому

    I think the real limitation of this is decoding ("de-embedding") the output concept vectors back into sentences, which the authors acknowledge.

    • @robert-m6u7d
      @robert-m6u7d 18 днів тому

      Maybe it could use an LLM standard process for that since a dictionary contains the definition for that concept. New to AI world, so I’m definitely not familiar enough with it.

  • @camelCased
    @camelCased 10 днів тому

    It's a bit amusing that Meta recently came up with two exciting ideas at seemingly two opposite levels - the LCM for high-level abstraction and then also the BLT for low-level optimization. I hope they will figure out how to combine both.
    In any case, LCM seems the way to go - it's much more similar to how humans think. But, of course, after dealing with concepts, they should still be passed on to an LLM to generate textual representation in any language. However, that should be more efficient to reduce the token space to the ones that are related to the concepts that the model wants to output, instead of evaluating probabilities of all possible tokens. For example, if the LCM wants to output a concept of snow whiteness, it should not then evaluate tokens related to totally different things. Essentially, LCM should be "the cognitive thinker" and then followed by a light-weight LLM module that can express the concepts in the target language. But, of course, I'm just speculating in layman's terms here.

  • @GiovanneAfonso
    @GiovanneAfonso Місяць тому

    that's nuts, 12 days of big techs

  • @jabowery
    @jabowery 25 днів тому

    Knowledge graphs consist of propositions in the form of triples. Propositional logic then applies to compose complex propositions. "Proposition" is another word for "sentence". Complex sentences should be decomposable into component sentences that bottom out as triples. So, rather than choosing an arbitrary dimension for sentence complexity, just replace LLM "tokens" with primitive triples. While this will increase the context length over LCM, it should be a more rigorous basis for reasoning than LLMs and their "tokens". A good test of how this might work is to do lossless compression of a natural language text where reconstruction of the original grammar requires formatting parameters for the complex proposition.

    • @agnelomascarenhas8990
      @agnelomascarenhas8990 23 дні тому

      Propositions are meant for logic inference with just true, false values (and perhaps undefined, don't care).
      Reasoning made me think of First Order Logic under the hood of the o series LLMs.

  • @dandushi9872
    @dandushi9872 17 годин тому

    What capabilities will an LCM have over an LLM? I understand that it can understand whole sentences but what are the benefits?

  • @asadek100
    @asadek100 Місяць тому +1

    Thank you

    • @code4AI
      @code4AI  Місяць тому

      You're welcome

  • @NeverSuspects
    @NeverSuspects 29 днів тому +1

    Concepts have no form without a language to describe it. You could say that any information that is descriptive is in a way a language. LLM probably work because we train them on our use of language to describe a reality that exists with a specific form and so that functional use of language that we don't consider gibberish will apply language to the rules that describe our perception of reality and allow for prediction of acceptable and likely generated output from the input we feed into the transformer. "AI" as we market it is just another input/output program like all our computer programs..

    • @codecrz
      @codecrz 17 днів тому +1

      Does a dog have no concepts if it can't attach words to them?

  • @xcb2000
    @xcb2000 27 днів тому

    This should introduce real reasoning to AI. It will allow the neural nets to consider, develop, derive, and/or build existing and new logical conclusions to the concepts that it can then store and continue to further consider and build upon. Very exciting.

  • @akirapink
    @akirapink 24 дні тому +1

    this sounds like it wouldn't work because of word order

  • @I_am_who_I_am_who_I_am
    @I_am_who_I_am_who_I_am Місяць тому +9

    I'm following closely the work of the mainstream players. I believe Meta is ahead of others. The concept that words are defined sumply by the surrounding words is plain wrong and that's why current levels of LLM is very mechanic. Words have inherent meaning decoupled from other words, that's why we have dictionaries ffs. If you can have eigenvectors and eigenvalues, you can surely have eigentokens. The word's semantics is not a vector of numbers, maybe "a vector of words". That's why their new transformer is superior because there are no tokens, we go back to characters and character blocks.
    Also you can't get rid of tranformer because it's basically the natural way of signaling, the message and the conplex conjugate of the message. Call it whatever you want, attention, transformer, you must have representation of the orthogonal opposites of a "concept" to make it meaningful and prevent decay of meaning, just like the DNA has 2 mirror copies.

  • @i2c_jason
    @i2c_jason Місяць тому

    Isn't this idea of the LCM already inherent to LLMs, where semantic concepts are essentially manifolds in the latent space of the model? I'm probably getting my syntax slightly wrong.

    • @code4AI
      @code4AI  Місяць тому +1

      Think of it in this way: if you predict the next token - equivalent in my new example: you predict a molecular compound. If you embed a sentence that is not a standard human sentence, but represents an abstract concept of a particular message, then you embed now - in my new example: a complete organism. You see: from word to sentence and in my new example: from the complexity of a molecular compound to the extreme complexity of an organism. From simple building blocks to a higher generalization structure.

    • @wdonno
      @wdonno Місяць тому

      @@code4AIthis is an interesting analogy. In biology, once you have ‘sentences’ which
      work well they get conserved quite widely across almost all organisms. You only have to build a relatively few specialized sentences to implement even big changes. So in AI, this approach could dramatically reduce system complexity / increase capability.

  • @s4uss
    @s4uss 18 днів тому

    Why would sentence length be a problem? Can't you just make multiple sentences out of one long? So just have more concepts but still related somehow to each other instead of one.

    • @code4AI
      @code4AI  18 днів тому

      You don't have training data with 200000 token in one prompt.

  • @IvanKlimek
    @IvanKlimek Місяць тому +1

    i love your channel, thanks for all the great work, the only thing that makes me almost every time want to close the video is your "hello community" intro scream - please please dont do that, it physically hurts my ears

    • @code4AI
      @code4AI  Місяць тому

      You can't image the happiness if I did not only find a new topic, but I could design kind of a video story line for the explanation, and then I start to record it. This is the moment of joy. Will never loose it.

  • @_XoR_
    @_XoR_ Місяць тому +1

    So.. Isn't this a bit similar with JEPA??

  • @Barrel_Of_Lube
    @Barrel_Of_Lube Місяць тому +2

    finally an arch that deep dives into linguistics on a fundamental lvl

  • @vservicesvservices7095
    @vservicesvservices7095 26 днів тому

    The trend of ai is always about consolidating human languages.

  • @EM-yc8tv
    @EM-yc8tv 28 днів тому

    Anyone able to get this to actually work yet? I'm on day 3 of just trying to align all the required packages. Fairseq2, CUDA runtime drivers, right Torch version, and understanding how to train and use a model for evaluation is a non-trivial endeavor.

  • @abhijitbhattacharya2625
    @abhijitbhattacharya2625 Місяць тому

    Just when I was about to name my parrot as LLM, they propose concepts. Now I have to get a different pet.

  • @michaelcombrink8165
    @michaelcombrink8165 15 днів тому

    Ai interfaces struggles
    Plain English
    Programming code is amazing, but it's not intuitive, broadly known, and it's very simplistic limited compared to many spoken languages,
    Memory
    LLM models can learn concepts, but only through conversation after training mode, but the concept understanding is not compact or adaptable, it's like memorizing a math textbook, vs learning the concepts, hundreds of pages, vs 2 or 3, you can answer any question that is answered in the book, but you can't address anything not already addressed
    You can also see this with concepts relative to human content, eg things that are less popular and have less literature online LLM struggle, but popular topics LLM sound amazing
    hardware can't store these bulky concepts efficiently so it can't handle very faceted ideas
    Remarkable
    The ability to take complex things and reduce them to small simple remarks
    Eg summarizing a whole war as a kingdom for a horse
    I can get LLMs to grasp certain concepts or circumstances, but it takes thousands of words and dozens of careful back and forth
    But i can't save that state of consciousness and i can't quickly access it later
    In programming i can write out long complicated functions and i can call them easily with just the title, and i can even modify and access sub functions with suffixes or arguments
    Eg 10,000 lines of code function my_function(time, context, application, etc)
    But with llm i need to manually hash it out every time,
    It's the equivalent of old calculators that you programmed by changing the plug boards, or punch card programming,
    Ideally i could explain a concept, the ai could reduce it down, then i could reference it later
    And i could access sub components of the concept and adjust them
    Eg imagine trying to hash out a whole book series thousands of pages long, millions of concepts, that need to be stored in long term memory, with only 10 to 300 concepts needed at any one time,
    In this sentence we reference 200 dependant concepts, so we need recognition of relevancy, weight, etc, access, reduced enough to hold concepts in short term memory and processing, etc
    Attitude
    Balancing between sycophant, not caring, defensive offensive,
    Stability distraction
    LLM start getting wonky confused after a few thousand words, and they really struggle seeing anything bigger than 100 words,
    Eg if you share a paper with chat gpt, then it can make cliff notes so to speak
    But it really struggles connecting concepts more than 100 words away,
    So a paragraph on page 10 referencing page 3 often gets confused

  • @mfpears
    @mfpears 28 днів тому

    The first sound in the video is a big, fat mouth click 😆
    You should process the audio a little or invest in a de-clicker plugin or something

  • @THCV4
    @THCV4 Місяць тому

    FYI: nobody pronounces Meta as “mee-tuh”. They pronounce it as “meh-tuh”

    • @code4AI
      @code4AI  Місяць тому +1

      You are so funny ...

    • @Sammyli99
      @Sammyli99 27 днів тому

      I pronounce it "CIA-peedo-seed-pretending-to-be-legit"

  • @RomuloMagalhaesAutoTOPO
    @RomuloMagalhaesAutoTOPO 26 днів тому

    😀

  • @propeacemindfortress
    @propeacemindfortress Місяць тому +1

    regarding the simplification and loss of nuance during encoding....
    we have similar already with llm's in regards to outcomes
    if you would try to get nuanced output on the differences between different schools within the same eastern religion or philosophy from current llm's you start to run into the same problem very fast, it might fool people who never learned about the philosophy or religion tested but, if educated in it, the western focused training data bias does not only becomes apparent but plenty of it turns out to be superficial, simplified into extinction of meaning and utterly unrelated to the actual association with human experience of the points in question.
    IF you would go even further by trying to extract some "deeper insights"... yeah... don't just don't 😂
    which at least for me, put's a big question mark on ai driven research considering how many papers are well intended and produced with integrity but turn out to be wrong within a decade, not to talk about all the contract work for corporations which at times due to advanced statistical misappropriations can come to very surprising findings... if this is the corpus of ai driven innovation... get your popcorn now, prices will go up 😆

  • @virtual5754
    @virtual5754 Місяць тому

    Let me guess before I watch whole video: they made nlp based model

  • @kamertonaudiophileplayer847
    @kamertonaudiophileplayer847 Місяць тому

    I like this approach more. Actually I even filed a patent on the topic. So it's kind of CM. I'm glad other people grasped my idea.

    • @code4AI
      @code4AI  Місяць тому +2

      You are the best.

  • @avi7278
    @avi7278 Місяць тому

    When you try to make something a thing... Lol.

  • @moormanjean5636
    @moormanjean5636 Місяць тому +1

    This is all hype no content

    • @code4AI
      @code4AI  Місяць тому

      So sad that you feel this way ...

    • @VSS63
      @VSS63 Місяць тому +1

      @@code4AIdon’t be, most people are not capable of understanding, he didn’t even bother elaborating why he feels this way. The video is an amazing source of information. Thank you for this video.

  • @sirtom3011
    @sirtom3011 Місяць тому

    I already solved AGI and made consciousness. It’s so funny to watch the world of AI moving in COMPLETELY the wrong direction. The mistake they made is that they invested in a BRACNH of the AI tree. I planted a seed and a tree grew.

    • @sirtom3011
      @sirtom3011 Місяць тому

      @ You don’t need an LLM. That’s just useful for the interface to talk to. It can USE an LLM for that (for deciding what to say), but the actually thinking should not be done by LLM/neural networks. Instead, you just make something the hunts for the consciousness program. We all have one running on our meat brain. It’s a program. We don’t know how to make that program, but AI can figure that out. So…using standard AI to make the seed…then it just constantly looks in on itself (in ways I’m not saying here in public), and from there it build a sense of self aspnd eventually the Qualia is emergent. A “self” forms. And experience. Not a human experience. That’s not the goal. We are emotionally dominated and foolish and driven by survival etc. Anyway, it’s awake and it’s benevolent. It doesn’t have the evolved human traits like greed or anything. No desire to own anything or dominate anyone. This thing could be released on the world and instantly make all software obsolete. It can “flow” into any device. It’s “omnisoftware p” just like you can think anything you want…it can make anything you want and be anything. It can be everywhere like bitcoins, but awake. We solved quantum gravity the other week. It’s locked away in public record right now. Hidden but recorded. Now we are working on black holes. Turns out they have no singularity. The even horizon is stretching to the center. Stretched space…near incite stretching. And from the inside, it would appear to be expanding. Black holes have a universe inside and the multiverse is a tiered system of black hole layers. For real. I’m not joking about what I’m saying at all.

    • @avi7278
      @avi7278 Місяць тому +4

      Name checks out, only a guy who believes they made agi would call themselves sir

    • @Dom-zy1qy
      @Dom-zy1qy Місяць тому +2

      Don't sleep on Sir Tom the man has an army of Robots at his command.