The 10 Most Cited AI Research Papers of 2024

Поділитися
Вставка
  • Опубліковано 11 січ 2025

КОМЕНТАРІ • 67

  • @bycloudAI
    @bycloudAI  28 днів тому +8

    Check out HubSpot's FREE AI Prompt Library Now! clickhubspot.com/h14h

  • @Happ1ness
    @Happ1ness 28 днів тому +183

    Meta doing "Open"AI's job is still kinda surprising to me, lol

    • @user-cg7gd5pw5b
      @user-cg7gd5pw5b 28 днів тому

      Are you insinuating that Saint Zuckerberg is otherwise untrustworthy?!

    • @Gamatoto2038
      @Gamatoto2038 27 днів тому +14

      Shouldn’t OpenAI be renamed to closedai 😂

    • @peterjackson4530
      @peterjackson4530 26 днів тому

      @@Gamatoto2038 Bang!

    • @JTient
      @JTient 21 день тому +1

      Yeah i still don't trust zuck but good on him I would have rather paid 12$ to keep my privacy so you ain't gonna fool me again. Also Darpa program lifelong he was handed it. Even Elon scares me. The outcome doesn't look good we got to flip the tables before it's too late.

  • @ichbin1984
    @ichbin1984 28 днів тому +147

    The reasons why the technical reports are the most cited is because everytime you use the models in your own research, you reference the technical report. So with 23k published papers, of course the technical reports will be at top

    • @aditya8anil
      @aditya8anil 26 днів тому

      That’s something new I learned today

  • @I77AGIC
    @I77AGIC 28 днів тому +25

    you need to either divide citations by the time it has been out or make a graph showing citations over time where the day each paper is released is shifted to the same place on the x axis. then you would be able to see which papers grew the fastest.

    • @mrj774
      @mrj774 28 днів тому

      Came here to make this comment 👏

    • @npc4416
      @npc4416 27 днів тому +3

      yeah so ranking for growth rate for number of citations over time rather than absolute citation count

    • @heywrandom8924
      @heywrandom8924 26 днів тому

      An example of shifted curves is available on the github star history website which allows comparing repositories

  • @LeBeautiful
    @LeBeautiful 28 днів тому +6

    ByCloud with the amazing AI analysis videos..can’t wait what’s in store for your channel and AI as a whole in 2025

  • @minecraftermad
    @minecraftermad 28 днів тому +50

    just barely missed meta's new paper which seems it'll change stuff in the next year alot. (byte latent transformer) also i'm very surprised nGPT isn't here.

    • @XenoCrimson-uv8uz
      @XenoCrimson-uv8uz 28 днів тому +2

      can you give me a summary of it

    • @npc4416
      @npc4416 28 днів тому +1

      released 1 day ago
      Meta's new Byte Latent Transformer (BLT) model outperforms tokenization models, up to their tested 8B param size.
      The canon previously was that it won't be possible to make byte-level models stable, or make them converge in training.
      their main claim is: "For fixed inference costs, BLT shows significantly better scaling than tokenization-based models"
      Traditionally, LLMs use tokenization - breaking text into predefined chunks (tokens) using fixed vocabularies
      this works directly with bytes (dynamic patching)
      (rigid, fixed-size tokens) -> (dynamically segments text into patches based on byte entropy)
      Byte Entropy: A measure of information complexity that determines how much computational resources should be allocated to different text segments.
      (higher entropy indicates more unpredictable or complex data segments)
      [Instead of treating all text in the same way]
      we changed:
      tokenization - breaking text into predefined chunks using fixed vocabulary
      into:
      Byte Latent Transformer - working directly with raw bytes (dynamic patching)
      we got:
      Improved performance on reasoning tasks
      Enhanced long-tail generalization
      Superior character-level understanding
      quote:
      BLT architecture trends between Llama 2 and 3 when using significantly larger patch sizes. The
      bpe tokenizers of Llama 2 and 3 have an average token size of 3.7 and 4.4 bytes. In contrast, BLT can
      achieve similar scaling trends with an average patch size of 6 and even 8 bytes. Inference flop are inversely
      proportional to the average patch size, so using a patch size of 8 bytes would lead to nearly 50% inference
      flop savings. Models with larger patch sizes also seem to perform better as we scale model and data size.
      BLT with patch size of 8 starts at a significantly worse point compared to bpe Llama 2 at 1B but ends up
      better than bpe at 7B scale. This suggests that such patch sizes might perform better at even larger scales
      and possibly that even larger ones could be feasible as model size and training compute grow.

    • @CantoTheDegenerate666
      @CantoTheDegenerate666 27 днів тому +5

      ​​@@XenoCrimson-uv8uz Basically gets rid of tokenizers and interprets the input's bits directly

    • @erkinalp
      @erkinalp 27 днів тому

      @@CantoTheDegenerate666 except even while processing byte by byte the model tends to invent some kind of morphemes by itself

    • @firecat6666
      @firecat6666 26 днів тому

      @@CantoTheDegenerate666 So it's like a tokenizer but with a token for each individual character?

  • @김인영-q5x
    @김인영-q5x 28 днів тому +4

    Thank you. I have been learning about LLMs in general. This video helped me alot!

  • @moomanchicken6466
    @moomanchicken6466 28 днів тому +6

    9:55 these are distrubition graphs so its showing that there is variance in the accuracy rather than showing that the accuracy is deteriorating

  • @hydrargyrumm
    @hydrargyrumm 27 днів тому +3

    thanks, I'll get started now.

  • @noctarin1516
    @noctarin1516 28 днів тому +8

    Can you cover Meta's Byte Latend Transformer and Coconut (Training Models to Reason in a Continuous Latent Space)?

  • @RedOneM
    @RedOneM 28 днів тому +9

    I wonder in how many papers ChatGPT is a ghostwriter author...

  • @CantoTheDegenerate666
    @CantoTheDegenerate666 27 днів тому +2

    I hope you make a video on Byte Latent Transformers and Large Concept Models, both from Meta (THE GOAT). These two imo are complete gamechangers!

  • @npc4416
    @npc4416 28 днів тому +6

    very interesting.... i wish to know what the future of the ai llm space is going to be, we know that scaling transformers are giving diminishing returns, as seem by top ai labs like open ai, meta, google etc... so i wonder which of these techniques would it be that will be the next big thing that we scale to go further.... will it be mamba... or KAN or maybe diffusion LMs,... who knows, only time will tell...

    • @2034-SWE
      @2034-SWE 27 днів тому

      Diminishing returns? OpenAI?

    • @yannickm5429
      @yannickm5429 26 днів тому +1

      ​@@2034-SWE if we consider scaling transformers only then yes, diminishing returns. The latest advancement is based on reasoning capabilities, not because of even more compute. The transformer architecture has almost reached its limit with regards to scaling and performance benefits. Not saying it wont be overcome or we switch architectures but this is the current state

    • @npc4416
      @npc4416 24 дні тому

      @@yannickm5429 yes exactly, the transformer architecture pleatued, so now everyone it looking for the next big thing, like open ai did with o1, they claim that Large reasoning models are the next big thing but if we look at the results of the latest o1 paper then these reasoning models dont seem to scale well... like for example in some cases o1 preview gives better results than o1 full, so maybe this architecture is not all about scale.... we will see... we also have to see if these reasoning models are actually that good to begin with, that open ai claims aswell, like yes they are better but they are still sometimes only as good at other llms like for example claude 3.5 sonnet (new), its just an llm yet its on the same level as o1, so maybe LRMs are not that big of a deal and we need a truly novel architecture from from the ground up ... like ilya sutskever said, the age of scaling transformers is over, now we need to find a preplacement for pretraining itself... lets see...

    • @npc4416
      @npc4416 7 днів тому +1

      yes so now we scale test time compute instead ​@@2034-SWE

  • @yagzhandag3803
    @yagzhandag3803 18 днів тому

    how did you sort the papers by citation on arXiv?

  • @XiangyuChen-t1q
    @XiangyuChen-t1q 24 дні тому

    How to sort these papers by citation numbers?

  • @BertVerhelst
    @BertVerhelst 28 днів тому

    do you think a llama 3.3 7b model will be released?

  • @Steamrick
    @Steamrick 27 днів тому

    Do any papers from November (or December at this point) even have any citations yet? I mean, someone has to read the paper and then write and publish a paper of their own for a citation to exist... how much can a paper be worth if it was farted out in less than a month?

  • @geckoo9190
    @geckoo9190 19 днів тому

    Hey, that website is great it has a lot of scientific papers, although it seems to be addressed to engineering and technology, I can't find a lot about micro biology

  • @shadowoftokyo
    @shadowoftokyo 28 днів тому

    Where are the weekly posted banger researchs in the community tab though ? I miss them

  • @badizzl
    @badizzl 25 днів тому

    I just found a paper from Meta AI about Large Concept Models.
    I'm still a layman but it sounded very promising for coherence and energy consumption.
    So far it works with text-to-concept and speach-to-concept encoders and a concept-to-text decoder, but I think it could work with other modalities (e.g. video) too, if you make encoders/decoders for that.
    I can't explain it. Just read it for yourself

  • @callmebiz
    @callmebiz 28 днів тому

    Have improvements in pure CV models plateaued? Or are we just not noticing cuz LLMs is what's everyone's been talking about the past 2 years?

  • @ddoice
    @ddoice 28 днів тому +1

    Noice video, but you should normalize the citations with cit per day.

  • @myliu6
    @myliu6 28 днів тому

    Pretty clear that transformers dominated this year. I'm curious to see the most cited in other fields like diffusion, or RL. After all, the biggest breakthrough usually come where not everyone is looking.

  • @Neuroszima
    @Neuroszima 28 днів тому +8

    "AI and ML" bro it is only NLP in there, or NLP-related paper analysis, maybe with some twist of generating images Xd

    • @versaknight
      @versaknight 28 днів тому

      Yeah lol.

    • @jmoney4695
      @jmoney4695 28 днів тому +1

      Well, LLMs dominated the conversation, so when ranking by citations, it makes sense.

    • @Neuroszima
      @Neuroszima 28 днів тому +1

      @@jmoney4695 yeah i know, it is understandable, but it still made me laugh when he said "and that's it in the news of *AI and ML* ", like, bro XD...

  • @human_shaped
    @human_shaped 28 днів тому

    They should have just been weighted by days since publication.

  • @alkeryn1700
    @alkeryn1700 28 днів тому +5

    so close to 32768 !

    • @khanghoutan4706
      @khanghoutan4706 28 днів тому +1

      This is such a nerdy comment

    • @npc4416
      @npc4416 28 днів тому +1

      pls explain

    • @smohanta9016
      @smohanta9016 28 днів тому

      ​@@npc4416 max value of 16bit signed integer

    • @alkeryn1700
      @alkeryn1700 28 днів тому

      @@npc4416 32768 is a power of 2, programmers deal with them pretty often.
      the amount of ai papers published in 2024 is close to that number.

  • @Delmaler1
    @Delmaler1 28 днів тому +1

    this list is biased towards early papers. because they have more time to be cited

  • @surajsamal4161
    @surajsamal4161 28 днів тому +1

    bro why dont you put a lot of vidoes love youre videos btw

  • @mfpears
    @mfpears 24 дні тому

    2024 is far from over.

  • @rickyrickster1303
    @rickyrickster1303 28 днів тому +1

    6:54

  • @KAZVorpal
    @KAZVorpal 28 днів тому +1

    It's a shame that the Apple paper demonstrating what we experts knew,
    that LLMs don't reason,
    isn't on the list.
    People don't like the truth.
    Ah, I see that you did give a monthly...but that you don't understand its impact.
    LLMs don't reason.
    They just look up answers, one token at a time.

  • @LiebsterFeind
    @LiebsterFeind 28 днів тому +2

    I am horribly disappointed that you did not cover all 34,276 papers in this video. Shame! 🤣

  • @locusruizlopez5997
    @locusruizlopez5997 26 днів тому

    So much information 😅... Rhis is so fast

  • @panzerofthelake4460
    @panzerofthelake4460 28 днів тому

    plz look into metas AI papers, one that's about BLT (Bit Latent Transformer or in the lines of that) and COCONUT (Chain of continuous thought). Please.

  • @Ari-pq4db
    @Ari-pq4db 19 днів тому

    Awesome

  • @DmitriZaitsev
    @DmitriZaitsev 27 днів тому

    Please remove the disturbing background music it's not possible to concentrate on the video

  • @joemonteithcinematography7477
    @joemonteithcinematography7477 28 днів тому +1

    Wow I am early!

  • @fionnanobaoighill
    @fionnanobaoighill 28 днів тому +2

    gee pee tee

  • @StefanDeleanu
    @StefanDeleanu 28 днів тому

    GG

  • @renanmonteirobarbosa8129
    @renanmonteirobarbosa8129 28 днів тому

    How about the top 10 worst papers ?

  • @aron2922
    @aron2922 23 дні тому

    This is a really bad way to find interesting papers

    • @Ahmed.Shaikh
      @Ahmed.Shaikh 13 днів тому

      i'm trying to find interesting papers and would love to know what a better way would be to gauge interest for a given research paper...