Language Models are Open Knowledge Graphs (Paper Explained)

Поділитися
Вставка
  • Опубліковано 9 чер 2024
  • #ai #research #nlp
    Knowledge Graphs are structured databases that capture real-world entities and their relations to each other. KGs are usually built by human experts, which costs considerable amounts of time and money. This paper hypothesizes that language models, which have increased their performance dramatically in the last few years, contain enough knowledge to use them to construct a knowledge graph from a given corpus, without any fine-tuning of the language model itself. The resulting system can uncover new, unknown relations and outperforms all baselines in automated KG construction, even trained ones!
    OUTLINE:
    0:00 - Intro & Overview
    1:40 - TabNine Promotion
    4:20 - Title Misnomer
    6:45 - From Corpus To Knowledge Graph
    13:40 - Paper Contributions
    15:50 - Candidate Fact Finding Algorithm
    25:50 - Causal Attention Confusion
    31:25 - More Constraints
    35:00 - Mapping Facts To Schemas
    38:40 - Example Constructed Knowledge Graph
    40:10 - Experimental Results
    47:25 - Example Discovered Facts
    50:40 - Conclusion & My Comments
    Paper: arxiv.org/abs/2010.11967
    Abstract:
    This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available.
    Authors: Chenguang Wang, Xiao Liu, Dawn Song
    Links:
    UA-cam: / yannickilcher
    Twitter: / ykilcher
    Discord: / discord
    BitChute: www.bitchute.com/channel/yann...
    Minds: www.minds.com/ykilcher
    Parler: parler.com/profile/YannicKilcher
    LinkedIn: / yannic-kilcher-488534136
    If you want to support me, the best thing to do is to share out the content :)
    If you want to support me financially (completely optional and voluntary, but a lot of people have asked for this):
    SubscribeStar: www.subscribestar.com/yannick...
    Patreon: / yannickilcher
    Bitcoin (BTC): bc1q49lsw3q325tr58ygf8sudx2dqfguclvngvy2cq
    Ethereum (ETH): 0x7ad3513E3B8f66799f507Aa7874b1B0eBC7F85e2
    Litecoin (LTC): LQW2TRyKYetVC8WjFkhpPhtpbDM4Vw7r9m
    Monero (XMR): 4ACL8AGrEo5hAir8A9CeVrW8pEauWvnp1WnSDZxW7tziCDLhZAGsgzhRQABDnFy8yuM9fWJDviJPHKRjV4FWt19CJZN9D4n
  • Наука та технологія

КОМЕНТАРІ • 63

  • @TheGreatBlackBird
    @TheGreatBlackBird 3 роки тому +21

    It is slightly comforting that even "high profile" papers often have meh ideas hold together with duct tape.

  • @josephharvey1762
    @josephharvey1762 2 роки тому +2

    I love the way you freely critique papers!

  • @sarvagyagupta1744
    @sarvagyagupta1744 3 роки тому +5

    I think you should make a video on the differences between these attention techniques and Lambda layers. Similarities and differences can be quite confusing.

  • @andres_pq
    @andres_pq 3 роки тому +8

    Loved the South Park reference 17:30

  • @bennettbullock9690
    @bennettbullock9690 2 роки тому +2

    I have to read the article, but the Chomskian (conventional linguistic) perspective is that grammar is an aide to semantic interpretation, and therefore the LMM's knowledge of grammar is going to be definition encode knowledge of at least simple facts.
    Which leads me to wonder why we even bother with extracting a relation from attention matrices in the first place. Why not just extract them from the sentence itself, since the sentence specifies this relation?

  • @drhilm
    @drhilm 3 роки тому +18

    Even your commercials are interesting.

  • @horizon9863
    @horizon9863 3 роки тому +5

    I tried TabNine but the CPU usage is really high, every time it starts CPU got 50% usage (AMD 3900X)

  • @soopace1486
    @soopace1486 3 роки тому +1

    thank you for the interesting video

  • @karnigilon
    @karnigilon 3 роки тому +10

    As you say much of this seems like "grammatical" discovery. How does it compare to simply using the verbs as a relation? maybe using the verbs as the basis , and then using the attention values to filter out some of the tuples can increase the recall?

    • @YannicKilcher
      @YannicKilcher  3 роки тому

      good idea, I guess the goal here was to publish, not to get the best numbers :)

  • @DinHerios
    @DinHerios 3 роки тому +26

    Click-bait generates more clicks, which leads to more advertising money. Or, as in the case of academia, potentially more cites, simply because more people read the paper. It was only a question of time before academia gets infected with this concept. It certainly seems to work really well.

    • @G12GilbertProduction
      @G12GilbertProduction 3 роки тому

      In SEO sites optimalization attended to search engine indexes it's probably being more copy only this same tags at same time.

    • @ChaoticNeutralMatt
      @ChaoticNeutralMatt 7 місяців тому

      Yeah. This wasn't what I expected and I'm somewhat disappointed. At least it was fairly immediate that this wasn't what I expected based on the abstract.

  • @eduarddronnik5155
    @eduarddronnik5155 3 роки тому +1

    Dawn Song is a seriously great name. Like Klara Himmel.

  • @PeterOtt
    @PeterOtt 3 роки тому +3

    wow, TabNine doesnt even start auto-charging you after the 3 months expire? what a good guy thing to do!

  • @florianhonicke5448
    @florianhonicke5448 3 роки тому +3

    yeahhhh, another week - another paper!

  • @herp_derpingson
    @herp_derpingson 3 роки тому +3

    I think while this might work great in Wikipedia like well written corpuses. It will fail miserably for spoken human speech, as it is very noisy.
    But I like it. Simple idea but it will take you a long way.

  • @shengyaozhuang3748
    @shengyaozhuang3748 3 роки тому +5

    Many nouns will be split into word pieces right? how to compute attention weights for those nouns?

    • @YannicKilcher
      @YannicKilcher  3 роки тому

      good question, idk

    • @TheZork1995
      @TheZork1995 3 роки тому

      I read that there is no way in combining the subword parts in a meaningful way. So maybe they just use the Subword embedding to represent the word. So a big word has more chances. But it's just a guess.

  • @robertoc.martineza.3719
    @robertoc.martineza.3719 Рік тому +1

    2 years later: GPT4 can make nice KG using ASCII art or pixel art with specific color palette.

  • @marouanemaachou7875
    @marouanemaachou7875 3 роки тому +1

    Keep it coming ! haha

  • @DistortedV12
    @DistortedV12 3 роки тому +3

    Can you go from updated knowledge graph to language model again? If so, that would be really cool

    • @skm460
      @skm460 3 роки тому

      Won't it be an easier task to just form sentences from the triplets?

    • @raphaels2103
      @raphaels2103 Рік тому

      There is a method, rome

  • @kavitachoudhary1112
    @kavitachoudhary1112 3 роки тому

    superb video .please guide or suggest new area of NLP from where i start and go for research.,thank u in advance please guide

  • @konghong3885
    @konghong3885 3 роки тому

    I would argue the model in the commercial is more interesting then the results shown in the paper (good explanation tho)
    KG suffered with so many problems (graph incomplete, graph retrieval problems) making it a nightmare to use in production
    this result is just again demonstrate the power of transformers to perform in fewshot tasks, even if the task is knowledge compression (KG)

  • @evilunicorn9205
    @evilunicorn9205 3 роки тому

    Dawn Song is a badass name

  • @DamianReloaded
    @DamianReloaded 3 роки тому

    EDIT: Can't gpt3 do this already? Like giving it a text as prompt and get it to generate an xml containing the triples in the text you gave it? O_o

    • @ea_naseer
      @ea_naseer 10 місяців тому +2

      I'm here from the future, gpt3 can generate PDDL statements fine so it can probably generate triples in text

  • @first-thoughtgiver-of-will2456
    @first-thoughtgiver-of-will2456 3 роки тому +1

    is regressing on human input knowledge distillation?

    • @YannicKilcher
      @YannicKilcher  3 роки тому +1

      good point

    • @SirPlotsalot
      @SirPlotsalot Рік тому

      There's a case to argue it's actually an example of optimally-robust curriculum learning which is neat

  • @dr.mikeybee
    @dr.mikeybee 3 роки тому

    Thanks for the free 100 days.

  • @sillygoose2241
    @sillygoose2241 3 роки тому +2

    The relation must be in between the head and the tail?? Yoda pleased is not

  • @MachineLearningStreetTalk
    @MachineLearningStreetTalk 3 роки тому +3

    Hello 🙌

  • @first-thoughtgiver-of-will2456
    @first-thoughtgiver-of-will2456 3 роки тому

    dont bait me with the rust shoutout.

  • @maxwellclarke1862
    @maxwellclarke1862 3 роки тому +4

    I don't get the comment about Rust :) ?

  • @G12GilbertProduction
    @G12GilbertProduction 3 роки тому

    I think this tetradecimal language structure of this model was not really tetradecimal than be are, only sextadecimal.

  • @kikimajo6850
    @kikimajo6850 3 роки тому +3

    1⃣️ Knowledge in Corpus: Dylan is a songwriter
    2⃣️ "Knowledge" given by spaCy: Dylan(Noun), songwriter(Noun)
    3⃣️ "Knowledge" in pre-trained models: The word "Dylan" somehow relates to "is", syntactically or semantically. AND The word "is" somehow relates to "songwriter".
    1⃣️2⃣️3⃣️ -----MATCH----> (Dylan, is, songwriter)
    (Dylan, is, songwriter) -----MAP-----> KG(Bob_Dylan, occupation, songwriter)
    It seems that 3⃣️ is not 'knowledge' but 1⃣️ actually is.🤔🤔

    • @editorsworld1521
      @editorsworld1521 3 роки тому

      Can't agree more, kind of overstating, especially its title...

  • @manojb8876
    @manojb8876 3 роки тому

    Wait, is tab9 even legal. Isn't gpt not supposed to be released to companies and now bought by Microsoft?

    • @BryanRink
      @BryanRink 3 роки тому +1

      It's based on GPT2, which is released. You're thinking of GPT3.

  • @mathematicalninja2756
    @mathematicalninja2756 3 роки тому

    Love thid

  • @interestingamerican3100
    @interestingamerican3100 Рік тому

    What is wrong with Rust?

  • @ChaoticNeutralMatt
    @ChaoticNeutralMatt 7 місяців тому

    Also man you've been around a while

  • @7th_CAV_Trooper
    @7th_CAV_Trooper Рік тому

    Thumbs up for South Park reference.

  • @greencoder1594
    @greencoder1594 3 роки тому +1

    Start @ [6:44]

  • @tinyentropy
    @tinyentropy 2 роки тому +1

    After half of the video, I felt it is a waste of time. Unfortunately, you didn't properly set expectations by the title of the video.

  • @bobsavage3317
    @bobsavage3317 Рік тому

    This paper makes too many assumptions. For example, "Is" and "Was" are looked up as different Relations. Also, "To Be" is often a sign that the terminal Entity is actually a property of the Subject leading to semantic statements like P(x, TRUE), e.g. the triple (Ernst, Pacifist, TRUE). Another apparent assumption is that a sentence will only have 1 fact in it. The list goes on!
    It's a shame, because the title suggests there could be a way to extract the semantic concept in the model and externalize them in a manner that is machine verifiable. The ability to audit a model on a given topic would be very helpful.

    • @ChaoticNeutralMatt
      @ChaoticNeutralMatt 7 місяців тому

      This feels very unexplored and unrefined. I'll have to hold onto this.

  • @dmitrysamoylenko6775
    @dmitrysamoylenko6775 3 роки тому +6

    "Jeffrey Epstein tour with" could be useful