Step-by-step Code for Knowledge Graph Construction

Поділитися
Вставка

КОМЕНТАРІ • 32

  • @rowdyjuneja
    @rowdyjuneja 3 місяці тому +6

    Thanks for the informative video. Other than LLMS , could you suggest some approach or models to try for relationship extraction?

    • @TwoSetAI
      @TwoSetAI  3 місяці тому +3

      There are traditional ner methods. We will share more in new videos!

    • @zoroyee276
      @zoroyee276 3 місяці тому +2

      Thanks for your video, I am also looking forward to new video of relationships extraction by traditional ways!

  • @artur50
    @artur50 3 місяці тому +5

    Thanks! Will you publish the code or github?

    • @MehdiAllahyari
      @MehdiAllahyari 3 місяці тому +3

      Here's the code: github.com/mallahyari/twosetai

    • @artur50
      @artur50 3 місяці тому +2

      @@MehdiAllahyarithanx!!!

    • @TwoSetAI
      @TwoSetAI  3 місяці тому +1

      Yes.

  • @artur50
    @artur50 3 місяці тому +4

    Mehdi, if you say that LLMs are that excellent at making KGs and you prefer other libraries that are more practical at making KGs, could you say what libraries you mean?

    • @TwoSetAI
      @TwoSetAI  3 місяці тому +2

      Yes we will share more

    • @MehdiAllahyari
      @MehdiAllahyari 3 місяці тому +5

      There are many depending on the domain. But here are some of them that tend to work very well for many domains:
      - github.com/urchade/GLiNER
      - github.com/universal-ner/universal-ner
      - github.com/kamalkraj/BERT-NER

    • @artur50
      @artur50 3 місяці тому +1

      @@MehdiAllahyari thanx, it looks awesome. I will test it for sure…

  • @aGianOstaLgia
    @aGianOstaLgia 3 місяці тому +5

    what was wrong with the previous video? As always, thank you!

    • @MehdiAllahyari
      @MehdiAllahyari 3 місяці тому +1

      Because the subtitles were distracting, we had to re-upload a new one. Unfortunately, the comments of last video cannot be not displayed for this one!

    • @TwoSetAI
      @TwoSetAI  3 місяці тому +3

      i removed the subtitle. hopefully this is easier to watch! thanks!

  • @arjungoalset8442
    @arjungoalset8442 2 місяці тому +1

    Can you please share the link of the notebook that you went through in the video

    • @MehdiAllahyari
      @MehdiAllahyari Місяць тому

      Sure. Here's the code: github.com/mallahyari/twosetai/blob/main/02_kg_construction.ipynb

  • @apulacheyt
    @apulacheyt 2 місяці тому +1

    Very interesting review. Any chances you share the code to try it myself?
    Thanks in advance.
    BTW I'm reading your RAG book.

    • @MehdiAllahyari
      @MehdiAllahyari Місяць тому

      Awesome! Here's the code: github.com/mallahyari/twosetai/blob/main/02_kg_construction.ipynb

  • @myfolder4561
    @myfolder4561 2 місяці тому +1

    Would you recommend using the SLIM local models you introduced earlier in this series for NER, intent classification etc to construct knowledge graphs? Looks like it could be a cost-saver plus it offers structured and consistent inputs for graph construction, although I'm not sure if any of the existing available set of SLIMs is well trained enough for this purpose?

    • @TwoSetAI
      @TwoSetAI  2 місяці тому

      @@myfolder4561 That potentially is a good idea. We haven’t tried it ourselves. Let us know if you try this approach!

    • @TwoSetAI
      @TwoSetAI  2 місяці тому

      @@myfolder4561 it’s indeed possible you will need to train your own SLIM model for this.

  • @neatpaul
    @neatpaul 2 місяці тому

    What if we have no information about the entities? Suppose it is for an application that takes documents as input, in that case we have no sure idea about what the entities will be. How will it work then?

    • @karthickdurai2157
      @karthickdurai2157 2 місяці тому +1

      Then you can let a general NER model to parse depending on your use case, or if you do know the domain, like if it's a PII data, Finance dataset, etc. you can run it through a pretrained NER model for that particular domain

    • @neatpaul
      @neatpaul 2 місяці тому

      ​@@karthickdurai2157The type of application I'm developing, is intended to work on all types of documents irrespective of the domain.

  • @truliapro7112
    @truliapro7112 Місяць тому

    How to use documents which have images like some product manual pdf files. How can we use Grpahrag for this problem?

  • @mulderbm
    @mulderbm 3 місяці тому +1

    Thanks for using the the right tools for the purpose. I am looking at a tabular dataset that i want to use as the material for an llm to generate synthetic sample graphs from so instead of extracting it from the Wikipedia page it has to write the page given the base knowledge graph. And I believe an llm is very useful for that.

    • @MehdiAllahyari
      @MehdiAllahyari 3 місяці тому +2

      Yes for your use case llm is actually the best tool as you want to convert structured data into natural language form.

  • @sgwbutcher
    @sgwbutcher 3 місяці тому +2

    The challenge for me is that LLMs are not consistent within or between documents. In the example, you see "us" and "u.s.". I'm also concerned that Fiat is an Organization but Chrysler is a Company. And in the LLM example of triples, many of the objects are just, well, sentence fragments. The killer feature of KGs is that you can make connections...but the overspecificity would seem to prevent this. For example, I cannot connect Tom Hanks to any other "fourth highest grossing actor"...he's the only one! There seems to be no good way to create a prompt where the LLM generates entities and relationships at a consistent and appropriate level of hyper/hypo-nymy. This is perhaps not surprising given that LLMs don't think, reason, whatever. And therein lies the trap in getting LLMs to lift themselves up by their own bootstraps.

    • @MehdiAllahyari
      @MehdiAllahyari 3 місяці тому +2

      That's exactly my point in the video too. Many people are hyped/over excited to use LLM for extracting name entities and relations especially when you don't define your schema at first. However, there is no guarantee that you get consistent results. Plus the cost if prohibitive!

    • @awakenwithoutcoffee
      @awakenwithoutcoffee 2 місяці тому +1

      I think adding an extra layer of adding metadata (e.g. Parent documents) could solve this issue e.g.: - you can have an LLM embedding with semantic ability to go over each chunk and add metadata related to that chunk so that the LLM can understand the context of each word e.g. "u.s" -> "united states, country".

  • @jackbauer322
    @jackbauer322 2 місяці тому

    spacy-llm can help you do few shots NER and the performs is almost 99% of traditional approach

    • @karthickdurai2157
      @karthickdurai2157 2 місяці тому +1

      I think spacy-llm also uses a LLM behind the scenes, so it may not be as fast as this