Step-by-step Code for Knowledge Graph Construction

TwoSetAI

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 29 січ 2025
Наука та технологія

КОМЕНТАРІ • 42

@rembautimes8808 2 місяці тому ⁺¹
Very informative video with excellent production. Looking forward for more content
@rowdyjuneja 7 місяців тому ⁺⁶
Thanks for the informative video. Other than LLMS , could you suggest some approach or models to try for relationship extraction?
@TwoSetAI 7 місяців тому ⁺³
There are traditional ner methods. We will share more in new videos!
@zoroyee276 7 місяців тому ⁺³
Thanks for your video, I am also looking forward to new video of relationships extraction by traditional ways!
@neatpaul 6 місяців тому ⁺¹
What if we have no information about the entities? Suppose it is for an application that takes documents as input, in that case we have no sure idea about what the entities will be. How will it work then?
@karthickdurai2157 6 місяців тому ⁺¹
Then you can let a general NER model to parse depending on your use case, or if you do know the domain, like if it's a PII data, Finance dataset, etc. you can run it through a pretrained NER model for that particular domain
@neatpaul 6 місяців тому
@@karthickdurai2157The type of application I'm developing, is intended to work on all types of documents irrespective of the domain.
@dmitrivillevald9274 3 місяці тому
Building a good app with a knowledge graph without understanding input data would be challenging. A practical first step would be to generate a summary or extract key topic(s) from the document to understand its content before constructing the graph.
@HaseebAshraf-q3q 3 місяці тому ⁺¹
I have a huge company policy document that I want to create a knowledge graph for, how do I define labels for that? or is it better to do it without? If yes can you please guide me how to go about it without defining labels?
@MehdiAllahyari 3 місяці тому
By labels I assume you're talking about entity names. Those are things that you should already know or have some common sense about. So you can start there or manually create a few and use LLMs or some other model to extract/generate additional ones based on them.
@arjungoalset8442 6 місяців тому ⁺¹
Can you please share the link of the notebook that you went through in the video
@MehdiAllahyari 5 місяців тому
Sure. Here's the code: github.com/mallahyari/twosetai/blob/main/02_kg_construction.ipynb
@myfolder4561 6 місяців тому ⁺¹
Would you recommend using the SLIM local models you introduced earlier in this series for NER, intent classification etc to construct knowledge graphs? Looks like it could be a cost-saver plus it offers structured and consistent inputs for graph construction, although I'm not sure if any of the existing available set of SLIMs is well trained enough for this purpose?
@TwoSetAI 6 місяців тому
@@myfolder4561 That potentially is a good idea. We haven’t tried it ourselves. Let us know if you try this approach!
@TwoSetAI 6 місяців тому
@@myfolder4561 it’s indeed possible you will need to train your own SLIM model for this.
@truliapro7112 5 місяців тому ⁺¹
How to use documents which have images like some product manual pdf files. How can we use Grpahrag for this problem?
@MehdiAllahyari 3 місяці тому
if you need to use images as well, you're going to need to use some libraries to identify and extract images. There are few like github.com/ai8hyf/TF-ID or PyMuPDF4LLM.
@Ash2Tutorial 2 місяці тому ⁺¹
It would help if the code was also available. Could you post the link to your code shown in your Jupyter Notebook ?
@TwoSetAI 29 днів тому
@@Ash2Tutorial github.com/mallahyari/twosetai/blob/main/02_kg_construction.ipynb we will share more code in our course. Stay tuned!
@artur50 7 місяців тому ⁺⁵
Thanks! Will you publish the code or github?
@MehdiAllahyari 7 місяців тому ⁺³
Here's the code: github.com/mallahyari/twosetai
@artur50 7 місяців тому ⁺²
@@MehdiAllahyarithanx!!!
@TwoSetAI 7 місяців тому ⁺¹
Yes.
@mulderbm 7 місяців тому ⁺¹
Thanks for using the the right tools for the purpose. I am looking at a tabular dataset that i want to use as the material for an llm to generate synthetic sample graphs from so instead of extracting it from the Wikipedia page it has to write the page given the base knowledge graph. And I believe an llm is very useful for that.
@MehdiAllahyari 7 місяців тому ⁺²
Yes for your use case llm is actually the best tool as you want to convert structured data into natural language form.
@artur50 7 місяців тому ⁺⁵
Mehdi, if you say that LLMs are that excellent at making KGs and you prefer other libraries that are more practical at making KGs, could you say what libraries you mean?
@TwoSetAI 7 місяців тому ⁺²
Yes we will share more
@MehdiAllahyari 7 місяців тому ⁺⁵
There are many depending on the domain. But here are some of them that tend to work very well for many domains:
- github.com/urchade/GLiNER
- github.com/universal-ner/universal-ner
- github.com/kamalkraj/BERT-NER
@artur50 7 місяців тому ⁺¹
@@MehdiAllahyari thanx, it looks awesome. I will test it for sure…
@thanartchamnanyantarakij9950 29 днів тому ⁺¹
You must included your new course about how to build knowledge graph from unstructured data by using llm with advanced technique
@TwoSetAI 29 днів тому ⁺¹
@@thanartchamnanyantarakij9950 we can include some of these information! We will host a free presession for folks to get more info about the course. Welcome to join us!
@apulacheyt 6 місяців тому ⁺²
Very interesting review. Any chances you share the code to try it myself?
Thanks in advance.
BTW I'm reading your RAG book.
@MehdiAllahyari 5 місяців тому ⁺²
Awesome! Here's the code: github.com/mallahyari/twosetai/blob/main/02_kg_construction.ipynb
@TwoSetAI 3 місяці тому
@@apulacheyt thank you! Some materials might be outdated due to changes in the libraries . Check out our course for the latest updates!
@aGianOstaLgia 7 місяців тому ⁺⁶
what was wrong with the previous video? As always, thank you!
@MehdiAllahyari 7 місяців тому ⁺¹
Because the subtitles were distracting, we had to re-upload a new one. Unfortunately, the comments of last video cannot be not displayed for this one!
@TwoSetAI 7 місяців тому ⁺³
i removed the subtitle. hopefully this is easier to watch! thanks!
@sgwbutcher 7 місяців тому ⁺³
The challenge for me is that LLMs are not consistent within or between documents. In the example, you see "us" and "u.s.". I'm also concerned that Fiat is an Organization but Chrysler is a Company. And in the LLM example of triples, many of the objects are just, well, sentence fragments. The killer feature of KGs is that you can make connections...but the overspecificity would seem to prevent this. For example, I cannot connect Tom Hanks to any other "fourth highest grossing actor"...he's the only one! There seems to be no good way to create a prompt where the LLM generates entities and relationships at a consistent and appropriate level of hyper/hypo-nymy. This is perhaps not surprising given that LLMs don't think, reason, whatever. And therein lies the trap in getting LLMs to lift themselves up by their own bootstraps.
@MehdiAllahyari 7 місяців тому ⁺²
That's exactly my point in the video too. Many people are hyped/over excited to use LLM for extracting name entities and relations especially when you don't define your schema at first. However, there is no guarantee that you get consistent results. Plus the cost if prohibitive!
@awakenwithoutcoffee 7 місяців тому ⁺²
I think adding an extra layer of adding metadata (e.g. Parent documents) could solve this issue e.g.: - you can have an LLM embedding with semantic ability to go over each chunk and add metadata related to that chunk so that the LLM can understand the context of each word e.g. "u.s" -> "united states, country".
@jackbauer322 6 місяців тому ⁺¹
spacy-llm can help you do few shots NER and the performs is almost 99% of traditional approach
@karthickdurai2157 6 місяців тому ⁺¹
I think spacy-llm also uses a LLM behind the scenes, so it may not be as fast as this

Наступне

Автоматичне відтворення

HippoRAG - Revolutionizing AI Retrieval with a 20-Year-Old Algorithm!