If you found this content helpful, please consider liking, subscribing, and sharing it with others who might benefit. Your support is greatly appreciated :)
The main challenge I see with Knowledge graphs is in Retriever side. Until unless if we or LLM generate the cypher query with the exactly same nodes and relationships that are present in graph database from the question we asked we are not going to get the proper response.
Pls check the link in the description, its a link to the tutorial. If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
Structured data (sql db) is usually easy to query, if we can have a good text2LLM model, then we don't need any additional algorithmic techniques. Querying unstructured data is complex and hence we have RAG, knowledge graphs, and combination of them...
@@SridharKumarKannam what would be your recommendation for writing optimal few-shot query examples to further optimise the sql LLM agent and also to reduce the LLM tokens overlaid and context size with bloated info?
@@SridharKumarKannam nice thank you for sharing your knowledge. One of the challenges with graphRAG seems that it requires allot more tokens / time (from testing it took about 40-60 seconds for 1 answer compared to 1-4 seconds for regular RAG, but the answer was 40% better) Do you think these challenges can be solved with the rise of LPU's/inference speed increasement ?
i was so confused earlier , since i am a student and recently got a job where i have to drive the insight from unstructured raw text and i was so confused since i didn't knew neo4j and all other videos were so confusing Thanks Also what u suggest where i can keep updated with new topics or updates like these?
@@SridharKumarKannam Thanks for your response! ----> 9 llm_transformer = LLMGraphTransformer(llm=llm) 215 schema = create_simple_model(allowed_nodes, allowed_relationships) --> 216 structured_llm = llm.with_structured_output(schema) 217 self.chain = prompt | structured_llm 108 warned = True 109 emit_warning() --> 110 return wrapped(*args, **kwargs) 199 @beta() 200 def with_structured_output( 201 self, schema: Union[Dict, Type[BaseModel]], **kwargs: Any 202 ) -> Runnable[LanguageModelInput, Union[Dict, BaseModel]]: 203 """Implement this if there is a way of steering the model to generate responses that match a given schema.""" # noqa: E501 --> 204 raise NotImplementedError() Apparenttly Ollama doesn't output a structured output compatible with LLMGraphTransformer. I couldn't find a way around it.
Congratulations for the video. I'm facing some issues that looks like a package version problem. Could you provide the requirements for this experiment, please?
Yes, it should work with Gemini and most LLMs as long as their output format is as expected. If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
whats is document size? It is expected to take long time for large documents, for example, if its a pdf book, then all the pages needs to converted to text, a typical book can have 1000s of chunks and for each chunk embedding needs to be create and then stored in the index. Anyway, its mostly a one-off task, for check the end-to-end workflow with short text and then load the entire docs for usage.
after the code: llm = ChatOpenAI(temperature=0, model_name="gpt-4-0125-preview") I receive the error: ValidationError: 1 validation error for ChatOpenAI __root__ Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error) But, I do not find where you add such variables.
This should do ---- import os from langchain_community.graphs import Neo4jGraph os.environ["NEO4J_URI"] = "bolt://localhost:7687" os.environ["NEO4J_USERNAME"] = "neo4j" os.environ["NEO4J_PASSWORD"] = "password" graph = Neo4jGraph()
Using same code..my llm_transformer.convert_to_graph part not working ..gives error that index of list should be integer and not str Using hugging face llm
The ouput format of the llms can be the issue. Pls test using openai with a small text, if its working, then the issue is HF llm op format, choose a different model.
Hi sir, thank you for your great video. Two questions for you (or for anyone else): 1) What exactly does langchain Document do? Why not just feed raw text straight into the LLM Graph Transformer and let it extract relationships/identify entities? 2) What are some resources I can use to learn specifically about how the LLM Graph Transformer works? Thank you kindly.
(1) That the format llm_transformer expects. Your text content is still the same. (2) llm_transformer as it is shown in the video is specific to langchain, refer to the documentation. There is a ton of info on converting text to KGs, including my channel has a number of videos. All the best...
Very impressive output on this content volume. I do have a question, though. While this is great for building knowledge graphs, I'm curious how complicated these relationships can get in these knowledge graphs. I have more experience with formal logic, and more sophisticated logical frameworks like temporal logic, fuzzy logic, etc., try to explain relationships with more accuracy. It would be great if these more complicated relationships could be represented in these knowledge graphs. It would help so much in discovery and improving logical accuracy/soundness in written text.
its better to define the schema explicitly for production usecases and some post processing of LLMs output for sanity checks. I've several videos on these concepts. I'll add more content on this topic...
Amazing video! It clearly explains the concept. I have one question: The nodes and relationships are created by the LLM, so every time the code is run, it generates a different output. How do we handle that? Additionally, you spoke about allowed nodes and relationships. In a real-time scenario, when we don't have much knowledge about the input file, how can we extract all the entities and relationships from the document so that we don't miss any information? Your suggestions on this would be very helpful. Thank you!
1. first run without any schema - this will result in a lot of node and relationship types. 2. Analyse nodes+relations to find out which ones are important for your use case. 3. now run it with a fixed schema If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
I am writing this query to get output "response=chain.invoke({"query":"what was the name of SPOUSE of Marie Curie?"}) response" but in output it is giving "'result': "I don't know the answer. "}" although in neo4j it is showing relationship
Hi, what happen if I run this code again for a second text? does it add it to the same kg database? if not, what should I do to add a new text to the same database?
It will overwrite the DB. I'm sure Neo4J has capability to add new information, but I'm not sure if thats implemented yet in the langchain wrapper. If your database is not too large, you can re-create it, by adding the new information to raw data before extracting nodes/relationships...
Good stuff! Thank you!!! Has anybody tried with Ollama Open source models, consistently I am getting nodes, but no relationships (other than MENTIONS from document to an entity). llm_transformer = LLMGraphTransformer(llm=llm, node_properties=True, relationship_properties=True, strict_mode = False) and we define llm = ChatOllama(model="llama3.1", temperature=0, format="json") - I enev increase temperature to >0, but that does not help either ???
there are function calling issues with ollama models. Try the solution suggested here, I've not tested it though.. github.com/langchain-ai/langchainjs/issues/6051
Only me met the error at step 6? my pthon told me that there is an error at in LLMGraphTransformer.process_response(self, document) 593 nodes_set = set() 594 relationships = [] --> 595 parsed_json = self.json_repair.loads(raw_schema.content) AttributeError: 'str' object has no attribute 'content'
The output format from LLM models is important for these things. stackoverflow.com/questions/78521181/llmgraphtransformer-convert-to-graph-documentsdocuments-attributeerror-str
Sir, can you please explain the basics too? I mean how to install Neo4j correctly on your PC, because I tried with setting up the system env for Neo4j in Mac OS Big Sur and I struggled a lot with connecting it to my Jupyter Notebook in Python=3.11... and I see no improvement on my side. If you could enlighten me with how to establish a similar environment to run the app on my Mac/Win system, I would be grateful to you... I am watching all of your vids on Knowledge Graphs and LLMs and I have learnt a lot.
@@SridharKumarKannam The error is no bolt://localhost:7687 URL not found or running, I want to connect my Neo4J desktop to my Python Script/Notebook and try connecting both. So that, I could make use of knowledge graphs to visualize information from certain given text.
with LLMs the results are not always the same, they are stochastic. Set the temperature to very low value. You can add some post-processing to the output of LLMs.
If you found this content useful, pleases consider sharing it with others who might benefit. Your support is greatly appreciated :)
Exciting and inspiring! Thanks for sharing.
Thanks for your support :)
I am working in a big research project for over two months and I finally found the perfect video for my graph! It turned out amazing!!
Thank you.
If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
You did a great job of explaining llm graph transformer clearly.
Thanks for your support :)
Fanatic explanation. Thank you!
If you found this content helpful, please consider liking, subscribing, and sharing it with others who might benefit. Your support is greatly appreciated :)
Amazing work, thank you so much!
Thanks for your support. Pleases consider sharing it in your communities who might benefit. Your support is greatly appreciated :)
Very useful! Thanks so much!
Thanks for your support :)
Hello, the link in the description is not working anymore.
The main challenge I see with Knowledge graphs is in Retriever side. Until unless if we or LLM generate the cypher query with the exactly same nodes and relationships that are present in graph database from the question we asked we are not going to get the proper response.
yes, these tools and technologies are still improving everyday. Sometime not too far in the future, we will get minimum acceptable accuracy required..
Hi... Do you provide any courses on machine learning or ai?
I don't have any structured course year on ML but I'm planning to do one.
Did you share the notebook? I can't find it. It will be helpful to be able to run in on our own.
Pls check the link in the description, its a link to the tutorial.
If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
With structured data in your previous pgsql examples, can we still apply graphdb concepts on structure data for optimal rag retrieval?
Structured data (sql db) is usually easy to query, if we can have a good text2LLM model, then we don't need any additional algorithmic techniques. Querying unstructured data is complex and hence we have RAG, knowledge graphs, and combination of them...
@@SridharKumarKannam what would be your recommendation for writing optimal few-shot query examples to further optimise the sql LLM agent and also to reduce the LLM tokens overlaid and context size with bloated info?
when do you expect GraphRag to be production ready ?
I would think in the next few months, definitely within 6 months. There is a lot of work going on in converting free text to graphs.
@@SridharKumarKannam nice thank you for sharing your knowledge. One of the challenges with graphRAG seems that it requires allot more tokens / time (from testing it took about 40-60 seconds for 1 answer compared to 1-4 seconds for regular RAG, but the answer was 40% better) Do you think these challenges can be solved with the rise of LPU's/inference speed increasement ?
Sridhar, Great job! Thanks
thank you :)
You can view your graphs with Neo4j bloom also
thats right, Thanks for your support :)
how to i use multiple Neoj4 database's at the same time.. is this possible
you can use only one db in the free version.
i was so confused earlier , since i am a student and recently got a job where i have to drive the insight from unstructured raw text and i was so confused since i didn't knew neo4j and all other videos were so confusing
Thanks
Also what u suggest where i can keep updated with new topics or updates like these?
Thank you.
Pls follow neo4j blog and medium.
Ollama based llms don't work with the LLMGraphTransformer do you know why?
I've not tested that, what error are you getting?
@@SridharKumarKannam Thanks for your response!
----> 9 llm_transformer = LLMGraphTransformer(llm=llm)
215 schema = create_simple_model(allowed_nodes, allowed_relationships)
--> 216 structured_llm = llm.with_structured_output(schema)
217 self.chain = prompt | structured_llm
108 warned = True
109 emit_warning()
--> 110 return wrapped(*args, **kwargs)
199 @beta()
200 def with_structured_output(
201 self, schema: Union[Dict, Type[BaseModel]], **kwargs: Any
202 ) -> Runnable[LanguageModelInput, Union[Dict, BaseModel]]:
203 """Implement this if there is a way of steering the model to generate responses that match a given schema.""" # noqa: E501
--> 204 raise NotImplementedError()
Apparenttly Ollama doesn't output a structured output compatible with LLMGraphTransformer. I couldn't find a way around it.
@@SridharKumarKannam AttributeError on calling LLMGraphTransformer.convert_to_graph_documents
AttributeError on calling LLMGraphTransformer.convert_to_graph_documents having this error
it runs ollama server but ends with this error
@@SridharKumarKannam Traceback (most recent call last):
File "/home/jelcke/dev/test/txt2graph/openai-graph.py", line 38, in
llm_transformer_filtered = LLMGraphTransformer(
File "/home/jelcke/dev/test/txt2graph/venv/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 216, in __init__
structured_llm = llm.with_structured_output(schema)
File "/home/jelcke/dev/test/txt2graph/venv/lib/python3.10/site-packages/langchain_core/language_models/base.py", line 208, in with_structured_output
raise NotImplementedError()
NotImplementedError
Congratulations for the video. I'm facing some issues that looks like a package version problem. Could you provide the requirements for this experiment, please?
Langchain library is being updated very frequently. Pls check the latest docs/APIs. Did you resolve the issue ? Whats the error pls...
Instead of text, can we do it from 2 tables instead? Maybe two columns in two tables?
You can if you have text in those columns. The source can be anything, as long as you format in the text format...
Can it will possible using Gemini Model
Yes, it should work with Gemini and most LLMs as long as their output format is as expected.
If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
How can we process large documents. It takes so much time.
And how can we use aconvert_to_graph_documents function
whats is document size? It is expected to take long time for large documents, for example, if its a pdf book, then all the pages needs to converted to text, a typical book can have 1000s of chunks and for each chunk embedding needs to be create and then stored in the index. Anyway, its mostly a one-off task, for check the end-to-end workflow with short text and then load the entire docs for usage.
after the code:
llm = ChatOpenAI(temperature=0, model_name="gpt-4-0125-preview")
I receive the error:
ValidationError: 1 validation error for ChatOpenAI
__root__
Did not find openai_api_key, please add an environment variable `OPENAI_API_KEY` which contains it, or pass `openai_api_key` as a named parameter. (type=value_error)
But, I do not find where you add such variables.
you can create a .env file and paste your openapi key , OPENAI_API_KEY=your_api_key
You can add key either directly in the code using this command os.environ["OPENAI_API_KEY"] = "YOUR_KEY"
I've setup the key in my environmental variables in the bashrc file
hello, thank you for this video it was very helpful, how do i link the project to the neo4j desktop client ?
This should do
----
import os
from langchain_community.graphs import Neo4jGraph
os.environ["NEO4J_URI"] = "bolt://localhost:7687"
os.environ["NEO4J_USERNAME"] = "neo4j"
os.environ["NEO4J_PASSWORD"] = "password"
graph = Neo4jGraph()
Using same code..my llm_transformer.convert_to_graph part not working
..gives error that index of list should be integer and not str
Using hugging face llm
The ouput format of the llms can be the issue. Pls test using openai with a small text, if its working, then the issue is HF llm op format, choose a different model.
Hi sir, thank you for your great video. Two questions for you (or for anyone else):
1) What exactly does langchain Document do? Why not just feed raw text straight into the LLM Graph Transformer and let it extract relationships/identify entities?
2) What are some resources I can use to learn specifically about how the LLM Graph Transformer works?
Thank you kindly.
(1) That the format llm_transformer expects. Your text content is still the same. (2) llm_transformer as it is shown in the video is specific to langchain, refer to the documentation. There is a ton of info on converting text to KGs, including my channel has a number of videos. All the best...
Very impressive output on this content volume.
I do have a question, though. While this is great for building knowledge graphs, I'm curious how complicated these relationships can get in these knowledge graphs.
I have more experience with formal logic, and more sophisticated logical frameworks like temporal logic, fuzzy logic, etc., try to explain relationships with more accuracy.
It would be great if these more complicated relationships could be represented in these knowledge graphs. It would help so much in discovery and improving logical accuracy/soundness in written text.
its better to define the schema explicitly for production usecases and some post processing of LLMs output for sanity checks. I've several videos on these concepts. I'll add more content on this topic...
Amazing video! It clearly explains the concept. I have one question: The nodes and relationships are created by the LLM, so every time the code is run, it generates a different output. How do we handle that? Additionally, you spoke about allowed nodes and relationships. In a real-time scenario, when we don't have much knowledge about the input file, how can we extract all the entities and relationships from the document so that we don't miss any information? Your suggestions on this would be very helpful. Thank you!
1. first run without any schema - this will result in a lot of node and relationship types.
2. Analyse nodes+relations to find out which ones are important for your use case.
3. now run it with a fixed schema
If you found this content helpful, please consider sharing it with others who might benefit. Your support is greatly appreciated :)
great! apprecitate this!
thank you very much for your support :)
I am writing this query to get output "response=chain.invoke({"query":"what was the name of SPOUSE of Marie Curie?"})
response" but in output it is giving "'result': "I don't know the answer.
"}" although in neo4j it is showing relationship
Did you run the query multiple times and got the same output? All LLMs are stochastic (random) in nature, some times strange results are expected.
Hi, what happen if I run this code again for a second text? does it add it to the same kg database? if not, what should I do to add a new text to the same database?
It will overwrite the DB. I'm sure Neo4J has capability to add new information, but I'm not sure if thats implemented yet in the langchain wrapper. If your database is not too large, you can re-create it, by adding the new information to raw data before extracting nodes/relationships...
What software is he using?
Neo4j
Thanks for your support :)
Yes it is.
Good stuff! Thank you!!! Has anybody tried with Ollama Open source models, consistently I am getting nodes, but no relationships (other than MENTIONS from document to an entity). llm_transformer = LLMGraphTransformer(llm=llm, node_properties=True, relationship_properties=True, strict_mode = False) and we define llm = ChatOllama(model="llama3.1", temperature=0, format="json") - I enev increase temperature to >0, but that does not help either ???
there are function calling issues with ollama models. Try the solution suggested here, I've not tested it though..
github.com/langchain-ai/langchainjs/issues/6051
Only me met the error at step 6? my pthon told me that there is an error at in LLMGraphTransformer.process_response(self, document)
593 nodes_set = set()
594 relationships = []
--> 595 parsed_json = self.json_repair.loads(raw_schema.content)
AttributeError: 'str' object has no attribute 'content'
The output format from LLM models is important for these things.
stackoverflow.com/questions/78521181/llmgraphtransformer-convert-to-graph-documentsdocuments-attributeerror-str
Sir, can you please explain the basics too? I mean how to install Neo4j correctly on your PC, because I tried with setting up the system env for Neo4j in Mac OS Big Sur and I struggled a lot with connecting it to my Jupyter Notebook in Python=3.11... and I see no improvement on my side. If you could enlighten me with how to establish a similar environment to run the app on my Mac/Win system, I would be grateful to you... I am watching all of your vids on Knowledge Graphs and LLMs and I have learnt a lot.
It is resolved? What error are you getting? After installing neo4j desktop, you need to install a couple of plugins also from neo4j desktop.
@@SridharKumarKannam The error is no bolt://localhost:7687 URL not found or running, I want to connect my Neo4J desktop to my Python Script/Notebook and try connecting both. So that, I could make use of knowledge graphs to visualize information from certain given text.
Bro youre the best, you have saved my Thesis. Only question is how to retrieve
thanks for your support :)
You explained it well but the results are always inconsistent.
with LLMs the results are not always the same, they are stochastic. Set the temperature to very low value. You can add some post-processing to the output of LLMs.
@@SridharKumarKannam I did but still not consistent results 😢
Can you provide for the code
the link is in the description...
What api key you are using and From where do you get it?
many frameworks use openai by default if we don't specify a model explicitly. The key in my config files.