Could you also create sample code for instances where multiple documents are stored in a directory in either .txt or a compressed format like Parquet? How would it work then? Would you transform each document separately and combine them as a graph? Would like to get your insight. Thank you for the video!
Excellent video (as always). I have a question about the .jar dependency. Are you able to please provide instructions on where to get updated Neo4j releases? I see that they're onto 5.25.1 now, but I don't see where I can simply download a .jar. Do we need to compile that somehow? Thanks again for the fantastic video!
Congratulations, great approach! I was wondering if you could consider using a different embedding method instead of OpenAI's, as it requires payment. What do you think?
Ollama also offers an embedding model and to be honest, I think you should probably use it. You can use the same approach to use an env variable to use the ollama embeddings class like I did for the model
Hello, I had some trouble with : graph.add_graph_documents( graph_documents, baseEntityLabel=True, include_source=True ) Apparently some labels and relationships were empty, which is a problem for neo4J, I handled it by making sur that at least every node and relationship had a label. Why do I have that issue you did not had ?
Really informative and to the point tutorial! I do have 2 questions though. Q1. Did this approach store two sets of embeddings for the documents? One in the graph database and the other in a normal vector database? Q2. If I want to add more data to this graph, do I have to recompute the nodes every time or is there a way to just extend the original graphs with new nodes from the newly added content? Thanks!
Great video! One thing I would be interested to know that were you be able to compare the results between the two approaches Graph or Vector search? Which one is retrieving the most relevant results?
@@codingcrashcourses8533 I may have missed it from the video, but is Neo4j open source / free to use always? I looked at their site and was confused. Or are you using free-tier for the demo? Would love to use this for development. Thanks!
@@i2c_jason It´s free to use with Docker, you can also download a Desktop version. If you want to use it for your enterprise app, it might be different
@@i2c_jason It's source available, and equal to open source as long as you don't sell it as a service. So you can make a company around a service that uses it for free, but not offer it on a SaaS.
Great approach. As far as I understood, you still need to build a vector store for the queries, so it is not possible to get an answer only with Neo4J graphdatabase?
I have created a Neo4j graph but querying is getting difficult. So, Here the retrieval function varies for different inputs(like this retrieval function will work for this example only if I change the input it mayn't give me the proper results). How to overcome this problem. Can you give us a solution for querying the neo4j graph with different llms (like gemini,openai,ollma etc..) Thank you hoping for proper response.
Maybe you can help- When I execute entity_chain = llm.with_structured_output(Entities) I get the error in BaseLanguageModel.with_structured_output(self, schema, **kwargs) 238 """Not implemented on this class.""" 239 # Implement this on child class if there is a way of steering the model to 240 # generate responses that match a given schema. --> 241 raise NotImplementedError
When i use the convert to graph document function, for some reason it's not creating any nodes. Am i missing a step? From my understanding it is the LLM which will decide the nodes and the relationship between them right?
You mentioned you received a key error at one point. I am getting that key error as well. Please share how you worked around the key error. Following your notebook, my key error is, KeyError: 'head'.
@@codingcrashcourses8533 had the same, seems to be caused by the double quotes in dummytext.txt, if you replace them (say, with single quotes) then the json will not be broken. key error was to do with missing 'head' which is one of the node keys.
Awesome video! The notebook is slightly different from the video. Could you please make some comments here to reflect the departure away from the video?
Thank you for this video! 👏 I've been eager to see someone cover this topic. I have a couple of questions I hope you can help me with. In this approach, is there a way to rerank the retrieved documents? Did I understand correctly that you first query the graph database, and based on the retrieved entities and relationships, it then retrieves pieces of documents from the vector store?
Regarding reranking: I don´t know to be honest. I could not think of any metric to do that. Regarding retrieval: No! I used a hybrid approach there, but they are independent approaches. The LLM will use docs from both approaches.
does anyone know if there is a graphics package for python that works the way this one does but not for Jupyter notebook? Looking to build an app an want to be able to show the same kind of graph viz in a web browser.
@@giantworks1366 yes, i wrote the Code in a way that you can switch from one to another via the llm_type variable. But the 8b param models seems sometimes to have issues to create the docs in the required way
please help me, i start to run the second comand line (from langchain_core.runnables import RunnablePassthrough......) and the result is false, no .evn file apppears. Please help me to fix thiss. Thank you.
I try this approach with Ollama but It seesm when execute the row "llm_transformer.convert_to_graph_documents" it run for more time and I need to stop the run..can help me? What is it the problem? The graph DB neo4j connection?
@@codingcrashcourses8533 After 118 minutes with ollama end with this error: ValidationError: 2 validation errors for Node id none is not an allowed value (type=type_error.none.not_allowed) type none is not an allowed value (type=type_error.none.not_allowed). Error is generated from the line: graph_documents = llm_transformer.convert_to_graph_documents(documents)
Interesting! Is Neo4j language dependent? So is it capable of finding relations of multilingual data or does Neo4j not care about the language and only the LLM to create the knowledge graph should be multilingual?
the LLM is responsible for creating the entites and documents, you will just save them in the required format in the database. Neo4J has got many libraries with different languages. The format of the documents matter.
I am encountering an error when running the command print(graph_retriever("Who is Nonna Lucia?")) Generated Query: Nonna~2 AND Lucia~2 ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: java.lang.IllegalArgumentException: There is no such fulltext schema index: entity}"
Running this error over and over in the "graph_documents = llm_transformer.convert_to_graph_documents(documents)" cell: 731 parsed_json = self.json_repair.loads(raw_schema) 732 for rel in parsed_json: 733 # Nodes need to be deduplicated using a set --> 734 nodes_set.add((rel["head"], rel["head_type"])) 735 nodes_set.add((rel["tail"], rel["tail_type"])) 737 source_node = Node(id=rel["head"], type=rel["head_type"]) TypeError: string indices must be integers
Llama 3.1 8B can also handle it, instead of ollama use llama.cpp and download llama 3.1 8B.gguf model and customise prompt template and you are good to go.i For reference you can use this repo github.com/s3dhanth/GraphRAG-with-Hermes-2.5-Pro-LLM-using-neo4j-database
Hm I am not a Graph Expert to be honest. I had go through some tutorials myself to learn how to construct a query first, since I mainly worked with the RAG part so far. How would it look like in your opinion?
I tried following the tutorial and encountered some issues (I suspect the issue stems from the openai fallback?). Anyways, I have created a pull request on your github. Hope that helps anyone who wants to only use local models and encountered a similar problem.
Hii, amazing content as usual man, i am trying to implement this noe4j graph on my pc after downloading your code, i went to neo4jaura and downloaded a .txt file containing the database credentials but whenever i try it from your code i get this error message below, please help " Could not connect to Neo4j database. Please ensure that the url is correct" it is exactly the same thing i downloaded from their website
.env file is not in his code. So, you'd have to create your own OPENAI_API_KEY and the neo4j username and password. Remember to set the NEO$J_URI to bolt://localhost:7687
Thank u for great video. Is it possible to still use ollama via langchain? I got a error when i use "if llmtype=="olllama" -- error message : Received unsupported message type for Ollama. (my_path) : LangChainDeprecationWarning: The method `BaseChatModel.__call__` was deprecated in langchain-core 0.1.7 and will be removed in 0.3.0. Use invoke instead. warn_deprecated(
Did you change something in the codebase? Which line exactly produces this error? I have used the invoke method, to perform calls to the LLMs, so this kind of warning should not appear
@@codingcrashcourses8533 when i first tried it since release, it failed to generate low resource language, but i just tested in Poe, it seems to work well now. Btw thank you, your content really help.
@@codingcrashcourses8533 I really appreciate your content. When i first test with togetherAI, it only responded with English, it seems work well now when i tested again
Thank you for this video, I have been looking for a way to use local graphrag until I see this. Thank you so much. But I encountered an issue while using the neo4j browser. It keeps saying "Cannot load from URL 'file:///import/test_container.csv': Couldn't load the external resource at: file:///import/test_container.csv ()" I have tried all means to resolve this issue but the error keep persisting. Please I need your help.
for entity in entities.names: response = graph.query( """CALL db.index.fulltext.queryNodes('entity', $query, {limit:2}) YIELD node,score CALL { WITH node MATCH (node)-[r:!MENTIONS]->(neighbor) RETURN node.id + ' - ' + type(r) + ' -> ' + neighbor.id AS output UNION ALL WITH node MATCH (node)
@@codingcrashcourses8533 I used only openai model. i got a message : ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: java.lang.IllegalArgumentException: There is no such fulltext schema index: entity} Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings... 😂
This was EXACTLY what I've been looking for. Thank you for the great walkthrough, you just got yourself a new subscriber. Cheers
You are just brilliant! This is exactly what I've been looking for-great examples and very nicely illustrated. Thank you!
This is also EXACTLY what I was looking for. Please build more upon this, maybe for an anti-fraud use case.
anti-fraud works pretty different, at least in my space :)
Smart video ! It could be a good way to access easily relationships between stakeholders in any kind of business.
wow, great overview! Thanks!!
Could you also create sample code for instances where multiple documents are stored in a directory in either .txt or a compressed format like Parquet?
How would it work then? Would you transform each document separately and combine them as a graph?
Would like to get your insight.
Thank you for the video!
Has any tried this code for a large amounts of documents?
How does it scale?
great question!
Excellent video (as always). I have a question about the .jar dependency. Are you able to please provide instructions on where to get updated Neo4j releases? I see that they're onto 5.25.1 now, but I don't see where I can simply download a .jar. Do we need to compile that somehow? Thanks again for the fantastic video!
Aha I think I found it (neo4j apoc github repo). Thanks again!
Congratulations, great approach! I was wondering if you could consider using a different embedding method instead of OpenAI's, as it requires payment. What do you think?
Ollama also offers an embedding model and to be honest, I think you should probably use it. You can use the same approach to use an env variable to use the ollama embeddings class like I did for the model
Can you provide some information on the montivation to change from using ChatOllama in the video to OllamaFunctions in the repo please?
Hello, I had some trouble with :
graph.add_graph_documents(
graph_documents,
baseEntityLabel=True,
include_source=True
)
Apparently some labels and relationships were empty, which is a problem for neo4J, I handled it by making sur that at least every node and relationship had a label. Why do I have that issue you did not had ?
@@Weotcs some people had this with llama. Did it also happen with gpt 4o Mini?
Is this GraphRag as by microsoft's named approach? Because I don't see anything about creating communities for search
Really informative and to the point tutorial! I do have 2 questions though.
Q1. Did this approach store two sets of embeddings for the documents? One in the graph database and the other in a normal vector database?
Q2. If I want to add more data to this graph, do I have to recompute the nodes every time or is there a way to just extend the original graphs with new nodes from the newly added content?
Thanks!
1. Yes, that approach is called hybrid approach
2. Yes, at least as I know, there is no way around that
@@codingcrashcourses8533 thanks 🙏
Great video! One thing I would be interested to know that were you be able to compare the results between the two approaches Graph or Vector search? Which one is retrieving the most relevant results?
It depends on the question :). You probably want to use a Hybrid approach
Is it possible to use other models than llama 3.1? E.g GGUF models?
@@jordybrakie yes, normally thats possible. But some models struggle to create the graph
Are the graph visualizations mostly used for development, to double check what the system is 'thinking' and look for valuable clustering?
The visualization will probably used for development, the LLM won´t use it (a vision model won´t be able to deal with it, if you think of that)
@@codingcrashcourses8533 I may have missed it from the video, but is Neo4j open source / free to use always? I looked at their site and was confused. Or are you using free-tier for the demo? Would love to use this for development. Thanks!
@@i2c_jason It´s free to use with Docker, you can also download a Desktop version. If you want to use it for your enterprise app, it might be different
@@codingcrashcourses8533 can you demonstrate it with desktop version of neo4j
@@i2c_jason It's source available, and equal to open source as long as you don't sell it as a service. So you can make a company around a service that uses it for free, but not offer it on a SaaS.
Great approach. As far as I understood, you still need to build a vector store for the queries, so it is not possible to get an answer only with Neo4J graphdatabase?
You do! Its normal to use a hybrid approach use use results from neo4j and similarity search.
I have created a Neo4j graph but querying is getting difficult. So, Here the retrieval function varies for different inputs(like this retrieval function will work for this example only if I change the input it mayn't give me the proper results). How to overcome this problem. Can you give us a solution for querying the neo4j graph with different llms (like gemini,openai,ollma etc..)
Thank you hoping for proper response.
I tried to git clone your repo.. it does not work.
why? Cloning should always work if you dont have an error in your git configuration
Maybe you can help- When I execute entity_chain = llm.with_structured_output(Entities) I get the error in BaseLanguageModel.with_structured_output(self, schema, **kwargs)
238 """Not implemented on this class."""
239 # Implement this on child class if there is a way of steering the model to
240 # generate responses that match a given schema.
--> 241 raise NotImplementedError
I'm using llama 3.1:8b Maybe it doesn't support the structured output method.
@@aoliveira_ yep, thats the issue
I am having this error "no validator found for , see `arbitrary_types_allowed` in Config"
@@bhaibhai-qe8tt what langchain Version and pydantic Version do you use?
When i use the convert to graph document function, for some reason it's not creating any nodes. Am i missing a step? From my understanding it is the LLM which will decide the nodes and the relationship between them right?
@@yashbajpai231 hm, what errors do you see? Normally you are correct
Nice video. 🎉 Thanks
Very nice!
You mentioned you received a key error at one point. I am getting that key error as well. Please share how you worked around the key error. Following your notebook, my key error is, KeyError: 'head'.
@@Transforming-AI had the same issue a few times. I think the 8k Model produced this, bit not always. Or did you try the 70b or 405b Model?
@@codingcrashcourses8533 I have the 8b installed, will try the 70b. Thanks.
@@codingcrashcourses8533 had the same, seems to be caused by the double quotes in dummytext.txt, if you replace them (say, with single quotes) then the json will not be broken. key error was to do with missing 'head' which is one of the node keys.
@@codingcrashcourses8533 I tried it with llama3.1:70b and it took 13 minutes on my Mac Studio.
@@codingcrashcourses8533 I'm also running into this issue when running with the 8B model.
Awesome video! The notebook is slightly different from the video. Could you please make some comments here to reflect the departure away from the video?
I incorpoated issues people had with the Notebook which. Logically nothing should have changed
Thank you for this video! 👏 I've been eager to see someone cover this topic. I have a couple of questions I hope you can help me with.
In this approach, is there a way to rerank the retrieved documents?
Did I understand correctly that you first query the graph database, and based on the retrieved entities and relationships, it then retrieves pieces of documents from the vector store?
Regarding reranking: I don´t know to be honest. I could not think of any metric to do that.
Regarding retrieval: No! I used a hybrid approach there, but they are independent approaches. The LLM will use docs from both approaches.
@@codingcrashcourses8533 Honestly, not sure if reranking is even needed for Graph based RAGs, as they are very accurate. 😄
Thanks! 👊
does anyone know if there is a graphics package for python that works the way this one does but not for Jupyter notebook?
Looking to build an app an want to be able to show the same kind of graph viz in a web browser.
I can do a Video on that if you want to
@@codingcrashcourses8533 that would be great!
Cool. Thank a lot
docker compose is not working
any suggestions ?
What are the errors?
Nice. Can I only use llm ollama3.1 without openAi? Thanks
@@giantworks1366 yes, i wrote the Code in a way that you can switch from one to another via the llm_type variable. But the 8b param models seems sometimes to have issues to create the docs in the required way
please help me, i start to run the second comand line (from langchain_core.runnables import RunnablePassthrough......) and the result is false, no .evn file apppears. Please help me to fix thiss. Thank you.
@@QuocNguyen-se7vi you have to create it yourself first ;).
@@codingcrashcourses8533 thank you, I see LightRAG has a better constructive knowledge graph than GraphRAG, would you take a look at it!
I try this approach with Ollama but It seesm when execute the row "llm_transformer.convert_to_graph_documents" it run for more time and I need to stop the run..can help me? What is it the problem? The graph DB neo4j connection?
@@Silvietta13 any errors?
@@codingcrashcourses8533 No errors!
@@codingcrashcourses8533 After 118 minutes with ollama end with this error: ValidationError: 2 validation errors for Node
id
none is not an allowed value (type=type_error.none.not_allowed)
type
none is not an allowed value (type=type_error.none.not_allowed). Error is generated from the line: graph_documents = llm_transformer.convert_to_graph_documents(documents)
@@Silvietta13 so what did you do?
Interesting! Is Neo4j language dependent? So is it capable of finding relations of multilingual data or does Neo4j not care about the language and only the LLM to create the knowledge graph should be multilingual?
the LLM is responsible for creating the entites and documents, you will just save them in the required format in the database. Neo4J has got many libraries with different languages. The format of the documents matter.
LangChain --version ?
I just finished learning the ChromaDB , can the GraphRAG works with ChromaDB instead of Neo4J?
No, chroma is not able to store graphs
What about Surrealdb?
I am encountering an error when running the command
print(graph_retriever("Who is Nonna Lucia?"))
Generated Query: Nonna~2 AND Lucia~2
ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: java.lang.IllegalArgumentException: There is no such fulltext schema index: entity}"
How can I use this for my code project files. I need to convert whole php project with python.
The code is in the description, it´s free, do whatever you want with it :).
Running this error over and over in the "graph_documents = llm_transformer.convert_to_graph_documents(documents)" cell:
731 parsed_json = self.json_repair.loads(raw_schema)
732 for rel in parsed_json:
733 # Nodes need to be deduplicated using a set
--> 734 nodes_set.add((rel["head"], rel["head_type"]))
735 nodes_set.add((rel["tail"], rel["tail_type"]))
737 source_node = Node(id=rel["head"], type=rel["head_type"])
TypeError: string indices must be integers
Hmhmh, some people seem to get this unfortunately. Did you use the 8B model? That could explain it, since it might create empty documents or so.
@@codingcrashcourses8533 getting the same error as well
@@Wingly113 also with 8b? I did not run in any errors with 70b of gpt4-small yet
@@codingcrashcourses8533 Yup, 8b. I can't handle 70b tho...
Llama 3.1 8B can also handle it, instead of ollama use llama.cpp and download llama 3.1 8B.gguf model and customise prompt template and you are good to go.i For reference you can use this repo
github.com/s3dhanth/GraphRAG-with-Hermes-2.5-Pro-LLM-using-neo4j-database
How much VRAM and RAM do you think it takes to run the 70b model at minimum?
The amount of VRAM you would need is roughly 70 x 2 = 140gb
Is this really an RDF Knowledge Grap, or just a Neo4J property graph. That doesn't look like a SPARQL query in your code.
Hm I am not a Graph Expert to be honest. I had go through some tutorials myself to learn how to construct a query first, since I mainly worked with the RAG part so far. How would it look like in your opinion?
Cool 🤙
I tried following the tutorial and encountered some issues (I suspect the issue stems from the openai fallback?). Anyways, I have created a pull request on your github. Hope that helps anyone who wants to only use local models and encountered a similar problem.
@@Yes-lm9dq i will have a Look at it and merge it if it suits the issue
where is the .env file in the repository?
@@federicosalvati2454 its not there. You need to use your own api Keys and Passwords of course
@@codingcrashcourses8533 thank you so much I am approaching coding with python and RAGs for the first time and all of this helps a lot
Hii, amazing content as usual man, i am trying to implement this noe4j graph on my pc after downloading your code, i went to neo4jaura and downloaded a .txt file containing the database credentials but whenever i try it from your code i get this error message below, please help
" Could not connect to Neo4j database. Please ensure that the url is correct"
it is exactly the same thing i downloaded from their website
.env file is not in his code. So, you'd have to create your own OPENAI_API_KEY and the neo4j username and password. Remember to set the NEO$J_URI to bolt://localhost:7687
Thank u for great video.
Is it possible to still use ollama via langchain? I got a error when i use "if llmtype=="olllama"
-- error message : Received unsupported message type for Ollama.
(my_path) : LangChainDeprecationWarning: The method `BaseChatModel.__call__` was deprecated in langchain-core 0.1.7 and will be removed in 0.3.0. Use invoke instead.
warn_deprecated(
Did you change something in the codebase? Which line exactly produces this error? I have used the invoke method, to perform calls to the LLMs, so this kind of warning should not appear
It took me 100 minutes to compute the graph on my 32GB i7 machine
@@rorycawley yeah, i said its expensive :D
look like llama cant generate low resource language like Vietnamese
no, but to honest most models have issues with that. You probably have to create the english documents first and then try to translate it
@@codingcrashcourses8533 when i first tried it since release, it failed to generate low resource language, but i just tested in Poe, it seems to work well now. Btw thank you, your content really help.
@@codingcrashcourses8533 I really appreciate your content. When i first test with togetherAI, it only responded with English, it seems work well now when i tested again
Thank you for this video, I have been looking for a way to use local graphrag until I see this. Thank you so much.
But I encountered an issue while using the neo4j browser. It keeps saying "Cannot load from URL 'file:///import/test_container.csv': Couldn't load the external resource at: file:///import/test_container.csv ()"
I have tried all means to resolve this issue but the error keep persisting. Please I need your help.
You got this issue when running my code? I don´t use a test_container.csv file
@@codingcrashcourses8533 No, I did not get the issue while running your code. I already solved it. Thank you
for entity in entities.names:
response = graph.query(
"""CALL db.index.fulltext.queryNodes('entity', $query, {limit:2})
YIELD node,score
CALL {
WITH node
MATCH (node)-[r:!MENTIONS]->(neighbor)
RETURN node.id + ' - ' + type(r) + ' -> ' + neighbor.id AS output
UNION ALL
WITH node
MATCH (node)
@@kevli6373 did you try openai models too? Or just the small 8b llama model?
@@codingcrashcourses8533
I used only openai model.
i got a message :
ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure db.index.fulltext.queryNodes: Caused by: java.lang.IllegalArgumentException: There is no such fulltext schema index: entity}
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
😂
@@codingcrashcourses8533 thank you reply. i used only openai model. i didnt try llama yet.
@@codingcrashcourses8533 I used only OpenAI gpt 4o mini not llama. In your case, Does it work well for both ?
@@kevli6373 I had issues with Llama 8B. Everything else worked fine :/. The LLM seems not be able to create/extract entities