Recently, I've observed widespread use of Pipe operator, and this video provided a clear explanation of its purpose and functionality. Once again, it was incredibly helpful!
Pipes are a powerful tool for clearly expressing a sequence of multiple operations. They are used for long time in R. People coming from R to Python will be happy with the pipe operator.
I bought a course on Udemy, which claimed to be the complete guide to Langchain. I understood more about langchain from this video than the entire course. Thank you so much for the content. You really are a good teacher :)
The problem is not that is "not Pythonic", the problem is that it is not READABLE. We are sacrificing readability, which is always important, specially important when building something that can be conceptually challenging, as an LLM pipeline is.
I am still not sure, if I like this or not. It´s the official go-to way now, but I still prefer the Chain Interface. For RetrievalQA you had this nice Chain, now you have to do everything on your own. I still miss some easy to use and clean examples with the LCEL for more advanced usecases.
Old chains like RetrievalQA are compatible with the pipe stuff. Both kind of chains inherit from the Chain class (i guess). You can do: my_pipe_chain = blah | blah qa_chain = RetrievalQA(…) awesome_chain = my_pipe_chain | qa_chain I have to implement new features in a project and we were using class based chains and were doing some crazy shit. You don’t want to see that stuff. I we had used pipe chains, it would be much easier to refactor this. And if you can just use the chains as is, good. But we needed to change lots of stuff and it was confusing, all those clases which inherits and delegates in a dozen ones… A big problem is that we didn’t really knew everything those chains were doing, so know Im converting them to pipe chains and I have to read a lot of code, jumping around. More precisely, one of my biggest problems is reproducing everything the ConversationalRetrievalQA chain does
This is great actually. If you are familiar with R statistical language you will easily grasp the pipe approach. People just need to get used to this new paradigm.
yeah I think it's pretty interesting, it's weird when you're coming from Python but once you start using it it seems nice - for now I think it mainly needs better integration with agents, idk if that is planned or not but most of the time I'm using LangChain for agents
Please more and more of this content ! I recently spend one day trying to implenent SelfQuerying with Pinecone and langchain. But it is a fail... Impossible to filter on metadata. And the doc from pinecone on the Index is not super esy. If you want to do more very pedagogical video like this on the notion of Index (and thus self querying) in order to avoir using Langchain to push in Pinecone, it would be incredible !
I gave up on LangChain's SelfQueryRetreiver a few months ago and basically use three patterns: 1) a postgres database with a class that does `user utterance`->`natural language query`->`sql query` translation to select a subset of documents, 2) an on-the-fly embedding + vector similarity estimator on the subset of documents, and 3) control-flow operators where the LLM chooses from a set of enums (for categorical matching) or estimates a coeficient (for thresholding). More importantly, with the first pattern, you can now answer questions like "What are the top 5 complains from customers that scored us with low NPS?" (and then generate summary statistics passing the results to pandas), which is completely impossible to answer via RAG or any other form of embedding-based search. This guy has a series worth checking out on ho9w to build a Postgres Data Analytics AI Agent starting with this video: ua-cam.com/video/jmDMusirPKA/v-deo.html
pipe operator and RunnableParallel - reminds me of Apache Beam python API. - good to see they are not putting "Zen of Python" dogmaticism before AI innovation ! LangChain docs have completely changed since a year ago - this field is moving so fast !
How do you add a variable for metadata filtering when using LCEL. I just can't figure it out. And how do you get verbose mode so that you can see the all the prompts and steps taking place?
Great content. It works with Agent, in my case I need to use AgentExecutor instead of Agent, agent_executor = AgentExecutor(agent=agent_chain, memory=memory, verbose=True, tools=tool_list,return_intermediate_steps=False) Look like AgentExecutor is not streaming with LCEL. Any ideas?
I don't mind the syntax too much... It's not pythonic but I'm used to it because it reminds me of R tidyverse pipelines. What I can't really find/understand is what about incorporating history in chats, and implementing callbacks using LCEL? I haven't found understandable documentation for that. Initially it appears in the process of making a clean implementation for chains they have made adding history (like ConversationBufferMemory) pretty complicated and it wasn't previously. Oh and thanks for the in depth walk through of how they overload that pipe operator to use it in python very good walk though on that!
Hey James, nice video. I recently refactored some code to use LCEL and I enjoyed the process. Any chance you have a video lined up around setting up benchmarks for a system in production. And then evaluating potential changes before they go live? Especially for one that uses RAG, thanks.
If you ask langchain chat anything slightly more advanced than the most basic LCEL questions, it spits out incorrect code. If you neatly organize all the LCEL examples for context, GPT4 produces worthless code responses. If you add in all of the runnable source code and give that a whirl, GPT4 is still unable to build simple prompt templates, let alone work output parsers into the mix, forget tools. Nothing in recent memory has made me so angry than the Python implementation of LCEL and I tried so hard to like it. The focus on LCEL seems to have made the entire library suffer as the general langchain examples are sporadically functional and it’s difficult understanding the recommended approach to use certain functionality over their updates the past 6 months.
Hey James, new to the AI creation space and my goal is to try my hand at creating a AI V Tuber with personality, could you please point me in the best direction to follow out of all the videos you have to help me follow this goal. Your videos are great but hard to pick one to follow for it. Thanks!
Sounds like a lesser (since it's langchain) ripoff of LMQL. Langchain can do a lot but working on its codebase is one of the worst experiencing working on a repo I've ever had. It's unmanageable for a production environment. LMQL is awesome btw.
LangChain team answered this in negative (ua-cam.com/video/9M8x485j_lU/v-deo.htmlsi=kkHWKA9VcClwXfxU&t=2396). According to them LMQL is focused on instructions to LLM where as LCEL is for composing of processing outside LLM.
I agree, langchain was very helpful early on when it was just a simple wrapper for common LLM functions. However, it has just grown more and more complex without a clear underlying principle.
Have you tried Marvin or Instructor? They both do what LMQL does but they integrate with Python much better. Neither has that LMQL playground but I’m not sure the benefit of that over using Marvin/instructor in a Jupyter notebook.
Instructor is on my list of things to add to my system. It seems odd to hear it integrates with Python better since LMQL is scope-aware when you include it in a docstring with the LMQL decorator on a function. @@zacboyles1396
The pipes are the most confusing thing ever created. Thanks for the video - it explains them in a great way, but it doesn't make them more pleasant. This is totally unneeded and it is not even a syntax sugar, because it doesn't make the syntax more easy and it doesn't make the code more understandable or reduce the boilerplate. Exactly the opposite - you have to learn totally new concept with no any need for it.
LangChain was already a questionable abstraction. Now they've built an abstraction on top of the abstraction. 🤔 If you're okay at Python and understand anything about LLMs, I think this library is best avoided, especially its core abstractions like chains, agents, etcetera. If you want some convenient interfaces to components like LLMs or vector databases then go ahead and use what you need. But beyond that, it causes much more work than it avoids. Especially as the documentation is pretty poor. They haven't even bothered with docstrings for the main classes.
Recently, I've observed widespread use of Pipe operator, and this video provided a clear explanation of its purpose and functionality. Once again, it was incredibly helpful!
glad to hear it!
Pipes are a powerful tool for clearly expressing a sequence of multiple operations. They are used for long time in R. People coming from R to Python will be happy with the pipe operator.
If people are coming from Linux background also they will find this intuitive
James, thank you so much, you've made this "scary" topic digestible converting it to step by step in depth practical guide. Thank a lot for your work!
I bought a course on Udemy, which claimed to be the complete guide to Langchain. I understood more about langchain from this video than the entire course. Thank you so much for the content. You really are a good teacher :)
Better explanation than the video from Langchain! Awesome work!
Thanks for this more in depth explanation. Coming from Scala, LCEL makes sense for me.
The problem is not that is "not Pythonic", the problem is that it is not READABLE. We are sacrificing readability, which is always important, specially important when building something that can be conceptually challenging, as an LLM pipeline is.
I am still not sure, if I like this or not. It´s the official go-to way now, but I still prefer the Chain Interface. For RetrievalQA you had this nice Chain, now you have to do everything on your own. I still miss some easy to use and clean examples with the LCEL for more advanced usecases.
Old chains like RetrievalQA are compatible with the pipe stuff.
Both kind of chains inherit from the Chain class (i guess).
You can do:
my_pipe_chain = blah | blah
qa_chain = RetrievalQA(…)
awesome_chain = my_pipe_chain | qa_chain
I have to implement new features in a project and we were using class based chains and were doing some crazy shit. You don’t want to see that stuff.
I we had used pipe chains, it would be much easier to refactor this.
And if you can just use the chains as is, good.
But we needed to change lots of stuff and it was confusing, all those clases which inherits and delegates in a dozen ones…
A big problem is that we didn’t really knew everything those chains were doing, so know Im converting them to pipe chains and I have to read a lot of code, jumping around.
More precisely, one of my biggest problems is reproducing everything the ConversationalRetrievalQA chain does
Great video! I’d love a follow up with custom data formats or output formats
This is great actually. If you are familiar with R statistical language you will easily grasp the pipe approach.
People just need to get used to this new paradigm.
yeah I think it's pretty interesting, it's weird when you're coming from Python but once you start using it it seems nice - for now I think it mainly needs better integration with agents, idk if that is planned or not but most of the time I'm using LangChain for agents
I absolutely LOVE lcel
Please more and more of this content !
I recently spend one day trying to implenent SelfQuerying with Pinecone and langchain. But it is a fail... Impossible to filter on metadata. And the doc from pinecone on the Index is not super esy. If you want to do more very pedagogical video like this on the notion of Index (and thus self querying) in order to avoir using Langchain to push in Pinecone, it would be incredible !
I gave up on LangChain's SelfQueryRetreiver a few months ago and basically use three patterns: 1) a postgres database with a class that does `user utterance`->`natural language query`->`sql query` translation to select a subset of documents, 2) an on-the-fly embedding + vector similarity estimator on the subset of documents, and 3) control-flow operators where the LLM chooses from a set of enums (for categorical matching) or estimates a coeficient (for thresholding).
More importantly, with the first pattern, you can now answer questions like "What are the top 5 complains from customers that scored us with low NPS?" (and then generate summary statistics passing the results to pandas), which is completely impossible to answer via RAG or any other form of embedding-based search.
This guy has a series worth checking out on ho9w to build a Postgres Data Analytics AI Agent starting with this video: ua-cam.com/video/jmDMusirPKA/v-deo.html
@@leobeeson1
Let me tell you that you made my day !
I will check that with big interest!!
How would you use conversation and LCEL together ? Cheers
Great video and explanation. I think the pipe operator is a solution in search of a problem. Might be usefull in some contexts.
Happy birthday, bro! 🎂
appreciate it!
Great video James. Thanks
Thank you James, very clear, looking to experiment with Remote Runnables etc.
pipe operator and RunnableParallel - reminds me of Apache Beam python API.
- good to see they are not putting "Zen of Python" dogmaticism before AI innovation !
LangChain docs have completely changed since a year ago - this field is moving so fast !
Thanks a ton James!!
How do you add a variable for metadata filtering when using LCEL. I just can't figure it out. And how do you get verbose mode so that you can see the all the prompts and steps taking place?
Same my issue , Did you get solution of it ??
So the pipeline operator can't forward unconsumed inputs automatically further downstream?
Great content. It works with Agent, in my case I need to use AgentExecutor instead of Agent, agent_executor = AgentExecutor(agent=agent_chain, memory=memory, verbose=True, tools=tool_list,return_intermediate_steps=False)
Look like AgentExecutor is not streaming with LCEL. Any ideas?
@jamesbriggs Great tute! James, what's the right way of using a LCEL chain as a Tool inside an Agent?
I don't mind the syntax too much... It's not pythonic but I'm used to it because it reminds me of R tidyverse pipelines. What I can't really find/understand is what about incorporating history in chats, and implementing callbacks using LCEL? I haven't found understandable documentation for that. Initially it appears in the process of making a clean implementation for chains they have made adding history (like ConversationBufferMemory) pretty complicated and it wasn't previously. Oh and thanks for the in depth walk through of how they overload that pipe operator to use it in python very good walk though on that!
I included chat history with LCEL (pretty manual) in this walkthrough if it helps :) ua-cam.com/video/rbzYZLfQbAM/v-deo.html
One day this might catch up to the usability of chatsnack, but hard to see if this is the right direction?
Confused by pipes! Oh the kids these days 😉 another great vid, thanks dude (you should try some awk and sed one day for fun)
Thank you. Can this be done using palm API
That's a great topic! Thanks James :)
Hey James, nice video. I recently refactored some code to use LCEL and I enjoyed the process. Any chance you have a video lined up around setting up benchmarks for a system in production. And then evaluating potential changes before they go live? Especially for one that uses RAG, thanks.
Many many happy returns of the day
Bro how you come up with these techniques,do you follow some specific blog or anything could you please tell me
If you ask langchain chat anything slightly more advanced than the most basic LCEL questions, it spits out incorrect code. If you neatly organize all the LCEL examples for context, GPT4 produces worthless code responses. If you add in all of the runnable source code and give that a whirl, GPT4 is still unable to build simple prompt templates, let alone work output parsers into the mix, forget tools.
Nothing in recent memory has made me so angry than the Python implementation of LCEL and I tried so hard to like it. The focus on LCEL seems to have made the entire library suffer as the general langchain examples are sporadically functional and it’s difficult understanding the recommended approach to use certain functionality over their updates the past 6 months.
Hey James, new to the AI creation space and my goal is to try my hand at creating a AI V Tuber with personality, could you please point me in the best direction to follow out of all the videos you have to help me follow this goal. Your videos are great but hard to pick one to follow for it. Thanks!
thanks! amazing xD
i was wondering where these pipes suddenly came from...
Sounds like a lesser (since it's langchain) ripoff of LMQL. Langchain can do a lot but working on its codebase is one of the worst experiencing working on a repo I've ever had. It's unmanageable for a production environment. LMQL is awesome btw.
nice I haven't dived into LMQL yet - will check it out soon :)
LangChain team answered this in negative (ua-cam.com/video/9M8x485j_lU/v-deo.htmlsi=kkHWKA9VcClwXfxU&t=2396). According to them LMQL is focused on instructions to LLM where as LCEL is for composing of processing outside LLM.
I agree, langchain was very helpful early on when it was just a simple wrapper for common LLM functions. However, it has just grown more and more complex without a clear underlying principle.
Have you tried Marvin or Instructor? They both do what LMQL does but they integrate with Python much better. Neither has that LMQL playground but I’m not sure the benefit of that over using Marvin/instructor in a Jupyter notebook.
Instructor is on my list of things to add to my system. It seems odd to hear it integrates with Python better since LMQL is scope-aware when you include it in a docstring with the LMQL decorator on a function. @@zacboyles1396
apache beam had the same thing...
The pipes are the most confusing thing ever created. Thanks for the video - it explains them in a great way, but it doesn't make them more pleasant. This is totally unneeded and it is not even a syntax sugar, because it doesn't make the syntax more easy and it doesn't make the code more understandable or reduce the boilerplate. Exactly the opposite - you have to learn totally new concept with no any need for it.
thank you. langchain documentation sucks. i was really confusing until i watched this videos
LangChain was already a questionable abstraction. Now they've built an abstraction on top of the abstraction. 🤔
If you're okay at Python and understand anything about LLMs, I think this library is best avoided, especially its core abstractions like chains, agents, etcetera. If you want some convenient interfaces to components like LLMs or vector databases then go ahead and use what you need. But beyond that, it causes much more work than it avoids. Especially as the documentation is pretty poor. They haven't even bothered with docstrings for the main classes.
i really dislike langchain because i think its just a massivw unreliable regex hack, does this make langchain more useable/reliable?
It´s not a regex hack. It just overloads the __OR__ operator. James even explains it.