@Patrick Marlow this is the best intro to Agent I had! Fantastic job! But hey what’s the think with Tesla you guys have at Google? Are you merging both compagnies? 😂😅
10:47 - "Models No native tool implementation. Tools can be implemented via custom integrations." "Agents Tools are natively implemented in Agent architecture" 15:00 - "the point is you're building essentially a shim between the agent and the API interface" (speaking on the "Extension" construct) So how is building a shim to an API different than building an integration?
Yep great catch! For me, this really boils down to your role in interacting with these systems. Consider there are 2 roles: [1] Consumer [2] Developer If you are the "Consumer" of a Gemini Model (i.e. API endpoint for `gemini-1.5-flash-001`) then there are no tools involved. Imagine if someone created a basic UI with an empty text box and when you hit "enter" it simply sends 1 API call directly to the Model. No tools just simply "generate some text". If you are the "Consumer" of Gemini (the web based Agent Application) then tools are already built into that application (i.e. Google Search, Image Generation, Flights Search, etc.) Now the difference becomes when you are the "Developer" working with either of the above systems. As a "Developer" working with a Model or Agent, your goal is to build something for a "Consumer". So in both of these cases, from your perspective as a Developer, these things become the same to some degree. Your Model + Integrations becomes an "app" for a Consumer. Your Agent + Tools (Extensions) becomes an "app" for a Consumer. The difference for Agents is that they include one last piece: the Reasoning component. There are various camps that argue Model + Tools could also be an "agent" to some degree, and I won't particularly argue against that. So for simplicity sake, we try to keep the definition of "agent" to be Model + Tools + Reasoning as I laid out in the first slide @ 2:00 and 5:30 TLDR - If you're the developer of these systems, building a shim to API or Integration are largely the same. The difference is in the end application architecture. Hope this provides some clarity, and thanks for watching! 🙏
Thanks for the response 👍 Looks like upon reading your explanation a few times and rewatching a bit + letting the little grey cells work, you're simply defining a tiny slice of structure on top of a bare model, i.e. a RAG approach specific to your approach to the - as you say - overloaded term Agent. The two pieces I quoted can be just two perspectives on this framing, with a model + tools being basically the same as an agent without the explicit reasoning part. Interesting talk! How do you propose to handle agent "locality" and movement? Also related, how do you plan on versioning agents, and are you and your incubator interested in discussing a distributed protocol similar to a git-like semantic vcs as a possible solution to this locality/versioning problem? It just happens to solve numerous other issues by addressing git's technical debt! (It is 20 years old after all...)
Thanks for response@@pmarlow-google! We need to hear and read from you more often! IMO this conversation is way too important to have it sit here being somewhat lost in yt comments. If you have a moment - it will be awesome to read through some larger blog post that goes deeper into these questions (assuming there is none yet?). Say there is a developer working with Model + Integration (tools + function calling) via official vertexai/gemini sdk (any supported language) does it mean that they will be missing out on Reasoning Engine (is it going to stay part of the LangChain on Vertex AI? ). Is it assumed that if you are writing your code using vertexai/gemini sdk - you don't need Reasoning Engine functionality since you'll have to write your own logic around function calling? Also IMO it is not very clear on what will happen with Vertex AI Extensions functionality in vertexai/gemin sdk... I take it this functionality is too new and hasn't been included in vertexai/gemini sdk yet?
I'm excited to see how this technology evolves and transforms the way we work alongside AI. Have you had any experiences with Generative AI Agents yourself?"
Thanks or the nice presentation👍 , Here are few issues with Generative Playbooks 1. When you navigate to Flow it will always detect Default welcome intent (Agent is DFCX), 2. There is no way you can send parameters generated in playbook back to flow. 3. If you are using complex object in tool , your Model does not recognize it at all. 4. Its not developer friendly , you don't have enough logging information and many more 😂😂😂😂
The demo is impressive!
Patrick has one of the best presentations on agents on the internet. Well explained, straight to the point and intuitive.
Generative AI agents are truly impressive; however, to fully understand them, we also need great presentations like this one. Thank you, Patrick.
what a presentation by Patrick, thank you so much
Amazing, simple, realistic .. well done Patrick
Heck yeah. Stoked to play around with it
Excellent session!👏
@Patrick Marlow this is the best intro to Agent I had! Fantastic job!
But hey what’s the think with Tesla you guys have at Google? Are you merging both compagnies? 😂😅
10:47 - "Models No native tool implementation. Tools can be implemented via custom integrations." "Agents Tools are natively implemented in Agent architecture"
15:00 - "the point is you're building essentially a shim between the agent and the API interface" (speaking on the "Extension" construct)
So how is building a shim to an API different than building an integration?
Yep great catch! For me, this really boils down to your role in interacting with these systems.
Consider there are 2 roles:
[1] Consumer
[2] Developer
If you are the "Consumer" of a Gemini Model (i.e. API endpoint for `gemini-1.5-flash-001`) then there are no tools involved. Imagine if someone created a basic UI with an empty text box and when you hit "enter" it simply sends 1 API call directly to the Model. No tools just simply "generate some text".
If you are the "Consumer" of Gemini (the web based Agent Application) then tools are already built into that application (i.e. Google Search, Image Generation, Flights Search, etc.)
Now the difference becomes when you are the "Developer" working with either of the above systems.
As a "Developer" working with a Model or Agent, your goal is to build something for a "Consumer".
So in both of these cases, from your perspective as a Developer, these things become the same to some degree.
Your Model + Integrations becomes an "app" for a Consumer.
Your Agent + Tools (Extensions) becomes an "app" for a Consumer.
The difference for Agents is that they include one last piece: the Reasoning component.
There are various camps that argue Model + Tools could also be an "agent" to some degree, and I won't particularly argue against that.
So for simplicity sake, we try to keep the definition of "agent" to be Model + Tools + Reasoning as I laid out in the first slide @ 2:00 and 5:30
TLDR - If you're the developer of these systems, building a shim to API or Integration are largely the same. The difference is in the end application architecture.
Hope this provides some clarity, and thanks for watching! 🙏
Thanks for the response 👍
Looks like upon reading your explanation a few times and rewatching a bit + letting the little grey cells work, you're simply defining a tiny slice of structure on top of a bare model, i.e. a RAG approach specific to your approach to the - as you say - overloaded term Agent. The two pieces I quoted can be just two perspectives on this framing, with a model + tools being basically the same as an agent without the explicit reasoning part.
Interesting talk! How do you propose to handle agent "locality" and movement? Also related, how do you plan on versioning agents, and are you and your incubator interested in discussing a distributed protocol similar to a git-like semantic vcs as a possible solution to this locality/versioning problem? It just happens to solve numerous other issues by addressing git's technical debt! (It is 20 years old after all...)
Thanks for response@@pmarlow-google! We need to hear and read from you more often! IMO this conversation is way too important to have it sit here being somewhat lost in yt comments. If you have a moment - it will be awesome to read through some larger blog post that goes deeper into these questions (assuming there is none yet?). Say there is a developer working with Model + Integration (tools + function calling) via official vertexai/gemini sdk (any supported language) does it mean that they will be missing out on Reasoning Engine (is it going to stay part of the LangChain on Vertex AI? ). Is it assumed that if you are writing your code using vertexai/gemini sdk - you don't need Reasoning Engine functionality since you'll have to write your own logic around function calling? Also IMO it is not very clear on what will happen with Vertex AI Extensions functionality in vertexai/gemin sdk... I take it this functionality is too new and hasn't been included in vertexai/gemini sdk yet?
Beautiful ❤️ thanks
Very good!
I'm excited to see how this technology evolves and transforms the way we work alongside AI. Have you had any experiences with Generative AI Agents yourself?"
Excellent on every level
How can we start building with these right now?
Thanks or the nice presentation👍 , Here are few issues with Generative Playbooks 1. When you navigate to Flow it will always detect Default welcome intent (Agent is DFCX), 2. There is no way you can send parameters generated in playbook back to flow. 3. If you are using complex object in tool , your Model does not recognize it at all. 4. Its not developer friendly , you don't have enough logging information and many more 😂😂😂😂
really solide
I want to be an Agent..❤
You are a mushy agent
It doesn't work with complex apis
Then that’s where you write in a CrewAI function call as a part of your flow