If you want external-facing conversational agents that can integrate with human support teams and existing telephony and communication platforms, choose the Customer Engagement Suite and its Conversational Agents.
If you want internal search to accelerate knowledge exchange throughout your organization, across your drives, chat, mail, ticketing platforms, databases, and more, including AI assistant support, choose Agentspace.
If you want to build something custom, you can use basic building blocks and start building from the ground up, for
example with the Google Gen AI SDK or LangChain, where you will also need to make decisions about infrastructure and hosting.
If you want the freedom of custom development with support for communication between agents through conversation history and shared state.
The Agent Development Kit makes it easier to build multi-agent systems, while handling challenges of agent communication for you.
It also frees you from infrastructure decisions through deployment to Agent Engine, a fully managed runtime, so you can focus on building logic and interactions between agents while resources are allocated and autoscaled for you.
The Google Agent Development Kit (or Google ADK) is designed to empower developers to build, manage, evaluate and deploy AI-powered agents.
The Agent Development Kit is a client-side Python SDK, enabling developers to quickly build and customize multi-agent systems.
While providing core tools, it allows developers to easily integrate and reuse tools from other popular agent frameworks (like LangChain, and CrewAI), leveraging existing investments and community contributions.
ADK also makes evaluation easier.
And provides a convenient, user-friendly local development user interface, with tools to help debug your agents and multi-agent systems.
Google ADK provides callbacks that can be used to invoke functions during various stages of a flow.
As well as session memory for stateful conversations, which enables agents to recall information about a user across multiple sessions, providing long-term context (in addition to short-term session State)
It integrates artifact storage to facilitate agent collaboration on documents.
And, Google ADK can be deployed to Agent Engine, for fully-managed agent infrastructure.
Google ADK is built around a few key primitives and concepts that make it powerful and flexible: The Agent is the fundamental worker unit designed for specific tasks.
Agents can use language models for complex reasoning, or to act as controllers to manage workflows.
Agents can coordinate complex tasks, delegate sub-tasks using LLM-driven transfer, or explicit Agent Tool invocation, enabling modular and scalable solutions.
With native streaming support, you can build real-time, interactive experiences with native support for bi-directional streaming, with text and audio.
This integrates seamlessly with underlying capabilities like the Gemini Live API, often enabled with simple configuration changes.
Artifact Management allows agents to save, load, and manage versioned artifacts, files or binary data, like images, documents, or generated reports, associated with a session or user, during their execution.
Google ADK provides a rich tool ecosystem, which equips agents with diverse capabilities.
It supports integrating custom functions, using other agents as tools, leveraging built-in functionalities like code execution, and interacting with external data sources and APIs.
Support for long-running tools allows handling asynchronous operations effectively.
There is also integrated developer tooling, so that you can develop and iterate locally with ease.
Google ADK includes tools like a command-line interface (CLI) and a Web UI for running agents, inspecting execution steps, debugging interactions, and visualizing agent definitions.
Session Management for session and state handles the context of a single conversation (the Session), including its history (as Events) and the agent’s working memory for that conversation (the State).
An Event is the basic unit of communication representing things that happen during a session (such as user message, agent reply, and tool use), forming the conversation history.
And Memory enables agents to recall information about a user across multiple sessions, providing long-term context, this is distinct from short-term session State.
Google ADK provides flexible orchestration that enables you to define complex agent workflows using built-in workflow agents alongside LLM-driven dynamic routing.
This allows for both predictable pipelines and adaptive agent behavior.
As part of this orchestration Google ADK uses a Runner, which is the engine that manages the execution flow, orchestrates agent interactions based on Events, and coordinates with backend services.
Google ADK has built-in Agent evaluation, which means you can assess agent performance systematically.
The framework includes tools to create multi-turn evaluation datasets, and run evaluations locally, through the CLI or UI, to measure quality and guide improvements.
And Code Execution provides the ability for agents (usually using Tools) to generate and execute code, to perform complex calculations or actions.
Callbacks are custom code snippets you provide to run at specific points in the agent's process, allowing for checks, logging, or behavior modifications.
Google ADK deploys to Agent Engine, a fully managed Google Cloud service enabling developers to deploy, manage, and scale AI agents in production.
Agent Engine handles the infrastructure to scale agents in production, so you can focus on creating intelligent and impactful applications.
And Planning, is an advanced capability where agents can break down complex goals into smaller steps and plan how to achieve them like a ReACT planner.
As part of the interactive developer tooling, Google ADK provides you tools to help debug your agents, interactions and multi-agent systems.
Your application traces will be collected by Cloud Trace, a tracing system that collects latency data from your distributed applications and displays it in the Google Cloud console.
Cloud Trace can capture traces from applications deployed on Agent Engine, and it can help you debug
the different calls performed between your LLM agent and its tools, before returning a response to the user.
Finally, models are the underlying Large Language Models, like Gemini, or Claude, that power ADKs LLM Agents, enabling their reasoning, and language understanding abilities.
While optimized for Google’s Gemini models, the framework is designed for flexibility, allowing integration with various LLMs, potentially including open-source or fine-tuned models, through its Base LLM interface.
An agent can execute the steps of a certain workflow to accomplish a goal, and can access any required external systems and tools to do so.
There are four main components for an agent: The models are used to reason over goals, determine the plan and generate a response.
An agent can use multiple models.
Tools are used to fetch data, perform actions or transactions by calling other APIs or services.
Orchestration is the mechanism for configuring the steps required to complete a task, and the logic for processing over these steps, and accessing the required tools.
It maintains memory and state, including the approach used to plan, and any data provided or fetched, as well as the necessary tools.
And the runtime is used to execute the system when invoked after receiving a query from an end user.
Core Concepts of Agent Development Kit
Google ADK is built around a few core concepts that make it powerful and flexible:
Agent: Agents are core building blocks designed to accomplish specific tasks. They can be powered by LLMs to reason, plan, and utilize tools to achieve goals, and can even collaborate on complex projects.
Tools: Tools give agents abilities beyond conversation, letting them interact with external APIs, search information, run code, or call other services.
Session Services: Session services handle the context of a single conversation (Session), including its history (Events) and the agent's working memory for that conversation (State).
Callbacks: Custom code snippets you provide to run at specific points in the agent's process, allowing for checks, logging, or behavior modifications.
Artifact Management: Artifacts allow agents to save, load, and manage files or binary data (like images or PDFs) associated with a session or user.
Runner: The engine that manages the execution flow, orchestrates agent interactions based on Events, and coordinates with backend services.
InMemoryRunner()
The Runner is the code responsible for receiving the user's query, passing it to the appropriate agent, receiving the agent's response event and passing it back to the calling application or UI for rendering, and then triggering the following event
runner.session_service.create_session()
Sessions allow an agent to preserve state, remembering a list of items, the current status of a task, or other 'current' information. This class creates a local session service for simplicity, but in production this could be handled by a database.
types.Content() and types.Part()
Instead of a simple string, the agent is passed a Content object which can consist of multiple Parts. This allows for complex messages, including text and multimodal content to be passed to the agent in a specific order.
When you ran the agent in the dev UI, it created a session service, artifact service, and runner for you. When you write your own agents to deploy programmatically, it is recommended that you provide these components as external services rather than relying on in-memory versions.
No comments:
Post a Comment