Architecture
Understanding Pygent's architecture helps in effectively customizing and extending the project. The system is composed of a few main components that work together.
Core Components
-
Agent
: TheAgent
is the central orchestrator. It maintains the conversation history, interacts with the language model to decide the next step, and dispatches calls to tools. Each agent has its own state, including its persona and enabled tools. -
Runtime
: TheRuntime
represents the isolated execution environment. It is responsible for executing commands (bash
), interacting with the file system (write_file
,read_file
), and managing the environment's lifecycle (e.g., a Docker container). If Docker is unavailable, theruntime
executes commands locally. Each agent has its ownruntime
instance, ensuring isolation between tasks. -
Model
: TheModel
is an interface (protocol) that abstracts communication with a language model (LLM). The default implementation,OpenAIModel
, interacts with OpenAI-compatible APIs. You can provide your own implementation to connect to different model backends. -
TaskManager
: TheTaskManager
manages the execution of background tasks. When you use thedelegate_task
tool, theTaskManager
creates a newAgent
with its ownRuntime
to execute the task asynchronously. This allows the main agent to continue its work or monitor the subtask's progress.
Request Flow
- The user sends a message to the
Agent
via the CLI or API. - The
Agent
adds the user's message to the conversation history. - The
Agent
sends the complete history to theModel
. - The
Model
returns a response, which can be a text message or a request to call one or more tools (tool_calls
). - If it's a text message, the
Agent
displays it to the user. - If it's a tool call, the
Agent
invokes the corresponding function (e.g.,tools._bash
), passing the necessary arguments to theRuntime
. - The
Runtime
executes the action (e.g., anls
command in the Docker container). - The result of the execution is returned to the
Agent
. - The
Agent
adds the tool's result to the history and typically calls theModel
again so it can process the result and decide the next step, continuing the cycle until the task is completed (signaled by thestop
tool).