Architecture
Understanding Pygent's architecture helps in effectively customizing and extending the project. The system is composed of a few main components that work together.
Core Components
-
Agent: TheAgentis the central orchestrator. It maintains the conversation history, interacts with the language model to decide the next step, and dispatches calls to tools. Each agent has its own state, including its persona and enabled tools. -
Runtime: TheRuntimerepresents the isolated execution environment. It is responsible for executing commands (bash), interacting with the file system (write_file,read_file), and managing the environment's lifecycle (e.g., a Docker container). If Docker is unavailable, theruntimeexecutes commands locally. Each agent has its ownruntimeinstance, ensuring isolation between tasks. -
Model: TheModelis an interface (protocol) that abstracts communication with a language model (LLM). The default implementation,OpenAIModel, interacts with OpenAI-compatible APIs. You can provide your own implementation to connect to different model backends. -
TaskManager: TheTaskManagermanages the execution of background tasks. When you use thedelegate_tasktool, theTaskManagercreates a newAgentwith its ownRuntimeto execute the task asynchronously. This allows the main agent to continue its work or monitor the subtask's progress.
Request Flow
- The user sends a message to the
Agentvia the CLI or API. - The
Agentadds the user's message to the conversation history. - The
Agentsends the complete history to theModel. - The
Modelreturns a response, which can be a text message or a request to call one or more tools (tool_calls). - If it's a text message, the
Agentdisplays it to the user. - If it's a tool call, the
Agentinvokes the corresponding function (e.g.,tools._bash), passing the necessary arguments to theRuntime. - The
Runtimeexecutes the action (e.g., anlscommand in the Docker container). - The result of the execution is returned to the
Agent. - The
Agentadds the tool's result to the history and typically calls theModelagain so it can process the result and decide the next step, continuing the cycle until the task is completed (signaled by thestoptool).