OpenManus Project Structure

Core Architecture

OpenManus is built on an agent-based architecture with a hierarchical inheritance pattern:

  1. BaseAgent (app/agent/base.py): The foundation class that provides:
    • Memory management
    • State transitions
    • Execution loop control
    • Stuck state detection
  2. ReActAgent (app/agent/react.py): Extends BaseAgent with:
    • Think-Act cycle pattern
    • Abstract methods for thinking and acting
  3. ToolCallAgent (app/agent/toolcall.py): Extends ReActAgent with:
    • Tool/function calling capabilities
    • Tool execution handling
    • Response processing
  4. BrowserAgent (app/agent/browser.py): Extends ToolCallAgent with:
    • Browser control capabilities
    • Screenshot handling
    • Browser state management
  5. Manus (app/agent/manus.py): The main agent class that extends BrowserAgent with:
    • Additional tools (Python execution, browser use, string replacement)
    • Context-aware thinking process

Main Components

  1. Entry Points:
    • main.py: Simple CLI interface to run the Manus agent
    • run_mcp.py: Alternative version using MCP tool
    • run_flow.py: Experimental multi-agent version
  2. Tools System (app/tool/):
    • browser_use_tool.py: Web browser automation
    • python_execute.py: Safe Python code execution
    • str_replace_editor.py: Text manipulation
  3. LLM Integration (app/llm.py):
    • Handles communication with language models
    • Manages tool calling format
  4. Memory and State Management (app/schema.py):
    • Defines message structure
    • Maintains agent state
    • Handles tool calls
  5. Configuration (app/config.py):
    • Manages system settings
    • Browser configuration

Execution Flow

When you run python main.py:

  1. The script creates a Manus agent instance
  2. It prompts the user for input
  3. The agent processes the input through its run method
  4. The agent executes a series of think-act cycles:
    • Think: Processes the current state and decides what to do
    • Act: Executes the decided actions using tools
  5. The agent continues until it reaches a conclusion or hits the maximum step limit

Key Features

  1. Browser Automation: Can navigate websites, interact with elements, and extract content
  2. Python Code Execution: Can run Python code in a sandboxed environment
  3. Planning Capabilities: Uses LLM for planning and reasoning
  4. Error Handling: Detects stuck states and has robust error recovery
  5. Extensibility: Modular design allows for adding new tools and capabilities

How to Use OpenManus

As seen in the README.md, you can run OpenManus in several ways:

  1. Basic usage: python main.py
  2. MCP tool version: python run_mcp.py
  3. Multi-agent version: python run_flow.py

The agent is designed to handle a wide range of tasks through natural language instructions, leveraging its tools to accomplish complex goals.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.