This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Windows-MCP is a Python MCP (Model Context Protocol) server that bridges AI LLM agents with the Windows OS, enabling direct desktop automation. It exposes 16 tools (App, Shell, Screenshot, Snapshot, Click, Type, Scroll, Move, Shortcut, Wait, Scrape, MultiSelect, MultiEdit, Clipboard, Process, Notification) via FastMCP.
uv sync # Install dependencies
uv run windows-mcp # Run the MCP server
ruff format . # Format code
ruff check . # Lint code
ruff check --fix . # Lint and auto-fix
pytest # Run all tests (if tests/ exists)
pytest tests/test_foo.py # Run a single test filePackage manager: UV (not pip). Python: 3.13+. Build backend: Hatchling.
The codebase follows a layered service architecture under src/windows_mcp/:
Entry point — __main__.py: Registers all 15 MCP tools on a FastMCP server instance. Uses an async lifespan to initialize Desktop, WatchDog, and Analytics services. Each tool function delegates to Desktop methods. The @with_analytics decorator wraps tools for telemetry.
Desktop service — desktop/service.py: High-level orchestrator. Manages window operations (launch, resize, switch), screenshots, mouse/keyboard actions, and clipboard. Interfaces with Tree service for UI element discovery. desktop/views.py defines data models: DesktopState, Window, Size, BoundingBox, Status.
Tree service — tree/service.py: Captures the Windows accessibility tree from active and background windows. Identifies interactive elements and scrollable areas. Uses ThreadPoolExecutor for multi-threaded UI traversal. tree/views.py defines TreeElementNode, ScrollElementNode, TreeState. tree/config.py has control type configurations.
UIAutomation wrapper — uia/: Low-level abstraction over the Windows UIAutomation COM API via comtypes. core.py wraps the main automation object, controls.py has control-specific logic, patterns.py wraps UIAutomation patterns, enums.py has COM enumerations, events.py handles event subscriptions.
WatchDog — watchdog/service.py: Runs in a separate thread monitoring UI focus changes via UIAutomation events. Notifies the Tree service of focus changes so the accessibility tree stays current.
Virtual Desktop Manager — vdm/core.py: Tracks which windows belong to which Windows virtual desktop (Win10/11).
Analytics — analytics.py: Optional PostHog telemetry (disabled with ANONYMIZED_TELEMETRY=false env var). Tracks tool names and errors only, not arguments or outputs.
- Formatter/linter: Ruff (line length 100, double quotes)
- Naming: PEP 8 —
snake_casefunctions/variables,PascalCaseclasses,UPPER_CASEconstants - Type hints required on function signatures
- Google-style docstrings for public functions/classes
- Screenshots are capped to 1920x1080 for token efficiency
- Mouse/keyboard input uses UIA (same coordinate space as BoundingRectangle; no DPI mismatch)
- Screenshot is the preferred fast visual-context tool; Snapshot is the heavier path for UI element ids and DOM extraction
- Browser detection (Chrome, Edge, Firefox) triggers special DOM extraction mode in Snapshot
- Fuzzy string matching (
thefuzz) is used for element name matching - UI element fetching has retry logic (
THREAD_MAX_RETRIES=3in tree service) - The server supports stdio, SSE, and streamable HTTP transports
This server has full system access with no sandboxing. Tools like Shell and App can perform irreversible operations. The recommended deployment target is a VM or Windows Sandbox.