An autonomous QA agent that lives in Slack. Send a message like "Test the login flow" and a multi-agent pipeline will explore your codebase, generate Playwright tests, run them in Docker, auto-heal failures, and post results back to your Slack thread along with video.
User message in Slack
→ Planner (explores repo via GitHub MCP, outputs test plan)
→ Generator (writes .spec.ts files, verifies selectors via Playwright MCP)
→ Test Runner (executes in Docker, records video)
→ Healer (patches failures, re-runs, classifies bugs)
→ Reporter (posts results + video to Slack thread)
View full architecture diagram on Excalidraw
- Node.js 20+
- Docker running locally
- A Slack app with Socket Mode enabled
- GitHub PAT with repo read access
- OpenRouter or Anthropic API key
# Install dependencies
npm install
# Build the Docker image for the test runner
docker build -t qa-agent-playwright docker/playwright/
# Copy and fill in environment variables
cp .env.example .env| Variable | Description |
|---|---|
SLACK_BOT_TOKEN |
xoxb-... from your Slack app |
SLACK_SIGNING_SECRET |
Slack app signing secret |
SLACK_APP_TOKEN |
xapp-... for Socket Mode |
SLACK_TEAM_ID |
Your Slack workspace ID |
ANTHROPIC_API_KEY |
Claude API key |
OPENROUTER_API_KEY |
Claude API key |
GITHUB_TOKEN |
GitHub PAT (repo or contents:read scope) |
GITHUB_REPO |
Target repo, e.g. yourorg/yourapp |
APP_BASE_URL |
URL of the app to test, e.g. https://staging.yourapp.com |
Optional: PLAYWRIGHT_STORAGE_STATE (path to pre-authenticated session), MAX_HEAL_ATTEMPTS (default 3), TEST_TIMEOUT_MS (default 300000) and others at config.ts.
Your Slack app needs these bot token scopes: chat:write, files:write, channels:history, app_mentions:read.
# Development (hot reload)
npm run dev
# Production
npm run build && npm startSend a message in your Slack channel:
@QA_AGENT Test the login flow
The agent replies in a thread with progress updates:
- HEALTH_CHECK — verifies the target site is reachable
- PLANNING — explores the repo, posts the Markdown test plan
- GENERATING — creates
.spec.tsfiles with verified selectors - RUNNING — executes tests in Docker with video recording
- HEALING — auto-patches failures (wrong selectors, missing waits, etc.)
- REPORTING — posts pass/fail summary, healed tests, app bugs, and video
To cancel a run, reply cancel or stop in the thread.
To resume a failed mid run, reply resume or retry or continue in the thread.