This AWS Labs Model Context Protocol (MCP) server for CloudWatch enables your troubleshooting agents to use CloudWatch data to do AI-powered root cause analysis and provide recommendations. It offers comprehensive observability tools that simplify monitoring, reduce context switching, and help teams quickly diagnose and resolve service issues. This server will provide AI agents with seamless access to CloudWatch telemetry data through standardized MCP interfaces, eliminating the need for custom API integrations and reducing context switching during troubleshooting workflows. By consolidating access to all CloudWatch capabilities, we enable powerful cross-service correlations and insights that accelerate incident resolution and improve operational visibility.
The CloudWatch MCP Server provides specialized tools to address common operational scenarios including alarm troubleshooting, understand metrics definitions, alarm recommendations and log analysis. Each tool encapsulates one or multiple CloudWatch APIs into task-oriented operations.
Alarm Based Troubleshooting - Identifies active alarms, retrieves related metrics and logs, and analyzes historical alarm patterns to determine root causes of triggered alerts. Provides context-aware recommendations for remediation.
Log Analyzer - Analyzes a CloudWatch log group for anomalies, message patterns, and error patterns within a specified time window.
Metric Definition Analyzer - Provides comprehensive descriptions of what metrics represent, how they're calculated, recommended statistics to use for metric data retrieval
Alarm Recommendations - Suggests recommended alarm configurations for CloudWatch metrics, including thresholds, evaluation periods, and other alarm settings.
- An AWS account with CloudWatch Telemetry
- This MCP server can only be run locally on the same host as your LLM client.
- Set up AWS credentials with access to AWS services
- You need an AWS account with appropriate permissions (See required permissions below)
- Configure AWS credentials with
aws configureor environment variables
get_metric_data- Retrieves detailed CloudWatch metric data for any CloudWatch metric. Use this for general CloudWatch metrics that aren't specific to Application Signals. Provides ability to query any metric namespace, dimension, and statisticget_metric_metadata- Retrieves comprehensive metadata about a specific CloudWatch metricget_recommended_metric_alarms- Gets recommended alarms for a CloudWatch metric based on best practice, and trend, seasonality and statistical analysis.analyze_metric- Analyzes CloudWatch metric data to determine trend, seasonality, and statistical properties
get_active_alarms- Identifies currently active CloudWatch alarms across the accountget_alarm_history- Retrieves historical state changes and patterns for a given CloudWatch alarm
describe_log_groups- Finds metadata about CloudWatch log groupsanalyze_log_group- Analyzes CloudWatch logs for anomalies, message patterns, and error patternsexecute_log_insights_query- Executes CloudWatch Logs insights query on CloudWatch log group(s) with specified time range and query syntax, returns a unique ID used to retrieve resultsexecute_cwl_insights_batch- Runs a Logs Insights query across multiple log groups and regions in a single call, automatically chunking log groups (max 50 per query), throttling concurrency (max 7 per region), polling for completion, retrying failures, and splitting time ranges when hitting the 10,000-record or timeout limits. Returns one merged result set annotated with region, log group, and optional account labels. Seeexecute_cwl_insights_batchExamples below.get_logs_insight_query_results- Retrieves the results of an executed CloudWatch insights query using the query ID. It is used afterexecute_log_insights_queryhas been calledcancel_logs_insight_query- Cancels in progress CloudWatch logs insights query
Basic usage:
result = await execute_cwl_insights_batch(
ctx,
log_group_names=['/aws/lambda/my-app'], # Log group names (or ARNs for cross-account/region)
regions=['us-east-1', 'us-west-2', 'eu-west-1'], # Regions to query
start_time='2025-04-19T20:00:00+00:00', # ISO 8601 start time with timezone
end_time='2025-04-19T21:00:00+00:00', # ISO 8601 end time with timezone
query_string='fields @timestamp, @message | filter @message like /ERROR/ | limit 100' # Logs Insights query
)
print(f"Found {result.summary.total_records_returned} errors across {result.summary.total_regions} regions")
for warning in result.summary.warnings:
print(f"Warning: {warning}")Cross-account/cross-region query using log group ARNs:
# When querying log groups in different accounts or regions, use ARN format:
# arn:aws:logs:<region>:<account-id>:log-group:<log-group-name>
result = await execute_cwl_insights_batch(
ctx,
log_group_names=[
'arn:aws:logs:us-east-1:123456789012:log-group:/aws/ecs/my-service', # Source account log group ARN
'arn:aws:logs:eu-west-1:123456789012:log-group:/aws/ecs/my-service' # Different region
],
regions=['us-east-1'], # Monitoring account region
start_time='2025-04-19T00:00:00+00:00',
end_time='2025-04-19T23:59:59+00:00',
query_string='fields @timestamp, @message | filter level = "ERROR" | stats count() by bin(5m)',
account_label='prod-123456789012', # Optional label for result annotation
profile_name='prod-readonly' # AWS profile with cross-account access
)Performance tips:
- Use
limitparameter or| limit Nin query to control result size - Narrow time ranges for faster queries
- The tool automatically splits time ranges if hitting 10,000-record limit
- Monitor
summary.warningsfor optimization suggestions
Common errors and solutions:
Invalid ISO 8601 timestamp: Ensure timestamps include timezone (e.g.,+00:00)start_time must be before end_time: Check time range orderQuery failed... bad query syntax: Verify query syntax at AWS Logs Insights docs- Large result warnings: Add
| limit Nto query or use smaller time ranges
-
cloudwatch:DescribeAlarms -
cloudwatch:DescribeAlarmHistory -
cloudwatch:GetMetricData -
cloudwatch:ListMetrics -
logs:DescribeLogGroups -
logs:DescribeQueryDefinitions -
logs:ListLogAnomalyDetectors -
logs:ListAnomalies -
logs:StartQuery -
logs:GetQueryResults -
logs:StopQuery
- Install
uvfrom Astral or the GitHub README - Install Python using
uv python install 3.10
| Kiro | Cursor | VS Code |
|---|---|---|
- For Kiro, update MCP Config (~/.kiro/settings/mcp.json)
- For Cline click on "Configure MCP Servers" option from MCP tab
{
"mcpServers": {
"awslabs.cloudwatch-mcp-server": {
"autoApprove": [],
"disabled": false,
"command": "uvx",
"args": [
"awslabs.cloudwatch-mcp-server@latest"
],
"env": {
"AWS_PROFILE": "[The AWS Profile Name to use for AWS access]",
"FASTMCP_LOG_LEVEL": "ERROR"
},
"transportType": "stdio"
}
}
}For Windows users, the MCP server configuration format is slightly different:
{
"mcpServers": {
"awslabs.cloudwatch-mcp-server": {
"disabled": false,
"timeout": 60,
"type": "stdio",
"command": "uv",
"args": [
"tool",
"run",
"--from",
"awslabs.cloudwatch-mcp-server@latest",
"awslabs.cloudwatch-mcp-server.exe"
],
"env": {
"FASTMCP_LOG_LEVEL": "ERROR",
"AWS_PROFILE": "your-aws-profile",
"AWS_REGION": "us-east-1"
}
}
}
}Please reference AWS documentation to create and manage your credentials profile
Build and install docker image locally on the same host of your LLM client
- Install Docker
git clone https://github.com/awslabs/mcp.git- Go to sub-directory
cd src/cloudwatch-mcp-server/ - Run
docker build -t awslabs/cloudwatch-mcp-server:latest .
{
"mcpServers": {
"awslabs.cloudwatch-mcp-server": {
"command": "docker",
"args": [
"run",
"--rm",
"--interactive",
"-v",
"~/.aws:/root/.aws",
"-e",
"AWS_PROFILE=[The AWS Profile Name to use for AWS access]",
"awslabs/cloudwatch-mcp-server:latest"
],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}Please reference AWS documentation to create and manage your credentials profile
This MCP server includes reusable investigation skills that encode domain expertise into structured workflows for AI agents.
| Skill | Description | Setup Guide |
|---|---|---|
| AgentCore Investigation | Investigate Bedrock AgentCore runtime sessions — resolve session/trace IDs, query OTEL spans, filter noise, build timelines | Kiro CLI setup |
Skills provide pre-built investigation pipelines that agents can follow. They include the skill definition (SKILL.md), reference documentation, and MCP server configuration.
See the skills directory for details.
Contributions are welcome! Please see the CONTRIBUTING.md in the monorepo root for guidelines.
We value your feedback! Submit your feedback, feature requests and any bugs at GitHub issues with prefix cloudwatch-mcp-server in title.