A simplified, production-ready OpenTelemetry demo with Azure Data Explorer as the sole telemetry backend
Quick Start • Dashboards • Architecture • Alerting • Workload Identity • Deployment
This is a fork of the OpenTelemetry Demo that replaces multiple observability backends (Jaeger, Prometheus, OpenSearch) with a single, unified backend: Azure Data Explorer (ADX).
| Challenge with Original Demo | Solution with ADX |
|---|---|
| 3 separate backends (Jaeger, Prometheus, OpenSearch) | 1 unified backend for traces, metrics, and logs |
| Complex deployment with many components | Simplified architecture with fewer moving parts |
| No native Azure integration | Native Azure services with Workload Identity |
| Data scattered across systems | All telemetry in one place with powerful KQL queries |
| SaaS observability costs ($$$) | Pay only for Azure resources (~$150/month for dev) |
All telemetry data flows to Azure Data Explorer and is visualized through Grafana dashboards using KQL queries.
flowchart LR
subgraph Original["Original OpenTelemetry Demo"]
direction TB
O_SVC[17 Microservices]
O_COL[OTel Collector]
O_J[Jaeger<br/>Traces]
O_P[Prometheus<br/>Metrics]
O_OS[OpenSearch<br/>Logs]
O_G[Grafana]
O_SVC --> O_COL
O_COL --> O_J
O_COL --> O_P
O_COL --> O_OS
O_J --> O_G
O_P --> O_G
O_OS --> O_G
end
subgraph ThisFork["This Fork (ADX)"]
direction TB
T_SVC[17 Microservices]
T_COL[OTel Collector]
T_ADX[(Azure Data Explorer<br/>Traces + Metrics + Logs)]
T_G[Grafana]
T_SVC --> T_COL
T_COL --> T_ADX
T_ADX --> T_G
end
Original -.->|Simplified| ThisFork
| Aspect | Original Demo | This Fork |
|---|---|---|
| Backends | Jaeger + Prometheus + OpenSearch | Azure Data Explorer only |
| Infrastructure | Docker Compose / Generic K8s | Terraform + AKS + Helm |
| Authentication | N/A | Azure Workload Identity (no secrets!) |
| Query Language | PromQL + Jaeger UI + OpenSearch | KQL (Kusto Query Language) |
| Deployment | docker compose up |
terraform apply + helm install |
| Data Retention | Limited by local storage | 1 year (configurable) |
| Cost | Free (local) / Variable (cloud) | ~$150-500/month on Azure |
flowchart TB
subgraph Internet
User[User Browser]
end
subgraph AKS["Azure Kubernetes Service"]
subgraph Services["Microservices"]
FE[Frontend]
Cart[Cart Service]
Checkout[Checkout]
Payment[Payment]
Product[Product Catalog]
Currency[Currency]
Shipping[Shipping]
Email[Email]
Recommend[Recommendation]
Ad[Ad Service]
Other[+ 7 more...]
end
FP[Frontend Proxy<br/>Envoy]
OTel[OTel Collector]
Grafana[Grafana]
LG[Load Generator]
end
subgraph Azure["Azure Cloud Services"]
ADX[(Azure Data Explorer<br/>─────────────────<br/>OTelTraces<br/>OTelMetrics<br/>OTelLogs)]
MI[Managed Identity]
end
User --> FP
FP --> FE
FE --> Cart & Checkout & Product & Currency
Checkout --> Payment & Shipping & Email
Product --> Recommend
FE --> Ad
Services -->|OTLP| OTel
LG -->|Traffic| FP
OTel -->|Workload Identity| MI
MI -->|Authenticated| ADX
ADX -->|KQL Queries| Grafana
flowchart LR
subgraph Receivers["Receivers"]
OTLP[OTLP<br/>gRPC :4317<br/>HTTP :4318]
PROM[Prometheus<br/>Self-metrics]
HC[HTTP Check<br/>Frontend Proxy]
end
subgraph Processors["Processors"]
K8S[K8s Attributes<br/>Pod metadata]
RD[Resource Detection<br/>Host info]
ML[Memory Limiter<br/>80% limit]
RES[Resource<br/>service.instance.id]
TF[Transform<br/>URL cleanup]
BT[Batch<br/>1000-2000 records]
end
subgraph Exporters["Exporters"]
ADX[Azure Data Explorer<br/>───────────────<br/>use_azure_auth: true<br/>Workload Identity]
DBG[Debug<br/>Logging]
end
subgraph Connector["Connector"]
SM[Span Metrics<br/>───────────────<br/>Latency histograms<br/>Request counts]
end
OTLP --> K8S
PROM --> K8S
HC --> K8S
K8S --> RD --> ML --> RES --> TF --> BT
BT --> ADX
BT --> DBG
BT --> SM
SM -->|Generated Metrics| BT
flowchart TB
subgraph Terraform["1. Terraform (terraform/)"]
TF_ADX[ADX Cluster<br/>+ Database<br/>+ Tables]
TF_AKS[AKS Cluster<br/>+ OIDC Issuer<br/>+ Workload Identity]
TF_MI[Managed Identity<br/>+ Federated Credential<br/>+ ADX Permissions]
TF_VAL[values-generated.yaml]
TF_ADX --> TF_VAL
TF_AKS --> TF_MI
TF_MI --> TF_VAL
end
subgraph Helm["2. Helm Chart (kubernetes/opentelemetry-demo-chart/)"]
H_COL[OTel Collector<br/>Deployment + ConfigMap]
H_SA[Service Account<br/>with WI annotation]
H_SVC[17 Microservices]
H_GF[Grafana]
H_SUP[Supporting Services<br/>Kafka, Valkey, PostgreSQL]
end
TF_VAL -->|"helm install -f values-generated.yaml"| Helm
| Tool | Version | Installation |
|---|---|---|
| Azure CLI | Latest | brew install azure-cli or Install Guide |
| Terraform | >= 1.5.0 | brew install terraform or Install Guide |
| kubectl | Latest | brew install kubectl or Install Guide |
| Helm | >= 3.0 | brew install helm or Install Guide |
You also need an Azure Subscription with Owner or Contributor access.
git clone https://github.com/roy2392/adx-opentelemetry-demo.git
cd adx-opentelemetry-demoaz login
az account set --subscription "<your-subscription-id>"cd terraform
# Copy example variables file
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your preferences:
# - project_name = "otel-demo"
# - environment = "dev"
# - location = "eastus"
# Initialize and deploy
terraform init
terraform applyTerraform creates:
- Azure Data Explorer cluster with
otel_demodatabase and tables - AKS cluster with OIDC issuer and Workload Identity enabled
- User-Assigned Managed Identity with ADX Ingestor + Viewer permissions
- Federated Identity Credential linking K8s service account to the identity
values-generated.yamlwith all configuration (no secrets!)
# Get AKS credentials (command shown in Terraform output)
az aks get-credentials --resource-group <rg-name> --name <aks-name>
# Return to project root
cd ..
# Deploy using Terraform-generated values
helm install otel-demo ./kubernetes/opentelemetry-demo-chart \
-f ./kubernetes/opentelemetry-demo-chart/values-generated.yaml \
-n otel-demo --create-namespace
# Watch pods come up (takes 2-3 minutes)
kubectl get pods -n otel-demo -w# Frontend (Web Store) - http://localhost:8080
kubectl port-forward -n otel-demo svc/frontend-proxy 8080:8080
# Grafana (Dashboards) - http://localhost:3000
# Login: admin / admin
kubectl port-forward -n otel-demo svc/grafana 3000:80# Check OTel Collector is sending data
kubectl logs -n otel-demo deployment/otel-collector | grep azuredataexplorer
# You should see: "Flushing X metrics to sink"This fork uses Azure Workload Identity for secure, secret-less authentication to ADX.
sequenceDiagram
participant Pod as OTel Collector Pod
participant SA as Service Account
participant AKS as AKS OIDC Issuer
participant AAD as Azure AD
participant MI as Managed Identity
participant ADX as Azure Data Explorer
Note over Pod,SA: Pod starts with label:<br/>azure.workload.identity/use: "true"
Pod->>SA: Mount projected token
SA->>AKS: Request signed JWT
AKS-->>SA: JWT (1 hour expiry)
Pod->>AAD: Exchange JWT for Azure token
Note over AAD: Validates:<br/>1. OIDC issuer URL<br/>2. Service account subject<br/>3. Audience
AAD->>MI: Lookup Federated Credential
MI-->>AAD: Managed Identity confirmed
AAD-->>Pod: Azure AD access token
Pod->>ADX: Ingest telemetry (with token)
ADX-->>Pod: Success (Ingestor role)
Note over Pod,ADX: No secrets stored anywhere!<br/>Tokens auto-refresh every hour
| Aspect | Service Principal (Secrets) | Workload Identity |
|---|---|---|
| Secrets | Client secret stored in K8s | No secrets anywhere |
| Rotation | Manual (every 6-12 months) | Automatic (~1 hour) |
| Blast Radius | Full access if secret leaked | Pod-scoped tokens only |
| Audit | Limited visibility | Full Azure AD logs |
| Setup | Create SP, manage secret | Terraform handles everything |
# Check pod has Workload Identity environment variables
kubectl get pod -n otel-demo -l app.kubernetes.io/name=opentelemetry-collector \
-o jsonpath='{.items[0].spec.containers[0].env[*].name}' | tr ' ' '\n' | grep AZURE
# Expected output:
# AZURE_TENANT_ID
# AZURE_CLIENT_ID
# AZURE_FEDERATED_TOKEN_FILE
# Check projected token volume exists (injected by AKS)
kubectl get pod -n otel-demo -l app.kubernetes.io/name=opentelemetry-collector \
-o jsonpath='{.items[0].spec.volumes[?(@.name=="azure-identity-token")].name}'
# Expected output: azure-identity-tokenEdit terraform/terraform.tfvars:
# Project settings
project_name = "otel-demo"
environment = "dev"
location = "eastus"
# ADX settings
adx_sku_name = "Dev(No SLA)_Standard_D11_v2" # Use Standard_D11_v2 for prod
adx_hot_cache_days = 30
adx_retention_days = 365
# AKS settings
aks_node_count = 3
aks_node_vm_size = "Standard_DS2_v2"After terraform apply, you'll see:
aks_get_credentials_command = "az aks get-credentials --resource-group otel-demo-dev-rg --name otel-demo-dev-aks"
helm_install_command = "helm install otel-demo ./kubernetes/opentelemetry-demo-chart -f ./kubernetes/opentelemetry-demo-chart/values-generated.yaml -n otel-demo --create-namespace"
workload_identity_client_id = "70de6cd3-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
adx_cluster_uri = "https://oteldemodevadx.eastus.kusto.windows.net"
The values-generated.yaml file is automatically created by Terraform:
# Azure Workload Identity (no secrets!)
azure:
workloadIdentity:
enabled: true
clientId: "<managed-identity-client-id>"
tenantId: "<azure-tenant-id>"
# ADX Configuration
adx:
enabled: true
clusterUri: "https://<cluster>.eastus.kusto.windows.net"
database: "otel_demo"
tables:
traces: "OTelTraces"
metrics: "OTelMetrics"
logs: "OTelLogs"
# OTel Collector
otelCollector:
serviceAccount:
name: "otel-collector-sa"
annotations:
azure.workload.identity/client-id: "<managed-identity-client-id>"If you need to customize values:
# Install with custom values
helm install otel-demo ./kubernetes/opentelemetry-demo-chart \
-f ./kubernetes/opentelemetry-demo-chart/values-generated.yaml \
-f ./my-custom-values.yaml \
-n otel-demo --create-namespace
# Upgrade existing installation
helm upgrade otel-demo ./kubernetes/opentelemetry-demo-chart \
-f ./kubernetes/opentelemetry-demo-chart/values-generated.yaml \
-n otel-demo| Table | Contents |
|---|---|
OTelTraces |
Distributed traces (spans) |
OTelMetrics |
Metrics (counters, gauges, histograms) |
OTelLogs |
Application logs |
// Service latency P95 over time
OTelTraces
| where Timestamp > ago(1h)
| summarize P95_ms = percentile(Duration / 1000000, 95) by ServiceName, bin(Timestamp, 5m)
| render timechart
// Error rate by service
OTelLogs
| where Timestamp > ago(1h) and SeverityText == "ERROR"
| summarize Errors = count() by ServiceName, bin(Timestamp, 5m)
| render columnchart
// Top 10 slowest endpoints
OTelTraces
| where Timestamp > ago(1h) and ParentSpanId == ""
| summarize AvgDuration = avg(Duration / 1000000) by Name
| top 10 by AvgDuration desc
// Request count by service
OTelMetrics
| where Timestamp > ago(1h) and Name == "http.server.request.duration"
| summarize Requests = count() by ServiceName, bin(Timestamp, 1m)
| render timechartThis fork includes Grafana-based alerting with KQL queries against Azure Data Explorer and email notifications via Azure Communication Services.
The following alert rules are pre-configured:
| Alert | Description | Severity | Threshold |
|---|---|---|---|
| Service Not Reporting | Detects when services stop sending telemetry | Critical | < expected services in 5min |
| High Error Rate | Triggers when error rate exceeds threshold | Warning | > 5% errors |
| High Latency (P95) | Triggers when P95 latency exceeds threshold | Warning | > 500ms |
Email alerts are sent via Azure Communication Services SMTP. Terraform provisions the infrastructure automatically.
flowchart LR
GF[Grafana Alert] --> SMTP[Azure Comm Services SMTP]
SMTP --> EMAIL[Email Notification]
1. Enable in Terraform (terraform/terraform.tfvars):
# Enable Azure Communication Services for email alerts
enable_email_alerts = true
# Email recipients (comma-separated)
alert_recipients = "team@example.com,oncall@example.com"
# Create Entra ID app for SMTP auth (requires Application Administrator role)
# Set to false if you lack permissions - use az CLI workaround instead
create_smtp_entra_app = false2. Run Terraform:
cd terraform
terraform applyThis creates:
- Azure Communication Service
- Email Communication Service with Azure-managed domain
- (Optional) Entra ID app for SMTP authentication
3. Create Service Principal for SMTP (if create_smtp_entra_app = false):
# Create service principal for SMTP authentication
az ad sp create-for-rbac --name "otel-demo-smtp-auth" --skip-assignment
# Output:
# {
# "appId": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
# "password": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
# "tenant": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
# }4. Configure SMTP in Helm values:
Create a file smtp-values.yaml:
grafana:
alerting:
smtp:
enabled: true
host: "smtp.azurecomm.net"
port: 587
# Format: <COMM_SERVICE_NAME>.<APP_ID>.<TENANT_ID>
user: "otel-demo-dev-comm.xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
password: "your-service-principal-password"
fromAddress: "DoNotReply@xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.azurecomm.net"
fromName: "OTel Demo Alerts"
toAddresses: "your-email@example.com"
skipVerify: falseNote: Get the
fromAddressdomain from the Azure Portal: Communication Services → Your Service → Email → Domains → Azure Managed Domain
5. Deploy with SMTP configuration:
helm upgrade otel-demo ./kubernetes/opentelemetry-demo-chart \
-f ./kubernetes/opentelemetry-demo-chart/values-generated.yaml \
-f smtp-values.yaml \
-n otel-demoAfter deploying, configure the email contact point in Grafana:
-
Access Grafana:
kubectl port-forward -n otel-demo svc/grafana 3000:80
-
Navigate to: Alerting → Contact Points → Add contact point
-
Configure email contact point:
- Name:
alerts - Type:
Email - Addresses:
your-email@example.com - Click Test to verify, then Save
- Name:
-
Configure notification policy:
- Go to: Alerting → Notification policies
- Edit default policy → Set receiver to your contact point
- Save
Option 1: Test Contact Point
In Grafana UI: Alerting → Contact Points → Click "Test" on your contact point
Option 2: Trigger Real Alert with Feature Flags
The demo includes feature flags to simulate failures:
# Enable payment service failure (triggers errors)
kubectl patch configmap flagd-config -n otel-demo --type merge -p '{
"data": {
"demo.flagd.json": "{\"flags\":{\"paymentFailure\":{\"state\":\"ENABLED\",\"defaultVariant\":\"on\",\"variants\":{\"on\":1,\"off\":0}}}}"
}
}'
# Restart flagd to pick up changes
kubectl rollout restart deployment/flagd -n otel-demoThis will cause payment failures, triggering error rate alerts.
Option 3: Via Grafana API
# Send test notification via API
kubectl exec -n otel-demo deployment/grafana -c grafana -- \
curl -s -X POST 'http://localhost:3000/api/alertmanager/grafana/config/api/v1/receivers/test' \
-H 'Content-Type: application/json' \
-d '{
"receivers": [{"name": "your-contact-point-name", "grafana_managed_receiver_configs": [{"uid": "your-uid", "name": "your-contact-point-name", "type": "email", "settings": {"addresses": "your-email@example.com"}}]}],
"alert": {"labels": {"alertname": "Test"}, "annotations": {"summary": "Test alert"}}
}'| Setting | Location | Description |
|---|---|---|
| Alert rules | Grafana UI or values.yaml |
KQL-based alert conditions |
| Contact points | Grafana UI | Email addresses, Slack webhooks, etc. |
| Notification policies | Grafana UI | Routing rules for alerts |
| SMTP settings | values.yaml or smtp-values.yaml |
Azure Communication Services config |
| Thresholds | values.yaml → grafana.alerting.thresholds |
Error rate %, latency ms |
Check SMTP is configured:
kubectl exec -n otel-demo deployment/grafana -c grafana -- \
cat /etc/grafana/grafana.ini | grep -A 10 "\[smtp\]"Check Grafana logs for email errors:
kubectl logs -n otel-demo deployment/grafana -c grafana | grep -i "smtp\|email\|mail"Common issues:
| Error | Cause | Fix |
|---|---|---|
| "SMTP not configured" | Missing [smtp] section |
Helm upgrade with smtp-values.yaml |
| "Authentication failed" | Wrong SMTP credentials | Verify service principal password |
| "Connection refused" | Wrong host/port | Use smtp.azurecomm.net:587 |
| No email received | Spam filter | Check spam/junk folder |
adx-opentelemetry-demo/
├── terraform/ # Infrastructure as Code
│ ├── main.tf # Root module - orchestrates everything
│ ├── variables.tf # Input variables
│ ├── outputs.tf # Output values (commands, URLs)
│ ├── terraform.tfvars.example # Example configuration
│ └── modules/
│ ├── adx/ # ADX cluster, database, tables
│ ├── aks/ # AKS with OIDC + Workload Identity
│ ├── identity/ # Managed Identity + Federation
│ └── communication/ # Azure Communication Services (email alerts)
│
├── kubernetes/
│ └── opentelemetry-demo-chart/ # Helm chart
│ ├── Chart.yaml
│ ├── values.yaml # Default values
│ ├── values-generated.yaml # Generated by Terraform (gitignored)
│ └── templates/
│ ├── otel-collector-deployment.yaml
│ ├── otel-collector-config.yaml
│ ├── otel-collector-sa.yaml # Service account with WI
│ ├── grafana.yaml
│ └── services/ # 17 microservices
│
├── src/ # Microservice source code (from upstream)
│ ├── frontend/
│ ├── cartservice/
│ ├── checkoutservice/
│ └── ...
│
├── docs/
│ ├── AZURE_DEPLOYMENT.md # Detailed deployment guide
│ ├── ALERTING.md # Alerting configuration guide
│ └── INTEGRATE_YOUR_SERVICES.md # Add your own services
│
└── adx/
├── schema.kql # Table definitions
└── example-queries.kql # Sample KQL queries
| Resource | SKU | Monthly Cost (Est.) |
|---|---|---|
| ADX Cluster | Dev(No SLA)_Standard_D11_v2 | ~$150 |
| AKS Cluster | 3x Standard_DS2_v2 | ~$300 |
| Storage | Varies by data volume | ~$20-50 |
| Total | ~$470-500/month |
- Use ADX Auto-Stop for dev environments (stops cluster when idle)
- Scale down AKS nodes when not in use
- Reduce hot cache period (default 30 days) for cost savings
- Use spot instances for AKS nodes in non-production
# Check pod status
kubectl get pods -n otel-demo -l app.kubernetes.io/name=opentelemetry-collector
# Check logs for errors
kubectl logs -n otel-demo deployment/otel-collector
# Check events
kubectl describe pod -n otel-demo -l app.kubernetes.io/name=opentelemetry-collector# Verify collector is exporting to ADX
kubectl logs -n otel-demo deployment/otel-collector | grep -i "azuredataexplorer\|flush"
# Check ADX ingestion metrics
az monitor metrics list \
--resource $(az kusto cluster show -n <cluster-name> -g <rg> --query id -o tsv) \
--metric "IngestionResult" --interval PT5M \
--query "value[0].timeseries[0].data[-5:].{Time:timeStamp, Count:total}" -o table# Verify federated credential exists
az identity federated-credential list \
--identity-name <identity-name> \
--resource-group <rg> -o table
# Check service account has correct annotation
kubectl get sa otel-collector-sa -n otel-demo -o yaml | grep "azure.workload.identity"
# Verify ADX permissions for the managed identity
az kusto database-principal-assignment list \
--cluster-name <cluster-name> \
--database-name otel_demo \
--resource-group <rg> -o tableTo remove all resources:
# Delete Helm release
helm uninstall otel-demo -n otel-demo
kubectl delete namespace otel-demo
# Destroy Terraform resources
cd terraform
terraform destroyThis project is a fork of the OpenTelemetry Demo maintained by the OpenTelemetry community.
Fork maintained by: Roy Zalta
Key additions in this fork:
- Azure Data Explorer as sole telemetry backend
- Terraform infrastructure automation
- Helm chart with Azure Workload Identity
- Simplified single-backend architecture
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.





