A/B Experimentation Platform with Intelligent Agents

Modular ML Experimentation • Automated Decision-Making • Observability-First

📌 Overview

This repository contains a complete A/B experimentation platform powered by agents, including automated traffic allocation, real-time metric aggregation, statistical evaluation, model inference service, retraining workflow, observability, and human-readable reporting.

The system is designed as a modular, extensible, production-oriented experimentation engine that can be embedded into any product or ML workflow.

🔍 Full low-level architecture and agent specifications are provided in the included Technical Report (PDF).
📐 High-level architecture diagram is provided as a separate PDF illustration.

🚀 Key Features

Event ingestion & normalization through n8n
Dynamic experiment routing (A/B or multi-arm)
FastAPI-based ML service:
- /predict
- /stat_test
- /retrain
- /nlq (natural-language queries)
Intelligent Agents:
- Experiment Agent (traffic allocation)
- Metrics Agent (aggregation)
- Evaluator Agent (statistical tests)
- Trainer Agent (model retraining)
PostgreSQL storage with audit logs, metrics tables, and model registry
Observability & alerts (low-confidence predictions, pipeline issues)
PDF reporting for experiment summaries
Plug-and-play integration with external systems or applications

🏗 High-Level Architecture

                       +--------------------+
                       |   Data Sources     |
                       | (Events, Labels)   |
                       +---------+----------+
                                 |
                                 v
                     +-----------+-------------+
                     |     n8n Orchestrator    |
                     |  - ingestion            |
                     |  - routing (A/B)        |
                     |  - triggers & alerts    |
                     +-----------+-------------+
                                 |
                                 v
      +-------------------+    FastAPI      +------------------+
      | Experiment Agent  | <-------------> |  Model Registry  |
      | - traffic control |                 |  (artifacts)     |
      +-------------------+                 +------------------+
                                 |
                                 v
                       +---------+----------+
                       |   FastAPI ML API   |
                       | /predict           |
                       | /stat_test         |
                       | /retrain           |
                       +---------+----------+
                                 |
                                 v
             +---------------------------------------+
             |              PostgreSQL                |
             | events | labels | ab_metrics | audit  |
             +---------------------------------------+
                                 |
                                 v
                       +---------+---------+
                       |  Reporting Layer  |
                       |  PDF / dashboards |
                       +---------+---------+
                                 |
                                 v
                            End Users

Scheme using Draw.io

📄 Documentation Structure

This repository includes two primary documents:

Technical Report (PDF)
Full detailed architecture: agents, models, decision logic, metrics, constraints, and design rationale. look report here
High-Level Architecture Diagram (PDF)
A clean visualization for presentations and system overview.

🔧 Technology Stack

Layer	Tool	Purpose
Orchestration	n8n	Event routing, experiment workflows, automation
Model Serving	FastAPI	Prediction, evaluation, retraining, NLQ
Storage	PostgreSQL	Events, labels, metrics, model registry
Backend Logic	Python 3.10+	ML models, agents, data processing
CI/CD	GitHub Actions	Automated build & deploy
Containerization	Docker Compose	Local and production-ready deployments
Observability	(Optional) Prometheus, Grafana, Sentry	Metrics & alerts

🧠 Agents Architecture (Short Summary)

Each agent is fully described in the Technical Report.
Below is the high-level overview:

Experiment Agent

Applies statistical results.
Selects winning model.
Adjusts A/B traffic values in real time.
Writes decisions to audit log.

Metrics Agent

Periodically aggregates prediction events.
Computes recall, precision, FPR, accuracy.
Stores results in ab_metrics.

Evaluator Agent

Runs statistical tests via /stat_test.
Generates insights (p-value, confidence interval).
Creates human-readable conclusions.

Trainer Agent

Retrains on Kaggle or internal dataset.
Applies preprocessing and versioning.
Updates the Model Registry.

🔍 Observability & Alerts

Low-confidence prediction alerts
Model drift signals
API latency & error rate monitoring
Experiment health checks
Full audit trail for decisions

🎛 Configuration

Configuration is stored in PostgreSQL:

experiment configuration
traffic split ratios
active model version
metrics history
audit logs

This makes the system fully dynamic and adjustable without redeployment.

🧪 Sample User Query (NLQ)

User:

“Compare Model A and Model B over the last 7 days and recommend a rollout strategy.”

System Response:

{
  "winner": "B",
  "p_value": 0.008,
  "effect_size": 0.12,
  "recommendation": "Increase B to 70% traffic; monitor FPR for 24h."
}

ROADMAP

Version	Feature
1.0	Initial release (A/B testing, agents, metrics)
1.1	Auto-rollout strategies
1.2	PDF reports + dashboards
1.3	Multi-model (A/B/C/D) support
1.4	Canary deployments & rollback logic
2.0	Real-time drift detection
2.1	Feature store integration
3.0	Full experimentation as a service (EaaS)

🤝 Contribution Guidelines

We welcome contributions! Please follow the steps below:

1. Fork the repository
2. Create a feature branch
3. Ensure code follows our style guide
4. Add or update tests
5. Submit a pull request describing your change
6. For major changes, please open an issue beforehand.

📬 Contact

For questions or interest in integrating the platform into your product:

📧 [email protected]
🌐 https://github.com/BorDch

⭐ Acknowledgements

This project was built as part of an engineering challenge and later evolved into a production-capable experimentation platform.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
doc		doc
fastapi		fastapi
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
files_created.txt		files_created.txt
n8n_flow_ab.json		n8n_flow_ab.json
statistical_test.py		statistical_test.py
test_cmd.txt		test_cmd.txt
webhook_site_url.txt		webhook_site_url.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A/B Experimentation Platform with Intelligent Agents

📌 Overview

🚀 Key Features

🏗 High-Level Architecture

Scheme using Draw.io

📄 Documentation Structure

🔧 Technology Stack

🧠 Agents Architecture (Short Summary)

Experiment Agent

Metrics Agent

Evaluator Agent

Trainer Agent

🔍 Observability & Alerts

🎛 Configuration

🧪 Sample User Query (NLQ)

ROADMAP

🤝 Contribution Guidelines

📬 Contact

⭐ Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

A/B Experimentation Platform with Intelligent Agents

📌 Overview

🚀 Key Features

🏗 High-Level Architecture

Scheme using Draw.io

📄 Documentation Structure

🔧 Technology Stack

🧠 Agents Architecture (Short Summary)

Experiment Agent

Metrics Agent

Evaluator Agent

Trainer Agent

🔍 Observability & Alerts

🎛 Configuration

🧪 Sample User Query (NLQ)

ROADMAP

🤝 Contribution Guidelines

📬 Contact

⭐ Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages