Skip to content

Latest commit

 

History

History
493 lines (374 loc) · 13 KB

File metadata and controls

493 lines (374 loc) · 13 KB

🚀 ModelCub

The Local-First Alternative to Roboflow

Your data. Your GPU. Your rules. Zero cloud lock-in.

Python 3.9+ License: MIT Code style: black

ModelCub is a complete, privacy-first MLOps platform for computer vision. Think Roboflow, but 100% local, 100% free, and 100% yours.

Perfect for medical imaging, defense applications, indie developers, or anyone who values data privacy and wants to avoid monthly SaaS fees.


✨ Why ModelCub?

The Problem

  • Roboflow: $500-8k/month, cloud lock-in, data privacy concerns
  • Label Studio + Ultralytics: Separate tools, manual integration, no versioning
  • DIY Solutions: Time-consuming, error-prone, not reproducible

The Solution

ModelCub gives you a professional, integrated platform that runs entirely on your infrastructure:

# Install
pip install modelcub

# Initialize project
modelcub project init my-cv-project

# Import dataset
modelcub dataset add --source ./yolo-data --name production-v1

# Launch UI
modelcub ui

# Start training (coming soon)
modelcub train production-v1 --model yolov11n --epochs 100

🎯 Key Features

1. Privacy-First Architecture

Your data never leaves your machine. Perfect for:

  • 🏥 Medical imaging (HIPAA compliant)
  • 🔬 Pharmaceutical research
  • 🛡️ Defense applications
  • 🏢 Enterprise on-premise deployments

2. Complete MLOps Pipeline

  • Dataset Management: Import, validate, version control
  • Auto-Fix: Automatically detect and fix data quality issues
  • Annotation: Built-in labeling tool (in development)
  • Training: YOLO integration with auto-configuration
  • Evaluation: Comprehensive metrics and visualizations
  • Export: ONNX, TensorRT, CoreML

3. Git for Computer Vision

# Commit dataset changes
modelcub commit "Fixed bounding box issues in batch-42"

# View history
modelcub history

# Visual diff
modelcub diff v1 v2 --report

4. Beautiful Web UI

  • 🎨 Modern, dark-mode interface
  • ⚡ Real-time updates via WebSocket
  • 🖼️ Image browser with lazy loading
  • 📊 Analytics dashboard
  • 🏷️ Class management
  • 🔧 Configuration editor

5. Developer-Friendly

Three ways to use ModelCub:

CLI (for scripting):

modelcub dataset add --source ./data --name v1
modelcub dataset validate v1
modelcub train v1 --model yolov11n

Python SDK (for notebooks):

from modelcub import Project, Dataset

project = Project.init("my-project")
dataset = Dataset.from_yolo("./data", name="v1")

stats = dataset.stats()
print(f"Classes: {dataset.classes}")
print(f"Images: {dataset.num_images}")

Web UI (for visual work):

modelcub ui  # Opens at localhost:8000

📦 Installation

Requirements

  • Python 3.9 or higher
  • pip or uv package manager

Quick Install

# Using pip
pip install modelcub

# Using uv (recommended for speed)
uv pip install modelcub

# Verify installation
modelcub --version

Development Install

# Clone repository
git clone https://github.com/yourusername/modelcub.git
cd modelcub

# Install in editable mode
pip install -e .

# Or with uv
uv pip install -e .

🚀 Quick Start

1. Initialize a Project

# Create new project
modelcub project init my-cv-project
cd my-cv-project

# Directory structure created:
# my-cv-project/
# ├── .modelcub/         # Config, registries, history
# ├── data/datasets/     # Your datasets
# ├── runs/              # Training outputs
# ├── reports/           # Generated reports
# └── modelcub.yaml      # Project marker

2. Import a Dataset

# From YOLO format
modelcub dataset add --source ./yolo-data --name production-v1

# From Roboflow export
modelcub dataset add --source ./roboflow-export.zip --name roboflow-v1

# Unlabeled images (for annotation)
modelcub dataset add --source ./images/ --name unlabeled-v1

# Custom configuration
modelcub dataset add \
  --source ./data \
  --name custom-v1 \
  --n 500 \
  --train-frac 0.85 \
  --classes "cat,dog,bird"

3. Inspect Your Dataset

# List all datasets
modelcub dataset list

# Detailed information
modelcub dataset info production-v1

# Output:
# 📦 Dataset: production-v1
#
# Classes: pill, bottle, box (3 classes)
# Images: 847 total
#   • train: 677 (80%)
#   • val: 119 (14%)
#   • test: 51 (6%)
#
# Created: 2025-01-26 14:30:22
# Path: data/datasets/production-v1

4. Manage Classes

# List classes
modelcub dataset classes list production-v1

# Add new class
modelcub dataset classes add production-v1 "capsule"

# Rename class
modelcub dataset classes rename production-v1 "pill" "tablet"

# Remove class (with confirmation)
modelcub dataset classes remove production-v1 "box" --yes

5. Launch the Web UI

# Start the web interface
modelcub ui

# Development mode (with hot reload)
modelcub ui --dev

# Custom port
modelcub ui --port 3000

# Opens at http://localhost:8000

🎓 Use Cases

Medical Imaging Research

from modelcub import Project, Dataset

# Initialize project
project = Project.init("tumor-detection")

# Import DICOM images
dataset = Dataset.from_images(
    source="./dicom-exports/",
    name="patient-cohort-2025",
    classes=["benign", "malignant"]
)

# All data stays on your HIPAA-compliant infrastructure
# No cloud uploads, no third-party access

Manufacturing Quality Control

# Import production line images
modelcub dataset add \
  --source ./production-line/ \
  --name defect-detection-v1 \
  --classes "scratch,dent,discoloration,ok"

# Validate dataset quality
modelcub dataset validate defect-detection-v1

# Auto-fix common issues
modelcub dataset fix defect-detection-v1 --auto

# Train model
modelcub train defect-detection-v1 --model yolov11n

Academic Research

# Reproducible experiments
from modelcub import Project

project = Project.init("research-experiment-1")

# Version control for datasets
dataset = project.import_dataset("./data", name="baseline-v1")
project.commit("Initial dataset import")

# Modify dataset
dataset.augment(rotation=15, brightness=0.2)
project.commit("Added augmentation pipeline")

# Generate diff report
project.diff("baseline-v1", "augmented-v1", output="paper-figures/")

💻 Architecture

ModelCub is built on a clean, layered architecture:

┌─────────────────────────────────────────────┐
│           User Interfaces                    │
├──────────────┬──────────────┬───────────────┤
│     CLI      │  Python SDK  │    Web UI     │
│   (Click)    │   (Public)   │  (React+TS)   │
└──────┬───────┴──────┬───────┴───────┬───────┘
       │              │               │
       └──────────────┼───────────────┘
                      │
       ┌──────────────▼──────────────┐
       │      FastAPI Backend        │
       │   (REST + WebSocket)        │
       └──────────────┬──────────────┘
                      │
       ┌──────────────▼──────────────┐
       │       Core Services         │
       │  (Business Logic Layer)     │
       └──────────────┬──────────────┘
                      │
       ┌──────────────▼──────────────┐
       │      File System State      │
       │    (.modelcub directory)    │
       └─────────────────────────────┘

Key Design Principles

1. API-First Everything is composable. Use CLI, SDK, or UI interchangeably.

2. Stateless Backend No hidden databases. All state lives in human-readable YAML files.

3. Format-Agnostic YOLO internally, but import/export any format (COCO, VOC, TensorFlow).

4. Git-Friendly Version control everything like code. Diffs, commits, rollbacks.


📊 Comparison

Feature ModelCub Roboflow Label Studio + Ultralytics
Annotation
Training
Local-First
Auto-Fix
Version Control Basic Manual
Visual Diff
Integrated
Pricing Free $500-8k/mo Free
Setup Time 2 min 5 min 30+ min
Data Privacy 100% Cloud 100%
Vendor Lock-in None High None

🗺️ Roadmap

✅ Phase 1: Foundation (Complete)

  • Project management (init, config, delete)
  • Dataset import (YOLO, Roboflow, COCO, images)
  • Class management (add, rename, remove)
  • CLI with all core commands
  • Python SDK
  • FastAPI backend
  • React frontend with routing

🚧 Phase 2: Dataset Operations (In Progress)

  • Dataset validation with health scoring
  • Auto-fix system with backups
  • Version control (commit, diff, history)
  • Visual diff UI
  • Export to multiple formats (ONNX, TensorRT, CoreML)

📅 Phase 3: Annotation (Q1 2026)

  • Canvas-based annotation tool
  • Rectangle and polygon drawing
  • Keyboard shortcuts (vim-style)
  • Auto-save and undo/redo
  • Review and consensus mode

📅 Phase 4: Training (Q2 2026)

  • YOLO training integration (v8, v11)
  • Auto-configuration and hyperparameter tuning
  • Real-time progress (WebSocket)
  • Model evaluation and comparison
  • Multi-GPU support

🔮 Future Features

  • Active learning pipeline
  • Multi-annotator consensus
  • Custom augmentation plugins
  • Team collaboration (local network)
  • Optional cloud sync (S3, MinIO)

📚 Documentation


🤝 Contributing

We welcome contributions! ModelCub is built in the open, for the community.

Ways to Contribute

  • 🐛 Report bugs via GitHub Issues
  • 💡 Suggest features or improvements
  • 📝 Improve documentation
  • 🧪 Add test coverage
  • 🎨 Improve UI/UX
  • 🔧 Fix bugs and add features

Development Setup

# Clone repository
git clone https://github.com/yourusername/modelcub.git
cd modelcub

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run linter
black src/ tests/
ruff check src/ tests/

# Start development UI
cd src/modelcub/ui/frontend
npm install
npm run dev

Code Guidelines

  • Follow PEP 8 style guide
  • Write tests for new features
  • Update documentation
  • Keep commits atomic and well-described

🙏 Acknowledgments

ModelCub is inspired by and builds upon:

  • Roboflow - for proving the product-market fit
  • Ultralytics - for excellent YOLO implementations
  • Label Studio - for annotation UX patterns
  • The entire computer vision open-source community

Special thanks to everyone who contributed feedback, code, and ideas!


📜 License

MIT License - see LICENSE file for details.


🎯 Mission

Make computer vision development accessible, private, and reproducible for everyone.

Whether you're a solo developer, a research lab, or an enterprise team - you deserve tools that respect your data privacy and don't lock you into expensive subscriptions.

ModelCub is our answer. Own your tools. Own your data. Own your future.


Built with ❤️ by developers who felt the pain

Quick StartDocumentationContributingReport Bug