AI Firewall and guardrails for LLM-based Elixir applications
-
Updated
Apr 4, 2026 - Elixir
AI Firewall and guardrails for LLM-based Elixir applications
Experimental research framework for running AI benchmarks at scale
Fairness and bias detection library for Elixir AI/ML systems
Interactive Phoenix LiveView demonstrations of the Crucible Framework - showcasing ensemble voting, request hedging, statistical analysis, and more with mock LLMs
Request hedging for tail latency reduction in distributed systems
Phoenix LiveView dashboard for the Crucible ML reliability stack
Explainable AI (XAI) tools for the Crucible framework
ML model deployment for the Crucible ecosystem. vLLM and Ollama integration, canary deployments, A/B testing, traffic routing, health checks, rollback strategies, and inference serving for Elixir-based ML workflows.
Intermediate Representation for the Crucible ML reliability ecosystem
Metrics aggregation and alerting for ML experiments—multi-backend export (Prometheus, InfluxDB, Datadog, OpenTelemetry), advanced aggregations (percentiles, histograms, moving averages), threshold-based alerting with anomaly detection (z-score, IQR), and time-series storage. Research-grade observability for the NSAI ecosystem.
ML model registry for the Crucible ecosystem. Artifact storage, model versioning, lineage tracking, metadata management, model comparison, reproducibility, and integration with training pipelines for Elixir-based ML workflows.
Statistical testing and analysis framework for AI research
Industrial ML training orchestration - backend-agnostic workflow engine for supervised, reinforcement, and preference learning. Provides composable workflows, declarative stage DSL, comprehensive telemetry, and port/adapter patterns for any ML backend. The missing orchestration layer that makes ML cookbooks trivially thin.
Model evaluation harness for standardized benchmarking—comprehensive metrics (F1, BLEU, ROUGE, METEOR, BERTScore, pass@k), statistical analysis (confidence intervals, effect size, bootstrap CI, ANOVA), multi-model comparison, and report generation. Research-grade evaluation for LLM and ML experiments.
Data validation and quality library for ML pipelines in Elixir
Dataset management library for ML experiments—loaders for SciFact, FEVER, GSM8K, HumanEval, MMLU, TruthfulQA, HellaSwag; git-like versioning with lineage tracking; transformation pipelines; quality validation with schema checks and duplicate detection; GenStage streaming for large datasets. Built for reproducible AI research.
Structured causal reasoning chain logging for LLM transparency
ML training orchestration for the Crucible ecosystem. Distributed training, hyperparameter optimization, checkpointing, model versioning, metrics collection, early stopping, LR scheduling, gradient accumulation, and mixed precision training with Nx/Scholar integration.
CrucibleFramework: A scientific platform for LLM reliability research on the BEAM
Adversarial testing and robustness evaluation for the Crucible framework
Add a description, image, and links to the nshkr-crucible topic page so that developers can more easily learn about it.
To associate your repository with the nshkr-crucible topic, visit your repo's landing page and select "manage topics."