Skip to content

stewartpark/private-llm

Repository files navigation

Private LLM

Private LLM

Your AI. Your infrastructure. Zero middlemen. Security built for zero trust.

Download

macOS App (signed) β€’ CLI β€’ Linux systemd


License Platform GCP

Stop trusting third parties with your data

Every prompt you send to cloud AI providers is logged. Stored. Possibly used to train their models.

Private LLM changes the game:

  • πŸ”’ 4096-bit RSA, TLS 1.3 β€” Exceeds typical enterprise standards
  • πŸ›οΈ HSM-backed key management β€” Hardware security module, 90-day auto-rotation
  • πŸ”„ Aggressive key rotation β€” Fresh certs every VM boot
  • πŸ›‘οΈ Zero-trust architecture β€” CA key never leaves your machine
# macOS: Download, sign GCP once, run `private-llm up`
# CLI: one-liner install, interactive setup, done

# Your tools think it's local Ollama
$ ollama run stewartpark/qwen3.5

Private LLM vs. the Cloud

Cloud AI Providers Private LLM
Your prompts Logged, stored, possibly trained on Never leave your infrastructure, certs auto-rotate
Cost Per token, opaque pricing GPU hourly, scales to zero
Control Their rates, their uptime, their rules You own the VM, you set idle timeout
Compliance Their SOC 2, their BAA Your GCP project, your KMS keys

Install

macOS App (Recommended)

  1. Download latest release
  2. Sign into GCP (one-time):
    gcloud auth application-default login
  3. Run up from the menu bar β†’ follow interactive prompts

Done. Menu bar icon shows status. No terminal needed.

CLI (All Platforms)

curl -fsSSL https://raw.githubusercontent.com/stewartpark/private-llm/main/misc/install.sh | sh

Then:

$ gcloud auth application-default login  # one-time
$ private-llm up                         # interactive setup
$ private-llm                            # start dashboard

Total time: ~5 min (first boot: 30 min for package installs; subsequent: 3-5 min)


How It Works

flowchart LR
    subgraph "Your Machine"
        A[Your Tools<br/>ollama CLI, Cursor, etc.]
        B[private-llm CLI<br/>Proxy daemon]
    end
    
    subgraph GCP[GCP Cloud]
        C{VM Running?}
        D[Compute API<br/>Start VM]
        E[Secret Manager<br/>Server certs + token]
        F[GPU VM<br/>Ollama]
    end
    
    A -->|localhost:11434| B
    B -->|request| C
    C -->|No| B
    B -->|1. Detect IP<br/>2. Open firewall<br/>3. Rotate certs<br/>4. Upload to SM| E
    B -->|5. Start VM| D
    D --> F
    F -->|6. Fetch certs at boot| E
    F -->|7. Boot Ollama| B
    C -->|Yes| F
    F -->|response| B
    B -->|SSE stream| A
    
    style A fill:#22c55e,stroke:#166534
    style B fill:#3b82f6,stroke:#1e40af,color:white
    style F fill:#8b5cf6,stroke:#6b21a8,color:white
    style E fill:#16a34a,stroke:#14532d,color:white
Loading
  1. Install (app or CLI) β€” CA private key stays on your machine
  2. Provision β€” private-llm up creates VPC, KMS HSM key, shielded VM
  3. Run β€” private-llm starts proxy with live TUI dashboard
  4. Use β€” Any Ollama tool works (localhost:11434)
  5. Scale to zero β€” VM auto-stops after 5 min idle ($0 when not in use)

Security Architecture

graph TB
    subgraph "Your Machine"
        A[CA Private Key<br/>~/.config/private-llm/certs/ca.key]
        B[Client Cert + Key<br/>~/.config/private-llm/certs/]
        P[private-llm Proxy<br/>localhost:11434]
    end
    
    subgraph GCP[GCP Cloud]
        subgraph "Key Management"
            C[KMS HSM Key<br/>Auto-rotate 90 days]
            D[Secret Manager<br/>Server certs + bearer token]
        end
        
        subgraph "Compute"
            E[Shielded VM<br/>Secure Boot + vTPM]
        end
    end
    
    subgraph "Defense Layers"
        F[mTLS Validation<br/>4096-bit RSA, TLS 1.3]
        G[Fingerprint Pinning<br/>SHA-256 in memory]
        H[Dynamic Firewall<br/>Your IP only]
    end
    
    A -.->|never leaves your machine| B
    B -.->|loads | P
    P ==>|mTLS request | E
    C -->|encrypts| D
    D -->|boot retrieval| E
    E -->|every request| F
    F -->|verifies| G
    H -->|IP-locked access| E
    
    style A fill:#dc2626,stroke:#991b1b,color:white
    style B fill:#ef4444,stroke:#991b1b
    style P fill:#3b82f6,stroke:#1e40af,color:white
    style C fill:#16a34a,stroke:#14532d,color:white
    style D fill:#16a34a,stroke:#14532d,color:white
    style F fill:#f59e0b,stroke:#92400e
    style G fill:#f59e0b,stroke:#92400e
    style H fill:#f59e0b,stroke:#92400e
Loading

Zero-trust model: CA key isolation means GCP cannot forge certificates or intercept traffic (only your machine can sign certs). Fingerprint pinning detects MITM attacks. Firewall rule deleted when you quit.


GPU Options

Type GPU VRAM Best For ~$/hr
g2-standard-4 L4 24GB 7B-13B models 0.25
g4-standard-48 RTX 6000 96GB 70B+ models (default) 1.80
a2-standard-12 A100 40GB Legacy 0.50
a3-standard-8 H100 80GB Cutting-edge 2.50

Monthly cost (g2-standard-4): $18 (always off) β†’ $28 (40 hrs) β†’ $58 (160 hrs) β†’ $200 (24/7)


Dashboard

Running private-llm opens a live TUI with real-time stats:

Private LLM Dashboard

Works With

Any Ollama-compatible tool:

  • CLI: ollama run llama3.2
  • Agents: opencode, Aider, Codex CLI, Claude Code (via ollama launch)
  • IDEs: Cursor, VS Code + Ollama extensions
  • Custom: OpenAI API compatible (just change base_url to http://localhost:11434)

Quick Reference

private-llm up                    # Provision infrastructure
private-llm down                  # Destroy infrastructure
private-llm                       # Start dashboard (proxy runs here)
private-llm rotate-mtls-ca        # Emergency: rotate all certs

TUI Controls: q quit | r restart | R reset (recreate) | S toggle VM

Config: ~/.config/private-llm/agent.json (see CONFIG.md for all options)

Docs: AGENTS.md β€” architecture & design | SECURITY.md β€” threat model & controls | Linux packaging


License

PolyForm Noncommercial 1.0.0 β€” Free for personal/internal use. Not for SaaS or resale.


Your infrastructure. Your control. No middlemen. Ever.

Releases β€’ Docs β€’ Issues

About

πŸ”’ Your own private LLM provider with enterprise-grade privacy

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Contributors