Transform AbstractCore into an OpenAI-compatible API server. One server, all models, any client.
If you want a dedicated single-model /v1 server (one provider/model per worker), see Endpoint.
Visit while the server is running:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
Swagger UI exposes an Authorize button. When ABSTRACTCORE_SERVER_API_KEY is set,
enter that value there; requests executed from the docs page will send it as
Authorization: Bearer <token>. The docs and OpenAPI schema are public by default so
the UI can load before authentication, but API operations remain protected. Set
ABSTRACTCORE_SERVER_PROTECT_DOCS=1 if you also want /docs, /redoc, and
/openapi.json behind server auth.
# Install
pip install "abstractcore[server]"
# Configure server auth and provider keys
export ABSTRACTCORE_SERVER_API_KEY="acore-server-secret"
export OPENAI_API_KEY="sk-..."
# Start server
python -m abstractcore.server.app
# Or with uvicorn directly
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000
# Test
curl http://localhost:8000/health
# Response: {"status":"healthy"}curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_SERVER_API_KEY" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'Or with Python:
import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_SERVER_API_KEY"])
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
print(response.choices[0].message.content)You can configure the server through environment variables or through AbstractCore's centralized config. Environment variables always take precedence over config-persisted values.
# Persisted local/server config
abstractcore --set-server-api-key acore-server-secret
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
abstractcore --set-api-key openrouter sk-or-...
abstractcore --set-api-key portkey pk_...
# Optional hardening/defaults
abstractcore --set-server-base-url-allowlist "https://example.com/v1"
abstractcore --set-server-url-fetch-allowlist "https://files.example.com"
abstractcore --set-server-media-root /srv/abstractcore-media
abstractcore --set-server-host 127.0.0.1
abstractcore --set-server-port 8000# Provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENROUTER_API_KEY="sk-or-..."
export PORTKEY_API_KEY="pk_..." # optional (Portkey)
export PORTKEY_CONFIG="pcfg_..." # required for Portkey routing
# Server master key. Authenticated clients can use all server-configured providers.
export ABSTRACTCORE_SERVER_API_KEY="acore-server-secret"
# Optional: also protect /docs, /redoc, and /openapi.json.
export ABSTRACTCORE_SERVER_PROTECT_DOCS=1
# Local providers
export OLLAMA_BASE_URL="http://localhost:11434" # (or legacy: OLLAMA_HOST)
export LMSTUDIO_BASE_URL="http://localhost:1234/v1"
export VLLM_BASE_URL="http://localhost:8000/v1"
export OPENAI_COMPATIBLE_BASE_URL="http://localhost:1234/v1"
export OPENAI_COMPATIBLE_API_KEY="your-endpoint-key" # optional, if the endpoint requires auth
# Server bind (only used by `python -m abstractcore.server.app`)
export HOST="0.0.0.0"
export PORT="8000"
# Debug mode
export ABSTRACTCORE_DEBUG=true
# Dangerous (multi-tenant hazard): allow unload_after for providers that can unload shared server state (e.g. Ollama)
export ABSTRACTCORE_ALLOW_UNSAFE_UNLOAD_AFTER=1
# Server security controls (recommended)
#
# - Request-level base_url overrides are loopback-only by default.
# URL entries match scheme + exact host + default/explicit port + path-segment prefix.
# Bare entries match hostname globs, e.g. "*.example.com".
export ABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST="https://api.openai.com,https://example.com/v1"
#
# - Remote URL fetches for attachments are blocked for private/loopback/link-local targets by default (SSRF protection).
# To allow specific hosts/prefixes, use the same structured allowlist syntax:
export ABSTRACTCORE_SERVER_URL_FETCH_ALLOWLIST="https://www.berkshirehathaway.com"
#
# - Local file paths in HTTP requests are disabled by default (including @/path/to/file in message strings).
# To allow local file paths safely, restrict them under a single directory:
export ABSTRACTCORE_SERVER_MEDIA_ROOT="/srv/abstractcore-media"
#
# - Unsafe escape hatch: allow arbitrary local file paths from HTTP requests (not recommended)
export ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1# Using AbstractCore's built-in CLI
python -m abstractcore.server.app --help # View all options
python -m abstractcore.server.app --debug # Debug mode
python -m abstractcore.server.app --host 127.0.0.1 --port 8080 # Custom host/port
python -m abstractcore.server.app --debug --port 8001 # Debug on custom port
# Using uvicorn directly
uvicorn abstractcore.server.app:app --reload # Development with auto-reload
uvicorn abstractcore.server.app:app --workers 4 # Production with multiple workers
uvicorn abstractcore.server.app:app --port 3000 # Custom portEndpoint: POST /v1/chat/completions
Standard OpenAI-compatible endpoint. Works with all providers.
Server auth:
- If
ABSTRACTCORE_SERVER_API_KEYis configured, every non-health endpoint requiresAuthorization: Bearer $ABSTRACTCORE_SERVER_API_KEY. Authenticated clients can use all provider keys/endpoints configured on the server. - If
ABSTRACTCORE_SERVER_API_KEYis not configured,Authorization: Bearer <provider-key>may be used as a bring-your-own upstream provider key. That key is forwarded only to the requested provider and never unlocks server-configured provider keys. - Health checks (
GET /health) are always unauthenticated.
Request:
{
"model": "provider/model-name",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}Key Parameters:
model(required): Prefer"provider/model-name"(e.g.,"openai/gpt-4o-mini"). If you pass a bare model name (no/), the server will best-effort auto-detect a provider.messages(required): Array of message objectsstream(optional): Enable streaming responsestools(optional): Tools for function callingagent_format(optional, AbstractCore extension): Tool-call syntax output format for agentic clients ("auto"|"openai"|"codex"|"qwen3"|"llama3"|"gemma"|"xml"|"passthrough"). When omitted, the server auto-detects from user-agent + model heuristics.api_key(deprecated/disabled, AbstractCore extension): Provider API keys are no longer accepted in request bodies or query strings. Configure provider keys on the server, useX-AbstractCore-Provider-API-Keyfor a per-request provider override, or useAuthorizationas a provider key only whenABSTRACTCORE_SERVER_API_KEYis not configured.base_url(optional, AbstractCore extension): Override the provider endpoint (include/v1for OpenAI-compatible servers like LM Studio / vLLM / OpenRouter)unload_after(optional, AbstractCore extension): Iftrue, callsllm.unload_model(model)after the request completes. Disabled forollama/*unlessABSTRACTCORE_ALLOW_UNSAFE_UNLOAD_AFTER=1.prompt_cache_key(optional, AbstractCore extension): Best-effort prompt caching key (semantics depend on provider/backend). Seedocs/prompt-caching.md.prompt_cache_retention(optional, AbstractCore extension): Prompt cache retention policy (OpenAI:"in_memory"or"24h"; ignored by other providers). Seedocs/prompt-caching.md.thinking(optional, AbstractCore extension): Unified thinking/reasoning control (null|"auto"|"on"|"off"|"none"or"low"|"medium"|"high"|"xhigh"when supported). Note:"none"is treated as an alias for"off".temperature,max_tokens,top_p: Standard LLM parameters
The server forwards thinking to the underlying provider using AbstractCore’s unified thinking mapping (see Generation Parameters).
Example (route to LM Studio + Qwen3.5, disable thinking):
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen3.5-27b@q4_k_m",
"base_url": "http://localhost:1234/v1",
"messages": [{"role": "user", "content": "Compute 17*23 - 19*11. Reply with the integer only."}],
"thinking": "none",
"max_tokens": 64
}'Notes:
- For Qwen3 / Qwen3.5 on LM Studio,
thinking="none"maps to LM Studio’s template variables (enable_thinking/enableThinking) plus a Qwen template “hard switch” fallback (empty<think></think>) when needed. This avoids injecting “reasoning effort” instructions into the system prompt. - Not every backend supports per-effort budgets for
low|medium|high; when unavailable, levels degrade to “thinking enabled”.
Example with streaming:
import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_SERVER_API_KEY"])
stream = client.chat.completions.create(
model="ollama/qwen3-coder:30b",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Route a provider to a specific endpoint (useful for remote OpenAI-compatible servers):
Security notes:
- Request-level
base_urloverrides are loopback-only by default. To allow additional origins or host globs, setABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST. URL entries are parsed and matched on scheme, exact host, effective port, and path-segment prefix. - If the server has an environment provider key set (e.g.
OPENAI_API_KEY) and you route to a non-loopbackbase_url, the request is refused unless the provider key was supplied explicitly withX-AbstractCore-Provider-API-Key, or withAuthorizationwhen server auth is disabled.
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen/qwen3-4b-2507",
"base_url": "http://localhost:1234/v1",
"messages": [{"role": "user", "content": "Hello from a remote LM Studio endpoint"}]
}'Do not put provider keys in request bodies or query strings. Those fields are disabled because they leak through logs, shell history, browser history, and reverse proxies.
# Preferred: configure provider keys on the server and authenticate to AbstractCore.
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_SERVER_API_KEY" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'When ABSTRACTCORE_SERVER_API_KEY is not configured, Authorization: Bearer <provider-key> may
be used as an upstream provider key. Once server auth is enabled, Authorization is reserved for
the AbstractCore server key and is never forwarded upstream.
To override a single upstream provider while still using the server master key, send the provider
key in X-AbstractCore-Provider-API-Key. The override applies only to the requested provider:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_SERVER_API_KEY" \
-H "X-AbstractCore-Provider-API-Key: $ANTHROPIC_API_KEY" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [{"role": "user", "content": "Hello!"}]
}'AbstractCore Server can optionally expose OpenAI-compatible image generation and audio endpoints.
Important notes:
- These are interoperability-first endpoints (return
b64_jsonor raw bytes), not an artifact-first durability contract. - If the required plugin/backend is not available, the server returns
501with actionable messaging.
Endpoints:
POST /v1/images/generationsPOST /v1/images/edits
Remote OpenAI-compatible image proxying is included in abstractcore[server]
and is enabled by setting ABSTRACTCORE_VISION_UPSTREAM_BASE_URL.
Install for remote image proxying:
pip install "abstractcore[server]"Install local image backends only when you want the server to load Diffusers or stable-diffusion.cpp models itself:
pip install "abstractcore[server,vision]"Endpoints:
POST /v1/audio/transcriptions(multipart;file=...)POST /v1/audio/speech(json;input=..., optionalvoice, optionalformat)
Remote provider routing is enabled when model is supplied in provider/model format:
openai/gpt-4o-mini-transcribe,openai/whisper-1openai/gpt-4o-mini-tts,openai/tts-1openrouter/...for OpenRouter STT/TTS modelsportkey/...for Portkey-routed OpenAI-compatible audio modelsopenai-compatible/...for endpoints that implement OpenAI-compatible audio routes
If model is omitted, the endpoint delegates to local capability plugins
(typically abstractvoice) and returns 501 when no suitable plugin is installed.
Install for remote audio:
pip install "abstractcore[server,remote]"Install for local plugin fallback:
pip install "abstractcore[server]"
pip install abstractvoiceNotes:
/v1/audio/transcriptionsrequirespython-multipartfor form parsing (included in the server extra).- Uploaded audio is limited by
ABSTRACTCORE_SERVER_AUDIO_MAX_BYTES(default: 25 MB).
Examples:
# Remote speech-to-text (STT)
curl -X POST http://localhost:8000/v1/audio/transcriptions \
-H "Authorization: Bearer $ABSTRACTCORE_SERVER_API_KEY" \
-F "file=@speech.wav" \
-F "model=openai/gpt-4o-mini-transcribe" \
-F "language=en"
# Remote text-to-speech (TTS)
curl -X POST http://localhost:8000/v1/audio/speech \
-H "Authorization: Bearer $ABSTRACTCORE_SERVER_API_KEY" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-4o-mini-tts","input":"Hello!","voice":"alloy","response_format":"mp3"}' \
--output hello.mp3If you want to “ask a model about an audio file”, prefer one of:
- Run STT first (
/v1/audio/transcriptions) then send the transcript toPOST /v1/chat/completions, or - Configure the server’s default audio strategy (
config.audio.strategy) to enable STT fallback for audio attachments, then attach audio in chat requests.
AbstractCore server supports comprehensive file attachments using OpenAI-compatible multimodal message format, plus AbstractCore's convenient @filename syntax.
Security note (HTTP server): local file paths are disabled by default (including @/path/to/file and {"url": "/path/to/file"}).
Use http(s) URLs or data: base64, or enable local paths via ABSTRACTCORE_SERVER_MEDIA_ROOT (safe) / ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1 (unsafe).
- Images: PNG, JPEG, GIF, WEBP, BMP, TIFF
- Documents: PDF, DOCX, XLSX, PPTX
- Data/Text: CSV, TSV, TXT, MD, JSON, XML
- Size Limits: 10MB per file, 32MB total per request
Simple syntax that works with all providers (requires local paths enabled via ABSTRACTCORE_SERVER_MEDIA_ROOT or ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1):
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "What is in this document? @/path/to/report.pdf"}
]
}'Standard OpenAI format for images:
{
"model": "anthropic/claude-haiku-4-5",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}Base64 Images:
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD..."
}
}AbstractCore supports OpenAI's planned file format with simplified structure (consistent with image_url):
File URL Format (Recommended - Same Pattern as image_url):
{
"model": "ollama/qwen3:4b",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this document"},
{
"type": "file",
"file_url": {
"url": "https://example.com/documents/report.pdf"
}
}
]
}
]
}Local File Path:
{
"type": "file",
"file_url": {
"url": "/Users/username/documents/data.csv"
}
}Note: local file paths require ABSTRACTCORE_SERVER_MEDIA_ROOT (safe) or ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1 (unsafe) on the server.
Base64 Data URL:
{
"type": "file",
"file_url": {
"url": "data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iago<PAovVHlwZS..."
}
}Filename Extraction:
- URLs/Paths: Extracted automatically (
/path/file.pdf→file.pdf) - Base64: Generated from MIME type (
data:application/pdf;base64,...→document.pdf)
Combine text, images, and documents in a single request:
{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Compare this chart with the data in the spreadsheet"},
{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,iVBORw0KGgoAAAANS..."}
},
{
"type": "file",
"file_url": {
"url": "https://example.com/data/sales_data.xlsx"
}
}
]
}
]
}Using OpenAI Client:
import os
from openai import OpenAI
import base64
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_SERVER_API_KEY"])
# Method 1: @filename syntax
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{"role": "user", "content": "Summarize @document.pdf"}]
)
# Method 2: File URL (HTTP/HTTPS)
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What are the key findings?"},
{
"type": "file",
"file_url": {
"url": "https://example.com/documents/report.pdf"
}
}
]
}]
)
# Method 3: Local file path
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this local document"},
{
"type": "file",
"file_url": {
"url": "/Users/username/documents/report.pdf"
}
}
]
}]
)
# Method 4: Base64 data URL
with open("report.pdf", "rb") as f:
file_data = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="lmstudio/qwen/qwen3-next-80b",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What are the key findings?"},
{
"type": "file",
"file_url": {
"url": f"data:application/pdf;base64,{file_data}"
}
}
]
}]
)Universal Provider Support:
# Same syntax works across all providers
providers_models = [
"openai/gpt-4o",
"anthropic/claude-haiku-4-5",
"ollama/qwen2.5vl:7b",
"lmstudio/qwen/qwen2.5-vl-7b"
]
for model in providers_models:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Analyze @data.csv and @chart.png"}]
)
print(f"{model}: {response.choices[0].message.content[:100]}...")Endpoint: POST /v1/responses
AbstractCore implements an OpenAI-compatible Responses-style API, including input_file support.
- OpenAI Compatible: Drop-in replacement for OpenAI's Responses API
- Native File Support:
input_filetype designed specifically for document attachments - Cleaner API: Explicit separation between text (
input_text) and files (input_file) - Backward Compatible: Existing
messagesformat still works alongside newinputformat - Optional Streaming: Streaming opt-in with
"stream": true(defaults tofalse)
OpenAI Responses API Format (Recommended):
{
"model": "gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze this document"},
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
]
}
],
"stream": false,
"max_tokens": 2000,
"temperature": 0.7
}Legacy Format (Still Supported):
{
"model": "openai/gpt-4",
"messages": [
{"role": "user", "content": "Tell me a story"}
],
"stream": false
}The server automatically detects which format you're using:
- OpenAI Format: Presence of
inputfield → converts to internal format - Legacy Format: Presence of
messagesfield → processes directly - Error: Missing both fields → returns 400 error with clear message
Simple Text Request:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen/qwen3-next-80b",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "What is Python?"}
]
}
]
}'File Analysis:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze the letter and summarize key points"},
{"type": "input_file", "file_url": "https://www.berkshirehathaway.com/letters/2024ltr.pdf"}
]
}
]
}'Multiple Files:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Compare these documents"},
{"type": "input_file", "file_url": "https://example.com/report1.pdf"},
{"type": "input_file", "file_url": "https://example.com/report2.pdf"},
{"type": "input_file", "file_url": "https://example.com/chart.png"}
]
}
],
"max_tokens": 2000
}'Streaming Response:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Summarize this document"},
{"type": "input_file", "file_url": "https://example.com/document.pdf"}
]
}
],
"stream": true
}' --no-bufferAll file types supported via URL, local path, or base64:
- Documents: PDF, DOCX, XLSX, PPTX
- Data Files: CSV, TSV, JSON, XML
- Text Files: TXT, MD
- Images: PNG, JPEG, GIF, WEBP, BMP, TIFF
- Size Limits: 10MB per file, 32MB total per request
Source Options:
// HTTP/HTTPS URL
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
// Local file path
{"type": "input_file", "file_url": "/path/to/document.xlsx"}
// Base64 data URL
{"type": "input_file", "file_url": "data:application/pdf;base64,JVBERi0x..."}import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_SERVER_API_KEY"])
# Direct request to /v1/responses endpoint
import requests
response = requests.post(
"http://localhost:8000/v1/responses",
json={
"model": "gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze this document"},
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
]
}
]
}
)
result = response.json()
print(result["choices"][0]["message"]["content"])Endpoint: POST /v1/embeddings
Generate embedding vectors for semantic search, RAG, and similarity analysis.
Request:
{
"input": "Text to embed",
"model": "huggingface/sentence-transformers/all-MiniLM-L6-v2"
}Supported Providers:
- HuggingFace: Local models with ONNX acceleration
- Ollama:
ollama/granite-embedding:278m, etc. - LMStudio: Any loaded embedding model
- OpenAI:
openai/text-embedding-3-small,openai/text-embedding-3-large - OpenRouter:
openrouter/openai/text-embedding-3-small, etc. - Portkey:
portkey/...with your Portkey routing configuration - OpenAI-compatible:
openai-compatible/...against configured/local/v1/embeddingsendpoints
Anthropic does not expose a native embeddings API. Use OpenAI, OpenRouter, Portkey, an OpenAI-compatible endpoint, or a local embedding provider.
OpenAI-compatible request fields are forwarded where supported:
dimensionsencoding_formatuserbase_url(AbstractCore extension; loopback by default, allowlist required for non-loopback)
Batch Embedding:
curl -X POST http://localhost:8000/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": ["text 1", "text 2", "text 3"],
"model": "ollama/granite-embedding:278m"
}'Endpoint: GET /v1/models
List all available models from configured providers.
Query Parameters:
provider: Filter by provider (e.g.,ollama,openai)type: Filter by type (text-generationortext-embedding)
Examples:
# All models
curl http://localhost:8000/v1/models
# Ollama models only
curl http://localhost:8000/v1/models?provider=ollama
# Embedding models only
curl http://localhost:8000/v1/models?type=text-embedding
# Ollama embeddings
curl http://localhost:8000/v1/models?provider=ollama&type=text-embeddingEndpoint: GET /providers
List all available providers and their status.
Response:
{
"providers": [
{
"name": "ollama",
"type": "llm",
"model_count": 15,
"status": "available"
}
]
}Endpoint: GET /health
Server health check for monitoring.
Response: {"status": "healthy"}
AbstractCore Server is OpenAI-compatible. Most OpenAI-compatible CLIs/SDKs can be pointed at it by setting:
OPENAI_BASE_URL="http://localhost:8000/v1"(or an equivalent flag)OPENAI_API_KEY="unused"(many clients require a non-empty key even for local servers)
- The server does not execute tools (it always returns tool calls; your host/runtime executes them).
- It can emit tool calls either as structured
tool_calls(OpenAI/Codex style) or as tagged content for clients that parse tool calls from assistant text. - Control the output format with
agent_format(request body, AbstractCore extension), or rely on auto-detection (user-agent + model heuristics).
Supported agent_format values: auto, openai, codex, qwen3, llama3, gemma, xml, passthrough.
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="unused"
codex --model "ollama/qwen3-coder:30b" "Write a factorial function"curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ollama/qwen3:4b-instruct-2507-q4_K_M",
"messages": [{"role": "user", "content": "Use the tool."}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather by city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}
],
"agent_format": "llama3"
}'Release images are published to GitHub Container Registry after the matching PyPI release succeeds:
ghcr.io/lpalbou/abstractcore-server:<version>The image is built from PyPI, not from the repository checkout, and installs:
abstractcore[server,remote,media,tokens,compression]==<version>It includes remote chat/responses, remote embeddings, remote STT/TTS routing,
remote OpenAI-compatible image proxying, server dependencies, media parsing,
token counting, and compression helpers. It intentionally does not include
local model runtimes (vllm, mlx, huggingface, local Diffusers/sdcpp
vision backends) or local embedding dependencies (sentence-transformers).
Run:
docker pull ghcr.io/lpalbou/abstractcore-server:2.13.4For local development, keep secrets in an uncommitted .env file:
ABSTRACTCORE_SERVER_API_KEY=replace-with-a-server-token
OPENAI_API_KEY=sk-...
OPENROUTER_API_KEY=sk-or-...
ANTHROPIC_API_KEY=sk-ant-...
PORTKEY_API_KEY=pk_...
PORTKEY_CONFIG=pcfg_...
OPENAI_COMPATIBLE_BASE_URL=http://host.docker.internal:1234/v1
OPENAI_COMPATIBLE_API_KEY=optionalThen run the image with that environment file:
docker run --rm --name abstractcore-server \
-p 127.0.0.1:8000:8000 \
--env-file .env \
ghcr.io/lpalbou/abstractcore-server:2.13.4ABSTRACTCORE_SERVER_API_KEY is the AbstractCore server auth token. Clients
send it as Authorization: Bearer <token>, including from Swagger UI's
Authorize button. Provider keys such as OPENAI_API_KEY, OPENROUTER_API_KEY,
ANTHROPIC_API_KEY, and PORTKEY_API_KEY stay inside the server container.
Set ABSTRACTCORE_SERVER_PROTECT_DOCS=1 if /docs, /redoc, and
/openapi.json should require the same server token.
For local OpenAI-compatible endpoints such as LM Studio or Ollama's /v1
server, point the container at a URL reachable from Docker:
docker run --rm --name abstractcore-server \
-p 127.0.0.1:8000:8000 \
-e ABSTRACTCORE_SERVER_API_KEY="$ABSTRACTCORE_SERVER_API_KEY" \
-e OPENAI_COMPATIBLE_BASE_URL="http://host.docker.internal:1234/v1" \
-e OPENAI_COMPATIBLE_API_KEY="$OPENAI_COMPATIBLE_API_KEY" \
ghcr.io/lpalbou/abstractcore-server:2.13.4version: '3.8'
services:
abstractcore:
image: ghcr.io/lpalbou/abstractcore-server:2.13.4
ports:
- "8000:8000"
environment:
- ABSTRACTCORE_SERVER_API_KEY=${ABSTRACTCORE_SERVER_API_KEY}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- PORTKEY_API_KEY=${PORTKEY_API_KEY}
- PORTKEY_CONFIG=${PORTKEY_CONFIG}
- OPENAI_COMPATIBLE_BASE_URL=${OPENAI_COMPATIBLE_BASE_URL}
- OPENAI_COMPATIBLE_API_KEY=${OPENAI_COMPATIBLE_API_KEY}
restart: unless-stoppedpip install gunicorn
gunicorn abstractcore.server.app:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000Debug mode provides comprehensive logging and detailed error reporting for troubleshooting API issues.
# Method 1: Using command line flag (recommended)
python -m abstractcore.server.app --debug
# Method 2: Using environment variable
export ABSTRACTCORE_DEBUG=true
python -m abstractcore.server.app
# Method 3: With uvicorn directly
export ABSTRACTCORE_DEBUG=true
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000Enhanced Error Reporting:
- Before: Uninformative "422 Unprocessable Entity" messages
- After: Detailed field validation errors with request body capture
Example Debug Output:
🔴 Request Validation Error (422) | method=POST | error_count=2 | errors=[
{"field": "body -> model", "message": "Field required", "type": "missing"},
{"field": "body -> messages", "message": "Field required", "type": "missing"}
] | client=127.0.0.1
📋 Request Body (Validation Error) | body={"invalid": "data"}Request/Response Tracking:
- Full HTTP request details (method, URL, headers, client IP)
- Response status codes and processing times
- Structured JSON logging for machine processing
Log Files:
logs/abstractcore_TIMESTAMP.log- Structured eventslogs/YYYYMMDD-payloads.jsonl- Full request bodieslogs/verbatim_TIMESTAMP.jsonl- Complete I/O
Useful Commands:
# Find errors
grep '"level": "error"' logs/abstractcore_*.log
# Track token usage
cat logs/verbatim_*.jsonl | jq '.metadata.tokens | .input + .output' | \
awk '{sum+=$1} END {print "Total:", sum}'
# Monitor specific model
grep '"model": "qwen3-coder:30b"' logs/verbatim_*.jsonlimport requests
providers = [
"ollama/qwen3-coder:30b",
"openai/gpt-4o-mini",
"anthropic/claude-haiku-4-5"
]
def generate_with_fallback(prompt):
for model in providers:
try:
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={"model": model, "messages": [{"role": "user", "content": prompt}]},
timeout=30
)
if response.status_code == 200:
return response.json()
except Exception:
continue
raise Exception("All providers failed")# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3-coder:30b
# Use via AbstractCore server
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ollama/qwen3-coder:30b",
"messages": [{"role": "user", "content": "Write a Python function"}]
}'# Check port availability
lsof -i :8000
# Use different port
uvicorn abstractcore.server.app:app --port 3000# Check providers
curl http://localhost:8000/providers
# Check API keys
echo $OPENAI_API_KEY
# Start Ollama
ollama serve
ollama list# Set API keys
export ABSTRACTCORE_SERVER_API_KEY="acore-server-secret"
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
# Restart server after setting keys- Universal: One API for all providers
- OpenAI Compatible: Drop-in replacement
- Simple: Clean, focused endpoints
- Fast: Lightweight, high-performance
- Debuggable: Comprehensive logging
- CLI Ready: Codex, Gemini CLI, Crush support
- Production Ready: Docker, multi-worker, health checks
- Getting Started - Core library quick start
- Architecture - System architecture including server
- Python API Reference - Core library API
- Embeddings Guide - Embeddings deep dive
- Troubleshooting - Common issues and solutions
AbstractCore Server - One server, all models, any client.