Smart AIPI — Model List and Pricing

Smart AIPI provides access to the full OpenAI model family at 75% less than OpenAI's direct pricing (you pay 25% of list rates). All models are accessed through standard OpenAI-compatible endpoints with no code changes required.

API base URL: https://api.smartaipi.com/v1
Anthropic-compatible base: https://api.smartaipi.com
WebSocket: wss://api.smartaipi.com/v1/realtime

Text Generation Models

GPT-5.4 Series (Latest)

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-5.4`	$0.625	$0.0625	$3.75	State of the art — SWE-Bench Pro 57.7%, FrontierMath 47.6%
`gpt-5.4-mini`	$0.1875	$0.01875	$1.125	60% Terminal-Bench 2.0 — best price-performance small model
`gpt-5.4-nano`	$0.05	$0.005	$0.3125	Ultra-cheap routing, classification, summarization

GPT-5.4 is OpenAI's most capable model. Benchmark highlights vs previous generation:

OSWorld (computer use): 75.0% (up from 74.0%)
SWE-Bench Pro (software engineering): 57.7% (up from 56.8%)
GPQA Diamond (expert science reasoning): 92.8%
FrontierMath (advanced math): 47.6%
GDPval (knowledge work): 83.0% (up from 70.9%)
Toolathlon (agentic tool use): 54.6%

GPT-5.4 also supports a 1M context window (up from 272K; tokens beyond 272K are billed at 2x). Priority processing is available via service_tier: "priority" at 1.5x standard rates.

GPT-5.4 Mini scores 60% on Terminal-Bench 2.0 — outperforming Gemini 3 Flash (47.7%) while costing less. Ideal for coding subagents and parallelized development workflows.

GPT-5.4 Nano supports Chat Completions only (not Responses API). Use for lightweight tasks: classification, summarization, routing, content moderation.

GPT-5.3 Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-5.3-codex`	$0.4375	$0.045	$3.50	Previous frontier — excellent for coding agents
`gpt-5.3-codex-mini`	$0.0375	$0.005	$0.15	Fast, cheap coding subagent
`gpt-5.3-mini`	$0.0375	$0.005	$0.15	Fast small model

GPT-5.3 Codex is OpenAI's previous frontier model and remains a top choice for coding agents (Codex CLI, OpenCode, Claude Code via Anthropic endpoint). Strong on SWE-Bench (56.8%) and all agentic benchmarks.

GPT-5.2 Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-5.2-codex`	$0.4375	$0.045	$3.50	Solid coding frontier
`gpt-5.2-codex-mini`	$0.0375	$0.005	$0.15	Fast coding model
`gpt-5.2`	$0.4375	$0.045	$3.50	General purpose
`gpt-5.2-mini`	$0.0375	$0.005	$0.15	Lightweight tasks

GPT-5.1 Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-5.1`	$0.375	$0.0375	$3.00	Balanced capability and cost
`gpt-5.1-codex-mini`	$0.0375	$0.005	$0.15
`gpt-5.1-mini`	$0.0375	$0.005	$0.15

GPT-5 Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-5`	$0.3125	$0.03125	$2.50	Solid general-purpose model
`gpt-5-mini`	$0.0625	$0.0075	$0.50	Low-cost, fast
`gpt-5-nano`	$0.0125	$0.0025	$0.10	Ultra-cheap for high-throughput
`gpt-5-codex-mini`	$0.0375	$0.005	$0.15	Coding-optimized small model

GPT-4.1 Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-4.1`	$0.50	$0.125	$2.00	Reliable, well-tested
`gpt-4.1-mini`	$0.10	$0.025	$0.40	Efficient mid-tier
`gpt-4.1-nano`	$0.025	$0.0075	$0.10	Near-free for simple tasks

GPT-4o Series

Model	Input / 1M	Cached Input / 1M	Output / 1M	Notes
`gpt-4o`	$0.625	$0.3125	$2.50	Multimodal, broad support
`gpt-4o-mini`	$0.0375	$0.02	$0.15	Low-cost multimodal

Image Generation Models

Image generation is billed per image based on resolution and quality, not per token.

Model	Notes
`gpt-image-1.5`	Frontier image generation and editing
`gpt-image-latest`	Always resolves to the latest image model
`gpt-image-1-mini`	Faster, lower-cost image generation

Endpoint: POST /v1/images/generations

Video Generation Models

Model	Notes
`sora-2`	Up to 20 seconds of video from text prompt or image
`sora-2-pro`	Higher quality video, longer duration

Endpoint: POST /v1/videos

Anthropic Claude Model Aliases

When using the Anthropic-compatible endpoint (/v1/messages), Claude model names are automatically routed to the closest GPT equivalent:

You Request	Routes To	Output / 1M
`claude-opus-4-6`	gpt-5.3-codex	$3.50
`claude-sonnet-4-5`	gpt-5.3-codex	$3.50
`claude-haiku-4-5`	gpt-5.3-codex-mini	$0.15
Any `claude-*` model	gpt-5.3-codex	$3.50

Prompt Caching

Prompt caching is automatic — no configuration needed. Requests with long, repeated system prompts (common in agentic workflows) benefit automatically. Cache hit rates of 30–50% are typical for agent loops with consistent system prompts.

Model Selection Guide

Use Case	Recommended Model	Why
Best quality coding	`gpt-5.4`	State-of-the-art SWE-Bench, complex multi-file tasks
Coding subagents	`gpt-5.4-mini`	60% Terminal-Bench at 70% less cost than gpt-5.4
Routing, classification, summaries	`gpt-5.4-nano`	Near-free at $0.05/1M input
Coding agents (proven)	`gpt-5.3-codex`	Excellent track record, slightly lower cost than 5.4
Fast lightweight tasks	`gpt-5-mini`	Balanced between speed and capability
High-volume pipelines	`gpt-5-nano` / `gpt-5.4-nano`	Lowest cost per token
Image generation	`gpt-image-1.5`	Frontier image quality
Video generation	`sora-2`	Text-to-video up to 20s

Back to README | Quickstart guide | Pricing details

Get your free API key →

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smart AIPI — Model List and Pricing

Text Generation Models

GPT-5.4 Series (Latest)

GPT-5.3 Series

GPT-5.2 Series

GPT-5.1 Series

GPT-5 Series

GPT-4.1 Series

GPT-4o Series

Image Generation Models

Video Generation Models

Anthropic Claude Model Aliases

Prompt Caching

Model Selection Guide

FilesExpand file tree

models.md

Latest commit

History

models.md

File metadata and controls

Smart AIPI — Model List and Pricing

Text Generation Models

GPT-5.4 Series (Latest)

GPT-5.3 Series

GPT-5.2 Series

GPT-5.1 Series

GPT-5 Series

GPT-4.1 Series

GPT-4o Series

Image Generation Models

Video Generation Models

Anthropic Claude Model Aliases

Prompt Caching

Model Selection Guide