Last updated: 2026-03-16
- GenerateText / StreamText - Text generation with streaming (token-by-token, text-only, or blocking)
- GenerateObject[T] / StreamObject[T] - Type-safe structured output with auto JSON Schema from Go structs
- Embed / EmbedMany - Single and batch embeddings with auto-chunking + parallel execution
- GenerateImage - Text-to-image generation (OpenAI DALL-E, Google Imagen)
| Category | Providers |
|---|---|
| Flagship | OpenAI, Anthropic, Google (Gemini + Imagen) |
| Cloud platforms | AWS Bedrock (SigV4), Azure OpenAI, Google Vertex AI |
| Fast inference | Groq, Cerebras, Fireworks, Together, DeepInfra |
| Specialized | Mistral, xAI, DeepSeek, Cohere, Perplexity |
| Aggregators | OpenRouter |
| Local | Ollama, vLLM |
| Bring your own | compat.New() for any OpenAI-compatible endpoint |
- Tool system - Define tools with JSON Schema, auto tool loop with
WithMaxSteps - TokenSource - Static keys, OAuth-refreshed, cached credentials (lock-free network fetch, TTL-based)
- WithHTTPClient - Custom transport for proxies, auth middleware, Codex/Copilot patterns
- Prompt caching -
WithPromptCaching()automaticcache_controlon system messages (immutable, no input mutation) - Retry/backoff - Exponential backoff on 429/5xx, retry-on-401 with token refresh
- Thread-safe - All providers safe for concurrent use; Bedrock fallback uses RWMutex for cross-region retry
- Telemetry hooks -
WithOnRequest,WithOnResponse,WithOnToolCall,WithOnStepFinish - SchemaFrom[T] - Reflection-based JSON Schema generation, OpenAI strict mode compatible
- Azure multi-model - Auto-routing: OpenAI models use Responses API, Claude uses Anthropic endpoint, others use Chat Completions
- Array content - Handles response content as string or
[{type:"text",text:"..."}](Mistral magistral models) - Provider-defined tools - 20 tools across 5 providers: Anthropic (10), OpenAI (4), Google (3), xAI (2), Groq (1). E2E validated: 18 PASS, 12 SKIP (no credits/blocked), 0 FAIL.
- E2E validated - 96 models across 8 providers tested with real API calls (94 generate PASS, 96 stream PASS, 0 FAIL)
- Benchmarks - Go wins 5/6 categories vs Vercel AI SDK: streaming 1.13x, TTFC 1.26x, cold start 24.5x, memory 3.5x, GenerateText 1.41x
- Documentation - Full docs site, 20 provider pages, 16 runnable examples, API reference
| Feature | Description |
|---|---|
| Output.array | Stream validated array elements incrementally |
| Output.choice | Convenience enum selection wrapper |
goai/otel |
Pre-built OpenTelemetry integration (optional import) |
| xAI Responses API | Enable xAI provider-defined tools (web_search, x_search) via /v1/responses endpoint |
| New providers | Based on community requests |
GoAI reaches v1.0 when the API is complete enough that most Go+AI applications can be built without workarounds:
- Stable interfaces -
LanguageModel,EmbeddingModel,ImageModelfinalized with no planned breaking changes - Full provider coverage - Every major AI provider works out of the box, including auth flows and regional endpoints
- Production observability - First-class OpenTelemetry integration, structured logging hooks, usage tracking
- Comprehensive documentation - Every exported type and function documented with examples, migration guides for common patterns
| Feature | Description |
|---|---|
| Agent | Multi-step agent abstraction with built-in tool loop and memory |
| MCP client | Connect to MCP servers, auto-convert tools to GoAI tools |
| Reranking | goai.Rerank() for search and retrieval pipelines |
| Speech | Server-side audio generation and transcription |
- Go-native API - Functional options, interfaces, composition. No TypeScript transliterations.
- Zero required dependencies - Core GoAI depends only on stdlib. Providers import only what they need.
- Provider-agnostic - Same code works across all providers. Switch models by changing one line.
- Consumer flexibility -
WithHTTPClient+TokenSourcelet consumers handle auth, proxies, and custom endpoints without GoAI needing to know. - No middleware, no registry - Go's interface composition is sufficient. We don't add abstractions until proven necessary.
Have a feature request? Open an issue on GitHub. PRs welcome, see CONTRIBUTING.md.