CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Development Commands

The project uses uv for package management and Make for common tasks:

Setup environment: make setup (installs all dependencies with uv sync --all-extras)
Run tests: make check-tests or uv run pytest (uses VCR cassettes by default)
Type checking: make check-types or uv run pyright
Linting/formatting: make check-format (check) or make format (fix)
Full checks: make check (runs format, type, and test checks)
Build package: make build (creates dist/ with built package)
Run single test: uv run pytest tests/test_specific_file.py::TestClass::test_method -v
Update snapshots: make update-snaps (for syrupy snapshot tests)
Update VCR cassettes: make update-snaps-vcr (re-records HTTP interactions, requires API keys)
Check VCR secrets: make check-vcr-secrets (scans cassettes for leaked credentials)
Documentation: make docs (build) or make docs-preview (serve locally)

Project Architecture

Core Components

Chat System: The main Chat class in _chat.py manages conversation state and provider interactions. It's a generic class that works with different providers through the Provider abstract base class.

Provider Pattern: All LLM providers (OpenAI, Anthropic, Google, etc.) inherit from Provider in _provider.py. Each provider (e.g., _provider_openai.py) implements:

Model-specific parameter handling
API client configuration
Request/response transformation
Tool calling integration

Content System: The _content.py module defines structured content types:

ContentText: Plain text messages
ContentImage: Image content (inline, remote, or file-based)
ContentToolRequest/ContentToolResult: Tool interaction messages
ContentJson: Structured data responses

Tool System: Tools are defined in _tools.py and allow LLMs to call Python functions. The system supports:

Function registration with automatic schema generation
Tool approval workflows
MCP (Model Context Protocol) server integration via _mcp_manager.py

Turn Management: Turn objects in _turn.py represent individual conversation exchanges, containing sequences of Content objects.

Key Patterns

Provider Abstraction: All providers implement the same interface but handle model-specific details internally
Generic Typing: Heavy use of TypeVars and generics for type safety across providers
Streaming Support: Both sync and async streaming responses via ChatResponse/ChatResponseAsync
Content-Based Messaging: All communication uses structured Content objects rather than raw strings
Tool Integration: Seamless function calling with automatic JSON schema generation from Python type hints

Typing Best Practices

This project prioritizes strong typing that leverages provider SDK types directly:

Use provider SDK types: Import and use types from openai.types, anthropic.types, google.genai.types, etc. rather than creating custom TypedDicts or dataclasses that mirror them. This ensures compatibility with SDK updates and provides better IDE support.
Use @overload for provider-specific returns: When a method returns different types based on a provider argument, use @overload with Literal types to give callers precise return type information.
Explore SDK types interactively: Use python -c "from <sdk>.types import <Type>; print(<Type>.__annotations__)" to inspect available fields and nested types when implementing provider-specific features.

Testing Structure

Tests are organized by component (e.g., test_provider_openai.py, test_tools.py)
Snapshot testing with syrupy for response validation
MCP server tests use local test servers in tests/mcp_servers/
Async tests configured via pytest.ini with asyncio_mode=strict

VCR Testing (HTTP Recording/Replay)

Tests use pytest-recording (wrapping vcrpy) to record and replay HTTP interactions:

Cassettes: YAML files stored in tests/_vcr/ organized by test module
Default mode: Tests replay cassettes without making live API calls
Recording: Use make update-snaps-vcr or uv run pytest --record-mode=rewrite (requires real API keys)
Dummy credentials: Auto-set by conftest.py when env vars are missing, enabling VCR replay without secrets

Adding VCR to tests:

from .conftest import make_vcr_config, VCR_MATCH_ON_WITHOUT_BODY

# Most tests use default config (matches on request body)
@pytest.mark.vcr
def test_provider_simple():
    ...

# For tests with dynamic request bodies (temp files, generated IDs)
@pytest.fixture(scope="module")
def vcr_config():
    return make_vcr_config(match_on=VCR_MATCH_ON_WITHOUT_BODY)

Tests requiring live API (skip in VCR mode):

from .conftest import is_dummy_credential

@pytest.mark.skipif(
    is_dummy_credential("ANTHROPIC_API_KEY"),
    reason="This test requires live API calls",
)
def test_multi_sample():
    ...

Providers incompatible with VCR: Bedrock and Snowflake require live API tests due to auth mechanisms.

See docs/dev/vcr-tests.md for comprehensive documentation.

Documentation

Documentation is built with Quarto and quartodoc:

API reference generated from docstrings in chatlas/ modules
Guides and examples in docs/ as .qmd files
Type definitions in chatlas/types/ provide provider-specific parameter types

Adding New Providers

When implementing a new LLM provider, follow this systematic approach:

1. Research Phase

Check ellmer first: Look in ../ellmer/R/provider-*.R for existing implementations
Identify base provider: Most providers inherit from either OpenAIProvider (for OpenAI-compatible APIs) or implement Provider directly
Check existing patterns: Review similar providers in chatlas/_provider_*.py

2. Implementation Steps

Create provider file: chatlas/_provider_[name].py
- Use PascalCase for class names (e.g., MistralProvider)
- Use snake_case for function names (e.g., ChatMistral)
- Follow existing docstring patterns with Prerequisites, Examples, Parameters, Returns sections

Provider class structure:

class [Name]Provider(OpenAIProvider):  # or Provider if custom
    def __init__(self, ...):
        super().__init__(...)
        # Provider-specific initialization
    
    def _chat_perform_args(self, ...):
        # Customize request parameters if needed
        kwargs = super()._chat_perform_args(...)
        # Apply provider-specific modifications
        return kwargs

Chat function signature:

def Chat[Name](
    *,
    system_prompt: Optional[str] = None,
    model: Optional[str] = None,
    api_key: Optional[str] = None,
    base_url: str = "https://...",
    seed: int | None | MISSING_TYPE = MISSING,
    kwargs: Optional["ChatClientArgs"] = None,
) -> Chat["SubmitInputArgs", ChatCompletion]:

3. Testing Setup

Create test file: tests/test_provider_[name].py

Add environment variable skip pattern:

import os
import pytest

do_test = os.getenv("TEST_[NAME]", "true")
if do_test.lower() == "false":
    pytest.skip("Skipping [Name] tests", allow_module_level=True)

Add VCR support (for most providers):
```
@pytest.mark.vcr
def test_[name]_simple_request():
    ...
```
For async tests, put @pytest.mark.vcr before @pytest.mark.asyncio.
Use standard test patterns:
- test_[name]_simple_request()
- test_[name]_simple_streaming_request()
- test_[name]_respects_turns_interface()
- test_[name]_tool_variations() (if supported)
- test_data_extraction()
- test_[name]_images() (if vision supported)

Record VCR cassettes:

# Set real API key, then record
export [PROVIDER]_API_KEY="..."
uv run pytest tests/test_provider_[name].py -v --record-mode=rewrite

4. Package Integration

Update chatlas/__init__.py:
- Add import: from ._provider_[name] import Chat[Name]
- Add to __all__ tuple: "Chat[Name]"

Run validation:

uv run pyright chatlas/_provider_[name].py
uv run pytest tests/test_provider_[name].py -v  # Replays VCR cassettes
uv run python -c "from chatlas import Chat[Name]; print('Import successful')"
make check-vcr-secrets  # Ensure no secrets leaked in cassettes

5. Provider-Specific Customizations

OpenAI-Compatible Providers:

Inherit from OpenAIProvider
Override _chat_perform_args() for API differences
Common customizations: remove stream_options, adjust parameter names, modify headers

Custom API Providers:

Inherit from Provider directly
Implement all abstract methods: chat_perform(), chat_perform_async(), stream_text(), etc.
Handle model-specific response formats

6. Common Patterns

Environment variables: Use [PROVIDER]_API_KEY format
Default models: Use provider's recommended general-purpose model
Seed handling: seed = 1014 if is_testing() else None when MISSING
Error handling: Provider APIs often return different error formats
Rate limiting: Consider implementing client-side throttling for providers that need it

7. Documentation Requirements

Include provider description and prerequisites
Document known limitations (tool calling, vision support, etc.)
Provide working examples with environment variable usage
Note any special model requirements (e.g., vision models for images)

Connections to ellmer

This project is the Python equivalent of the R package ellmer. The source code for ellmer is available in a sibling directory to this project. Before implementing new features or bug fixes in chatlas, it may be useful to consult the ellmer codebase to: (1) check whether the feature/fix already exists on the R side and (2) make sure the projects are aligned in terms of stylistic approaches. Note also that ellmer itself has a CLAUDE.md file which has a useful overview of the project.

Differences from ellmer

One important difference to note is that in ellmer::chat_openai() uses the completions API, while chatlas.ChatOpenAI() uses the responses API. Look to ellmer::chat_openai_responses() and chatlas.ChatOpenAICompletions() for the "non-default" API.

Access to gh CLI

If ellmer or chatlas issues or PRs are referenced, try using the gh CLI tool to gain necessary context. They live at tidyverse/ellmer and posit-dev/chatlas

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Development Commands

Project Architecture

Core Components

Key Patterns

Typing Best Practices

Testing Structure

VCR Testing (HTTP Recording/Replay)

Documentation

Adding New Providers

1. Research Phase

2. Implementation Steps

3. Testing Setup

4. Package Integration

5. Provider-Specific Customizations

6. Common Patterns

7. Documentation Requirements

Connections to ellmer

Differences from ellmer

Access to gh CLI

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Development Commands

Project Architecture

Core Components

Key Patterns

Typing Best Practices

Testing Structure

VCR Testing (HTTP Recording/Replay)

Documentation

Adding New Providers

1. Research Phase

2. Implementation Steps

3. Testing Setup

4. Package Integration

5. Provider-Specific Customizations

6. Common Patterns

7. Documentation Requirements

Connections to ellmer

Differences from ellmer

Access to gh CLI