Skip to content

Latest commit

 

History

History
187 lines (141 loc) · 6.9 KB

File metadata and controls

187 lines (141 loc) · 6.9 KB

Contributing to EXO

Thank you for your interest in contributing to EXO!

Getting Started

To run EXO from source:

Prerequisites:

  • uv (for Python dependency management)
    brew install uv
  • rust (to build Rust bindings, nightly for now)
    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    rustup toolchain install nightly
  • macmon (for hardware monitoring on Apple Silicon) Use the pinned fork revision used by this repo instead of Homebrew macmon.
    cargo install --git https://github.com/swiftraccoon/macmon \
      --rev 9154d234f763fbeffdcb4135d0bbbaf80609699b \
      macmon \
      --force
git clone https://github.com/exo-explore/exo.git
cd exo/dashboard
npm install && npm run build && cd ..
uv run exo

Development

EXO is built with a mix of Rust, Python, and TypeScript (Svelte for the dashboard), and the codebase is actively evolving. Before starting work:

  • Pull the latest source to ensure you're working with the most recent code
  • Keep your changes focused - implement one feature or fix per pull request
  • Avoid combining unrelated changes, even if they seem small

This makes reviews faster and helps us maintain code quality as the project evolves.

Code Style

Write pure functions where possible. When adding new code, prefer Rust unless there's a good reason otherwise. Leverage the type systems available to you - Rust's type system, Python type hints, and TypeScript types. Comments should explain why you're doing something, not what the code does - especially for non-obvious decisions.

Run nix fmt to auto-format your code before submitting.

Model Cards

EXO uses TOML-based model cards to define model metadata and capabilities. Model cards are stored in:

  • resources/inference_model_cards/ for text generation models
  • resources/image_model_cards/ for image generation models
  • ~/.exo/custom_model_cards/ for user-added custom models

Adding a Model Card

To add a new model, create a TOML file with the following structure:

model_id = "mlx-community/Llama-3.2-1B-Instruct-4bit"
n_layers = 16
hidden_size = 2048
supports_tensor = true
tasks = ["TextGeneration"]
family = "llama"
quantization = "4bit"
base_model = "Llama 3.2 1B"
capabilities = ["text"]

[storage_size]
in_bytes = 729808896

Required Fields

  • model_id: Hugging Face model identifier
  • n_layers: Number of transformer layers
  • hidden_size: Hidden dimension size
  • supports_tensor: Whether the model supports tensor parallelism
  • tasks: List of supported tasks (TextGeneration, TextToImage, ImageToImage)
  • family: Model family (e.g., "llama", "deepseek", "qwen")
  • quantization: Quantization level (e.g., "4bit", "8bit", "bf16")
  • base_model: Human-readable base model name
  • capabilities: List of capabilities (e.g., ["text"], ["text", "thinking"])

Optional Fields

  • components: For multi-component models (like image models with separate text encoders and transformers)
  • uses_cfg: Whether the model uses classifier-free guidance (for image models)
  • trust_remote_code: Whether to allow remote code execution (defaults to false for security)

Capabilities

The capabilities field defines what the model can do:

  • text: Standard text generation
  • thinking: Model supports chain-of-thought reasoning
  • thinking_toggle: Thinking can be enabled/disabled via enable_thinking parameter
  • image_edit: Model supports image-to-image editing (FLUX.1-Kontext)

Security Note

By default, trust_remote_code is set to false for security. Only enable it if the model explicitly requires remote code execution from the Hugging Face hub.

API Adapters

EXO supports multiple API formats through an adapter pattern. Adapters convert API-specific request formats to the internal TextGenerationTaskParams format and convert internal token chunks back to API-specific responses.

Adapter Architecture

All adapters live in src/exo/master/adapters/ and follow the same pattern:

  1. Convert API-specific requests to TextGenerationTaskParams
  2. Handle both streaming and non-streaming response generation
  3. Convert internal TokenChunk objects to API-specific formats
  4. Manage error handling and edge cases

Existing Adapters

  • chat_completions.py: OpenAI Chat Completions API
  • claude.py: Anthropic Claude Messages API
  • responses.py: OpenAI Responses API
  • ollama.py: Ollama API (for OpenWebUI compatibility)

Adding a New API Adapter

To add support for a new API format:

  1. Create a new adapter file in src/exo/master/adapters/
  2. Implement a request conversion function:
    def your_api_request_to_text_generation(
        request: YourAPIRequest,
    ) -> TextGenerationTaskParams:
        # Convert API request to internal format
        pass
  3. Implement streaming response generation:
    async def generate_your_api_stream(
        command_id: CommandId,
        chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None],
    ) -> AsyncGenerator[str, None]:
        # Convert internal chunks to API-specific streaming format
        pass
  4. Implement non-streaming response collection:
    async def collect_your_api_response(
        command_id: CommandId,
        chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None],
    ) -> AsyncGenerator[str]:
        # Collect all chunks and return single response
        pass
  5. Register the adapter endpoints in src/exo/master/api.py

The adapter pattern keeps API-specific logic isolated from core inference systems. Internal systems (worker, runner, event sourcing) only see TextGenerationTaskParams and TokenChunk objects - no API-specific types cross the adapter boundary.

For detailed API documentation, see docs/api.md.

Testing

EXO relies heavily on manual testing at this point in the project, but this is evolving. Before submitting a change, test both before and after to demonstrate how your change improves behavior. Do the best you can with the hardware you have available - if you need help testing, ask and we'll do our best to assist. Add automated tests where possible - we're actively working to substantially improve our automated testing story.

Submitting Changes

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/your-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin feature/your-feature)
  5. Open a Pull Request and follow the PR template

Reporting Issues

If you find a bug or have a feature request, please open an issue on GitHub with:

  • A clear description of the problem or feature
  • Steps to reproduce (for bugs)
  • Expected vs actual behavior
  • Your environment (macOS version, hardware, etc.)

Questions?

Join our community: