docs: add Text Generation Inference (TGI) section to using_different_models by santoshray02 · Pull Request #2209 · huggingface/smolagents

santoshray02 · 2026-04-19T17:43:29Z

Adds a "Using Text Generation Inference (TGI) Models" section to the "Using different models" guide, following the same pattern as the existing Gemini / OpenRouter / Grok sections.

The section covers three common scenarios:

Pointing LiteLLMModel at a Hugging Face Inference Endpoint
Authenticating with HF_TOKEN for private endpoints
Running TGI locally via Docker and connecting to http://localhost:8080/v1/

Details verified against:

LiteLLM's huggingface provider docs (model_id prefix huggingface/tgi, api_base with /v1/ suffix for the OpenAI-compatible Messages API)
TGI's quicktour (Docker image tag 3.3.5, port mapping 8080:80)

Docs-only change; no code touched.

…models Document how to point LiteLLMModel at a TGI endpoint — covering Hugging Face Inference Endpoints, authenticated private endpoints, and running TGI locally via Docker. Verified against LiteLLM's huggingface provider docs and TGI's quicktour (image tag 3.3.5, /v1/ Messages API path). Closes huggingface#1567

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add Text Generation Inference (TGI) section to using_different_models#2209

docs: add Text Generation Inference (TGI) section to using_different_models#2209
santoshray02 wants to merge 1 commit intohuggingface:mainfrom
santoshray02:docs/add-tgi-section

santoshray02 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

santoshray02 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant