scribe-worker

Worker for the Sunet transcription service (Sunet Scribe).

Author

This project is developed by Sunet. Contributor: Kristofer Hallin.

License

This project is licensed under the Apache License, Version 2.0. See LICENSE for details.

Contributing

Contributions are welcome! Please feel free to open issues or submit pull requests.

Features

Transcription Processing: Processes audio/video transcription jobs from the backend queue
Whisper.cpp Integration: Uses whisper.cpp for efficient local transcription
Multiple Output Formats: Generates JSON and SRT transcription outputs
Multi-worker Support: Run multiple workers in parallel for increased throughput

Requirements

Python 3.13+
uv (recommended package manager)
whisper.cpp (must be built separately)
FFmpeg (for audio/video processing)

Development Environment Setup

1. Clone and Install Dependencies

git clone <repository-url>
cd scribe-worker
uv sync

2. Build whisper.cpp

Build and install whisper.cpp from source. See https://github.com/ggml-org/whisper.cpp for detailed instructions.

3. Download Whisper Models

./download_models.sh

4. Configure Environment Variables

Create a .env file in the project root with the following settings:

# Debug mode
DEBUG=True

# Backend API configuration
API_BACKEND_URL="http://localhost:8000"
API_VERSION="v1"

# Worker configuration
WORKERS=2
WHISPER_CPP_PATH=<Path to whisper.cpp>
FILE_STORAGE_DIR=<Your file storage directory>

5. Run the Worker

uv run main.py --foreground --debug

Docker

Build and run with Docker:

docker build -t scribe-worker .
docker run --env-file .env scribe-worker

Project Structure

scribe-worker/
├── main.py              # Worker entry point
├── utils/               # Utility modules
├── models/              # Whisper model files
└── downloaded/          # Downloaded files for processing

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
utils		utils
.gitignore		.gitignore
.python-version		.python-version
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
download_models.sh		download_models.sh
main.py		main.py
oidc_get_token.py		oidc_get_token.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scribe-worker

Author

License

Contributing

Features

Requirements

Development Environment Setup

1. Clone and Install Dependencies

2. Build whisper.cpp

3. Download Whisper Models

4. Configure Environment Variables

5. Run the Worker

Docker

Project Structure

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

scribe-worker

Author

License

Contributing

Features

Requirements

Development Environment Setup

1. Clone and Install Dependencies

2. Build whisper.cpp

3. Download Whisper Models

4. Configure Environment Variables

5. Run the Worker

Docker

Project Structure

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages