Image Text Extractor

An OCR application that extracts text from images.

Features

Extract text from uploaded images
Process multiple image formats (PNG, JPG, JPEG, GIF, WEBP)
User-friendly Streamlit interface
RESTful API endpoints
Integration with Langchain for advanced text processing
Together AI Vision model integration

Prerequisites

Python 3.12 or higher
Poetry package manager
Together AI API key

Installation

1.Clone the repository:

   git clone https://github.com/yourusername/ImageTextExtractor.git
   cd ImageTextExtractor

2.Install dependencies using Poetry:

   poetry install

Usage

Streamlit UI

1.Start the FastAPI backend:

   poetry run python main.py

2.In a new terminal, launch the Streamlit interface:

   poetry run streamlit run ui.py

3.Open your browser and navigate to http://localhost:8501

4.Enter your Together AI API key

5.Upload an image and wait for the results

REST API

The application exposes a REST API endpoint for OCR processing.

Endpoint: POST /ocr

Request:

URL: http://localhost:8000/ocr
Method: POST
Content-Type: multipart/form-data

Parameters:

file: Image file (supported formats: PNG, JPG, JPEG, GIF, WEBP)
api_key: Together AI API key
system_prompt: (Optional) Custom prompt for the vision model

Example using curl:

curl -X POST http://localhost:8000/ocr \
-F "file=@/path/to/your/image.jpg" \
-F "api_key=your_together_ai_api_key" \
-F "system_prompt=Convert the provided image into text"

Response:

poetry run pytest

Environment Variables

The application uses the following configurations (defined in config.py):

LOGGING_LEVEL: Default is "INFO"
SUPPORTED_IMAGE_TYPES: [".png", ".jpg", ".jpeg", ".gif", ".webp"]
TOGETHER_MODEL_NAME: "meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo"

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Together AI for providing the vision model
Langchain for the AI integration framework
Streamlit for the user interface
FastAPI for the REST API implementation

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
image-text-extractor		image-text-extractor
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Text Extractor

Features

Prerequisites

Installation

Usage

Streamlit UI

REST API

Endpoint: POST /ocr

Environment Variables

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Text Extractor

Features

Prerequisites

Installation

Usage

Streamlit UI

REST API

Endpoint: POST /ocr

Environment Variables

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages