This project demonstrates how to build a Multimodal RAG (Retrieval-Augmented Generation) pipeline using LangChain to process PDF documents (text + tables + images) with the Unstructured library.
- Extract text, tables and images from PDFs using `Unstructured. '
- Retriever-augmented generation (RAG) pipeline for multimodal content.
- LangChain integration for document processing and retrieval
- Question-answering to process, find, and answer questions about documents.
- Prerequisites
- Python 3.10+
- Clone repository
- git clone https://github.com/Mercytopsy/Multimodal-RAG-Pipeline.git
- cd Multimodal-RAG-Pipeline
- Install dependencies
- pip install -r requirements.txt
