qax is a small proof-of-concept implementation for Retrieval Question & Answer (Q&A) aimed to store embeddings for code repositories or any folder full of text files. This code is a simple adaption of the langchain examples, embedded in docker containers
With qax, you can create embeddings for folder structures containing text files, such as code repositories, leverage OpenAI's text-embedding-ada-002 model and store them in a pgVector container.
After the index has been built, the app uses langchain´s RetrievelQA chain with gpt-3.5-turbo (or gpt-4 if available) model for performing Q&A on the indexed data.
(see langchain QA Docs)
Build a lokal docker container to run the app:
docker build -t qax .To get started, clone any repository and navigate to its root directory:
git clone https://github.com/your_username/example.git && cd exampleCreate two .env files in the root directory of the repository with the following contents:
.env:
OPENAI_API_KEY=your_openai_api_key_here
# Database Connection
PGVECTOR_HOST=IP_OF_HOST
PGVECTOR_PORT=5432
PGVECTOR_COLLECTION=qaxdb.env:
POSTGRES_USER=victor
POSTGRES_PASSWORD=vector
POSTGRES_DB=vectordb
PGDATA=/.vectordbdocker run -d \
--name pgvector \
--env-file=db.env \
-p 5432:5432 \
-v $PWD/.vectordb:/var/lib/postgresql \
ankane/pgvectordocker run -d `
--name pgvector `
--env-file=db.env `
-p 5432:5432 `
-v ${PWD}/.vectordb:/var/lib/postgresql `
ankane/pgvectordocker cp ./load-ext.sh pgvector:/load-ext.sh && \
docker exec pgvector /load-ext.shRun the following command to create embeddings for the files in the repository:
docker run --rm -it \
--env-file=.env \
--env-file=db.env \
-v ${PWD}:/repository \
qax --indexThis will, by default, create a .vectordb folder in your repository to store the index.
To change the name, adjust PGDATA variable in db.env
To ask questions about the indexed data, use the command below:
docker run --rm -it \
--env-file=.env \
--env-file=db.env \
-v ${PWD}:/repository \
qax [QUERY]Example 1:
docker run --rm -it \
--env-file=.env \
--env-file=db.env \
-v ${PWD}:/repository \
qax "Which libraries are needed to build the app?"> Entering new RetrievalQA chain...
> Finished chain.
The libraries needed to build the app are langchain, openai, tiktoken, pathspec, python-dotenv, psycopg2-binary, and pgvector.
--------------------------------------------------------------------------------
Sources:
- /repository/requirements.txt
- /repository/Dockerfile
- /repository/load-ext.sh
- /repository/README.md
Example 2:
docker run --rm -it \
--env-file=.env \
--env-file=db.env \
-v ${PWD}:/repository \
qax "Create inline documentation for the main function"> Entering new RetrievalQA chain...
> Finished chain.
"""The main function of the document indexing and similarity search program.
Args:
index (bool): Whether to perform document indexing.
"""
embeddings = OpenAIEmbeddings()
--------------------------------------------------------------------------------
Sources:
- /repository/app.py
- /repository/requirements.txt
- /repository/README.md
- /repository/app.py
This project is licensed under the MIT License.
We welcome contributions from the community! Please follow our contribution guidelines for more details.
For any inquiries or support, feel free to join our community on Discord.