Skip to content

Commit cbfe4a4

Browse files
authored
Merge pull request #65 from raptorsun/hermetic
LCORE-791: konflux hermetic build
2 parents 0611dd0 + 310b4d1 commit cbfe4a4

16 files changed

+3673
-65
lines changed

.tekton/rag-tool-pull-request.yaml

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,8 +30,42 @@ spec:
3030
- name: build-platforms
3131
value:
3232
- linux/x86_64
33+
- linux-c6gd2xlarge/arm64
3334
- name: dockerfile
3435
value: Containerfile
36+
- name: build-source-image
37+
value: 'true'
38+
- name: prefetch-input
39+
# no source available: torch, faiss-cpu
40+
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
41+
# those need cmake to build: pyarrow
42+
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
43+
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
44+
value: |
45+
[
46+
{
47+
"type": "rpm",
48+
"path": "."
49+
},
50+
{
51+
"type": "pip",
52+
"path": ".",
53+
"requirements_files": [
54+
"requirements.hashes.wheel.txt",
55+
"requirements.hashes.source.txt",
56+
"requirements.hermetic.txt"
57+
],
58+
"requirements_build_files": ["requirements-build.txt"],
59+
"binary": {
60+
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pycparser,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,sqlalchemy,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wcwidth,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
61+
"os": "linux",
62+
"arch": "x86_64,aarch64",
63+
"py_version": "312"
64+
}
65+
}
66+
]
67+
- name: hermetic
68+
value: 'true'
3569
pipelineSpec:
3670
description: |
3771
This pipeline is ideal for building multi-arch container images from a Containerfile while maintaining trust after pipeline customization.

.tekton/rag-tool-push.yaml

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,39 @@ spec:
2929
- linux/x86_64
3030
- name: dockerfile
3131
value: Containerfile
32+
- name: build-source-image
33+
value: 'true'
34+
- name: prefetch-input
35+
# no source available: torch, faiss-cpu
36+
# hermeto prefetch problems: uv, pip, jiter, tiktoken,
37+
# those need cmake to build: pyarrow
38+
# those need cargo to build: jiter, tiktoken, cryptography, fastuuid, hf_xet, maturin, pydantic_core, rpds_py, safetensors, tokenizers
39+
# to accelerate build:numpy, scipy, pandas, pillow, scikit_learn
40+
value: |
41+
[
42+
{
43+
"type": "rpm",
44+
"path": "."
45+
},
46+
{
47+
"type": "pip",
48+
"path": ".",
49+
"requirements_files": [
50+
"requirements.hashes.wheel.txt",
51+
"requirements.hashes.source.txt",
52+
"requirements.hermetic.txt"
53+
],
54+
"requirements_build_files": ["requirements-build.txt"],
55+
"binary": {
56+
"packages": "accelerate,aiohappyeyeballs,aiohttp,aiosignal,aiosqlite,annotated-doc,annotated-types,anyio,asyncpg,attrs,beautifulsoup4,cffi,chardet,charset-normalizer,click,colorama,cryptography,dataclasses-json,defusedxml,distro,docling-ibm-models,einops,et-xmlfile,faiss-cpu,filetype,fire,frozenlist,googleapis-common-protos,greenlet,h11,hf-xet,httpcore,httpx,huggingface-hub,idna,jinja2,jiter,joblib,jsonlines,jsonref,jsonschema-specifications,latex2mathml,llama-stack-client,lxml,markdown-it-py,markupsafe,mdurl,mpire,mpmath,multidict,mypy-extensions,nest-asyncio,networkx,nltk,openpyxl,opentelemetry-api,opentelemetry-exporter-otlp-proto-common,opentelemetry-exporter-otlp-proto-http,opentelemetry-proto,opentelemetry-sdk,opentelemetry-semantic-conventions,packaging,pandas,pillow,platformdirs,pluggy,prompt-toolkit,propcache,psycopg2-binary,pyaml,pycparser,pydantic,pydantic-core,pydantic-settings,pygments,pyjwt,pylatexenc,python-dateutil,python-docx,python-dotenv,python-multipart,python-pptx,pytz,pyyaml,referencing,requests,rich,rpds-py,rtree,safetensors,scikit-learn,scipy,semchunk,sentence-transformers,shapely,shellingham,six,sniffio,sqlalchemy,starlette,sympy,tabulate,tenacity,threadpoolctl,tiktoken,tokenizers,torch,torchvision,tqdm,transformers,triton,typer,typing-extensions,typing-inspect,typing-inspection,tzdata,wcwidth,wrapt,xlsxwriter,yarl,zipp,uv-build,uv,pip,maturin,opencv-python,rapidocr,sqlite-vec",
57+
"os": "linux",
58+
"arch": "x86_64,aarch64",
59+
"py_version": "312"
60+
}
61+
}
62+
]
63+
- name: hermetic
64+
value: 'true'
3265
pipelineSpec:
3366
description: |
3467
This pipeline is ideal for building multi-arch container images from a Containerfile while maintaining trust after pipeline customization.

Containerfile

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@ RUN microdnf install -y --nodocs --setopt=keepcache=0 --setopt=tsflags=nodocs \
99
RUN microdnf install -y rubygems && \
1010
microdnf clean all && \
1111
gem install asciidoctor
12+
1213
# Install uv package manager
13-
RUN pip3.12 install uv==0.7.20
14+
RUN pip3.12 install uv>=0.7.20
1415

1516
WORKDIR /rag-content
1617

@@ -21,8 +22,11 @@ COPY scripts ./scripts
2122

2223
# Configure UV environment variables for optimal performance
2324
# Pytorch backend - cpu. `uv` contains convenient way to specify the backend.
25+
# MATURIN_NO_INSTALL_RUST=1 : Disable installation of Rust dependencies by Maturin.
2426
ENV UV_COMPILE_BYTECODE=0 \
25-
UV_PYTHON_DOWNLOADS=0
27+
UV_LINK_MODE=copy \
28+
UV_PYTHON_DOWNLOADS=0 \
29+
MATURIN_NO_INSTALL_RUST=1
2630

2731
# Install Python dependencies
2832
RUN uv sync --locked --no-install-project
@@ -43,4 +47,12 @@ RUN python ./scripts/download_embeddings_model.py \
4347
# Reset the entrypoint.
4448
ENTRYPOINT []
4549

46-
LABEL description="Contains embedding model and dependencies needed to generate a vector database"
50+
LABEL vendor="Red Hat, Inc." \
51+
name="lightspeed-core/rag-tool-rhel9" \
52+
com.redhat.component="lightspeed-core/rag-tool" \
53+
cpe="cpe:/a:redhat:lightspeed_core:0.4::el9" \
54+
io.k8s.display-name="Lightspeed RAG Tool" \
55+
summary="RAG tool containing embedding model and dependencies needed to generate a vector database." \
56+
description="RAG Tool provides a shared codebase for generating vector databases. It serves as the core framework for Lightspeed-related projects (e.g., OpenShift Lightspeed, OpenStack Lightspeed, etc.) to generate their own vector databases that can be used for RAG." \
57+
io.k8s.description="RAG Tool provides a shared codebase for generating vector databases. It serves as the core framework for Lightspeed-related projects (e.g., OpenShift Lightspeed, OpenStack Lightspeed, etc.) to generate their own vector databases that can be used for RAG." \
58+
io.openshift.tags="lightspeed-core,lightspeed-rag-tool,lightspeed"

Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -101,6 +101,9 @@ start-postgres-debug: ## Start postgresql from the pgvector container image with
101101
-v ./postgresql/data:/var/lib/postgresql/data:Z pgvector/pgvector:pg16 \
102102
postgres -c log_statement=all -c log_destination=stderr
103103

104+
konflux-requirements: ## generate hermetic requirements.*.txt file for konflux build
105+
./scripts/konflux_requirements.sh
106+
104107
.PHONY: help
105108
help: ## Show this help screen
106109
@echo 'Usage: make <OPTIONS> ... <TARGETS>'

pyproject.toml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ dependencies = [
4747
"llama-index-vector-stores-postgres>=0.5.4",
4848
# Pin torch/torchvision to versions available as CPU wheels
4949
# torch 2.5.x pairs with torchvision 0.20.x
50-
"torch>=2.5.0,<2.6.0",
51-
"torchvision>=0.20.0,<0.21.0",
50+
"torch>=2.8.0,<2.9.0",
51+
"torchvision>=0.23.0,<0.24.0",
5252
"llama-stack==0.3.5",
5353
"llama-stack-client==0.3.5",
5454
"aiosqlite>=0.21.0",
@@ -87,6 +87,8 @@ dev = [
8787
"pytest>=8.3.4",
8888
"pytest-cov>=6.0.0",
8989
"pytest-mock>=3.15.1",
90+
"pybuild-deps>=0.5.0",
91+
"pip==24.3.1",
9092
]
9193

9294
[tool.pylint."MESSAGES CONTROL"]

requirements-build.txt

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
#
2+
# This file is autogenerated by pip-compile with Python 3.12
3+
# by the following command:
4+
#
5+
# pybuild-deps compile --output-file=requirements-build.txt requirements.source.txt
6+
#
7+
altgraph==0.17.5
8+
# via macholib
9+
bashlex==0.18
10+
# via cibuildwheel
11+
bracex==2.6
12+
# via cibuildwheel
13+
build==1.4.0
14+
# via cibuildwheel
15+
calver==2025.10.20
16+
# via trove-classifiers
17+
certifi==2026.1.4
18+
# via cibuildwheel
19+
cibuildwheel==3.3.1
20+
# via docling-parse
21+
cmake==3.31.10
22+
# via docling-parse
23+
coherent-licensed==0.5.2
24+
# via importlib-metadata
25+
ctypesgen @ git+https://github.com/pypdfium2-team/ctypesgen@b561360fad763b4a64e2d8ef8f7ddf354670dbb7
26+
# via pypdfium2
27+
cython==3.2.4
28+
# via
29+
# numpy
30+
# pyclipper
31+
delocate==0.13.0
32+
# via docling-parse
33+
dependency-groups==1.3.1
34+
# via cibuildwheel
35+
filelock==3.20.3
36+
# via cibuildwheel
37+
flit-core==3.12.0
38+
# via
39+
# build
40+
# coherent-licensed
41+
# dependency-groups
42+
# marshmallow
43+
# packaging
44+
# pathspec
45+
# pypdf
46+
# pyproject-hooks
47+
# pyproject-metadata
48+
# typing-extensions
49+
# wheel
50+
hatch-fancy-pypi-readme==25.1.0
51+
# via
52+
# jsonschema
53+
# openai
54+
hatch-vcs==0.5.0
55+
# via
56+
# filelock
57+
# fsspec
58+
# humanize
59+
# jsonschema
60+
# platformdirs
61+
# scikit-build-core
62+
# termcolor
63+
# urllib3
64+
hatchling==1.26.3
65+
# via
66+
# hatch-fancy-pypi-readme
67+
# openai
68+
hatchling==1.28.0
69+
# via
70+
# banks
71+
# bracex
72+
# cibuildwheel
73+
# filelock
74+
# fsspec
75+
# hatch-fancy-pypi-readme
76+
# hatch-vcs
77+
# humanize
78+
# jsonschema
79+
# llama-cloud-services
80+
# llama-index
81+
# llama-index-cli
82+
# llama-index-core
83+
# llama-index-embeddings-huggingface
84+
# llama-index-embeddings-openai
85+
# llama-index-indices-managed-llama-cloud
86+
# llama-index-instrumentation
87+
# llama-index-llms-openai
88+
# llama-index-readers-file
89+
# llama-index-readers-llama-parse
90+
# llama-index-vector-stores-faiss
91+
# llama-index-vector-stores-postgres
92+
# llama-parse
93+
# platformdirs
94+
# polyfactory
95+
# scikit-build-core
96+
# soupsieve
97+
# termcolor
98+
# urllib3
99+
# uvicorn
100+
humanize==4.15.0
101+
# via cibuildwheel
102+
macholib==1.16.4
103+
# via delocate
104+
maturin==1.10.2
105+
# via uv-build
106+
meson-python==0.19.0
107+
# via numpy
108+
meson==1.10.1
109+
# via meson-python
110+
packaging==25.0
111+
# via
112+
# cibuildwheel
113+
# hatchling
114+
# meson-python
115+
# pypdfium2
116+
# scikit-build-core
117+
# setuptools-scm
118+
patchelf==0.17.2.4
119+
# via cibuildwheel
120+
pathspec==1.0.3
121+
# via
122+
# hatchling
123+
# scikit-build-core
124+
pdm-backend==2.4.6
125+
# via
126+
# fastapi
127+
# griffe
128+
# marko
129+
platformdirs==4.5.1
130+
# via cibuildwheel
131+
pluggy==1.6.0
132+
# via hatchling
133+
poetry-core==2.3.0
134+
# via
135+
# llama-cloud
136+
# tomlkit
137+
pybind11==3.0.1
138+
# via docling-parse
139+
pyelftools==0.32
140+
# via cibuildwheel
141+
pyproject-hooks==1.2.0
142+
# via build
143+
pyproject-metadata==0.10.0
144+
# via meson-python
145+
scikit-build-core==0.11.6
146+
# via
147+
# cmake
148+
# patchelf
149+
# pybind11
150+
semantic-version==2.10.0
151+
# via setuptools-rust
152+
setuptools-rust==1.12.0
153+
# via maturin
154+
setuptools-scm==9.2.2
155+
# via
156+
# ctypesgen
157+
# delocate
158+
# hatch-vcs
159+
# importlib-metadata
160+
# pluggy
161+
# pyclipper
162+
# setuptools-rust
163+
# urllib3
164+
trove-classifiers==2026.1.14.14
165+
# via hatchling
166+
typing-extensions==4.15.0
167+
# via delocate
168+
uv-build==0.9.26
169+
# via llama-index-workflows
170+
wheel==0.45.1
171+
# via
172+
# bashlex
173+
# cibuildwheel
174+
# docling-parse
175+
# meson
176+
# pyclipper
177+
# pypdfium2
178+
# tree-sitter-c
179+
# tree-sitter-javascript
180+
# tree-sitter-python
181+
# tree-sitter-typescript
182+
183+
# The following packages are considered to be unsafe in a requirements file:
184+
setuptools==80.10.1
185+
# via
186+
# bashlex
187+
# calver
188+
# certifi
189+
# colorlog
190+
# ctypesgen
191+
# delocate
192+
# dill
193+
# docling-parse
194+
# importlib-metadata
195+
# llama-stack
196+
# maturin
197+
# meson
198+
# multiprocess
199+
# pathspec
200+
# pgvector
201+
# pluggy
202+
# psutil
203+
# pyclipper
204+
# pyelftools
205+
# pypdfium2
206+
# regex
207+
# setuptools-rust
208+
# setuptools-scm
209+
# tree-sitter
210+
# tree-sitter-c
211+
# tree-sitter-javascript
212+
# tree-sitter-python
213+
# tree-sitter-typescript
214+
# trove-classifiers

0 commit comments

Comments
 (0)