Conversation
📝 WalkthroughWalkthroughRenaming/rebranding from DataFlow/DataPipeline to RAG across docs, UI strings, proxies, nginx templates, Helm chart names, and CI pipelines; CI deploy adds a new cleanOldDataflow() helper to remove legacy dataflow Helm releases before rag deployment. Changes
Sequence Diagram(s)sequenceDiagram
participant Jenkins as Jenkins
participant Pipeline as CI_Script
participant Deployer as Deployer_Job(unifai-app-deployer-rag)
participant Helm as Helm_Client
participant K8s as Kubernetes
Jenkins->>Pipeline: start build/deploy (params incl. build_rag_backend, MODULES)
Pipeline->>Pipeline: if deploying rag -> cleanOldDataflow()
Pipeline->>Deployer: trigger deploy job (RAG_VERSION, modules)
Deployer->>Helm: helm upgrade/install (helm/rag charts)
Helm->>K8s: apply manifests
K8s-->>Helm: report release/status
Helm-->>Deployer: deployment result
Deployer-->>Jenkins: job completion/status
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 SummaryThis is a comprehensive refactor renaming the "Dataflow" component to "RAG" (Retrieval-Augmented Generation) across the entire stack. The changes span Helm charts, CI/CD pipelines, Backend configurations, Multi-Agent providers, and UI integration. Overall Assessment: 🧩 File-by-file feedback[DOMAIN: CI/CD] ci/deploy-bu.groovy
[DOMAIN: HELM] helm/ui/values.yaml
[DOMAIN: HELM] helm/rag.yaml.gotmpl (formerly dataflow.yaml.gotmpl)
[DOMAIN: BACKEND] multi-agent/elements/providers/rag_client/config.py
[DOMAIN: UI] ui/vite.config.ts
🛠 Suggested Improvements1. Fix Syntax Error in PipelineFile: // Current (Broken):
deployModules('rag
')
// Fixed:
deployModules('rag')2. Add Migration Cleanup Logic (Optional but Recommended)File: def deleteRunningApplication(){
echo("Removing running UnifAI application")
// Attempt to clean up legacy dataflow release if it exists
try {
sh("podman exec -t helmfile bash -c 'helmfile destroy -f dataflow.yaml.gotmpl --deleteWait || true'")
} catch (Exception e) {
echo "No legacy dataflow deployment found or destroy failed: ${e.message}"
}
// Standard destroy for new structure
sh("podman exec -t helmfile bash -c 'helmfile destroy -f rag.yaml.gotmpl --deleteWait'")
// ... rest of function
}✅ What's Good
✍️ Suggested Commit Message |
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
ui/vite.config.ts (1)
31-38:⚠️ Potential issue | 🟠 MajorEnvironment variable rename requires documentation update and developer action.
The proxy target now uses
RAG_HOSTinstead ofDATAPIPELINEHUB_HOST. Deployment configurations have been updated (helm/values/global-config.yaml correctly usesRAG_HOST), but documentation remains inconsistent:helm/ARCHITECTURE.mdline 593 still references the oldDATAPIPELINEHUB_HOSTvariable in its environment configuration example, while line 761 correctly usesRAG_HOST. Update the documentation to useRAG_HOSTthroughout to prevent developers following outdated setup instructions from encountering configuration errors.backend/ARCHITECTURE.md (1)
1287-1295:⚠️ Potential issue | 🟡 MinorCorrect the Kubernetes resource names in the Helm deployment section.
The documented service name should be
unifai-rag-server(notunifai-rag-service), and the workers deployment should beunifai-rag-celery(notunifai-rag-celery-workers). These names are derived from the Helm release names and confirmed in the actual templates:
unifai-rag-server/templates/service.yamlnames the service as{{ include "rag.fullname" . }}unifai-rag-celery/templates/deployment.yamlandservice.yamlboth use{{ include "unifai-rag-celery.fullname" . }}Update lines 1292-1294 to reflect the correct resource names.
yhabushi79
left a comment
There was a problem hiding this comment.
looks ok ,
i think it good to delete old unused files such as deploy-bu.groovy
in ci/README.md you reference the old DF_VERSION.
i guess when you deploy it you have to entirlly wipe the entire unifai application. with all its PV PVC reousrces etc..
|
@sfiresht i see your changing in mutli agent the name from dataflow to rag, but your changing the type of the element, which means current things in database are in type dataflow, and therefore a bug will happen, i suggest not to change this in multi agent, since its not related to code of rag, the only thing that you need to change in mutli agent is the url the dataflow/rag client points to. |
but then reading the code will make ppl confused, I gues it's possible to go to all entries in the DB and update their element type. not somthing that wasn';t done before even in unifai. |
you're right, I'll remove the all BU file and I thought that I deleted all DF_VERSION occurences, will double check |
Nirsisr
left a comment
There was a problem hiding this comment.
@sfiresht I go over all the GUI changes looks good, don't forget to align with Maya on the DB script.
Regard the backend you made some changes, but as we speak it need to be removed (whether it's from this PR or another one) just make sure RAG appearing on the README in the rag/ folder. 10x saar
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ci/pipeline-deploy.groovy (1)
300-305:⚠️ Potential issue | 🔴 CriticalBug:
versionis undefined in theshared-resourcescase.Every other module case (lines 308, 315, 322, 329) declares
def version = params.XXX_VERSION?.trim() ?: params.VERSION?.trim()before use. Theshared-resourcescase on line 303 referencesversionwithout defining it, which will cause a runtimeMissingPropertyException.🐛 Proposed fix
case 'shared-resources': + def version = params.VERSION?.trim() updateValuesYaml("${buildParams.DevRoot}/${params.BRANCH}/helm/values/shared-resource-values.yaml", version) deployModules('shared-resources') break🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci/pipeline-deploy.groovy` around lines 300 - 305, The shared-resources switch case references the variable version but never defines it; add a definition for version in the 'shared-resources' case consistent with the other cases (e.g., def version = params.SHARED_RESOURCES_VERSION?.trim() ?: params.VERSION?.trim()) before calling updateValuesYaml and deployModules so updateValuesYaml("${buildParams.DevRoot}/${params.BRANCH}/helm/values/shared-resource-values.yaml", version) uses a defined value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@ci/pipeline-deploy.groovy`:
- Around line 170-203: cleanOldDataflow() is running helm and oc commands on the
Jenkins agent (helm ls, helm delete, oc get ...) while the pipeline standard is
to run these inside the helmfile container; update cleanOldDataflow() so every
helm and oc invocation is executed via podman exec -t helmfile bash -lc
'...'(e.g., run "helm ls", "helm delete <releaseName>" and the until loop that
calls oc get inside podman exec), and similarly modify
deleteRunningApplication() so its oc get call is executed inside the helmfile
container as well; ensure you preserve the existing stdout capture/trim behavior
and the release-name extraction logic (result.split, release.split(/\s+/)[0])
when moving the commands into the podman exec calls.
---
Outside diff comments:
In `@ci/pipeline-deploy.groovy`:
- Around line 300-305: The shared-resources switch case references the variable
version but never defines it; add a definition for version in the
'shared-resources' case consistent with the other cases (e.g., def version =
params.SHARED_RESOURCES_VERSION?.trim() ?: params.VERSION?.trim()) before
calling updateValuesYaml and deployModules so
updateValuesYaml("${buildParams.DevRoot}/${params.BRANCH}/helm/values/shared-resource-values.yaml",
version) uses a defined value.
| // temporary fix for dataflow deployment deletion, after we move completely to the new rag naming this function is obsolete | ||
| def cleanOldDataflow(){ | ||
| echo("Removing old dataflow application") | ||
|
|
||
| // Capture the output properly | ||
| def result = sh( | ||
| script: "helm ls | grep 'dataflow' || true", | ||
| returnStdout: true | ||
| ).trim() | ||
|
|
||
| if(result.length() > 0) { | ||
| // Split by newlines to get all releases, not just the first one | ||
| def releases = result.split('\n') | ||
|
|
||
| releases.each { release -> | ||
| // Extract the release name (first column in helm ls output) | ||
| def releaseName = release.split(/\s+/)[0] | ||
| echo("Deleting helm release: ${releaseName}") | ||
| sh("helm delete ${releaseName}") | ||
| } | ||
|
|
||
| // Wait for all dataflow resources to be deleted | ||
| sh(""" | ||
| until ! oc get deployment,statefulset,svc 2>/dev/null | grep 'dataflow'; do | ||
| echo 'Waiting for dataflow deployment deletion...' | ||
| sleep 5 | ||
| done | ||
| """) | ||
| echo("All dataflow applications successfully deleted") | ||
| sh("sleep 5") | ||
| } else { | ||
| echo("No dataflow releases found") | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find all Groovy files in the ci directory
fd -e groovy ci/ | head -20Repository: redhat-community-ai-tools/UnifAI
Length of output: 393
🏁 Script executed:
#!/bin/bash
# Check how helm/oc commands are invoked across the pipeline files
rg -n 'helm |oc ' --type groovy -C 2Repository: redhat-community-ai-tools/UnifAI
Length of output: 2341
🏁 Script executed:
#!/bin/bash
# Search for deployModules function and podman exec patterns
rg -n 'deployModules|podman exec' --type groovy -A 5 -B 2Repository: redhat-community-ai-tools/UnifAI
Length of output: 4873
🏁 Script executed:
#!/bin/bash
# Get more context around the pipeline functions
cat -n ci/pipeline-deploy.groovy | head -250 | tail -100Repository: redhat-community-ai-tools/UnifAI
Length of output: 4983
Fix helm/oc commands to run consistently inside the helmfile container — inconsistent execution contexts detected in cleanup functions.
cleanOldDataflow() runs helm ls (line 176) and helm delete (line 188) directly on the Jenkins agent, but all other helm operations in this pipeline run inside the helmfile podman container via podman exec -t helmfile bash -lc '...' (e.g., deployModules() at line 145, deleteRunningApplication() at line 156). Additionally, deleteRunningApplication() shows this same pattern: it uses podman exec for helmfile destroy (line 156) but then runs oc get directly on the agent (line 161). This mixed execution context is problematic — the agent likely doesn't have the correct kubeconfig/context, so these commands will either fail or target the wrong cluster.
Both functions need to run helm and oc commands inside the helmfile container to match the rest of the pipeline's patterns.
🐛 Proposed fixes
In cleanOldDataflow():
def result = sh(
- script: "helm ls | grep 'dataflow' || true",
+ script: "podman exec -t helmfile bash -c \"helm ls | grep 'dataflow' || true\"",
returnStdout: true
).trim() echo("Deleting helm release: ${releaseName}")
- sh("helm delete ${releaseName}")
+ sh("podman exec -t helmfile bash -c 'helm delete ${releaseName}'")- sh("""
- until ! oc get deployment,statefulset,svc 2>/dev/null | grep 'dataflow'; do
+ sh("""
+ until ! podman exec -t helmfile bash -c "oc get deployment,statefulset,svc 2>/dev/null | grep 'dataflow'" ; do
echo 'Waiting for dataflow deployment deletion...'
sleep 5
done
""")In deleteRunningApplication(), also wrap the oc get command at line 161 inside the helmfile container:
echo("Wait for resource deletion...")
sh("""
- until ! oc get deployment,statefulset,svc | grep 'unifai\\|qdrant\\|mongo\\|rabbitmq'; do
+ until ! podman exec -t helmfile bash -c "oc get deployment,statefulset,svc | grep 'unifai\\|qdrant\\|mongo\\|rabbitmq'"; do
echo 'Waiting for deployment deletion...'
sleep 5
done
""")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ci/pipeline-deploy.groovy` around lines 170 - 203, cleanOldDataflow() is
running helm and oc commands on the Jenkins agent (helm ls, helm delete, oc get
...) while the pipeline standard is to run these inside the helmfile container;
update cleanOldDataflow() so every helm and oc invocation is executed via podman
exec -t helmfile bash -lc '...'(e.g., run "helm ls", "helm delete <releaseName>"
and the until loop that calls oc get inside podman exec), and similarly modify
deleteRunningApplication() so its oc get call is executed inside the helmfile
container as well; ensure you preserve the existing stdout capture/trim behavior
and the release-name extraction logic (result.split, release.split(/\s+/)[0])
when moving the commands into the podman exec calls.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ui/ARCHITECTURE.md (1)
26-26:⚠️ Potential issue | 🟡 MinorStale "Data Pipeline Hub / Data Pipeline" references remain after the rename.
Several sections in this document were not updated alongside line 1413 and still reference the old DataFlow/Data Pipeline terminology, creating a confusing internal inconsistency for anyone using this doc as a reference:
- Line 26 – Core Features bullet:
"Data Pipeline Hub, Multi-Agent System, SSO"→ should read"RAG Backend, Multi-Agent System, SSO".- Lines 195–198 – Multi-Backend API Architecture table:
/api1row still labelled"Data Pipeline Hub"→ should be"RAG Backend".- Lines 959–962 – HTTP Client Selection table:
/api1 (Data Pipeline)→ should be/api1 (RAG).- Lines 1363–1365 – Dev server Vite proxy rewrites: trailing comment
(Data Pipeline Hub)→ should say(RAG Backend).📝 Proposed documentation fixes
- - Multi-backend architecture (Data Pipeline Hub, Multi-Agent System, SSO) + - Multi-backend architecture (RAG Backend, Multi-Agent System, SSO)-| `/api1` | Data Pipeline Hub | Document/Slack pipelines, embeddings | Port 13457 | +| `/api1` | RAG Backend | Document/Slack pipelines, embeddings | Port 13457 |-| `api` | `@/http/queryClient` | `/api1` (Data Pipeline) | Document/Slack pipelines | +| `api` | `@/http/queryClient` | `/api1` (RAG) | Document/Slack pipelines |-- `/api1` → `http://127.0.0.1:13457/api` (Data Pipeline Hub) +- `/api1` → `http://127.0.0.1:13457/api` (RAG Backend)Also applies to: 194-198, 958-962, 1363-1365
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/ARCHITECTURE.md` at line 26, Update all stale "Data Pipeline"/"Data Pipeline Hub" occurrences to "RAG Backend" in the ARCHITECTURE document: replace the Core Features bullet "Data Pipeline Hub, Multi-Agent System, SSO" with "RAG Backend, Multi-Agent System, SSO"; change the Multi-Backend API Architecture table entry `/api1` label from "Data Pipeline Hub" to "RAG Backend"; update the HTTP Client Selection table row `/api1 (Data Pipeline)` to `/api1 (RAG)`; and change the dev server Vite proxy rewrite trailing comment "(Data Pipeline Hub)" to "(RAG Backend)". Search the document for any other exact matches of "Data Pipeline" or "Data Pipeline Hub" and make the same substitutions to ensure terminology consistency.
🧹 Nitpick comments (1)
ui/ARCHITECTURE.md (1)
1723-1724:Last Updateddate is stale.The document was modified by this PR but the version footer still reads
November 23, 2025.📝 Proposed fix
-**Last Updated:** November 23, 2025 +**Last Updated:** February 2026🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ui/ARCHITECTURE.md` around lines 1723 - 1724, Update the stale footer date by replacing the "Last Updated: November 23, 2025" line with the current PR's update date (or today's date) so the document footer reflects the recent modification; locate the literal "Last Updated" entry in ARCHITECTURE.md and update its value accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@ui/ARCHITECTURE.md`:
- Line 26: Update all stale "Data Pipeline"/"Data Pipeline Hub" occurrences to
"RAG Backend" in the ARCHITECTURE document: replace the Core Features bullet
"Data Pipeline Hub, Multi-Agent System, SSO" with "RAG Backend, Multi-Agent
System, SSO"; change the Multi-Backend API Architecture table entry `/api1`
label from "Data Pipeline Hub" to "RAG Backend"; update the HTTP Client
Selection table row `/api1 (Data Pipeline)` to `/api1 (RAG)`; and change the dev
server Vite proxy rewrite trailing comment "(Data Pipeline Hub)" to "(RAG
Backend)". Search the document for any other exact matches of "Data Pipeline" or
"Data Pipeline Hub" and make the same substitutions to ensure terminology
consistency.
---
Nitpick comments:
In `@ui/ARCHITECTURE.md`:
- Around line 1723-1724: Update the stale footer date by replacing the "Last
Updated: November 23, 2025" line with the current PR's update date (or today's
date) so the document footer reflects the recent modification; locate the
literal "Last Updated" entry in ARCHITECTURE.md and update its value
accordingly.
Summary by CodeRabbit
Chores
Documentation