Skip to content

add drs server chart#87

Open
matthewpeterkort wants to merge 7 commits intoohsu-developfrom
feature/drs-server
Open

add drs server chart#87
matthewpeterkort wants to merge 7 commits intoohsu-developfrom
feature/drs-server

Conversation

@matthewpeterkort
Copy link
Copy Markdown
Collaborator

@matthewpeterkort matthewpeterkort commented Mar 4, 2026

@bwalsh
Copy link
Copy Markdown
Collaborator

bwalsh commented Mar 31, 2026

ADR: Introduce drs-server and re-route DRS/index workflows in feature/drs-server

  • Status: Accepted
  • Date: 2026-03-31
  • Branch: feature/drs-server
  • Compared Against: ohsu-develop
  • Scope: All effective branch changes

Context

This branch introduces a dedicated DRS service and updates ingress and Fence startup behavior to support DRS/data workflows.

Effective branch diff summary:

  • 20 files changed
  • 735 insertions / 87 deletions
  • Primary areas:
    • New chart: helm/drs-server/*
    • Revproxy routing updates: helm/revproxy/gen3.nginx.conf/*
    • Fence startup/volume mount updates: helm/fence/*
    • Gen3 dependency wiring: helm/gen3/Chart.yaml
    • Gecko nav update: helm/gecko/files/init-data/nav.json

Decision

Adopt a standalone drs-server Helm chart and route DRS/index/data endpoints through this service, while simplifying Fence responsibility boundaries.

1) Add a new drs-server Helm chart

Files added:

  • helm/drs-server/Chart.yaml
  • helm/drs-server/README.md
  • helm/drs-server/values.yaml
  • helm/drs-server/templates/_helpers.tpl
  • helm/drs-server/templates/configmap.yaml
  • helm/drs-server/templates/deployment.yaml
  • helm/drs-server/templates/service.yaml
  • helm/drs-server/templates/secret-app-db.yaml
  • helm/drs-server/templates/secret-admin-db.yaml
  • helm/drs-server/templates/postgres-schema-configmap.yaml
  • helm/drs-server/templates/postgres-init-job.yaml

Implementation decisions:

  • Deploy drs-server as its own service (port 8080) with readiness/liveness probes on /healthz.
  • Configure runtime behavior via mounted config.yaml generated from chart values/global overrides.
  • Inject DB settings via secret env vars (DRS_DB_*).
  • Add optional pre-install/pre-upgrade PostgreSQL init hooks to:
    • ensure app role/database exist,
    • apply DRS schema,
    • support global Postgres defaults (global.postgres.master.*) with per-chart overrides.

2) Wire drs-server into umbrella chart

Modified:

  • helm/gen3/Chart.yaml

Decision:

  • Add drs-server as a local dependency with condition drs-server.enabled.

3) Move DRS/index/data routing to drs-server

Added:

  • helm/revproxy/gen3.nginx.conf/drs-server-service.conf

Modified:

  • helm/revproxy/gen3.nginx.conf/indexd-service.conf
  • helm/revproxy/gen3.nginx.conf/fence-service-ga4gh.conf
  • helm/revproxy/gen3.nginx.conf/fence-service.conf

Routing decisions:

  • Introduce dedicated upstream/rules for:
    • /ga4gh/
    • /index and /index/
    • /upload
    • /download/
    • /info/lfs/
    • /data/
    • /healthz
  • Disable legacy indexd GA4GH/index route blocks in indexd-service.conf.
  • Disable legacy Fence GA4GH DRS access proxy block in fence-service-ga4gh.conf.
  • Simplify fence-service.conf so Fence owns /user/* and /.well-known/ flows; remove special /user/data/download handoff to presigned-url-fence.

4) Harden Fence startup config/key handling

Modified:

  • helm/fence/templates/fence-deployment.yaml
  • helm/fence/templates/presigned-url-fence.yaml
  • helm/fence/values.yaml

Decisions:

  • Update startup shell to set -euo pipefail.
  • Read secret config from /var/run/fence-secrets/fence-config-secret.yaml and merge when yaml_merge.py is present.
  • Add fallback path that copies secret config directly if merge script is unavailable.
  • Stage JWT private key from /var/run/fence-secrets/jwt_private_key.pem into /fence/keys/key/ and enforce file mode 600 before generating public key.
  • Shift volume mounts to new /var/run/fence-secrets/* paths (and comment legacy direct mounts).

Rationale

  • A dedicated DRS service isolates DRS-specific behavior, making routing and ownership clearer than splitting behavior across Fence and indexd.
  • Routing /ga4gh, /index, and data transfer endpoints through one backend reduces ambiguity and service coupling.
  • Fence startup changes improve resilience when merge helper scripts or key paths differ across environments.
  • Umbrella chart dependency makes DRS deployment explicit and toggleable.

Consequences

  • Deployments can enable and run a standalone drs-server with its own DB schema/bootstrap path.
  • Revproxy traffic for DRS/index/data workflows shifts to drs-server; indexd/Fence no longer serve those legacy route blocks.
  • Fence configuration/key file handling now depends on /var/run/fence-secrets/* mount paths.
  • Operationally, PostgreSQL privileges and schema setup are managed by hook jobs when enabled.

Alternatives Considered

  • Keep DRS functionality distributed across Fence/indexd: rejected due to mixed ownership and route complexity.
  • Avoid DB init hooks and rely on manual DB prep: rejected due to repeatability and install/upgrade friction.
  • Keep legacy Fence/indexd revproxy blocks active: rejected to avoid ambiguous endpoint ownership.

@bwalsh
Copy link
Copy Markdown
Collaborator

bwalsh commented Mar 31, 2026

PR Review: feature/drs-server vs ohsu-develop

Scope

  • Reviewed all effective branch changes (20 files, 735 insertions, 87 deletions) with focus on testability, readability, and maintenance.

Findings (ordered by severity)

High

  1. /index route bypasses CSRF check while /index/ is protected
    • Files/lines: helm/revproxy/gen3.nginx.conf/drs-server-service.conf:16-20, helm/revproxy/gen3.nginx.conf/drs-server-service.conf:22-30
    • Why this matters: The new exact-match location /index block proxies requests without the CSRF gate used on location ^~ /index/. This creates inconsistent protection on closely related endpoints and can open an unintended unauthenticated path for non-GET usage, depending on backend behavior.
    • Recommendation: Apply the same CSRF guard to location /index (or explicitly restrict it to safe methods) so both routes have equivalent security semantics.

Medium

  1. Legacy /user/data/download/<guid> path no longer has dedicated routing, with no compatibility note/tests

    • Files/lines: helm/revproxy/gen3.nginx.conf/fence-service.conf:33-41, docs/CONFIGURATION.md:358
    • Why this matters: The previous explicit /user/data/download handoff to presigned-url-fence has been removed. Existing clients, scripts, or runbooks still referencing /user/data/download/<GUID> may break or receive different behavior.
    • Recommendation: Either (a) preserve a compatibility route/redirect, or (b) document a clear migration to new download paths and add an integration test covering old and new route behavior.
  2. No automated route contract tests for a large ingress ownership shift

    • Files/lines: helm/revproxy/gen3.nginx.conf/drs-server-service.conf:5-94, helm/revproxy/gen3.nginx.conf/indexd-service.conf:1-25, helm/revproxy/gen3.nginx.conf/fence-service-ga4gh.conf:1-10, helm/revproxy/gen3.nginx.conf/fence-service.conf:20-49
    • Why this matters: This PR reassigns multiple high-traffic endpoints (/ga4gh, /index, /upload, /download, /info/lfs, /data) across services, but there is no test harness to assert expected ownership/auth behavior per path.
    • Recommendation: Add a lightweight route matrix test in CI (curl-based smoke tests against a templated local stack is enough) validating route owner, expected status code, and CSRF behavior for each endpoint.
  3. Mutable default image tag reduces reproducibility across environments

    • Files/lines: helm/drs-server/values.yaml:7
    • Why this matters: Defaulting to latest makes deployments non-deterministic and harder to debug/rollback.
    • Recommendation: Pin a released image tag by default (or digest), and document upgrade guidance in helm/drs-server/README.md.

Low

  1. Large commented-out NGINX blocks increase maintenance burden
    • Files/lines: helm/revproxy/gen3.nginx.conf/indexd-service.conf:1-25, helm/revproxy/gen3.nginx.conf/fence-service-ga4gh.conf:1-10
    • Why this matters: Carrying inactive config in place makes ownership less clear and increases cognitive load during future route changes.
    • Recommendation: Remove dead blocks and capture rationale in ADR/changelog; rely on git history for rollback context.

Readability Notes

  • New drs-server chart layout is clear and follows existing chart patterns (templates, helper macros, values split).
  • helm/drs-server/README.md provides good operator guidance, especially around DB secret handling and probe tuning.

Testability Notes

  • Template-level checks passed (helm lint, helm template) but these do not validate runtime route/auth behavior.
  • Highest-value missing tests are endpoint ownership and auth regression checks at revproxy boundaries.

Validation Performed

  • git diff --name-status ohsu-develop...feature/drs-server
  • git diff --stat ohsu-develop...feature/drs-server
  • helm lint helm/drs-server
  • helm template test helm/drs-server > /tmp/drs-rendered.yaml

@bwalsh
Copy link
Copy Markdown
Collaborator

bwalsh commented Mar 31, 2026

yamllint reports:

helm/drs-server/Chart.yaml
7:1 [empty-lines] too many blank lines (1 > 0)
This means there is an extra blank line where the linter configuration allows zero empty lines.

bwalsh
bwalsh previously approved these changes Mar 31, 2026
Copy link
Copy Markdown
Collaborator

@bwalsh bwalsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See failing lint job

lbeckman314 and others added 3 commits April 1, 2026 15:25
Signed-off-by: Liam Beckman <lbeckman314@gmail.com>
Signed-off-by: Liam Beckman <lbeckman314@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants