containers: prepare environmentd and clusterd for distroless migration#35859
Draft
jasonhernandez wants to merge 11 commits intomainfrom
Draft
containers: prepare environmentd and clusterd for distroless migration#35859jasonhernandez wants to merge 11 commits intomainfrom
jasonhernandez wants to merge 11 commits intomainfrom
Conversation
Contributor
|
Thanks for opening this PR! Here are a few tips to help make the review process smooth for everyone. PR title guidelines
Pre-merge checklist
|
63a74f4 to
ef9e7db
Compare
Move bash entrypoint logic into Rust binaries so environmentd and clusterd can run in distroless containers without a shell: clusterd: - Auto-detect Kubernetes FQDN from /etc/hostname (replaces `hostname --fqdn`) - Auto-detect StatefulSet ordinal from HOSTNAME env var - Configure LD_PRELOAD for eatmydata (CI only, no-op in distroless) environmentd: - Configure LD_PRELOAD for eatmydata - Sleep forever after graceful exit (keeps container alive for debugging) Also add Dockerfile.distroless variants for both services that use the distroless-prod-base image and expect a static `ssh` binary to be copied in for SSH tunnel support. Part of SEC-236. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
ef9e7db to
748f09e
Compare
Replace the Ubuntu-based Dockerfiles with distroless variants directly, delete the now-unnecessary bash entrypoint scripts, and remove the explicit LD_PRELOAD=libeatmydata.so from the mzcompose clusterd service (the MZ_EAT_MY_DATA env var triggers the Rust-side LD_PRELOAD logic which is harmless when libeatmydata.so is absent in distroless). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy libeatmydata.so from a Debian image into the distroless base so that CI tests using MZ_EAT_MY_DATA=1 continue to benefit from fsync elision. The library is inert in production (MZ_EAT_MY_DATA is unset). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The jobs image only contains Rust binaries (persistcli, mz-catalog-debug) with no shell or tool dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The mzbuild system expects a Dockerfile next to every mzbuild.yml. Include the static OpenSSH build Dockerfile so the pipeline can resolve the openssh-static image dependency. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The git clone of aws-lc from GitHub fails with "server certificate verification failed" because the ubuntu:noble base image doesn't include CA certificates by default. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The jobs image is used in CI tests with mzcompose's idle feature which overrides the entrypoint to ["sleep", "infinity"]. Distroless images don't have the sleep binary, so keep this CI-only image on Ubuntu. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Change the static OpenSSH build to use plain AWS-LC by default (faster, no Go dependency) with FIPS mode available via --build-arg AWS_LC_FIPS=1. AWS-LC is a drop-in replacement for OpenSSL that's faster and smaller. FIPS 140-3 validation is an additional layer only needed for compliance builds, not for all builds. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
zlib.net is unreliable in CI — the download has failed twice. Use the GitHub releases mirror which is more stable. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
AWS-LC (like BoringSSL) doesn't define BN_FLG_CONSTTIME. OpenSSH V_9_9_P2 uses it in ssh-rsa.c. Define it to 0 via CFLAGS — the constant is only used with BN_set_flags which AWS-LC already shims to a no-op. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Move bash entrypoint logic into Rust binaries so environmentd and clusterd can run in distroless container images:
/etc/hostname(replaceshostname --fqdnwhich isn't available in distroless), auto-detect StatefulSet ordinal fromHOSTNAMEenv var, LD_PRELOAD eatmydata toggledistroless-prod-baseMotivation
environmentd and clusterd are the last major services still on Ubuntu-based containers. The blockers were:
sshbinary for tunnels (solved by SEC-236 static OpenSSH PR)tinifor PID 1 (KubernetesshareProcessNamespaceor container runtime--inithandles this)Distroless images are ~60% smaller, have no shell for attackers to exploit, and are required for FIPS compliance (no uncontrolled system crypto libraries).
What's NOT in this PR
sshbinary build (separate PR: ssh-util: add static OpenSSH build and FIPS algorithm enforcement #35858)Part of SEC-236.
Test plan
cargo check -p mz-clusterd -p mz-environmentdpassescargo fmtclean🤖 Generated with Claude Code