refactor: simplify launcher image validation by accepting manifest digest directly#2785
refactor: simplify launcher image validation by accepting manifest digest directly#2785barakeinav1 wants to merge 3 commits intomainfrom
Conversation
… manifest digest Remove the Docker registry API interaction from the launcher. The approved hashes file now contains manifest digests (instead of config digests), so the launcher can pull directly with `docker pull image@sha256:<digest>` without querying the registry to resolve config digest -> manifest digest. Removed: - registry.rs (~430 lines) -- Docker registry auth, manifest fetching, retry logic, multi-platform resolution, config digest matching - oci-client, backon, rustls, rstest dependencies - Registry-related error variants - rustls crypto provider setup in main.rs - external-services-tests feature flag Simplified: - validation.rs: 79 lines -> 33 lines (just docker pull + verify) - error.rs: 90 lines -> 65 lines - lib.rs: emits manifest digest (not config digest) to RTMR3 Closes #2627
Code ReviewClean simplification. The removal of the OCI registry client in favor of direct Minor observations (non-blocking):
✅ Approved |
There was a problem hiding this comment.
Pull request overview
This PR refactors tee-launcher image validation to accept and operate directly on manifest digests, removing the OCI/Docker registry API client and associated retry/auth logic so the launcher can docker pull <image>@sha256:<digest> without registry lookups.
Changes:
- Replace registry-based digest resolution with a direct
docker pullby manifest digest (pull_and_verify). - Remove the
registry.rsmodule and drop related dependencies (oci-client,backon,rustls,rstest) and error variants. - Emit the selected manifest digest to dstack (RTMR3) and propagate it through tee config / compose launch.
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
crates/tee-launcher/src/validation.rs |
Replaces async registry+inspect validation with a minimal pull-by-digest check. |
crates/tee-launcher/src/registry.rs |
Deletes the registry client, digest resolution logic, and related tests. |
crates/tee-launcher/src/main.rs |
Removes rustls crypto provider initialization that was required for prior TLS usage. |
crates/tee-launcher/src/lib.rs |
Switches selection/validation flow to manifest digests and emits manifest digest to RTMR3. |
crates/tee-launcher/src/error.rs |
Simplifies error enums by removing registry-related variants and tightening run/pull errors. |
crates/tee-launcher/Cargo.toml |
Removes registry/test-related features and dependencies. |
Cargo.lock |
Removes transitive packages no longer needed after dropping the registry client stack. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| [[bin]] | ||
| name = "tee-launcher" | ||
| path = "src/main.rs" | ||
|
|
||
| [features] | ||
| external-services-tests = [] | ||
|
|
||
| [dependencies] | ||
| backon = { workspace = true } | ||
| clap = { workspace = true } | ||
| dstack-sdk = { workspace = true } |
There was a problem hiding this comment.
Removing the external-services-tests feature flag will break existing CI/docs that still run cargo nextest ... -p tee-launcher --features external-services-tests (e.g. .github/workflows/external-services-tests.yml). Either keep this feature as a no-op for compatibility, or update the workflow/docs to stop passing the feature for tee-launcher.
| // The approved hashes file now contains manifest digests. | ||
| // We can pull directly by digest without querying the Docker registry API. | ||
| let manifest_digest = select_image_hash( | ||
| approved_hashes_on_disk.as_ref(), | ||
| &args.default_image_digest, | ||
| config.launcher_config.mpc_hash_override.as_ref(), | ||
| )?; | ||
|
|
||
| let manifest_digest = validate_image_hash(&config.launcher_config, image_hash.clone()).await?; | ||
| pull_and_verify(&config.launcher_config.image_name, &manifest_digest)?; | ||
|
|
There was a problem hiding this comment.
pull_and_verify is invoked with only launcher_config.image_name, and launcher_config.registry is no longer used anywhere. Configs/docs currently model non-DockerHub registries as registry=<host> + image_name=<repo>, which would now attempt to pull from the default registry instead of the configured one. Consider constructing a fully-qualified image reference (or explicitly requiring image_name to include the registry host) and using that consistently for both pull and compose launch.
| .map_err(|e| ImageDigestValidationFailed::DockerPullFailed(e.to_string()))?; | ||
|
|
||
| let pull_failed = !pull.status.success(); | ||
| if pull_failed { | ||
| return Err(ImageDigestValidationFailed::DockerPullFailed( | ||
| "docker pull terminated with unsuccessful status".to_string(), | ||
| )); | ||
| } | ||
|
|
||
| // Verify that the pulled image ID matches the expected config digest. | ||
| // `docker inspect .ID` returns the image ID, which equals the config digest | ||
| // (i.e. the sha256 of the image config blob). | ||
| let inspect = Command::new("docker") | ||
| .args([ | ||
| "image", | ||
| "inspect", | ||
| "--format", | ||
| "{{index .ID}}", | ||
| &name_and_digest, | ||
| ]) | ||
| .output() | ||
| .map_err(|e| ImageDigestValidationFailed::DockerInspectFailed(e.to_string()))?; | ||
|
|
||
| let docker_inspect_failed = !inspect.status.success(); | ||
| if docker_inspect_failed { | ||
| return Err(ImageDigestValidationFailed::DockerInspectFailed( | ||
| "docker inspect terminated with unsuccessful status".to_string(), | ||
| )); | ||
| } | ||
|
|
||
| let pulled_image_id: DockerSha256Digest = String::from_utf8_lossy(&inspect.stdout) | ||
| .trim() | ||
| .to_string() | ||
| .parse() | ||
| .map_err(|e| { | ||
| ImageDigestValidationFailed::DockerInspectFailed(format!( | ||
| "docker inspect returned invalid image ID: {e}" | ||
| )) | ||
| })?; | ||
|
|
||
| if pulled_image_id != image_hash { | ||
| return Err( | ||
| ImageDigestValidationFailed::PulledImageHasMismatchedDigest { | ||
| pulled_image_id, | ||
| expected_image_id: image_hash, | ||
| }, | ||
| ); | ||
| if !pull.status.success() { | ||
| let stderr = String::from_utf8_lossy(&pull.stderr); | ||
| return Err(ImageDigestValidationFailed::DockerPullFailed(format!( | ||
| "docker pull {reference} failed: {stderr}" | ||
| ))); |
There was a problem hiding this comment.
The current error formatting will duplicate text ("docker pull failed: docker pull ... failed: ...") and only includes stderr, which can hide useful diagnostics if Docker writes details to stdout (or if stderr includes trailing newlines). Consider adjusting ImageDigestValidationFailed::DockerPullFailed to carry structured info (reference, exit status, stdout/stderr) and format it once, with trimmed output.
Simplify LauncherConfig to use a single 'image' field that is the full Docker image reference (e.g., 'nearone/mpc-node' for Docker Hub or 'ghcr.io/nearone/mpc-node' for other registries). Removed fields: image_tags, image_name, registry, rpc_request_timeout_secs, rpc_request_interval_secs, rpc_max_attempts (all were only used by the now-deleted registry API code). Updated README and tests to reflect the new config format.
| | `rpc_request_timeout_secs` | Yes | Per-request timeout for registry API calls (seconds) | | ||
| | `rpc_request_interval_secs` | Yes | Initial retry interval for registry API calls (seconds) | | ||
| | `rpc_max_attempts` | Yes | Maximum registry API retry attempts | | ||
| | `image` | Yes | Full Docker image reference. Include registry prefix for non-Docker Hub registries. Examples: `"nearone/mpc-node"` (Docker Hub), `"ghcr.io/nearone/mpc-node"` | |
There was a problem hiding this comment.
The registry, image_name, and image_tags fields were replaced with a single image field. These fields existed
because the old code needed to construct Docker registry API URLs separately
(https://{registry}/v2/{image_name}/manifests/{tag}). Now that we pull directly with docker pull
image@sha256:, Docker handles registry resolution natively from the image reference. A single image field
(e.g., nearone/mpc-node or ghcr.io/nearone/mpc-node) is all that's needed and matches standard Docker conventions.
The rpc_* fields (timeout, interval, max_attempts) were also removed as they only controlled the now-deleted
registry retry logic.
There was a problem hiding this comment.
I am not 100% about this, but of course could be certainly the case
| ## Supported Registries | ||
|
|
||
| The launcher uses the [OCI Distribution Specification](https://github.com/opencontainers/distribution-spec) for registry communication. Auth endpoints are discovered automatically via the `WWW-Authenticate` challenge on `/v2/`, so there is no hard-coded auth URL. | ||
| The launcher pulls images using `docker pull <image>@sha256:<digest>`. Any registry that Docker supports works out of the box. Set the `image` field to include the registry prefix: |
There was a problem hiding this comment.
Note on removing the OCI Distribution Spec client (oci-client):
The old code used oci-client to interact with registries directly via the OCI Distribution Spec. This gave us: (1)
automatic auth challenge discovery via WWW-Authenticate, (2) multi-platform manifest resolution to amd64/linux, and
(3) config digest verification inside the manifest before pulling.
With the new docker pull image@sha256:<manifest_digest> approach, all of this is handled by Docker itself:
- Auth: Docker uses its credential helpers (~/.docker/config.json), which actually supports private registries —
the old code only supported anonymous/public images - Multi-platform: Not needed — we pull by exact manifest digest, which already points to a single-platform manifest
(platform was resolved at build time) - Digest verification: Docker verifies the pulled content matches the digest — if it doesn't, the pull fails
Net result: same guarantees, 430 fewer lines, and we gained private registry support.
… digests Update image hash inspection, voting, and verification sections to reflect the switch from config digests (docker inspect .ID) to manifest digests (skopeo inspect .Digest / build script output).
839a7d3 to
3d61145
Compare
Summary
Remove the Docker registry API from the tee-launcher. The approved hashes file now contains manifest digests, so the launcher can
docker pull image@sha256:<digest>directly without querying the registry.Removed (~630 lines):
registry.rsentirely — Docker registry auth, manifest fetching, retry/backoff, multi-platform resolution, config digest matchingoci-client,backon,rustls,rstestdependenciesmain.rsexternal-services-testsfeature flagSimplified:
validation.rs: 79 → 33 lines (justdocker pull+ verify exit code)error.rs: 90 → 65 lineslib.rs: now emits manifest digest to RTMR3 (not config digest)Note:
LauncherConfigstill acceptsrpc_*fields for backward compat with existing TOML configs. They are unused — removal is a follow-up.Closes #2627. Plan and research in the issue comment.
Requires coordination
Test plan
cargo check -p tee-launcherpassescargo test -p tee-launcherpasses (30 tests)