Skip to content

refactor: simplify launcher image validation by accepting manifest digest directly#2785

Draft
barakeinav1 wants to merge 3 commits intomainfrom
barak/simplify-image-validation
Draft

refactor: simplify launcher image validation by accepting manifest digest directly#2785
barakeinav1 wants to merge 3 commits intomainfrom
barak/simplify-image-validation

Conversation

@barakeinav1
Copy link
Copy Markdown
Contributor

Summary

Remove the Docker registry API from the tee-launcher. The approved hashes file now contains manifest digests, so the launcher can docker pull image@sha256:<digest> directly without querying the registry.

Removed (~630 lines):

  • registry.rs entirely — Docker registry auth, manifest fetching, retry/backoff, multi-platform resolution, config digest matching
  • oci-client, backon, rustls, rstest dependencies
  • All registry-related error variants
  • rustls crypto provider setup in main.rs
  • external-services-tests feature flag

Simplified:

  • validation.rs: 79 → 33 lines (just docker pull + verify exit code)
  • error.rs: 90 → 65 lines
  • lib.rs: now emits manifest digest to RTMR3 (not config digest)

Note: LauncherConfig still accepts rpc_* fields for backward compat with existing TOML configs. They are unused — removal is a follow-up.

Closes #2627. Plan and research in the issue comment.

Requires coordination

Test plan

  • cargo check -p tee-launcher passes
  • cargo test -p tee-launcher passes (30 tests)

… manifest digest

Remove the Docker registry API interaction from the launcher. The approved
hashes file now contains manifest digests (instead of config digests), so
the launcher can pull directly with `docker pull image@sha256:<digest>`
without querying the registry to resolve config digest -> manifest digest.

Removed:
- registry.rs (~430 lines) -- Docker registry auth, manifest fetching,
  retry logic, multi-platform resolution, config digest matching
- oci-client, backon, rustls, rstest dependencies
- Registry-related error variants
- rustls crypto provider setup in main.rs
- external-services-tests feature flag

Simplified:
- validation.rs: 79 lines -> 33 lines (just docker pull + verify)
- error.rs: 90 lines -> 65 lines
- lib.rs: emits manifest digest (not config digest) to RTMR3

Closes #2627
@barakeinav1 barakeinav1 requested a review from Copilot April 9, 2026 14:25
@claude
Copy link
Copy Markdown

claude bot commented Apr 9, 2026

Code Review

Clean simplification. The removal of the OCI registry client in favor of direct docker pull image@sha256:<digest> is straightforward and correct. No critical issues found.

Minor observations (non-blocking):

  • The sync pull_and_verify call inside the async run() is fine here since this is a sequential launcher binary (and other calls like launch_mpc_container are already sync in the same context).
  • The PR description notes that LauncherConfig still has unused rpc_* fields for backward compat — the follow-up cleanup is tracked.

✅ Approved

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors tee-launcher image validation to accept and operate directly on manifest digests, removing the OCI/Docker registry API client and associated retry/auth logic so the launcher can docker pull <image>@sha256:<digest> without registry lookups.

Changes:

  • Replace registry-based digest resolution with a direct docker pull by manifest digest (pull_and_verify).
  • Remove the registry.rs module and drop related dependencies (oci-client, backon, rustls, rstest) and error variants.
  • Emit the selected manifest digest to dstack (RTMR3) and propagate it through tee config / compose launch.

Reviewed changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
crates/tee-launcher/src/validation.rs Replaces async registry+inspect validation with a minimal pull-by-digest check.
crates/tee-launcher/src/registry.rs Deletes the registry client, digest resolution logic, and related tests.
crates/tee-launcher/src/main.rs Removes rustls crypto provider initialization that was required for prior TLS usage.
crates/tee-launcher/src/lib.rs Switches selection/validation flow to manifest digests and emits manifest digest to RTMR3.
crates/tee-launcher/src/error.rs Simplifies error enums by removing registry-related variants and tightening run/pull errors.
crates/tee-launcher/Cargo.toml Removes registry/test-related features and dependencies.
Cargo.lock Removes transitive packages no longer needed after dropping the registry client stack.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 12 to 18
[[bin]]
name = "tee-launcher"
path = "src/main.rs"

[features]
external-services-tests = []

[dependencies]
backon = { workspace = true }
clap = { workspace = true }
dstack-sdk = { workspace = true }
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the external-services-tests feature flag will break existing CI/docs that still run cargo nextest ... -p tee-launcher --features external-services-tests (e.g. .github/workflows/external-services-tests.yml). Either keep this feature as a no-op for compatibility, or update the workflow/docs to stop passing the feature for tee-launcher.

Copilot uses AI. Check for mistakes.
Comment on lines +39 to 48
// The approved hashes file now contains manifest digests.
// We can pull directly by digest without querying the Docker registry API.
let manifest_digest = select_image_hash(
approved_hashes_on_disk.as_ref(),
&args.default_image_digest,
config.launcher_config.mpc_hash_override.as_ref(),
)?;

let manifest_digest = validate_image_hash(&config.launcher_config, image_hash.clone()).await?;
pull_and_verify(&config.launcher_config.image_name, &manifest_digest)?;

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull_and_verify is invoked with only launcher_config.image_name, and launcher_config.registry is no longer used anywhere. Configs/docs currently model non-DockerHub registries as registry=<host> + image_name=<repo>, which would now attempt to pull from the default registry instead of the configured one. Consider constructing a fully-qualified image reference (or explicitly requiring image_name to include the registry host) and using that consistently for both pull and compose launch.

Copilot uses AI. Check for mistakes.
Comment on lines 20 to +26
.map_err(|e| ImageDigestValidationFailed::DockerPullFailed(e.to_string()))?;

let pull_failed = !pull.status.success();
if pull_failed {
return Err(ImageDigestValidationFailed::DockerPullFailed(
"docker pull terminated with unsuccessful status".to_string(),
));
}

// Verify that the pulled image ID matches the expected config digest.
// `docker inspect .ID` returns the image ID, which equals the config digest
// (i.e. the sha256 of the image config blob).
let inspect = Command::new("docker")
.args([
"image",
"inspect",
"--format",
"{{index .ID}}",
&name_and_digest,
])
.output()
.map_err(|e| ImageDigestValidationFailed::DockerInspectFailed(e.to_string()))?;

let docker_inspect_failed = !inspect.status.success();
if docker_inspect_failed {
return Err(ImageDigestValidationFailed::DockerInspectFailed(
"docker inspect terminated with unsuccessful status".to_string(),
));
}

let pulled_image_id: DockerSha256Digest = String::from_utf8_lossy(&inspect.stdout)
.trim()
.to_string()
.parse()
.map_err(|e| {
ImageDigestValidationFailed::DockerInspectFailed(format!(
"docker inspect returned invalid image ID: {e}"
))
})?;

if pulled_image_id != image_hash {
return Err(
ImageDigestValidationFailed::PulledImageHasMismatchedDigest {
pulled_image_id,
expected_image_id: image_hash,
},
);
if !pull.status.success() {
let stderr = String::from_utf8_lossy(&pull.stderr);
return Err(ImageDigestValidationFailed::DockerPullFailed(format!(
"docker pull {reference} failed: {stderr}"
)));
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current error formatting will duplicate text ("docker pull failed: docker pull ... failed: ...") and only includes stderr, which can hide useful diagnostics if Docker writes details to stdout (or if stderr includes trailing newlines). Consider adjusting ImageDigestValidationFailed::DockerPullFailed to carry structured info (reference, exit status, stdout/stderr) and format it once, with trimmed output.

Copilot uses AI. Check for mistakes.
Simplify LauncherConfig to use a single 'image' field that is the full
Docker image reference (e.g., 'nearone/mpc-node' for Docker Hub or
'ghcr.io/nearone/mpc-node' for other registries).

Removed fields: image_tags, image_name, registry, rpc_request_timeout_secs,
rpc_request_interval_secs, rpc_max_attempts (all were only used by the
now-deleted registry API code).

Updated README and tests to reflect the new config format.
| `rpc_request_timeout_secs` | Yes | Per-request timeout for registry API calls (seconds) |
| `rpc_request_interval_secs` | Yes | Initial retry interval for registry API calls (seconds) |
| `rpc_max_attempts` | Yes | Maximum registry API retry attempts |
| `image` | Yes | Full Docker image reference. Include registry prefix for non-Docker Hub registries. Examples: `"nearone/mpc-node"` (Docker Hub), `"ghcr.io/nearone/mpc-node"` |
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registry, image_name, and image_tags fields were replaced with a single image field. These fields existed
because the old code needed to construct Docker registry API URLs separately
(https://{registry}/v2/{image_name}/manifests/{tag}). Now that we pull directly with docker pull
image@sha256:, Docker handles registry resolution natively from the image reference. A single image field
(e.g., nearone/mpc-node or ghcr.io/nearone/mpc-node) is all that's needed and matches standard Docker conventions.
The rpc_* fields (timeout, interval, max_attempts) were also removed as they only controlled the now-deleted
registry retry logic.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100% about this, but of course could be certainly the case

## Supported Registries

The launcher uses the [OCI Distribution Specification](https://github.com/opencontainers/distribution-spec) for registry communication. Auth endpoints are discovered automatically via the `WWW-Authenticate` challenge on `/v2/`, so there is no hard-coded auth URL.
The launcher pulls images using `docker pull <image>@sha256:<digest>`. Any registry that Docker supports works out of the box. Set the `image` field to include the registry prefix:
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note on removing the OCI Distribution Spec client (oci-client):

The old code used oci-client to interact with registries directly via the OCI Distribution Spec. This gave us: (1)
automatic auth challenge discovery via WWW-Authenticate, (2) multi-platform manifest resolution to amd64/linux, and
(3) config digest verification inside the manifest before pulling.

With the new docker pull image@sha256:<manifest_digest> approach, all of this is handled by Docker itself:

  • Auth: Docker uses its credential helpers (~/.docker/config.json), which actually supports private registries —
    the old code only supported anonymous/public images
  • Multi-platform: Not needed — we pull by exact manifest digest, which already points to a single-platform manifest
    (platform was resolved at build time)
  • Digest verification: Docker verifies the pulled content matches the digest — if it doesn't, the pull fails

Net result: same guarantees, 430 fewer lines, and we gained private registry support.

@barakeinav1 barakeinav1 marked this pull request as draft April 9, 2026 14:42
… digests

Update image hash inspection, voting, and verification sections to reflect
the switch from config digests (docker inspect .ID) to manifest digests
(skopeo inspect .Digest / build script output).
@barakeinav1 barakeinav1 force-pushed the barak/simplify-image-validation branch from 839a7d3 to 3d61145 Compare April 10, 2026 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[rust-launcher] Simplify image validation by accepting manifest hash directly

3 participants