codemode: fix Start() fail-fast and use tools.As for wrapper unwrapping by dgageot · Pull Request #2007 · docker/docker-agent

dgageot · 2026-03-09T16:28:45Z

Problem

codeModeTool.Start() had two issues:

Fail-fast behavior — it returned on the first error, so one broken toolset (e.g. an unreachable MCP server) prevented all remaining toolsets from starting.
Direct type assertion — both Start() and Stop() used t.(tools.Startable) instead of tools.As, which cannot discover Startable through wrapper chains (e.g. StartableToolSet).

Fix

Collect all errors with errors.Join in Start(), matching the existing Stop() behavior.
Switch to tools.As[tools.Startable](t) so the capability is found at any depth in the wrapper chain.
Extract the now-identical Start/Stop loops into a shared forEachStartable helper to remove duplication.

docker-agent

Review Summary

Assessment: 🔴 CRITICAL

This refactoring introduces a significant semantic change to Start() behavior. The method now attempts to start all toolsets even after failures (collect-all-errors), whereas the original implementation failed-fast on the first error.

Key concern: Without state tracking for which toolsets successfully started, the system can be left in a partially-started state with no rollback mechanism. If toolset 1 starts successfully but toolset 2 fails, there's no way to clean up toolset 1.

Findings

🔴 1 CRITICAL issue
🟡 1 NOTABLE issue

Note: This review uses COMMENT mode only. Human review is required for merge decisions.

docker-agent · 2026-03-09T16:34:08Z

pkg/tools/codemode/codemode.go

-	}
-
-	return nil
+	return c.forEachStartable(func(s tools.Startable) error { return s.Start(ctx) })


🔴 CRITICAL: Semantic breaking change — fail-fast → collect-all-errors

The original Start() implementation returned immediately on the first error:

if err := startable.Start(ctx); err != nil { return err // fail-fast }

The new implementation attempts all toolsets regardless of failures:

return c.forEachStartable(func(s tools.Startable) error { return s.Start(ctx) }) // forEachStartable collects all errors with errors.Join

Problem: If toolset A starts successfully but toolset B fails, the system is left in a partially-started state. There's no tracking of which toolsets succeeded, so:

The caller cannot determine which toolsets are running

No rollback mechanism exists to stop successfully-started toolsets

Stop() will attempt to stop all toolsets regardless of whether they started

Impact: This changes startup semantics from "abort on first failure" to "try everything and report all failures". While the PR description mentions this is intentional ("one broken toolset shouldn't prevent others"), this could leave the system in an inconsistent state.

Recommendation: Consider adding state tracking (e.g., a started []tools.Startable field) to enable cleanup/rollback when Start() returns an error, or document that toolset Stop() implementations must handle being stopped without being started.

docker-agent · 2026-03-09T16:34:09Z

pkg/tools/codemode/codemode.go

-	}
-
-	return nil
+	return c.forEachStartable(func(s tools.Startable) error { return s.Start(ctx) })


🟡 NOTABLE: No state tracking for cleanup after partial startup failures

When Start() returns an error after attempting all toolsets, the caller has no visibility into which toolsets successfully started and which failed. This makes cleanup/rollback impossible.

The codeModeTool maintains no internal state about which toolsets are running, so if Start() fails:

Successfully-started toolsets remain running

No mechanism exists to selectively stop only the started ones

Calling Stop() will attempt to stop all toolsets, including those that never started (which may or may not handle this gracefully)

Recommendation: Either:

Add internal state tracking: started []tools.Startable

Implement automatic rollback in Start() on any error

Document that toolset Stop() implementations must be idempotent and safe to call on non-started toolsets

Start() returned on the first error, so one broken toolset (e.g. an unreachable MCP server) prevented all remaining toolsets from starting. Collect all errors with errors.Join, matching the existing Stop() behavior. Both Start() and Stop() used a direct type assertion to find Startable, which cannot see through wrapper chains (e.g. StartableToolSet). Switch to tools.As so the capability is discovered at any depth. Extract the now-identical loops into a shared forEachStartable helper to remove the duplication. Assisted-By: docker-agent

dgageot requested a review from a team as a code owner March 9, 2026 16:28

docker-agent bot reviewed Mar 9, 2026

View reviewed changes

dgageot force-pushed the board/fix-code-mode-71471ec1 branch from dbf90e1 to a6e51ce Compare March 9, 2026 16:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

codemode: fix Start() fail-fast and use tools.As for wrapper unwrapping#2007

codemode: fix Start() fail-fast and use tools.As for wrapper unwrapping#2007
dgageot wants to merge 1 commit intodocker:mainfrom
dgageot:board/fix-code-mode-71471ec1

dgageot commented Mar 9, 2026

Uh oh!

docker-agent bot left a comment

Uh oh!

docker-agent bot Mar 9, 2026

Uh oh!

docker-agent bot Mar 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dgageot commented Mar 9, 2026

Problem

Fix

Uh oh!

docker-agent bot left a comment

Choose a reason for hiding this comment

Review Summary

Findings

Uh oh!

docker-agent bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

docker-agent bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant