Skip to content

feat: align workflow spec and runtime with dapr durable control flow#6

Open
vpittamp wants to merge 14 commits intomainfrom
102-dsl
Open

feat: align workflow spec and runtime with dapr durable control flow#6
vpittamp wants to merge 14 commits intomainfrom
102-dsl

Conversation

@vpittamp
Copy link

@vpittamp vpittamp commented Feb 16, 2026

Summary

  • upgrade workflow spec to strict workflow-spec/v2 and tighten lint/decompile/compile behavior
  • add explicit workflow-control step type (schema, linting, UI node/config, orchestrator execution)
  • remove MCP side-channel control (respond / __workflow_builder_control) and use explicit durable control flow
  • align AI workflow generation guidance with Dapr-compatible control flow

Validation

  • pnpm type-check
  • pnpm fix
  • python -m py_compile services/workflow-orchestrator/workflows/dynamic_workflow.py services/workflow-orchestrator/core/types.py

vpittamp and others added 14 commits February 14, 2026 12:36
…ol call serialization

- Add execute-by-id endpoint to workflow-orchestrator for service-to-service workflow execution
- Add run_workflow, get_workflow_execution_status, approve_workflow MCP tools to mastra-agent-tanstack
- Add RunWorkflow UI component with form, status polling, and approval gate UI
- Fix AI SDK tool call serialization (getter properties don't survive JSON.stringify)
- Fix MCP client pool session management to prevent OOM from session churn
- Add LogsViewer component, k8s sandbox config, and Dapr pub/sub event publishing
- Update function-router registry and credential handling for mastra/execute action type

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Mastra wraps AI SDK tool calls in an envelope:
  { type: "tool-call", payload: { toolCallId, toolName, args } }
The previous fix tried accessing tc.toolName directly, which was undefined.
Now unwraps from tc.payload first. Also removes otel deps (from other process)
that caused CJS/ESM import crash on startup.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…legacy services

Remove planner-dapr-agent service and all planner/* action types. All agent
actions now route through mastra-agent-tanstack exclusively.

Fix orchestrator routing: only `mastra/execute` (async, fire-and-forget) goes
through `process_agent_child_workflow`. Sync actions like `mastra/clone` and
`mastra/plan` now correctly route through `execute_action` → function-router →
mastra-agent-tanstack's proper endpoints.

Fix workflow-mcp-server's execute_workflow tool to use the orchestrator's
`execute-by-id` endpoint which properly serializes React Flow nodes, fixing
approval gate timeouts (was 60s default instead of configured 300s) and
missing node labels.

Also removes workflow-orchestrator-ts-archived, legacy dapr workflow API
routes, planner plugin, and workflow-executor. Updates CLAUDE.md to reflect
the current architecture.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Replace mastra-agent-tanstack's in-memory agent loop with a durable
Dapr Workflow implementation that survives restarts, has built-in
retries, and deterministic replay. Deployed alongside (Phase 1) so
both agent/* and durable/* action types work.

- Copy @dapr-agents/durable-agent library (workflow activities, state
  management, orchestration, pub/sub, registry, observability)
- Port K8s sandbox, remote filesystem, and sandbox config from
  mastra-agent-tanstack (standalone, no @mastra/core dependency)
- Create workspace tools as DurableAgentTool objects (read/write/edit
  files, execute commands, clone repos)
- Add Express server with same API surface (run, plan, execute-plan,
  tools, health, dapr subscriptions)
- Integrate with orchestrator (durable/* action routing) and
  function-router (durable-agent service resolution)
- Register durable-agent plugin with durable/run action type

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Upgrade ai to ^6.0.0 and @ai-sdk/openai to ^3.0.0
- Use LanguageModel type (replaces LanguageModelV1)
- Use openai.chat() to force Chat Completions API (not Responses API)
- Fix tool-call content parts: use `input` field (not `args`)
- Fix tool-result output: use discriminated union { type: "json", value }
- Build zodToJsonSchema converter for tool input schemas
- Replace pub/sub completion with direct Dapr service invocation to
  orchestrator raise_event API (bypasses component scoping mismatch)
- Use DaprWorkflowClient.waitForWorkflowCompletion() for reliable polling
- Handle WorkflowRuntimeStatus as numeric enum (COMPLETED=1)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…retention

The Runs tab lost per-node output data when navigating away because the
status polling endpoint updated status/phase/progress but never wrote
output to workflow_executions. This adds two complementary persistence
paths:

1. Status polling endpoint now writes output on terminal status
2. New persist_results_to_db orchestrator activity writes output directly
   at workflow completion (belt-and-suspenders for closed-browser case)

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add maxTurns as UI-editable field on mastra-run, mastra/execute, and
  durable/run actions, flowing through orchestrator to durable-agent
- Increase default maxIterations from 10 to 50 for durable-agent
- Accumulate all tool calls across ReAct turns in AgentWorkflowResult
- Strengthen execution prompt to ensure agent completes all plan steps
- Update CLAUDE.md with durable-agent, node-sandbox, and current architecture

Co-Authored-By: Claude Opus 4.6 <[email protected]>
- Add automated kill-and-restart durability test (scripts/test-dapr-durability.ts)
  that proves durable-agent workflows survive pod kills and resume to completion
- Fix crash recovery gap: pass previous tool results through generator to repair
  Redis conversation state after crashes between saveToolResults and next callLlm
- Add safety check that detects broken assistant messages (tool_calls without
  matching tool results) and truncates conversation to last clean state
- Add eager WorkflowRuntime initialization at startup so pending workflows replay
  immediately on pod restart instead of waiting for first /api/run request
- Add user message deduplication guard in callLlm activity

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@vercel
Copy link

vercel bot commented Feb 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
v0-workflow-builder Ready Ready Preview, Comment, Open in v0 Feb 16, 2026 5:18pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant