-
Notifications
You must be signed in to change notification settings - Fork 217
Description
Description
The wgc router compose command hangs indefinitely when federation composition produces errors and the compose config references 4 or more subgraphs. The process prints:
We found composition errors, while composing. Please check the errors below:
…and then deadlocks. The error table is never rendered. All threads block on futex waits.
With 3 subgraphs and the same composition errors, the table renders correctly and the process exits with code 1.
Versions affected
wgc@0.102.4wgc@0.108.0
(Likely all recent versions — the table rendering code path hasn't changed between these releases.)
Minimal reproduction
1. Create subgraph schemas
subgraph-a.graphql
extend schema
@link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key"])
type Query {
foos: [Foo!]!
}
type Foo @key(fields: "id") {
id: ID!
name: String!
}subgraph-b.graphql
extend schema
@link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key", "@external", "@override"])
type Query {
bars: [Bar!]!
}
type Foo @key(fields: "id") {
id: ID!
description: String! @override(from: "subgraph-c")
}
type Bar @key(fields: "id") {
id: ID!
title: String!
}subgraph-c.graphql
extend schema
@link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key", "@external", "@override"])
type Query {
baz: String
}
type Foo @key(fields: "id") {
id: ID!
description: String! @override(from: "subgraph-b")
}subgraph-d.graphql
extend schema
@link(url: "https://specs.apollo.dev/federation/v2.5", import: ["@key"])
type Query {
qux: [Qux!]!
}
type Qux @key(fields: "id") {
id: ID!
value: Int!
}Note: subgraph-b overrides
descriptionfrom subgraph-c, while subgraph-c overridesdescriptionfrom subgraph-b — a circular override that produces a composition error.
2. Create compose config
compose.yaml
version: 1
subgraphs:
- name: subgraph-a
routing_url: http://localhost:4001/graphql
schema:
file: ./subgraph-a.graphql
- name: subgraph-b
routing_url: http://localhost:4002/graphql
schema:
file: ./subgraph-b.graphql
- name: subgraph-c
routing_url: http://localhost:4003/graphql
schema:
file: ./subgraph-c.graphql
- name: subgraph-d
routing_url: http://localhost:4004/graphql
schema:
file: ./subgraph-d.graphql3. Run compose
npx wgc@0.108.0 router compose -i compose.yamlNote on minimal repro: The deadlock may depend on the total size/complexity of the error messages rendered into the table, not purely the subgraph count. If the minimal schemas above don't trigger it, the real-world scenario below reliably does. The threshold appears to be 4+ subgraphs producing errors with enough combined text to overflow an internal buffer in the table renderer.
Real-world scenario that reliably reproduces
In a real project with 4 subgraphs (e.g. repos, issues, nodes, monolith) where composition errors arise from field conflicts or override cycles, the hang occurs 100% of the time. Removing any one subgraph from the config (reducing to 3) causes the error table to render normally.
Expected behavior
The composition error table is printed to stderr and the process exits with code 1, e.g.:
We found composition errors, while composing. Please check the errors below:
┌─────────────────────────────────────────────────────┐
│ COMPOSITION ERRORS │
├─────────────────────────────────────────────────────┤
│ <error details> │
└─────────────────────────────────────────────────────┘
This is exactly what happens with 3 subgraphs.
Actual behavior
The process prints the header message and then hangs forever:
We found composition errors, while composing. Please check the errors below:
█ (cursor blinks here indefinitely — no table, no exit)
The process must be killed with SIGKILL; SIGTERM and SIGINT are also ignored once the deadlock occurs.
Debugging: strace output
Running under strace -f shows all threads blocked on futex waits — a classic deadlock:
6434 futex(0x67e4a20, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6433 futex(0x67e4a20, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6432 futex(0x67e4a20, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6420 futex(0x67d2a00, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 0, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6419 futex(0x447749a8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1587, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6418 futex(0x447749a8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1587, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6417 futex(0x447749a8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1587, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6416 futex(0x447749a8, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1587, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
6435 futex(0x67e4a20, FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 16, NULL, FUTEX_BITSET_MATCH_ANY <unfinished ...>
Two distinct futex addresses are involved (0x67e4a20 and 0x447749a8), suggesting two different locks are part of the deadlock cycle.
Attempted workarounds (none work)
| Workaround | Result |
|---|---|
Pipe through cat (wgc ... | cat) |
Still hangs |
Redirect to file (wgc ... > out.txt 2>&1) |
Still hangs |
Set COLUMNS=300 |
Still hangs |
Use script for PTY allocation |
Still hangs |
Set NO_UPDATE_NOTIFIER=1 |
Still hangs |
| Reduce to 3 subgraphs | ✅ Works — error table renders and process exits |
Workaround
The only reliable workaround is to wrap the command with timeout:
timeout 30 npx wgc router compose -i compose.yamlThis will kill the process after 30 seconds. The composition errors are lost (since the table never renders), but at least CI pipelines won't hang indefinitely.
Alternatively, temporarily remove one subgraph from the config to get the error output, fix the errors, then re-add it.
Environment
- OS: Linux (tested on Debian/Ubuntu in Codespaces)
- Node.js: v18+ (bundled with wgc)
- wgc versions: 0.102.4, 0.108.0
- Terminal: various (bash, zsh; xterm-256color, dumb)