Summary
We have a complete reference for pg_cluster_ic_peers
(docs/reference/system-views.md) and a working multi-node walkthrough
(docs/user-guide/bootstrap.md), but no task-oriented troubleshooting guide
for when the tier1 (TCP) interconnect doesn't come up. This issue adds one.
Why this is a good first issue
- Pure documentation — no C, no cluster internals.
- Everything you need is already in the tree: the
pg_cluster_ic_peers column
reference and the cluster.interconnect_* GUC docs.
- You can reproduce every failure state locally with two nodes on loopback
(see the multi-node section of bootstrap.md).
What to do
Add docs/user-guide/troubleshooting-interconnect.md and link it from
bootstrap.md. For each common failure give the symptom as seen in
pg_cluster_ic_peers, the likely cause, and the fix. At minimum cover:
pg_cluster_ic_peers returns zero rows → cluster.interconnect_tier is
stub, not tier1.
- Peer stuck
state = down, connect_error_count climbing, last_error = "connect SO_ERROR: Connection refused" (errno 61), last_error_code = 08001
→ peer not started, or wrong interconnect_addr / port unreachable.
- Peer flapping → non-zero
reconnect_count; how to read last_connect_at
vs last_recv_at.
- Heartbeats not advancing → compare
heartbeat_send_count /
heartbeat_recv_count across two queries a few seconds apart.
state = rejected → what rejects a peer (membership/handshake) and where to
look next.
Tip: to reproduce (2), start only node 0 from the bootstrap walkthrough and
watch node 0's view of node 1 before node 1 is up.
Definition of done
- New
docs/user-guide/troubleshooting-interconnect.md, each scenario as
symptom → cause → fix.
- Linked from
docs/user-guide/bootstrap.md.
- Column/GUC names match the existing docs (no invented fields).
Pointers
docs/reference/system-views.md → ## pg_cluster_ic_peers
docs/user-guide/configuration.md → cluster.interconnect_*
docs/user-guide/bootstrap.md → "Multi-node cluster (tier1 TCP interconnect)"
Summary
We have a complete reference for
pg_cluster_ic_peers(
docs/reference/system-views.md) and a working multi-node walkthrough(
docs/user-guide/bootstrap.md), but no task-oriented troubleshooting guidefor when the tier1 (TCP) interconnect doesn't come up. This issue adds one.
Why this is a good first issue
pg_cluster_ic_peerscolumnreference and the
cluster.interconnect_*GUC docs.(see the multi-node section of
bootstrap.md).What to do
Add
docs/user-guide/troubleshooting-interconnect.mdand link it frombootstrap.md. For each common failure give the symptom as seen inpg_cluster_ic_peers, the likely cause, and the fix. At minimum cover:pg_cluster_ic_peersreturns zero rows →cluster.interconnect_tierisstub, nottier1.state = down,connect_error_countclimbing,last_error = "connect SO_ERROR: Connection refused"(errno 61),last_error_code = 08001→ peer not started, or wrong
interconnect_addr/ port unreachable.reconnect_count; how to readlast_connect_atvs
last_recv_at.heartbeat_send_count/heartbeat_recv_countacross two queries a few seconds apart.state = rejected→ what rejects a peer (membership/handshake) and where tolook next.
Tip: to reproduce (2), start only node 0 from the bootstrap walkthrough and
watch node 0's view of node 1 before node 1 is up.
Definition of done
docs/user-guide/troubleshooting-interconnect.md, each scenario assymptom → cause → fix.
docs/user-guide/bootstrap.md.Pointers
docs/reference/system-views.md→## pg_cluster_ic_peersdocs/user-guide/configuration.md→cluster.interconnect_*docs/user-guide/bootstrap.md→ "Multi-node cluster (tier1 TCP interconnect)"