Skip to content

Fork/spawn hang after creating zeromq Socket on x64 Linux under Rosetta 2(OrbStack) #732

@bytemain

Description

@bytemain

Description

I've encountered a consistent deadlock when using child_process.spawn() in Node.js after initializing any ZeroMQ socket. The issue is 100% reproducible on OrbStack running x64 Linux on Apple Silicon (M1/M2/M3) via Rosetta 2 emulation.

This appears to be a classic fork-safety issue where ZeroMQ's background I/O threads hold mutex locks during fork(), causing the child process to inherit locked mutexes with no owner thread—resulting in deadlock.

Environment

  • Host OS: macOS (Apple Silicon / ARM64)
  • OrbStack Version: (e.g., v1.x.x) Version 2.0.5 (19905)
  • Guest Linux: Debian/Ubuntu x64 (Running via Rosetta 2 emulation)
  • Node.js Version: 22
  • zeromq.js Version: latest

Minimal Reproduction

const { spawn } = require("child_process");
const { Router } = require("zeromq");

// 1. Initialize ZeroMQ (starts background I/O threads)
const server = new Router();
console.log("ZeroMQ Router created");

// 2. Attempt to spawn a child process
// This hangs indefinitely on OrbStack (x64 on ARM64)
const child = spawn("echo", ["hello"], {
    stdio: ["ignore", "pipe", "pipe"],
});

// This line is never reached
console.log("child.pid:", child.pid);

child.stdout.on("data", (data) => console.log(`stdout: ${data}`));
child.on("close", (code) => console.log(`exited with code ${code}`));

What I've Tried (All Failed)

1. Setting ioThreads: 0/Setting blocky: false and linger: 0/Setting threadPriority and threadSchedulingPolicy

const context = new Context({
  ioThreads: 0,
  blocky: false,
  threadPriority: 0,
  threadSchedulingPolicy: 0,
});
// Still deadlocks

2. Using worker_threads to isolate ZeroMQ

zmq_worker.js:

const { parentPort } = require("worker_threads");
const { Router } = require("zeromq");

const server = new Router();
console.log("Worker: ZeroMQ Router created");
parentPort.postMessage("ready");

main.js:

const { spawn } = require("child_process");
const { Worker } = require("worker_threads");

const worker = new Worker("./zmq_worker.js");

worker.on("message", (msg) => {
    if (msg === "ready") {
        // Still deadlocks!
        const child = spawn("echo", ["hello"]);
    }
});

This also deadlocks because worker_threads share the same process address space, and fork() copies the entire process memory including ZeroMQ's internal state from the worker thread.

Technical Analysis

The root cause is the well-known fork-after-pthread problem:

  1. ZeroMQ creates background threads (even with ioThreads: 0, there may be internal initialization)
  2. These threads may hold mutexes (malloc locks, internal ZMQ locks, etc.)
  3. When fork() is called, only the calling thread is copied to the child process
  4. The child process inherits locked mutexes, but the threads that held them don't exist
  5. Any operation requiring those locks (like malloc) will deadlock

Rosetta 2's x64→ARM64 translation layer appears to significantly widen the race condition window, making this issue 100% reproducible instead of sporadic.

strace Evidence

6886  brk(0x2 <unfinished ...>
6886  <... brk resumed>)                = 0x8000001a1c30
6884  syscall_0x6aad140(0, 0, 0x6, 0, 0x1, 0x62) = 0x3fbfe
6884  syscall_0x6aad140(0, 0, 0x6, 0, 0x1, 0x62 <unfinished ...>
// Child process hangs here indefinitely (likely futex wait)

Potential Solutions (Discussion)

Option A: pthread_atfork() handlers in zeromq.js

Register fork handlers to lock/unlock known mutexes around fork:

pthread_atfork(
    []() { /* prepare: acquire locks */ },
    []() { /* parent: release locks */ },
    []() { /* child: release/reinit locks */ }
);

Problem: The critical locks are in libzmq and glibc, not in zeromq.js itself.

Option B: Upstream fix in libzmq

Request libzmq to implement pthread_atfork handlers for their internal mutexes.

Thank you for your time!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions