Skip to content

Rename "dummy" backend to "fake" and add init_process_group test #818

@d4l3k

Description

@d4l3k

Summary

Rename the "dummy" backend to "fake" (matching PyTorch's FakeProcessGroup convention) and add a test that creates a process group using torch.distributed.init_process_group with the TORCH_DISTRIBUTED_USE_TORCHCOMMS=1 env var and the fake backend.

Good intro task — touches the build system, backend registration, and the PyTorch integration path.

Part 1: Rename "dummy" → "fake"

Rename TorchCommDummyTorchCommFake and update all references:

  • comms/torchcomms/TorchCommDummy.{hpp,cpp} — class, window, static registration name
  • comms/torchcomms/tests/unit/cpp/DummyTorchCommBackend.cpp + CMakeLists.txt + TorchCommFactoryTest.cpp — dynamic loading test library and references
  • comms/torchcomms/tests/unit/py/test_hooks.py, test_copy.pynew_comm("dummy", ...) calls
  • setup.py — entry_points registration

Part 2: Add init_process_group test

Add a test that sets TORCH_DISTRIBUTED_USE_TORCHCOMMS=1 and calls torch.distributed.init_process_group with the fake backend. This exercises the upstream PyTorch integration path (distributed_c10d.py) where PyTorch itself creates the torchcomms communicator via new_comm(). Run a basic collective and tear down.

Build & test

USE_NCCLX=OFF USE_NCCL=OFF USE_TRANSPORT=OFF pip install --no-build-isolation -v .
ctest --test-dir build --output-on-failure   # C++
pytest comms/torchcomms/tests/unit/py/        # Python
lintrunner -a                                 # Lint

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions