Skip to content

Lightning sets shuffle=True for custom samplers when wrapping them for distributed sampling #21447

@mojtababahrami

Description

@mojtababahrami

Bug description

I am now using a distributed sampler and set use_distributed_sampler=False but I wish I didn't have to! So let me explain:
Before starting MultiGPU, I had a simple sampler that was working ok on single GPU, no DDP. But then, starting with DDP, when I pass my simple sampler to the Lightning, it wraps it with DistributedSamplerWrapper and ignores my sampler's sample ordering by setting the shuffle to true. This took me a while to find out the bug, and then, when I did, I replaced my sampler with a distributed version of it, which then worked ok. But the whole thing was very unnecessary and time-consuming. The expected behvaiour is that Lightning converts a sampler to a distributed one respecting the original sampler' order.

Based on Trainer API for argument use_distributed_sampler:
By default, it will add shuffle=True for the train sampler and shuffle=False for validation/test/predict samplers.

Setting shuffle=True for a custom input sampler makes no sense and ignores the whole idea of a custom sampler.

This was already mentioned in this closed issue before (#21131 ), but was never resolved.

What version are you seeing the problem on?

v2.5

Reproduced in studio

No response

How to reproduce the bug

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- PyTorch Lightning Version (e.g., 2.5.0):
#- PyTorch Version (e.g., 2.5):
#- Python version (e.g., 3.12):
#- OS (e.g., Linux):
#- CUDA/cuDNN version:
#- GPU models and configuration:
#- How you installed Lightning(`conda`, `pip`, source):

More info

No response

cc @ethanwharris @justusschock

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdistributedGeneric distributed-related topicver: 2.5.x

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions