Skip to content

ActsExamples::RandomNumbers is not random for continuous run seed #5321

@Corentin-Allaire

Description

@Corentin-Allaire

Hello everybody !

As some of you might know, I have been generating ttbar events using the OOD for a while now to train the Seeding Transformer I have been developing.
Today I realised that the ActsExamples::RandomNumbers we use to generate event-based seed is not random at all...

It uses hash_combine (https://www.boost.org/doc/libs/1_36_0/doc/html/hash/reference.html#boost.hash_combine) to combine the global run seed with the event number to create a unique event seed. Unfortunately, this is not unique at all if you perform multiple simulation runs with an adjacent seed. So what is the issue with the hash_combine implementation?

seed ^= hash_value(v) + 0x9e3779b9 + (seed << 6) + (seed >> 2);

If v (for us, the event number) is an int, then hash_value is the identity. What this means is that the new seed can be written (for seed>0):

seed ^= event_id + cte + seed*64 + int(seed / 4);

Which will lead to many collisions, for example, the 3 following seed-event pairs will result in seed 2655077414 : (10110, 64), (10111, 1), (10112, 13)

Right now, to avoid the issue, I try to make sure my seeds are not consecutive (using seed^2 works nicely). But I think we should either use something other than hash_combine (I didn't check what approaches exist) or maybe use a type for which hash_value is not the identity.

To loop back to what I am doing right now, I simulated 50k ttbar events (500 runs of 100 events) with a seed from 1000 to 1500. Results: 37% seed collision and only 30k unique events. More probematic my training and testing events are all mixed :(

If people want to recreate the issue, I am joining a small Python script that shows this exact effect.

test_seed.py

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions