Skip to content

Commit 7fcad3f

Browse files
jeongyoonleeclaude
andcommitted
Make sparse-group test deterministic with 1-sample minority groups
With only 1 sample per minority treatment group out of 102 total, bootstrap sampling will miss them in most trees, making the test deterministic regardless of seed or CI environment. Co-Authored-By: Claude Opus 4.6 <[email protected]>
1 parent 009a109 commit 7fcad3f

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

tests/test_uplift_trees.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -371,11 +371,14 @@ def test_UpliftRandomForestClassifier_predict_shape_with_sparse_groups():
371371
"""Test that UpliftRandomForestClassifier.predict() returns correct shape
372372
when bootstrap sampling causes some trees to miss treatment groups (#569)."""
373373
np.random.seed(RANDOM_SEED)
374-
n = 60
374+
n = 102
375375
X = np.random.randn(n, 3)
376-
# Very few samples in treatment groups so bootstraps are likely to miss some
376+
# Only 1 sample per minority treatment group guarantees that bootstrap
377+
# sampling (with replacement, n draws from n) will miss them in some trees.
378+
# P(group included) = 1 - (1 - 1/n)^n ≈ 1 - 1/e ≈ 0.63 per tree,
379+
# so with 10 trees the chance ALL include both groups is ~0.63^20 ≈ 0.01%.
377380
treatment = np.array(
378-
[CONTROL_NAME] * 50 + [TREATMENT_NAMES[1]] * 5 + [TREATMENT_NAMES[2]] * 5
381+
[CONTROL_NAME] * 100 + [TREATMENT_NAMES[1]] * 1 + [TREATMENT_NAMES[2]] * 1
379382
)
380383
y = np.random.randint(0, 2, n)
381384

0 commit comments

Comments
 (0)