Add Rank-weighted Average Treatment Effect (RATE) metric#887
Add Rank-weighted Average Treatment Effect (RATE) metric#887aman-coder03 wants to merge 2 commits intouber:masterfrom
Conversation
|
hey @jeongyoonlee can you please have a look at this PR? |
jeongyoonlee
left a comment
There was a problem hiding this comment.
Thanks for adding the RATE metric! The implementation follows the existing get_qini/qini_score/plot_qini API pattern well. A few items to address:
Blocking
-
normalizedivision by zero — Atq=1,TOC = 0by definition (subset ATE == overall ATE when subset is the entire population), sotoc.div(np.abs(toc.iloc[-1, :]), axis=1)will divide by zero and produceinf/NaN. Needs a guard or a different normalization reference point (e.g., max absolute value). -
Unused
random_seedparameter — All three functions acceptrandom_seed=42but never use it. The docstring says "deprecated" but this is brand-new code with no backward-compatibility obligation. Please remove it, or if kept for API consistency withget_qini, document why. -
Missing test for
normalize=True— Given the division-by-zero issue above, this path needs coverage. -
Hardcoded seeds — Per project conventions, please use
RANDOM_SEEDfromtests/const.pyinstead of hardcoded42/0. Same forCONTROL_NAMEandTREATMENT_NAMESif applicable. -
Test that TOC ends at zero — At
q=1, TOC should be 0 by definition. There's a test for TOC starting at zero but not ending at zero.
Non-blocking suggestions
-
O(n²) complexity in
get_toc— The loop over every data point computessorted_df.iloc[:top_k].mean()for eachk. For large datasets this will be slow. Consider using cumulative sums (likeget_qinidoes) for O(n) performance:cumsum_tau = np.cumsum(sorted_tau) subset_ate = cumsum_tau / np.arange(1, n_total + 1)
-
Integration formula — The weight normalization (
weights / weights.sum()) computes a weighted mean rather than a true integral. This preserves model rankings (which is the primary use case, similar to Qini/AUUC), but the absolute values won't exactly match the paper's definition. Worth a brief note in the docstring. -
Module-level
plt.style.use("fivethirtyeight")— This is a side effect at import time that affects global matplotlib state. Consistent withvisualize.pybut worth noting. -
pytest.raises(Exception)intest_get_toc_errors_on_nan— Use a more specific exception type (the code raisesAssertionError, so usepytest.raises(AssertionError)). -
Observed-outcome fallback — When
t_mask.sum() == 0orc_mask.sum() == 0at a quantile, the code silently falls back tooverall_atemakingTOC(q) = 0. This is reasonable but worth documenting.
|
Hi @jeongyoonlee, thanks for the thorough review! I've already addressed all of these in the latest commit... |
Proposed changes
implements the RATE metric proposed by Yadlowsky et al. (2021) as requested in #540
RATE evaluates how well a treatment prioritization rule (e.g. a CATE estimator) identifies units with above-average treatment benefit. It does this by computing the weighted area under the Targeting Operator Characteristic (TOC) curve, which compares the ATE among the top-q fraction of prioritized units to the overall ATE.
3 functions are added to
causalml/metrics/rate.pyfollowing the same API conventions as the existingqini_score/get_qini/plot_qiniget_toc()computes the TOC curverate_score()computes the RATE scalar with either AUTOC (1/q) or Qini (q) weightingplot_toc()visualizes the TOC curveboth oracle mode (simulated
tau) and observed RCT mode (y+w) are supported. 16 tests are included intests/test_rate.pyCloses #540
Types of changes
What types of changes does your code introduce to CausalML?
Put an
xin the boxes that applyChecklist
Put an
xin the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
If this is a relatively large or complex change, kick off the discussion by explaining why you chose the solution you did and what alternatives you considered, etc. This PR template is adopted from appium.