Precision Medicine MiniDemo

QUESTION:

Which patients with condition D1 would you recommend be given treatment W to reduce their risk of outcome Y1?

ANSWER:

Running 'tx_effect.py' in Python 3.13 with dependencies will generate a file 'docs/results" tx_recommendations.csv" with a treatment recommendation for all patients with diagnosis D1. These are further grouped by patients with similar treatment effect scoring for easier outreach/followup assignment and prioritization.

For a light review the methodology used, please see the example whitepaper: "/docs/results/PrecisionMedicine_miniWhitepaper.pdf"

Project Structure

precision-medicine/
├── data/                           
│   └── data.csv                        ← source patient data flat file 
├── docs/
│   ├── assets                          ← figures, images, tables
│   └── results                         ← deliverables
│       ├── tx_recommendations.csv      ← treatment recommendations flat file artifact (deliverable)
│       └── uplift_distribution.png     ← data-analytic plot of Average Tx Effect (CATE/uplift) distribution
├── utils/
│   ├── meta_author.json                ← corresponding author metadata
│   ├── git_init.py                     ← python script to initialize git repo
│   └── logging_config.py               ← custom logging config to track runtime kickoffs
├── utils/
│   ├── 1_descriptive_stats.py          ← descriptive statistics of source dataframe and its characteristics
│   └── 2_treatment_effect.py           ← derive ATE/CATE and cluster results for patient engagement prioritization
├── README.md                           ← this file you're viewing / describes data-analytic procedure
├── SECURITY.md                         ← example infosec policy to report CVEs or other exposures
└── requirements.txt                    ← python 3.13 library dependencies

Here is my procedure:

Step 0: Acquire source patient data

Download the patient data file
Initialize this git repository as the response
Generate descriptive stats to learn attribute dTypes (view in console manually):

import pandas as pd


def describe_csv(csv_file_path: str, max_cats: int = 10) -> dict:
    """
    Purpose: I manually view dummy variable data types
            - specifically 1/0 integers/numbers/decimals/floats versus True/False boolean/strings
    Method: Ingest a csv and compute descriptive statistics and freq counts.
    Args: csv_file_path (str): Path to the CSV file; max_cats (int): Max number of value counts to display per column.
    Returns: dictionary summary
    """
    df = pd.read_csv(csv_file_path)

    # Basic descriptive stats for numeric columns
    numeric_summary = df.describe(include=[float, int]).to_dict()

    # Column data types
    dtypes_summary = df.dtypes.apply(lambda x: str(x)).to_dict()

    # Frequency counts for each column
    freq_summary = {}
    for col in df.columns:
        freq_summary[col] = (
            df[col]
            .value_counts(dropna=False)
            .head(max_cats)
            .to_dict()
        )

    summary = {
        "shape": df.shape,
        "columns": df.columns.tolist(),
        "dtypes": dtypes_summary,
        "numeric_summary": numeric_summary,
        "frequency_summary": freq_summary
    }

    return summary


# call it and scroll through console output manually perusing dtypes
if __name__ == "__main__":
    result = describe_csv("data/data.csv")
    print("Dataset shape:", result["shape"])
    print("\nColumn data types:")
    for col, dtype in result["dtypes"].items():
        print(f"  {col}: {dtype}")
    print("\n--- Numeric Summary ---")
    print(pd.DataFrame(result["numeric_summary"]))
    print("\n--- Frequency Counts ---")
    for col, freqs in result["frequency_summary"].items():
        print(f"\n{col}:")
        for val, count in freqs.items():
            print(f"  {val}: {count}")

Step 1: Frame the problem precisely

Category	Details
Target Cohort	Patients with disease of interest where `D1 == 1`
Treatment (Tx)	`W` (1 / TRUE = received treatment)
Outcome to Reduce	`Y1` (1 = bad outcome)
Goal	Part A: For each patient in the D1 cohort, estimate the individual treatment effect (uplift) on outcome `Y1`. Part B: Recommend treatment `W` for those with predicted risk reduction (`> 0`).
Assumptions	1) Consistency: each observation’s outcome corresponds to the treatment actually received (no confounding). 2) Ignorability: given `X1–X80` (+DDx, site), treatment assignment is as-if random (no unmeasured confounders). 3) Positivity: every profile had a non-zero chance to be treated and untreated.
Deliverables	1) Baseline for the D1 cohort (treated/untreated rates, outcome rates) 2) Average Treatment Effect (ATE/CATE) estimates (propensity weighting + doubly-robust checks) 3) Uplift (individual Tx effect) score per patient 4) Recommended-treatment list of patient IDs with positive estimated benefit

Step 2: Look at the baseline

Compute basic rates for D1 == 1 to ensure the data give us a naive risk comparison.

import pandas as pd
import numpy as np
from prettyprinter import pprint
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline

# adjust path/file
precision_df = pd.read_csv("data/data.csv")

# normalize booleans to 0/1 if needed
# if we do this, we can just use sum() and mean() for descriptive stats if the boolean flag was text 'TRUE/FALSE'
bool_map = {"TRUE": 1, "True": 1, True: 1, "FALSE": 0, "False": 0, False: 0}
if precision_df['W'].dtype == object:
    precision_df['W'] = precision_df['W'].map(bool_map).astype(int)
if precision_df['Y1'].dtype == object:
    precision_df['Y1'] = precision_df['Y1'].map(bool_map).astype(int)

# cohort: D1 == 1
cohort = precision_df[precision_df['D1'] == 1].copy()

baseline_summary = {
    "n_total_D1": len(cohort),
    "treated_n": int(cohort['W'].sum()),
    "untreated_n": int((1 - cohort['W']).sum()),
    "treated_rate": float(cohort['W'].mean()),

    "Y1_rate_overall": float(cohort['Y1'].mean()),
    "Y1_rate_treated": float(cohort.loc[cohort['W'] == 1, 'Y1'].mean()),
    "Y1_rate_untreated": float(cohort.loc[cohort['W'] == 0, 'Y1'].mean()),
}

baseline_summary["naive_risk_diff_treated_minus_untreated"] = (
        baseline_summary["Y1_rate_treated"] - baseline_summary["Y1_rate_untreated"]
)

pd.DataFrame([baseline_summary])
pprint(baseline_summary)

Step 3: Estimate the average treatment effect (ATE) inside the D1 cohort

We can now calculate and use the propensity score and doubly-robust estimator so that the result is less biased by baseline differences.

Propensity Score Model

The model estimates the probability that a patient receives treatment (W == 1) given their features (X):

$p(X) = P(W = 1 \mid X)$

Using logistic regression, this is:

$p(X) = \frac{1}{1 + e^{-(\beta_0 + \beta^\top X)}}$

Where:

(X = (X_1, X_2, ... , X_{80}, site_id))

$\beta$ is the coefficient vector

$\beta_0$ is the intercept

Propensity model predicts the statistical likelihood that each patient would receive the treatment (W) given all baseline features (X1 - X80 and site)
Outcome model predicts the likelihood of a bad outcome for Y1 given treatment assignment and features.
Combine both –> Double-robust estimator corrects some bias even if one model is a bit wonky.
Here is an explanation of Average Tx Effect:

Term	Meaning	Formula	Typical Usage
ATE	Average Treatment Effect	$( E[Y(1) - Y(0)] )$	Overall average effect of treatment in a population.
CATE	Conditional Average Treatment Effect	$( E[Y(1) - Y(0) \mid X=x] )$	Average effect conditional on some covariates (e.g., age, comorbidities, site, severity).

ATE tells you “on average, what’s the treatment’s effect overall.”
CATE tells you “on average, what’s the treatment’s effect for patients like this one (given X ).”

# filter cohort for treated pats
precision_df = pd.read_csv("data/data.csv")
cohort = precision_df[precision_df['D1'] == 1].copy()

# 1) prep feature matrix
feature_cols = [f"X{i}" for i in range(1, 81)] + ["site_id"]
X = cohort[feature_cols]
W = cohort["W"].astype(int)
Y = cohort["Y1"].astype(int)

# 2a) propensity model: P(W=1|X)
ps_model = Pipeline([
    ("scaler", StandardScaler()),
    ("logit", LogisticRegression(max_iter=500))
])
ps_model.fit(X, W)
cohort["p_hat"] = ps_model.predict_proba(X)[:, 1]

# 2b) remove extreme probabilities to avoid huge weights
cohort["p_hat"] = cohort["p_hat"].clip(0.01, 0.99)

# 3a) outcome model: E[Y|W,X] - fit the model separately for treated and untreated (t: with Tx W; c: without)
outcome_model_t = Pipeline([
    ("scaler", StandardScaler()),
    ("logit", LogisticRegression(max_iter=500))
])
outcome_model_c = Pipeline([
    ("scaler", StandardScaler()),
    ("logit", LogisticRegression(max_iter=500))
])

outcome_model_t.fit(X[W == 1], Y[W == 1])
outcome_model_c.fit(X[W == 0], Y[W == 0])

# 3b) predicted outcomes under each condition
mu1 = outcome_model_t.predict_proba(X)[:, 1]  # if treated
mu0 = outcome_model_c.predict_proba(X)[:, 1]  # if untreated

# 4a) create the doubly robust ATE composite estimate (aka ate_dr)
cohort["dr_term"] = (
    (W * (Y - mu1) / cohort["p_hat"])
    - ((1 - W) * (Y - mu0) / (1 - cohort["p_hat"]))
    + (mu1 - mu0)
)
ate_dr = cohort["dr_term"].mean()  # just take the mean

print(f"Estimated ATE (Y1 bad outcome difference, treated - untreated): {ate_dr:.4f}")
print(f"Estimated Risk reduction by treatment: {-ate_dr:.4f}")

Tx Risk MGMT

ATE Sign	Interpretation
Negative	Treatment reduces bad outcomes → good news
Positive	Treatment worsens outcomes → bad news

can convert it to “% reduction in bad-outcome risk.”
ate_dr is the doubly-robust average treatment effect on Y1:

Step 4: Compute CATE (uplift) for each pat_id

The goal was to generate individual-level treatment recommendations — the CATE/uplift for each patient with D1.
So far, we have already fit these models:
- Propensity model: $(\hat p(X) = P(W=1|X))$
- Outcome models: $(\mu_1(X) = E[Y|W=1,X])$ for treated, and $(\mu_0(X) = E[Y|W=0,X])$ for untreated pat_ids
I want to try and keep it simple, stable, and interpretable.

Now CATE is: $\text{CATE}_i = \mu_1(X_i) - \mu_0(X_i)$

Since Y1 == 1 means “bad outcome,” we want:

Negative CATE → treatment reduces risk → recommend treatment
Positive → treatment increases/worsens risk → do NOT treat
Near zero → “no strong effect” meaning probably do NOT treat

# 4b) calculate individual-level CATE (uplift)
cohort["cate"] = mu1 - mu0   # predicted bad-outcome difference

# risk reduction is the negative of that (because Y1=1 means bad)
cohort["risk_reduction"] = -cohort["cate"]

# generate the Tx recommendation: treat if predicted risk reduction > 0
cohort["recommend_treatment"] = cohort["risk_reduction"] > 0

# recommendation table
recommend_df = cohort[
    ["patient_id", "cate", "risk_reduction", "recommend_treatment"]
].sort_values("risk_reduction", ascending=False)

# export to csv
recommend_df.to_csv("recommendation/tx_recommendations.csv", index=False)
print()
print()
pprint(recommend_df)  # .head(10))

Output columns:

Column	Meaning
cate	Predicted change in risk if treated vs untreated
risk_reduction	Negative of CATE (lower Y1 = better)
recommend_treatment	Boolean: True = predicted benefit

Let's interpret the values:

risk_reduction = 0.12 → treatment lowers bad-outcome risk by 12 percentage points.
risk_reduction = -0.05 → treatment increases risk by 5 points (do not treat).

Want to prioritize patients for treatment-screening appointments?

Sorting by risk reduction gives the highest-priority patients to treat.
We can organize them into treatment-priority cluster in step 5.
Direct clinicians to call/contact/outreach/follow up with prioritized treatment-recommended individuals or groups first if working on a clinical schedule and timeline.

Treatment-Recommendation Map (when to treat):

Scenario	Interpretation	Recommendation
CATE < 0	Treatment lowers risk of Y1 (good)	Treat
CATE > 0	Treatment increases risk of Y1 (bad)	Do not treat
CATE ≈ 0	No meaningful difference	Neutral / optional
Risk_Reduction = -CATE > 0	Predicts fewer bad outcomes with treatment	Treat
Risk_Reduction = -CATE < 0	Predicts more bad outcomes with treatment	Avoid

Step 5a: Plot the uplift distribution

We can see that distribution of good/bad treatment effects is normal:

print()
print('--------------------------------------------------------------------------------------------')
print('PLOT UPLIFT DISTRIBUTION:')
print()
print('please see recommendation/tx_recommendations.csv')
# Histogram of risk reduction
plt.figure(figsize=(10, 6))
plt.hist(cohort["risk_reduction"], bins=50)
plt.title("Distribution of Predicted Risk Reduction (–CATE)")
plt.xlabel("Risk Reduction (negative = harmful treatment; positive = beneficial)")
plt.ylabel("Number of Patients")
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('recommendation/uplift_distribution_plot.png')
plt.show()

Step 5b: Cluster patients by similar uplift

# reshape dataframe
risk_vals = cohort["risk_reduction"].values.reshape(-1, 1)

# k-means: three uplift groups
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
cohort["uplift_cluster"] = kmeans.fit_predict(risk_vals)

# sort the clusters by mean risk reduction for consistent labels
cluster_means = cohort.groupby("uplift_cluster")["risk_reduction"].mean().sort_values()

# map the clusters -> ordered labels (0 = worst, 2 = best)
cluster_order_map = {
    old: new for new, old in enumerate(cluster_means.index)
}
cohort["uplift_cluster"] = cohort["uplift_cluster"].map(cluster_order_map)

pprint(cohort[["patient_id", "risk_reduction", "uplift_cluster"]].head(10))
uplift_clusters_summary = cohort.groupby("uplift_cluster")["risk_reduction"].agg(["count", "mean", "min", "max"])
print(uplift_clusters_summary)

Cluster Number Meaning Map:

uplift_cluster	Interpretation	Meaning for Treatment
0	treatment harmful	negative risk_reduction → avoid
1	neutral or weak effect	little/no benefit
2	strong positive effect	major benefit → treat

Developer Notes

Why use uplift / doubly-robust CI estimation instead of a T-learning model with XGBoost? (The Average Treatment Effect (ACE/CASE) is also known as 'uplift.')

A T-learner with XGBoost is not a causal model.
It is a predicitive model trained separately on treated and untreated outcomes.

Because of that, it suffers from three fundamental weaknesses and would be misapplied in our 'Tx Recommendations' use case:

1. T-Learning does not correct for treatment bias (confounding).

T-Learner trains:

$𝑓_1(𝑋)$ on treated patients

$𝑓_0(𝑋)$ on untreated pats

But those two groups almost always have a different 𝑋 distributions:
treated patients != untreated patients
so XGBoost ends up learning:

differences in who gets treated

NOT differences in treatment effect

This produces biased uplift.

ATE/CATE/DoublyRobust handles this using:

propensity scores: $\hat{p}(𝑋)$ (i.e., the estimated probability as a function of X)

outcome models: $\hat{\mu_1}(𝑋)$, $\hat{\mu_0}(𝑋)$

inverse probability weighting (IPW, i.e., a causal method that reweighs each observation by the inverse of its probability of receiving the treatment it actually received, making the treated and untreated groups comparable as if randomized.

Doubly robust, written as var_dr in my python script. DR correction combines both propensity weighting and outcome modeling so the treatment effect remains unbiased if either model is correctly specified.

This makes treatment comparisons as if randomized--T-learning cannot do that.

2. Doubly-Robust ATE remains correct even if one model is wrong.

The DR estimator is called "doubly robust" because $\widehat{ATE_{DR}}$ is unbiased if either $\hat{p}(X)$ or $\hat{\mu}(X)$ is correct.

T-learning has zero robustness:

if either XGBoost model is even slightly mis-specified

we get biased estimates

3. T-Learner overfits easily with imbalanced treatments

For example, if treatment W=1 is rare (common in medicine), then:

$f_1(X)$ treated model is weak

$f_0(X)$ untreated model is strong

the uplift is dominated by noise from the treated model

ATE/DR uses:

weights

correction terms

variance stabilization So it handles imbalance safely.

Also: T-Learning estimates outcomes, not causal effects

This is the most important point of why not to use it in this use case.

T-Learner model:

$E[Y | X,W]$

But we want:

$E[Y(1)] - Y(0) | X]$

T-learner predicts based on observed outcomes only, not counterfactuals.

ATE/DR explicity reconstructs counterfactuals using:

proprensity adjustment
residual correction
cross-world estimate (i.e., hypothetical comparison of Tx and not Tx)

This is known as causal inference (CI), not standard prediciton.

To summarize the above:

Method	Purpose	Problem	Causal?
XGBoost T-Learner	Predicts Y for treated & untreated separately	Learns treatment bias; no confounding correction	no
ATE/Doubly-Robust	Estimates effect of treatment holding confounding constant	Requires assumptions but statistically robust	yes

The ATE/doubly-robust method is better because it removes confounding and identifies true causal effects, whereas an XGBoost T-learner only builds two prediction models and cannot produce unbiased Tx effects.

Here is a full variable mapping for this project:

Variable	Meaning	Model Role
`patient_id`	Unique patient identifier	🟦 ID only (not used in models)
`site_id`	Hospital/site ID	🟩 Propensity + Outcome
`D1–D4`	Diagnoses	🟩 Propensity + Outcome
`X1–X80`	Patient covariates	🟩 Propensity + Outcome
`W`	Treatment indicator	🟥 Target (Propensity), Feature (Outcome)
`Y1`	Main clinical outcome	🟥 Target (Outcome model)
`Y2–Y5`	Other outcomes	🟧 Optional Outcome targets
`p̂(X)`	Propensity score	🟪 Propensity model output
`μ̂₁(X)`	Predicted outcome if treated	🟪 Outcome model output
`μ̂₀(X)`	Predicted outcome if untreated	🟪 Outcome model output
`CATE_i`	Individual treatment effect	🟨 Uplift/Causal Model
`risk_reduction_i`	−CATEᵢ	🟨 Decision Rule
`uplift_cluster`	K-means uplift group	🟫 Clustering Model
`ATE_DR`	Doubly-robust ATE	🟫 Causal Estimator

Legend: 🟦 ID 🟩 Features 🟥 Targets 🟪 Model Outputs 🟨 Causal/Uplift 🟫 Meta-analysis

Here is a data-pipeline / dataflow diagram showing where each variable is touched:

                  ┌───────────────────────────────────────────┐
                  │                  DATA                     │
                  │ patient_id, site_id, D1–D4, X1–X80, W, Y1 │
                  └───────────────────────────────────────────┘
                                      |
                                      v
        ┌─────────────────────────────────────────────────────────┐
        │                 PROPENSITY MODEL p̂(X)                   │
        │   Inputs: site_id, D1–D4, X1–X80                        │
        │   Output: p̂(X) = P(W=1 | X)                             │
        └─────────────────────────────────────────────────────────┘
                                      |
                                      v
      ┌─────────────────────────────────────────────────────────────┐
      │                OUTCOME MODELS μ̂₁(X), μ̂₀(X)                  │
      │   Inputs: W, Y1, site_id, D1–D4, X1–X80                     │
      │   Outputs: μ̂₁(X), μ̂₀(X)                                     │
      └─────────────────────────────────────────────────────────────┘
                                      |
                                      v
          ┌──────────────────────────────────────────────────┐
          │              DOUBLY-ROBUST ATE_DR                │
          │   Uses: Y1, W, p̂(X), μ̂₁(X), μ̂₀(X)                │
          └──────────────────────────────────────────────────┘
                                      |
                                      v
     ┌──────────────────────────────────────────────────────────────┐
     │                    CATE / UPLIFT MODEL                       │
     │   CATE = μ̂₁(X) − μ̂₀(X)                                       │
     │   risk_reduction = −CATE                                     │
     │   uplift_cluster (K-means on risk_reduction)                 │
     └──────────────────────────────────────────────────────────────┘

Please see the mini/demo whitepaper: PrecisionMedicine_miniWhitepaper.pdf

View the mini white paper example and causal DAG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Precision Medicine MiniDemo

Step 0: Acquire source patient data

Step 1: Frame the problem precisely

Step 2: Look at the baseline

Step 3: Estimate the average treatment effect (ATE) inside the D1 cohort

Propensity Score Model

Step 4: Compute CATE (uplift) for each pat_id

Output columns:

Let's interpret the values:

Want to prioritize patients for treatment-screening appointments?

Treatment-Recommendation Map (when to treat):

Step 5a: Plot the uplift distribution

Step 5b: Cluster patients by similar uplift

Developer Notes

Why use uplift / doubly-robust CI estimation instead of a T-learning model with XGBoost? (The Average Treatment Effect (ACE/CASE) is also known as 'uplift.')

1. T-Learning does not correct for treatment bias (confounding).

2. Doubly-Robust ATE remains correct even if one model is wrong.

3. T-Learner overfits easily with imbalanced treatments

Also: T-Learning estimates outcomes, not causal effects

Here is a full variable mapping for this project:

Here is a data-pipeline / dataflow diagram showing where each variable is touched:

Please see the mini/demo whitepaper: PrecisionMedicine_miniWhitepaper.pdf

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
docs		docs
main		main
utils		utils
README.md		README.md
SECURITY.md		SECURITY.md
requirements.txt		requirements.txt

si-software-lab/precision-medicine

Folders and files

Latest commit

History

Repository files navigation

Precision Medicine MiniDemo

Step 0: Acquire source patient data

Step 1: Frame the problem precisely

Step 2: Look at the baseline

Step 3: Estimate the average treatment effect (ATE) inside the D1 cohort

Propensity Score Model

Step 4: Compute CATE (uplift) for each pat_id

Output columns:

Let's interpret the values:

Want to prioritize patients for treatment-screening appointments?

Treatment-Recommendation Map (when to treat):

Step 5a: Plot the uplift distribution

Step 5b: Cluster patients by similar uplift

Developer Notes

Why use uplift / doubly-robust CI estimation instead of a T-learning model with XGBoost? (The Average Treatment Effect (ACE/CASE) is also known as 'uplift.')

1. T-Learning does not correct for treatment bias (confounding).

2. Doubly-Robust ATE remains correct even if one model is wrong.

3. T-Learner overfits easily with imbalanced treatments

Also: T-Learning estimates outcomes, not causal effects

Here is a full variable mapping for this project:

Here is a data-pipeline / dataflow diagram showing where each variable is touched:

Please see the mini/demo whitepaper: PrecisionMedicine_miniWhitepaper.pdf

About

Resources

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages