Skip to content

Feature/indiv markov graphs#66

Open
AdaraPutri wants to merge 53 commits intodevfrom
feature/indiv-markov-graphs
Open

Feature/indiv markov graphs#66
AdaraPutri wants to merge 53 commits intodevfrom
feature/indiv-markov-graphs

Conversation

@AdaraPutri
Copy link
Copy Markdown
Collaborator

@AdaraPutri AdaraPutri commented Feb 2, 2026

This PR adds an individual-level version of the transition-edges + Markov graph pipeline, while keeping the same PR-ID sessioning behavior as the team-level flow. It introduces two new scripts (transition_edges_individual.py and graphing_individual.py) that generate per-person CSV outputs and graphs, with the logic split depending on the data source:

  • Branching/structure labels (data/graph_labels/*_labels_branching_and_structure.csv): the pipeline splits rows by pr_author, so each person’s transitions/graphs only use their own rows (still grouped by PR session).
  • PR labels (raw, non-clean) (data/csv/pr_labels_year-long-project-team-*.csv): since the raw file has two possible “author” columns, the pipeline first derives a single user field per row using: if source is empty or "pr" → use pr_author, otherwise if source is "review" → use author. Also, because the raw PR label CSV doesn’t have a timestamp column, this flow generates one using the same rules as the clean-CSV helper: default created_at, use merged_at for merge events (reviewed_merge / self_merge), and use updated_at for no_merge (fallback to created_at if missing).

To avoid copy/pasting the same Markov helpers across 4 scripts, common pieces (event parsing/explode + edge computation + graph rendering helpers) were moved into process_model/_markov_common.py, and the team + individual scripts import the functions they need directly.

Testing

Testing will be done in a separate test script PR since this change is already pretty big.

Expected Result

Here is a sample output of one student in Team 2 for the PR label (access through data/outputs/pr_individual/users/year-long-project-team-2/indigoalex-771a/individual_avg_session/indigoalex-771a_avg_session.png)
indigoalex-771a_avg_session

Closes: #60

@AdaraPutri AdaraPutri requested review from Mahatav and d2r3v February 2, 2026 17:12
@AdaraPutri AdaraPutri self-assigned this Feb 2, 2026
Copy link
Copy Markdown
Collaborator

@Mahatav Mahatav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good; however, I would pull from dev and fix the toggle thing.

Comment thread process_model/graphing_individual.py Outdated
CURRENT_DIR = os.path.dirname(os.path.abspath(__file__))
ROOT = os.path.abspath(os.path.join(CURRENT_DIR, "../"))

# ============================================================
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would pull from dev and rework it so everything works without a toggle.

@AdaraPutri
Copy link
Copy Markdown
Collaborator Author

@Mahatav thanks for the review Manu! I've pulled from dev now so feel free to re-review

@AdaraPutri AdaraPutri requested a review from Mahatav March 9, 2026 17:37
@d2r3v d2r3v force-pushed the dev branch 2 times, most recently from a8e6d00 to d14e5b5 Compare March 16, 2026 08:26
Copy link
Copy Markdown
Collaborator

@Mahatav Mahatav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants