Reference implementation 4 by Sindhuja217 · Pull Request #13 · VectorInstitute/interpretability-llms-agents

Sindhuja217 · 2026-02-09T04:14:41Z

This reference implementation includes all core helper utilities, end-to-end notebooks, and documentation required to run a Direct Preference Optimization (DPO) pipeline with an LLM-as-a-Judge setup. The implementation covers dataset construction, judge-based inference, preference pair generation, DPO training, and evaluation, and is structured for modularity and reproducibility.

This is an initial version of the reference implementation. While the codebase is complete and internally consistent, it has not yet been executed or validated on Google Colab. Minor environment- or runtime-specific adjustments may be required when running in Colab.

aravind-3105

Adding comments here as I go through them

implementations/implementation_d/dpo_req.txt

implementations/implementation_d/README.md

implementations/implementation_d/01_dataset_construction.ipynb

aravind-3105 · 2026-02-10T20:05:09Z

implementations/implementation_d/01_dataset_construction.ipynb

+   },
+   "outputs": [],
+   "source": [
+    "raw_dataset = load_parquet_dataset(PARQUET_PATH)\n",


PARQUET PATH will be PosixPath('') form so raw_dataset = load_parquet_dataset(str(PARQUET_PATH)) needed.

ALso the .parquet isn't going to be there in the repo so download data steps should be added in this notebook.

so the .parquet files are added in this folder
"/projects/aieng/interp_agents_bootcamp/reference_implementation_4"
So will the participants have access to this folder, because those are not directly downloaded from hugging face I kinda filtered the data from hugging face and got those parquet files

also should I add the code for the data filtering part anywhere ?

Keep the data filtering script and add instructions in the README on how to create it. Ideally, the filtered data will be stored in a GCP bucket, which participants can access to download and place in the reference_implementation_4 folder. They won't have access to the cluster.

implementations/implementation_d/01_dataset_construction.ipynb

shainarazavi

there are some icons in these, do you want to keep, ideally good to have some reference for judge model, although its your prompt @Sindhuja217

shainarazavi

I suggest add little bit context before each step

shainarazavi · 2026-02-11T16:45:52Z

implementations/implementation_d/01_dataset_construction.ipynb

it would be nice to add bit about why seed needed, why we perform some step, little bit above each line of code

shainarazavi · 2026-02-11T16:47:04Z

implementations/implementation_d/02_inference_runner.ipynb

@Sindhuja217 can we add some context before each line of code, bit what is happeneing, there are many LLM judge papers, good to add reference to 1-2 storng

shainarazavi · 2026-02-11T16:48:18Z

implementations/implementation_d/03_dpo_pair_construction.ipynb

@Sindhuja217 I prefer to add some context before each line of code and some ref in the end

shainarazavi · 2026-02-11T16:49:36Z

implementations/implementation_d/05_evaluation.ipynb

what am I missing @Sindhuja217 @aravind-3105 that we can add bit context before each line, add a reference in the end of related works (I know we have one reference we following but see it more from academic view)

Sindhuja217 · 2026-02-13T04:38:40Z

@shainarazavi I addressed all your comments added context for important cells, Im planning to add some one once I run the code on gpu and also included respective references

aravind-3105 · 2026-02-13T20:39:48Z

I’ve gone through all the notebooks (except the 5th, which needs an API key, I will try it once approval comes through) and everything is working well. One suggestion for the first notebook: instead of jumping straight into the "Dataset Construction for Preference Alignment (DPO)" section, it might be helpful to start with a main title, #Preference Alignment (DPO), that explains why we follow the four steps and includes a brief description (maybe even 1-2 images) about what preference alignment is. This, along with the slides, would make it easier to explain and for participants to grasp. The same content could also be added to the readme so both look complete.

Another addition, based on Shaina’s feedback for other notebooks, is to include 3-4 questions, answers, or discussion points on the topic. Since the notebooks are divided into sections, these could be added to the readme instead of any particular notebook. Once these two additions are in place, it’s good to merge. Thanks for addressing the comments so promptly, really appreciate it.

aravind-3105

Everything looks good now to merge.

sindchad added 3 commits February 8, 2026 22:57

Reference implementation 4

2e0f9e2

remove duplicate files

1d22437

modification in readme.md

c66e133

Sindhuja217 self-assigned this Feb 9, 2026

Sindhuja217 requested review from aravind-3105 and shainarazavi February 9, 2026 14:03

aravind-3105 added the enhancement New feature or request label Feb 9, 2026

aravind-3105 requested changes Feb 10, 2026

View reviewed changes

shainarazavi reviewed Feb 11, 2026

View reviewed changes

shainarazavi self-requested a review February 11, 2026 16:44

shainarazavi reviewed Feb 11, 2026

View reviewed changes

sindchad added 2 commits February 12, 2026 23:30

context in notebooks

6d965e8

equation change

f8a1e14

sindchad added 5 commits February 14, 2026 17:24

readme modified

35536ba

readme modified

a57f6e6

readme modified

b902d55

readme modified

3569277

readme modified

1ce892c

aravind-3105 approved these changes Feb 17, 2026

View reviewed changes

Sindhuja217 merged commit 7ac52b4 into main Feb 17, 2026
1 of 2 checks passed

aravind-3105 deleted the ref-impl-4 branch March 10, 2026 19:54

Conversation

Sindhuja217 commented Feb 9, 2026

Uh oh!

aravind-3105 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aravind-3105 Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Sindhuja217 Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aravind-3105 Feb 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shainarazavi left a comment

Choose a reason for hiding this comment

Uh oh!

shainarazavi left a comment

Choose a reason for hiding this comment

Uh oh!

shainarazavi Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

shainarazavi Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

shainarazavi Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

shainarazavi Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Sindhuja217 commented Feb 13, 2026

Uh oh!

aravind-3105 commented Feb 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aravind-3105 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sindhuja217 Feb 12, 2026 •

edited

Loading

aravind-3105 commented Feb 13, 2026 •

edited

Loading