Analyse Measured vs Reported Runtime (v2) – Issue #396 by Vamsipriya22 · Pull Request #432 · open-energy-transition/solver-benchmark

Vamsipriya22 · 2026-02-18T14:05:19Z

This PR is related to the issue #396.

A new notebook has been added:

notebooks/analyze_runtime_discrepancies_v2.ipynb

The notebook analyses the difference between:

Runtime (s) (measured)
Reported Runtime (s) (solver-reported)

for all successful, non-reference benchmarks in the v2 results.

What was done

Filtered results with Status == "ok"
Computed:
- runtime-difference
- runtime-difference-% (absolute difference divided by the maximum of the two runtimes)
Merged benchmark metadata:
- bench-size
- solver-version
- Num. variables
- Num. constraints
- size category (S / M / L)
Re-ran selected benchmarks (S, M, L) using run_solver.py with:
- the same conda environment
- the same solver version
  to check whether discrepancies can be reproduced.

Key Findings

Small (S) Benchmarks

Absolute runtime difference is extremely small (often a few milliseconds).
Percentage difference appears very large (sometimes ~90%+), because the runtime itself is very small.
Likely causes:
- Solver startup overhead
- Measurement noise dominating short runtimes

Medium (M) Benchmarks

Moderate runtime differences (~5–10%).
Absolute differences typically around ~0.5–1 second.
Likely causes:
- LP/MPS file parsing time
- Preprocessing outside solver internal timing

Large (L) Benchmarks

Large absolute runtime differences (tens to hundreds of seconds).
Percentage differences are generally small (~1–17%).
Likely causes:
- Slow parsing of large LP/MPS files
- Solver initialization overhead
- License validation time (especially Gurobi)
- Additional setup not included in solver-reported runtime

Observations

Many benchmarks (including large ones) show zero difference, especially for highs-hipo-1.12.0-hipo.

No systematic inconsistency between measured and reported runtimes was observed.
Differences are primarily due to overhead outside the solver’s internal timing.

vercel · 2026-02-18T14:05:25Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
solver-benchmark	Ready	Preview, Comment	Mar 5, 2026 3:11pm

for more information, see https://pre-commit.ci

eantonini

Everything looks good, thanks for performing this analysis and adding the key conclusions to the Jupyter notebook.

siddharth-krishna

Thanks, Priya, for the analysis and detailed notes. I notice that in the notebook many of the bench-sizes with large runtime-difference-% have really small runtime. Can we:

Filter out the instances with runtime < 1min -- even if there is a big discrepancy here people don't really care about such quick solution times
Show the top 5 bench-sizes sorted by runtime-difference-%, in each category S/M/L. I see this in the PR description but not in the jupyter notebook. Also, the screenshot in the PR description for Medium instances has S and L instances too. And the screenshot for Large instances isn't sorted by runtime-difference-%.

siddharth-krishna · 2026-03-10T14:36:21Z

notebooks/analyze_runtime_discrepancies_v2.ipynb

+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# 4. Identify Benchmark-Solver Pairs with Largest Runtime Discrepancies"


All the items on this table have runtime <= 1s. Can we first filter to those with runtime > 1min and then check the largest runtime-difference-%?

Jupyter notebook is added to analyse runtime discrepancies in v2 run

fce40fa

[pre-commit.ci] auto fixes from pre-commit.com hooks

6bc94cd

for more information, see https://pre-commit.ci

vercel bot deployed to Preview – solver-benchmark February 18, 2026 14:06 View deployment

Merge branch 'main' into priya/run-v2-runtimes

31d7f76

vercel bot deployed to Preview – solver-benchmark February 18, 2026 14:11 View deployment

Vamsipriya22 marked this pull request as ready for review February 23, 2026 10:16

Vamsipriya22 requested review from danielelerede-oet, eantonini and siddharth-krishna February 23, 2026 10:16

Merge branch 'main' into priya/run-v2-runtimes

b198cb1

vercel bot deployed to Preview – solver-benchmark March 5, 2026 06:29 View deployment

Fix CI timeout error

445bd17

vercel bot deployed to Preview – solver-benchmark March 5, 2026 06:50 View deployment

eantonini linked an issue Mar 5, 2026 that may be closed by this pull request

Analyze measured vs reported runtime in v2 #396

Open

4 tasks

Added conclusion to jupyter notebook file

94e7db5

vercel bot deployed to Preview – solver-benchmark March 5, 2026 15:11 View deployment

eantonini approved these changes Mar 9, 2026

View reviewed changes

siddharth-krishna requested changes Mar 10, 2026

View reviewed changes

eantonini added the monitoring label Mar 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analyse Measured vs Reported Runtime (v2) – Issue #396#432

Analyse Measured vs Reported Runtime (v2) – Issue #396#432
Vamsipriya22 wants to merge 6 commits intomainfrom
priya/run-v2-runtimes

Vamsipriya22 commented Feb 18, 2026

Uh oh!

vercel bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

eantonini left a comment

Uh oh!

siddharth-krishna left a comment

Uh oh!

siddharth-krishna Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Vamsipriya22 commented Feb 18, 2026

What was done

Key Findings

Small (S) Benchmarks

Medium (M) Benchmarks

Large (L) Benchmarks

Observations

Uh oh!

vercel bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eantonini left a comment

Choose a reason for hiding this comment

Uh oh!

siddharth-krishna left a comment

Choose a reason for hiding this comment

Uh oh!

siddharth-krishna Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel bot commented Feb 18, 2026 •

edited

Loading