Analyse Measured vs Reported Runtime (v2) – Issue #396#432
Open
Vamsipriya22 wants to merge 6 commits intomainfrom
Open
Analyse Measured vs Reported Runtime (v2) – Issue #396#432Vamsipriya22 wants to merge 6 commits intomainfrom
Vamsipriya22 wants to merge 6 commits intomainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
for more information, see https://pre-commit.ci
4 tasks
eantonini
approved these changes
Mar 9, 2026
Member
eantonini
left a comment
There was a problem hiding this comment.
Everything looks good, thanks for performing this analysis and adding the key conclusions to the Jupyter notebook.
siddharth-krishna
requested changes
Mar 10, 2026
Member
siddharth-krishna
left a comment
There was a problem hiding this comment.
Thanks, Priya, for the analysis and detailed notes. I notice that in the notebook many of the bench-sizes with large runtime-difference-% have really small runtime. Can we:
- Filter out the instances with runtime < 1min -- even if there is a big discrepancy here people don't really care about such quick solution times
- Show the top 5 bench-sizes sorted by
runtime-difference-%, in each category S/M/L. I see this in the PR description but not in the jupyter notebook. Also, the screenshot in the PR description for Medium instances has S and L instances too. And the screenshot for Large instances isn't sorted byruntime-difference-%.
| "cell_type": "markdown", | ||
| "metadata": {}, | ||
| "source": [ | ||
| "# 4. Identify Benchmark-Solver Pairs with Largest Runtime Discrepancies" |
Member
There was a problem hiding this comment.
All the items on this table have runtime <= 1s. Can we first filter to those with runtime > 1min and then check the largest runtime-difference-%?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR is related to the issue #396.
A new notebook has been added:
notebooks/analyze_runtime_discrepancies_v2.ipynbThe notebook analyses the difference between:
Runtime (s)(measured)Reported Runtime (s)(solver-reported)for all successful, non-reference benchmarks in the v2 results.
What was done
Status == "ok"runtime-differenceruntime-difference-%(absolute difference divided by the maximum of the two runtimes)bench-sizesolver-versionNum. variablesNum. constraintsrun_solver.pywith:to check whether discrepancies can be reproduced.
Key Findings
Small (S) Benchmarks
Medium (M) Benchmarks
Large (L) Benchmarks
Gurobi)Observations
highs-hipo-1.12.0-hipo.