Add paper asset generation tooling#31
Conversation
Reviewer's GuideAdds a new tools.paper_assets module to generate NeurIPS paper tables and figures from benchmark JSON outputs, wires it into CONTRIBUTING, extends result loading with skip metrics, and updates pre-commit tooling versions and a GCP comment. Sequence diagram for CLI-driven paper asset generationsequenceDiagram
actor Dev
participant Python as python_module_runner
participant Main as tools_paper_assets_main
participant Results as tools_results
participant Matplotlib as matplotlib_seaborn
participant FS as filesystem
Dev->>Python: python -m tools.paper_assets --all
Python->>Main: main()
Main->>Main: argparse.parse_args()
Main->>Main: resolve repo_root, input_dir, paper_dir
Main->>Main: fmt = _parse_formats("png,pdf")
Main->>Main: determine do_tables, do_figures
alt tables and/or figures selected
opt generate tables
Main->>Results: load_results(input_dir)
Results-->>Main: df_1t
Main->>Results: load_dataloader_results(input_dir)
Results-->>Main: dl
Main->>Main: generate_tables(input_dir, paper_dir)
Main->>FS: write generated/table*.md
FS-->>Main: tables persisted
end
opt generate figures
Main->>Matplotlib: sns.set_theme(...)
Main->>Results: load_dataloader_results(input_dir)
Results-->>Main: dl
Main->>Main: generate_figures(input_dir, paper_dir, fmt)
Main->>FS: save fig01..fig04 in figures/*.{png,pdf}
FS-->>Main: figures persisted
end
else no flags
Main->>Main: parser.error("Specify --tables, --figures, or --all")
end
Main-->>Python: exit
Python-->>Dev: CLI returns
Class diagram for tools.paper_assets and related utilitiesclassDiagram
class ToolsPaperAssetsModule {
+generate_tables(input_dir: Path, paper_dir: Path) void
+generate_figures(input_dir: Path, paper_dir: Path, formats: tuple) void
+plot_fig01_dataloader_scaling(dl: DataFrame, out_dir: Path, formats: tuple) void
+plot_fig02_amd_w4_w8(dl: DataFrame, out_dir: Path, formats: tuple) void
+plot_fig03_library_efficiency_heatmap(dl: DataFrame, out_dir: Path, formats: tuple) void
+plot_fig04_cross_platform_recommendation(dl: DataFrame, out_dir: Path, formats: tuple) void
+generate_hardware_table(df_1t: DataFrame, dest: Path) void
+generate_single_thread_table(df_1t: DataFrame, dest: Path) void
+generate_peak_dataloader_table(dl: DataFrame, dest: Path) void
+generate_amd_w4_w8_table(dl: DataFrame, dest: Path) void
+generate_recommendation_table(dl: DataFrame, dest: Path) void
+main() void
-_paper_scope(df: DataFrame) DataFrame
-_platform_labels(cpu_by_platform: dict) dict
-_cpu_by_platform(df: DataFrame) dict
-_ordered_libs_present(df: DataFrame) list
-_md_table(headers: list, rows: list) str
-_fmt_mean_std(mean: float, std: float) str
-_fmt_ips(value: float) str
-_peak_dataloader_rows(dl: DataFrame) DataFrame
-_ips_at_workers(dl: DataFrame, platform: str, lib: str, workers: int) float
-_peak_pct_of_platform_winner(dl: DataFrame) DataFrame
-_lib_colors(libs: list) dict
-_parse_formats(value: str) tuple
}
class ToolsResultsModule {
+LIBRARY_ORDER: list
+load_results(input_dir: Path) DataFrame
+load_dataloader_results(input_dir: Path) DataFrame
+short_platform(platform: str, cpu_brand: str) str
}
class MatplotlibSeaborn {
+pyplot
+sns_set_theme(style: str, context: str, font_scale: float) void
}
class PathsAndIO {
+Path
+mkdir(parents: bool, exist_ok: bool) void
+write_text(data: str, encoding: str) void
+resolve() Path
}
ToolsPaperAssetsModule ..> ToolsResultsModule : uses_loaders_and_order
ToolsPaperAssetsModule ..> MatplotlibSeaborn : uses_plotting
ToolsPaperAssetsModule ..> PathsAndIO : writes_tables_and_figures
class Constants {
+PAPER_PLATFORMS: tuple
+PLATFORM_MACHINE: dict
+PLATFORM_MICROARCH: dict
+PLATFORM_SMT: dict
+ZEN4_PLATFORM: str
+ZEN5_PLATFORM: str
+WORKERS_ORDER: tuple
+FIG_DPI: int
}
ToolsPaperAssetsModule ..> Constants : reads_configuration
Flow diagram for paper asset generation pipelineflowchart LR
subgraph OutputDir["output/ (benchmark JSON)"]
JSONResults["*_results.json"]
JSONDataloader["*_dataloader_results.json"]
end
Dev["Developer runs\npython -m tools.paper_assets"] --> CLIArgs["argparse parses flags\n--tables / --figures / --all"]
CLIArgs -->|resolve paths| RepoRoot["repo_root"]
RepoRoot --> InputDir["input_dir = repo_root/output"]
RepoRoot --> PaperDir["paper_dir = repo_root/_internal/papers"]
subgraph ResultsLoaders["tools._results"]
LoadResults["load_results(input_dir)"]
LoadDLResults["load_dataloader_results(input_dir)"]
end
InputDir --> LoadResults
InputDir --> LoadDLResults
subgraph PaperAssets["tools.paper_assets"]
GenerateTables["generate_tables(input_dir, paper_dir)"]
GenerateFigures["generate_figures(input_dir, paper_dir, formats)"]
HardwareTable["generate_hardware_table"]
SingleThreadTable["generate_single_thread_table"]
PeakDLTable["generate_peak_dataloader_table"]
AmdW4W8Table["generate_amd_w4_w8_table"]
RecommendationTable["generate_recommendation_table"]
Fig01["plot_fig01_dataloader_scaling"]
Fig02["plot_fig02_amd_w4_w8"]
Fig03["plot_fig03_library_efficiency_heatmap"]
Fig04["plot_fig04_cross_platform_recommendation"]
end
LoadResults -->|df_1t| GenerateTables
LoadDLResults -->|dl| GenerateTables
LoadDLResults -->|dl| GenerateFigures
GenerateTables --> HardwareTable
GenerateTables --> SingleThreadTable
GenerateTables --> PeakDLTable
GenerateTables --> AmdW4W8Table
GenerateTables --> RecommendationTable
GenerateFigures --> Fig01
GenerateFigures --> Fig02
GenerateFigures --> Fig03
GenerateFigures --> Fig04
subgraph PaperOutput["_internal/papers"]
GenDir["generated/*.md tables"]
FigDir["figures/*.{png,pdf}"]
end
HardwareTable --> GenDir
SingleThreadTable --> GenDir
PeakDLTable --> GenDir
AmdW4W8Table --> GenDir
RecommendationTable --> GenDir
Fig01 --> FigDir
Fig02 --> FigDir
Fig03 --> FigDir
Fig04 --> FigDir
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="tools/paper_assets.py" line_range="102-107" />
<code_context>
+ return "\n".join(lines) + "\n"
+
+
+def _fmt_mean_std(mean: float | None, std: float | None) -> str:
+ if mean is None or (isinstance(mean, float) and math.isnan(mean)):
+ return "—"
+ if std is None or (isinstance(std, float) and math.isnan(std)):
+ return f"{mean:.0f}"
+ return f"{mean:.0f} ± {std:.0f}"
+
+
</code_context>
<issue_to_address>
**issue (bug_risk):** Handle NumPy NaNs correctly in _fmt_mean_std and _fmt_ips.
Because these helpers are often passed NumPy scalars (e.g., np.float64) from DataFrames, the `isinstance(x, float)` check fails and the NaN branch is skipped, so NaNs are rendered as `'nan'` instead of an em dash. The same applies to `_fmt_ips`. Consider dropping the `isinstance` guard and using a NaN-aware check like `pd.isna` (or `np.isnan`) directly so that any NaN-like value returns the em dash and avoids leaking `'nan'` into the output.
</issue_to_address>
### Comment 2
<location path="tools/paper_assets.py" line_range="388-396" />
<code_context>
+ sub = peak[peak["platform"] == plat]
+ if sub.empty:
+ continue
+ winner_ips = float(sub["peak_ips"].max())
+ rows.extend(
+ [
+ {
+ "platform": plat,
+ "library": row["library"],
+ "peak_ips": float(row["peak_ips"]),
+ "peak_workers": int(row["peak_workers"]),
+ "pct_of_winner": 100.0 * float(row["peak_ips"]) / winner_ips,
+ }
+ for row in sub.to_dict("records")
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against division by zero when normalizing by the platform winner.
In `_peak_pct_of_platform_winner`, if all `peak_ips` for a platform are zero (or effectively zero), `winner_ips` becomes 0 and `pct_of_winner` does `100.0 * ... / 0.0`, producing `inf`/`nan` and breaking downstream consumers. Please guard against this, e.g.:
```python
winner_ips = float(sub["peak_ips"].max())
if not winner_ips:
# skip this platform or treat everything as 0%
continue
```
Alternatively, clamp `winner_ips` to a small epsilon before dividing.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
Adds a new tools.paper_assets module intended to generate NeurIPS paper-ready Markdown tables and Matplotlib/Seaborn figures directly from the existing output/*/*.json benchmark artifacts, keeping numbers aligned with the shared tools._results loaders used elsewhere in the repo.
Changes:
- Add
tools/paper_assets.pyCLI to write paper tables (_internal/papers/generated/*.md) and figures (_internal/papers/figures/*) fromoutput/. - Extend
tools._results.load_results()to expose skip-related fields (skip_rate,num_images_skipped) in the DataFrame. - Minor documentation / tooling updates (CONTRIBUTING example, pre-commit hook rev bumps, GCP script comment update).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tools/paper_assets.py | New CLI + helpers to generate paper tables/figures from benchmark JSON using shared loaders. |
| tools/_results.py | Adds skip-related columns to load_results() output schema. |
| gcp/run-many.sh | Updates a quota-related comment (32 → 64 vCPU). |
| CONTRIBUTING.md | Documents an example internal wrapper for generating paper assets (under gitignored _internal/). |
| .pre-commit-config.yaml | Bumps pre-commit hook revisions (ruff/uv/mypy). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| row.append( | ||
| _fmt_mean_std( | ||
| float(sub["images_per_second"].iloc[0]), | ||
| float(sub["std_dev"].iloc[0]) if pd.notna(sub["std_dev"].iloc[0]) else None, | ||
| ), | ||
| ) |
| "p90": results.get("images_per_second_p90"), | ||
| "p99": results.get("images_per_second_p99"), | ||
| "us_per_image": results.get("us_per_image_mean"), | ||
| "skip_rate": results.get("skip_rate"), | ||
| "num_images_skipped": results.get("num_images_skipped"), | ||
| "cpu_brand": cpu.get("brand_raw", "Unknown CPU"), | ||
| "os_name": sysinfo.get("OS", platform.split("_")[0].title()), | ||
| }, |
| def _peak_dataloader_rows(dl: pd.DataFrame) -> pd.DataFrame: | ||
| rows: list[dict] = [] | ||
| for plat in PAPER_PLATFORMS: | ||
| libs = dl[dl["platform"] == plat]["library"].unique() | ||
| for lib in libs: | ||
| sub = dl[(dl["platform"] == plat) & (dl["library"] == lib)] | ||
| if sub.empty: | ||
| continue | ||
| idx = sub["images_per_second"].idxmax() | ||
| peak = sub.loc[idx] | ||
| rows.append( | ||
| { | ||
| "platform": plat, | ||
| "library": lib, | ||
| "peak_ips": float(peak["images_per_second"]), | ||
| "peak_workers": int(peak["num_workers"]), | ||
| }, | ||
| ) | ||
| return pd.DataFrame(rows) | ||
|
|
||
|
|
||
| def generate_peak_dataloader_table(dl: pd.DataFrame, dest: Path) -> None: | ||
| dl = _paper_scope(dl) | ||
| peak = _peak_dataloader_rows(dl) | ||
| plat_labels = _platform_labels(_cpu_by_platform(dl)) | ||
| libs = _ordered_libs_present(dl) | ||
| headers = ["Decoder"] + [plat_labels[p] for p in PAPER_PLATFORMS] | ||
| rows: list[list[str]] = [] | ||
| for lib in libs: | ||
| row = [f"`{lib}`"] | ||
| for plat in PAPER_PLATFORMS: | ||
| sub = peak[(peak["platform"] == plat) & (peak["library"] == lib)] | ||
| if sub.empty: | ||
| row.append("—") | ||
| else: | ||
| w = int(sub["peak_workers"].iloc[0]) | ||
| ips = float(sub["peak_ips"].iloc[0]) | ||
| row.append(f"{ips:.0f} (w={w})") | ||
| rows.append(row) | ||
| dest.write_text( | ||
| "# Table 3 — Peak DataLoader throughput (images/s) and best worker count\n\n" | ||
| "_Max over workers ∈ {0, 2, 4, 8}. Decoders without DataLoader JSON omitted._\n\n" + _md_table(headers, rows), | ||
| encoding="utf-8", |
Summary by Sourcery
Add a new tooling module to generate paper-ready benchmark tables and figures from existing JSON results, and wire it into contributor documentation and supporting scripts.
New Features:
Enhancements:
Build:
Documentation: