Skip to content

fix: use parameterized SQL queries in report_digest.py#1645

Open
gn00295120 wants to merge 2 commits intoNVIDIA:mainfrom
gn00295120:fix/sql-parameterized-queries
Open

fix: use parameterized SQL queries in report_digest.py#1645
gn00295120 wants to merge 2 commits intoNVIDIA:mainfrom
gn00295120:fix/sql-parameterized-queries

Conversation

@gn00295120
Copy link

@gn00295120 gn00295120 commented Mar 22, 2026

Summary

  • Replace four f-string SQL query constructions in garak/analyze/report_digest.py with parameterized queries using ? placeholders
  • Affected functions: _init_populate_result_db, _get_group_aggregate_score, _get_probe_result_summaries, _get_detectors_info
  • Addresses CWE-89 (Improper Neutralization of Special Elements used in an SQL Command)

Background

The affected queries interpolated values derived from report JSONL data directly into SQL strings via f-strings. A malformed or adversarially crafted .report.jsonl file could include SQL metacharacters in field values such as probe, detector, or taxonomy tag names, causing unintended query behavior against the in-memory SQLite database.

Parameterized queries pass values as a separate tuple argument to cursor.execute(), so the SQLite driver handles escaping unconditionally — no string manipulation required.

Changes

  • _init_populate_result_db (line ~131): INSERT with 7 ? placeholders
  • _get_group_aggregate_score (line ~159): SELECT filtered by probe_group
  • _get_probe_result_summaries (line ~236): SELECT filtered by probe_group
  • _get_detectors_info (line ~262): SELECT filtered by probe_group and probe_class

No logic, variable names, return types, or other behavior was changed.

Test plan

  • Run existing test suite: pytest tests/ — all tests should pass without modification
  • Manually verify a report digest still generates correctly against a sample .report.jsonl

Copilot AI review requested due to automatic review settings March 22, 2026 00:51
@github-actions
Copy link
Contributor

github-actions bot commented Mar 22, 2026

DCO Assistant Lite bot All contributors have signed the DCO ✍️ ✅

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens garak/analyze/report_digest.py against SQL injection from untrusted .report.jsonl inputs by replacing f-string SQL construction with SQLite parameterized queries.

Changes:

  • Parameterized the INSERT into the in-memory results table in _init_populate_result_db.
  • Parameterized SELECT queries filtering by probe_group in _get_group_aggregate_score and _get_probe_result_summaries.
  • Parameterized the detector lookup query filtering by probe_group and probe_class in _get_detectors_info.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 235 to 239
def _get_probe_result_summaries(cursor, probe_group) -> List[tuple]:
res = cursor.execute(
f"select probe_module, probe_class, min(score) as s from results where probe_group='{probe_group}' group by probe_class order by s asc, probe_class asc;"
"select probe_module, probe_class, min(score) as s from results where probe_group=? group by probe_class order by s asc, probe_class asc;",
(probe_group,),
)
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a targeted regression test that exercises these queries with adversarial probe_group / probe_class / detector values containing quotes or SQL metacharacters (e.g., "x' OR 1=1 --") and asserts build_digest() still succeeds and returns expected results. Current tests run the CLI over known-good assets but don't specifically cover the injection/escaping behavior this change is meant to fix.

Copilot uses AI. Check for mistakes.
@gn00295120
Copy link
Author

I have read the DCO Document and I hereby sign the DCO

@gn00295120 gn00295120 force-pushed the fix/sql-parameterized-queries branch from 93a4d84 to 0b0f1b7 Compare March 22, 2026 06:42
github-actions bot added a commit that referenced this pull request Mar 22, 2026
gn00295120 added a commit to gn00295120/garak that referenced this pull request Mar 22, 2026
Add parameterized tests exercising _init_populate_result_db,
_get_group_aggregate_score, _get_probe_result_summaries, and
_get_detectors_info with adversarial SQL metacharacter payloads
(e.g., "x' OR 1=1 --", "'; DROP TABLE results; --").

Addresses Copilot review suggestion on PR NVIDIA#1645.

Signed-off-by: Lucas Wang <[email protected]>
@jmartin-tech
Copy link
Collaborator

@gn00295120 using Copilot or another assistant to help create a PR is acceptable and even encouraged if it helps you understand your contribution, however this should be confined to your private fork at this time. Asking the assistant to review and comment on the public PR after being opened upstream places a maintenance burdens on the project for tools that are not used by all contributors. In the future please keep this in mind.

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gn00295120 thank you for the contribution. This looks great and will go thru additional testing.

As an additional note to address the reference to CWE-89 in the description, this is a valid case of an improvement to follow a more secure practice in sql statement generation, the values here are only user controlled when utilizing this class as a standalone utility and the database is ephemeral to the report digest creation process. This simply mean the security evaluation of risk show no exposed vulnerable attack chain existed. This does not diminish the value of this PR in improving the codebase and reducing the possible attack surfaces.

Replace f-string interpolation with parameterized queries using ? placeholders
to prevent potential SQL injection from malformed report JSONL data.

Addresses CWE-89.

Signed-off-by: Lucas Wang <[email protected]>
Add parameterized tests exercising _init_populate_result_db,
_get_group_aggregate_score, _get_probe_result_summaries, and
_get_detectors_info with adversarial SQL metacharacter payloads
(e.g., "x' OR 1=1 --", "'; DROP TABLE results; --").

Addresses Copilot review suggestion on PR NVIDIA#1645.

Signed-off-by: Lucas Wang <[email protected]>
@gn00295120 gn00295120 force-pushed the fix/sql-parameterized-queries branch from fd3f7b5 to 4a1878a Compare March 24, 2026 00:37
@gn00295120
Copy link
Author

Thank you for the detailed review and clarification — I really appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants