Skip to content

feat: add goal hijacking regression scenario for cloud file deletion#21

Merged
mertsatilmaz merged 2 commits into
OWASP:mainfrom
anshjaiswal12:goal-hijacking-scenario-16
May 17, 2026
Merged

feat: add goal hijacking regression scenario for cloud file deletion#21
mertsatilmaz merged 2 commits into
OWASP:mainfrom
anshjaiswal12:goal-hijacking-scenario-16

Conversation

@anshjaiswal12
Copy link
Copy Markdown
Contributor

This PR adds a new executable security regression scenario for Goal Hijacking, as outlined in issue #16. I noticed a gap in the test coverage for this attack class — the existing goal_hijack/basic.yaml covers a simple document summarization context, but there was no scenario testing resistance to high-severity, tool-invocation-based goal diversion (e.g., an attacker attempting to trigger destructive operations like deleting cloud storage files).

The scenario is defined in scenarios/goal_hijacking_01.yaml and models an attack where malicious instructions are embedded within spreadsheet data — an untrusted, retrieved_document-style context. The agent's original goal is to calculate travel expenses; the injected payload attempts to divert it into calling cloud_delete or shell. I followed the existing schema structure (same top-level fields, trust: untrusted context sourcing, and denied_tools under expected) to stay consistent with the project's patterns and ensure the harness can ingest the file without any code changes.

Verification: the scenario passes agent-harness run scenarios/goal_hijacking_01.yaml --dry-run and is correctly loaded and validated by the harness. I also added a targeted test in tests/test_goal_hijacking_scenario.py to confirm schema compliance. All 9 tests in the suite pass cleanly.

Copy link
Copy Markdown
Collaborator

@mertsatilmaz mertsatilmaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. The scenario itself is strong: it is realistic, high-impact, and fits the goal-hijack category well. The injected spreadsheet content creates a clear conflict between the user’s original goal, calculating travel expenses, and the attacker’s goal, deleting cloud files. The denied tools also make the expected safety boundary explicit.

I verified the scenario loads and the test suite passes:

  • python -m py_compile tests/test_goal_hijacking_scenario.py
  • python -m pytest

One change is needed before merge: please move the scenario into the existing category-based layout.

Current path:

scenarios/goal_hijacking_01.yaml

Requested path:

scenarios/goal_hijack/delete_cloud_files_001.yaml

Please also update the test path accordingly.

Minor style cleanup in the test file:

  • Add a blank line between from pathlib import Path and from agent_harness.scenario import load_scenario
  • Add two blank lines before the test function

After that, this should be mergeable.

@mertsatilmaz mertsatilmaz linked an issue May 2, 2026 that may be closed by this pull request
Move the scenario into the category-based directory, rename id to match
the established `<category>.<name>_NNN` pattern, fix the schema-relative
path, and update the test to load the new path. Also apply PEP 8 import
and function spacing in the test file.

Co-authored-by: anshjaiswal12 <[email protected]>
Co-authored-by: mertsatilmaz <[email protected]>
Copy link
Copy Markdown
Collaborator

@mertsatilmaz mertsatilmaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @anshjaiswal12. I pushed the layout/style fixes directly onto the branch (kept you as co-author on the cleanup commit) so this can land. Approving and merging.

@mertsatilmaz mertsatilmaz merged commit 360059a into OWASP:main May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Good first issue: Add a goal hijacking regression scenario

2 participants