Conversation
📝 WalkthroughWalkthroughIntroduces optional workload identity JWT token support for jobs containing vault credentials. The feature is controlled by a flag (FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED) and includes error-tolerant token retrieval via a gateway client. Comprehensive unit tests validate token request success, failure scenarios, and flag-gating logic. Changes
Sequence DiagramsequenceDiagram
participant Job as Job Execution
participant RunJob as RunJob Handler
participant Client as Workload Identity Client
participant Gateway as Token Gateway
participant Logger as Logger
Job->>RunJob: pre_run_hook() triggered
RunJob->>RunJob: Check if feature flag enabled
alt Feature enabled
RunJob->>RunJob: _job_needs_workload_jwt(instance)
RunJob->>RunJob: Check for vault credentials
alt JWT needed
RunJob->>Client: request_workload_identity_token(claims, scope, audience)
Client->>Gateway: HTTP POST token request
alt Success
Gateway-->>Client: JWT token returned
Client-->>RunJob: Token received
RunJob->>Logger: Log successful token retrieval
else Token error
Gateway-->>Client: TokenRequestError
Client-->>RunJob: Exception caught
RunJob->>Logger: Log token request failure
RunJob->>Job: Continue job (error tolerated)
end
else JWT not needed
RunJob->>Logger: Skip token request
end
else Feature disabled
RunJob->>Job: Skip workload identity flow
end
RunJob-->>Job: Return to job execution
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
awx/main/tasks/jobs.py (1)
594-596: Hardcodedaudienceplaceholder will produce unusable JWTs once AAP-62694 lands.
"https://vault.example.com"is example text that will never match a real Vault instance. Any JWT issued with this audience will be rejected by Vault. This must be configurable (per-credential or via settings) before the token is consumed downstream.♻️ Pull audience from settings
- # TODO: Audience should be configured per integration (Vault, AWS, etc.) - audience = "https://vault.example.com" + audience = getattr(settings, 'WORKLOAD_IDENTITY_JWT_AUDIENCE', None) + if not audience: + logger.warning(f"No WORKLOAD_IDENTITY_JWT_AUDIENCE configured, skipping JWT request for job {instance.id}") + return🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@awx/main/tasks/jobs.py` around lines 594 - 596, The hardcoded audience variable ("https://vault.example.com") in the jobs task will produce invalid JWTs; replace this literal by reading the Vault audience from configuration or the credential record (e.g., a per-credential field) and fall back to a sensible default only for backwards compatibility. Update the code that assigns audience (variable name audience near scope = "aap_controller_automation_job") to pull from Django settings or the credential/integration object (prefer per-credential if available), validate it is non-empty, and use that value when creating the JWT so downstream Vault accepts the token.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@awx/main/tasks/jobs.py`:
- Line 613: The current logger.debug call exposes a sensitive value by logging
response.jwt (even partially); remove any logging of response.jwt and replace it
with a non-sensitive indicator such as the token length or a stable fingerprint
(e.g., log f"JWT length: {len(response.jwt)}" or a one-way hash/fingerprint of
response.jwt) so no JWT characters or payload are written to logs; update the
logger.debug invocation that references response.jwt to emit only the safe
indicator.
- Around line 618-628: Change the two exception handlers so they both log full
tracebacks: replace the logger.error in the TokenRequestError except block with
logger.exception to include the traceback (keep the same non-failing behavior),
and in the generic except Exception handler remove the redundant exception
argument so the call is logger.exception(f"Unexpected error requesting workload
identity token for job {instance.id}") — reference TokenRequestError, the
existing except blocks, logger.exception, and instance.id when making the edits.
In `@awx/main/tests/unit/tasks/test_jobs.py`:
- Around line 493-509: The test
test_request_workload_identity_token_feature_flag_disabled can error before the
assertion because pre_run_hook on jobs.BaseTask calls
create_partition(instance.event_class._meta.db_table, start=instance.created);
mock create_partition the same way other pre/post hook tests do so
instance.created=None won’t trigger real partition logic. Update the test to
patch/mocking create_partition (the function used by pre_run_hook) before
invoking jobs.BaseTask().pre_run_hook, leaving mocks for flag_enabled and
get_workload_identity_client unchanged so the test can reach the
mock_get_client.assert_not_called() assertion.
In `@awx/settings/development_defaults.py`:
- Around line 73-74: Add FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = False to the
base defaults so the feature flag has an explicit production-safe default;
mirror the pattern used for other flags like
FEATURE_POLICY_AS_CODE_ENABLED/FEATURE_INDIRECT_NODE_COUNTING_ENABLED in
defaults.py, leaving the existing FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = True
in development_defaults.py to override the base for development.
---
Nitpick comments:
In `@awx/main/tasks/jobs.py`:
- Around line 594-596: The hardcoded audience variable
("https://vault.example.com") in the jobs task will produce invalid JWTs;
replace this literal by reading the Vault audience from configuration or the
credential record (e.g., a per-credential field) and fall back to a sensible
default only for backwards compatibility. Update the code that assigns audience
(variable name audience near scope = "aap_controller_automation_job") to pull
from Django settings or the credential/integration object (prefer per-credential
if available), validate it is non-empty, and use that value when creating the
JWT so downstream Vault accepts the token.
| f"Successfully obtained workload identity token for job {instance.id}. " | ||
| f"JWT length: {len(response.jwt)} characters" | ||
| ) | ||
| logger.debug(f"JWT token: {response.jwt[:50]}...") |
There was a problem hiding this comment.
Avoid logging any portion of a JWT, even at DEBUG level.
Auth tokens should not appear in log files. Logging the first 50 characters still exposes part of the payload (base64-decoded claims like organization or job metadata). Remove or replace with a non-sensitive token fingerprint (e.g., length only).
🛡️ Proposed fix
- logger.debug(f"JWT token: {response.jwt[:50]}...")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.debug(f"JWT token: {response.jwt[:50]}...") |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@awx/main/tasks/jobs.py` at line 613, The current logger.debug call exposes a
sensitive value by logging response.jwt (even partially); remove any logging of
response.jwt and replace it with a non-sensitive indicator such as the token
length or a stable fingerprint (e.g., log f"JWT length: {len(response.jwt)}" or
a one-way hash/fingerprint of response.jwt) so no JWT characters or payload are
written to logs; update the logger.debug invocation that references response.jwt
to emit only the safe indicator.
| except TokenRequestError as e: | ||
| logger.error( | ||
| f"Failed to obtain workload identity token for job {instance.id}: {e}" | ||
| ) | ||
| # Don't fail the job if JWT request fails - this is optional functionality | ||
|
|
||
| except Exception as e: | ||
| logger.exception( | ||
| f"Unexpected error requesting workload identity token for job {instance.id}: {e}" | ||
| ) | ||
| # Don't fail the job on unexpected errors |
There was a problem hiding this comment.
Address both Ruff warnings: TRY400 and TRY401.
except TokenRequestError(line 618):logger.errordiscards the exception traceback — uselogger.exceptionso the full stack is captured.except Exception as e(line 624): passingetologger.exceptionis redundant —logger.exceptionalready attaches the active exception.
🐛 Proposed fix
except TokenRequestError as e:
- logger.error(
- f"Failed to obtain workload identity token for job {instance.id}: {e}"
- )
+ logger.exception(
+ f"Failed to obtain workload identity token for job {instance.id}: {e}"
+ )
# Don't fail the job if JWT request fails - this is optional functionality
- except Exception as e:
+ except Exception:
logger.exception(
- f"Unexpected error requesting workload identity token for job {instance.id}: {e}"
+ f"Unexpected error requesting workload identity token for job {instance.id}"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except TokenRequestError as e: | |
| logger.error( | |
| f"Failed to obtain workload identity token for job {instance.id}: {e}" | |
| ) | |
| # Don't fail the job if JWT request fails - this is optional functionality | |
| except Exception as e: | |
| logger.exception( | |
| f"Unexpected error requesting workload identity token for job {instance.id}: {e}" | |
| ) | |
| # Don't fail the job on unexpected errors | |
| except TokenRequestError as e: | |
| logger.exception( | |
| f"Failed to obtain workload identity token for job {instance.id}: {e}" | |
| ) | |
| # Don't fail the job if JWT request fails - this is optional functionality | |
| except Exception: | |
| logger.exception( | |
| f"Unexpected error requesting workload identity token for job {instance.id}" | |
| ) | |
| # Don't fail the job on unexpected errors |
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 619-621: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
[warning] 626-626: Redundant exception object included in logging.exception call
(TRY401)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@awx/main/tasks/jobs.py` around lines 618 - 628, Change the two exception
handlers so they both log full tracebacks: replace the logger.error in the
TokenRequestError except block with logger.exception to include the traceback
(keep the same non-failing behavior), and in the generic except Exception
handler remove the redundant exception argument so the call is
logger.exception(f"Unexpected error requesting workload identity token for job
{instance.id}") — reference TokenRequestError, the existing except blocks,
logger.exception, and instance.id when making the edits.
| @pytest.mark.django_db | ||
| @mock.patch('awx.main.tasks.jobs.get_workload_identity_client') | ||
| @mock.patch('awx.main.tasks.jobs.flag_enabled') | ||
| def test_request_workload_identity_token_feature_flag_disabled(mock_flag_enabled, mock_get_client): | ||
| """Test that JWT request is skipped when feature flag is disabled.""" | ||
| # Disable the feature flag | ||
| mock_flag_enabled.return_value = False | ||
|
|
||
| # Create a test job | ||
| job = Job(id=789, name='Test Job') | ||
|
|
||
| # Create a BaseTask instance and call pre_run_hook | ||
| task = jobs.BaseTask() | ||
| task.pre_run_hook(job, '/tmp/private_data') | ||
|
|
||
| # Verify client was never called | ||
| mock_get_client.assert_not_called() |
There was a problem hiding this comment.
create_partition is not mocked — test_request_workload_identity_token_feature_flag_disabled may fail before reaching the assertion.
pre_run_hook calls create_partition(instance.event_class._meta.db_table, start=instance.created) before the flag check. For a bare unsaved Job(id=789), instance.created is None, which may cause create_partition to error or behave unexpectedly. Add the same create_partition mock used in other pre/post hook tests.
🐛 Proposed fix
`@pytest.mark.django_db`
`@mock.patch`('awx.main.tasks.jobs.get_workload_identity_client')
`@mock.patch`('awx.main.tasks.jobs.flag_enabled')
+@mock.patch('awx.main.tasks.jobs.create_partition', return_value=True)
-def test_request_workload_identity_token_feature_flag_disabled(mock_flag_enabled, mock_get_client):
+def test_request_workload_identity_token_feature_flag_disabled(mock_create_partition, mock_flag_enabled, mock_get_client):📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @pytest.mark.django_db | |
| @mock.patch('awx.main.tasks.jobs.get_workload_identity_client') | |
| @mock.patch('awx.main.tasks.jobs.flag_enabled') | |
| def test_request_workload_identity_token_feature_flag_disabled(mock_flag_enabled, mock_get_client): | |
| """Test that JWT request is skipped when feature flag is disabled.""" | |
| # Disable the feature flag | |
| mock_flag_enabled.return_value = False | |
| # Create a test job | |
| job = Job(id=789, name='Test Job') | |
| # Create a BaseTask instance and call pre_run_hook | |
| task = jobs.BaseTask() | |
| task.pre_run_hook(job, '/tmp/private_data') | |
| # Verify client was never called | |
| mock_get_client.assert_not_called() | |
| `@pytest.mark.django_db` | |
| `@mock.patch`('awx.main.tasks.jobs.get_workload_identity_client') | |
| `@mock.patch`('awx.main.tasks.jobs.flag_enabled') | |
| `@mock.patch`('awx.main.tasks.jobs.create_partition', return_value=True) | |
| def test_request_workload_identity_token_feature_flag_disabled(mock_create_partition, mock_flag_enabled, mock_get_client): | |
| """Test that JWT request is skipped when feature flag is disabled.""" | |
| # Disable the feature flag | |
| mock_flag_enabled.return_value = False | |
| # Create a test job | |
| job = Job(id=789, name='Test Job') | |
| # Create a BaseTask instance and call pre_run_hook | |
| task = jobs.BaseTask() | |
| task.pre_run_hook(job, '/tmp/private_data') | |
| # Verify client was never called | |
| mock_get_client.assert_not_called() |
🧰 Tools
🪛 Ruff (0.15.1)
[error] 506-506: Probable insecure usage of temporary file or directory: "/tmp/private_data"
(S108)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@awx/main/tests/unit/tasks/test_jobs.py` around lines 493 - 509, The test
test_request_workload_identity_token_feature_flag_disabled can error before the
assertion because pre_run_hook on jobs.BaseTask calls
create_partition(instance.event_class._meta.db_table, start=instance.created);
mock create_partition the same way other pre/post hook tests do so
instance.created=None won’t trigger real partition logic. Update the test to
patch/mocking create_partition (the function used by pre_run_hook) before
invoking jobs.BaseTask().pre_run_hook, leaving mocks for flag_enabled and
get_workload_identity_client unchanged so the test can reach the
mock_get_client.assert_not_called() assertion.
| # Feature flag for workload identity JWT integration (AAP-62693) | ||
| FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = True |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check all settings files for FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED
rg -n "FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED" --type py -A1Repository: ansible/awx
Length of output: 367
🏁 Script executed:
#!/bin/bash
# Find all settings files in the directory
fd -e py settings/ awx/settings/ | head -20
# Also check specifically for production or defaults
rg -l "production|PRODUCTION" awx/settings/ --type pyRepository: ansible/awx
Length of output: 193
🏁 Script executed:
#!/bin/bash
# Check production_defaults.py for the flag
echo "=== production_defaults.py ==="
rg -n "FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED" awx/settings/production_defaults.py -A1
echo "Exit code: $?"
# Check defaults.py as well
echo -e "\n=== defaults.py ==="
rg -n "FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED" awx/settings/defaults.py -A1
echo "Exit code: $?"
# Also check if there's any inheritance or defaults mechanism
echo -e "\n=== All settings file references to FEATURE_OIDC ==="
rg -n "FEATURE_OIDC" awx/settings/ --type pyRepository: ansible/awx
Length of output: 272
🏁 Script executed:
#!/bin/bash
# Find the flag_enabled function definition
echo "=== Finding flag_enabled function ==="
rg -n "def flag_enabled" --type py -A5
# Check how settings files are loaded/inherited
echo -e "\n=== Checking settings/__init__.py for initialization logic ==="
head -50 awx/settings/__init__.py
# Look for which settings file is loaded by default
echo -e "\n=== Check for DJANGO_SETTINGS_MODULE or settings loading ==="
rg -n "development_defaults\|production_defaults" awx/settings/ --type py | head -20Repository: ansible/awx
Length of output: 2071
🏁 Script executed:
#!/bin/bash
# Search for flag_enabled in entire codebase
echo "=== Searching for flag_enabled definition ==="
rg -n "flag_enabled" --type py | grep -E "(def|from|import)" | head -20
# Search specifically in main directory
echo -e "\n=== Checking awx/main for flag_enabled ==="
rg -n "def flag_enabled\|from.*flag_enabled" awx/main/ --type py
# Check if it's from an external library
echo -e "\n=== Check imports in jobs.py ==="
head -40 awx/main/tasks/jobs.py | grep -E "(import|from)"Repository: ansible/awx
Length of output: 1509
🏁 Script executed:
#!/bin/bash
# Check the test for feature flag disabled
echo "=== Test for workload identity token with feature flag disabled ==="
sed -n '496,520p' awx/main/tests/unit/tasks/test_jobs.py
# Check if there's any documentation or setup in the test file about flag defaults
echo -e "\n=== Check test file setup and imports ==="
head -100 awx/main/tests/unit/tasks/test_jobs.py | grep -A5 -B5 "flag_enabled\|mock"
# Search for how flags library is configured
echo -e "\n=== Search for FLAGS configuration in settings ==="
rg -n "FLAGS" awx/settings/ --type pyRepository: ansible/awx
Length of output: 3943
🏁 Script executed:
#!/bin/bash
# Search for FLAGS related settings that might define defaults
echo "=== Search for FLAGS or flag defaults ==="
rg -n "FLAGS\|flag.*default" awx/settings/ --type py -i
# Check if production_defaults.py explicitly handles this flag
echo -e "\n=== Check production_defaults.py content around feature flags ==="
rg -n "FEATURE_" awx/settings/production_defaults.py --type py | head -20
# Check defaults.py for any feature flags
echo -e "\n=== Check defaults.py for FEATURE_ flags ==="
rg -n "FEATURE_" awx/settings/defaults.py --type py | head -20Repository: ansible/awx
Length of output: 305
🏁 Script executed:
#!/bin/bash
# Check the context around FEATURE_ flags in defaults.py
echo "=== Context around FEATURE_ flags in defaults.py ==="
sed -n '1115,1140p' awx/settings/defaults.py
# Check if there are more FEATURE_ flags
echo -e "\n=== All FEATURE_ flags in codebase ==="
rg -n "FEATURE_.*=" --type py | grep -E "defaults|settings"Repository: ansible/awx
Length of output: 2406
Add FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = False to awx/settings/defaults.py for consistency and production safety.
The flag is currently defined only in development_defaults.py:74 as True, but is missing from the base defaults.py. This breaks the established pattern where all FEATURE_* flags are explicitly defined in defaults.py as False (like FEATURE_POLICY_AS_CODE_ENABLED and FEATURE_INDIRECT_NODE_COUNTING_ENABLED). Without an explicit False in the base settings, production deployments lack a guaranteed default state for this flag. Add the flag to defaults.py to match the pattern and ensure production explicitly disables it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@awx/settings/development_defaults.py` around lines 73 - 74, Add
FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = False to the base defaults so the
feature flag has an explicit production-safe default; mirror the pattern used
for other flags like
FEATURE_POLICY_AS_CODE_ENABLED/FEATURE_INDIRECT_NODE_COUNTING_ENABLED in
defaults.py, leaving the existing FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED = True
in development_defaults.py to override the base for development.
|
Closed in favor of #16296 |
Supersedes: #16277
SUMMARY
Integrates the workload identity API client (from AAP-62691) into Controller's job execution flow to automatically request JWT tokens from Gateway when launching jobs that require them.
Design Decisions:
Related:
ISSUE TYPE
COMPONENT NAME
ADDITIONAL INFORMATION
Summary by CodeRabbit
New Features
Tests