fix(source-salesforce): Refresh Bulk API access token to prevent INVALID_SESSION_ID errors (AI-Triage PR)#75201
Conversation
…LID_SESSION_ID errors Replace static InterpolatedStringTokenProvider with a new SalesforceTokenProvider that proactively refreshes the access token every 20 minutes (well before the default 2-hour Salesforce session timeout). This prevents INVALID_SESSION_ID errors during long-running Bulk API syncs. Root cause: The Bulk API declarative streams captured the access token as a static string at initialization time via InterpolatedStringTokenProvider, which has no refresh mechanism. For syncs exceeding the session timeout, requests would fail with INVALID_SESSION_ID. The fix wraps the Salesforce API object in a custom TokenProvider that calls login() to obtain a fresh token when the refresh interval has elapsed. Co-Authored-By: bot_apk <apk@cognition.ai>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. 💡 Show Tips and TricksPR Slash CommandsAirbyte Maintainers (that's you!) can execute the following slash commands on your PR:
📚 Show Repo GuidanceHelpful Resources
|
Co-Authored-By: bot_apk <apk@cognition.ai>
|
|
Deploy preview for airbyte-docs ready! ✅ Preview Built with commit 20cc051. |
|
↪️ Triggering Reason: Draft PR with CI fully green (36 checks passed, 107/107 tests pass). Fixes Salesforce Bulk API token refresh — straightforward |
|
Fix Validation EvidenceOutcome: Could not Run Live Tests — Regression tests passed; live connection testing blocked pending human approval for version pinning. Evidence SummaryRegression tests completed successfully with no regressions detected. The pre-release version Next Steps
Connector & PR DetailsConnector: Evidence PlanProving CriteriaA Bulk API sync that previously failed with Disproving CriteriaThe same Cases Attempted
Pre-flight Checks
Detailed Evidence Log
Note: Connection IDs and detailed logs are recorded in the linked private issue. |
|
What
Resolves https://github.com/airbytehq/oncall/issues/11700:
Long-running Salesforce Bulk API syncs fail with
INVALID_SESSION_IDbecause the access token is captured as a static string at stream initialization and never refreshed. When syncs exceed the Salesforce session timeout (default 2 hours), all subsequent API requests fail.How
Introduces a new
SalesforceTokenProviderclass that replaces the CDK'sInterpolatedStringTokenProvider. Instead of storing a static token string, the new provider:SalesforceAPI objectget_token()call, checks whether 20 minutes have elapsed since the last refresh (well before the default 2-hour session timeout)sf_api.login()to obtain a fresh access tokensf_api.access_token(whichlogin()updates in-place)The previous code at
streams.py:533:captured
self.sf_api.access_tokenas a string literal at construction time — it could never change.Review guide
source_salesforce/streams.py— The only substantive change. Review the newSalesforceTokenProviderclass (lines 72–93) and its usage (line 562).Human review checklist (key risk areas):
TokenProviderABC contract: isget_token() -> strthe only required method? (No__post_init__or dataclass concerns?)login()is called without a lock. If multiple threads share this provider, concurrent refreshes could race. Is this a concern given how Bulk API streams use it?login()fails (network error, invalid refresh token), the exception will propagate up to the caller. Is that acceptable, or should it be caught and retried?[TBD]as the PR number placeholder — needs updating before merge.User Impact
Salesforce Bulk API syncs that run longer than the session timeout (default 2 hours) will no longer fail with
INVALID_SESSION_ID. The token is proactively refreshed every 20 minutes. No user-facing configuration changes.Can this PR be safely reverted and rolled back?
Link to Devin session: https://app.devin.ai/sessions/c3b2e52fb6dd49939e30b5da28fa5487