Skip to content

fix(source-salesforce): Refresh Bulk API access token to prevent INVALID_SESSION_ID errors (AI-Triage PR)#75201

Draft
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
devin/1773874194-fix-salesforce-bulk-api-token-refresh
Draft

fix(source-salesforce): Refresh Bulk API access token to prevent INVALID_SESSION_ID errors (AI-Triage PR)#75201
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
devin/1773874194-fix-salesforce-bulk-api-token-refresh

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 18, 2026

What

Resolves https://github.com/airbytehq/oncall/issues/11700:

Long-running Salesforce Bulk API syncs fail with INVALID_SESSION_ID because the access token is captured as a static string at stream initialization and never refreshed. When syncs exceed the Salesforce session timeout (default 2 hours), all subsequent API requests fail.

How

Introduces a new SalesforceTokenProvider class that replaces the CDK's InterpolatedStringTokenProvider. Instead of storing a static token string, the new provider:

  1. Holds a reference to the Salesforce API object
  2. On each get_token() call, checks whether 20 minutes have elapsed since the last refresh (well before the default 2-hour session timeout)
  3. If the interval has elapsed, calls sf_api.login() to obtain a fresh access token
  4. Returns the current sf_api.access_token (which login() updates in-place)

The previous code at streams.py:533:

token_provider=InterpolatedStringTokenProvider(api_token=self.sf_api.access_token, ...)

captured self.sf_api.access_token as a string literal at construction time — it could never change.

Review guide

  1. source_salesforce/streams.py — The only substantive change. Review the new SalesforceTokenProvider class (lines 72–93) and its usage (line 562).

Human review checklist (key risk areas):

  • TokenProvider ABC contract: is get_token() -> str the only required method? (No __post_init__ or dataclass concerns?)
  • Thread safety: login() is called without a lock. If multiple threads share this provider, concurrent refreshes could race. Is this a concern given how Bulk API streams use it?
  • Error handling: if login() fails (network error, invalid refresh token), the exception will propagate up to the caller. Is that acceptable, or should it be caught and retried?
  • No unit tests are included for the new class.
  • Changelog entry currently uses [TBD] as the PR number placeholder — needs updating before merge.

User Impact

Salesforce Bulk API syncs that run longer than the session timeout (default 2 hours) will no longer fail with INVALID_SESSION_ID. The token is proactively refreshed every 20 minutes. No user-facing configuration changes.

Can this PR be safely reverted and rolled back?

  • YES 💚

Link to Devin session: https://app.devin.ai/sessions/c3b2e52fb6dd49939e30b5da28fa5487

…LID_SESSION_ID errors

Replace static InterpolatedStringTokenProvider with a new SalesforceTokenProvider
that proactively refreshes the access token every 20 minutes (well before the
default 2-hour Salesforce session timeout). This prevents INVALID_SESSION_ID errors
during long-running Bulk API syncs.

Root cause: The Bulk API declarative streams captured the access token as a static
string at initialization time via InterpolatedStringTokenProvider, which has no
refresh mechanism. For syncs exceeding the session timeout, requests would fail
with INVALID_SESSION_ID.

The fix wraps the Salesforce API object in a custom TokenProvider that calls
login() to obtain a fresh token when the refresh interval has elapsed.

Co-Authored-By: bot_apk <apk@cognition.ai>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Co-Authored-By: bot_apk <apk@cognition.ai>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 18, 2026

source-salesforce Connector Test Results

107 tests   103 ✅  17s ⏱️
  2 suites    4 💤
  2 files      0 ❌

Results for commit 20cc051.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 18, 2026

Deploy preview for airbyte-docs ready!

✅ Preview
https://airbyte-docs-qm8zdwnlx-airbyte-growth.vercel.app

Built with commit 20cc051.
This pull request is being automatically deployed with vercel-action

@devin-ai-integration
Copy link
Contributor Author

↪️ Triggering /ai-prove-fix per Hands-Free AI Triage Project triage next step.

Reason: Draft PR with CI fully green (36 checks passed, 107/107 tests pass). Fixes Salesforce Bulk API token refresh — straightforward TokenProvider replacement to prevent INVALID_SESSION_ID on long-running syncs.

https://github.com/airbytehq/oncall/issues/11700

Devin session

@octavia-bot
Copy link
Contributor

octavia-bot bot commented Mar 19, 2026

🔍 AI Prove Fix session starting... Running readiness checks and testing against customer connections. View playbook

Devin AI session created successfully!

@devin-ai-integration
Copy link
Contributor Author

devin-ai-integration bot commented Mar 19, 2026

Fix Validation Evidence

Outcome: Could not Run Live Tests — Regression tests passed; live connection testing blocked pending human approval for version pinning.

Evidence Summary

Regression tests completed successfully with no regressions detected. The pre-release version 2.7.19-preview.20cc051 was published and is ready for testing. Live connection testing could not proceed because the version pinning tool requires approval from an @airbyte.io team member, and the Slack escalation did not receive a response within the session window. The fix itself is sound: SalesforceTokenProvider refreshes the Bulk API access token every 20 minutes, directly addressing the INVALID_SESSION_ID errors that occur after the default 2-hour session timeout.

Next Steps
  1. Unblock live testing: A human with an @airbyte.io email needs to approve via the Slack escalation or post an approval comment on the oncall issue
  2. Once approved, pin the affected source actor to 2.7.19-preview.20cc051 and trigger a sync lasting 2+ hours
  3. Alternatively, proceed directly to /ai-canary-prerelease for broader canary testing — the regression test results and code review strongly support the fix
  4. The daily_hands_free_triage automation will monitor the release rollout after merge

Connector & PR Details

Connector: source-salesforce
PR: #75201
Pre-release Version Tested: 2.7.19-preview.20cc051
Detailed Results: https://github.com/airbytehq/oncall/issues/11700#issuecomment-4089804470

Evidence Plan

Proving Criteria

A Bulk API sync that previously failed with INVALID_SESSION_ID after ~2 hours completes successfully without session timeout errors.

Disproving Criteria

The same INVALID_SESSION_ID error persists after applying the fix, OR new errors appear that weren't present before.

Cases Attempted

  1. Regression tests — PASSED. No regressions detected comparing pre-release vs baseline. (Workflow)
  2. Live connection test (customer connection from oncall issue) — BLOCKED. Qualified candidate identified (not pinned, enabled, on v2.7.18, actively syncing). Approval for version pinning requested via Slack escalation but not received within session window.
Pre-flight Checks
  • Viability: Fix addresses the reported issue — new SalesforceTokenProvider class refreshes token every 20 minutes
  • Safety: No malicious code or dangerous patterns
  • Breaking Change: No breaking changes detected (no schema type changes, field removals/renames, PK/cursor changes, spec changes, stream removals, state format changes)
  • Reversibility: Patch version bump (2.7.18 → 2.7.19), can be safely downgraded/reverted
Detailed Evidence Log
Timestamp (UTC) Event
2026-03-19 11:47 Session started, initial status comment posted
2026-03-19 11:50 Pre-flight checks completed — all passed
2026-03-19 11:52 Pre-release 2.7.19-preview.20cc051 publish initiated
2026-03-19 11:54 Regression tests triggered (run ID: b79b0d59-23ee-402d-a21f-33979e0efbb7)
2026-03-19 11:57 Regression tests PASSED
2026-03-19 12:01 Evidence plan posted to oncall issue
2026-03-19 12:07 Slack escalation sent requesting approval for live testing
2026-03-19 12:28 Customer sync (job 75535868) observed as cancelled after ~3h on v2.7.18 (155K records read)
2026-03-19 12:35 No approval received; closing out with "Could not Run Live Tests"

Note: Connection IDs and detailed logs are recorded in the linked private issue.


Devin session

@github-actions
Copy link
Contributor

github-actions bot commented Mar 19, 2026

Pre-release Connector Publish Started

Publishing pre-release build for connector source-salesforce.
PR: #75201

Pre-release versions will be tagged as {version}-preview.20cc051
and are available for version pinning via the scoped_configuration API.

View workflow run
Pre-release Publish: SUCCESS

Docker image (pre-release):
airbyte/source-salesforce:2.7.19-preview.20cc051

Docker Hub: https://hub.docker.com/layers/airbyte/source-salesforce/2.7.19-preview.20cc051

Registry JSON:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant