Skip to content

ci: Connector Registry 2.0 Production Launch (replaces legacy metadata_service)#75224

Merged
Aaron ("AJ") Steers (aaronsteers) merged 13 commits intomasterfrom
devin/1773952030-productionalize-ops-cli-registry
Mar 19, 2026
Merged

ci: Connector Registry 2.0 Production Launch (replaces legacy metadata_service)#75224
Aaron ("AJ") Steers (aaronsteers) merged 13 commits intomasterfrom
devin/1773952030-productionalize-ops-cli-registry

Conversation

@aaronsteers
Copy link
Collaborator

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented Mar 19, 2026

What

Promotes the ops CLI registry pipeline from soft-launch (parallel validation) to production, fully replacing the legacy Poetry-based metadata_service registry generation and the legacy run-airbyte-ci RC promote/rollback steps.

This is the cutover step following successful soft-launch validation where the ops CLI ran alongside the legacy pipeline with continue-on-error: true targeting coral:dev/soft-launch-trial.

Tracking: https://github.com/airbytehq/airbyte-ops-mcp/issues/504
Post-launch test plan: https://github.com/airbytehq/airbyte-ops-mcp/issues/560
Operations docs: https://github.com/airbytehq/airbyte-ops-mcp/issues/561

How

publish_connectors.yml (per-connector artifact generation + publish):

  • Remove legacy poetry run metadata_service generate-registry-entry steps (both OSS and Cloud)
  • Remove Poetry and metadata_service install from the publish_connector_registry_entries job
  • Promote ops CLI generate + publish steps: remove continue-on-error: true, remove soft-launch naming
  • Switch REGISTRY_STORE from coral:dev/soft-launch-trialcoral:prod
  • Artifact generation runs unconditionally (local-only, no reason to skip on dry-run); only the publish step retains a dry-run guard
  • CI artifacts upload with if: always() so artifacts are available for debugging even on validation failure

generate-connector-registries.yml (full registry compilation):

  • Remove legacy jobs: generate-cloud-registry, generate-oss-registry, post-registry-generation (4 jobs → 1 job)
  • Promote ops CLI compile job: remove continue-on-error: true, remove soft-launch naming
  • Switch REGISTRY_STORE from coral:dev/soft-launch-trialcoral:prod
  • Add --with-legacy-migration v1 to clean up legacy cloud.json/oss.json for disabled connectors

finalize_rollout.yml (RC promote/rollback):

  • Replace legacy run-airbyte-ci promote + rollback steps with a single ops CLI registry rc command
  • Simplify checkout (remove GitHub App auth; keep lightweight actions/checkout@v4 with fetch-depth: 1)
  • Remove all legacy secrets (Dagger, Docker Hub, Sentry, Slack, spec cache) — the ops CLI only needs GCS_CREDENTIALS
  • Downgrade runner from connector-publish-largeubuntu-24.04 (ops CLI is lightweight)
  • Add CONNECTOR_NAME job-level env var to handle both workflow_dispatch and repository_dispatch event types
  • Pass --with-pr and --with-store-cleanup flags to the RC command
  • Add descriptive header comments documenting the Temporal workflow "finalizeRollout" that calls this workflow

Review guide

  1. publish_connectors.yml — Focus on the removal of legacy steps and the promotion of ops CLI steps. Verify that the publish step's dry-run guard is correct, and that artifact generation intentionally runs unconditionally (it's local-only).
  2. generate-connector-registries.yml — Verify the single ops CLI compile job covers what the 4 legacy jobs did (cloud registry, oss registry, secrets mask, registry report).
  3. finalize_rollout.ymlThis workflow was NOT soft-launched (parallel execution was skipped due to state mutation concerns). Verify the ops CLI registry rc command correctly handles both promote and rollback actions via the ACTION env var, and that the --with-pr and --with-store-cleanup flags are correct for production use.

⚠️ Items for reviewer attention:

  • finalize_rollout.yml has no soft-launch history — unlike the other two workflows, this is going directly from legacy to ops CLI. The registry rc command has been tested manually but not via this workflow path in production.
  • New --with-pr and --with-store-cleanup flags — These were added to the RC command. Confirm these flags are production-ready and correctly implement the workflow's documented behavior (creating an auto-merge PR and cleaning up the release_candidate directory).
  • The CONNECTOR_NAME expression (${{ github.event_name == 'workflow_dispatch' && github.event.inputs.connector_name || github.event.client_payload.connector_name }}) handles both event types — confirm this is correct for repository_dispatch payloads from the platform.
  • The checkout step was re-added (simplified from the legacy version) — presumably needed for --with-pr to create the version-bump PR. Confirm fetch-depth: 1 is sufficient.
  • The legacy generate-registry-entry accepted a --pre-release / --main-release flag. The ops CLI artifacts generate does not receive this flag — please confirm it handles RC/preview releases correctly without it.
  • The legacy post-registry-generation job ran generate-registry-report — this has no ops CLI equivalent in this PR. Confirm this report is either handled by the ops CLI compile or is acceptable to drop.
  • The legacy steps passed SLACK_TOKEN, SENTRY_DSN, etc. for error reporting. The ops CLI steps do not — confirm error observability is acceptable.
  • Enterprise repo (airbyte-enterprise) calls publish_connectors.yml@master — once this merges, enterprise publishes immediately use the new pipeline.

Human review checklist

  • Confirm finalize_rollout.yml CONNECTOR_NAME expression works for both workflow_dispatch and repository_dispatch
  • Verify registry rc command handles both promote and rollback actions correctly
  • Confirm --with-pr and --with-store-cleanup flags are production-ready and implement the documented workflow behavior
  • Verify checkout with fetch-depth: 1 is sufficient for the --with-pr flag to create version-bump PRs
  • Confirm runner downgrade to ubuntu-24.04 is acceptable for the ops CLI workload
  • Verify REGISTRY_STORE is set to coral:prod in all three workflows

User Impact

No direct user-facing impact. Connector publishing, registry compilation, and RC promote/rollback will now use the ops CLI as the sole path instead of the legacy metadata_service and run-airbyte-ci. If the ops CLI encounters issues, these operations will fail (previously the ops CLI failures were silently ignored via continue-on-error).

Can this PR be safely reverted and rolled back?

  • YES 💚

Reverting this PR restores the legacy metadata_service and run-airbyte-ci pipelines. However, if connectors have been published to coral:prod via the ops CLI after this merges, reverting may cause registry inconsistencies unless the legacy pipeline can handle the ops CLI-generated artifacts.


Link to Devin run: https://app.devin.ai/sessions/f900274cb0884bf99e399b1c40c48067
Requested by: AJ Steers (Aaron ("AJ") Steers (@aaronsteers))

…ta_service

- Remove legacy Poetry-based registry entry generation (OSS + Cloud) from publish_connectors.yml
- Promote ops CLI generate + publish steps as the primary registry pipeline
- Remove continue-on-error: true from ops CLI steps (now required, not soft-launch)
- Change REGISTRY_STORE from coral:dev/soft-launch-trial to coral:prod
- Add dry-run skip steps for registry artifact generation and publish
- Remove Poetry and metadata_service install from publish_connector_registry_entries job
- Replace legacy generate-cloud-registry, generate-oss-registry, and post-registry-generation
  jobs with single ops CLI compile job in generate-connector-registries.yml
- Remove soft-launch naming prefixes from all step names

Co-Authored-By: AJ Steers <aj@airbyte.io>
@octavia-bot
Copy link
Contributor

octavia-bot bot commented Mar 19, 2026

Note

📝 PR Converted to Draft

More info...

Thank you for creating this PR. As a policy to protect our engineers' time, Airbyte requires all PRs to be created first in draft status. Your PR has been automatically converted to draft status in respect for this policy.

As soon as your PR is ready for formal review, you can proceed to convert the PR to "ready for review" status by clicking the "Ready for review" button at the bottom of the PR page.

To skip draft status in future PRs, please include [ready] in your PR title or add the skip-draft-status label when creating your PR.

@octavia-bot octavia-bot bot marked this pull request as draft March 19, 2026 20:30
@devin-ai-integration
Copy link
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Contributor

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

  • 🛠️ Quick Fixes
    • /format-fix - Fixes most formatting issues.
    • /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  • ❇️ AI Testing and Review (internal link: AI-SDLC Docs):
    • /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
    • /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
    • /ai-review - AI-powered PR review for connector safety and quality gates.
  • 🚀 Connector Releases:
    • /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
    • /bump-progressive-rollout-version - Bumps connector version with an RC suffix (2.16.10-rc.1) for progressive rollouts (enableProgressiveRollout: true).
      • Example: /bump-progressive-rollout-version changelog="Add new feature for progressive rollout"
  • ☕️ JVM connectors:
    • /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
      Example: /update-connector-cdk-version connector=destination-bigquery
  • 🐍 Python connectors:
    • /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
    • /poe source example lock - Alias for /poe connector source-example lock.
    • /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
    • /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
  • ⚙️ Admin commands:
    • /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
      Example: /force-merge reason="CI is flaky, tests pass locally"
📚 Show Repo Guidance

Helpful Resources

📝 Edit this welcome message.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 3 additional findings.

Open in Devin Review

@aaronsteers Aaron ("AJ") Steers (aaronsteers) marked this pull request as ready for review March 19, 2026 20:39
@aaronsteers Aaron ("AJ") Steers (aaronsteers) changed the title ci: productionalize ops CLI registry pipeline - replace legacy metadata_service ci: Connector Registry 2.0 Production Launch (replaces legacy metadata_service) Mar 19, 2026
Artifact generation is local-only, so there's no reason to skip it during
dry-run. Only the publish step (which writes to GCS) needs the dry-run guard.
Also removes the now-unnecessary [DRY-RUN] skip step for generation.

Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
devin-ai-integration bot and others added 2 commits March 19, 2026 21:17
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
devin-ai-integration bot and others added 2 commits March 19, 2026 21:30
devin-ai-integration bot and others added 2 commits March 19, 2026 22:25
Co-Authored-By: AJ Steers <aj@airbyte.io>
Co-Authored-By: AJ Steers <aj@airbyte.io>
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 10 additional findings in Devin Review.

Open in Devin Review

Comment on lines +42 to +53
- name: "Promote or Rollback RC: ${{ env.ACTION }} ${{ env.CONNECTOR_NAME }}"
id: finalize-release-candidate
shell: bash
env:
GCS_CREDENTIALS: ${{ secrets.METADATA_SERVICE_PROD_GCS_CREDENTIALS }}
REGISTRY_STORE: "coral:prod"
run: >
airbyte-ops registry rc "${ACTION}"
--name "${CONNECTOR_NAME}"
--store "${REGISTRY_STORE}"
--with-pr
--with-store-cleanup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Missing GitHub App authentication for PR creation in finalize_rollout.yml

The old workflow explicitly authenticated as the OCTAVIA_BOT GitHub App with the comment: "Authenticate as the GitHub App to ensure CI can run. This is necessary because commits created with the built-in GitHub token will not trigger workflows." The token was passed via github_token: ${{ steps.get-app-token.outputs.token }}. The new workflow uses --with-pr (which creates a PR) but removes all GitHub App authentication entirely — no GITHUB_TOKEN or GH_TOKEN env var is passed to the ops CLI step, only GCS_CREDENTIALS and REGISTRY_STORE. This means PRs created by the workflow will either fail (if the CLI requires explicit auth) or use the default GITHUB_TOKEN, which GitHub explicitly does not allow to trigger downstream workflows. This same pattern is confirmed in bump-progressive-rollout-version-command.yml:73-74 which comments: "Important that token is a PAT so that CI checks are triggered again. Without this we would be forever waiting on required checks to pass."

Prompt for agents
In .github/workflows/finalize_rollout.yml, the GitHub App authentication step that existed in the old workflow was removed. The old workflow used actions/create-github-app-token with OCTAVIA_BOT_APP_ID and OCTAVIA_BOT_PRIVATE_KEY secrets to create a token, specifically because PRs/commits created with the default GITHUB_TOKEN do not trigger CI workflows.

To fix this:
1. Add back the GitHub App authentication step before the 'Promote or Rollback RC' step (around line 36, after the 'Install Ops CLI' step):
   - name: Authenticate as GitHub App
     uses: actions/create-github-app-token@d72941d797fd3113feb6b93fd0dec494b13a2547
     id: get-app-token
     with:
       owner: airbytehq
       repositories: airbyte
       app-id: ${{ secrets.OCTAVIA_BOT_APP_ID }}
       private-key: ${{ secrets.OCTAVIA_BOT_PRIVATE_KEY }}

2. Then pass the token to the ops CLI step as an env var (e.g., GITHUB_TOKEN or GH_TOKEN, depending on what the airbyte-ops CLI expects):
   env:
     GCS_CREDENTIALS: ${{ secrets.METADATA_SERVICE_PROD_GCS_CREDENTIALS }}
     REGISTRY_STORE: coral:prod
     GITHUB_TOKEN: ${{ steps.get-app-token.outputs.token }}

This ensures PRs created by --with-pr will trigger CI workflows.
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

@aaronsteers Aaron ("AJ") Steers (aaronsteers) merged commit d33a0f0 into master Mar 19, 2026
44 checks passed
@aaronsteers Aaron ("AJ") Steers (aaronsteers) deleted the devin/1773952030-productionalize-ops-cli-registry branch March 19, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant