Skip to content

refactor(ingestion): rename capability summary to connector registry and split into package-based files#16106

Merged
shirshanka merged 1 commit intomasterfrom
refactor-capability-summary
Feb 6, 2026
Merged

refactor(ingestion): rename capability summary to connector registry and split into package-based files#16106
shirshanka merged 1 commit intomasterfrom
refactor-capability-summary

Conversation

@shirshanka
Copy link
Contributor

Summary

This PR refactors the capability summary feature into a more scalable connector registry system with package-based organization:

Key Changes

  • Renamed terminology: capability_summaryconnector_registry across codebase
  • File reorganization: Single JSON file → directory structure with package-based files
    • connector_registry/datahub.json - connectors from datahub.* package
    • connector_registry/manifest.json - lists all available packages for discovery
  • Script refactoring:
    • scripts/capability_summary.pyscripts/connector_registry.py
    • Groups plugins by top-level package name from classname
    • Each package generates its own JSON file
  • Removed fields: generated_at timestamp (keeping generated_by, added package field)
  • Frontend updates:
    • Fetches manifest first, then loads all package files in parallel
    • Merges all packages into single registry for use
  • Build system: Updated gradle tasks and CI workflows to use new directory structure

Files Changed

  • .github/workflows/metadata-ingestion.yml - Updated CI task name
  • datahub-web-react/build.gradle - Updated copy task to handle directory
  • datahub-web-react/src/app/ingestV2/shared/capabilitySummary.tsconnectorRegistry.ts
  • datahub-web-react/src/app/ingestV2/shared/hooks/useCapabilitySummary.ts - Parallel loading logic
  • metadata-ingestion/build.gradle - Updated gradle task configuration
  • metadata-ingestion/scripts/capability_summary.pyconnector_registry.py
  • metadata-ingestion/scripts/docgen.py - Loads from directory instead of single file
  • metadata-ingestion/src/datahub/ingestion/autogenerated/connector_registry/ - New directory structure

Benefits

  • Scalability: Package-based organization prepares for future multi-package support
  • Performance: Frontend loads packages in parallel
  • Maintainability: Clearer separation of concerns by package
  • Extensibility: Easy to add new packages in the future

Test plan

  • Verify CI passes (especially metadata-ingestion workflow)
  • Run ./gradlew :metadata-ingestion:connectorRegistry to generate new files
  • Verify connector_registry/ directory contains manifest.json and datahub.json
  • Run ./gradlew :metadata-ingestion:docGen to verify documentation generation works
  • Frontend: Verify connector list loads correctly in ingestion UI
  • Check browser console for proper parallel loading logs

🤖 Generated with Claude Code

…and split into package-based files

Renames the capability summary feature to connector registry across the codebase
and refactors the implementation to use package-based organization:

- Renamed capability_summary.py → connector_registry.py
- Renamed capabilitySummary.ts → connectorRegistry.ts
- Renamed gradle task capabilitySummary → connectorRegistry
- Generation script now groups plugins by top-level package name from classname
- Each package generates its own JSON file (e.g., datahub.json) in connector_registry/
- Added manifest.json listing all available packages for frontend discovery
- Removed generated_at timestamp field (keeping generated_by, added package field)
- Documentation generator loads and merges all package files from directory
- Frontend fetches manifest and loads all packages in parallel
- Updated build tasks and CI workflows to work with directory structure
@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata product PR or Issue related to the DataHub UI/UX devops PR or Issue related to DataHub backend & deployment labels Feb 5, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 5, 2026

Linear: ING-1518

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:11:32: Double quote to prevent globbing and word splitting [shellcheck]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:17:31: Double quote to prevent globbing and word splitting [shellcheck]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:20:32: Double quote to prevent globbing and word splitting [shellcheck]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:2:29: Double quote to prevent globbing and word splitting [shellcheck]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 [actionlint] reported by reviewdog 🐶
shellcheck reported issue in this script: SC2086:info:8:31: Double quote to prevent globbing and word splitting [shellcheck]

@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Feb 5, 2026
@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 35.00000% with 26 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
.../app/ingestV2/shared/hooks/useCapabilitySummary.ts 35.00% 26 Missing ⚠️

❌ Your patch status has failed because the patch coverage (35.00%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

📢 Thoughts on this report? Let us know!

@alwaysmeticulous
Copy link

alwaysmeticulous bot commented Feb 5, 2026

✅ Meticulous spotted 0 visual differences across 969 screens tested: view results.

Meticulous evaluated ~9 hours of user flows against your PR.

Expected differences? Click here. Last updated for commit d51790c. This comment will update as new commits are pushed.

@codecov
Copy link

codecov bot commented Feb 5, 2026

Bundle Report

Changes will increase total bundle size by 576 bytes (0.0%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
datahub-react-web-esm 29.56MB 576 bytes (0.0%) ⬆️

Affected Assets, Files, and Routes:

view changes for bundle: datahub-react-web-esm

Assets Changed:

Asset Name Size Change Total Size Change (%)
assets/index-*.js 576 bytes 19.39MB 0.0%

Files in assets/index-*.js:

  • ./src/app/ingestV2/shared/hooks/useCapabilitySummary.ts → Total Size: 3.98kB

@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Feb 6, 2026
@shirshanka shirshanka merged commit 350328c into master Feb 6, 2026
95 of 97 checks passed
@shirshanka shirshanka deleted the refactor-capability-summary branch February 6, 2026 14:47
neildsouth pushed a commit to National-Digital-Twin/ndt-data-catalogue that referenced this pull request Feb 13, 2026
neildsouth pushed a commit to National-Digital-Twin/ndt-data-catalogue that referenced this pull request Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops PR or Issue related to DataHub backend & deployment ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge product PR or Issue related to the DataHub UI/UX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments