Skip to content

[Draft] Create a new Iceberg Catalog Plugin for exporting data to S3 tables#21284

Open
sachin-27 wants to merge 5 commits intoopensearch-project:feature/datafusionfrom
sachin-27:feature/datafusion-iceberg-integration
Open

[Draft] Create a new Iceberg Catalog Plugin for exporting data to S3 tables#21284
sachin-27 wants to merge 5 commits intoopensearch-project:feature/datafusionfrom
sachin-27:feature/datafusion-iceberg-integration

Conversation

@sachin-27
Copy link
Copy Markdown

Description

Creates a new plugin which enables opensearch to integrate with iceberg metadata. Includes files and dependencies required to connect to S3

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

…iceberg catalog

Signed-off-by: Sachin Sriramagiri <srirasac@amazon.com>
@sachin-27 sachin-27 requested a review from a team as a code owner April 20, 2026 03:30
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 20, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit 890194a.

PathLineSeverityDescription
plugins/iceberg-metadata-catalog/build.gradle18highNew dependency added: org.apache.iceberg:iceberg-api:1.6.1. Per mandatory rule, all dependency additions must be flagged for maintainer verification regardless of apparent legitimacy.
plugins/iceberg-metadata-catalog/build.gradle19highNew dependency added: org.apache.iceberg:iceberg-core:1.6.1. Per mandatory rule, all dependency additions must be flagged for maintainer verification regardless of apparent legitimacy.
plugins/iceberg-metadata-catalog/build.gradle20highNew dependency added: org.apache.iceberg:iceberg-common:1.6.1. Per mandatory rule, all dependency additions must be flagged for maintainer verification regardless of apparent legitimacy.
plugins/iceberg-metadata-catalog/build.gradle21highNew dependency added: org.apache.iceberg:iceberg-bundled-guava:1.6.1. Per mandatory rule, all dependency additions must be flagged for maintainer verification regardless of apparent legitimacy.
plugins/iceberg-metadata-catalog/build.gradle32mediumthirdPartyAudit task is disabled. This suppresses the security audit that would validate the four newly introduced Iceberg dependencies, removing a layer of supply-chain verification precisely where it is most needed.
plugins/iceberg-metadata-catalog/build.gradle29lowdependencyLicenses check is disabled. While framed as a scaffold convenience, disabling license tracking for new dependencies obscures their provenance and can mask substituted or malicious artifacts.

The table above displays the top 10 most important findings.

Total: 6 | Critical: 0 | High: 4 | Medium: 1 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

Signed-off-by: Sachin Sriramagiri <srirasac@amazon.com>
Copy link
Copy Markdown
Contributor

@rajiv-kv rajiv-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the failing build. Looks like diff_analyzer is flagging the checked-in jar.

* compatible open source license.
*/

package org.opensearch.plugin.iceberg.catalog;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can the package be named as org.opensearch.plugin.catalog.iceberg

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

* OpenSearch index data to S3 Tables. It copies S3 client code from repository-s3
* for plugin isolation (plugins cannot depend on other plugins).
*/
public class IcebergMetadataCatalogPlugin extends Plugin {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Lets name it as IcebergCatalogPlugin.
  • I believe you will define interface CatalogPlugin in core ?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Yes, will raise that in following PR

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for starting small, but I'd recommend starting with the new catalog plugin abstraction and don't create a plugin until there is something from it to use. No one can provide a meaningful review of this code because it doesn't do anything and there's no description about what it is intended to do beyond "integrate with iceberg metadata"

Signed-off-by: Sachin Sriramagiri <srirasac@amazon.com>
@sachin-27
Copy link
Copy Markdown
Author

Can you check the failing build. Looks like diff_analyzer is flagging the checked-in jar.

Enabled third party audit and severity went from critical to high. I believe that the action is configured to flag any jar being checked in to the code

Signed-off-by: Sachin Sriramagiri <srirasac@amazon.com>
Signed-off-by: Sachin Sriramagiri <srirasac@amazon.com>
}
}

// Disable checks that are not relevant for this initial scaffold.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these not relevant? This is bypassing the checks intended to verify that the new dependencies you're bringing in are of a compatible license.

@sachin-27 sachin-27 changed the title Create a new Iceberg Catalog Plugin for exporting data to S3 tables [Draft] Create a new Iceberg Catalog Plugin for exporting data to S3 tables Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants