Skip to content

Support Iceberg REST catalog storage credentials in ForwardingFileIo#28425

Open
kaveti wants to merge 2 commits intotrinodb:masterfrom
kaveti:storage-credentials-support
Open

Support Iceberg REST catalog storage credentials in ForwardingFileIo#28425
kaveti wants to merge 2 commits intotrinodb:masterfrom
kaveti:storage-credentials-support

Conversation

@kaveti
Copy link
Copy Markdown
Contributor

@kaveti kaveti commented Feb 23, 2026

Make ForwardingFileIo implement SupportsStorageCredentials so that storage credentials from the Iceberg REST catalog LoadTableResponse are captured and their config is merged into properties(). Storage credential keys take precedence over base config properties.

This allows storage credential keys (e.g. s3.access-key-id, gcs.oauth2.token) to flow through io().properties() into fileIoProperties, where downstream consumers like
IcebergRestCatalogFileSystemFactory can use them.

Description

Support Iceberg REST catalog storage credentials in ForwardingFileIo

Why this change

Iceberg REST catalogs can return short-lived, scoped credentials alongside table metadata so that
the query engine can access the underlying storage (S3, GCS, ADLS, etc.) without needing long-lived
static credentials. There are two mechanisms for this in the Iceberg REST spec:

  1. Config-based vended credentials — credential keys are embedded directly in the config map
    of the loadTable response. Trino already supports this path today.

  2. Storage credentials — a dedicated storage-credentials field in the loadTable response
    that carries a list of StorageCredential objects, each scoped to a storage prefix with its
    own config map. This was introduced as part of the Iceberg REST catalog spec and is used by
    catalogs like Unity Catalog, Polaris, and Gravitino.

Trino currently ignores storage credentials entirely. The Iceberg library's RESTSessionCatalog
calls FileIO.setCredentials() after loading a table, but Trino's ForwardingFileIo does not
implement the SupportsStorageCredentials interface — so those credentials are silently dropped.

This means that any catalog server returning credentials via the storage-credentials field
(instead of or in addition to the config map) will fail with access denied errors when Trino
tries to read the table data.

What this change does

ForwardingFileIo now implements SupportsStorageCredentials. When the Iceberg library calls
setCredentials(), the credential configs are captured and merged into properties().

This is the same io().properties() map that Trino already reads as fileIoProperties throughout
the Iceberg connector (splits, page sources, metadata operations). So all existing downstream
credential handling (e.g. IcebergRestCatalogFileSystemFactory for S3) works without any
additional changes.

Storage credential keys take precedence over base config properties when there is a conflict.
For example, if the base config has s3.access-key-id=static-key and a storage credential
has s3.access-key-id=vended-key, the vended key wins.

Example

A REST catalog loadTable response might include:

{
  "metadata": { ... },
  "config": {
    "s3.access-key-id": "fallback-key"
  },
  "storage-credentials": [
    {
      "prefix": "s3://my-bucket/warehouse/",
      "config": {
        "s3.access-key-id": "AKIA...",
        "s3.secret-access-key": "...",
        "s3.session-token": "..."
      }
    }
  ]
}

Before this change, Trino would only see the config map (fallback-key). After this change,
Trino sees the merged result where storage credentials take precedence (AKIA...).

This works the same way for GCS (gcs.oauth2.token), Azure (adls.sas-token.*), or any other
provider — the change is cloud-agnostic.

Spec references

Additional context

#When storage-credentials is used vs config properties:

The Iceberg REST catalog has two mechanisms for vending credentials:

  1. config map (older, ad-hoc) The REST server puts credential keys like s3.access-key-id, s3.secret-access-key directly into the config map of LoadTableResponse. This is what TestIcebergVendingRestCatalogConnectorSmokeTest exercises — the tabulario/iceberg-rest backend only supports this approach.

  2. storage-credentials list (Iceberg REST Spec) The REST server returns a structured list of StorageCredential objects in LoadTableResponse.credentials(). Each credential has:

A prefix (e.g., s3://my-bucket/warehouse/) scoping the credential to a specific storage location
A config map with the actual credential key-value pairs
This enables:

Prefix-scoped credentials — least-privilege access per storage location
Multi-cloud support — different credentials for S3, GCS, ADLS in the same response
Short-lived, table-specific tokens — the catalog can vend narrowly-scoped STS tokens per table

#Which catalogs use storage-credentials?
Apache Polaris (Snowflake Open Catalog)
Unity Catalog (Databricks)
AWS Glue Data Catalog (REST endpoint)
Gravitino
The tabulario/iceberg-rest test backend predates this spec addition and only returns credentials via config, which is why you see properties being filled but not storageCredentials.

I've added TestIcebergStorageCredentialsRestCatalog which uses a mock REST catalog that injects storage-credentials into LoadTableResponse to validate this flow end-to-end. I've also updated the issue description with this clarification.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Feb 23, 2026
@github-actions github-actions bot added the iceberg Iceberg connector label Feb 23, 2026
@kaveti
Copy link
Copy Markdown
Contributor Author

kaveti commented Feb 23, 2026

#Note: This change is the Trino-side prerequisite for Iceberg REST catalog storage credentials support.

@kaveti kaveti marked this pull request as draft February 23, 2026 16:48
@findinpath findinpath self-requested a review February 24, 2026 16:44
@kaveti kaveti marked this pull request as ready for review February 28, 2026 17:01
@kaveti
Copy link
Copy Markdown
Contributor Author

kaveti commented Feb 28, 2026

@findinpath i have addressed your review comments

@kaveti kaveti force-pushed the storage-credentials-support branch 2 times, most recently from 0a7c730 to 882184c Compare March 1, 2026 07:29
@kaveti kaveti force-pushed the storage-credentials-support branch 2 times, most recently from f6a6771 to 1e7f22e Compare March 6, 2026 17:05
@kaveti
Copy link
Copy Markdown
Contributor Author

kaveti commented Mar 8, 2026

@findinpath could you please review again

1 similar comment
@kaveti
Copy link
Copy Markdown
Contributor Author

kaveti commented Mar 19, 2026

@findinpath could you please review again

for (StorageCredential credential : credentials) {
merged.putAll(credential.config());
}
return merged.buildKeepingLast();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic makes not much sense to me.
We have the storageCredentials additionally to the fileIo.properties() (considered as a fallback) - meaning that they're separate, not mixed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree. we need to retain prefix scoped creds

RESTSessionCatalog icebergCatalogInstance = new RESTSessionCatalog(
config -> HTTPClient.builder(config)
.uri(config.get(CatalogProperties.URI))
.withHeaders(RESTUtil.configHeaders(config))
.build(),
(context, config) -> {
ConnectorIdentity currentIdentity = (context.wrappedIdentity() != null)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the changes from apache/iceberg#12591 disregarded the fact that the ioBuilder is not receiving now the storageCredentials.

@nastra was it intentional to skip it during the implementation phase?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findinpath this was not intentional. I believe Trino is the only engine that goes through that path. Can you please open a PR with a fix?

Copy link
Copy Markdown
Contributor Author

@kaveti kaveti Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opened PR: apache/iceberg#15752 , cc @nastra

: ConnectorIdentity.ofUser("fake");
return fileIoFactory.create(fileSystemFactory.create(currentIdentity, config), true, config);
});
null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a preparatory commit doing this change to see better whether it makes any sense to have it.

@@ -45,15 +52,23 @@
import static java.util.stream.Collectors.joining;

public class ForwardingFileIo
implements SupportsBulkOperations
implements SupportsBulkOperations, SupportsStorageCredentials
Copy link
Copy Markdown
Contributor

@findinpath findinpath Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at S3FileIO

https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java#L114

https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIO.java#L102

and am rather thinking that we should have a prefixed map of TrinoFileSystem actually (and probably the default one we used to have as fallback).

Copy link
Copy Markdown
Contributor

@findinpath findinpath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the PR needs to be intensively reshaped to have it moving forward.

warehouseLocation = Files.createTempDirectory(null);
closeAfterClass(() -> deleteRecursively(warehouseLocation, ALLOW_INSECURE));

DelegatingRestSessionCatalog delegatingCatalog = DelegatingRestSessionCatalog.builder()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was introduced as part of the Iceberg REST catalog spec and is used by catalogs like Unity Catalog, Polaris, and Gravitino.

Isn't there an actual implementation we can test against instead of relying on mocked code?

private final String staticRegion;
private final String staticEndpoint;
private final boolean staticCrossRegionAccessEnabled;
private final Map<ClientKey, S3Client> dynamicClients;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the right design. S3FileSystemLoader is already responsible for creating and managing clients. We shouldn't have a second system here for doing that.

@findinpath findinpath added the stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed. label Mar 25, 2026
@findinpath
Copy link
Copy Markdown
Contributor

Waiting on apache/iceberg#15752

Consider rebasing in the meantime on #26640

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed iceberg Iceberg connector stale-ignore Use this label on PRs that should be ignored by the stale bot so they are not flagged or closed.

Development

Successfully merging this pull request may close these issues.

4 participants