Skip to content

Fix query-scope cache invalidation for Glue catalog#28266

Open
piotrrzysko wants to merge 3 commits intotrinodb:masterfrom
piotrrzysko:iceberg_glue_cache_fix
Open

Fix query-scope cache invalidation for Glue catalog#28266
piotrrzysko wants to merge 3 commits intotrinodb:masterfrom
piotrrzysko:iceberg_glue_cache_fix

Conversation

@piotrrzysko
Copy link
Member

@piotrrzysko piotrrzysko commented Feb 12, 2026

Description

This PR fixes a bug where GlueIcebergTableOperations#commitTableUpdate was not invalidating the query-scope table metadata cache after a successful commit. As a result, Iceberg observed stale metadata when determining committed files. This caused uncommitted manifest files from failed retry attempts to remain as orphans instead of being cleaned up.

Automatic cleanup is performed in BaseTransaction#commitSimpleTransaction However, it was never triggered because committedFiles was null due to stale information returned by TableOperations::current.

The fix refactors the code to wrap the Glue client in a caching client implementation, centralizing cache management and preventing similar bugs in the future.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text:

## Section
* Fix orphaned manifest files not being cleaned up after commit retries when using the Glue catalog. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Feb 12, 2026
@github-actions github-actions bot added the iceberg Iceberg connector label Feb 12, 2026
@piotrrzysko piotrrzysko force-pushed the iceberg_glue_cache_fix branch from e796ef4 to 05b11d6 Compare February 12, 2026 09:34
@piotrrzysko piotrrzysko marked this pull request as ready for review February 12, 2026 10:41
Copy link
Member

@findepi findepi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Add test for uncommitted manifest cleanup"

Comment on lines +260 to +261
//TODO: Fix https://github.com/trinodb/trino/issues/23941
abort("Skipped for now due to #23941");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test is self-maintained when it follows the pattern

assertThatThrownBy(super::test_name)
  .hasMessage...(...);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure yet about this. I think it fails nondeterministically, and I couldn’t find the error message with which it failed for me. Locally, I couldn’t reproduce failures for either this test or the one above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. this pattern is not applicable for tests which fail nondeterministically

public void testDeleteRowsConcurrently() {}

@Test
@Disabled("Moto is not concurrency safe")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😮

Does it make sense to continue to test with Moto? What value does it add?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to continue to test with Moto?

That’s a good question, but I think it's outside the scope of this PR. I followed the existing pattern (see the test above).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a good question, but I think it's outside the scope of this PR.

of course!

@@ -449,7 +443,7 @@ private void getColumnsFromIcebergMetadata(
for (SchemaTableName tableName : relationFilter.apply(glueTables.keySet())) {
Table table = glueTables.get(tableName);
// potentially racy with invalidation, but TrinoGlueCatalog is session-scoped
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was already here. Now we're adding QueryScopedCachingTrinoGlueClient.
Do we really need two levels of caching & invalidation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By two levels of caching & invalidation do you mean TrinoGlueCatalog (level 1) and QueryScopedCachingTrinoGlueClient (level 2)?

If so, I introduced QueryScopedCachingTrinoGlueClient to keep cache invalidation close to the state-modifying operations. Previously, they were decoupled, which in my opinion led to the bug I’m trying to fix.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get there is a subtle bug manifesting in partial cleanup.

Is the bug fixiable without introducing QueryScopedCachingTrinoGlueClient? what would the fix look like, and what would be the cons of doing it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m fixing a missing invalidation in GlueIcebergTableOperations#commitTableUpdate, where the raw GlueClient was used. I introduced QueryScopedCachingTrinoGlueClient to share invalidation logic between TrinoGlueCatalog and GlueIcebergTableOperations.

I didn’t intend to address this pre-existing code comment.

@piotrrzysko piotrrzysko force-pushed the iceberg_glue_cache_fix branch 2 times, most recently from d56bb4c to 985fca7 Compare February 16, 2026 08:32
Verify that uncommitted manifest files are properly cleaned up. The test
uses concurrent INSERT operations to create conflicts, then checks if
any orphaned manifests remain.
GlueIcebergTableOperations#commitTableUpdate was not invalidating the
query-scope table metadata cache after a successful commit. This caused
Iceberg to see stale metadata when determining which files were committed,
resulting in uncommitted manifest files from failed retry attempts being
left as orphans instead of being cleaned up automatically.

Changes in testInformationSchemaTableAndColumns were required because
now INSERTs in the test setup populate the filesystem cache so SELECTs
hit the cache.
@piotrrzysko piotrrzysko force-pushed the iceberg_glue_cache_fix branch from 985fca7 to 5cefeca Compare February 17, 2026 10:40

public void populateCache(SchemaTableName tableName, Table table)
{
uncheckedCacheGet(tableCache, tableName, () -> table);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment why it's safe

}
}

private Set<String> listManifestFiles(String tableName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use computeActual("SELECT path FROM \"table$manifests\"").getOnlyColumnAsSet() ?

assertGlueMetastoreApiInvocations("REFRESH MATERIALIZED VIEW test_refresh_mview_view",
ImmutableMultiset.<GlueMetastoreMethod>builder()
.addCopies(GET_TABLE, 6)
.addCopies(GET_TABLE, 7)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😞

Comment on lines -576 to -577
.add(new FileOperation(METADATA_JSON, "InputFile.length"))
.add(new FileOperation(METADATA_JSON, "InputFile.newInput"))
Copy link
Member

@findepi findepi Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from commit message

Changes in testInformationSchemaTableAndColumns were required because
now INSERTs in the test setup populate the filesystem cache so SELECTs
hit the cache.

this means the test is no longer testing what it appears it tries to
DESCRIBE still can result in InputFile.newInput, just that it's hidden from the test,
The test should be running with iceberg.metadata-cache.enabled=false, or flush FS cache in assertInvocations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test class is about testing Glue catalog accesses, and that's what it should focus on.
There are other tests like TestIcebergFileOperations which focus on iceberg metadata accesses on filesystem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filesystem access counts depend on catalog in use. In fact, they are here for metadata queries, which have have quite important catalog-specific component. Here we're asserting a metadata query. TestIcebergFileOperations is awesome for regular query access patterns and catalog-agnostic components. It doesn't run with Glue and doesn't, therefore, guard against potential metadata-related regressions in Glue catalog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought any catalog would need to look-up table entry in metastore and then read metadata.json from FS to show output of DESCRIBE. Not sure why the FS access would depend on catalog.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link

This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack.

@github-actions github-actions bot added the stale label Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

3 participants