Skip to content

[CTM-385] Create default firestore database if does not exist and Update Integration Test#2076

Open
snf2ye wants to merge 23 commits intodevelopfrom
sh/test-if-connected-tet-time-decreases
Open

[CTM-385] Create default firestore database if does not exist and Update Integration Test#2076
snf2ye wants to merge 23 commits intodevelopfrom
sh/test-if-connected-tet-time-decreases

Conversation

@snf2ye
Copy link
Contributor

@snf2ye snf2ye commented Feb 26, 2026

Jira ticket: https://broadworkbench.atlassian.net/browse/CTM-385

Addresses

There are two issues with TDR tests. This PR aims to get tests back into a good state.

Issue 1 - Connected Test timeout

A couple of weeks ago, we saw a jump in connected test timeouts. Many of the tests were failing with the following error:

NOT_FOUND: The database (default) does not exist for project datarepo-55b38b1e

We have actually seen this error sporadically for quite a while (See ticket from 2024) .
A Puzzle
It is still not clear to me when a default database gets created and when it does not. It doesn't seem to be tied to when we use Firestore (i.e. when we try to write an entry to firstore on ingesting a file). It appears to only happen for datasets (but we call this same setup method for snapshots). And it doesn't appear to be a timing issue: for the instances that do not have firestore, you can check again hours later and it still does not have the default firestore database created.
What is clear is that adding this call to create the default database drastically reduces the connected test run time (by 50%+)

Issue 2 - SnapshotAcl Integration test consistency failing

The SnapshotPermissionsIntegrationTest.snapshotAclTest() has started failing pretty consistently on the bigQuery.getDataset check. It can take a minute for the ACL (access control list) on the big query dataset to update. So, if we wrap this in a retry, the test can then pass. We use a retry in the production code, it was just not added to the test code.

Ideally, we would just call the bigQueryProject.getBQDataset(bqDatasetName) method, but this test is trying to specifically test that a big query instance with a given token has access to BQ.

Summary of changes

  • On setting up a new google project for a dataset or snapshot, enabling firestore and then check if the default database exists in Firestore. If it does not, create the default database.
  • Add retry for SnapshotAcl test

Testing Strategy

  • New connected test to test firestore changes

@snf2ye snf2ye force-pushed the sh/test-if-connected-tet-time-decreases branch 2 times, most recently from 12563c9 to 2f27a25 Compare March 3, 2026 21:36
@snf2ye snf2ye changed the title [WIP - do not merge] Test if connected test time decreases [WIP - do not merge] Investigate long TDR connected test run time Mar 4, 2026
@snf2ye snf2ye force-pushed the sh/test-if-connected-tet-time-decreases branch from b7f8ce9 to db32e06 Compare March 4, 2026 20:06
@snf2ye snf2ye changed the title [WIP - do not merge] Investigate long TDR connected test run time [CTM-385] Create default firestore database if does not exist Mar 6, 2026
.setProjectId(googleProjectId)
.build()
.getService();
return FirestoreOptions.newBuilder().setProjectId(googleProjectId).build().getService();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a change to use the same logic as we have in FirestoreProject

@snf2ye snf2ye force-pushed the sh/test-if-connected-tet-time-decreases branch from 0d94d26 to 680a221 Compare March 9, 2026 01:26
// Fetch BQ Dataset
// Fetch BQ Dataset with retry logic
String bqDatasetName = BigQueryPdao.prefixName(datasetName);
Dataset bqDataset = bigQuery.getDataset(bqDatasetName);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SnapshotPermissionsIntegrationTest.snapshotAclTest() is consistently failing on this check. It can take a minute for the ACL (access control list) on the big query dataset to update. So, if we wrap this in a retry, the test can then pass. We use a retry in the production code, it was just not added to the test code.

Ideally, we would just call the bigQueryProject.getBQDataset(bqDatasetName) method, but this test is trying to specifically test that a big query instance with a given token has access to BQ.

@snf2ye snf2ye changed the title [CTM-385] Create default firestore database if does not exist [CTM-385] Create default firestore database if does not exist and Update Integration Test Mar 9, 2026
@snf2ye snf2ye marked this pull request as ready for review March 9, 2026 14:07
@snf2ye snf2ye requested a review from a team as a code owner March 9, 2026 14:07
@snf2ye snf2ye requested review from calypsomatic and rjohanek and removed request for a team March 9, 2026 14:07
@snf2ye
Copy link
Contributor Author

snf2ye commented Mar 9, 2026

@claude please review for critical issues

snf2ye added 8 commits March 10, 2026 15:55
Update GoogleProjectServiceFirestoreConnectedTest.java

Address sonarcloud

remove unneeded call

Does this make sonarcloud happy?
The SnapshotPermissionsIntegrationTest.snapshotAclTest() is consistently failing on this bigQuery.getDataset check. It can take a minute for the ACL (access control list) on the big query dataset to update. So, if we wrap this in a retry, the test can then pass. We use a retry in the production code, it was just not added to the test code.

Ideally, we would just call the bigQueryProject.getBQDataset(bqDatasetName) method, but this test is trying to specifically test that a big query instance with a given token has access to BQ.

So, we've created a helper method here.
… in our connected test

 Excellent! I've successfully refactored the code to reuse the Firestore setup logic from GoogleProjectService. Here's a summary of the changes:

  Changes Made:

  In GoogleProjectService.java:

  1. Extracted createAppengineClient() as a static method marked @VisibleForTesting:
    - Takes applicationName as a parameter
    - Creates and returns an Appengine client
    - Made the instance method appengine() call this new static method
  2. Extracted createAppEngineApplication() as a static method marked @VisibleForTesting:
    - Creates an App Engine application in Firestore native mode
    - This is the prerequisite for creating Firestore databases
    - Handles the operation completion wait
  3. Refactored enableFirestore():
    - Now calls createAppEngineApplication()
    - Then calls createFirestoreDefaultDatabase()
    - Cleaner separation of concerns

  In GoogleProjectServiceFirestoreConnectedTest.java:

  1. Simplified the setup() method:
    - Now calls GoogleProjectService.createAppengineClient()
    - Then calls GoogleProjectService.createAppEngineApplication()
    - Much more concise (4 lines instead of ~90 lines)
  2. Removed duplicate code:
    - Deleted enableFirestoreApp() method (26 lines)
    - Deleted createAppengineClient() method (14 lines)
    - Deleted waitForAppengineOperation() method (22 lines)
  3. Cleaned up unused imports:
    - Removed 8 unused import statements

  Benefits:

  - ✅ No code duplication - test reuses production logic
  - ✅ Easier maintenance - changes to the setup logic only need to be made in one place
  - ✅ Shorter test file - removed ~70 lines of code
  - ✅ Better alignment - test setup matches production setup exactly

  The tests should work exactly the same as before, but now they're using the shared implementation from GoogleProjectService.
…ogle project will get deleted by janitor service
@snf2ye snf2ye force-pushed the sh/test-if-connected-tet-time-decreases branch from 76244bf to 9ec24bb Compare March 10, 2026 19:55
@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant