feat: add schema hash functionality and caching#1598
Conversation
- Introduced `getSchemaHash` method across various data access objects to compute and cache schema hashes. - Added `DEFAULT_SCHEMA_HASH_CACHE_OPTIONS` in caching constants for schema hash caching. - Implemented schema hash validation in `LRUStorage` to manage schema changes effectively. - Updated `OperationTypeEnum` to include `getSchemaHash` operation type. - Enhanced error handling and logging for schema hash retrieval. - Created `validateSchemaCache` utility function to validate schema cache across data access objects. - Updated message texts to reflect new functionality and improve user guidance.
There was a problem hiding this comment.
Pull request overview
This pull request adds schema hash functionality across the codebase to detect and handle database schema changes effectively. The implementation computes and caches schema hashes for each database connection, automatically invalidating table metadata caches when schema changes are detected.
Changes:
- Added
getSchemaHashmethod to all data access objects (Postgres, MySQL, MSSQL, Oracle, IBM DB2, ClickHouse, Agent) - Implemented schema hash caching and automatic cache invalidation in LRUStorage when schemas change
- Integrated schema validation into key use cases (get table structure, get table rows, get row by primary key, find tables)
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 17 comments.
Show a summary per file
| File | Description |
|---|---|
| shared-code/src/shared/interfaces/data-access-object.interface.ts | Added optional getSchemaHash method to the IDataAccessObject interface |
| shared-code/src/shared/interfaces/data-access-object-agent.interface.ts | Added required getSchemaHash method to the IDataAccessObjectAgent interface |
| shared-code/src/shared/enums/data-access-object-commands.enum.ts | Added getSchemaHash operation to enum for agent communication |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-postgres.ts | Implemented getSchemaHash using MD5 hashing of table structures, columns, and indexes |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-oracle.ts | Implemented getSchemaHash using STANDARD_HASH function with table/column/index counts and checksums |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-mysql.ts | Implemented getSchemaHash using MD5 hashing with CTEs for table metadata aggregation |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-mssql.ts | Implemented getSchemaHash using HASHBYTES function with table structure and index information |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-ibmdb2.ts | Implemented getSchemaHash using concatenated counts and checksums of tables/columns/indexes |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-clickhouse.ts | Implemented getSchemaHash using MD5 hashing of table metadata excluding views |
| shared-code/src/data-access-layer/data-access-objects/data-access-object-agent.ts | Implemented getSchemaHash to proxy requests to agent server with retry logic |
| shared-code/src/caching/schema-cache-validator.ts | Created utility function to validate schema cache across different DAO types |
| shared-code/src/caching/lru-storage.ts | Added schema hash cache management and invalidation logic for table metadata |
| shared-code/src/caching/caching-constants.ts | Added DEFAULT_SCHEMA_HASH_CACHE_OPTIONS with 30-second TTL and increased table structure cache TTL to 15 minutes |
| rocketadmin-agent/src/text/messages.ts | Added FAILED_GET_SCHEMA_HASH error message and reformatted file with tabs |
| rocketadmin-agent/src/enums/operation-type.enum.ts | Added getSchemaHash operation type and reformatted with tabs |
| rocketadmin-agent/src/command/command-executor.ts | Added getSchemaHash case to command executor switch statement |
| backend/src/entities/table/use-cases/get-table-structure.use.case.ts | Integrated validateSchemaCache call before fetching table structure |
| backend/src/entities/table/use-cases/get-table-rows.use.case.ts | Integrated validateSchemaCache call before fetching table rows |
| backend/src/entities/table/use-cases/get-row-by-primary-key.use.case.ts | Integrated validateSchemaCache call before fetching single row |
| backend/src/entities/table/use-cases/find-tables-in-connection.use.case.ts | Integrated validateSchemaCache call before listing tables |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| WHERE c.database = '${database}' AND c.table = t.name | ||
| ), '') | ||
| )))) AS table_hash | ||
| FROM system.tables t | ||
| WHERE t.database = '${database}' |
There was a problem hiding this comment.
SQL injection vulnerability: The database variable is directly interpolated into the SQL query using string template literals without proper escaping. Unlike other methods in this class that use the escapeValue helper, this getSchemaHash method injects the database name unsafely. This creates a security risk if the database name contains malicious SQL code.
| WHERE c.database = '${database}' AND c.table = t.name | |
| ), '') | |
| )))) AS table_hash | |
| FROM system.tables t | |
| WHERE t.database = '${database}' | |
| WHERE c.database = ${this.escapeValue(database)} AND c.table = t.name | |
| ), '') | |
| )))) AS table_hash | |
| FROM system.tables t | |
| WHERE t.database = ${this.escapeValue(database)} |
| } | ||
|
|
||
| function isAgentDao(dao: IDataAccessObject | IDataAccessObjectAgent): dao is IDataAccessObjectAgent { | ||
| return dao.getTableStructure.length >= 2; |
There was a problem hiding this comment.
Fragile type detection using function parameter length: The isAgentDao function uses the length of the getTableStructure method parameters to distinguish between IDataAccessObject and IDataAccessObjectAgent. This is a brittle approach that will break if the method signatures change. Consider using a more robust type detection method, such as checking for the presence of specific properties that uniquely identify agent DAOs, or using a discriminator property.
| return dao.getTableStructure.length >= 2; | |
| return 'getAgents' in dao && typeof (dao as any).getAgents === 'function'; |
| const daoWithSchemaHash = dao as typeof dao & { getSchemaHash?: () => Promise<string> }; | ||
| if (daoWithSchemaHash.getSchemaHash) { | ||
| return await daoWithSchemaHash.getSchemaHash(); | ||
| } |
There was a problem hiding this comment.
Silent failure when getSchemaHash is not implemented: When the getSchemaHash method is not available on the DAO, the function silently returns an empty string without any logging or notification. This makes it difficult to debug situations where schema hash validation isn't working as expected. Consider adding a console log or debug message when getSchemaHash is not available to aid in troubleshooting.
| } | |
| } | |
| console.log('Debug: getSchemaHash method is not implemented on the current DAO. Returning empty schema hash.'); |
| for (const key of tableStructureCache.keys()) { | ||
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { | ||
| tableStructureCache.delete(key); | ||
| } | ||
| } | ||
| for (const key of tableForeignKeysCache.keys()) { | ||
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { | ||
| tableForeignKeysCache.delete(key); | ||
| } | ||
| } | ||
| for (const key of tablePrimaryKeysCache.keys()) { | ||
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { |
There was a problem hiding this comment.
Potential cache invalidation false positives: The cache invalidation logic uses connectionStr.slice(0, -1) to match cache keys. This substring matching approach could lead to false positives where unrelated connection cache entries are incorrectly invalidated if their serialized JSON happens to contain the same substring. Consider using exact matching based on the connection identifier to prevent unintended cache invalidations.
| for (const key of tableStructureCache.keys()) { | |
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { | |
| tableStructureCache.delete(key); | |
| } | |
| } | |
| for (const key of tableForeignKeysCache.keys()) { | |
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { | |
| tableForeignKeysCache.delete(key); | |
| } | |
| } | |
| for (const key of tablePrimaryKeysCache.keys()) { | |
| if (typeof key === 'string' && key.includes(connectionStr.slice(0, -1))) { | |
| const connectionPrefix = connectionStr.slice(0, -1); | |
| for (const key of tableStructureCache.keys()) { | |
| if (typeof key === 'string' && key.startsWith(connectionPrefix)) { | |
| tableStructureCache.delete(key); | |
| } | |
| } | |
| for (const key of tableForeignKeysCache.keys()) { | |
| if (typeof key === 'string' && key.startsWith(connectionPrefix)) { | |
| tableForeignKeysCache.delete(key); | |
| } | |
| } | |
| for (const key of tablePrimaryKeysCache.keys()) { | |
| if (typeof key === 'string' && key.startsWith(connectionPrefix)) { |
| @@ -1,3 +1,4 @@ | |||
| /** biome-ignore-all lint/complexity/noStaticOnlyClass: <explanation> */ | |||
There was a problem hiding this comment.
Incomplete biome-ignore comment: The biome-ignore directive has a placeholder explanation text. This should be replaced with a meaningful explanation of why the noStaticOnlyClass rule is being ignored for this file, such as explaining that this is a utility class that serves as a namespace for cache-related operations.
| /** biome-ignore-all lint/complexity/noStaticOnlyClass: <explanation> */ | |
| /** biome-ignore-all lint/complexity/noStaticOnlyClass: This static-only utility class intentionally serves as a namespace for cache-related LRU operations, so instance-based state or behavior is not required. */ |
| operationStatusResult = OperationResultStatusEnum.successfully; | ||
| return await dao.getIdentityColumns(tableName, referencedFieldName, identityColumnName, fieldValues); | ||
| } catch (e) { | ||
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; |
There was a problem hiding this comment.
The value assigned to operationStatusResult here is unused.
| operationStatusResult = OperationResultStatusEnum.successfully; | |
| return await dao.getIdentityColumns(tableName, referencedFieldName, identityColumnName, fieldValues); | |
| } catch (e) { | |
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; | |
| return await dao.getIdentityColumns(tableName, referencedFieldName, identityColumnName, fieldValues); | |
| } catch (e) { |
| operationStatusResult = OperationResultStatusEnum.successfully; | ||
| return await dao.getReferencedTableNamesAndColumns(tableName); | ||
| } catch (e) { | ||
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; |
There was a problem hiding this comment.
The value assigned to operationStatusResult here is unused.
| operationStatusResult = OperationResultStatusEnum.successfully; | |
| return await dao.getReferencedTableNamesAndColumns(tableName); | |
| } catch (e) { | |
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; | |
| return await dao.getReferencedTableNamesAndColumns(tableName); | |
| } catch (e) { |
| operationStatusResult = OperationResultStatusEnum.successfully; | ||
| return await dao.getReferencedTableNamesAndColumns(tableName); | ||
| } catch (e) { | ||
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; |
There was a problem hiding this comment.
The value assigned to operationStatusResult here is unused.
| operationStatusResult = OperationResultStatusEnum.successfully; | |
| return await dao.getReferencedTableNamesAndColumns(tableName); | |
| } catch (e) { | |
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; | |
| return await dao.getReferencedTableNamesAndColumns(tableName); | |
| } catch (e) { |
| operationStatusResult = OperationResultStatusEnum.successfully; | ||
| return await dao.isView(tableName); | ||
| } catch (e) { | ||
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; |
There was a problem hiding this comment.
The value assigned to operationStatusResult here is unused.
| operationStatusResult = OperationResultStatusEnum.successfully; | |
| return await dao.isView(tableName); | |
| } catch (e) { | |
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; | |
| return await dao.isView(tableName); | |
| } catch (e) { |
| operationStatusResult = OperationResultStatusEnum.successfully; | ||
| return await dao.isView(tableName); | ||
| } catch (e) { | ||
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; |
There was a problem hiding this comment.
The value assigned to operationStatusResult here is unused.
| operationStatusResult = OperationResultStatusEnum.successfully; | |
| return await dao.isView(tableName); | |
| } catch (e) { | |
| operationStatusResult = OperationResultStatusEnum.unsuccessfully; | |
| return await dao.isView(tableName); | |
| } catch (e) { |
… data access object
getSchemaHashmethod across various data access objects to compute and cache schema hashes.DEFAULT_SCHEMA_HASH_CACHE_OPTIONSin caching constants for schema hash caching.LRUStorageto manage schema changes effectively.OperationTypeEnumto includegetSchemaHashoperation type.validateSchemaCacheutility function to validate schema cache across data access objects.