Conversation
Previously the test mocks presented commit IDs in chronological order. But GC code actually receives commit IDs in lexicographic order -- which is indistinguishable from random.
There was a problem hiding this comment.
Pull request overview
This PR aims to clean up GC (Garbage Collection) tests by improving resource management and test data randomization. However, there are two critical bugs that prevent the code from working correctly.
Key Changes:
- Moved iterator cleanup to use defer pattern in active_commits_test.go
- Added ID scrambling functionality to avoid alphabetical ordering bias in tests
- Attempted to reposition Close() call in starting_point_iterator_test.go
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/graveler/retention/starting_point_iterator_test.go | Attempted to move Close() call but positioned it incorrectly, breaking the test |
| pkg/graveler/retention/active_commits_test.go | Added test data scrambling with fingerprint function and improved resource cleanup with defer, but fingerprint implementation has a critical hash computation bug |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
The iterator is being closed before it's used. Moving it.Close() to line 74 (before the for it.Next() loop on line 76) will cause the test to fail because the iterator will be closed before any data is read from it. The Close() call should remain after the loop completes (after line 92).
| it.Close() | |
| defer it.Close() |
There was a problem hiding this comment.
The fingerprint function incorrectly uses h.Sum([]byte(s)) on line 177. The Sum method appends the current hash to the provided byte slice; it doesn't hash the data. To hash the string, use h.Write([]byte(s)) instead. The current implementation will produce incorrect fingerprints as it's not actually hashing the input string.
| h.Sum([]byte(s)) | |
| h.Write([]byte(s)) |
Neither breaks anything... but both definitely improve tests and/or readability.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| headsRetentionDays map[string]int32 | ||
| expectedActiveIDs []string | ||
| } | ||
|
|
There was a problem hiding this comment.
[nitpick] The fingerprint function lacks documentation explaining its purpose and behavior. Consider adding a comment describing that it creates a deterministic hash-prefixed string to scramble commit IDs for testing purposes.
Suggested addition:
// fingerprint creates a hash-prefixed version of the input string to scramble
// commit IDs in tests, ensuring they are not alphabetically ordered in a way
// that could accidentally match their time ordering.
func fingerprint[T ~string](s T) T {| // fingerprint creates a hash-prefixed version of the input string to scramble | |
| // commit IDs in tests, ensuring they are not alphabetically ordered in a way | |
| // that could accidentally match their time ordering. |
| fingerprint := h.Sum64() | ||
| return T(fmt.Sprintf("%016x-%s", fingerprint, s)) | ||
| } | ||
|
|
There was a problem hiding this comment.
[nitpick] The scrambleIDs function lacks documentation explaining its purpose. Consider adding a comment describing that it applies fingerprinting to all IDs in the test data structure.
Suggested addition:
// scrambleIDs applies fingerprinting to all commit IDs, head IDs, and expected
// active IDs in the test data to ensure they are not alphabetically ordered.
func scrambleIDs(tst testRepoCommits) testRepoCommits {| // scrambleIDs applies fingerprinting to all commit IDs, head IDs, and expected | |
| // active IDs in the test data to ensure they are not alphabetically ordered. |
| for idx, parent := range commit.parents { | ||
| commit.parents[idx] = fingerprint(parent) | ||
| } |
There was a problem hiding this comment.
The commit.parents slice is being modified, but since commit is a value copy from the range loop, these modifications won't be reflected in the commit value stored in the commits map on line 188. The parents should be modified before storing the commit.
Suggested fix:
for id, commit := range tst.commits {
parents := make([]graveler.CommitID, len(commit.parents))
for idx, parent := range commit.parents {
parents[idx] = fingerprint(parent)
}
commit.parents = parents
commits[fingerprint(id)] = commit
}| for idx, parent := range commit.parents { | |
| commit.parents[idx] = fingerprint(parent) | |
| } | |
| parents := make([]graveler.CommitID, len(commit.parents)) | |
| for idx, parent := range commit.parents { | |
| parents[idx] = fingerprint(parent) | |
| } | |
| commit.parents = parents |
| fingerprint := h.Sum64() | ||
| return T(fmt.Sprintf("%016x-%s", fingerprint, s)) |
There was a problem hiding this comment.
The local variable fingerprint shadows the function name, which can be confusing. Consider renaming the variable to something like hash or fpHash.
Suggested fix:
func fingerprint[T ~string](s T) T {
h := fnv.New64a()
h.Write([]byte(s)) // Write cannot fail for a hash func.
hash := h.Sum64()
return T(fmt.Sprintf("%016x-%s", hash, s))
}| fingerprint := h.Sum64() | |
| return T(fmt.Sprintf("%016x-%s", fingerprint, s)) | |
| hash := h.Sum64() | |
| return T(fmt.Sprintf("%016x-%s", hash, s)) |
| func fingerprint[T ~string](s T) T { | ||
| h := fnv.New64a() | ||
| h.Write([]byte(s)) // Write cannot fail for a hash func. | ||
| fingerprint := h.Sum64() | ||
| return T(fmt.Sprintf("%016x-%s", fingerprint, s)) | ||
| } |
There was a problem hiding this comment.
nit; can keep it simple while accepting strings
No description provided.