Skip to content

implement timestamp correction utilities and tests for git commit data#3518

Open
shlokgilda wants to merge 2 commits intochaoss:mainfrom
shlokgilda:fix/issue-3472-validate-committer-timestamps
Open

implement timestamp correction utilities and tests for git commit data#3518
shlokgilda wants to merge 2 commits intochaoss:mainfrom
shlokgilda:fix/issue-3472-validate-committer-timestamps

Conversation

@shlokgilda
Copy link
Collaborator

Description

Fixes #3472 - PostgreSQL insertion failures due to invalid committer timestamps.

The Bug:
facade_bulk_insert_commits() only validated cmt_author_timestamp when handling DB insertion failures. If the author timestamp was valid but the committer timestamp had an invalid timezone (e.g., -13068837 from git corruption), the insert would fail permanently.

The Fix:

  • Created augur/tasks/git/correction.py with timezone validation functions (simple string operations)
  • Modified lib.py to validate both author and committer timestamps when the binary search isolates a problematic commit
  • Uses author timestamp as fallback for invalid committer timestamp to minimize data loss
  • Only validates on insertion failure (keeps the existing performance optimization intact)

Changes:

  1. NEW: augur/tasks/git/correction.py - Timestamp validation utilities
  2. MODIFIED: augur/application/db/lib.py - Now validates both timestamps on failure
  3. NEW: tests/test_tasks/test_git/test_correction.py - 12 unit tests (all passing)

Notes for Reviewers

  • The fix maintains the existing binary search optimization - we only validate timestamps when PostgreSQL rejects an insert, not proactively
  • POSTGRES_VALID_TIMEZONES includes all real-world timezone offsets (-12:00 to +14:00). Some of the timezone offsets were missing. Reference: https://docs.oracle.com/cd/E19563-01/819-4437/anovd/index.html
  • Fallback chain: invalid committer → use author timestamp → UTC (minimizes data loss per issue discussion)

Signed commits

  • Yes, I signed my commits.

AI Disclosure: I used Claude Code to assist with writing docstrings, code review, writing test cases, and drafting this PR description. All test cases were manually verified and are passing locally.

@shlokgilda shlokgilda requested a review from sgoggins as a code owner January 7, 2026 07:46
@shlokgilda shlokgilda force-pushed the fix/issue-3472-validate-committer-timestamps branch from a354293 to 62f1c29 Compare January 7, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

facade_bulk_insert_commits should also correct cmt_committer_timestamp

1 participant