feat: vibes branch consolidation (48 commits)#26
Conversation
- Add configuration limits and validation system with automatic enforcement - Implement database schema versioning and migration system (v0 -> v2) - Add improved action tracking with display IDs (P1a2b3c format) and metadata - Update wiki table format with new ID column for better content tracking - Add error handling improvements for continuous mode with exponential backoff - Add new CLI options: --show-config-limits, --force-migrate - Update configuration template with new defaults and additional options - Update README with comprehensive documentation for all new features - Replace OOP architecture with simpler functional approach - Add comprehensive database queries for content lifecycle tracking 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Auto-updates config file when new defaults are available - Creates backup before making changes (.backup extension) - Preserves existing user settings - Add --no-auto-update-config flag to disable if needed - Update documentation with auto-update information - Handles errors gracefully with fallback to in-memory defaults 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix timestamp handling for both datetime objects and Unix timestamps - Fix moderator name handling for both string and object types - Fix target attribute handling to prevent 'str' object has no attribute 'name' errors - Add moderator name censoring: AutoMod, Reddit, HumanModerator - Add configurable wiki_actions filter (default: removals and removal reasons only) - Preserve original wiki content formatting 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…anonymization - Convert inquiry URLs to markdown format ([Inquire](url) instead of plain URLs) - Add configurable moderator anonymization (anonymize_moderators setting) - Store removal reasons from Reddit API in database (new removal_reason column) - Update database schema to version 3 with proper migration - Update CLAUDE.md and README.md with new features and configuration options 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Process removal reasons properly by stripping whitespace before storage - Update database query in get_recent_actions_from_db to include removal_reason column - Use actual removal reason content in MockAction instead of hardcoded text - Add test script to verify removal reason processing without Reddit API calls - Fix database schema query to use created_at instead of timestamp for consistency 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Cache SHA-256 hashes of wiki content per subreddit/page in database - Skip Reddit API calls when content hasn't changed - Add --force option to bypass hash check when needed - Add v4 database migration for wiki_hash_cache table - Update all wiki update calls to use hash caching 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add subreddit column to database to prevent mixing modlogs from different subreddits - Fix ID generation to never use user IDs, always use post/comment/action IDs - Add email address censoring in removal reasons - Update database schema to version 5 - Fix get_recent_actions_from_db to filter by subreddit - Update store_processed_action to include subreddit context 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…safety - Fixed removal reasons showing as numbers by checking mod_note field - Fixed display IDs to use permalinks instead of user IDs - Added strict subreddit validation to prevent mixed subreddit wiki corruption - Enhanced target ID extraction to get actual post/comment IDs - Added bot attribution footer to wiki pages - Improved permalink generation for better content linking 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Fix case-insensitive subreddit matching in database queries - Add subreddit extraction from permalinks - Simplify content link formatting to match main branch approach - Use actual Reddit permalinks for removed content - Add database update function for missing subreddit entries - Improve safety checks to prevent mixed subreddit data 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Force-refresh now fetches all modlog actions from Reddit to populate database - Store ALL action types in database but only show removal actions in wiki - Separate database population from wiki display logic - Ensure comprehensive modlog data collection while maintaining wiki focus Testing with Usenet subreddit 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove user profile fallback links - NEVER link to user profiles - Prioritize actual content permalinks (posts/comments) over user profiles - Implement proper content ID extraction for markdown ID field - Align modmail inquiry format with main branch (detailed prefilled message) - Add content link guidelines to CLAUDE.md - Fix case sensitivity in subreddit validation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…profile links - ID field now shows actual Reddit content IDs (t3_abc123, t1_def456) instead of user-based IDs - Content links only point to actual removed posts/comments, never user profiles - Show '-' for ID field when no content ID available (unlinkable actions) - Remove action ID fallback - only use actual content IDs for linking removals - Enable proper linking of removal actions with their removal reasons via content ID 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Use main branch logic for comment/post detection (permalink slash count) - Show actual titles when available - Show 'Comment by u/username' for comments without titles - Show 'Post by u/username' for posts without titles - Only create links for actual content URLs, not user profiles - Maintain main branch content field behavior while adding ID column 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Always prioritize mod_note (actual removal reason text) over numeric details - Show 'Removal reason applied' instead of 'Removal reason #7' for numeric details - Improve ID extraction to handle more cases (though many will still be '-' due to Reddit API limitations) - Apply consistent removal reason handling in both storage and display functions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add ID column back for tracking actions across the table - Fix removal reason prioritization to show text first, not numbers - For addremovalreason actions, use mod_note instead of details - Extract short content IDs from permalinks for table tracking - Ensure AutoModerator shows correctly (not HumanModerator) - Content links point to actual posts/comments, not user profiles 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Update README.md with current wiki output format examples - Add section on recent improvements including content linking fixes - Update database query examples to include subreddit column - Document intelligent removal reason handling (text over numbers) - Add multi-subreddit support documentation - Update CLAUDE.md with new database operations and schema info - Document content ID extraction and tracking features 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Include content ID in modmail subject line: [ID: 1mkz4jm] - Add Content ID field to modmail body for moderator reference - Enables moderators to quickly cross-reference inquiries with modlog entries - Improves modmail workflow and response efficiency 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Replace --force with more intuitive options: - --force-modlog: Fetch ALL from Reddit API and rebuild - --force-wiki: Force wiki update (bypass cache) - --force-all: Do both (replaces old --force) - Update documentation with clear explanations and examples - Add usage guidance for when to use each force option - Maintain backwards compatibility during transition 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Removes the erroneous error condition that prevented the application from working with databases containing multiple subreddits. The application now properly supports multi-subreddit databases by filtering results by the requested subreddit while providing informational logging about available subreddits. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Updates the database query to include target_author column and properly populates MockAction objects with actual usernames from the database, preventing the fallback to [deleted] placeholder text. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…sage Changes action.description to action.details (Reddit API field) and updates logic to only show generic message for removal reason template numbers 1-15, allowing actual removal reason text to display properly. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Removes logic that hides removal reason template numbers (1-15) behind generic "Removal reason applied" message. Now shows all available removal reason data including template numbers for full transparency. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Reorder regex patterns to extract comment IDs before post IDs, fixing duplicate ID issue - Fix display logic to use action.details consistently with storage logic - Ensures removal reasons show properly in wiki output 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Update README.md with latest improvements and multi-subreddit support - Update CLAUDE.md with v2.1 improvements summary - Fix --force-wiki to rebuild from database without API calls - Ensure --force-wiki always recreates wiki using existing data 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Reorganized sections for clarity and logical flow - Added prominent venv path requirement - Grouped configuration options by category - Enhanced development guidelines section - Improved formatting and structure 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…tead of template numbers - For addremovalreason actions, use description field which contains actual text like 'Invites - No asking' - Fixes template numbers (6, 9, etc) showing instead of meaningful removal reasons - Maintains existing logic for other action types using mod_note and details fields - Improves removal reason transparency for manual moderator actions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Updated sample wiki output to show actual removal reasons - Enhanced database schema documentation with transparency details - Updated features to highlight complete removal reason transparency - Shows AutoModerator rule text and addremovalreason descriptions - Documents unique content ID tracking improvements 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add missing BASE_BACKOFF_WAIT and MAX_BACKOFF_WAIT constants for exponential backoff - Initialize first_run_force variable in continuous mode function - Add sanitize_for_markdown() helper function to eliminate code duplication - Replace all instances of pipe character replacement with consistent helper function - Improves code maintainability and prevents NameError exceptions 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add get_config_with_default() helper to eliminate code duplication - Replace 7 instances of config.get(key, CONFIG_LIMITS[key]['default']) pattern - Improve code maintainability and consistency - Add validation for unknown config keys - Reduces repetitive nested dictionary access throughout codebase 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Fixed incorrect variable name 'timestamp' to 'created_at' to match the database schema. This was causing force-wiki operations to fail with "table processed_actions has no column named timestamp" error. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Added TRACE level (level 5) logging for prawcore and urllib3 when debug mode is enabled, providing detailed Reddit API request/response information for troubleshooting authentication and API issues. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…ew ones Modified regular operation logic to rebuild wiki from ALL relevant actions in database (within retention period) rather than only new actions, matching the behavior of force operations for consistent wiki display. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
WalkthroughThis PR refactors the modlog publisher from object-oriented classes to standalone functions, introducing database schema versioning (v5), wiki content hash caching to skip redundant edits, removal reason transparency with email censoring, multi-subreddit support, enhanced CLI force modes, continuous daemon operation, and expanded documentation with practical examples. Changesv2.1 Modlog Publisher Refactor
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
README.md (1)
37-40:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winWrong template filename —
config.template.jsondoes not exist; it'sconfig_template.json.The repo ships
config_template.json(underscore), as referenced inCLAUDE.mdline 25. The dotted form here will give first-time users acp: cannot stat 'config.template.json'error.📝 Proposed fix
-cp config.template.json config.json +cp config_template.json config.jsonAs per coding guidelines: "Always update commands and flags in documentation".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 37 - 40, The README has the wrong template filename; replace the cp command that references "config.template.json" with the correct "config_template.json" (underscore) so the copy command succeeds and remains consistent with CLAUDE.md; search for any other occurrences of "config.template.json" and update them to "config_template.json" to keep documentation consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Around line 65-71: Update the documented default flag values in the Core
Options section to match the runtime defaults defined by CONFIG_LIMITS in
modlog_wiki_publisher.py: change `--retention-days` default from 30 to 90,
`--batch-size` from 100 to 50, and `--interval` from 300 to 600 (leave
`--wiki-page` default "modlog" and other flags as-is); ensure the CLI docs in
this section now align with the values used by CONFIG_LIMITS and the README
table.
In `@modlog_wiki_publisher.py`:
- Around line 935-948: When cached_hash equals content_hash in update_wiki_page,
change the force=True path to use a warning-level log and, if running
interactively (check sys.stdin.isatty()), prompt the user to confirm the forced
update before proceeding; if the prompt is declined, return False. Keep the
non-force path unchanged (skip update). Refer to update_wiki_page,
get_content_hash, get_cached_wiki_hash and logger when making the change and
gate interactive confirmation on stdin being a TTY.
- Around line 1240-1250: CLI overrides currently write args
(args.retention_days, args.batch_size, args.interval) directly into the config
dict, bypassing CONFIG_LIMITS validation; after applying the CLI overrides to
config, call apply_config_defaults_and_limits(config) (or individually run
validate_config_value for each overridden key) so the values are
clamped/validated per CONFIG_LIMITS before the config is used by
load_config/SQL/PRAW.
- Around line 306-312: In censor_email_addresses, remove the redundant local
"import re" and correct the regex so the TLD character class doesn't include a
literal pipe; replace the pattern
r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b' with a proper
case-insensitive / letter-only TLD class (e.g. use [A-Za-z]{2,} or apply
re.IGNORECASE) in the re.sub call so the function only matches valid alphabetic
TLDs.
- Around line 414-448: The function get_target_permalink currently uses a bare
except: pass which swallows KeyboardInterrupt/SystemExit and programming errors;
replace the bare except with a specific Exception handler (e.g. except Exception
as e) and log the error (use your module logger or logging.exception) including
context (e.g. action id/attributes) before returning None; ensure you do not
catch SystemExit/KeyboardInterrupt (either re-raise them or only catch
Exception) and remove the silent pass so issues in building permalinks from
action.target_submission, action.target_comment or action.target_permalink are
visible in logs.
- Around line 478-543: The code lazily adds the target_author column on every
write (store_processed_action) which causes schema races and breaks fresh DB
reads; instead, bump CURRENT_DB_VERSION to 6 and add a migration in
migrate_database() that runs "ALTER TABLE processed_actions ADD COLUMN
target_author TEXT" for v5→v6 upgrades so the schema is deterministic. Remove
the runtime PRAGMA/ALTER logic in store_processed_action (the
cursor.execute("PRAGMA table_info(processed_actions)") and the conditional block
that adds target_author) and assume the column exists when inserting; keep the
extraction logic that sets the target_author value and write it as before.
- Around line 577-600: In cleanup_old_entries replace the convoluted timestamp
calculation int((datetime.now() - datetime.fromtimestamp(0)).total_seconds())
with int(time.time()) and remove the dead branch that checks if retention_days
<= 0 (since validate_config_value already guarantees retention_days >= 1); also
make the identical change in get_recent_actions_from_db so both use
int(time.time()) when computing cutoff_timestamp, and keep references to
DB_PATH, processed_actions, and created_at unchanged.
In `@README.md`:
- Around line 200-202: The README's example SQL uses LIKE '%[0-9]%' which treats
'[0-9]' as a literal string in SQLite; change the filter to use GLOB (or a
regexp if available) so it truly checks for digits. Replace the condition
"removal_reason NOT LIKE '%[0-9]%'" with "removal_reason NOT GLOB '*[0-9]*'"
(keeping the other clause removal_reason != 'remove' and the same SELECT/WHERE/
LIMIT structure on processed_actions.removal_reason) so the query excludes any
removal_reason containing digits as intended.
In `@test_removal_reasons.py`:
- Around line 14-57: The test currently imports DB_PATH by value via "from
modlog_wiki_publisher import *" so reassigning local global DB_PATH doesn't
affect the publisher functions; change the import to import the module (e.g.,
import modlog_wiki_publisher) and set modlog_wiki_publisher.DB_PATH = test_db
before calling setup_database(), store_processed_action(),
get_recent_actions_from_db(), etc., and update all calls to use
modlog_wiki_publisher.setup_database,
modlog_wiki_publisher.store_processed_action,
modlog_wiki_publisher.get_recent_actions_from_db (or ensure those functions
still reference the module-level DB_PATH) so the test operates against the test
database rather than the production DB.
---
Outside diff comments:
In `@README.md`:
- Around line 37-40: The README has the wrong template filename; replace the cp
command that references "config.template.json" with the correct
"config_template.json" (underscore) so the copy command succeeds and remains
consistent with CLAUDE.md; search for any other occurrences of
"config.template.json" and update them to "config_template.json" to keep
documentation consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2f48b77a-32cf-4bfd-b626-894f6921bd03
📒 Files selected for processing (5)
CLAUDE.mdREADME.mdconfig_template.jsonmodlog_wiki_publisher.pytest_removal_reasons.py
| ### Core Options | ||
| - `--source-subreddit`: Target subreddit for reading/writing logs | ||
| - `--wiki-page`: Wiki page name (default: "modlog") | ||
| - `--retention-days`: Database cleanup period (default: 30) | ||
| - `--batch-size`: Entries fetched per run (default: 100) | ||
| - `--interval`: Seconds between updates in daemon mode (default: 300) | ||
| - `--debug`: Enable verbose logging |
There was a problem hiding this comment.
Documented defaults are stale and disagree with code/README.
CONFIG_LIMITS in modlog_wiki_publisher.py sets retention_days=90, batch_size=50, update_interval=600, and the table in README.md (lines 86–88) reflects those. The values listed here (30, 100, 300) match the pre-refactor defaults and will mislead operators.
📝 Proposed fix
- `--source-subreddit`: Target subreddit for reading/writing logs
- `--wiki-page`: Wiki page name (default: "modlog")
-- `--retention-days`: Database cleanup period (default: 30)
-- `--batch-size`: Entries fetched per run (default: 100)
-- `--interval`: Seconds between updates in daemon mode (default: 300)
+- `--retention-days`: Database cleanup period (default: 90)
+- `--batch-size`: Entries fetched per run (default: 50)
+- `--interval`: Seconds between updates in daemon mode (default: 600)
- `--debug`: Enable verbose loggingAs per coding guidelines: "Always update commands and flags in documentation".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ### Core Options | |
| - `--source-subreddit`: Target subreddit for reading/writing logs | |
| - `--wiki-page`: Wiki page name (default: "modlog") | |
| - `--retention-days`: Database cleanup period (default: 30) | |
| - `--batch-size`: Entries fetched per run (default: 100) | |
| - `--interval`: Seconds between updates in daemon mode (default: 300) | |
| - `--debug`: Enable verbose logging | |
| ### Core Options | |
| - `--source-subreddit`: Target subreddit for reading/writing logs | |
| - `--wiki-page`: Wiki page name (default: "modlog") | |
| - `--retention-days`: Database cleanup period (default: 90) | |
| - `--batch-size`: Entries fetched per run (default: 50) | |
| - `--interval`: Seconds between updates in daemon mode (default: 600) | |
| - `--debug`: Enable verbose logging |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 65-65: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@CLAUDE.md` around lines 65 - 71, Update the documented default flag values in
the Core Options section to match the runtime defaults defined by CONFIG_LIMITS
in modlog_wiki_publisher.py: change `--retention-days` default from 30 to 90,
`--batch-size` from 100 to 50, and `--interval` from 300 to 600 (leave
`--wiki-page` default "modlog" and other flags as-is); ensure the CLI docs in
this section now align with the values used by CONFIG_LIMITS and the README
table.
| def censor_email_addresses(text): | ||
| """Censor email addresses in removal reasons""" | ||
| if not text: | ||
| return text | ||
| import re | ||
| # Replace email addresses with [EMAIL] | ||
| return re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]', text) |
There was a problem hiding this comment.
Email regex includes a literal | in the TLD class.
[A-Z|a-z]{2,} is a character class that allows A-Z, the literal |, and a-z. It happens to still match real emails (because real TLDs are letters), but the intent is [A-Za-z] and the current form will also accept noise like foo@bar.||a as a TLD-tail. Also, re is already imported at module level — the import re inside the function is redundant.
🛡️ Proposed fix
-def censor_email_addresses(text):
- """Censor email addresses in removal reasons"""
- if not text:
- return text
- import re
- # Replace email addresses with [EMAIL]
- return re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b', '[EMAIL]', text)
+def censor_email_addresses(text):
+ """Censor email addresses in removal reasons"""
+ if not text:
+ return text
+ return re.sub(r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}\b', '[EMAIL]', text)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 306 - 312, In censor_email_addresses,
remove the redundant local "import re" and correct the regex so the TLD
character class doesn't include a literal pipe; replace the pattern
r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b' with a proper
case-insensitive / letter-only TLD class (e.g. use [A-Za-z]{2,} or apply
re.IGNORECASE) in the re.sub call so the function only matches valid alphabetic
TLDs.
| def get_target_permalink(action): | ||
| """Get permalink for the target content - prioritize actual content over user profiles""" | ||
| # Check if we have a cached permalink from database | ||
| if hasattr(action, 'target_permalink_cached') and action.target_permalink_cached: | ||
| return action.target_permalink_cached | ||
|
|
||
| try: | ||
| # Priority 1: get actual post/comment permalinks from Reddit API | ||
| if hasattr(action, 'target_submission') and action.target_submission: | ||
| if hasattr(action.target_submission, 'permalink'): | ||
| return f"https://reddit.com{action.target_submission.permalink}" | ||
| elif hasattr(action.target_submission, 'id'): | ||
| # Construct permalink from submission ID | ||
| return f"https://reddit.com/comments/{action.target_submission.id}/" | ||
| elif hasattr(action, 'target_comment') and action.target_comment: | ||
| if hasattr(action.target_comment, 'permalink'): | ||
| return f"https://reddit.com{action.target_comment.permalink}" | ||
| elif hasattr(action.target_comment, 'id') and hasattr(action.target_comment, 'submission'): | ||
| # For comments, construct proper permalink with submission ID | ||
| return f"https://reddit.com/comments/{action.target_comment.submission.id}/_/{action.target_comment.id}/" | ||
| elif hasattr(action.target_comment, 'id'): | ||
| # Fallback for comment without submission info | ||
| return f"https://reddit.com/comments/{action.target_comment.id}/" | ||
|
|
||
| return result | ||
|
|
||
| def fetch_modlog_entries(self, limit: int = 100) -> List[Dict]: | ||
| """Fetch and process modlog entries with rate limit handling""" | ||
| subreddit = self.reddit.subreddit(self.config['source_subreddit']) | ||
| sub_logger = self.get_subreddit_logger(self.config['source_subreddit']) | ||
| entries = [] | ||
|
|
||
| sub_logger.info("Starting to fetch modlog entries, limit: %s", limit) | ||
| try: | ||
| for entry in subreddit.mod.log(limit=limit): | ||
| try: | ||
| processed = self._process_modlog_entry(entry) | ||
| if processed: | ||
| processed['subreddit'] = subreddit.display_name | ||
| entries.append(processed) | ||
| sub_logger.debug("Processed entry: %s [%s] by %s", | ||
| processed['id'], processed['action_type'], processed['moderator']) | ||
| # Mark as processed | ||
| self.db.mark_processed( | ||
| processed['id'], | ||
| processed['action_type'], | ||
| processed['timestamp'] | ||
| ) | ||
| self.db.store_entry(processed) | ||
| except praw.exceptions.APIException as e: | ||
| if e.error_type == "RATELIMIT": | ||
| # Extract wait time from message | ||
| import re | ||
| match = re.search(r'(\d+) minute', str(e)) | ||
| wait_time = int(match.group(1)) * 60 if match else 60 | ||
| sub_logger.warning("Rate limited, waiting %s seconds", wait_time) | ||
| time.sleep(wait_time) | ||
| else: | ||
| raise | ||
|
|
||
| # Sort by timestamp (newest first) | ||
| entries.sort(key=lambda x: x['timestamp'], reverse=True) | ||
| sub_logger.info("Successfully fetched %s modlog entries", len(entries)) | ||
| # Priority 2: Try to get content permalink from action.target_permalink if it's not a user profile | ||
| if hasattr(action, 'target_permalink') and action.target_permalink: | ||
| permalink = action.target_permalink | ||
| # Only use if it's actual content (contains /comments/) not user profile (/u/) | ||
| if '/comments/' in permalink and '/u/' not in permalink: | ||
| return f"https://reddit.com{permalink}" if not permalink.startswith('http') else permalink | ||
|
|
||
| # NEVER fall back to user profiles - only link to actual content | ||
| except: | ||
| pass | ||
| return None |
There was a problem hiding this comment.
Bare except: plus try/except/pass silently swallows everything (including KeyboardInterrupt).
get_target_permalink wraps the whole permalink-building block in a bare except: with a pass, which:
- Catches
KeyboardInterrupt/SystemExitand hides them. - Eats programming errors (e.g., a typo in an attribute name) so you'll get silent
Nonepermalinks instead of a fixable stack trace.
Narrow the catch and at least log it, especially given the coding guideline to "Check for proper error handling and logging".
🛡️ Proposed fix
- # NEVER fall back to user profiles - only link to actual content
- except:
- pass
- return None
+ # NEVER fall back to user profiles - only link to actual content
+ except AttributeError as e:
+ logger.debug("Could not resolve permalink for action %s: %s", getattr(action, 'id', '?'), e)
+ return NoneAs per coding guidelines: "Check for proper error handling and logging."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def get_target_permalink(action): | |
| """Get permalink for the target content - prioritize actual content over user profiles""" | |
| # Check if we have a cached permalink from database | |
| if hasattr(action, 'target_permalink_cached') and action.target_permalink_cached: | |
| return action.target_permalink_cached | |
| try: | |
| # Priority 1: get actual post/comment permalinks from Reddit API | |
| if hasattr(action, 'target_submission') and action.target_submission: | |
| if hasattr(action.target_submission, 'permalink'): | |
| return f"https://reddit.com{action.target_submission.permalink}" | |
| elif hasattr(action.target_submission, 'id'): | |
| # Construct permalink from submission ID | |
| return f"https://reddit.com/comments/{action.target_submission.id}/" | |
| elif hasattr(action, 'target_comment') and action.target_comment: | |
| if hasattr(action.target_comment, 'permalink'): | |
| return f"https://reddit.com{action.target_comment.permalink}" | |
| elif hasattr(action.target_comment, 'id') and hasattr(action.target_comment, 'submission'): | |
| # For comments, construct proper permalink with submission ID | |
| return f"https://reddit.com/comments/{action.target_comment.submission.id}/_/{action.target_comment.id}/" | |
| elif hasattr(action.target_comment, 'id'): | |
| # Fallback for comment without submission info | |
| return f"https://reddit.com/comments/{action.target_comment.id}/" | |
| return result | |
| def fetch_modlog_entries(self, limit: int = 100) -> List[Dict]: | |
| """Fetch and process modlog entries with rate limit handling""" | |
| subreddit = self.reddit.subreddit(self.config['source_subreddit']) | |
| sub_logger = self.get_subreddit_logger(self.config['source_subreddit']) | |
| entries = [] | |
| sub_logger.info("Starting to fetch modlog entries, limit: %s", limit) | |
| try: | |
| for entry in subreddit.mod.log(limit=limit): | |
| try: | |
| processed = self._process_modlog_entry(entry) | |
| if processed: | |
| processed['subreddit'] = subreddit.display_name | |
| entries.append(processed) | |
| sub_logger.debug("Processed entry: %s [%s] by %s", | |
| processed['id'], processed['action_type'], processed['moderator']) | |
| # Mark as processed | |
| self.db.mark_processed( | |
| processed['id'], | |
| processed['action_type'], | |
| processed['timestamp'] | |
| ) | |
| self.db.store_entry(processed) | |
| except praw.exceptions.APIException as e: | |
| if e.error_type == "RATELIMIT": | |
| # Extract wait time from message | |
| import re | |
| match = re.search(r'(\d+) minute', str(e)) | |
| wait_time = int(match.group(1)) * 60 if match else 60 | |
| sub_logger.warning("Rate limited, waiting %s seconds", wait_time) | |
| time.sleep(wait_time) | |
| else: | |
| raise | |
| # Sort by timestamp (newest first) | |
| entries.sort(key=lambda x: x['timestamp'], reverse=True) | |
| sub_logger.info("Successfully fetched %s modlog entries", len(entries)) | |
| # Priority 2: Try to get content permalink from action.target_permalink if it's not a user profile | |
| if hasattr(action, 'target_permalink') and action.target_permalink: | |
| permalink = action.target_permalink | |
| # Only use if it's actual content (contains /comments/) not user profile (/u/) | |
| if '/comments/' in permalink and '/u/' not in permalink: | |
| return f"https://reddit.com{permalink}" if not permalink.startswith('http') else permalink | |
| # NEVER fall back to user profiles - only link to actual content | |
| except: | |
| pass | |
| return None | |
| def get_target_permalink(action): | |
| """Get permalink for the target content - prioritize actual content over user profiles""" | |
| # Check if we have a cached permalink from database | |
| if hasattr(action, 'target_permalink_cached') and action.target_permalink_cached: | |
| return action.target_permalink_cached | |
| try: | |
| # Priority 1: get actual post/comment permalinks from Reddit API | |
| if hasattr(action, 'target_submission') and action.target_submission: | |
| if hasattr(action.target_submission, 'permalink'): | |
| return f"https://reddit.com{action.target_submission.permalink}" | |
| elif hasattr(action.target_submission, 'id'): | |
| # Construct permalink from submission ID | |
| return f"https://reddit.com/comments/{action.target_submission.id}/" | |
| elif hasattr(action, 'target_comment') and action.target_comment: | |
| if hasattr(action.target_comment, 'permalink'): | |
| return f"https://reddit.com{action.target_comment.permalink}" | |
| elif hasattr(action.target_comment, 'id') and hasattr(action.target_comment, 'submission'): | |
| # For comments, construct proper permalink with submission ID | |
| return f"https://reddit.com/comments/{action.target_comment.submission.id}/_/{action.target_comment.id}/" | |
| elif hasattr(action.target_comment, 'id'): | |
| # Fallback for comment without submission info | |
| return f"https://reddit.com/comments/{action.target_comment.id}/" | |
| # Priority 2: Try to get content permalink from action.target_permalink if it's not a user profile | |
| if hasattr(action, 'target_permalink') and action.target_permalink: | |
| permalink = action.target_permalink | |
| # Only use if it's actual content (contains /comments/) not user profile (/u/) | |
| if '/comments/' in permalink and '/u/' not in permalink: | |
| return f"https://reddit.com{permalink}" if not permalink.startswith('http') else permalink | |
| # NEVER fall back to user profiles - only link to actual content | |
| except AttributeError as e: | |
| logger.debug("Could not resolve permalink for action %s: %s", getattr(action, 'id', '?'), e) | |
| return None |
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 414-414: Too many return statements (8 > 6)
(PLR0911)
[warning] 425-425: Unnecessary elif after return statement
Remove unnecessary elif
(RET505)
[warning] 431-431: Unnecessary elif after return statement
Remove unnecessary elif
(RET505)
[error] 446-446: Do not use bare except
(E722)
[error] 446-447: try-except-pass detected, consider logging the exception
(S110)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 414 - 448, The function
get_target_permalink currently uses a bare except: pass which swallows
KeyboardInterrupt/SystemExit and programming errors; replace the bare except
with a specific Exception handler (e.g. except Exception as e) and log the error
(use your module logger or logging.exception) including context (e.g. action
id/attributes) before returning None; ensure you do not catch
SystemExit/KeyboardInterrupt (either re-raise them or only catch Exception) and
remove the silent pass so issues in building permalinks from
action.target_submission, action.target_comment or action.target_permalink are
visible in logs.
| def store_processed_action(action, subreddit_name=None): | ||
| """Store processed action to prevent duplicates""" | ||
| try: | ||
| conn = sqlite3.connect(DB_PATH) | ||
| cursor = conn.cursor() | ||
|
|
||
| # Format title with URL | ||
| if entry['url']: | ||
| title = f"[{entry['title']}]({entry['url']})" | ||
| else: | ||
| title = f"{entry['title']}" | ||
|
|
||
| # Format removal reason | ||
| reason = entry['removal_reason'] or entry['note'] or '-' | ||
| # Process removal reason properly - ALWAYS prefer actual text over numeric details | ||
| removal_reason = None | ||
|
|
||
| # Format inquire link | ||
| if entry['modmail_url']: | ||
| inquire = f"[Contact Mods]({entry['modmail_url']})" | ||
| else: | ||
| inquire = '-' | ||
|
|
||
| # Format time | ||
| time_str = self._format_timestamp(entry['timestamp']) | ||
| return f"| {time_str} | {action} | {moderator} | {title} | {reason} | {inquire} |" | ||
| # For addremovalreason actions, use description field (contains actual text) | ||
| if action.action == 'addremovalreason' and hasattr(action, 'description') and action.description: | ||
| removal_reason = censor_email_addresses(str(action.description).strip()) | ||
| # First priority: mod_note (actual removal reason text) | ||
| elif hasattr(action, 'mod_note') and action.mod_note: | ||
| removal_reason = censor_email_addresses(str(action.mod_note).strip()) | ||
| # Second priority: details (accept ALL details text, including numbers) | ||
| elif hasattr(action, 'details') and action.details: | ||
| details_str = str(action.details).strip() | ||
| removal_reason = censor_email_addresses(details_str) | ||
|
|
||
| # Extract subreddit from URL if not provided | ||
| target_permalink = get_target_permalink(action) | ||
| if not subreddit_name and target_permalink: | ||
| subreddit_name = extract_subreddit_from_permalink(target_permalink) | ||
|
|
||
| # Add subreddit column if it doesn't exist | ||
| cursor.execute("PRAGMA table_info(processed_actions)") | ||
| columns = [row[1] for row in cursor.fetchall()] | ||
| if 'subreddit' not in columns: | ||
| cursor.execute("ALTER TABLE processed_actions ADD COLUMN subreddit TEXT") | ||
|
|
||
| # Add target_author column if it doesn't exist | ||
| if 'target_author' not in columns: | ||
| cursor.execute("ALTER TABLE processed_actions ADD COLUMN target_author TEXT") | ||
|
|
||
| # Extract target author | ||
| target_author = None | ||
| if hasattr(action, 'target_author') and action.target_author: | ||
| if hasattr(action.target_author, 'name'): | ||
| target_author = action.target_author.name | ||
| else: | ||
| target_author = str(action.target_author) | ||
|
|
||
| cursor.execute(""" | ||
| INSERT OR REPLACE INTO processed_actions | ||
| (action_id, action_type, moderator, target_id, target_type, | ||
| display_id, target_permalink, removal_reason, target_author, created_at, subreddit) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) | ||
| """, ( | ||
| action.id, | ||
| action.action, | ||
| get_moderator_name(action, False), # Store actual name in database | ||
| extract_target_id(action), | ||
| get_target_type(action), | ||
| generate_display_id(action), | ||
| target_permalink, | ||
| sanitize_for_markdown(removal_reason), # Store properly processed removal reason | ||
| target_author, | ||
| int(action.created_utc) if isinstance(action.created_utc, (int, float)) else int(action.created_utc.timestamp()), | ||
| subreddit_name or 'unknown' | ||
| )) | ||
|
|
||
| conn.commit() | ||
| conn.close() | ||
| except Exception as e: | ||
| logger.error(f"Error storing processed action: {e}") | ||
| raise |
There was a problem hiding this comment.
target_author is added via lazy ALTER TABLE on every write — move it into migrate_database.
store_processed_action does a PRAGMA table_info + conditional ALTER TABLE ... ADD COLUMN on every call (lines 504–511). Two concrete problems:
target_authoris not present in any of the v0→v5 migrations inmigrate_database(). New deployments on a fresh DB only get this column because the first call tostore_processed_actionadds it. Anything that reads the schema before the first write (e.g., a query tool, orget_recent_actions_from_dbif no actions have been stored yet) will see a column-less schema and fail. This also meansupdate_missing_subreddits()(called fromsetup_database) is the only deterministic path establishing the schema, and it doesn't addtarget_author.- Doing PRAGMA + ALTER per-row is needless overhead on the hot path and creates a schema-mutation race when multiple processes write concurrently.
Bump CURRENT_DB_VERSION to 6 and add a real migration that adds target_author (and removes the runtime ALTER from store_processed_action).
♻️ Proposed fix (sketch)
-CURRENT_DB_VERSION = 5
+CURRENT_DB_VERSION = 6
@@
+ # Migration from version 5 to 6: Add target_author column
+ if current_version < 6:
+ logger.info("Applying migration: Add target_author column (v5 -> v6)")
+ cursor.execute("PRAGMA table_info(processed_actions)")
+ existing_columns = [row[1] for row in cursor.fetchall()]
+ if 'target_author' not in existing_columns:
+ cursor.execute("ALTER TABLE processed_actions ADD COLUMN target_author TEXT")
+ logger.info("Added column: target_author")
+ set_db_version(6)
@@
- # Add subreddit column if it doesn't exist
- cursor.execute("PRAGMA table_info(processed_actions)")
- columns = [row[1] for row in cursor.fetchall()]
- if 'subreddit' not in columns:
- cursor.execute("ALTER TABLE processed_actions ADD COLUMN subreddit TEXT")
-
- # Add target_author column if it doesn't exist
- if 'target_author' not in columns:
- cursor.execute("ALTER TABLE processed_actions ADD COLUMN target_author TEXT")
-
# Extract target author🧰 Tools
🪛 Ruff (0.15.12)
[warning] 528-528: Boolean positional value in function call
(FBT003)
[warning] 542-542: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
[warning] 542-542: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 478 - 543, The code lazily adds the
target_author column on every write (store_processed_action) which causes schema
races and breaks fresh DB reads; instead, bump CURRENT_DB_VERSION to 6 and add a
migration in migrate_database() that runs "ALTER TABLE processed_actions ADD
COLUMN target_author TEXT" for v5→v6 upgrades so the schema is deterministic.
Remove the runtime PRAGMA/ALTER logic in store_processed_action (the
cursor.execute("PRAGMA table_info(processed_actions)") and the conditional block
that adds target_author) and assume the column exists when inserting; keep the
extraction logic that sets the target_author value and write it as before.
| def cleanup_old_entries(retention_days: int): | ||
| """Remove entries older than retention_days""" | ||
| if retention_days <= 0: | ||
| retention_days = CONFIG_LIMITS['retention_days']['default'] # No config object available here | ||
|
|
||
| try: | ||
| conn = sqlite3.connect(DB_PATH) | ||
| cursor = conn.cursor() | ||
|
|
||
| try: | ||
| subreddit = self.reddit.subreddit(target_sub) | ||
| wiki_page = self.config.get('wiki_page', 'modlog') | ||
|
|
||
| sub_logger.info("Updating wiki page: /r/%s/wiki/%s", target_sub, wiki_page) | ||
| cutoff_timestamp = int((datetime.now() - datetime.fromtimestamp(0)).total_seconds()) - (retention_days * 86400) | ||
|
|
||
| cursor.execute( | ||
| "DELETE FROM processed_actions WHERE created_at < ?", | ||
| (cutoff_timestamp,) | ||
| ) | ||
|
|
||
| deleted_count = cursor.rowcount | ||
| conn.commit() | ||
| conn.close() | ||
|
|
||
| if deleted_count > 0: | ||
| logger.info(f"Cleaned up {deleted_count} old entries") | ||
| except Exception as e: | ||
| logger.error(f"Error during cleanup: {e}") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Convoluted "now as Unix timestamp" expression and dead branch.
int((datetime.now() - datetime.fromtimestamp(0)).total_seconds()) is just int(time.time()) written the long way around (it relies on the local-timezone offset cancelling between both naive datetimes, which is fragile and trips DTZ005/DTZ006 in static analysis). Also, validate_config_value already enforces retention_days >= 1, so the if retention_days <= 0 branch is dead code in normal flow. Same expression is duplicated at line 626 in get_recent_actions_from_db.
♻️ Proposed simplification
-def cleanup_old_entries(retention_days: int):
- """Remove entries older than retention_days"""
- if retention_days <= 0:
- retention_days = CONFIG_LIMITS['retention_days']['default'] # No config object available here
-
- try:
- conn = sqlite3.connect(DB_PATH)
- cursor = conn.cursor()
-
- cutoff_timestamp = int((datetime.now() - datetime.fromtimestamp(0)).total_seconds()) - (retention_days * 86400)
+def cleanup_old_entries(retention_days: int):
+ """Remove entries older than retention_days"""
+ retention_days = validate_config_value('retention_days', retention_days, CONFIG_LIMITS)
+
+ try:
+ conn = sqlite3.connect(DB_PATH)
+ cursor = conn.cursor()
+
+ cutoff_timestamp = int(time.time()) - (retention_days * 86400)(Apply the same int(time.time()) change at line 626.)
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 586-586: datetime.datetime.now() called without a tz argument
(DTZ005)
[warning] 586-586: datetime.datetime.fromtimestamp() called without a tz argument
(DTZ006)
[warning] 598-598: Logging statement uses f-string
(G004)
[warning] 599-599: Do not catch blind exception: Exception
(BLE001)
[warning] 600-600: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
[warning] 600-600: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 577 - 600, In cleanup_old_entries
replace the convoluted timestamp calculation int((datetime.now() -
datetime.fromtimestamp(0)).total_seconds()) with int(time.time()) and remove the
dead branch that checks if retention_days <= 0 (since validate_config_value
already guarantees retention_days >= 1); also make the identical change in
get_recent_actions_from_db so both use int(time.time()) when computing
cutoff_timestamp, and keep references to DB_PATH, processed_actions, and
created_at unchanged.
| def update_wiki_page(reddit, subreddit_name: str, wiki_page: str, content: str, force: bool = False): | ||
| """Update wiki page with content, using hash caching to avoid unnecessary updates""" | ||
| try: | ||
| # Calculate content hash | ||
| content_hash = get_content_hash(content) | ||
|
|
||
| # Check if content has changed (unless forced) | ||
| cached_hash = get_cached_wiki_hash(subreddit_name, wiki_page) | ||
| if cached_hash == content_hash: | ||
| if force: | ||
| logger.info(f"Wiki content unchanged, but you selected force for /r/{subreddit_name}/wiki/{wiki_page}, forcing update") | ||
| else: | ||
| logger.info(f"Wiki content unchanged for /r/{subreddit_name}/wiki/{wiki_page}, skipping update") | ||
| return False |
There was a problem hiding this comment.
force=True flow doesn't match the coding guideline ("warn if same, interactively ask to force refresh").
The repository guideline for this file is: "Confirm cache file of wiki page and warn if same, interactively ask to force refresh." Today, when force=True and the cached hash equals the new hash, the code only logs an info line (line 945) and proceeds to push the same content to Reddit unconditionally. There is no interactive confirmation prompt and no warning level log; CLI flags like --force-wiki go straight to a (rate-limit-costing) wiki edit even when the content is byte-identical.
If non-interactive use is required (daemon mode), gate the prompt on stdin being a TTY and at minimum upgrade the message to logger.warning(...).
As per coding guidelines: "Confirm cache file of wiki page and warn if same, interactively ask to force refresh."
🧰 Tools
🪛 Ruff (0.15.12)
[warning] 935-935: Boolean-typed positional argument in function definition
(FBT001)
[warning] 935-935: Boolean default positional argument in function definition
(FBT002)
[warning] 945-945: Logging statement uses f-string
(G004)
[warning] 947-947: Logging statement uses f-string
(G004)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 935 - 948, When cached_hash equals
content_hash in update_wiki_page, change the force=True path to use a
warning-level log and, if running interactively (check sys.stdin.isatty()),
prompt the user to confirm the forced update before proceeding; if the prompt is
declined, return False. Keep the non-force path unchanged (skip update). Refer
to update_wiki_page, get_content_hash, get_cached_wiki_hash and logger when
making the change and gate interactive confirmation on stdin being a TTY.
| # Override config with CLI args | ||
| if args.source_subreddit: | ||
| config['source_subreddit'] = args.source_subreddit | ||
| if args.wiki_page: | ||
| config['wiki_page'] = args.wiki_page | ||
| if args.retention_days is not None: | ||
| config['retention_days'] = args.retention_days | ||
| if args.batch_size is not None: | ||
| config['batch_size'] = args.batch_size | ||
| if args.interval is not None: | ||
| config['update_interval'] = args.interval |
There was a problem hiding this comment.
CLI overrides bypass CONFIG_LIMITS validation.
load_config runs values through apply_config_defaults_and_limits / validate_config_value, but the CLI overrides directly assign user-supplied values (--retention-days, --batch-size, --interval) into config afterwards. A user passing --retention-days 999999 or --batch-size 0 will silently bypass the published min/max and reach SQL/PRAW with out-of-range inputs.
Run the overrides through validate_config_value (or call apply_config_defaults_and_limits(config) once after the overrides):
🛡️ Proposed fix
# Override config with CLI args
if args.source_subreddit:
config['source_subreddit'] = args.source_subreddit
if args.wiki_page:
config['wiki_page'] = args.wiki_page
if args.retention_days is not None:
- config['retention_days'] = args.retention_days
+ config['retention_days'] = validate_config_value('retention_days', args.retention_days, CONFIG_LIMITS)
if args.batch_size is not None:
- config['batch_size'] = args.batch_size
+ config['batch_size'] = validate_config_value('batch_size', args.batch_size, CONFIG_LIMITS)
if args.interval is not None:
- config['update_interval'] = args.interval
+ config['update_interval'] = validate_config_value('update_interval', args.interval, CONFIG_LIMITS)As per coding guidelines: "Review for security issues — validate all user inputs."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@modlog_wiki_publisher.py` around lines 1240 - 1250, CLI overrides currently
write args (args.retention_days, args.batch_size, args.interval) directly into
the config dict, bypassing CONFIG_LIMITS validation; after applying the CLI
overrides to config, call apply_config_defaults_and_limits(config) (or
individually run validate_config_value for each overridden key) so the values
are clamped/validated per CONFIG_LIMITS before the config is used by
load_config/SQL/PRAW.
| # View removal reasons that are text (not numbers) | ||
| sqlite3 modlog.db "SELECT action_type, removal_reason FROM processed_actions WHERE removal_reason NOT LIKE '%[0-9]%' AND removal_reason != 'remove' LIMIT 5;" | ||
|
|
There was a problem hiding this comment.
LIKE '%[0-9]%' doesn't do what the comment implies in SQLite.
SQLite's LIKE only supports % and _ wildcards — [0-9] is treated as the literal four-character string [0-9], not "any digit". So the documented "View removal reasons that are text (not numbers)" query won't filter out numeric template IDs the way the section heading suggests; it will only exclude rows whose removal_reason literally contains [0-9].
If the intent is "no digits", use a GLOB pattern (which supports character classes) or a regexp:
📝 Proposed fix
-# View removal reasons that are text (not numbers)
-sqlite3 modlog.db "SELECT action_type, removal_reason FROM processed_actions WHERE removal_reason NOT LIKE '%[0-9]%' AND removal_reason != 'remove' LIMIT 5;"
+# View removal reasons that contain no digits (text-only reasons)
+sqlite3 modlog.db "SELECT action_type, removal_reason FROM processed_actions WHERE removal_reason NOT GLOB '*[0-9]*' AND removal_reason != 'remove' LIMIT 5;"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # View removal reasons that are text (not numbers) | |
| sqlite3 modlog.db "SELECT action_type, removal_reason FROM processed_actions WHERE removal_reason NOT LIKE '%[0-9]%' AND removal_reason != 'remove' LIMIT 5;" | |
| # View removal reasons that contain no digits (text-only reasons) | |
| sqlite3 modlog.db "SELECT action_type, removal_reason FROM processed_actions WHERE removal_reason NOT GLOB '*[0-9]*' AND removal_reason != 'remove' LIMIT 5;" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` around lines 200 - 202, The README's example SQL uses LIKE
'%[0-9]%' which treats '[0-9]' as a literal string in SQLite; change the filter
to use GLOB (or a regexp if available) so it truly checks for digits. Replace
the condition "removal_reason NOT LIKE '%[0-9]%'" with "removal_reason NOT GLOB
'*[0-9]*'" (keeping the other clause removal_reason != 'remove' and the same
SELECT/WHERE/ LIMIT structure on processed_actions.removal_reason) so the query
excludes any removal_reason containing digits as intended.
| from modlog_wiki_publisher import * | ||
|
|
||
| # Mock Reddit action objects for testing | ||
| class MockRedditAction: | ||
| def __init__(self, action_id, action_type, details, mod_name, target_type='post', target_id='abc123'): | ||
| self.id = action_id | ||
| self.action = action_type | ||
| self.details = details | ||
| self.created_utc = int(datetime.now().timestamp()) | ||
|
|
||
| # Mock moderator | ||
| class MockMod: | ||
| def __init__(self, name): | ||
| self.name = name | ||
| self.mod = MockMod(mod_name) | ||
|
|
||
| # Mock targets based on type | ||
| if target_type == 'post': | ||
| self.target_submission = target_id | ||
| self.target_comment = None | ||
| self.target_author = 'testuser' | ||
| self.target_title = 'Test Post Title' | ||
| self.target_permalink = f'/r/test/comments/{target_id}/test_post/' | ||
| elif target_type == 'comment': | ||
| self.target_submission = None | ||
| self.target_comment = target_id | ||
| self.target_author = 'testuser' | ||
| self.target_title = None | ||
| self.target_permalink = f'/r/test/comments/parent123/test_post/{target_id}/' | ||
|
|
||
| def test_removal_reasons(): | ||
| """Test removal reason processing and storage""" | ||
| print("Testing Removal Reason Processing") | ||
| print("=" * 50) | ||
|
|
||
| # Clean up any existing test database | ||
| test_db = "test_modlog.db" | ||
| if os.path.exists(test_db): | ||
| os.remove(test_db) | ||
|
|
||
| # Override the global DB_PATH for testing | ||
| global DB_PATH | ||
| original_db_path = DB_PATH | ||
| DB_PATH = test_db |
There was a problem hiding this comment.
Critical: global DB_PATH does not actually point the publisher's functions at the test DB — this test will write to production modlog.db.
from modlog_wiki_publisher import * imports the value of DB_PATH into this module's namespace. The global DB_PATH on line 55 rebinds only the test module's DB_PATH. The imported functions (setup_database, store_processed_action, get_recent_actions_from_db, …) were defined inside modlog_wiki_publisher.py and resolve DB_PATH in that module — so every sqlite3.connect(DB_PATH) inside those functions still hits ./modlog.db.
Concrete consequences in this test:
setup_database()(line 62) runs migrations against your realmodlog.db.store_processed_action(action)(line 101) inserts the mock test rows into the real DB.- The verification
SELECTon line 106 connects via the test module's localDB_PATH = "test_modlog.db", so it (correctly) reads an empty/different DB and the test silently fails to verify what it claims. os.remove(test_db)in thefinallyremoves only the emptytest_modlog.db; the polluting rows persist in production.
Mutate the attribute on the publisher module instead of using import * + global:
🛡️ Proposed fix
-from modlog_wiki_publisher import *
+import modlog_wiki_publisher
+from modlog_wiki_publisher import (
+ setup_database,
+ store_processed_action,
+ get_recent_actions_from_db,
+ build_wiki_content,
+)
@@
- # Override the global DB_PATH for testing
- global DB_PATH
- original_db_path = DB_PATH
- DB_PATH = test_db
+ # Override the publisher module's DB_PATH for testing
+ original_db_path = modlog_wiki_publisher.DB_PATH
+ modlog_wiki_publisher.DB_PATH = test_db
@@
- conn = sqlite3.connect(DB_PATH)
+ conn = sqlite3.connect(modlog_wiki_publisher.DB_PATH)
@@
finally:
- # Restore original DB path
- DB_PATH = original_db_path
+ # Restore original DB path
+ modlog_wiki_publisher.DB_PATH = original_db_path(Apply the same modlog_wiki_publisher.DB_PATH reference everywhere DB_PATH is used in this file.)
🧰 Tools
🪛 Ruff (0.15.12)
[error] 14-14: from modlog_wiki_publisher import * used; unable to detect undefined names
(F403)
[warning] 18-18: Missing return type annotation for special method __init__
Add return type annotation: None
(ANN204)
[warning] 22-22: datetime.datetime.now() called without a tz argument
(DTZ005)
[warning] 26-26: Missing return type annotation for special method __init__
Add return type annotation: None
(ANN204)
[warning] 44-44: Too many branches (16 > 12)
(PLR0912)
[warning] 44-44: Too many statements (70 > 50)
(PLR0915)
[warning] 55-55: Using the global statement to update DB_PATH is discouraged
(PLW0603)
[warning] 55-55: Using the global statement to update DB_PATH is discouraged
(PLW0603)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test_removal_reasons.py` around lines 14 - 57, The test currently imports
DB_PATH by value via "from modlog_wiki_publisher import *" so reassigning local
global DB_PATH doesn't affect the publisher functions; change the import to
import the module (e.g., import modlog_wiki_publisher) and set
modlog_wiki_publisher.DB_PATH = test_db before calling setup_database(),
store_processed_action(), get_recent_actions_from_db(), etc., and update all
calls to use modlog_wiki_publisher.setup_database,
modlog_wiki_publisher.store_processed_action,
modlog_wiki_publisher.get_recent_actions_from_db (or ensure those functions
still reference the module-level DB_PATH) so the test operates against the test
database rather than the production DB.
Summary
Consolidation PR for the long-running
vibesbranch (48 commits ahead ofmain). Branch was created as experimental playground; latest commits look like real fixes that should land:Why this PR exists
Branch never had a PR opened. Reviewer should:
Test plan
git log --oneline main..vibes🤖 Auto-opened during cross-repo branch consolidation sweep.
Summary by CodeRabbit
New Features
Documentation
Tests