Commit 9fe9cef
Add FAQ automation system with GitHub Actions integration (#14)
* Add FAQ automation system with GitHub Actions integration
This commit introduces a comprehensive FAQ automation system that uses RAG
and LLM-based triage to intelligently process new FAQ proposals.
Features:
- AI-powered FAQ proposal analysis (NEW/UPDATE/DUPLICATE decisions)
- Automated PR creation for approved changes
- GitHub issue template for structured FAQ proposals
- Complete test suite with unit and integration tests
- Comprehensive documentation (README, CONTRIBUTING)
Components:
- faq_automation/: Python module with core logic
- core.py: FAQ processing utilities
- rag_agent.py: LLM-based decision agent using OpenAI
- actions.py: GitHub Actions integration helpers
- cli.py: Command-line interface for workflow
- .github/workflows/faq-automation.yml: GitHub Actions workflow
- .github/ISSUE_TEMPLATE/faq-proposal.yml: Structured issue template
- tests/: Comprehensive test coverage
- CONTRIBUTING.md: Contributor guidelines
- README.md: Updated with full documentation
Dependencies added:
- minsearch: Lightweight text search for FAQ retrieval
- openai: LLM integration for decision making
- pydantic: Structured output validation
The system processes FAQ proposals through:
1. Issue submission via GitHub template
2. Retrieval of similar existing FAQs
3. LLM analysis and decision (NEW/UPDATE/DUPLICATE)
4. Automated PR creation or issue closure with feedback
Supports: machine-learning-zoomcamp (initial course)
Can be extended to support all courses in the future
* Fix minsearch version requirement (0.0.7 instead of 0.4.1)
* Configure setuptools to only package faq_automation
* Change default model to gpt-5-nano for structured output support
- Update CLI default model from gpt-4 to gpt-5-nano
- Update RAG agent default model to gpt-5-nano
- Update GitHub Actions workflow to use gpt-5-nano
- Fix setuptools package configuration
- Fix minsearch version requirement (0.0.7)
* Remove test artifacts and add *.egg-info/ to gitignore
* Add *.egg-info/ to gitignore
* Add contribution banner to course pages
Added a simple text banner with link to CONTRIBUTING.md at the top
of each course page to encourage user contributions.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Increase contribution banner font size to match questions
Set font size to 1.17em to match the question heading size for
better visibility and consistency.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update issue template with dropdown for course selection
- Change course field from input to dropdown menu
- Add all 4 available courses as options:
- machine-learning-zoomcamp
- data-engineering-zoomcamp
- llm-zoomcamp
- mlops-zoomcamp
- Update uv.lock to sync with FAQ automation dependencies
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add manual trigger for testing FAQ automation workflow
- Add workflow_dispatch trigger with issue_number input
- Support both automatic (issue opened) and manual execution
- Fetch issue data when manually triggered
- Update error handler to use correct issue number
This allows testing the workflow on feature branches without merging.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix issue number handling for manual workflow triggers
* Fix: Create FAQ bot branches from main instead of current branch
* docs: Update Development section with correct uv commands and API key setup
* fix: Correct uv commands in Makefile and improve issue body parsing
- Remove incorrect 'python -m' prefix from uv commands in Makefile
- Update parse_issue_body to stop collecting content at any ### section
- Ensures Checklist and other sections are excluded from parsed answer
- All tests now passing
* docs: Update README - remove Quick Start and correct LLM to GPT-5
- Remove Quick Start section (missing necessary environment setup)
- Update all GPT-4 references to GPT-5
- Development section now contains all necessary setup instructions
* docs: Update issue links to DataTalksClub/faq repository
- Replace relative issue links with absolute URLs
- Use direct link to FAQ proposal template
- Simplify CONTRIBUTING instructions (remove redundant steps)
- Point to https://github.com/DataTalksClub/faq/issues
* docs: Remove GitHub Discussions reference from Support section
* docs: Fix escaped backticks in CONTRIBUTING.md example code blocks
* docs: Fix nested code blocks using 4 backticks for outer block
* docs: Update testing documentation links for consistency
* docs: Remove outer code block from example to render link correctly
* docs: Add FAQ automation tests to tests/README.md
- Document test_faq_automation.py (core functions)
- Document test_cli_parsing.py (issue body parsing)
- Document test_faq_actions.py (GitHub Actions integration)
- Update test coverage section with test counts
- Add example commands for running FAQ automation tests
* refactor: Organize FAQ automation tests into classes
- Reorganize test_faq_automation.py into classes (TestParseFrontmatter, TestWriteFrontmatter, TestGenerateDocumentId, TestKeepRelevant)
- Reorganize test_cli_parsing.py into TestParseIssueBody class
- Reorganize test_faq_actions.py into classes (TestGeneratePRBody, TestGenerateDuplicateComment)
- Follow the established test structure pattern from test_sorting.py
- All 102 unit tests passing
* docs: Add FAQ automation examples to test method examples
- Add example for running specific FAQ automation test method
- Add example for running specific CLI parsing test method
- Show proper class-based test structure in examples
* docs: Update test commands to match Makefile and remove --extra dev
- Use 'make test' commands as primary examples
- Show 'uv run pytest' commands as alternatives
- Remove '--extra dev' flag for consistency with Makefile
- All test commands now consistent across documentation
* Fix OpenAI API syntax to match notebook implementation
Update faq_automation/rag_agent.py to use the correct OpenAI API syntax
from the notebook prototype:
- Changed beta.chat.completions.parse to responses.parse
- Changed messages parameter to input
- Changed response_format parameter to text_format
- Updated response parsing to extract from response.output
All 116 tests pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Replace JS parsing logic with Python in FAQ automation workflow
Eliminates code duplication between JavaScript and Python parsing by
using a single Python implementation for all issue body parsing.
Changes:
- Add parse_full_issue_body() to extract course, question, and answer
- Create scripts/extract_issue_fields.py for GitHub Actions integration
- Simplify workflow to use Python parsing instead of JS
- Add 6 comprehensive tests for parse_full_issue_body()
- Update tests/README.md with new test documentation
Benefits:
- Single source of truth for parsing logic
- All parsing code is testable
- Easier maintenance (one language, one implementation)
- No duplication between JS and Python
Test results: All 122 tests pass (was 116, added 6 new tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Replace bash scripting with Python for GitHub Actions outputs
Creates a shared GitHub Actions helper module to eliminate bash
scripting for writing to GITHUB_OUTPUT environment variable.
Changes:
- Create faq_automation/github_actions.py with write_github_output()
- Supports both multiline (heredoc) and single-line formats
- Handles local testing mode (prints to stdout)
- Environment detection helpers (is_github_actions, get_github_output_path)
- Update scripts/extract_issue_fields.py to use shared function
- Import write_github_output from github_actions module
- Remove duplicate implementation
- Create scripts/write_faq_decision_output.py
- Reads faq_decision.json
- Writes to GITHUB_OUTPUT using Python
- Replaces bash: echo "decision=$(jq -c .)" >> $GITHUB_OUTPUT
- Update .github/workflows/faq-automation.yml
- Replace bash output logic with Python script call
- Cleaner, more maintainable workflow
- Add comprehensive tests (10 new tests)
- test_github_actions.py with 3 test classes
- Multiline and single-line output formats
- Local testing mode behavior
- Environment detection
- Update tests/README.md documentation
Benefits:
- All GitHub Actions integration uses Python
- Single source of truth for output writing
- Fully testable (no bash to test)
- Consistent approach across all scripts
- Better error handling and validation
Test results: All 132 tests pass (was 122, added 10 new tests)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update workflow to use modern uv sync command
Replace 'uv pip install --system -e .' with 'uv sync --no-dev' for:
- Consistency with local development workflow
- Modern uv best practices
- Declarative dependency management
- Faster installation (skips dev dependencies not needed in automation)
Benefits:
- Uses same uv sync approach as README recommends locally
- Only installs production dependencies needed for FAQ automation
- Automatic virtual environment management
- Better reproducibility
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Simplify FAQ automation workflow by removing --course argument
Removes code duplication and simplifies workflow by having CLI parse
course from issue body instead of extracting it separately.
Changes:
- Remove --course argument from CLI (faq_automation/cli.py)
- Always use parse_full_issue_body() to extract course, question, answer
- Update all references from args.course to parsed course variable
- Simplify workflow (.github/workflows/faq-automation.yml)
- Keep "Fetch issue body" step for clean separation
- Remove "Extract issue fields with Python" step entirely (~26 lines removed)
- Simplify "Process FAQ with AI" step (~14 lines removed)
- Pass full issue body directly to CLI without reconstruction
- Delete scripts/extract_issue_fields.py (no longer needed)
- Update README.md example
- Add course field to test_issue.txt example
- Remove --course argument from CLI command
Benefits:
- Workflow reduced from 3 steps to 2 steps
- Removed ~40 lines from workflow file
- Deleted 1 script file (67 lines)
- Simpler CLI interface (one less argument)
- Single parsing path (no conditionals)
- Easier to maintain and understand
All 132 tests pass ✅
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Fix workflow to use uv run for Python commands
After switching to 'uv sync --no-dev', Python commands need to run
within the uv-managed virtual environment using 'uv run'.
Changes:
- Use 'uv run python -m faq_automation.cli' to run CLI module
- Use 'uv run scripts/write_faq_decision_output.py' for script
(leverages shebang line for cleaner syntax)
This ensures Python commands execute with the correct dependencies
installed by uv sync.
Fixes: ModuleNotFoundError: No module named 'yaml'
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Remove dead code: parse_issue_body() and its tests
Removed parse_issue_body() function from CLI as it is no longer used in production.
After removing the --course argument, the CLI always uses parse_full_issue_body()
which extracts course, question, and answer from the full issue body.
Changes:
- Removed parse_issue_body() function from faq_automation/cli.py (58 lines)
- Removed TestParseIssueBody class from tests/unit/test_cli_parsing.py (79 lines)
- Updated import statement in test file
- Updated tests/README.md to reflect new test count (132 → 127 tests)
All 127 tests pass.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add answer content example to FAQ file format section
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add course field to FAQ example in contributing guide
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Update test documentation to reflect full test suite scope
- Updated description to clarify tests cover both site generator and FAQ automation
- Added guidance for adding tests to FAQ automation system
- Specified which test files to use for different FAQ automation components
- Added notes about testing with real issue bodies and mocking external dependencies
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add comprehensive integration tests for FAQ automation workflow
Added 26 integration tests covering the complete end-to-end FAQ automation
workflow, bringing total test count from 143 to 153.
Test coverage includes:
- FAQ agent integration (5 tests): initialization, search, and proposal processing
- File creation and updates (3 tests): creating new FAQs and updating existing ones
- PR and comment generation (5 tests): generating outputs for all decision types
- CLI integration (2 tests): parsing issue bodies and full CLI execution
- Error handling and edge cases (4 tests): empty sections, non-existent docs, etc.
- Site generator integration (3 tests): verifying created files work with generate_website.py
- End-to-end workflows (3 tests): complete NEW/UPDATE/DUPLICATE flows
All tests use mocked OpenAI API responses for consistency and speed, while
performing real file I/O to verify format compatibility with the site generator.
Updated tests/README.md with comprehensive documentation of the new test suite.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* Add auto-close for FAQ issues when PR is merged
Modified FAQ automation workflow to include "Closes #<issue>" in PR body.
This uses GitHub's native auto-close feature to automatically close the
originating issue when the FAQ bot's PR is merged.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* docs: Document auto-close behavior for FAQ issues when PRs merge
Updated README.md and CONTRIBUTING.md to clarify that issues with
NEW or UPDATE actions are automatically closed when their associated
PRs are merged, using GitHub's native "Closes #issue" feature.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* refactor: Remove manual workflow trigger to simplify FAQ automation
Removed workflow_dispatch trigger and all related logic for manual workflow
execution. The workflow now only triggers automatically on issue creation
with the faq-proposal label, which simplifies the codebase and reduces
maintenance burden.
Changes:
- Removed workflow_dispatch trigger and inputs section
- Simplified job condition to only check for faq-proposal label
- Removed fallback logic for manual issue number input
- Streamlined issue body fetching (always uses context.payload.issue)
- Cleaned up error handler to assume issue context
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>1 parent ec9d92d commit 9fe9cef
File tree
23 files changed
+3576
-25
lines changed- .github
- ISSUE_TEMPLATE
- workflows
- _layouts
- assets/css
- faq_automation
- scripts
- tests
- integration
- unit
23 files changed
+3576
-25
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
0 commit comments