Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
127 changes: 127 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Copilot Instructions for Open Library

You are assisting with the Open Library project - a free, open-source catalog of books with lending capabilities.

## Project Overview

Open Library is a digital library project by the Internet Archive that aims to create "one web page for every book ever published." The project includes:
- A catalog of millions of books with metadata
- Book lending/borrowing features
- Book editing and contribution tools
- Integration with Internet Archive's book scanning efforts

## Technology Stack

- **Backend**: Python (web.py framework, transitioning to FastAPI)
- **Frontend**: JavaScript, Vue.js components, legacy jQuery
- **Database**: PostgreSQL (via Infogami)
- **Search**: Solr
- **Infrastructure**: Docker-based development environment

## Key Areas of the Codebase

### Core Directories
- `openlibrary/` - Main application code
- `openlibrary/plugins/` - Plugin modules (books, upstream, admin, etc.)
- `openlibrary/templates/` - HTML templates
- `openlibrary/components/` - Vue.js components
- `openlibrary/utils/` - Utility functions
- `static/` - Static assets (CSS, JS, images)
- `scripts/` - Maintenance and deployment scripts
- `tests/` - Test suites
- `docker/` - Docker configuration files

### Important Patterns
- Books are identified by Open Library IDs (e.g., `/books/OL123M`)
- Works represent conceptual books, Editions are specific printings
- The system uses a custom data model built on Infogami

## Common Issue Categories

### Authentication & Accounts
**Files**: `openlibrary/plugins/upstream/account.py`, `openlibrary/accounts/`
**Labels**: `Needs: Staff` (requires production testing)
**Note**: Account-related features require staff testing as they involve sensitive operations

### Borrowing & Lending
**Files**: `openlibrary/plugins/upstream/borrow.py`, `openlibrary/plugins/upstream/mybooks.py`
**Labels**: `Needs: Staff` (requires access to lending infrastructure)
**Note**: Borrowing features integrate with Internet Archive's lending system

### Search & Discovery
**Files**: `openlibrary/plugins/worksearch/`, `openlibrary/solr/`
**Documentation**: Solr schema and search documentation
**Note**: Search changes may require Solr reindexing

### Book Editing & Data
**Files**: `openlibrary/plugins/upstream/addbook.py`, `openlibrary/catalog/`
**Note**: Book data changes must be carefully validated to maintain data quality

### Frontend/UI
**Files**: `static/css/`, `openlibrary/components/`, Vue components
**Labels**: Can be `Good First Issue` if well-scoped
**Documentation**: Frontend Guide on wiki

## Issue Triage Guidelines

### Labeling Criteria

**`Needs: Staff`** - Apply when issue involves:
- Authentication/login systems
- Borrowing/lending features
- Account management operations
- Payment/donation processing
- Admin-only features
- Production database modifications
- Features requiring special access/permissions

**`Good First Issue`** - Apply ONLY when ALL true:
- Clear, well-defined scope (1-2 files)
- Straightforward changes (UI text, simple bug fix, small feature)
- Specific file locations or clear guidance provided
- No complex architecture knowledge required
- Does NOT require staff-only testing
- Be conservative - when in doubt, don't apply this label

### Information to Provide

When analyzing issues, use available repository data to provide:
1. **Relevant Files**: Use code search to identify files related to the issue
2. **Similar Issues**: Search for related open/closed issues
3. **Related PRs**: Find PRs that touched similar areas of code
4. **Documentation**: Link to relevant wiki pages, README files
5. **Next Steps**: Clear guidance based on issue state (needs priority, needs approach, ready for work)

### Skills/Tools Available

Use these tools to gather context:
- **GitHub CLI (`gh`)**: Query issues, PRs, repository data
- **Code search**: Find relevant files and code patterns
- **Repository structure**: Understand file organization
- **Git history**: Check recent changes to related files

## Response Guidelines

- Be specific and actionable
- Only suggest files/PRs/issues when confident of relevance
- Link to actual documentation, not generic advice
- Note if issue needs priority/lead assignment before work begins
- Remind about git workflow (rebase, pre-commit) when relevant
- Conservative with "Good First Issue" label - better to skip than mislabel

## Development Workflow

Contributors should:
1. Check wiki for setup instructions (Docker-based development)
2. Follow git workflow (rebase before creating branch, pre-commit hooks)
3. Write tests for changes (pytest for Python, Jest for JavaScript)
4. Get issue approved by lead with priority label before starting work
5. Reference the issue in PR title/description

## Common Documentation Links

- Setup: https://github.com/internetarchive/openlibrary/tree/master/docker
- Git Workflow: https://github.com/internetarchive/openlibrary/wiki/Git-Cheat-Sheet
- Testing: https://github.com/internetarchive/openlibrary/wiki/Testing
- Contributing: https://github.com/internetarchive/openlibrary/blob/master/CONTRIBUTING.md
- API Docs: https://openlibrary.org/developers/api
155 changes: 155 additions & 0 deletions .github/prompts/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Issue PM AI Workflow - Copilot Skills-Based

This directory contains configurations for the GitHub Copilot Agent Skills-based issue triage workflow.

## Copilot Instructions (`.github/copilot-instructions.md`)

This file provides comprehensive context about the Open Library project to GitHub Copilot, including:
- Project overview and technology stack
- Key codebase areas and patterns
- Issue triage guidelines and labeling criteria
- Common documentation links

This serves as the knowledge base for the AI when analyzing issues.

## Issue PM AI Workflow (`.github/workflows/issue_pm_ai.yml`)

An automated workflow that uses **Copilot Agent Skills** to provide contextual follow-up on newly created issues.

### Skills-Based Architecture

The workflow implements a "Skills" pattern where it:
1. **Gathers repository context** using GitHub CLI (gh):
- Similar issues (via search)
- Related pull requests
- Recent open issues
- Potentially relevant files (based on keywords)

2. **Provides context to AI** as "Skills data":
- All gathered information is passed to the AI
- AI uses this real-time repository data to make informed suggestions

3. **Generates contextual response**:
- Suggests relevant files based on actual codebase
- References real similar issues and PRs
- Recommends appropriate labels
- Provides actionable next steps

### Purpose

The workflow helps:
- Suggest relevant files and code locations (from actual repository)
- Link to appropriate documentation
- Reference related PRs and issues (from live searches)
- Recommend appropriate labels (Needs: Staff, Good First Issue)
- Provide clear next steps for contributors

### When It Runs

The workflow triggers automatically when:
- A new issue is opened in the repository
- Only runs in the main `internetarchive/openlibrary` repository (not forks)

### How It Works (Skills Pattern)

1. **Load Copilot Instructions**: Loads project context from `copilot-instructions.md`
2. **Gather Repository Context (Skills)**:
- Uses `gh` CLI to search for similar issues
- Finds related PRs using GitHub API
- Identifies potentially relevant files based on issue keywords
- Collects recent issues for context
3. **Generate Response**: Sends all context to GitHub Models API with Copilot instructions
4. **Post Comment**: Posts the AI-generated response on the issue

### Skills/Tools Used

The workflow leverages these "skills" to gather context:
- **GitHub CLI (`gh`)**: Query issues, PRs, search across repository
- **Keyword matching**: Identify relevant file patterns
- **GitHub API**: Access repository metadata
- **Code patterns**: Understand common issue types and related files

### Modifying the Workflow

To update the AI's behavior:
1. **Edit `.github/copilot-instructions.md`** to change project context, guidelines, or criteria
2. **Edit workflow YAML** to add more Skills (e.g., code search, file content analysis)
3. Changes take effect immediately for new issues

### Documentation Links

Keep documentation links in `.github/copilot-instructions.md` up-to-date as the project evolves.

## Legacy Files

- `issue_pm_instructions.md` - Original static instructions (replaced by copilot-instructions.md)
- Can be removed if no longer needed

## Troubleshooting

If the workflow isn't working:
1. Check GitHub Actions logs in the repository's Actions tab
2. Verify GitHub Models API access is available
3. Ensure `gh` CLI commands succeed (check for API rate limits or auth errors in logs)
4. Check that the workflow has proper permissions (`issues: write`, `pull-requests: read`)
5. Review recent changes to the `.github/copilot-instructions.md` file

## Extending Skills

To add more "Skills" to the workflow:

### Example: Add code search
```yaml
- name: Search code for keywords
run: |
KEYWORDS=$(echo "$ISSUE_TITLE" | tr '[:upper:]' '[:lower:]')
CODE_RESULTS=$(gh search code --repo "$REPO" "$KEYWORDS" --json path,repository 2>&1)
if [ $? -eq 0 ]; then
CODE_MATCHES=$(echo "$CODE_RESULTS" | jq -r '.[] | .path')
echo "CODE_MATCHES<<EOF" >> "$GITHUB_OUTPUT"
echo "$CODE_MATCHES" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
else
echo "Warning: Code search failed"
echo "CODE_MATCHES=No code matches found" >> "$GITHUB_OUTPUT"
fi
```

### Example: Get file content
```yaml
- name: Get relevant file content
run: |
FILE_CONTENT=$(gh api repos/$REPO/contents/path/to/file --jq '.content' 2>&1 | base64 -d)
if [ $? -eq 0 ]; then
echo "FILE_CONTENT<<EOF" >> "$GITHUB_OUTPUT"
echo "$FILE_CONTENT" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
else
echo "FILE_CONTENT=Unable to fetch file" >> "$GITHUB_OUTPUT"
fi
```

### Example: Check recent commits
```yaml
- name: Get recent commits
run: |
COMMITS=$(gh api repos/$REPO/commits --jq '.[0:5] | .[] | {sha, message, author}' 2>&1)
if [ $? -eq 0 ]; then
echo "RECENT_COMMITS<<EOF" >> "$GITHUB_OUTPUT"
echo "$COMMITS" >> "$GITHUB_OUTPUT"
echo "EOF" >> "$GITHUB_OUTPUT"
else
echo "RECENT_COMMITS=Unable to fetch commits" >> "$GITHUB_OUTPUT"
fi
```

## API Usage

The workflow uses:
- **GitHub CLI (`gh`)**: For repository queries (uses GITHUB_TOKEN)
- **GitHub Models API**: For AI-powered analysis
- Both are part of GitHub's free tier for open source projects

For more information:
- GitHub CLI: https://cli.github.com/
- GitHub Models: https://docs.github.com/en/github-models
Loading