Skip to content

ci: add weekly AI model update workflow#10

Open
el-schneider wants to merge 3 commits intomainfrom
ci/add-model-update-workflow
Open

ci: add weekly AI model update workflow#10
el-schneider wants to merge 3 commits intomainfrom
ci/add-model-update-workflow

Conversation

@el-schneider
Copy link
Owner

Adds a scheduled GitHub Action that runs weekly to scan the codebase for outdated AI model references and opens a PR to update them. Uses Claude CLI + OpenRouter model data.

Requires two repo secrets:

  • ANTHROPIC_API_KEY
  • PAT (fine-grained token with contents, pull-requests, and workflows write access)

@greptile-apps
Copy link

greptile-apps bot commented Mar 8, 2026

Greptile Summary

This PR adds a weekly scheduled GitHub Actions workflow (.github/workflows/update-ai-models.yml) that fetches the latest AI model list from OpenRouter, feeds it to a Claude Code agent, and opens an automated PR to update any stale model references found in the codebase.

The concept is sound and the overall structure is clean, but there are a few issues worth addressing before enabling this in production:

  • Security: The agent is launched with the Bash tool enabled while ANTHROPIC_API_KEY is present in the environment. Since the task only requires reading and editing files, Bash access is unnecessary and enlarges the attack surface (e.g., prompt injection → secret exfiltration). Removing Bash from --allowedTools is a straightforward fix.
  • Change detection gap: git diff --name-only silently ignores untracked (new) files. Switching to git status --porcelain makes the check complete.
  • Missing summary file guard: body-path: /tmp/update-summary.md will cause the create-pull-request step to fail if the agent exits before writing that file. A one-line fallback ensures the step is always safe to run.
  • Self-modification risk: The prompt tells the agent to scan .github/workflows/ (including itself) and contains a concrete versioned model ID as an example. If that example is ever deemed outdated, the agent will rewrite the workflow prompt, silently altering future behaviour. Excluding .github/workflows/ from the scan scope, or replacing the example with a non-versioned placeholder, removes this risk.
  • Unpinned CLI dependency: Installing @anthropic-ai/claude-code without a version pin means a breaking release could silently break weekly runs.

Confidence Score: 2/5

  • Not safe to merge as-is — the unrestricted Bash tool combined with a live API key in the environment poses a real security risk, and two logic bugs could cause silent failures or workflow corruption.
  • The Bash-tool-plus-secret combination is a concrete security concern that should be resolved before the workflow goes live. Additionally, the change-detection bug and missing summary-file guard could produce confusing silent failures, and the self-referential scan could corrupt the workflow on future runs. These are all straightforward to fix but together lower confidence significantly.
  • .github/workflows/update-ai-models.yml requires attention for the security and logic issues described above.

Comments Outside Diff (1)

  1. .github/workflows/update-ai-models.yml, line 66-76 (link)

    body-path will fail if /tmp/update-summary.md was never created

    If the claude step exits with an error (API failure, timeout, etc.) before writing /tmp/update-summary.md, but git status still shows changes (e.g., a partial edit was written), peter-evans/create-pull-request will fail trying to read a nonexistent file.

    Add a fallback that ensures the file always exists before the PR step runs:

          - name: Ensure summary file exists
            if: steps.changes.outputs.has_changes == 'true'
            run: |
              if [ ! -f /tmp/update-summary.md ]; then
                echo "Automated AI model reference update." > /tmp/update-summary.md
              fi

    Place this step between "Check for changes" and "Create pull request".

Last reviewed commit: 204f19a

Comment on lines +38 to +57
cat <<'PROMPT' | claude -p --model sonnet --allowedTools "Read,Write,Edit,Bash"
Read /tmp/latest-models.json — it contains the 5 most recent models per
provider, sorted by release date.

Scan this entire codebase (excluding .github/workflows/) for ALL
mentions of specific AI model names or IDs (e.g. claude-sonnet-4-20250514,
gpt-4o, gemini-2.0-flash). This includes illustrative examples, prompts,
comments, and documentation — not just functional references.
Do NOT touch model aliases like "haiku", "sonnet", "opus" — those resolve
automatically and are intentional.

Compare found references against the latest models. The most recently
created non-preview model in each family is the current stable version.
Update any outdated ones to their current equivalents. Preserve intent — if a
small/fast model was used, pick the current small/fast model, not the largest.

If nothing is outdated, change nothing.

Write a summary of changes (with old → new table) to /tmp/update-summary.md.
PROMPT
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrestricted Bash tool grants the agent arbitrary code execution

The agent is launched with --allowedTools "Read,Write,Edit,Bash", giving it full shell access in a CI environment where ANTHROPIC_API_KEY is injected via env. This means the agent could — intentionally or through a prompt injection attack — exfiltrate the secret, make arbitrary network requests, or run destructive commands on the runner.

Since the task is purely "read files, compare model names, edit files", the Bash tool is not needed. Removing it limits the agent's blast radius to file read/write operations only:

Suggested change
cat <<'PROMPT' | claude -p --model sonnet --allowedTools "Read,Write,Edit,Bash"
Read /tmp/latest-models.json — it contains the 5 most recent models per
provider, sorted by release date.
Scan this entire codebase (excluding .github/workflows/) for ALL
mentions of specific AI model names or IDs (e.g. claude-sonnet-4-20250514,
gpt-4o, gemini-2.0-flash). This includes illustrative examples, prompts,
comments, and documentation — not just functional references.
Do NOT touch model aliases like "haiku", "sonnet", "opus" — those resolve
automatically and are intentional.
Compare found references against the latest models. The most recently
created non-preview model in each family is the current stable version.
Update any outdated ones to their current equivalents. Preserve intent — if a
small/fast model was used, pick the current small/fast model, not the largest.
If nothing is outdated, change nothing.
Write a summary of changes (with old → new table) to /tmp/update-summary.md.
PROMPT
cat <<'PROMPT' | claude -p --model sonnet --allowedTools "Read,Write,Edit"

Comment on lines +59 to +64
- name: Check for changes
id: changes
run: |
if [ -n "$(git diff --name-only)" ]; then
echo "has_changes=true" >> "$GITHUB_OUTPUT"
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New (untracked) files are invisible to git diff --name-only

git diff --name-only only reports changes to files that are already tracked by git. If the agent creates a brand-new file (e.g., writes a note or creates a file while using Bash), that file will not appear in the diff output and has_changes will never be set, so the PR creation step will be silently skipped.

Use git status --porcelain to capture both modified tracked files and new untracked files:

Suggested change
- name: Check for changes
id: changes
run: |
if [ -n "$(git diff --name-only)" ]; then
echo "has_changes=true" >> "$GITHUB_OUTPUT"
fi
- name: Check for changes
id: changes
run: |
if [ -n "$(git status --porcelain)" ]; then
echo "has_changes=true" >> "$GITHUB_OUTPUT"
fi

Comment on lines +42 to +46
Scan this entire codebase (excluding .github/workflows/) for ALL
mentions of specific AI model names or IDs (e.g. claude-sonnet-4-20250514,
gpt-4o, gemini-2.0-flash). This includes illustrative examples, prompts,
comments, and documentation — not just functional references.
Do NOT touch model aliases like "haiku", "sonnet", "opus" — those resolve
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agent instructed to scan its own workflow file, risking self-modification

The prompt tells the agent to scan including .github/workflows/ — which includes this very file. The prompt itself contains a full model ID as an illustrative example (claude-sonnet-4-20250514). If that example becomes "outdated" by the agent's criteria, it will rewrite the prompt embedded in the workflow, silently changing the behaviour or breaking the heredoc on the next run.

Consider either:

  1. Excluding .github/workflows/ from the scan scope, or
  2. Removing the concrete model-ID example from the prompt (replacing it with a placeholder like <model-id>) so there is nothing for the agent to match.

- uses: actions/checkout@v6

- name: Install Claude CLI
run: npm install -g @anthropic-ai/claude-code
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unpinned @anthropic-ai/claude-code version

Installing the package without a version pin means a breaking release could silently break the weekly workflow. Consider pinning to a known-good version:

Suggested change
run: npm install -g @anthropic-ai/claude-code
run: npm install -g @anthropic-ai/claude-code@latest

Or pin a specific semver (e.g. @anthropic-ai/claude-code@1.x.x) to get reproducible runs and only update intentionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant