Sibling to claude-issue-watcher.sh: polls open Claude-raised PRs (head branch under claude/, or labelled claude-pr) for review comments carrying the marker `claude-review:` — in the PR conversation, review summaries, or inline on-file comments — and runs headless Claude Code on the PR's own branch to address them, pushing the follow-up commit(s) to the same branch. - Authorization gate: only repo collaborators (write access) + the owner can trigger it; the bot's own comments are ignored. - Idempotent: handled comments are tracked by a hidden marker on the bot's acknowledgements, so the 10-min poll never redoes a comment. - Own clone (~/pelagia-pr-review), config, and lock so it never races the issue watcher. Token needs write:repository + write:issue. Adds the script, an example config, .gitignore entries for the live config/lock, and an automation/README.md section with deploy + cron steps. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
18 KiB
Automated issue-to-deploy pipeline
End-to-end flow from a user clicking Report Issue in the portal to a fix running in production:
Portal header (bug icon) [App/components/layout/report-issue-button.tsx]
│ server action → Forgejo API
▼
Forgejo issue (label: portal) [git.pelagiamarine.com/shad0w/pelagia-portal]
│ polled every 10 min by Windows Scheduled Task "PelagiaClaudeIssueWatcher"
▼
TRIAGE (watcher phase 1) [dev PC, headless Claude Code, analysis only]
│ Claude reads the issue + repo, posts a requirements-breakdown comment,
│ and routes it: adds `claude-queue` (auto-fixable) or `interactive` (human)
▼
FIX (watcher phase 2, only for claude-queue) [headless Claude Code in C:\...\src\pelagia-autofix]
│ Claude implements + verifies fix; watcher pushes branch claude/issue-N
│ and opens a PR (label: claude-pr)
▼
Human review: merge the PR, then create a release tag vX.Y.Z
│ tag push triggers .forgejo/workflows/deploy.yml
▼
forgejo-runner on pms1 (pm2: forgejo-runner, label "host")
│ checks out the tag in ~/pms, pnpm install + build + prisma migrate deploy
▼
pm2 restart ppms → live at pms.pelagiamarine.com
interactive-routed issues stop after triage for a human to pick up (run with
Claude in a steered session). The triage breakdown comment is plain (no bot
marker) so, for claude-queue issues, the fix stage reads it back as refined
requirements.
Contribution policy (all changes via PR)
Every change lands through a pull request — no direct pushes to master. This applies
to humans and to the automated pipeline alike (the watcher already opens PRs).
Each PR must include:
- Tests for any code change. Model: the integration test on
claude/issue-12— it targets the prod-mirror test DB, anchors on existing rows, inserts fixtures via raw SQL (schema-tolerant), isolates them with a unique prefix, and cleans up inafterEach. Docs/config/automation-only PRs are exempt. - Docs updates where relevant (
App/README.md,App/CLAUDE.md,Docs/, this file,CHANGELOG.md).
Enforcement — .forgejo/workflows/pr-checks.yml
runs on every PR into master:
- Test-presence gate: a PR touching
App/app|lib|components|hookswith no test change fails. Justify genuine exceptions in the PR body for a reviewer to override. - Type-check:
pnpm type-checkmust be clean across the whole project (tests included). The test suite's old type baseline was repaired when this gate landed. - Unit tests:
pnpm testmust pass.
All three are hard gates. pnpm lint is intentionally not run — it currently
requires an interactive ESLint migration (a follow-up). Integration tests are
type-checked here but executed against the pelagia_test DB by the autofix / locally
(not in this shared CI, to avoid prod-mirror schema drift).
A PULL_REQUEST_TEMPLATE.md carries the checklist.
Components
| Piece | Where | Notes |
|---|---|---|
| Report Issue button | App/components/layout/report-issue-button.tsx + report-issue-actions.ts |
Any signed-in user; files issue with only the portal label (triage routes it) |
| Forgejo helper | App/lib/forgejo.ts |
Needs FORGEJO_URL, FORGEJO_REPO, FORGEJO_TOKEN env (token scope: write:issue) |
| Issue watcher (active) | automation/claude-issue-watcher.sh on pms1 |
Bash port; runs 24/7 via cron. Config + logs under ~/issue-watcher/ |
| Issue watcher (Windows, disabled) | automation/claude-issue-watcher.ps1 |
PowerShell original. PelagiaClaudeIssueWatcher task is disabled (pms1 is the sole worker; two pollers would race) |
| PR review-comment watcher | automation/claude-pr-review-watcher.sh on pms1 |
Addresses claude-review: comments on Claude-raised PRs. Own cron entry, own clone (~/pelagia-pr-review), own config + lock. See below |
| Forgejo helper | App/lib/forgejo.ts |
Needs FORGEJO_URL, FORGEJO_REPO, FORGEJO_TOKEN env (token scope: write:issue) |
| Deploy workflow | .forgejo/workflows/deploy.yml |
Triggers on v* tags; runs on the host runner |
| Runner | pms1 ~/forgejo-runner, pm2 process forgejo-runner |
Registered as pms1-host with labels host, docker |
Where the watcher runs (pms1)
The watcher runs on pms1 under cron (every 10 min), polling Forgejo over the
local loopback (http://127.0.0.1:3001).
- Script:
~/issue-watcher/claude-issue-watcher.sh(source:automation/claude-issue-watcher.sh) - Config:
~/issue-watcher/watcher.config.json(gitignored; holds the token +claudeExe= the nvmclaudepath) - Work clone:
~/pelagia-autofix(separate from the deployed~/pms) - Logs:
~/issue-watcher/logs/(watcher-<date>.log, per-issueclaude-*.log,cron.log) - Crontab:
*/10 * * * * PATH=<nvm bin>:... ~/issue-watcher/claude-issue-watcher.sh >> ~/issue-watcher/logs/cron.log 2>&1
Auth: Claude Code must be signed in on pms1 (ssh in, run claude, complete
the login → writes ~/.claude/.credentials.json). The watcher has a preflight that
no-ops until those credentials exist, so cron can be enabled before sign-in and
activates automatically once signed in. (An ANTHROPIC_API_KEY env var also satisfies it.)
The Windows variant (.ps1 + register-watcher-task.ps1) is the portable fallback;
re-enable its task only if pms1 is unavailable, and disable one before enabling the other.
PR review-comment watcher
Where the issue watcher turns issues into PRs, the PR review-comment watcher
(automation/claude-pr-review-watcher.sh) closes the
loop on the other side: it addresses review comments left on the PRs Claude already
raised. This is how you iterate on an automated PR without dropping into an
interactive session — leave a comment, Claude pushes a follow-up commit.
How to use it (as a reviewer): on any open Claude-raised PR, leave a comment that
starts with the marker claude-review: — the text after the marker is the
instruction. It works in three places:
- the PR conversation (a normal PR comment),
- a review summary (the overall body of a submitted review),
- an inline / on-file comment (Claude is given the file, line, and diff hunk).
Example inline comment on App/lib/foo.ts:
claude-review:this should null-checkorder.vendorbefore dereferencing it, and add a test for the null case.
What the watcher does each run (every 10 min via cron):
- Lists open PRs Claude raised — head branch starts with
prBranchPrefix(claude/) or the PR is labelledclaude-pr. - Collects every
claude-review:comment from repo collaborators only (write access; the repo owner is always included). Comments from anyone else, and the bot's own comments, are ignored. This is the safety gate — only trusted users can make Claude push code. - Skips comments already handled in a previous run (tracked by a hidden
<!-- ppms-review-bot handled: … -->marker the bot stamps on its acknowledgements, so a 10-minute poll never redoes the same comment). - Checks out the PR's own branch in
~/pelagia-pr-review, runs headless Claude Code with the collected instructions (+ the samepelagia_test/ port-3100 test environment the fixer uses), then pushes the new commit(s) to the same branch — updating the open PR in place. - Acknowledges: posts a reply listing what it addressed (with the handled marker) and adds a 🚀 reaction to each handled PR-conversation comment.
If Claude judges a comment unclear, out of scope, or too risky to do unattended
(migrations, payments, permissions), it makes no commit for it and the watcher posts a
"produced no change — a human may need to take these" reply. The comments are still
marked handled so the poll doesn't loop on them; re-comment with a clearer
claude-review: instruction to retry.
Deploy on pms1 (mirrors the issue watcher):
# 1. Place the script + config alongside the issue watcher
cp automation/claude-pr-review-watcher.sh ~/pr-review-watcher/
cp automation/pr-review-watcher.config.example.json ~/pr-review-watcher/pr-review-watcher.config.json
# 2. Edit the config: real token (scope write:repository,write:issue), claudeExe = `which claude`
# 3. Add a crontab entry, OFFSET from the issue watcher so the two don't run at the same minute:
# 5,15,25,35,45,55 * * * * PATH=<nvm bin>:$PATH ~/pr-review-watcher/claude-pr-review-watcher.sh >> ~/pr-review-watcher/logs/cron.log 2>&1
- Token scope: needs
write:repository(push to the PR branch) pluswrite:issue(post comments + reactions) — one scope more than the issue watcher. - Own everything: separate clone (
~/pelagia-pr-review), config (pr-review-watcher.config.json), and lock (.pr-review-watcher.lock) so it never races the issue watcher. Logs land in the samelogs/dir (pr-review-<date>.log, per-PRclaude-pr-<n>-*.log). - Same auth preflight as the issue watcher — no-ops until Claude Code is signed in
on pms1 (or
ANTHROPIC_API_KEYis set). - A Windows
.ps1port is not provided yet (pms1 is the sole worker); port it fromclaude-issue-watcher.ps1only if you need a failover.
Test database (for autofix verification)
So the fix stage can verify against realistic data without touching production:
pelagia_test— a PostgreSQL database on pms1, owned bypelagia_user, that is a daily mirror of production (pelagia). Created once as superuser; refreshed byautomation/refresh-test-db.shvia cron at 03:30 (pg_dump pelagia | psql pelagia_test).- The autofix clone's
~/pelagia-autofix/App/.envpointsDATABASE_URLatpelagia_testand runs in safe dev mode — no Resend/SSO secrets, so email is console-logged and storage is local.NEXTAUTH_URL/PORTare set to 3100 (production app is on 3000). - The fix prompt tells Claude it may run integration tests against this DB
(
set -a; . ./.env; set +a; pnpm test:integration) and may start a dev server on port 3100 only, stopping it by port (fuser -k 3100/tcp) — never a broadpkill next, which would take down production (it also runs anext-server).
Because the test DB is refreshed daily, anything the autofix writes to it (test data,
schema experiments) is disposable. Schema-migration issues are routed to interactive
by triage, so the unattended fixer should not be altering the schema anyway.
Staging (smoke test before deploy)
automation/staging-up.sh (deployed to ~/issue-watcher/ on pms1) brings up a
staging instance of the latest master so changes can be clicked through
before a release tag deploys them to prod.
- Checkout:
~/pelagia-staging(separate from~/pmsand~/pelagia-autofix) - Process: pm2
ppms-stagingon port 3200, against the prod-mirror test DB (pelagia_test), safe dev mode (console email, local storage, SSO disabled). - Auto-refresh:
.forgejo/workflows/staging.ymlrebuilds staging on every push tomaster(i.e. every merged PR) on the host runner, so staging always tracks the trunk. It runs~/issue-watcher/staging-up.sh; concurrent runs are coalesced (newest master wins). Also triggerable on demand (workflow_dispatch). - Manual refresh / restart: re-run
~/issue-watcher/staging-up.sh. - Stop:
pm2 delete ppms-staging. - Access is SSH-tunnel only — the dev server binds to
127.0.0.1:3200, so it is not reachable from the public internet. Open a tunnel and browsehttp://localhost:3200:ssh -L 3200:localhost:3200 shad0w@<pms1>. On Windows, the desktop shortcut "Pelagia Staging (tunnel)" (automation/staging-tunnel.cmd) opens the tunnel and the browser in one click. - A fixed banner "INTERNAL DEV / STAGING - NOT PRODUCTION" is shown (driven by
NEXT_PUBLIC_ENV_LABELin the staging.env; theEnvBannercomponent renders nothing when the var is unset, so production is unaffected). - Log in with a password user (SSO is off here), e.g.
admin@pelagiamarine.com.
Issue label lifecycle
portal ──(triage)──▶ triaged + claude-queue ─▶ claude-working ─▶ claude-pr | claude-failed
└─────▶ triaged + interactive (stops here — handle with Claude interactively)
- Triage owns routing for every
portalissue. Each untriaged portal issue is triaged once (maxTriagePerRunper run); triage addstriaged, a routing label (claude-queueorinteractive), a type label (bugorfeature), and posts a breakdown. Triage skips an issue only once it carriestriaged,interactive,claude-working,claude-pr, orclaude-failed. claude-queuealone does NOT skip triage on a portal issue. The Report Issue button may stampclaude-queueat creation; triage still claims the issue and decides routing (stripping the strayclaude-queueif it routes tointeractive). This is why triage works even if an older button build is deployed.claude-queue→claude-working→claude-pr(PR opened) orclaude-failed.- To retry a failed issue, re-add
claude-queue(and removeclaude-failed). - To queue a non-portal issue for Claude (skipping triage), add
claude-queuedirectly — triage never claims issues without theportallabel. - To force a portal issue straight to fix, add
triaged+claude-queueyourself.
Releasing
⚠️ Release tags MUST be
v-prefixed (e.g.v0.2.2).deploy.ymltriggers only onv*tags — a bare tag like0.2.2will NOT deploy (the runner ignores it and prod stays on the previous version). Push the tag specifically; pushingmasteralone never deploys.
After merging PR(s) on master:
git pull
git tag v0.2.2 # MUST start with "v"; semver: patch = fixes, minor = features
git push pms1 v0.2.2 # pushing the v* tag is what triggers the deploy
The runner checks out the tag in ~/pms, runs pnpm install + build +
prisma migrate deploy, pm2 restart ppms, and verifies /login returns 200. Watch
progress under Actions on the Forgejo repo, or pm2 logs forgejo-runner on pms1.
Microservices (GstService / EpfoService / PdfService)
The standalone Playwright services are deployed by the same v* tag as the app.
~/pms was historically a sparse checkout limited to App/, so the service
folders never landed on disk; the deploy now disables sparse-checkout (idempotent)
to materialise the whole tree before managing the services. After the app restart,
deploy.yml:
- expands the working tree (sparse-checkout disable) and exports the few secrets
the services need out of
~/pms/App/.env(PDF_SERVICE_TOKEN,ALLOWED_ORIGIN,EPFO_LIVE) — neverPORTor the runner's ephemeralFORGEJO_TOKEN; - for each service folder present, runs
npm install+npx playwright install chromium+npm run build; - runs
pm2 startOrReload ecosystem.config.js --update-env— which creates the pm2 processes on the first release and reloads them on every release after — thenpm2 save; - health-checks
:3003/:3004/:3005(/health→ 200).
ecosystem.config.js (repo root) is the source of truth: canonical pm2 names
gst-service / epfo-service / pdf-service, fixed ports, and it registers
only services whose folder is checked out (so a not-yet-merged service is
skipped, and adopted automatically once its PR lands).
One-time alignment: if a service is already running on pms1 under a different
pm2 name, delete it once (pm2 delete <old-name> && pm2 save) so the canonical
process can bind its port — otherwise the new one fails on a port clash. PdfService
additionally needs Chromium system libs the first time (npx playwright install --with-deps chromium, which needs sudo); the deploy's plain playwright install chromium only fetches the browser binary.
Operational notes
-
The watcher runs Claude Code with
--dangerously-skip-permissionsinside the dedicatedpelagia-autofixclone — never pointworkDirat your main checkout. -
Watcher only works issues while this PC is on; queued issues are picked up on the next run after boot (
-StartWhenAvailable). -
Tokens:
portal-report-issue(write:issue, used by the app) andclaude-watcher(write:issue + write:repository, used by the watcher). Both belong to theshad0wForgejo account. Rotate viadocker exec -u 1000 forgejo forgejo admin user generate-access-token .... -
Server-side env for the button lives in
~/pms/App/.envon pms1 (FORGEJO_URL=http://127.0.0.1:3001so it does not depend on the tunnel). -
Known Forgejo 10 bug: clicking Update branch on a PR (or pushing to its head branch) can make the page show "This pull request is broken due to missing fork information" even though the PR is fine (API still reports
mergeable: true). Fix: close and reopen the PR — via the UI, or:$h = @{ Authorization = "token <claude-watcher token>" } Invoke-RestMethod -Method Patch -Headers $h -ContentType application/json ` -Uri https://git.pelagiamarine.com/api/v1/repos/shad0w/pelagia-portal/pulls/<N> -Body '{"state":"closed"}' Invoke-RestMethod -Method Patch -Headers $h -ContentType application/json ` -Uri https://git.pelagiamarine.com/api/v1/repos/shad0w/pelagia-portal/pulls/<N> -Body '{"state":"open"}'Fixed upstream in newer Gitea/Forgejo — resolves itself if Forgejo is upgraded past v10.