3 Issue to Deploy Pipeline
Hardik edited this page 2026-06-19 14:06:07 +05:30

Issue-to-Deploy Pipeline

A self-hosted pipeline takes a user-reported bug from a click in the portal all the way to a production fix, with a human gate only at PR-merge / release. It runs on pms1 (Forgejo + headless Claude Code). Full runbook: automation/README.md.

End-to-end flow

Portal header "Report Issue"        [components/layout/report-issue-button.tsx]
        │  server action → Forgejo API (label: portal)
        ▼
Forgejo issue                       [git.pelagiamarine.com/shad0w/pelagia-portal]
        │  polled every 10 min (cron on pms1)
        ▼
TRIAGE  (watcher phase 1)           [headless Claude Code, analysis only]
        │  posts a requirements breakdown; routes the issue:
        │  → claude-queue (auto-fixable)   or   → interactive (human)
        ▼
FIX  (watcher phase 2, claude-queue only)   [in ~/pelagia-autofix clone]
        │  implements + verifies; pushes branch claude/issue-N; opens PR (claude-pr)
        ▼
Human: review + merge PR, then push a release tag vX.Y.Z
        │  tag push triggers .forgejo/workflows/deploy.yml
        ▼
forgejo-runner on pms1 (label "host")
        │  checkout tag in ~/pms → pnpm install + build + migrate deploy
        ▼
pm2 restart ppms  →  live at pms.pelagiamarine.com

interactive-routed issues stop after triage for a human to pick up. The triage breakdown comment is plain (no bot marker) so, for claude-queue issues, the fix stage reads it back as refined requirements.

Components

Piece Where Notes
Report Issue button App/components/layout/report-issue-button.tsx + report-issue-actions.ts Any signed-in user; files an issue with only the portal label
Forgejo helper App/lib/forgejo.ts Needs FORGEJO_URL, FORGEJO_REPO, FORGEJO_TOKEN (scope write:issue)
Issue watcher (active) automation/claude-issue-watcher.sh on pms1 Bash; 24/7 via cron; config + logs under ~/issue-watcher/
Issue watcher (Windows, disabled) automation/claude-issue-watcher.ps1 PowerShell original; PelagiaClaudeIssueWatcher task disabled (one worker only)
Deploy workflow .forgejo/workflows/deploy.yml Triggers on v* tags; runs on the host runner
Runner pms1 ~/forgejo-runner, pm2 forgejo-runner Registered pms1-host, labels host, docker

Where the watcher runs

On pms1 under cron (every 10 min), polling Forgejo over loopback (http://127.0.0.1:3001):

  • Script: ~/issue-watcher/claude-issue-watcher.sh
  • Config: ~/issue-watcher/watcher.config.json (gitignored; token + claudeExe path)
  • Work clone: ~/pelagia-autofix (separate from the deployed ~/pms)
  • Logs: ~/issue-watcher/logs/ (watcher-<date>.log, per-issue claude-*.log, cron.log)

Auth: Claude Code must be signed in on pms1 (~/.claude/.credentials.json), or an ANTHROPIC_API_KEY env var present. The watcher preflight no-ops until credentials exist, so cron can be enabled before sign-in and activates automatically once signed in. It runs Claude with --dangerously-skip-permissions inside the dedicated pelagia-autofix clone — never the main checkout.

Issue label lifecycle

portal ──(triage)──▶ claude-queue ─▶ claude-working ─▶ claude-pr | claude-failed
              └────▶ interactive  (stops here — handle interactively)
  • A portal issue with no decision label is triaged once per run; triage adds claude-queue or interactive and posts a breakdown.
  • claude-queueclaude-workingclaude-pr (PR opened) or claude-failed.
  • Retry a failed issue by re-adding claude-queue. Queue a manual issue (skipping triage) by adding claude-queue directly; force human handling with interactive. Triage is skipped for issues that already carry a decision label.

Autofix verification against the test DB

So the fix stage verifies against realistic data without touching production:

  • The autofix clone's ~/pelagia-autofix/App/.env points DATABASE_URL at pelagia_test (the daily prod-mirror) and runs in safe dev mode (no Resend/SSO secrets → console email, local storage). NEXTAUTH_URL/PORT are 3100 (production is 3000).
  • The fix prompt allows running integration tests against this DB (set -a; . ./.env; set +a; pnpm test:integration) and starting a dev server on port 3100 only, stopping it by port (fuser -k 3100/tcp) — never a broad pkill next (would take down production).
  • Schema-migration issues are routed to interactive, so the unattended fixer should not be altering the schema.

See Deployment and Operations for the deploy workflow and staging, and automation/README.md for the authoritative runbook.