The Problem

Slow, unreliable pipelines are a silent productivity killer. A 15-minute CI build means developers context-switch while waiting, then forget to check the result. Flaky tests that fail 10% of the time train the team to ignore failures and hit “re-run.” Missing caching downloads 500MB of dependencies on every run. No deployment rollback means a bad release requires a frantic hotfix at midnight. The pipeline should be the team’s safety net — instead, it is often the bottleneck.

The Prompt

Review the following CI/CD pipeline configuration. Act as a DevOps architect optimizing for speed, reliability, and safety.

PLATFORM: [GitHub Actions / GitLab CI / CircleCI / Jenkins / Vercel]
PROJECT TYPE: [e.g., Node.js monorepo, Python API, Astro static site]
DEPLOY TARGET: [e.g., Vercel, AWS, Netlify, Docker/Kubernetes]

PIPELINE CONFIG:
[paste .github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile, etc.]

Evaluate across these dimensions:

1. **Build Speed**
   - Are dependencies cached between runs (node_modules, pip cache)?
   - Are jobs parallelized where possible (lint + test + build in parallel)?
   - Is there unnecessary work (rebuilding unchanged packages in monorepo)?
   - Could incremental builds replace full rebuilds?

2. **Caching Strategy**
   - Are cache keys based on lock files (package-lock.json hash)?
   - Are build artifacts cached for downstream jobs?
   - Is Docker layer caching enabled for container builds?
   - Are caches invalidated when they should be (dependency updates)?

3. **Test Reliability**
   - Are tests isolated (no shared state between test jobs)?
   - Are flaky tests identified and quarantined?
   - Is there a retry strategy for infrastructure failures (network, npm registry)?
   - Are test results reported as PR checks (not just logs)?

4. **Security**
   - Are secrets stored in CI platform secrets (not in config files)?
   - Are third-party actions pinned to SHA (not tag)?
   - Is GITHUB_TOKEN scoped with minimum permissions?
   - Are artifacts from forks handled safely (no secret access)?

5. **Deployment Safety**
   - Is there a staging → production pipeline (not direct to prod)?
   - Are deployments gated by test passage?
   - Is there an automated rollback mechanism?
   - Are deployment notifications sent (Slack, email)?
   - Are database migrations run as a separate step?

6. **Maintenance**
   - Is the pipeline DRY (reusable workflows, composite actions)?
   - Are runner versions pinned (ubuntu-22.04, not ubuntu-latest)?
   - Are unused workflows and jobs cleaned up?
   - Is pipeline duration tracked and alerted on regression?

For each issue, provide:
- **Location**: Workflow file and job/step
- **Impact**: Time wasted per run × daily runs = weekly cost
- **Severity**: slow (wastes time) / unreliable (flaky) / unsafe (security) / missing (needed)
- **Fix**: Corrected pipeline configuration

Example Output

## Pipeline Review: 4 issues found

### Slow: No Dependency Caching (saves ~2 min/run)
Location: .github/workflows/ci.yml — install step
Impact: 3 min × 20 runs/day = 60 min/day of pure download time.
Fix:
  - uses: actions/setup-node@v4
    with: { node-version: 20, cache: 'pnpm' }

### Slow: Sequential Jobs That Could Parallel
Location: ci.yml — lint waits for build, but they are independent.
Impact: 4 min lint + 6 min build = 10 min sequential. Parallel = 6 min.
Fix:
  jobs:
    lint:    { runs-on: ubuntu-22.04, steps: [...] }
    test:    { runs-on: ubuntu-22.04, steps: [...] }
    build:   { runs-on: ubuntu-22.04, steps: [...] }
    deploy:  { needs: [lint, test, build], ... }

### Unsafe: Third-Party Actions on Floating Tags
Location: ci.yml uses `actions/checkout@v4` — should be pinned to SHA.
Risk: A compromised action tag could inject malicious code into your build.
Fix: `uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1`

### Missing: No Deployment Rollback
Location: deploy.yml — deploys and hopes.
Fix: Add health check + automatic rollback:
  - name: Health check
    run: curl --fail $DEPLOY_URL/health || gh workflow run rollback.yml

When to Use

Run this when setting up a new project’s CI/CD, when builds exceed 10 minutes, or when flaky tests undermine team confidence in the pipeline. Essential before adding deployment automation and after any incident where “the pipeline passed but production broke.”

Pro Tips

Measure before optimizing — add time to key steps or use the CI platform’s timing visualization to find actual bottlenecks.
Pin everything — runner versions, action versions (by SHA), Node versions. “Latest” in CI is a ticking time bomb.
Ask for a pipeline diagram — follow up with “Draw an ASCII diagram showing job dependencies, parallel stages, and expected total runtime.”

The Problem

The Prompt

Example Output

When to Use

Pro Tips

Related Skills

Configuration Review

Dependency Review