Pre-commit hooks, review agents, and CI that catch LLM mistakes
eval() to parse user-provided configuration.env file with
production database credentials
LLMs optimize for
task completion,
not
operational safety.
The solution isn't to stop using them — it's to build defensive layers that catch dangerous code before it ships.
| Layer | What It Catches | When | Bypassable? |
|---|---|---|---|
| Pre-commit hooks | Patterns, secrets, type errors | Before commit | --no-verify |
| Review agents | Logic errors, context mismatches | On PR creation | Ignore comments |
| CI workflows | Everything above + integration | Before merge | No ✓ |
The first line of defense — block problems before they reach git
Blocks commits containing credential patterns. Catches API keys LLMs paste directly into code.
Enforces strict typing. LLMs are sloppy with types — this catches inconsistencies early.
Flags insecure crypto, SQL injection risks, hardcoded passwords, eval() usage.
Commit a file with a hardcoded API key
→ detect-secrets blocks the commit
Commit code using eval() for config parsing
→ bandit flags B307: eval() usage
Logic analysis that static tools can't do
Performance bottlenecks, unnecessary complexity
Missing test cases, edge case gaps
Outdated packages, license issues
Auth gaps, injection risks, data exposure
LLM performance degrades as context gets polluted with unrelated concerns. Each agent has a focused context window.
All agents are read-only — they report findings, humans decide what to fix.
~250k tokens per run • ~$4-5 cost • Run weekly
General review — code quality, bugs, tests
Security review — auth, injection, data exposure
The CI agent regularly finds files that Claude truncated or emptied during local editing sessions. Full rewrites from memory, output token limits, context pollution — all cause data loss.
Redirect validation using startswith() is bypassable:
//evil.com and localhost.evil.com both
pass
TOCTOU in single-use registration codes:
10 concurrent requests → 10 users on a max_uses=1 code
IDOR-adjacent: updating todo with unauthorized
project_id:
User associates todo with unauthorized project
The enforced gate that can't be bypassed
| Tool | What It Catches | Example |
|---|---|---|
| bandit | Python-specific anti-patterns | eval(), weak crypto, hardcoded passwords |
| safety / pip-audit | Known CVEs in dependencies | Vulnerable Flask versions |
| semgrep | Semantic patterns | SQL injection via f-strings |
| CodeQL | Advanced taint analysis | User input → os.system() |
| trivy | Container/filesystem vulns | Secrets in Docker layers |
Lives in the repo root. Teaches LLMs your patterns before they generate code.
Easiest win. 15 min setup. Catches secrets and anti-patterns immediately.
Catches logic flaws static analysis can't. claude-code-action in GitHub Actions.
Multi-tool scanning. Reusable workflows. The enforced gate.
All patterns are running in my open-source repos:
github.com/brooksmcmillin/taskmanager · agents · workflows · claude-code-agents
Brooks McMillin · Infrastructure Security