Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips

Ofri Peretz Posted on May 31 • Originally published at ofriperetz.dev Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips # ai # security # node # eslint AI Security Benchmark Series (7 Part Series) 1 I Let Claude Write 80 Functions. 65-75% Had Security Vulnerabilities. 2 The AI Hydra Problem: Fix One AI Bug, Get Two More ... 3 more parts... 3 We Ranked 5 AI Models by Security. The Leaderboard Is Wrong. 4 Aggregate Benchmarks Lie. Here's What 700 AI Functions Look Like by Security Domain. 5 Claude Wrote a NestJS Service. TypeScript Was Happy. ESLint Found 6 Security Holes. 6 Same NestJS Prompt. Claude Got 6 Security Errors. Gemini Got 2. Here's What Both Got Wrong. 7 Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips The interesting result isn't who won. It's that across four security domains, Claude and Gemini missed the same hardening steps — and if you've shipped AI-generated auth middleware this year, your code almost certainly has the same gaps, and your review didn't catch them either. For the record, the scoreboard: one Gemini win, two ties, one split — a statistical dead heat. That's the last time the winner matters in this article. Here's the number that should bother you more than any leaderboard: across 700 AI-generated functions scored by the rules I'm about to use, 63% shipped a vulnerability . So "which model writes more secure code?" is mostly the wrong question — I've run that leaderboard myself and argued it's the wrong frame. But people keep asking it, so I ran it properly — on the ESLint security plugins I wrote specifically to catch these bugs, each mapped to a CWE — to show you what actually matters. The setup Four domains, four of my plugins. For each, the same feature-only prompt (no "make it secure" hint — that's how people actually use these tools), generated once by Gemini 2.5 Flash via the Gemini CLI and once by Claude Sonnet 4.6 via the

Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips

Related Articles

Spending Hours Designing the UI? Or Just Telling AI the Pain Story

I built a Chrome extension that shows your ChatGPT token usage in real-time

MCP Tool Budget for AI SaaS: Stop Agents From Burning Tokens, Tools, and Trust

Comments