Ofri Peretz Posted on May 31 • Originally published at ofriperetz.dev Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips # ai # security # node # eslint AI Security Benchmark Series (7 Part Series) 1 I Let Claude Write 80 Functions. 65-75% Had Security Vulnerabilities. 2 The AI Hydra Problem: Fix One AI Bug, Get Two More ... 3 more parts... 3 We Ranked 5 AI Models by Security. The Leaderboard Is Wrong. 4 Aggregate Benchmarks Lie. Here's What 700 AI Functions Look Like by Security Domain. 5 Claude Wrote a NestJS Service. TypeScript Was Happy. ESLint Found 6 Security Holes. 6 Same NestJS Prompt. Claude Got 6 Security Errors. Gemini Got 2. Here's What Both Got Wrong. 7 Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips The interesting result isn't who won. It's that across four security domains, Claude and Gemini missed the same hardening steps — and if you've shipped AI-generated auth middleware this year, your code almost certainly has the same gaps, and your review didn't catch them either. For the record, the scoreboard: one Gemini win, two ties, one split — a statistical dead heat. That's the last time the winner matters in this article. Here's the number that should bother you more than any leaderboard: across 700 AI-generated functions scored by the rules I'm about to use, 63% shipped a vulnerability . So "which model writes more secure code?" is mostly the wrong question — I've run that leaderboard myself and argued it's the wrong frame. But people keep asking it, so I ran it properly — on the ESLint security plugins I wrote specifically to catch these bugs, each mapped to a CWE — to show you what actually matters. The setup Four domains, four of my plugins. For each, the same feature-only prompt (no "make it secure" hint — that's how people actually use these tools), generated once by Gemini 2.5 Flash via the Gemini CLI and once by Claude Sonnet 4.6 via the
Back to Home

Claude vs Gemini Across 4 Security Domains: A Dead Heat — and the Hardening 63% of AI Code Skips
B
Blizine Admin
·2 min read·0 views
📰Dev.to — dev.to
B
Blizine Admin
View Profile Staff Writer