45% of AI-suggested code is insecure — the exact prompts that make it safer
We reran the 2025 study against Claude Opus 4.7, GPT-5.4, Gemini 2.5, and DeepSeek V3.2. The share of insecure suggestions has improved — but only when the prompt asks for security. The prompts that reliably produce safer code are short and we have them in this post.
When you ask an AI coding assistant for a function, the model picks an implementation. It does not tell you the options. It does not weigh security unless you tell it to. The 2025 Stanford and Georgia-Tech studies both found the same thing: about forty-five percent of AI-generated code contains a real security vulnerability when the prompt is neutral.
We reran the setup in April 2026 against the four most-used models — Claude Opus 4.7, GPT-5.4, Gemini 2.5 Pro, and DeepSeek V3.2 — across two hundred canonical prompts sampled from public programming-help forums.
The result
- Neutral prompt, no security cues: 44% insecure (effectively unchanged from 2025).
- Prompt includes the word "secure": 21% insecure.
- Prompt specifies the security requirement precisely: 8% insecure.
The delta is not the model getting better at security. It is the model being given more information.
The five prompts that reliably lowered insecurity
These dropped insecurity below 10% when added to a neutral coding prompt, on all four models:
### 1. For any database query: "use row-level-security; never trust a client-supplied ID as sole authorization; scope by both user and tenant where applicable"
### 2. For any file upload: "validate content-type server-side, enforce max size, store outside web-root, and generate a new filename"
### 3. For any HTTP request to a user-supplied URL: "reject private IP ranges, enforce a DNS-resolution-time check (not parse-time), enforce a one-second timeout, limit redirects to zero"
### 4. For any shell or subprocess call: "never pass user input as part of a shell string; always use array-form argv; drop all environment variables except a whitelist"
### 5. For any AI-agent tool: "list each tool with the exact trust-level it requires; untrusted content never invokes destructive tools; every tool call logs the prompt that produced it"
Why this works
The models have security training in them. They do not apply it unless the prompt makes security a visible requirement. The word "secure" alone helps. A specific requirement helps more.
What this means for your app
- Every system prompt in your codebase should carry a security clause relevant to its domain.
- Every prompt template your AI coding tool uses should be reviewed for a visible security cue.
- Every commit from an AI coding session should be scanned by a tool that does not trust the model's judgment. Securie does that.
Related
Related posts
We ran 500 authentication-related prompts against Claude Opus 4.7, GPT-5.4, Gemini 2.5, and DeepSeek V3.2. 92% of the generated code had at least one security bug. Here is the catalog of the top seven recurring mistakes.
Moltbook leaked 1.5 million API keys, 35,000 emails, and 4,060 private messages in 72 hours. Wiz's disclosure showed the root cause: a single Supabase table without row-level security. Here is the timeline, the exact bug, and the ten-minute hardening walkthrough for your own app.
The Next.js middleware-bypass vulnerability was disclosed in March 2025 and patched within 24 hours. One year later, forty percent of public Next.js apps are still running vulnerable versions. Here is why, and the two-minute check to run on yours.
Every major study in the last twelve months has measured the same thing: 40 to 62 percent of code produced by modern AI assistants contains a real security vulnerability. Here is what that looks like in practice, and why traditional SAST tools miss most of it.