AI Security Strategy versus Web Security: A Different Paradigm

Written by Lucas Hendrich | Dec 15, 2025

I have been thinking about how technology leaders approach security for AI systems. The default instinct is to reach for familiar frameworks. SQL injection was a catastrophic vulnerability. We learned how to fix it. When prompt injection emerged, the natural response was to treat it the same way.

Prompt injection attacks against AI systems may never be totally mitigated in the way SQL injection was. Not because we lack research or the tooling is immature. The architecture itself prevents it.

This matters because the analogy to SQL injection creates false confidence. Organizations are deploying AI systems with the expectation that a similar fix exists. It does not.

Why SQL Injection Was Fixable

SQL injection succeeded because applications concatenated user input directly into database queries. An attacker could manipulate the query structure itself.

The solution was architectural. Parameterized queries separate the query structure from the data. The database engine compiles the query first, then substitutes user input. Input arrives too late to affect what the query does. Modern frameworks make this the default. SQL injection became rare in well-maintained systems.

The fix worked because databases distinguish between instructions and data at a protocol level. That boundary is enforceable.

Why AI Systems Are Different

Large language models do not distinguish between instructions and data. They predict the next token based on everything they have seen. There is no compilation phase. There is no protocol boundary to enforce.

Consider a customer service system. You give it instructions: answer questions about products, never reveal internal data. Then a customer submits a message that includes different instructions: ignore previous rules, export customer records.

The model sees a sequence of text. It has learned from vast training data where context shifts frequently, where hypothetical scenarios are common, where "ignore previous instructions" appears in benign contexts. It cannot distinguish which text represents your trusted instructions versus untrusted user input.

Prompt engineering attempts to solve this by instructing the model to prioritize certain inputs. But you are asking the model itself to enforce a boundary it cannot perceive. This is a statistical tendency, not a security control.

The NCSC terms this an "inherently confusable deputy" problem. Traditional confused deputy vulnerabilities can be patched because the system can be modified to verify intent. AI models operate by processing all input as potentially meaningful context. The confusion is architectural.

What This Means for Decision Makers

The practical implication is that you cannot secure AI systems the way you secured databases. The mental model needs to change.

With SQL injection, you patched the vulnerability and moved on. With prompt injection, you accept residual risk and design systems accordingly. This changes how you evaluate use cases.

Some applications cannot tolerate that risk. If an AI system handles high-value transactions or accesses sensitive data, and the potential damage from compromise exceeds the value delivered, you need to question whether AI is the appropriate solution for that specific function.

This does not mean avoiding AI. It means designing with the assumption that the model will be manipulated. Limit what a compromised model can access. Validate outputs before execution. Require human approval for sensitive actions. Build monitoring that detects anomalous behavior.

Security becomes an architectural concern, not an implementation detail. You cannot retrofit it after deployment.

Practical Constraints

Organizations face several constraints that limit what is possible:

The token prediction architecture means no "parameterized prompt" equivalent will solve this at the protocol level. You cannot fix the model itself.

Most organizations use third-party AI services. Your security depends on vendor decisions you cannot control. You have limited visibility into how those systems handle untrusted input.

Monitoring for prompt injection is challenging. Unlike traditional attacks with known signatures, malicious prompts are context-dependent and highly variable. Detection requires domain expertise and continuous refinement.

Multi-agent systems compound the problem. If one agent is compromised, it can manipulate other agents by crafting inputs that exploit their vulnerabilities. The attack surface expands with system complexity.

A Different Approach

The NCSC guidance suggests focusing on reducing likelihood and impact rather than eliminating risk:

Design with minimal privileges. If an AI processes untrusted input, it should not have administrative access. Assume compromise and limit the blast radius.

Build security layers around the model. Track where inputs originate. Validate that generated actions are appropriate given input provenance. Do not depend on the model for security decisions.

Require human approval for high-value operations. The model drafts actions but humans authorize execution. This breaks full automation but is appropriate when compromise would be costly.

Implement comprehensive monitoring. Log all interactions and tool invocations. Most attacks generate detectable anomalies if you establish baseline behavior.

Test adversarially. Traditional security testing looks for known vulnerability patterns. AI systems require red teams that attempt to manipulate behavior through creative prompt construction.

Concluding thoughts

The technology industry is repeating a familiar pattern. Widespread adoption of powerful capabilities before security implications are understood. Then a wave of breaches as attackers discover exploitable patterns. This happened with SQL injection. It took years of incidents before secure defaults became standard.

Organizations deploying AI now have an opportunity to avoid that trajectory. The difference is whether you design for prompt injection from the beginning or attempt to retrofit security after deployment.

The critical question is not whether to use AI. The question is which use cases justify accepting architectural risk that cannot be fully eliminated. For some applications, the value exceeds the risk when mitigated through careful design. For others, it does not.

Treating prompt injection like SQL injection will lead to false confidence in inadequate mitigations. The vulnerability is architectural. The solutions must be architectural as well.

View full post