Executive summary

Applications with language models create a security surface that is different from traditional web flows. The risk is not only in the user's message. It appears in the combination of system instructions, retrieved context, tools, memory, permissions, sensitive data and generated output.

The model should be treated as a non-deterministic component surrounded by deterministic controls. Guardrails are not just text filters; they are decision layers around input, context, tool calls and output.

Trust boundaries

A model flow mixes sources with different trust levels. System instructions are controlled by the application. User input is untrusted. Retrieved documents have variable trust and must be treated as data, not commands. The model output is derived and must be validated before sensitive effects occur.

  • Do not allow retrieved content to redefine policy.
  • Do not allow user text to override system rules.
  • Do not send data to the model before authorization.
  • Do not execute tool calls without deterministic policy checks.

Testing LLM flows

Testing an LLM application requires looking beyond the final answer. The flow must be validated from input to tool execution, including retrieved context, memory, authorization and structured output.

  • Insert adversarial instructions into retrieved documents and verify isolation.
  • Request data from another tenant and confirm the block happens before the model.
  • Propose a tool call outside the user's scope.
  • Generate invalid structured output and verify rejection.
  • Confirm memory scope and expiration.
  • Review logs for auditability without exposing sensitive data.

Tools need their own policy

When a model calls a tool, conversation becomes execution. Authorization must happen outside the model with deterministic rules. The model may propose an action; the application decides whether the user, tenant, resource and state allow that action.

This separation prevents well-written instructions from becoming operational privilege.

Memory and retention

Memory improves experience but increases risk. It must have scope, retention and sensitivity classification. Data from one tenant cannot influence another. Sensitive information should not be persisted unless necessary and governed.

Conclusion

Security for LLMs is architecture around the model. Input, context, tools, memory and output need boundaries, authorization, validation and audit. The model can interpret language, but the application must remain responsible for data and actions.