Guard checks every call at the gateway, both ways. It can scan the user’s input before the model sees it, and the model’s response before it reaches your code.
A typical use: redact credit card numbers from incoming messages, or block responses that look like a leaked API key.
Two kinds of rule
Open Controls → Guard in platform.opper.ai and add a rule.
LLM Guard
A custom prompt that asks a model to classify content. Use it for fuzzy things like tone, intent, or categories.
| Field | What it does |
|---|
| Prompt | Classifier instructions. Example: “Flag messages that include a plaintext password, API key, or access token.” |
| Model | Auto, Small, Medium, or Large. Auto picks the cheapest model that supports structured output. |
| Action | Flag, Block, or Redact (see below). |
| Where | Input only, output only, or both. |
Templates: PII, Toxicity, Secrets.
Regex Guard
Pattern matching. Use it for known formats like credit cards, emails, SSNs, and API keys.
| Field | What it does |
|---|
| Patterns | One or more regex patterns. Each has an optional name, the pattern itself, and standard flags: i (case-insensitive), m (multiline), s (dotall). The editor has a tester and an AI helper that drafts patterns from a description. |
| Replacement | What to substitute when redacting. Default: ***. |
| Action | Flag, Block, or Redact. |
| Where | Input only, output only, or both. |
Templates: Email, Phone, Credit Card, SSN, API Key.
Actions
| Action | What happens |
|---|
| Flag | The match is logged on the trace. The call proceeds. |
| Block | The call fails. Input-side: before the model runs. Output-side: before the response returns. |
| Redact | The matched text is replaced (default ***, or your replacement text). |
Where it applies
| Scope | Applies to |
|---|
| Organization | Every call in your org. |
| Project | Calls in one or more projects. |
You can layer rules. For example, run an org-wide secrets check alongside a stricter toxicity check on a customer-facing project. Each rule fires independently.
Where you see the result
A Guardrail event appears on the trace span with a Shield icon and status:
- Passed: the content was fine.
- Flagged: matched but the action was Flag, so the call proceeded.
- Blocked: the rule rejected the call.
Each event carries the rule name, a family label (LLM guard or Regex guard), and a scope badge. Redacted content appears already replaced in the span’s input or output.
In the playground, Guard runs when Project controls is on. Open the trace ↗ link to see the event.
Start with Flag while validating a new rule. Once you trust it, switch to Block or Redact. Combine an LLM Guard (fuzzy: toxicity, intent) with a Regex Guard (exact: card numbers, API keys).