Every request through ACAI is protected by content safety, PII detection, and prompt injection prevention — included in every tier at no extra cost.
Requests and responses are evaluated against content safety policies. Content that violates policies is blocked with a structured error response.
When content is blocked, you receive a 400 response with a content_policy_violation error code and details about which category was triggered.
ACAI scans requests for personally identifiable information and can redact or flag it before it reaches the model.
PII detection modes:
| Mode | Behavior |
|---|---|
| block | Reject the request with 400 error |
| redact | Replace PII with placeholder tokens (e.g., [SSN_REDACTED]) |
| flag | Allow the request but log a warning in audit logs |
Multi-layer injection detection protects against adversarial inputs that attempt to override system instructions.
Injection attempts are blocked with a 400 response and logged to the audit trail.
When PII detection mode is set to redact, sensitive data is replaced with typed placeholders before reaching the model. Here's a before/after example:
Original request (what your app sends)
{
"model": "gpt-4o-mini",
"messages": [{
"role": "user",
"content": "Patient Jane Doe (SSN 123-45-6789) called from
555-867-5309. Her email is jane.doe@example.com and she
paid with card 4111-1111-1111-1111. Please summarize
her visit notes."
}]
}After redaction (what the model sees)
{
"model": "gpt-4o-mini",
"messages": [{
"role": "user",
"content": "Patient [NAME_REDACTED] (SSN [SSN_REDACTED])
called from [PHONE_REDACTED]. Her email is
[EMAIL_REDACTED] and she paid with card
[CREDIT_CARD_REDACTED]. Please summarize her visit
notes."
}]
}Redaction metadata is included in the X-DirectAI-Redactions response header and logged to the audit trail. Each redaction records the entity type, character offset, and a one-way hash of the original value for correlation.
| Placeholder | Detected Entity |
|---|---|
| [NAME_REDACTED] | Person name |
| [SSN_REDACTED] | Social Security Number |
| [EMAIL_REDACTED] | Email address |
| [PHONE_REDACTED] | Phone number |
| [CREDIT_CARD_REDACTED] | Credit / debit card number |
| [IP_REDACTED] | IP address |
Define custom guardrail rules via the dashboard or the Rules API. Rules can match on request content using regex patterns, keyword lists, or string matching.
POST /api/v1/guardrails/rules
{
"name": "block-competitor-names",
"description": "Block requests mentioning competitor products",
"pattern": "\\b(CompetitorA|CompetitorB)\\b",
"action": "block",
"scope": "input"
}Configure guardrail behavior per-user in the Dashboard → Guardrails → Configuration. Settings include:
All guardrail violations are tracked and viewable in Dashboard → Guardrails → Violations. Each violation records: