ACAI
ProductEvidenceDocsPricing
ACAI

Continuous compliance for AI. Every call scanned, classified, audit-logged, and evidence-ready.

Product

  • AI Layer
  • Sample Reports
  • Pricing
  • Documentation
  • Quickstart
  • Start Free

Company

  • About
  • Talk to an Engineer
  • Security
  • Support

Legal

  • Privacy Policy
  • Terms of Service
Service-Disabled Veteran-Owned Small Business
© 2026 Agile Cloud & AI LLC. All rights reserved.
OverviewQuick StartMigration GuideCompliance Quick StartNext Steps

User Guide

AuthenticationChat CompletionsEmbeddingsTranscriptionModelsGuardrailsRate LimitsError HandlingBYOK / Passthrough

Features

Batch APISemantic CacheRAGPromptsSmart RoutingRealtime APIAudit & Compliance

Developer

ArchitectureSelf-HostingAPI ReferenceInteractive DocsConfigurationContributing
Back to site

Guardrails

Every request through ACAI is protected by content safety, PII detection, and prompt injection prevention — included in every tier at no extra cost.

Content Safety

Requests and responses are evaluated against content safety policies. Content that violates policies is blocked with a structured error response.

  • Hate speech and harassment detection
  • Violence and self-harm content filtering
  • Sexual content filtering
  • Configurable severity thresholds per category

When content is blocked, you receive a 400 response with a content_policy_violation error code and details about which category was triggered.

PII Detection & Redaction

ACAI scans requests for personally identifiable information and can redact or flag it before it reaches the model.

  • Social Security Numbers (SSN)
  • Email addresses
  • Phone numbers
  • Credit card numbers
  • IP addresses
  • Custom patterns via guardrail rules

PII detection modes:

ModeBehavior
blockReject the request with 400 error
redactReplace PII with placeholder tokens (e.g., [SSN_REDACTED])
flagAllow the request but log a warning in audit logs

Prompt Injection Prevention

Multi-layer injection detection protects against adversarial inputs that attempt to override system instructions.

  • Heuristic detection — pattern matching for common injection techniques
  • Token analysis — detects encoding-based evasion (base64, unicode, homoglyphs)
  • Regex patterns — matches known injection signatures
  • Prompt Shield — optional integration with cloud-native content safety APIs

Injection attempts are blocked with a 400 response and logged to the audit trail.

What Does Redaction Look Like?

When PII detection mode is set to redact, sensitive data is replaced with typed placeholders before reaching the model. Here's a before/after example:

Original request (what your app sends)

{
  "model": "gpt-4o-mini",
  "messages": [{
    "role": "user",
    "content": "Patient Jane Doe (SSN 123-45-6789) called from
      555-867-5309. Her email is jane.doe@example.com and she
      paid with card 4111-1111-1111-1111. Please summarize
      her visit notes."
  }]
}

After redaction (what the model sees)

{
  "model": "gpt-4o-mini",
  "messages": [{
    "role": "user",
    "content": "Patient [NAME_REDACTED] (SSN [SSN_REDACTED])
      called from [PHONE_REDACTED]. Her email is
      [EMAIL_REDACTED] and she paid with card
      [CREDIT_CARD_REDACTED]. Please summarize her visit
      notes."
  }]
}

Redaction metadata is included in the X-DirectAI-Redactions response header and logged to the audit trail. Each redaction records the entity type, character offset, and a one-way hash of the original value for correlation.

PlaceholderDetected Entity
[NAME_REDACTED]Person name
[SSN_REDACTED]Social Security Number
[EMAIL_REDACTED]Email address
[PHONE_REDACTED]Phone number
[CREDIT_CARD_REDACTED]Credit / debit card number
[IP_REDACTED]IP address

Custom Rules

Define custom guardrail rules via the dashboard or the Rules API. Rules can match on request content using regex patterns, keyword lists, or string matching.

POST /api/v1/guardrails/rules
{
  "name": "block-competitor-names",
  "description": "Block requests mentioning competitor products",
  "pattern": "\\b(CompetitorA|CompetitorB)\\b",
  "action": "block",
  "scope": "input"
}

Configuration

Configure guardrail behavior per-user in the Dashboard → Guardrails → Configuration. Settings include:

  • Enable/disable individual guardrail types
  • Set severity thresholds for content safety categories
  • Choose PII detection mode (block, redact, flag)
  • Configure injection detection sensitivity

Violation Tracking

All guardrail violations are tracked and viewable in Dashboard → Guardrails → Violations. Each violation records:

  • Violation type (content safety, PII, injection)
  • Request ID for correlation
  • Timestamp and API key
  • Category and severity