Guardrails + Audit Included in Every Plan

Compliance-First AI Inference

Content safety, PII detection, and audit logging come standard. Start free, scale with managed models. No add-on subscriptions, no surprises.

Free

Free$5 credit included

Try the API with $5 of free credit on our shared cluster. No credit card required. Or self-host with our open-source stack.

  • Guardrails + PII detection + audit logging
  • OpenAI-compatible API endpoint
  • LLMs, embeddings, transcription
  • $5 one-time API credit (Pro rates)
  • 20 RPM / 40K TPM rate limits
  • 7-day audit log retention
  • Community support
  • Self-host option (Apache 2.0)
Most Popular

Pro

$99/month + per-token usage

Managed model catalog with full compliance stack. Per-token billing — pay only for what you use.

  • Everything in Free
  • Prompt injection prevention
  • HIPAA / SOC 2 compliance exports
  • Semantic cache + batch API
  • Full managed model catalog
  • 300 RPM / 500K TPM rate limits
  • 30-day audit log retention
  • Per-token usage billing via Stripe
  • Dashboard + API key management
  • Email support (48hr SLA)
  • 99.5% uptime SLA

Business

$499/month + per-token usage

BYOB (Bring Your Own Backend) support, policy templates, and BAA available. Full compliance reporting for auditors.

  • Everything in Pro
  • BYOB — bring your own provider keys
  • Network policy enforcement
  • RAG, routing, and A/B testing
  • Pre-built compliance policy templates
  • 1-year audit log retention + exports
  • BAA available (HIPAA)
  • 1,000 RPM / 5M TPM rate limits
  • Email support (24hr SLA)
  • 99.9% uptime SLA

Enterprise

Customcommitted monthly spend

Customer-deployed inference project, private endpoints, custom models. Full compliance documentation and support.

  • Everything in Business
  • Customer-deployed inference project
  • Private endpoints in your VNet
  • Custom models + fine-tuning support
  • Unlimited audit retention + legal holds
  • Custom guardrail rules + policies
  • HIPAA / SOC 2 compliance documentation
  • Dedicated solutions engineer
  • Slack + phone support (1hr SLA)
  • 99.99% uptime SLA

Compute Pricing

All tiers pay per token. Enterprise gets committed capacity at a flat monthly rate.

Per-Model Rates

Free + Pro + Business
ModelInputOutputTiers
GPT-4o$3.50 / 1M tokens$14.00 / 1M tokensPro+
GPT-4o mini$0.20 / 1M tokens$0.80 / 1M tokensFree+
o3-mini$1.50 / 1M tokens$6.00 / 1M tokensBusiness+
Phi-4$0.14 / 1M tokens$0.28 / 1M tokensFree+
Llama 3.1 8B$0.06 / 1M tokens$0.12 / 1M tokensFree+
Llama 3.3 70B$0.45 / 1M tokens$1.35 / 1M tokensPro+
Mistral Large$2.60 / 1M tokens$7.80 / 1M tokensBusiness+
text-embedding-3-large$0.17 / 1M tokensAll
Cohere Embed v3$0.13 / 1M tokensPro+
Whisper large-v3$0.13/minPro+

Free tier $5 credit burns at these rates. Volume discounts available on Business+.

Included in Every Plan

Compliance Built In, Not Bolted On

Every DirectAI deployment includes production-grade guardrails and audit logging at no extra cost. Because compliance shouldn't be a premium feature.

Content Safety

Content safety scoring for hate, violence, self-harm, and sexual content on every request.

Included free

PII Detection & Redaction

10 built-in patterns (SSN, credit cards, PHI, MRN, DOB) with detect, redact-logs, or redact-all modes.

Included free

Prompt Injection Prevention

4-layer detection pipeline: encoding analysis, pattern matching, heuristic scoring, and Prompt Shield API.

Included free

Audit Logging & Compliance

Every request logged with full audit trail. HIPAA and SOC 2 export formats. Legal holds and retention policies.

Included free

Feature Comparison

Everything scales with your tier. No add-on subscriptions.

FeatureFreeProBusinessEnterprise
Content safety + PII detection
Audit logging7 days30 days1 yearUnlimited
Prompt injection prevention
HIPAA / SOC 2 exports
Semantic cache
Batch API
RAG (knowledge bases)5 GBUnlimited
Routing + A/B testing
Custom guardrail rules
Custom models + fine-tuning
Legal holds
BAA (HIPAA)
Compute isolationSharedSharedBYOBVNet
SLA99.5%99.9%99.99%
SupportCommunityEmail 48hrEmail 24hrSlack 1hr

Frequently Asked Questions

What compliance features are included?

Every plan — including Free — includes content safety filtering, PII detection and redaction, prompt injection prevention, and full audit logging. HIPAA/SOC 2 compliance exports are available on Pro and above. BAA execution is available on Business and above.

How does per-token billing work?

Free and Pro tiers bill per token processed. Rates vary by model — premium models like GPT-4o cost more per token than lightweight models like Llama 3.1 8B or Phi-4. Your Free $5 credit burns at the same per-model rates. See the pricing table above for exact rates.

What's the difference between Pro and Business?

Pro uses our managed model catalog — great for development and moderate production workloads. Business adds BYOB (Bring Your Own Backend) support so you can route through your own provider keys while keeping the full compliance layer. Business also unlocks BAA execution for HIPAA compliance.

Can I start with Pro and upgrade later?

Yes. Most customers start on Pro to validate their use case, then upgrade to Business when they need dedicated compute or BAA. The API is identical — same endpoints, same SDKs, zero migration effort.

What about self-hosting?

Our entire stack is open source (Apache 2.0). You can deploy it yourself on your own infrastructure. Self-hosting is always free — community support is available.

What models can I run?

All tiers access a managed model catalog (GPT-4o, Llama 3.1, Phi-4, Mistral, and more). Business adds BYOB — bring your own OpenAI, Anthropic, or any provider keys. Enterprise adds custom fine-tuned models.

What regions do you support?

Free and Pro run in US East and US South Central. Business can deploy to additional regions. Enterprise supports sovereign and government cloud regions.

Not sure which tier is right? Talk to an engineer or email us at support@agilecloud.ai