ACAI
ProductEvidenceDocsPricing
ACAI

Continuous compliance for AI. Every call scanned, classified, audit-logged, and evidence-ready.

Product

  • AI Layer
  • Sample Reports
  • Pricing
  • Documentation
  • Quickstart
  • Start Free

Company

  • About
  • Talk to an Engineer
  • Security
  • Support

Legal

  • Privacy Policy
  • Terms of Service
Service-Disabled Veteran-Owned Small Business
© 2026 Agile Cloud & AI LLC. All rights reserved.
OverviewQuick StartMigration GuideCompliance Quick StartNext Steps

User Guide

AuthenticationChat CompletionsEmbeddingsTranscriptionModelsGuardrailsRate LimitsError HandlingBYOK / Passthrough

Features

Batch APISemantic CacheRAGPromptsSmart RoutingRealtime APIAudit & Compliance

Developer

ArchitectureSelf-HostingAPI ReferenceInteractive DocsConfigurationContributing
Back to site

Text to Speech

Generate spoken audio from text using TTS and TTS-HD models. OpenAI-compatible endpoint.

Endpoint

POST https://api.agilecloud.ai/v1/audio/speech

Request Parameters

Send as application/json:

ParameterTypeRequiredDescription
modelstringNotts or tts-hd. Default: tts
inputstringYesText to synthesize (max 4096 characters)
voicestringNoalloy, echo, fable, onyx, nova, shimmer. Default: alloy
response_formatstringNomp3, opus, aac, flac, wav, pcm. Default: mp3
speednumberNoSpeed multiplier (0.25–4.0). Default: 1.0

cURL Example

curl https://api.agilecloud.ai/v1/audio/speech \
  -H "Authorization: Bearer $DIRECTAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts",
    "input": "ACAI gives you AI inference with built-in compliance.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Python SDK Example

from openai import OpenAI
from pathlib import Path

client = OpenAI(
    base_url="https://api.agilecloud.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.audio.speech.create(
    model="tts",
    voice="alloy",
    input="ACAI gives you AI inference with built-in compliance.",
)

Path("speech.mp3").write_bytes(response.content)

Models

ModelQualityTiers
ttsStandard — fast, low latencyPro+
tts-hdHigh definition — higher quality, slightly slowerPro+

Notes

  • Response is streamed as raw audio bytes — pipe to a file or audio player.
  • All 6 voices are available on both tts and tts-hd.
  • Maximum input is 4,096 characters per request.
  • Compliance guardrails (content safety, audit logging) apply to the input text.