ACAI
ProductEvidenceDocsPricing
ACAI

Continuous compliance for AI. Every call scanned, classified, audit-logged, and evidence-ready.

Product

  • AI Layer
  • Sample Reports
  • Pricing
  • Documentation
  • Quickstart
  • Start Free

Company

  • About
  • Talk to an Engineer
  • Security
  • Support

Legal

  • Privacy Policy
  • Terms of Service
Service-Disabled Veteran-Owned Small Business
© 2026 Agile Cloud & AI LLC. All rights reserved.
OverviewQuick StartMigration GuideCompliance Quick StartNext Steps

User Guide

AuthenticationChat CompletionsEmbeddingsTranscriptionModelsGuardrailsRate LimitsError HandlingBYOK / Passthrough

Features

Batch APISemantic CacheRAGPromptsSmart RoutingRealtime APIAudit & Compliance

Developer

ArchitectureSelf-HostingAPI ReferenceInteractive DocsConfigurationContributing
Back to site

Embeddings

Generate vector embeddings for semantic search, RAG, clustering, and classification. OpenAI-compatible endpoint.

Endpoint

POST https://api.agilecloud.ai/v1/embeddings

Request Body

ParameterTypeRequiredDescription
modelstringYesModel name or alias (e.g., text-embedding-3-small)
inputstring | arrayYesText string or array of strings to embed

Example

curl https://api.agilecloud.ai/v1/embeddings \
  -H "Authorization: Bearer $DIRECTAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": "ACAI provides continuous compliance AI inference."
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0123, -0.0456, 0.0789, ...]
    }
  ],
  "model": "text-embedding-3-small",
  "usage": {
    "prompt_tokens": 8,
    "total_tokens": 8
  }
}

Batch Embedding

Pass an array of strings to embed multiple texts in a single request. The server uses dynamic batching to maximize throughput.

curl https://api.agilecloud.ai/v1/embeddings \
  -H "Authorization: Bearer $DIRECTAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "text-embedding-3-small",
    "input": [
      "First document text",
      "Second document text",
      "Third document text"
    ]
  }'

The response data array will contain one embedding per input, in the same order.

Python SDK Example

from openai import OpenAI

client = OpenAI(
    base_url="https://api.agilecloud.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.embeddings.create(
    model="text-embedding-3-small",
    input=["Hello, world!", "Another sentence"],
)

for item in response.data:
    print(f"Index {item.index}: {len(item.embedding)} dimensions")

Performance Notes

  • text-embedding-3-small produces 1536-dimensional embeddings
  • Maximum sequence length: 8191 tokens
  • Dynamic batching groups up to 256 inputs per batch for maximum throughput
  • Powered by serverless inference endpoints