Smart Routing

Route requests across models and providers with A/B testing, budget controls, and automatic fallback chains. Optimize for cost, latency, or quality without changing client code.

How It Works

Route configurations define rules for directing traffic. When a request arrives, the routing engine evaluates rules in priority order to select the target model and provider. Routes can split traffic (A/B), enforce budgets, and fall back to alternates on failure.

Request → Routing Engine
  ├── A/B Test match?   → Split to variant A or B
  ├── Budget exceeded?  → Route to cheaper fallback
  ├── Primary healthy?  → Route to primary
  └── Primary down?     → Fallback chain

Create a Route

curl https://api.agilecloud.ai/api/v1/routes \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "production-chat",
    "match_model": "default-chat",
    "strategy": "fallback",
    "targets": [
      {"model": "qwen-2.5-3b", "weight": 1.0},
      {"model": "gpt-4o-mini", "weight": 1.0, "fallback": true}
    ]
  }'

A/B Testing

Split traffic between models to compare latency, quality, and cost. Results are tracked per-variant with statistical significance.

curl https://api.agilecloud.ai/api/v1/routes \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "model-comparison",
    "match_model": "chat-model",
    "strategy": "ab_test",
    "targets": [
      {"model": "qwen-2.5-3b", "weight": 0.5},
      {"model": "gpt-4o-mini", "weight": 0.5}
    ]
  }'

Budget Controls

Set spending limits to prevent runaway costs. When the budget is exhausted, requests are routed to a cheaper model or rejected.

# Get budget status
curl https://api.agilecloud.ai/api/v1/budget/status \
  -H "Authorization: Bearer YOUR_API_KEY"

# Response
{
  "budget_usd": 500.00,
  "spent_usd": 142.30,
  "remaining_usd": 357.70,
  "forecast_end_of_month_usd": 284.60,
  "period": "2025-01"
}

# Update budget config
curl -X PATCH https://api.agilecloud.ai/api/v1/budget/config \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "monthly_budget_usd": 500.00,
    "alert_threshold_pct": 80,
    "overage_action": "downgrade_model"
  }'

Fallback Chains

When a primary backend is unhealthy or overloaded, the routing engine automatically falls through to the next target in the chain. Health is monitored continuously — when the primary recovers, traffic returns automatically.

Dry-Run Evaluation

Test routing rules without sending actual requests:

curl https://api.agilecloud.ai/api/v1/routes/evaluate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "default-chat",
    "customer_id": "cust_abc"
  }'

# Response shows which route and target would be selected

Endpoints

MethodPathDescription
POST/api/v1/routesCreate route
GET/api/v1/routesList routes
GET/api/v1/routes/{route_id}Get route
PATCH/api/v1/routes/{route_id}Update route
DELETE/api/v1/routes/{route_id}Delete route
POST/api/v1/routes/evaluateDry-run evaluate
GET/api/v1/budget/statusBudget status
GET/api/v1/budget/configGet budget config
PATCH/api/v1/budget/configUpdate budget config

Tier Availability

Smart routing, A/B testing, and budget controls are available on Business and above.