Self-Hosting

ACAI is Apache 2.0 licensed. Deploy the entire stack in your own infrastructure — data never leaves your boundary.

Prerequisites

Cloud subscription with Kubernetes quota (AKS, EKS, GKE, or self-managed)
Cloud CLI installed and authenticated
Helm 3.x and kubectl configured
Docker for building custom images (optional)

1. Deploy Infrastructure

Use the Bicep templates to deploy AKS, Storage, Key Vault, and networking:

# Clone the repository
git clone https://github.com/TheManInTheBox/DirectAI.git
cd DirectAI

# Create a parameter file for your environment
cp infra/environments/internal.prod.scus.bicepparam \
   infra/environments/mycompany.prod.eus2.bicepparam

# Edit the parameter file with your subscription, region, etc.

# Deploy the stamp
az deployment sub create \
  --location eastus2 \
  --template-file infra/main.bicep \
  --parameters infra/environments/mycompany.prod.eus2.bicepparam

2. Configure AKS

# Get AKS credentials
az aks get-credentials --resource-group rg-dai-mycompany-prod-eus2 \
  --name aks-dai-mycompany-prod-eus2

# Install NGINX Ingress Controller
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx --create-namespace

# Install cert-manager for TLS
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager --create-namespace \
  --set crds.enabled=true

# Apply ClusterIssuers
kubectl apply -f deploy/cluster-issuers.yaml

3. Deploy ACAI

# Create a custom values file
cp deploy/helm/directai/values-dev.yaml \
   deploy/helm/directai/values-mycompany.yaml

# Edit values-mycompany.yaml:
#   - Set your ACR login server
#   - Configure model backends
#   - Set API server environment variables
#   - Configure ingress hostname

# Deploy with Helm
helm upgrade --install directai deploy/helm/directai \
  --namespace directai --create-namespace \
  -f deploy/helm/directai/values.yaml \
  -f deploy/helm/directai/values-mycompany.yaml

4. Deploy Models

Models are declared as YAML configs in deploy/models/. Each model config specifies the engine type, backend URL, and tier availability.

# Example: deploy/models/managed/gpt-4o.yaml
apiVersion: directai/v1
kind: ModelDeployment
metadata:
  name: gpt-4o
spec:
  displayName: "GPT-4o"
  ownedBy: OpenAI
  modality: chat
  engine:
    type: managed-serverless
    backendUrl: "https://<inference-endpoint>/v1"
    apiKeySecret: "gpt4o-key"
  managed:
    tiers:
      - Pro
      - Business
      - Enterprise
  api:
    aliases:
      - gpt-4o

The Helm chart reads these configs and creates Kubernetes Services and routing rules for each model backend.

Local Development

Run the full stack locally with Docker Compose and Ollama:

# Start all services
docker compose up

# API server: http://localhost:8000
# Web app: http://localhost:3000

# Or run the API server directly
cd src/api-server
pip install -e ".[dev]"
DIRECTAI_MODEL_CONFIG_DIR=../../deploy/models \
  python -m uvicorn app.main:app --reload

Local model configs in deploy/models/local/ use Ollama as the backend, so no special hardware is needed for development.