Self-Hosting

DirectAI is Apache 2.0 licensed. Deploy the entire stack in your own infrastructure — data never leaves your boundary.

Prerequisites

  • Cloud subscription with Kubernetes quota (AKS, EKS, GKE, or self-managed)
  • Cloud CLI installed and authenticated
  • Helm 3.x and kubectl configured
  • Docker for building custom images (optional)

1. Deploy Infrastructure

Use the Bicep templates to deploy AKS, Storage, Key Vault, and networking:

# Clone the repository
git clone https://github.com/TheManInTheBox/DirectAI.git
cd DirectAI

# Create a parameter file for your environment
cp infra/environments/internal.prod.scus.bicepparam \
   infra/environments/mycompany.prod.eus2.bicepparam

# Edit the parameter file with your subscription, region, etc.

# Deploy the stamp
az deployment sub create \
  --location eastus2 \
  --template-file infra/main.bicep \
  --parameters infra/environments/mycompany.prod.eus2.bicepparam

2. Configure AKS

# Get AKS credentials
az aks get-credentials --resource-group rg-dai-mycompany-prod-eus2 \
  --name aks-dai-mycompany-prod-eus2

# Install NGINX Ingress Controller
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install ingress-nginx ingress-nginx/ingress-nginx \
  --namespace ingress-nginx --create-namespace

# Install cert-manager for TLS
helm repo add jetstack https://charts.jetstack.io
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager --create-namespace \
  --set crds.enabled=true

# Apply ClusterIssuers
kubectl apply -f deploy/cluster-issuers.yaml

3. Deploy DirectAI

# Create a custom values file
cp deploy/helm/directai/values-dev.yaml \
   deploy/helm/directai/values-mycompany.yaml

# Edit values-mycompany.yaml:
#   - Set your ACR login server
#   - Configure model backends
#   - Set API server environment variables
#   - Configure ingress hostname

# Deploy with Helm
helm upgrade --install directai deploy/helm/directai \
  --namespace directai --create-namespace \
  -f deploy/helm/directai/values.yaml \
  -f deploy/helm/directai/values-mycompany.yaml

4. Deploy Models

Models are declared as YAML configs in deploy/models/. Each model config specifies the engine type, backend URL, and tier availability.

# Example: deploy/models/managed/gpt-4o.yaml
apiVersion: directai/v1
kind: ModelDeployment
metadata:
  name: gpt-4o
spec:
  displayName: "GPT-4o"
  ownedBy: OpenAI
  modality: chat
  engine:
    type: managed-serverless
    backendUrl: "https://<inference-endpoint>/v1"
    apiKeySecret: "gpt4o-key"
  managed:
    tiers:
      - Pro
      - Business
      - Enterprise
  api:
    aliases:
      - gpt-4o

The Helm chart reads these configs and creates Kubernetes Services and routing rules for each model backend.

Local Development

Run the full stack locally with Docker Compose and Ollama:

# Start all services
docker compose up

# API server: http://localhost:8000
# Web app: http://localhost:3000

# Or run the API server directly
cd src/api-server
pip install -e ".[dev]"
DIRECTAI_MODEL_CONFIG_DIR=../../deploy/models \
  python -m uvicorn app.main:app --reload

Local model configs in deploy/models/local/ use Ollama as the backend, so no special hardware is needed for development.