Self-Hosting
DirectAI is Apache 2.0 licensed. Deploy the entire stack in your own infrastructure — data never leaves your boundary.
Prerequisites
- Cloud subscription with Kubernetes quota (AKS, EKS, GKE, or self-managed)
- Cloud CLI installed and authenticated
- Helm 3.x and kubectl configured
- Docker for building custom images (optional)
1. Deploy Infrastructure
Use the Bicep templates to deploy AKS, Storage, Key Vault, and networking:
# Clone the repository git clone https://github.com/TheManInTheBox/DirectAI.git cd DirectAI # Create a parameter file for your environment cp infra/environments/internal.prod.scus.bicepparam \ infra/environments/mycompany.prod.eus2.bicepparam # Edit the parameter file with your subscription, region, etc. # Deploy the stamp az deployment sub create \ --location eastus2 \ --template-file infra/main.bicep \ --parameters infra/environments/mycompany.prod.eus2.bicepparam
2. Configure AKS
# Get AKS credentials az aks get-credentials --resource-group rg-dai-mycompany-prod-eus2 \ --name aks-dai-mycompany-prod-eus2 # Install NGINX Ingress Controller helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx helm install ingress-nginx ingress-nginx/ingress-nginx \ --namespace ingress-nginx --create-namespace # Install cert-manager for TLS helm repo add jetstack https://charts.jetstack.io helm install cert-manager jetstack/cert-manager \ --namespace cert-manager --create-namespace \ --set crds.enabled=true # Apply ClusterIssuers kubectl apply -f deploy/cluster-issuers.yaml
3. Deploy DirectAI
# Create a custom values file cp deploy/helm/directai/values-dev.yaml \ deploy/helm/directai/values-mycompany.yaml # Edit values-mycompany.yaml: # - Set your ACR login server # - Configure model backends # - Set API server environment variables # - Configure ingress hostname # Deploy with Helm helm upgrade --install directai deploy/helm/directai \ --namespace directai --create-namespace \ -f deploy/helm/directai/values.yaml \ -f deploy/helm/directai/values-mycompany.yaml
4. Deploy Models
Models are declared as YAML configs in deploy/models/. Each model config specifies the engine type, backend URL, and tier availability.
# Example: deploy/models/managed/gpt-4o.yaml
apiVersion: directai/v1
kind: ModelDeployment
metadata:
name: gpt-4o
spec:
displayName: "GPT-4o"
ownedBy: OpenAI
modality: chat
engine:
type: managed-serverless
backendUrl: "https://<inference-endpoint>/v1"
apiKeySecret: "gpt4o-key"
managed:
tiers:
- Pro
- Business
- Enterprise
api:
aliases:
- gpt-4oThe Helm chart reads these configs and creates Kubernetes Services and routing rules for each model backend.
Local Development
Run the full stack locally with Docker Compose and Ollama:
# Start all services docker compose up # API server: http://localhost:8000 # Web app: http://localhost:3000 # Or run the API server directly cd src/api-server pip install -e ".[dev]" DIRECTAI_MODEL_CONFIG_DIR=../../deploy/models \ python -m uvicorn app.main:app --reload
Local model configs in deploy/models/local/ use Ollama as the backend, so no special hardware is needed for development.