llm-d Infrastructure on xKS: Architecture Overview
Context
Goal: Deploy Red Hat AI Inference Server (LLMInferenceService) on xKS platforms (AKS, CoreWeave) for EA1 delivery.
Challenge: LLMInferenceService requires Red Hat-supported operators (cert-manager, sail-operator, lws-operator) that are normally deployed via OLM (Operator Lifecycle Manager) on OpenShift. OLM is not available on vanilla Kubernetes.
Solution: Extract operator manifests from OLM bundles and deploy using Helm/Helmfile.
Components
| Component | Version | Purpose | OLM Bundle Source |
|---|---|---|---|
| cert-manager-operator | v1.15.2 | TLS certificate management | registry.redhat.io/cert-manager/cert-manager-operator-bundle |
| sail-operator (Istio) | 3.2.1 | Gateway API for inference routing | registry.redhat.io/openshift-service-mesh/istio-sail-operator-bundle |
| lws-operator | 1.0 | LeaderWorkerSet for distributed inference | registry.redhat.io/leader-worker-set/lws-operator-bundle |
Note: We use Istio only for Gateway API (inference routing), not as a service mesh.
Why Helm Charts?
| On OpenShift | On xKS (AKS/CKS) |
|---|---|
| OLM manages operator lifecycle | No OLM available |
Subscription CR triggers install |
Need alternative deployment method |
| Automatic upgrades via OLM | Manual upgrades via Helm |
Helm provides: - Declarative deployment - Version control - Rollback capability - Integration with GitOps (ArgoCD, Flux)
How We Extract OLM Bundles
Tool: olm-extractor
We use olm-extractor to convert OLM bundles to Kubernetes manifests.
# Example: Extract sail-operator bundle
podman run --rm \
-v ~/.config/containers/auth.json:/root/.docker/config.json:z \
quay.io/lburgazzoli/olm-extractor:main \
run "registry.redhat.io/openshift-service-mesh/istio-sail-operator-bundle:3.2.1" \
-n istio-system \
--exclude '.kind == "ConsoleCLIDownload"'
What olm-extractor does: 1. Pulls the OLM bundle image 2. Reads the ClusterServiceVersion (CSV) 3. Extracts deployment, RBAC, CRDs 4. Outputs Kubernetes YAML manifests
Post-Processing
After extraction, we:
1. Split into CRDs and templates
2. Templatize namespace references
3. Add imagePullSecrets for Red Hat registry
4. Apply CRDs with --server-side (some are 700KB+)
Why This Approach?
Red Hat Supported Components
We use Red Hat-supported operator bundles because: - Support: Covered under Red Hat subscription - Tested: Validated on OpenShift, compatible with Kubernetes - Security: Regular CVE patches - Compliance: Required for enterprise customers
Implementation Details
Repository Structure
rhaii-on-xks/ # Monorepo
├── helmfile.yaml.gotmpl # Imports operator charts (local paths)
├── values.yaml # Configuration
├── Makefile # make deploy, make status
└── charts/
├── cert-manager-operator/ # Extracted operator
│ ├── crds/ # CRDs (installed by Helm with SSA)
│ ├── templates/ # Operator deployment, RBAC
│ ├── scripts/update-bundle.sh # Re-extract from newer bundle
│ └── helmfile.yaml.gotmpl
├── sail-operator/ # Extracted operator
│ ├── crds/ # 19 Istio CRDs
│ ├── manifests-presync/ # Namespace, ServiceAccounts
│ ├── templates/ # Operator deployment, Istio CR
│ ├── scripts/update-bundle.sh
│ └── helmfile.yaml.gotmpl
└── lws-operator/ # Extracted operator
├── crds/
├── templates/
├── scripts/update-bundle.sh
└── helmfile.yaml.gotmpl
Deployment Flow
User runs: make deploy
│
├── helmfile apply (rhaii-on-xks)
│ │
│ ├── Import charts/cert-manager-operator
│ │ ├── presync: Create operand namespace
│ │ └── install: Deploy operator + CRDs (Helm SSA)
│ │
│ ├── Import charts/sail-operator
│ │ ├── presync: Apply Gateway API CRDs (kustomize)
│ │ ├── presync: Apply istiod ServiceAccount
│ │ ├── install: Deploy operator + Istio CRDs (Helm SSA)
│ │ └── postsync: Fix webhook loop workaround
│ │
│ └── Import charts/lws-operator
│ ├── presync: Create namespace + ServiceAccount
│ └── install: Deploy operator + CRDs (Helm SSA)
│
└── Operators reconcile and deploy operands
├── cert-manager controller
├── istiod (Gateway API controller)
└── lws controller
Authentication
Red Hat registry requires authentication:
# values.yaml
useSystemPodmanAuth: true # Uses ~/.config/containers/auth.json
Pull secret is created in each operator namespace and attached to ServiceAccounts.
Known Issues & Workarounds
Sail-Operator Reconciliation Loop
Issue: On vanilla Kubernetes, sail-operator enters infinite reconciliation loop due to MutatingWebhookConfiguration caBundle updates.
Workaround: Applied automatically via postsync hook:
kubectl annotate mutatingwebhookconfiguration istio-sidecar-injector \
sailoperator.io/ignore=true
EA1 Usage
For EA1 Delivery
- Customer provisions xKS cluster (AKS or CoreWeave)
- Customer obtains Red Hat pull secret
- Deploy infrastructure:
git clone https://github.com/opendatahub-io/rhaii-on-xks cd rhaii-on-xks make deploy-all - Deploy KServe controller:
make deploy-kserve - Set up Gateway and deploy LLMInferenceService (see Deployment Guide)
Upgrade Path
When new operator versions are released:
# Update each chart
cd charts/cert-manager-operator
./scripts/update-bundle.sh v1.16.0
cd ../sail-operator
./scripts/update-bundle.sh 3.3.0
# Redeploy from repo root
cd ../..
make deploy
Summary
- What: Deploy Red Hat operators on xKS without OLM
- How: Extract OLM bundles → Helm charts via olm-extractor
- Why: Red Hat support + no OLM on vanilla K8s
- Result: LLMInferenceService runs on AKS and CoreWeave with supported components