RBAC for GPU platform operators¶
Scope: the ServiceAccounts, ClusterRoles and ClusterRoleBindings that the GPU Operator, Network Operator, DRA driver and Kueue install and run as. What each is allowed to do, how to keep it least-privilege, and how to read an authorization denial in the logs. Verify with kubectl auth can-i --as=system:serviceaccount:.... The access-control companion to Security & multi-tenancy and the GPU platform hub.
Reference templates from upstream charts. The operators ship their own RBAC; do not hand-write the controller roles. Treat the YAML below as what to audit and as the boundary for tenant-facing bindings, not as a replacement for the chart-generated roles. Apply via GitOps.
flowchart TB
subgraph CTRL["Controller identities (chart-installed)"]
SAGPU["sa/gpu-operator"]
SANET["sa/network-operator"]
SADRA["sa/nvidia-dra-driver-gpu"]
SAKUE["sa/kueue-controller-manager"]
end
subgraph CR["ClusterRoles"]
CRGPU["gpu-operator"]
CRDRA["dra-driver role: resourceslices/resourceclaims"]
CRKUE["kueue-batch-admin / kueue-batch-user"]
end
SAGPU --> CRGPU
SADRA --> CRDRA
SAKUE --> CRKUE
HUMAN["Humans / CI"] -->|"bind, do not impersonate SA"| CRKUE
What it is¶
Each operator installs with two cluster-scoped objects bound together: a ServiceAccount (the identity its controller pods run as) and one or more ClusterRoles granting the verbs that controller needs, joined by a ClusterRoleBinding. GPU-platform controllers are privileged by nature: they label nodes, evict pods, manage node-level DaemonSets, and create cluster-scoped CRs (ClusterPolicy, NicClusterPolicy, ResourceSlice). The job here is not to write those roles (the charts do, and overriding them breaks upgrades) but to know exactly what each grants, scope the privileged ones, and bind humans and CI to the narrow user-facing roles instead of letting them impersonate a controller ServiceAccount.
Two distinct RBAC surfaces:
- Controller RBAC: installed by the chart, consumed by the operator's own pods. Audit it; do not rewrite it.
- Tenant/operator RBAC: what you grant humans and pipelines (submit jobs, read queues, edit a
ClusterQueue). This is where least-privilege is yours to enforce.
Prerequisites¶
- Kubernetes 1.29+ with the RBAC authorizer enabled (default). DRA roles below assume 1.34+ with the
resource.k8s.ioAPI served (v1is stable on 1.34/1.35; older clusters servev1beta1). - The four operators installed per the hub: GPU Operator in
gpu-operator, Network Operator innvidia-network-operator, DRA driver innvidia-dra-driver-gpu, Kueue inkueue-system. kubectl auth can-iavailable (in-tree since 1.6) and cluster-admin to create the bindings.- Decision made on namespace Pod Security Admission level: the Network Operator (MOFED) and GPU Operator driver/toolkit pods require
privilegedon their namespaces.
The manifest¶
The controller identities are created by the charts. You can inspect them rather than declare them:
# Controller ServiceAccounts the charts install
kubectl -n gpu-operator get sa gpu-operator
kubectl -n nvidia-dra-driver-gpu get sa # nvidia-dra-driver-gpu-*
kubectl -n kueue-system get sa kueue-controller-manager
kubectl -n nvidia-network-operator get sa # network-operator + sub-charts
# The ClusterRoles those identities use
kubectl get clusterrole gpu-operator -o yaml
kubectl get clusterrole -l rbac.kueue.x-k8s.io/role # Kueue's generated roles
1. GPU Operator. The chart creates ServiceAccount gpu-operator and ClusterRole gpu-operator (release namespace via {{ .Release.Namespace }}). Its verified grants include cluster-scoped node and pod management plus its CRDs:
# Reference shape of the chart-installed ClusterRole (do not hand-apply; audit only).
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: gpu-operator
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch", "update", "patch"] # node labels for MIG/strategy
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"] # drains nodes for driver upgrades
- apiGroups: ["nvidia.com"]
resources: ["clusterpolicies", "clusterpolicies/status", "clusterpolicies/finalizers",
"nvidiadrivers", "nvidiadrivers/status", "nvidiadrivers/finalizers"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "deletecollection"]
- apiGroups: ["security.openshift.io"]
resources: ["securitycontextconstraints"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete", "use"] # OpenShift only
2. DRA driver. A DRA driver controller needs a tight, well-known set: read resourceclaims and nodes, full lifecycle on resourceslices it publishes. The canonical upstream shape below is for audit only; the NVIDIA chart installs the real ServiceAccount/ClusterRole/ClusterRoleBinding and owns their names across upgrades:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: nvidia-dra-driver-gpu
rules:
- apiGroups: ["resource.k8s.io"]
resources: ["resourceclaims"]
verbs: ["get"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get"]
- apiGroups: ["resource.k8s.io"]
resources: ["resourceslices"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: nvidia-dra-driver-gpu
subjects:
- kind: ServiceAccount
name: nvidia-dra-driver-gpu-service-account # verify exact name: kubectl -n nvidia-dra-driver-gpu get sa
namespace: nvidia-dra-driver-gpu
roleRef:
kind: ClusterRole
name: nvidia-dra-driver-gpu
apiGroup: rbac.authorization.k8s.io
3. Kueue. Ships ServiceAccount kueue-controller-manager in kueue-system plus two persona ClusterRoles you bind to humans (never to the controller SA): kueue-batch-admin-role (manage ClusterQueues, LocalQueues, Workloads, ResourceFlavors) and kueue-batch-user-role (submit/manage Jobs, view queues and workloads). Bind a platform team admin cluster-wide, scope users per namespace:
# Platform admins manage quota cluster-wide.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kueue-platform-admins
subjects:
- kind: Group
name: [email protected]
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: kueue-batch-admin-role
apiGroup: rbac.authorization.k8s.io
---
# Tenant team submits jobs in its own namespace only.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: team-a-batch-users
namespace: team-a
subjects:
- kind: Group
name: [email protected]
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole # ClusterRole referenced from a namespaced RoleBinding scopes it to team-a
name: kueue-batch-user-role
apiGroup: rbac.authorization.k8s.io
4. Network Operator. Installs its own controller ServiceAccount and roles via the chart's rbac.create=true. There is no tenant-facing role to add; the work is the PSA label its privileged DaemonSets (MOFED, RDMA shared device plugin) require:
kubectl label --overwrite ns nvidia-network-operator \
pod-security.kubernetes.io/enforce=privileged
Configuration¶
| Key / field | Where | Default / value | Notes |
|---|---|---|---|
sa/gpu-operator |
gpu-operator ns |
created by chart | Controller identity; ClusterRole gpu-operator. |
ClusterRole gpu-operator verbs |
cluster | nodes update/patch, pods/eviction create |
Node labelling + drain for driver upgrades. |
securitycontextconstraints rule |
cluster | OpenShift only | No-op on upstream K8s; SCC is OpenShift's PSP equivalent. |
rbac.create |
Network Operator values | true |
Set false only to supply pre-made roles; then set serviceAccount.name. |
serviceAccount.name |
Network Operator values | chart default | Override only with rbac.create=false. |
PSA enforce label |
operator namespaces | privileged (GPU Op, Network Op) |
MOFED/driver/toolkit pods need host namespaces + hostPath. |
sa/nvidia-dra-driver-gpu-* |
nvidia-dra-driver-gpu ns |
created by chart | Needs resourceslices CRUD, resourceclaims/nodes get. |
resource.k8s.io apiVersion |
DRA roles | v1 (1.34+/1.35) |
v1beta1 on older clusters; match served version. |
sa/kueue-controller-manager |
kueue-system ns |
created by chart | Do not bind admin/user persona roles to it. |
kueue-batch-admin-role |
cluster | persona role | Bind to platform admins (group), cluster-wide. |
kueue-batch-user-role |
cluster | persona role | Bind via namespaced RoleBinding to scope per tenant. |
rbac.kueue.x-k8s.io/role label |
Kueue roles | selector | kubectl get clusterrole -l rbac.kueue.x-k8s.io/role lists generated roles. |
Apply & verify¶
Apply tenant/admin bindings, then assert with impersonation. Do not apply the chart-managed GPU Operator, Network Operator, or DRA controller roles by hand; inspect them from the live release. auth can-i answers from the live RBAC graph without mutating anything.
kubectl apply -f kueue-rbac.yaml
# Controllers: confirm they HAVE what they need (expect: yes)
kubectl auth can-i create pods/eviction \
--as=system:serviceaccount:gpu-operator:gpu-operator # yes
kubectl auth can-i patch nodes \
--as=system:serviceaccount:gpu-operator:gpu-operator # yes
kubectl auth can-i create resourceslices.resource.k8s.io \
--as=system:serviceaccount:nvidia-dra-driver-gpu:nvidia-dra-driver-gpu-service-account # yes
# Controllers: confirm they LACK what they shouldn't have (expect: no)
kubectl auth can-i delete nodes \
--as=system:serviceaccount:gpu-operator:gpu-operator # no
kubectl auth can-i create clusterrolebindings \
--as=system:serviceaccount:nvidia-dra-driver-gpu:nvidia-dra-driver-gpu-service-account # no
# Tenants: scoped to their namespace only
kubectl auth can-i create jobs --as-group=[email protected] \
--as=alice -n team-a # yes
kubectl auth can-i create jobs --as-group=[email protected] \
--as=alice -n team-b # no
kubectl auth can-i update clusterqueues.kueue.x-k8s.io --as-group=[email protected] \
--as=alice # no (only batch-admins)
# Dump everything an identity can do (triage)
kubectl auth can-i --list \
--as=system:serviceaccount:kueue-system:kueue-controller-manager
Expected signal: controller SAs return yes for their listed verbs and no for cluster-admin-shaped verbs (delete nodes, create clusterrolebindings); tenant identities return yes only inside their namespace. A controller pod that is CrashLoopBackOff with forbidden in its log almost always means the chart RBAC was overridden or partially applied. Re-run helm upgrade rather than patching the role by hand.
Failure modes¶
- Reading a denial. API-server and controller logs spell out the missing grant verbatim. Parse the four fields: verb, resource, ServiceAccount, and (for namespaced resources) namespace:
resourceslices.resource.k8s.io is forbidden:
User "system:serviceaccount:nvidia-dra-driver-gpu:nvidia-dra-driver-gpu-service-account"
cannot create resource "resourceslices" in API group "resource.k8s.io" at the cluster scope
Reproduce non-destructively: kubectl auth can-i create resourceslices.resource.k8s.io --as=system:serviceaccount:<ns>:<sa>. If no, the ClusterRole lacks the verb or the ClusterRoleBinding does not reference that SA.
-
apiGroup/resourceNamesmismatch. A rule forresourceslices(no group) does not grantresourceslices.resource.k8s.io. CRD-backed resources (clusterqueues,clusterpolicies,resourceslices) must name the exactapiGroups. The bare-name vs FQDN confusion is the most common silent denial. -
Impersonating the controller SA as your normal grant. Operators run as their SA; humans must bind to
kueue-batch-admin-role/kueue-batch-user-role, not--asthe controller. Granting people the controller SA's token erases the audit trail and hands out node-eviction rights. -
ClusterRolefrom a namespacedRoleBindingis silently scoped. Referencingkueue-batch-user-rolefrom aRoleBindinginteam-agrants it only inteam-a, intended here, but a frequent surprise when an admin expects cluster-wide effect. Use aClusterRoleBindingfor cluster-wide. -
PSA, not RBAC, blocks the privileged DaemonSets. MOFED and the driver/toolkit pods fail to start with a Pod Security admission warning (not a
forbidden) if the namespace enforcesbaseline/restricted. Fix is thepod-security.kubernetes.io/enforce=privilegedlabel, not a role edit (security & multi-tenancy). -
Overriding chart RBAC. Hand-editing
clusterrole/gpu-operatoris reverted on the nexthelm upgrade(or causes drift if reconciliation is off). Customize through chart values; never edit the generated role in place.
References¶
- GPU Operator ClusterRole (source): https://github.com/NVIDIA/gpu-operator/blob/main/deployments/gpu-operator/templates/clusterrole.yaml
- GPU Operator ServiceAccount (source): https://github.com/NVIDIA/gpu-operator/blob/main/deployments/gpu-operator/templates/serviceaccount.yaml
- NVIDIA DRA Driver for GPUs (install): https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/dra-intro-install.html
- DRA driver repo: https://github.com/NVIDIA/k8s-dra-driver-gpu
- Kubernetes DRA driver RBAC (ServiceAccount/ClusterRole/ClusterRoleBinding): https://kubernetes.io/docs/tutorials/cluster-management/install-use-dra/
- Kueue RBAC (personas, roles, bindings): https://kueue.sigs.k8s.io/docs/tasks/manage/rbac/
- Network Operator (PSA privileged, RBAC values): https://docs.nvidia.com/networking/display/kubernetes2570/deployment-guide-kubernetes.html · https://github.com/Mellanox/network-operator
- Kubernetes RBAC reference: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
kubectl auth can-i: https://kubernetes.io/docs/reference/access-authn-authz/authorization/#checking-api-access
Related: GPU platform hub · Security & multi-tenancy · Kubernetes for GPUs · Telemetry · Glossary