TLDR:
When starting multiple MIG device using pods, they don't get created on one GPU and then the others, but instead it puts them randomly.
Full story:
I have a deployment that is requesting 7 x 1g10gb MIG devices on node with Nvidia H100 GPU
apiVersion: apps/v1
kind: Deployment
metadata:
name: abstract-mig-claim
namespace: nvidia-dra-driver-gpu
labels:
app: abstract-mig-claim
spec:
replicas: 7
selector:
matchLabels:
app: abstract-mig-claim
strategy:
type: Recreate
template:
metadata:
labels:
app: abstract-mig-claim
spec:
restartPolicy: Always
containers:
- name: abstract-mig-claiming-pod
image: docker.ops.iszn.cz/ftxt-gpu/cuda:13.1.1-runtime-ubuntu24.04
command: ["sleep", "6000"]
resources:
claims:
- name: mig-device
request: mig-10gb
resourceClaims:
- name: mig-device
resourceClaimTemplateName: at-least-10gb-mig-template
with ResourceClaimTemplate:
apiVersion: resource.k8s.io/v1
kind: ResourceClaimTemplate
metadata:
name: at-least-10gb-mig-template
spec:
spec:
devices:
requests:
- name: mig-10gb
exactly:
deviceClassName: mig.nvidia.com
selectors:
- cel:
expression: |
device.capacity['gpu.nvidia.com'].multiprocessors.isGreaterThan(quantity("10"))
&&
device.capacity['gpu.nvidia.com'].memory.isGreaterThan(quantity("9Gi"))
constraints:
- requests: []
matchAttribute: "gpu.nvidia.com/parentUUID"
and when I apply the deployment. i get the MIG devices randomly assigned instead of them trying to fit where they can and then move to other GPUs
GPU 0: NVIDIA H100 PCIe (UUID: GPU-c5dc08af-14bd-8444-4b8c-807a2b927bfc)
MIG 1g.10gb Device 0: (UUID: MIG-7bfd1ff3-36c9-5943-801e-ee320b69b080)
MIG 1g.10gb Device 1: (UUID: MIG-289084bd-9d24-58a1-b108-bd165d812ba0)
MIG 1g.10gb Device 2: (UUID: MIG-d5da8058-8391-52ad-9a82-4d71b789ff88)
GPU 1: NVIDIA H100 PCIe (UUID: GPU-26feb2ed-98d7-c1b0-85cc-9742bde90813)
GPU 2: NVIDIA H100 PCIe (UUID: GPU-406c8304-24fb-4e0f-ac82-8cc39e5deabe)
MIG 1g.10gb Device 0: (UUID: MIG-0eb84170-f5b5-5480-bb0b-1a68ec838b22)
GPU 3: NVIDIA H100 PCIe (UUID: GPU-75578d92-6462-98be-dc5e-9d90ec4d5fed)
MIG 1g.10gb Device 0: (UUID: MIG-ca1bcc65-fefc-5bb1-9009-76b161bc870b)
MIG 1g.10gb Device 1: (UUID: MIG-bc74463a-73bc-5c66-bb5e-78d95757f444)
MIG 1g.10gb Device 2: (UUID: MIG-83a994de-db16-5943-9cc4-92eb9951048e)
This basically makes most of the GPUs not usable as full by other pods that would require a full GPU (1 full GPU available when it could have been 3) and you would need to separate "MIG-able" and full GPU pods.
I was thinking that logic behind this could be:
New MIG device request comes -> Is there a GPU with MIG device already? if YES -> Can it fit the requested device? if YES -> create the device on that GPU.
If the answer is at any time NO. It moves to other GPU while checking the whole cluster before it creates MIG device on a GPU with no MIG already.
TLDR:
When starting multiple MIG device using pods, they don't get created on one GPU and then the others, but instead it puts them randomly.
Full story:
I have a deployment that is requesting 7 x 1g10gb MIG devices on node with
Nvidia H100 GPUwith ResourceClaimTemplate:
and when I apply the deployment. i get the MIG devices randomly assigned instead of them trying to fit where they can and then move to other GPUs
This basically makes most of the GPUs not usable as full by other pods that would require a full GPU (1 full GPU available when it could have been 3) and you would need to separate "MIG-able" and full GPU pods.
I was thinking that logic behind this could be:
New MIG device request comes -> Is there a GPU with MIG device already? if YES -> Can it fit the requested device? if YES -> create the device on that GPU.
If the answer is at any time NO. It moves to other GPU while checking the whole cluster before it creates MIG device on a GPU with no MIG already.