Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

example lvm deployment fails with storage class definition "volumeBindingMode: WaitForFirstConsumer" set #265

Closed
marblerun opened this issue Oct 12, 2023 · 9 comments
Labels
question Further information is requested

Comments

@marblerun
Copy link

What steps did you take and what happened:
As part of an exercise to test online volume expansion. the current version of lvm-localpv was installed using helm.

The worked example in the Readme file was then followed, but failed.

After working through the issue with support online, the storageclass definition had the line

volumeBindingMode: WaitForFirstConsumer

removed, and then worked as expected.

This parameter was carried over from a previous storageclass definition, where we had used openebs localpv to create storage requests for mongob cluster.

What did you expect to happen:

For the lvm to be created.

The output of the following commands will help us better understand what's going on:
(Pasting long output into a GitHub gist or other Pastebin is fine.)

From a failing pod, before the storageclass change

root@kube-1:~# kubectl -n openebs describe pod fio
Name: fio
Namespace: openebs
Priority: 0
Service Account: default
Node:
Labels:
Annotations:
Status: Pending
IP:
IPs:
Containers:
perfrunner:
Image: openebs/tests-fio
Port:
Host Port:
Command:
/bin/bash
Args:
-c
while true ;do sleep 50; done
Environment:
Mounts:
/mnt/datadir from fio-vol (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-95h98 (ro)
Conditions:
Type Status
PodScheduled False
Volumes:
fio-vol:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: csi-lvmpv
ReadOnly: false
kube-api-access-95h98:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message


Warning FailedScheduling 93s default-scheduler 0/4 nodes are available: 1 node(s) didn't find available persistent volumes to bind, 3 node(s) did not have enough free storage. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.

At this stage, all 4 nodes had an used 20Gb volume group defined and ready

  • kubectl logs -f openebs-lvm-controller-0 -n kube-system -c openebs-lvm-plugin
  • kubectl logs -f openebs-lvm-node-[xxxx] -n kube-system -c openebs-lvm-plugin

kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-75748cc9fd-9bv7n 1/1 Running 0 84d
calico-node-5gz47 1/1 Running 1 (84d ago) 177d
calico-node-ggcfz 1/1 Running 1 (84d ago) 177d
calico-node-ljgvl 1/1 Running 1 (84d ago) 120d
calico-node-s8pk6 1/1 Running 2 (84d ago) 177d
coredns-588bb58b94-45drn 1/1 Running 0 84d
coredns-588bb58b94-fkr4t 1/1 Running 0 84d
dns-autoscaler-5b9959d7fc-9dmlp 1/1 Running 0 84d
kube-apiserver-kube-1 1/1 Running 2 (84d ago) 177d
kube-apiserver-kube-2 1/1 Running 2 (84d ago) 177d
kube-apiserver-kube-3 1/1 Running 2 (84d ago) 177d
kube-controller-manager-kube-1 1/1 Running 2 (84d ago) 177d
kube-controller-manager-kube-2 1/1 Running 2 (84d ago) 177d
kube-controller-manager-kube-3 1/1 Running 3 (84d ago) 177d
kube-proxy-8wz2g 1/1 Running 1 (84d ago) 120d
kube-proxy-klt5x 1/1 Running 1 (84d ago) 120d
kube-proxy-qm52f 1/1 Running 1 (84d ago) 120d
kube-proxy-s7c42 1/1 Running 1 (84d ago) 120d
kube-scheduler-kube-1 1/1 Running 2 (84d ago) 177d
kube-scheduler-kube-2 1/1 Running 3 (84d ago) 177d
kube-scheduler-kube-3 1/1 Running 2 (84d ago) 177d
kubernetes-dashboard-74cc7bdb6d-52n49 1/1 Running 0 84d
kubernetes-metrics-scraper-75666d949b-fmxdp 1/1 Running 0 84d
local-volume-provisioner-h2gvj 1/1 Running 1 (84d ago) 177d
local-volume-provisioner-h5js2 1/1 Running 1 (84d ago) 120d
local-volume-provisioner-m8xwj 1/1 Running 1 (84d ago) 177d
local-volume-provisioner-v7gc6 1/1 Running 1 (84d ago) 177d
metrics-server-5dc9f5cf76-spdcq 1/1 Running 0 84d
nginx-proxy-kube-4 1/1 Running 1 (84d ago) 120d
nodelocaldns-88rcq 1/1 Running 1 (84d ago) 140d
nodelocaldns-dp55h 1/1 Running 2 (49d ago) 140d
nodelocaldns-n67jz 1/1 Running 1 (84d ago) 140d
nodelocaldns-rql67 1/1 Running 0 84d

kubectl get pods -n openebs
NAME READY STATUS RESTARTS AGE
fio 1/1 Running 0 92s
openebs-localpv-provisioner-68cb6c95f5-xvtpj 1/1 Running 0 28h
openebs-lvmlocalpv-lvm-localpv-controller-0 5/5 Running 0 28h
openebs-lvmlocalpv-lvm-localpv-node-2tpph 2/2 Running 0 28h
openebs-lvmlocalpv-lvm-localpv-node-lxmmq 2/2 Running 0 28h
openebs-lvmlocalpv-lvm-localpv-node-m5mfr 2/2 Running 0 28h
openebs-lvmlocalpv-lvm-localpv-node-prbd7 2/2 Running 0 28h
openebs-ndm-7b994 1/1 Running 0 28h
openebs-ndm-9xz7r 1/1 Running 0 28h
openebs-ndm-bzl6w 1/1 Running 0 28h
openebs-ndm-nbg6f 1/1 Running 0 28h
openebs-ndm-operator-54478658f7-btwmj 1/1 Running 0 28h

working version

root@kube-1:~# kubectl get lvmvol -nopenebs -o yaml
apiVersion: v1
items:

  • apiVersion: local.openebs.io/v1alpha1
    kind: LVMVolume
    metadata:
    creationTimestamp: "2023-10-12T16:11:48Z"
    finalizers:
    • lvm.openebs.io/finalizer
      generation: 3
      labels:
      kubernetes.io/nodename: kube-1
      name: pvc-aad9bd20-32d9-4043-a426-932dc796ca57
      namespace: openebs
      resourceVersion: "48444508"
      uid: 9f04d2a8-0723-4b95-931d-1bc4a280e75d
      spec:
      capacity: "1073741824"
      ownerNodeID: kube-1
      shared: "no"
      thinProvision: "no"
      vgPattern: ^lvmvg$
      volGroup: lvmvg
      status:
      state: Ready
      kind: List
      metadata:
      resourceVersion: ""

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • LVM Driver version
    root@kube-1:~# lvm version
    LVM version: 2.03.16(2) (2022-05-18)
    Library version: 1.02.185 (2022-05-18)
    Driver version: 4.47.0

  • Kubernetes version 1.25.5

  • Kubernetes installer & version: Kubespray 1:20

  • Cloud provider or hardware configuration: Hetzner cloud instance

  • OS (e.g. from /etc/os-release): Debian 12 bookworm

@abhilashshetty04
Copy link
Contributor

Hi @marblerun, I remember you had mayastor also in cluster. Does WaitForFirstConsumer work with mayastor provisioner?

@marblerun
Copy link
Author

Hi Abhilash ,

We are a little delayed in our Mayastor work, so currently I can't say. But I hope to get round to it soon.

@dsharma-dc
Copy link
Contributor

@marblerun
If you still face this issue could you please provide the output of vgs command from your node hosts, and also the pvc and storage class spec? The binding mode WaitForFirstConsumer works normally without any issues. In your case I see errors not enough space, so would be trying to look at the configuration issues.

@dsharma-dc dsharma-dc added the question Further information is requested label Jun 4, 2024
@ToroNZ
Copy link

ToroNZ commented Jun 15, 2024

I experienced a similar issue today when setting WaitForFirstConsumer on the StorageClass:

$ k describe pod/opensearch-cluster-0
[...]
Volumes:
  data-volume:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  opensearch-data-volume-0
    ReadOnly:   false
[...]
Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  3m2s  default-scheduler  0/4 nodes are available: 1 node(s) did not have enough free storage, 1 node(s) had untolerated taint {k3s-controlplane: true}, 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/4 nodes are available: 4 Preemption is not helpful for scheduling.

The PVC is for 20Gi.
The 1 node(s) did not have enough free storage message that comes from the "default-scheduler" is not right as the node had plenty of space left:

worker01:~$ sudo vgs
  VG        #PV #LV #SN Attr   VSize   VFree  
  openebsvg   1   1   0 wz--n- <80.00g <60.00g

As soon as I changed the binding mode to Immediate the volume was created:

# kubectl
$ k get lvmvolumes -A
NAMESPACE   NAME                                       VOLGROUP    NODE       SIZE          STATUS   AGE
openebs     pvc-fa694269-6310-47ae-b85d-affe29e1827f   openebsvg   worker01   21474836480   Ready    7m54s

# worker bash
worker01:~$ sudo vgs
  VG        #PV #LV #SN Attr   VSize   VFree  
  openebsvg   1   2   0 wz--n- <80.00g <40.00g
worker01:~$ sudo lvs
  LV                                       VG        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  datadir-cockroachdb                      openebsvg -wi-ao---- 20.00g                                                    
  pvc-fa694269-6310-47ae-b85d-affe29e1827f openebsvg -wi-a----- 20.00g            

PVC yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: local.csi.openebs.io
    volume.kubernetes.io/storage-provisioner: local.csi.openebs.io
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    pvcFor: opensearch
  name: opensearch-data-volume-0
  namespace: monitoring
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 20Gi
  storageClassName: openebs-lvm-worker01
  volumeMode: Filesystem
  volumeName: pvc-fa694269-6310-47ae-b85d-affe29e1827f
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 20Gi
  phase: Bound

SC yaml:

allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: kubernetes.io/hostname
    values:
    - worker01
  - key: node-role.kubernetes.io/worker
    values:
    - worker
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  creationTimestamp: "2024-06-15T22:38:35Z"
  name: openebs-lvm-worker01
  resourceVersion: "700293"
  uid: 76cc72fc-7609-476c-bd41-d1f09c1a49ee
parameters:
  fsType: xfs
  storage: lvm
  volgroup: openebsvg
provisioner: local.csi.openebs.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

@ToroNZ
Copy link

ToroNZ commented Jun 15, 2024

BTW - This testing is done using FCOS 40:

$ cat /etc/os-release 
NAME="Fedora Linux"
VERSION="40.20240519.3.0 (CoreOS)"
ID=fedora
VERSION_ID=40
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora CoreOS 40.20240519.3.0"
SUPPORT_END=2025-05-13
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='40.20240519.3.0'

It looks like the previous 1 node(s) did not have enough free storage when using "WaitForFirstConsumer" was because SELinux was blocking the access to the CSI socket. With that temporarily out of the way, trying "WaitForFirstConsumer" again gets this:

# kubectl
$ k describe pvc -n monitoring opensearch-data-volume-0
Name:          opensearch-data-volume-0
StorageClass:  openebs-lvm-worker01
Status:        Pending
[...]
Capacity:      
Access Modes:  
VolumeMode:    Filesystem
Used By:       opensearch-cluster-0
Events:
  Type     Reason                Age                     From                                                                                        Message
  ----     ------                ----                    ----                                                                                        -------
  Normal   WaitForFirstConsumer  9m53s (x2 over 9m53s)   persistentvolume-controller                                                                 waiting for first consumer to be created before binding
  Normal   ExternalProvisioning  4m23s (x24 over 9m48s)  persistentvolume-controller                                                                 Waiting for a volume to be created either by the external provisioner 'local.csi.openebs.io' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
  Normal   Provisioning          77s (x11 over 9m48s)    local.csi.openebs.io_openebs-lvm-localpv-controller-0_f6bc82bf-5b1f-4318-841c-1653544c7f49  External provisioner is provisioning volume for claim "monitoring/opensearch-data-volume-0"
  Warning  ProvisioningFailed    77s (x11 over 9m48s)    local.csi.openebs.io_openebs-lvm-localpv-controller-0_f6bc82bf-5b1f-4318-841c-1653544c7f49  failed to provision volume with StorageClass "openebs-lvm-worker01": error generating accessibility requirements: selected node '"worker01"' topology 'map[kubernetes.io/hostname:worker01 openebs.io/nodename:worker01]' is not in allowed topologies: [map[kubernetes.io/hostname:worker01 node-role.kubernetes.io/worker:worker]]

The following line is interesting: "topology 'map[kubernetes.io/hostname:worker01 openebs.io/nodename:worker01]' is not in allowed topologies"

@abhilashshetty04
Copy link
Contributor

abhilashshetty04 commented Jun 17, 2024

Hi @marblerun ,

1 node(s) did not have enough free storage

Can you share us the csi provisioner log for this instance. It would be interesting to see what happened there.

kubectl logs -f openebs-lvm-localpv-controller-xxxxx-xxxx -n openebs -c openebs-lvm-plugin

Are you using the same sc manifest that you shared below for WaitForFirstConsumer? If yes, then why have you used key: node-role.kubernetes.io/worker?

allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: kubernetes.io/hostname
    values:
    - worker01
  - key: node-role.kubernetes.io/worker
    values:
    - worker

@ToroNZ
Copy link

ToroNZ commented Jun 18, 2024

Hi @marblerun ,

1 node(s) did not have enough free storage

Can you share us the csi provisioner log for this instance. It would be interesting to see what happened there.

kubectl logs -f openebs-lvm-localpv-controller-xxxxx-xxxx -n openebs -c openebs-lvm-plugin

Are you using the same sc manifest that you shared below for WaitForFirstConsumer? If yes, then why have you used key: node-role.kubernetes.io/worker?

allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
  - key: kubernetes.io/hostname
    values:
    - worker01
  - key: node-role.kubernetes.io/worker
    values:
    - worker

FYI - Your quoting me, not the OP.

I just tried replicating this and I couldn't. Unfortunately I would have to re-create the nodes in order to replicate it and I don't time until next weekend :(

This is what SELinux reported at the time:

type=AVC msg=audit(1718432537.208:215475): avc:  denied  { connectto } for  pid=2115 comm="csi-node-driver" path="/plugin/csi.sock" scontext=system_u:system_r:container_t:s0:c56,c810 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=unix_stream_socket permissive=0

Once I created a policy for that I ended up as described above

I use key: node-role.kubernetes.io/worker to avoid any chance of workloads/volumes landing on nodes that perform different roles.

@dsharma-dc
Copy link
Contributor

Is this still an issue?

@avishnu
Copy link
Member

avishnu commented Sep 19, 2024

Closing now. Feel free to re-open if the issue occurs again.

@avishnu avishnu closed this as completed Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants