Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix vsphere detach bug #69691

Closed

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Oct 11, 2018

Fixes Kubernetes not detaching volumes from shutdown nodes even after pods that are using the volume has been deleted. Fixes #67900

This is a variant of #63413

/sig storage

Fix volume detaches from powered-off vSphere VMs

@k8s-ci-robot k8s-ci-robot added the do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. label Oct 11, 2018
@k8s-ci-robot
Copy link
Contributor

@gnufied: This PR is not for the master branch but does not have the cherry-pick-approved label. Adding the do-not-merge/cherry-pick-not-approved label.

To approve the cherry-pick, please assign the patch release manager for the release branch by writing /assign @username in a comment when ready.

The list of patch release managers for each release can be found here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Oct 11, 2018
@gnufied
Copy link
Member Author

gnufied commented Oct 11, 2018

/assign @SandeepPissay

@gnufied
Copy link
Member Author

gnufied commented Oct 11, 2018

/sig vmware

@k8s-ci-robot k8s-ci-robot added the area/provider/vmware Issues or PRs related to vmware provider label Oct 11, 2018
@gnufied
Copy link
Member Author

gnufied commented Oct 11, 2018

/test pull-kubernetes-verify

@gnufied
Copy link
Member Author

gnufied commented Oct 11, 2018

/test pull-kubernetes-integration

Copy link
Contributor

@abrarshivani abrarshivani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deletion of entries from configmap is missing. Do you plan to handle it in this PR?
Rest looks good to me.

@abrarshivani
Copy link
Contributor

@gnufied Can you please specify how you have tested this change?

@roberthbailey roberthbailey removed their request for review October 12, 2018 08:34
@gnufied
Copy link
Member Author

gnufied commented Oct 12, 2018

@abrarshivani I am still testing it with ESXi 6.7. I will update this PR with confirmation.

@gnufied
Copy link
Member Author

gnufied commented Oct 16, 2018

/test pull-kubernetes-integration

@gnufied
Copy link
Member Author

gnufied commented Oct 16, 2018

@abrarshivani alright, so I have tested this with a multinode cluster deployed using plain k8s on ESXi 6.7 and it does indeed appear to work and volumes are being detached without problem.

I am not handling removing of entries from configmap though. I think, that is something that should be done when node is terminated and removed from vcenter inventory too.

@nikopen
Copy link
Contributor

nikopen commented Oct 16, 2018

@gnufied is this fix only needed in 1.11? any issue linked with this PR (apart from #67825 )?

@gnufied
Copy link
Member Author

gnufied commented Oct 16, 2018

@nikopen this fix is needed for 1.9,1.10 and 1.11 (or whichever k8s versions are still considered supported except master). The PR you linked(#67825) just disables multiattach issue with vSphere, the underlying problem still is - we are NOT detaching volumes from shutdown nodes at all in these versions even after pods that were running on shutdown nodes has been deleted.

@nikopen
Copy link
Contributor

nikopen commented Oct 16, 2018

Indeed 67825 simply circumvented around the issue.

Did something else fix this in 1.12 onwards?

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 16, 2018
@abrarshivani
Copy link
Contributor

@gnufied Thanks for testing.
/lgtm
/approve

@abrarshivani
Copy link
Contributor

/approve

@saad-ali
Copy link
Member

/approve

I'm ok with this. RBAC rules changing in a minor release might be problematic. But I'm not the expert in that area.

/assign @liggitt

Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to specify the role/binding in a yaml file for vsphere deployers to install, that seems more reasonable than setting this up on all clusters, even non-vsphere ones

@@ -489,6 +489,13 @@ func ClusterRoles() []rbacv1.ClusterRole {
eventsRule(),
},
},
{
ObjectMeta: metav1.ObjectMeta{Name: "system:vsphere-cloud-provider"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really broad permission (global read/write of all configmaps), which doesn't seem necessary. Also, we are actively working to remove provider-specific roles from the default policy set up on all clusters (aws was the only one added before this was well-considered, and is being removed in
#66635)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reduced the scope to just the namespace. It is no longer crated as a clusterole.

@gnufied
Copy link
Member Author

gnufied commented Oct 17, 2018

@liggitt just so I understand this correctly - your recommendation is provide an YAML file with necessary roles and bindings and after that it is the job of installation tool to actually apply that YAML. Is there any other example of this?

@liggitt
Copy link
Member

liggitt commented Oct 17, 2018

Similar to the yaml manifests for default storage class when running in a particular cloud provider env:

https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/storage-class/aws/default.yaml

@liggitt
Copy link
Member

liggitt commented Oct 18, 2018

@nikopen we don't need this in 1.12 because we are not deleting node api objects when a node is shutdown and hence we don't need to save node's UUID in a configmap etc.

backporting that change to 1.11 seems like a more reasonable approach than this

@gnufied
Copy link
Member Author

gnufied commented Oct 18, 2018

@nikopen we don't need this in 1.12 because we are not deleting node api objects when a node is shutdown and hence we don't need to save node's UUID in a configmap etc.

backporting that change to 1.11 seems like a more reasonable approach than this

I am not comfortable with changing lifecycle of node api objects in a minor release that has to be backported all the way back to 1.9. We may have customer scripts/utilities etc that depends on it. Also - we don't know what breaks in kubernetes itself if we do that. Having said that, I am not super familiar with vsphere code and I will have to rely on @SandeepPissay @abrarshivani to verify if that is something worth considering.

VCP fails to detach disk from powered-off node which is still in
vcenter inventory. That's caused because VCP tries to detach disks
after nodemanager has unregistered the node information.

Use configmap as a fallback when node is not available
@k8s-ci-robot
Copy link
Contributor

New changes are detected. LGTM label has been removed.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 18, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: abrarshivani, gnufied, saad-ali
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: eparis

If they are not already assigned, you can assign the PR to them by writing /assign @eparis in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gnufied
Copy link
Member Author

gnufied commented Oct 18, 2018

@liggitt moved the rbac policy installation to a cluster add-on. It is currently only installed via ./hack/local-cluster-up.sh since ./kube-up.sh does not support vsphere. PTAL.

Install cloudprovider specific rbac policies
@gnufied
Copy link
Member Author

gnufied commented Oct 18, 2018

/test pull-kubernetes-e2e-gce

@gnufied
Copy link
Member Author

gnufied commented Oct 18, 2018

/test pull-kubernetes-e2e-kops-aws

@SandeepPissay
Copy link
Contributor

@nikopen we don't need this in 1.12 because we are not deleting node api objects when a node is shutdown and hence we don't need to save node's UUID in a configmap etc.

backporting that change to 1.11 seems like a more reasonable approach than this

I am not comfortable with changing lifecycle of node api objects in a minor release that has to be backported all the way back to 1.9. We may have customer scripts/utilities etc that depends on it. Also - we don't know what breaks in kubernetes itself if we do that. Having said that, I am not super familiar with vsphere code and I will have to rely on @SandeepPissay @abrarshivani to verify if that is something worth considering.

@gnufied If the node object is not deleted on node shutdown, then there is no need for this change. Are you sure that the node object is not getting deleted? Is this the behavior in earlier releases of k8s? My understanding is that it is not the case. So, how do you plan to fix this issue for earlier k8s releases?

@gnufied
Copy link
Member Author

gnufied commented Oct 18, 2018

@SandeepPissay yes in releases prior to 1.12, node object is being deleted when a node is shutdown and that is why I proposed this PR only for 1.9,1.10 and 1.11 releases.

What @liggitt is saying is - instead of this PR (which uses configmap cache for storing node name/uuid mapping) - can we backport the PR that changed the node deletion behavior on shutdown?

@nikopen
Copy link
Contributor

nikopen commented Oct 19, 2018

FYI 1.9 is not active anymore - it's just 1.10 and 1.11 as per 3 active versions

@nikopen
Copy link
Contributor

nikopen commented Oct 19, 2018

IMO at this point since it's about bugfixing for previous releases, if backporting the 1.12 PR has the same effect (fixes the bug) then it's a viable option. Do you expect any different behaviour with that, or extra PRs needed as well?

@SandeepPissay
Copy link
Contributor

Backporting the 1.12 PR is a better solution. Do you know which PR fixes the node removal bug?

@gnufied
Copy link
Member Author

gnufied commented Oct 19, 2018

The PR that changed lifecycle of node object for vsphere nodes is - #66619 but that PR alone isn't enough to fix the bug in question because it relies on a new cloud interface function InstanceShutdownByProviderID which was introduced by a different PR - #59323 (which exists in 1.11 but not in 1.10).

It is possible to rework a patch that just changes shutdown behaviour for VMs and isn't a backport of 1.12 PR. Another thing to keep in mind is, changing lifecycle of node objects(i.e keeping them around in case of shutdown) won't fix the detach issue. Volumes still will not be detached. But then user will have option of force deleting pods for forcing detach of volumes from powered-down nodes. So basically, vSphere behaviour will become same as GCE.

@nikopen
Copy link
Contributor

nikopen commented Oct 24, 2018

@gnufied so if #66619 is backported to 1.10 + 1.11 and #59323 to 1.10, it should be good to go?

@gnufied
Copy link
Member Author

gnufied commented Oct 26, 2018

@nikopen I have opened a new smaller PR - #70291 which just changes behaviour of shutdown nodes. if we backport it, then at least if user force deletes his pods volumes will be detached.

@gnufied
Copy link
Member Author

gnufied commented Nov 10, 2018

Since we went with different solution. Closing
/close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/provider/vmware Issues or PRs related to vmware provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/cherry-pick-not-approved Indicates that a PR is not yet approved to merge into a release branch. kind/bug Categorizes issue or PR as related to a bug. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants