Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached #54826

rohitagarwal003 · 2017-10-30T23:19:27Z

Instead of the old Accelerators feature that added alpha.kubernetes.io/nvidia-gpu resource, use the new DevicePlugins feature that adds vendor specific resources. (In case of nvidia GPUs it will
add nvidia.com/gpu resource.)
Add node label to GCE nodes with accelerators attached. This node label is the same as what GKE attaches to node pools with accelerators attached. (For example, for nvidia-tesla-p100 GPU, the label would be cloud.google.com/gke-accelerator=nvidia-tesla-p100) This will help us target accelerator specific
daemonsets etc. to these nodes.
Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached.
Some minor documentation improvements in addon manager.

Release note:

GCE nodes with NVIDIA GPUs attached now expose `nvidia.com/gpu` as a resource instead of `alpha.kubernetes.io/nvidia-gpu`.

/sig cluster-lifecycle
/sig scheduling
/area hw-accelerators

kubernetes/enhancements#368

rohitagarwal003 · 2017-10-30T23:19:45Z

/assign @vishh @jiayingz @roberthbailey

roberthbailey · 2017-10-30T23:21:47Z

I'll review for approval once this has an lgtm from @vishh

jiayingz · 2017-10-31T00:18:15Z

cluster/addons/addon-manager/README.md

@@ -1,25 +1,25 @@
 ### Addon-manager

-addon-manager manages two classes of addons with given template files.
+addon-manager manages two classes of addons with given template files in `$ADDON_PATH`.


If we are going to introduce $ADDON_PATH earlier here, maybe we should also move its default setting note "(default /etc/kubernetes/addons/)" here?

jiayingz

lgtm. Just a small nit.

vishh · 2017-11-01T00:04:26Z

/lgtm
/approve

MrHohn · 2017-11-01T00:28:27Z

cluster/addons/device-plugins/nvidia-gpu/daemonset.yaml

+          requiredDuringSchedulingIgnoredDuringExecution:
+            nodeSelectorTerms:
+            - matchExpressions:
+              - key: cloud.google.com/gke-accelerator


Curious how is this different from using nodeSelector field?

nodeSelector can only do key=value checks. I wanted to do key exists check (because I want to run it on nodes that have nvidia-tesla-k80 as value or nvidia-tesla-p100 as value or any later value we may add.)

Thanks, good to know :)

rohitagarwal003 · 2017-11-01T01:18:14Z

^ Fixed a typo.

vishh · 2017-11-01T17:50:13Z

/lgtm

roberthbailey · 2017-11-02T18:34:15Z

cluster/addons/addon-manager/README.md


 Notes:
 - Label `kubernetes.io/cluster-service=true` is deprecated (only for Addon Manager).
 In future release (after one year), Addon Manager may not respect it anymore. Addons
 have this label but without `addonmanager.kubernetes.io/mode=EnsureExists` will be
 treated as "reconcile class addons" for now.
- Resources under $ADDON_PATH (default `/etc/kubernetes/addons/`) needs to have either one
-of these two labels. Meanwhile namespaced resources need to be in `kube-system` namespace.
+- Resources under `$ADDON_PATH` needs to have either one of these two labels.


nit: s/needs/need

Thanks. Fixed.

roberthbailey · 2017-11-02T18:37:48Z

cluster/gce/config-default.sh

@@ -182,7 +182,10 @@ RUNTIME_CONFIG="${KUBE_RUNTIME_CONFIG:-}"
 FEATURE_GATES="${KUBE_FEATURE_GATES:-ExperimentalCriticalPodAnnotation=true}"

 if [[ ! -z "${NODE_ACCELERATORS}" ]]; then
-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"


how does switching this work across upgrades? Or if it doesn't that should be mentioned as part of the release note for this pr.

Accelerators is an alpha feature. DevicePlugins is also an alpha feature (and the replacement for Accelerators).

I had added the following release note to the PR:

GCE nodes with NVIDIA GPUs attached now expose `nvidia.com/gpu` as a resource instead of `alpha.kubernetes.io/nvidia-gpu`.

which captures the difference between clusters created using the old script and clusters created using the new script. But I haven't actually tried upgrading a cluster that had the old flag to a cluster with the new flag. In GKE, we don't have to worry about this because we don't allow alpha cluster upgrade.

What should be the right release note here?
cc - @vishh

Upgrades are not really supported when alpha features are turned on. So I don't see much value in thinking about the upgrade/downgrade scenario. We recommend users to stick to specific versions and our workflow has changed considerably over releases while in alpha.

I think the release note as it is LGTM

That makes sense; I wasn't sure where this was on the alpha -> ga slider. Even with alpha features it's nice to have a release note so that people using them will know how we've changed things and I agree that the release note looks good.

It would be pretty easy to run ./cluster/upgrade.sh to upgrade a GCE cluster and see what happens to the nodes w.r.t. labels. I'm guessing that they would change to the new label, but not entirely sure.

@roberthbailey , I tried the following:

# Last commit from master on this branch git checkout 55e216f56eac0082acc6be655d9ae09cf9ba38a8 go run hack/e2e.go -- -v --build export NODE_ACCELERATORS=type=nvidia-tesla-k80,count=2; export KUBE_NODE_OS_DISTRIBUTION=gci; export KUBE_GCE_ZONE=us-west1-b; export KUBE_GCE_NODE_IMAGE=gke-1-8-2-gke-0-cos-stable-60-9592-90-0-v171103-pre-nvda-gpu; export KUBE_GCE_NODE_PROJECT=gke-node-images; cluster/kube-up.sh # Latest commit on this branch git checkout cf292754ba423aa6782564ea83fe48cc1ed677d4 go run hack/e2e.go -- -v --build export NODE_ACCELERATORS=type=nvidia-tesla-k80,count=2; export KUBE_NODE_OS_DISTRIBUTION=gci; export KUBE_GCE_ZONE=us-west1-b; export KUBE_GCE_NODE_IMAGE=gke-1-8-2-gke-0-cos-stable-60-9592-90-0-v171103-pre-nvda-gpu; export KUBE_GCE_NODE_PROJECT=gke-node-images; cluster/gce/upgrade.sh -l

Master got the new label cloud.google.com/gke-accelerator=nvidia-tesla-k80. Master also had the correct feature-gate DevicePlugins=true set. However, not sure how to check node upgrade because upgrade.sh doesn't support upgrading nodes to local binaries.

roberthbailey · 2017-11-02T18:38:19Z

cluster/gce/config-default.sh

-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"
+    if [[ "${NODE_ACCELERATORS}" =~ .*type=([a-zA-Z0-9-]+).* ]]; then
+        NODE_LABELS="${NODE_LABELS},cloud.google.com/gke-accelerator=${BASH_REMATCH[1]}"


what are the cases where we want to enable deviceplugins but not set node labels?

Eventually, we would enable device plugins by default (there would be no feature gate). So line 185 would go away.

Ideally, we would also like that each node that has special device add a node label. Lines 186-188 are doing that, they see a special device (accelerator) and are adding a label to the node for that. As long as GCE APIs follow the convention of specifying accelerators as type=TYPE,count=COUNT, this line would continue to work. Once, GCE adds devices that are not accelerators we would have to add more logic here.

That didn't really answer my question but looking at the code again, is the reason for the conditional here to capture the device type?

roberthbailey · 2017-11-02T18:38:57Z

cluster/gce/config-default.sh

-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"
+    if [[ "${NODE_ACCELERATORS}" =~ .*type=([a-zA-Z0-9-]+).* ]]; then
+        NODE_LABELS="${NODE_LABELS},cloud.google.com/gke-accelerator=${BASH_REMATCH[1]}"


is BASH_REMATCH portable across at least linux & mac (and ideally cygwin)? This runs client side so it needs to work on places where we run kube-up.sh.

According to http://git.savannah.gnu.org/cgit/bash.git/tree/CHANGES?h=bash-4.4#n4839, BASH_REMATCH was added in bash-3.0 which was released in 2004

I tested it on my mac and it works there. I don't have access to a Windows machine but will ask someone with windows to test it on cygwin/mingw. Thanks for pointing this out.

Okay, I ran a Windows VM on GCP, installed cygwin and tested this if block. It works as expected. Fun experience! :D

roberthbailey · 2017-11-02T18:39:53Z

Also please squash your commits.

The comment is also present in lines 143-145 where it makes more sense.

Instead of the old Accelerators feature that added alpha.kubernetes.io/nvidia-gpu resource, use the new DevicePlugins feature that adds vendor specific resources. (In case of nvidia it will add nvidia.com/gpu resource.)

This node label is the same as what GKE attaches to node pools with accelerators attached. This will help us target accelerator specific daemonsets etc. to these nodes.

…have nvidia GPUs attached.

rohitagarwal003

Also please squash your commits.

The four commits are doing four different things:

improving documentation of addon-manager.
enabling DevicePlugins instead of Accelerators for GPU nodes.
adding labels to nodes that have accelerators attached
making nvidia-gpu-device-plugin an addon.

I would prefer to keep them separate unless you feel strongly otherwise.

rohitagarwal003 · 2017-11-02T19:05:32Z

cluster/addons/addon-manager/README.md


 Notes:
 - Label `kubernetes.io/cluster-service=true` is deprecated (only for Addon Manager).
 In future release (after one year), Addon Manager may not respect it anymore. Addons
 have this label but without `addonmanager.kubernetes.io/mode=EnsureExists` will be
 treated as "reconcile class addons" for now.
- Resources under $ADDON_PATH (default `/etc/kubernetes/addons/`) needs to have either one
-of these two labels. Meanwhile namespaced resources need to be in `kube-system` namespace.
+- Resources under `$ADDON_PATH` needs to have either one of these two labels.


Thanks. Fixed.

rohitagarwal003 · 2017-11-02T19:23:40Z

cluster/gce/config-default.sh

-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"
+    if [[ "${NODE_ACCELERATORS}" =~ .*type=([a-zA-Z0-9-]+).* ]]; then
+        NODE_LABELS="${NODE_LABELS},cloud.google.com/gke-accelerator=${BASH_REMATCH[1]}"


According to http://git.savannah.gnu.org/cgit/bash.git/tree/CHANGES?h=bash-4.4#n4839, BASH_REMATCH was added in bash-3.0 which was released in 2004

I tested it on my mac and it works there. I don't have access to a Windows machine but will ask someone with windows to test it on cygwin/mingw. Thanks for pointing this out.

rohitagarwal003 · 2017-11-02T19:42:50Z

cluster/gce/config-default.sh

@@ -182,7 +182,10 @@ RUNTIME_CONFIG="${KUBE_RUNTIME_CONFIG:-}"
 FEATURE_GATES="${KUBE_FEATURE_GATES:-ExperimentalCriticalPodAnnotation=true}"

 if [[ ! -z "${NODE_ACCELERATORS}" ]]; then
-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"


Accelerators is an alpha feature. DevicePlugins is also an alpha feature (and the replacement for Accelerators).

I had added the following release note to the PR:

GCE nodes with NVIDIA GPUs attached now expose `nvidia.com/gpu` as a resource instead of `alpha.kubernetes.io/nvidia-gpu`.

which captures the difference between clusters created using the old script and clusters created using the new script. But I haven't actually tried upgrading a cluster that had the old flag to a cluster with the new flag. In GKE, we don't have to worry about this because we don't allow alpha cluster upgrade.

What should be the right release note here?
cc - @vishh

rohitagarwal003 · 2017-11-02T19:57:54Z

cluster/gce/config-default.sh

-    FEATURE_GATES="${FEATURE_GATES},Accelerators=true"
+    FEATURE_GATES="${FEATURE_GATES},DevicePlugins=true"
+    if [[ "${NODE_ACCELERATORS}" =~ .*type=([a-zA-Z0-9-]+).* ]]; then
+        NODE_LABELS="${NODE_LABELS},cloud.google.com/gke-accelerator=${BASH_REMATCH[1]}"


Eventually, we would enable device plugins by default (there would be no feature gate). So line 185 would go away.

Ideally, we would also like that each node that has special device add a node label. Lines 186-188 are doing that, they see a special device (accelerator) and are adding a label to the node for that. As long as GCE APIs follow the convention of specifying accelerators as type=TYPE,count=COUNT, this line would continue to work. Once, GCE adds devices that are not accelerators we would have to add more logic here.

roberthbailey · 2017-11-06T21:25:48Z

Are the commits really separable? Would we roll one back but not the others? I can see an argument for the addon manager documentation, but the other three seem like they are the same change and need to be applied or rolled back atomically.

rohitagarwal003 · 2017-11-06T21:31:38Z

They are separable: we can rollback just the last one which adds the add-on, or we can rollback the last two that add the add-on and node-labels or we can rollback all three. However, the last one depends on the other two, so if we rollback one of them, we should roll the last one back as well.

rohitagarwal003 · 2017-11-10T19:42:36Z

/cc @mikedanese

roberthbailey · 2017-11-13T19:20:47Z

/lgtm
/approve

k8s-github-robot · 2017-11-13T19:21:29Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mindprince, roberthbailey, vishh

Associated issue: 368

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~cluster/OWNERS~~ [roberthbailey]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-github-robot · 2017-11-13T19:21:31Z

/test all

Tests are more than 96 hours old. Re-running tests.

k8s-github-robot · 2017-11-13T22:46:54Z

Automatic merge from submit-queue (batch tested with PRs 54826, 53576, 55591, 54946, 54825). If you want to cherry-pick this change to another branch, please follow the instructions here.

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Update URLs for nvidia gpu device plugin and nvidia driver installer. Device plugin is now an addon and its manifest is now in kubernetes/kubernetes. The manifest on GoogleCloudPlatform/container-engine-accelerators no longer contains device plugin. This is needed after #54826 and GoogleCloudPlatform/container-engine-accelerators#25 **Release note**: ```release-note NONE ``` /sig scheduling

k8s-ci-robot assigned jiayingz, roberthbailey and vishh Oct 30, 2017

jiayingz reviewed Oct 31, 2017

View reviewed changes

rohitagarwal003 force-pushed the addon-manager branch from 8f0e152 to 1663212 Compare October 31, 2017 23:46

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 1, 2017

MrHohn reviewed Nov 1, 2017

View reviewed changes

rohitagarwal003 force-pushed the addon-manager branch from 1663212 to 05b5b04 Compare November 1, 2017 01:17

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 1, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 1, 2017

roberthbailey reviewed Nov 2, 2017

View reviewed changes

rohitagarwal003 added 4 commits November 2, 2017 12:58

Remove redundant comment and improve documentation.

3de7e5a

The comment is also present in lines 143-145 where it makes more sense.

Add node label to GCE nodes with accelerators attached.

9c7baf9

This node label is the same as what GKE attaches to node pools with accelerators attached. This will help us target accelerator specific daemonsets etc. to these nodes.

Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that …

cf29275

…have nvidia GPUs attached.

rohitagarwal003 force-pushed the addon-manager branch from 05b5b04 to cf29275 Compare November 2, 2017 19:58

k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2017

rohitagarwal003 commented Nov 2, 2017

View reviewed changes

k8s-ci-robot requested a review from mikedanese November 10, 2017 19:42

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2017

k8s-github-robot merged commit 4f91113 into kubernetes:master Nov 13, 2017

rohitagarwal003 mentioned this pull request Nov 14, 2017

Update URLs for nvidia gpu device plugin and nvidia driver installer. #55737

Merged

Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached #54826

Run nvidia-gpu device-plugin daemonset as an addon on GCE nodes that have nvidia GPUs attached #54826

Conversation

rohitagarwal003 commented Oct 30, 2017 • edited Loading

rohitagarwal003 commented Oct 30, 2017

roberthbailey commented Oct 30, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiayingz left a comment

Choose a reason for hiding this comment

vishh commented Nov 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rohitagarwal003 commented Nov 1, 2017

vishh commented Nov 1, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roberthbailey commented Nov 2, 2017

rohitagarwal003 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roberthbailey commented Nov 6, 2017

rohitagarwal003 commented Nov 6, 2017

rohitagarwal003 commented Nov 10, 2017

roberthbailey commented Nov 13, 2017

k8s-github-robot commented Nov 13, 2017

k8s-github-robot commented Nov 13, 2017

k8s-github-robot commented Nov 13, 2017

rohitagarwal003 commented Oct 30, 2017 •

edited

Loading