Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use COS for nodes in testing clusters by default, and bump COS. #52120

Merged

Conversation

abgworrall
Copy link
Contributor

@abgworrall abgworrall commented Sep 7, 2017

Addresses part of issue #51487. May assist with #51961 and #50695.

CVM is being deprecated, and falls out of support on 2017/10/01. We shouldn't run test jobs on it. So start using COS for all test jobs.

The default value of KUBE_NODE_OS_DISTRIBUTION for clusters created for testing will now be gci. Testjobs that do not specify this value will now run on clusters using COS (aka GCI) as the node OS, instead of CVM, the previous default.

This change only affects testing; non-testing clusters already use COS by default.

In addition, bump the version of COS from cos-stable-60-9592-84-0 to cos-stable-60-9592-90-0.

NONE

/cc @yujuhong, @mtaufen, @fejta, @krzyzacy

Addresses part of issue kubernetes#51487.

This is a big change for testing; any testjobs that do not
set an explicit KUBE_NODE_OS_DISTRIBUTION will have been running
on CVM, but after this PR will start running COS.

CVM is being deprecated, and falls out of support on 2018/10/01.

In addition, bump the patch version of COS from
cos-stable-60-9592-84-0 to cos-stable-60-9592-90-0.
@krzyzacy
Copy link
Member

krzyzacy commented Sep 7, 2017

ah nice, which means we don't have to set --gcp-node-image=gci anymore?

@abgworrall
Copy link
Contributor Author

@krzyzacy - to be honest, not sure; I don't know which layer of the config stack --gcp-node-image kicks in. There are a surprising number of layers ...

@krzyzacy
Copy link
Member

krzyzacy commented Sep 7, 2017

@abgworrall that's equivalent to KUBE_NODE_OS_DISTRIBUTION, once this is in we can kill that env :-)
do we plan to backport it to previous branches as well?

@k8s-github-robot k8s-github-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 7, 2017
@yujuhong
Copy link
Contributor

yujuhong commented Sep 7, 2017

Should we also convert existing CVM jobs to use COS? If not, we'll not have any CVM coverage. The worst case would be that we are forced support 1.8+CVM without any test coverage.

@shyamjvs
Copy link
Member

shyamjvs commented Sep 7, 2017

/approve

@krzyzacy
Copy link
Member

krzyzacy commented Sep 7, 2017

So most of jobs has both cvm + gci coverage now, for master.

For 1.8, I'm only creating gci jobs, which should be fine if we are deprecating it before 1.8 is out?

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 7, 2017
@mtaufen
Copy link
Contributor

mtaufen commented Sep 7, 2017

@yujuhong we should maintain the tests that validate upgrade paths from CVM to other images. CVM won't be supported for 1.8.

@yujuhong
Copy link
Contributor

yujuhong commented Sep 7, 2017

@yujuhong we should maintain the tests that validate upgrade paths from CVM to other images. CVM won't be supported for 1.8.

@dchen1107, you didn't want to remove the CVM test coverage for 1.8 last time we spoke. Have you changed your mind? :-)

@abgworrall
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-etcd3

@abgworrall
Copy link
Contributor Author

I also recall a firm decision that 1.8 won't be expected to run on CVM, and so 1.8 will not be tested to run on CVM. This PR shouldn't go to 1.7; we still expect 1.7 to work on CVM.

@krzyzacy
Copy link
Member

krzyzacy commented Sep 7, 2017

sgtm :-)

@abgworrall
Copy link
Contributor Author

/unassign @bowei
/unassign @shyamjvs
/assign @mtaufen

@k8s-ci-robot k8s-ci-robot assigned mtaufen and unassigned bowei and shyamjvs Sep 8, 2017
@abgworrall
Copy link
Contributor Author

/sig node

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Sep 8, 2017
@abgworrall
Copy link
Contributor Author

/kind cleanup

@k8s-ci-robot k8s-ci-robot added the kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. label Sep 8, 2017
@abgworrall abgworrall added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label Sep 8, 2017
@abgworrall
Copy link
Contributor Author

@dchen1107 - Dawn, do you want to veto / defer this ? I'd like to get it approved into 1.8 in the next day or two ...

@mtaufen
Copy link
Contributor

mtaufen commented Sep 11, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 11, 2017
@yujuhong
Copy link
Contributor

What's the plan for the current -gci- jobs? I assume they'll be removed?

@abgworrall
Copy link
Contributor Author

@yujuhong , I think that makes a ton of sense (with the exception of upgrade tests), and probs needs some manual verification that the non -gci- test doesn't hardcode CVM somehow.

@dchen1107
Copy link
Member

CVM is deprecated for 1.8 for sure, and I am ok to remove cvm test coverage now.
We should remove those dup gci test suites too, but that can come in a separate pr.

@dchen1107
Copy link
Member

/lgtm

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abgworrall, dchen1107, mtaufen, shyamjvs

Associated issue: 51487

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 11, 2017
@abgworrall
Copy link
Contributor Author

@kubernetes/kubernetes-release-managers - can you consider this for a 1.8 milestone label ? It should increase the reliability of some test jobs.

@dchen1107 dchen1107 added this to the v1.8 milestone Sep 11, 2017
@krzyzacy
Copy link
Member

/retest

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Sep 11, 2017

@abgworrall: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-bazel d32b9a6 link /test pull-kubernetes-e2e-gce-bazel

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 52227, 52120)

@k8s-github-robot k8s-github-robot merged commit 01a4a60 into kubernetes:master Sep 12, 2017
@yujuhong
Copy link
Contributor

This is not that effective as most of the jobs have KUBE_NODE_OS_DISTRIBUTION=debian explicitly set in their test env file.

@krzyzacy
Copy link
Member

@yujuhong we want to delete those jobs right :-)

@yujuhong
Copy link
Contributor

@krzyzacy I thought we wanted to switch most jobs to using COS by default with this PR, and delete the COS-specific (-gci-) jobs?

@krzyzacy
Copy link
Member

@yujuhong either way works :-)

@yujuhong
Copy link
Contributor

The former seems easier to me, but yes, either way works. This PR itself didn't change anything though without the test-infra side change :-|

@xiangpengzhao
Copy link
Contributor

/release-note-none

This PR doesn't need a release note. A release-note label will lead to noise in CHANGELOG.

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Nov 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.