Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option for turning on/off compaction from apiserver in etcd3 mode #51765

Merged
merged 1 commit into from
Oct 3, 2017

Conversation

mitake
Copy link
Contributor

@mitake mitake commented Sep 1, 2017

…erver

What this PR does / why we need it:

This commit adds an option for controlling request of compaction to
etcd3 from apiserver. There is a situation that apiserver cannot fully
own its etcd cluster (e.g. sharing it with canal). In such a case,
apiserver should have limited access in terms of etcd's auth
functionality so it don't have a privilege to issue compaction
requests. It means that the compaction requests should be issued by
other component and apiserver's compaction requests are needless.

For such use cases, this commit adds a new flag
storagebackend.Config.DoCompaction. If the flag is true (default),
apiserver issues the compaction requests like current behaviour. If it
is false, apiserver doesn't issue the requests.

Related issue (etcd)
etcd-io/etcd#8458
/cc @xiang90 @struz

Release note:

Add --etcd-compaction-interval to apiserver for controlling request of compaction to etcd3 from apiserver.

@k8s-ci-robot
Copy link
Contributor

@mitake: GitHub didn't allow me to request PR reviews from the following users: struz.

Note that only kubernetes members can review this PR, and authors cannot review their own PRs.

In response to this:

…erver

What this PR does / why we need it:

This commit adds an option for controlling request of compaction to
etcd3 from apiserver. There is a situation that apiserver cannot fully
own its etcd cluster (e.g. sharing it with canal). In such a case,
apiserver should have limited access in terms of etcd's auth
functionality so it don't have a privilege to issue compaction
requests. It means that the compaction requests should be issued by
other component and apiserver's compaction requests are needless.

For such use cases, this commit adds a new flag
storagebackend.Config.DoCompaction. If the flag is true (default),
apiserver issues the compaction requests like current behaviour. If it
is false, apiserver doesn't issue the requests.

Notes
This PR is very experimental, actually it doesn't provide a way of controlling the flag for users (I'm currently trying to test it with minikube. The flag isn't tested yet at all!). I'd like to have feedback about this idea is acceptable by k8s or not.

Related issue (etcd)
etcd-io/etcd#8458
/cc @xiang90 @struz

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 1, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @mitake. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 1, 2017
@k8s-github-robot
Copy link

Adding do-not-merge/release-note-label-needed because the release note process has not been followed.
One of the following labels is required "release-note", "release-note-action-required", "release-note-experimental" or "release-note-none".
Please see: https://github.com/kubernetes/community/blob/master/contributors/devel/pull-requests.md#write-release-notes-if-needed.

@k8s-github-robot k8s-github-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label Sep 1, 2017
@mitake mitake changed the title storage, etcd3: add an option for turning on/off compaction from apis… WIP, RFC: storage, etcd3: add an option for turning on/off compaction from apis… Sep 1, 2017
@@ -50,6 +50,9 @@ type Config struct {
Copier runtime.ObjectCopier
// Transformer allows the value to be transformed prior to persisting into etcd.
Transformer value.Transformer

// DoCompaction is a flag to turn on or off requesting compaction from apiserver.
DoCompaction bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with exposing this as an option - I've already heard about customers who would like to control it themselves too.

I'm not sure if DoCompaction is the best name though.
I think that "auto-compaction" should be default and you should be able to opt-out. That would mean I would suggest changing to:

DisableCompaction

[with default false].

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment. I'll rename the flag to DisableCompaction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t BTW can I add DisableCompaction to EtcdOptions instead of storagebackend.Config? It seems that we don't have a way of configuring storagebackend.Config by command line option or other ways.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - that is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t sorry I found a very simple way to do this... please ignore the above comment

@mitake
Copy link
Contributor Author

mitake commented Sep 7, 2017

@wojtek-t I renamed the field name to DisableCompaction and added an option of apiserver (--etcd-disable-compaction) for controlling the field. I'll test it on a real cluster tomorrow. Should I add some tests for the new option and field?

@@ -127,6 +127,10 @@ func (s *EtcdOptions) AddFlags(fs *pflag.FlagSet) {

fs.StringVar(&s.EncryptionProviderConfigFilepath, "experimental-encryption-provider-config", s.EncryptionProviderConfigFilepath,
"The file containing configuration for encryption providers to be used for storing secrets in etcd")

fs.BoolVar(&s.StorageConfig.DisableCompaction, "etcd-disable-compaction", s.StorageConfig.DisableCompaction,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably we should make it clear that if this option is enabled, user MUST compact the underlying etcd3 storage themselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense. I'll update the description later. Thanks.

@mitake
Copy link
Contributor Author

mitake commented Sep 8, 2017

@xiang90 @wojtek-t I updated the description and briefly tested it and at least it wouldn't be breaking existing things. Could you trigger the tests? Or should I add new tests or a release note (sorry I'm not familiar with PR merging process of k8s...)?

@smarterclayton
Copy link
Contributor

smarterclayton commented Sep 8, 2017 via email

@mitake
Copy link
Contributor Author

mitake commented Sep 11, 2017

@smarterclayton what does "apichunking" mean (sorry I'm not familiar with k8s terminologies)?
Do you mean that the option should be able to specify the interval of compaction (etcd3.compactInterval)? If this understanding is correct, I agree with your opinion so will change the flag from bool to duration.

@wojtek-t
Copy link
Member

In this context apichunking means "pagination of the LIST responses".
I agree with Clayton`s suggestion to make it a duration.

@xiang90
Copy link
Contributor

xiang90 commented Sep 11, 2017

@mitake

The API server specifies etcd revision to get a consistent snapshot for pagination. It can achieve isolation with it.

However, if compaction compacts a revision that a client is ranging over, the client has to abort the range and start over.

Controlling the compaction interval can hint the client how fast it should finish the range without getting a abort.

@mitake
Copy link
Contributor Author

mitake commented Sep 13, 2017

@wojtek-t @xiang90 I see, thanks a lot for your explanation! I'll update this PR later.

@mitake mitake force-pushed the etcd3-compaction branch 2 times, most recently from 5dfa49b to 3894f6d Compare September 13, 2017 07:32
@mitake
Copy link
Contributor Author

mitake commented Sep 13, 2017

@xiang90 @wojtek-t @smarterclayton I changed the type of option from bool to duration and the name to --etd-compaction-interval. In my brief test, the option seems to be working well. Could you take a look?

@smarterclayton
Copy link
Contributor

Thank you for changing - high level this looks ok to me.

@wojtek-t wojtek-t changed the title WIP, RFC: storage, etcd3: add an option for turning on/off compaction from apis… Add an option for turning on/off compaction from apiserver in etcd3 mode Sep 13, 2017
@wojtek-t
Copy link
Member

/lgtm

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 13, 2017
@mitake
Copy link
Contributor Author

mitake commented Sep 14, 2017

It seems that pull-kubernetes-kubemark-e2e-gce is failing because of disk full. I'm not sure this should be categorized as fleaky test. Should I open an issue for this?

@wojtek-t
Copy link
Member

@mitake - the problem with full disk is not job specific - it's about test infrastructure. We are seeing it in many different jobs on many different PRs.

@wojtek-t
Copy link
Member

/lgtm
/approve no-issue

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 14, 2017
@wojtek-t wojtek-t added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Sep 14, 2017
@mitake
Copy link
Contributor Author

mitake commented Sep 14, 2017

@wojtek-t ah sorry I added a release note to the PR description. Should I remove it (you added release-note-none tag)?

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mitake, wojtek-t

Associated issue requirement bypassed by: wojtek-t

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@mitake
Copy link
Contributor Author

mitake commented Sep 14, 2017

@wojtek-t and I understand the situation about the disk full issue, thanks.

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 14, 2017
@wojtek-t wojtek-t added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Sep 14, 2017
@wojtek-t
Copy link
Member

Actually, I think this release note makes sense - changed the label.

@mitake
Copy link
Contributor Author

mitake commented Sep 14, 2017

I see, thanks for updating the label.

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@mitake
Copy link
Contributor Author

mitake commented Sep 15, 2017

/retest

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@k8s-github-robot
Copy link

/test all

Tests are more than 96 hours old. Re-running tests.

@mitake
Copy link
Contributor Author

mitake commented Sep 30, 2017

/retest (the unit test isn't triggered yet)

BTW, can this be pushed to the submit queue because v1.8 is now released?

@wojtek-t
Copy link
Member

wojtek-t commented Oct 2, 2017

/retest

@wojtek-t
Copy link
Member

wojtek-t commented Oct 3, 2017

/test pull-kubernetes-unit

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 51765, 53053, 52771, 52860, 53284). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 5dfea9e into kubernetes:master Oct 3, 2017
@mitake mitake deleted the etcd3-compaction branch October 4, 2017 00:30
k8s-github-robot pushed a commit that referenced this pull request Feb 2, 2018
#59106-upstream-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #51765 #59106 upstream release 1.8  

Cherry pick of #51765 #59106 on release-1.8.

#51765 : Add an option for turning on/off compaction from apiserver in etcd3 mode
#59106 : Expose etcd compaction time via environmental variable in GCE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants