Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial basic bootstrap-checkpoint support #50984

Merged
merged 2 commits into from
Nov 22, 2017

Conversation

timothysc
Copy link
Member

@timothysc timothysc commented Aug 20, 2017

What this PR does / why we need it:
Adds initial support for Pod checkpointing to allow for controlled recovery of the control plane during self host failure conditions.

fixes #49236
xref kubernetes/enhancements#378

Special notes for your reviewer:

Proposal is here: https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8#

  1. Controlled tests work, but I have not tested the self hosted api-server recovery, that requires validation and logs. /cc @luxas
  2. In adding hooks for checkpoint manager much of the tests around basicpodmanager appears to be stub'd. This has become an anti-pattern in the code and should be avoided.
  3. I need a node-e2e to ensure consistency of behavior.

Release note:

Add basic bootstrap checkpointing support to the kubelet for control plane recovery

/cc @kubernetes/sig-cluster-lifecycle-misc @kubernetes/sig-node-pr-reviews

@timothysc timothysc added do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. kind/enhancement release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Aug 20, 2017
@timothysc timothysc added this to the v1.8 milestone Aug 20, 2017
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 20, 2017
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 20, 2017
@timothysc
Copy link
Member Author

I will rebase post review and conversation. I will also post the design proposal writeup from https://docs.google.com/document/d/1hhrCa_nv0Sg4O_zJYOnelE8a5ClieyewEsQM6c7-5-o/edit?ts=5988fba8# today.

Copy link
Member

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick initial pass.
As we said earlier, the naming of this feature will be crucial to not cause unneeded confusion.

I didn't see anything that'd remove the Static Pod once the pivot is done; WIP, right?
Also I didn't review the overall flow in detail, but am starting the conversation with these basic comments.

@@ -232,6 +232,8 @@ func AddKubeletConfigFlags(fs *pflag.FlagSet, c *componentconfig.KubeletConfigur
fs.MarkDeprecated("experimental-fail-swap-on", "This flag is deprecated and will be removed in future releases. please use --fail-swap-on instead.")

fs.StringVar(&c.PodManifestPath, "pod-manifest-path", c.PodManifestPath, "Path to to the directory containing pod manifest files to run, or the path to a single pod manifest file. Files starting with dots will be ignored.")
fs.StringVar(&c.CheckpointPath, "checkpoint-path", c.CheckpointPath, "<Warning: Alpha feature> Path to to the directory where the checkpoints are stored")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--experimental-BOOTSTRAPPING-checkpoint-path 😉?
We talked about scoping this down a lot and being clear about the functionality at this time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with @smarterclayton's suggestion for the flag experimental-pod-preservation-path

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm changing to --bootstrap-checkpoint-path per: ' https://github.com/kubernetes/community/pull/1241/files '

Putting 'experimental' or 'alpha' into names has been known to cause issues during promotion and we've moved to updating descriptions and documentation. The name change should suffice.

@@ -72,6 +72,10 @@ const (
// This annotation can be attached to node.
ObjectTTLAnnotationKey string = "node.alpha.kubernetes.io/ttl"

// CheckpointAnnotationKey represents a Resource (Pod) that should be checkpointed by
// the kubelet prior to running
CheckpointAnnotationKey string = "node.kubernetes.io/checkpoint"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alpha.node.kubernetes.io/bootstrapping-checkpointing?
node.alpha.kubernetes.io/bootstrapping-checkpointing?

I guess that when this gets graduated we'll use an alpha/beta/whatever field using the grand new proposal we have for that, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope alpha has been removed from the naming conventions per api-machinery folks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! That's a great step forwards 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm keeping the annotation name the same for the purpose of potential re-use. The documentation update will call out the details and limitations.

@@ -9,6 +9,8 @@ openapi_library(
name = "go_default_library",
srcs = ["doc.go"],
openapi_targets = [
"_output/dockerized/go/src/github.com/miekg/coredns/vendor/k8s.io/client-go/1.5/pkg/runtime",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error?

@@ -271,6 +271,9 @@ type KubeletConfiguration struct {
// podManifestPath is the path to the directory containing pod manifests to
// run, or the path to a single manifest file
PodManifestPath string `json:"podManifestPath"`
// checkpointPath is the path to the directory where the kubelet will checkpoint
// annotated pod manifests and configmaps before they are run
CheckpointPath string `json:"checkpointPath"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BootstrappingCheckpointPath?

)

const (
delimiter = "_"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment

}

// loadCheckpoint will load 'Resource_Name' Checkpoint yaml file.
func (fcp *fileCheckPointManager) loadPod(file string) (*v1.Pod, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use "k8s.io/kubernetes/pkg/volume/util".LoadPodFromFile(filename string) for this instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh hey, that is handy, I'll update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

return pod, nil
}

// checkAnnotations will validate the annotations are there.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: checkAnnotations will check whether the bootstrapping checkpointing annotation is set on the pod object


// NewCheckpointManager will create a NewCheckpointManager that points to the following path
func NewCheckpointManager(path string) Manager {
return &fileCheckPointManager{path: path}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes something like var _ Manager = &fileCheckPointManager{} not needed, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, and it's consistent with the rest of the Kubelet.

@@ -195,6 +195,9 @@ type KubeletConfiguration struct {
// podManifestPath is the path to the directory containing pod manifests to
// run, or the path to a single manifest file
PodManifestPath string
// checkpointPath is the path to the directory where the kubelet will checkpoint
// annotated pod manifests and configmaps before they are run
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not implemented yet, right?
Should we consider adding this to the comment once we actually add this feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll nix the configmaps bit there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@luxas luxas self-assigned this Aug 20, 2017
@@ -646,6 +646,7 @@ function start_kubelet {
sudo -E "${GO_OUT}/hyperkube" kubelet ${priv_arg}\
--v=${LOG_LEVEL} \
--vmodule="${LOG_SPEC}" \
--fail-swap-on=false \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. spacing

@timothysc
Copy link
Member Author

@yujuhong - I'm pending on your feedback before I rebase and address other comments.

/cc @roberthbailey

@timothysc
Copy link
Member Author

/cc @liggitt - Did you have opinions on the startup ordering. This is a PoC and comments are solicited.

@timothysc timothysc requested a review from liggitt August 24, 2017 15:54
return nil, nil, err
}
if len(pods) > 0 {
c <- kubetypes.PodUpdate{Pods: pods, Op: kubetypes.SET, Source: kubetypes.ApiserverSource}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this make the kubelet kill any currently running containers not in the checkpointed list?

Copy link
Member Author

@timothysc timothysc Aug 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, it only seeds the cache which will need to run through the checks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you sure? this will add the ApiserverSource to the sourcesReady list before we've actually gotten a full list from the API server. That allows AllReady() to return true, which allows housekeeping to run HandlePodCleanups() prior to us knowing what we should be running from the API server. Won't that delete running containers?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt is right. Kubelet assumes the first update from a source includes the complete list of pods. Once all sources are ready, GC will kick in to delete containers that do not belong to a known pod.

Wiring this so that kubelet considers the source is not ready is required.

@smarterclayton
Copy link
Contributor

smarterclayton commented Aug 24, 2017

Can we rename the flag and pr so it's clear that this isn't actually checkpointing (as discussed on the meeting)? I haven't seen a checkpointing design, I've seen something much more limited with no design consideration given to the long term problems.

// Writes a checkpoint to a file on disk if annotation is present
func (fcp *fileCheckPointManager) WritePod(pod *v1.Pod) error {
var err error
if fcp.checkAnnotations(pod) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means that removing the checkpoint annotation from a pod will never clean it up (here or in DeletePod)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll create a UID cache so updates that remove cause a checkpoint delete.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@timothysc
Copy link
Member Author

timothysc commented Aug 24, 2017

Can we rename the flag and pr so it's clear that this isn't actually checkpointing (as discussed on the meeting)?

Gotta name?

I haven't seen a checkpointing design, I've seen something much more limited with no design consideration given to the long term problems.

Per the meeting, it was explicitly stated we would defer future use of such facilities would require their own proposals, but I'll distill the 2 docs down into concise language stating it explicitly and push today.

@smarterclayton
Copy link
Contributor

Maybe experimental-pod-preservation-path for the flag and call it Pod Preservation or Kubelet Pod Memory or something that implies recall without the more precise term.

Copy link
Contributor

@yujuhong yujuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some tricky cases to consider for implementation. Will need to think about it more...

@@ -646,6 +646,7 @@ function start_kubelet {
sudo -E "${GO_OUT}/hyperkube" kubelet ${priv_arg}\
--v=${LOG_LEVEL} \
--vmodule="${LOG_SPEC}" \
--fail-swap-on=false \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed in this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

return nil, nil, err
}
if len(pods) > 0 {
c <- kubetypes.PodUpdate{Pods: pods, Op: kubetypes.SET, Source: kubetypes.ApiserverSource}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt is right. Kubelet assumes the first update from a source includes the complete list of pods. Once all sources are ready, GC will kick in to delete containers that do not belong to a known pod.

Wiring this so that kubelet considers the source is not ready is required.

@@ -154,6 +159,11 @@ func (pm *basicManager) UpdatePod(pod *v1.Pod) {
pm.lock.Lock()
defer pm.lock.Unlock()
pm.updatePodsInternal(pod)
if pm.checkpointManager != nil {
if err := pm.checkpointManager.WritePod(pod); err != nil {
glog.Errorf("Error writing checkpoint for pod: %v", pod.GetName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the expected behavior if this happens? kubelet simply won't restore the pod at startup?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, at some point intervention will be required, and given this is for bootstrapping this is ok.

for _, pod := range pods {
glog.Infof("Starting Checkpoint %v - Name (%v)", pod.GetUID(), pod.GetName())
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is that if the pod gets deleted from the apiserver while kubelet is down, you would NOT receive a DELETE event from pkg/kubelet/config/apiserver.go through the regular channel because the cache in the pkg/kubelet/config would be cold. That means you will keep the pods restored from the checkpoint forever.

That's one of the reasons why I suggested also looking into storing the checkpoints there (pkg/kubelet/config).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed following your example.

var err error
if fcp.checkAnnotations(pod) {
if blob, err := yaml.Marshal(pod); err == nil {
err = ioutil.WriteFile(fcp.getPodPath(pod), blob, 0644)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioutil.WriteFile doesn't call f.Sync(). You'll likely end up with corrupted files, which we may not be able to detect due to the lack of checksum?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is the write atomic? Do we need to write to a temp file and move it here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. we need to write a temp file, fsync it and rename.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, using safefile instead.

@@ -232,6 +232,8 @@ func AddKubeletConfigFlags(fs *pflag.FlagSet, c *componentconfig.KubeletConfigur
fs.MarkDeprecated("experimental-fail-swap-on", "This flag is deprecated and will be removed in future releases. please use --fail-swap-on instead.")

fs.StringVar(&c.PodManifestPath, "pod-manifest-path", c.PodManifestPath, "Path to to the directory containing pod manifest files to run, or the path to a single pod manifest file. Files starting with dots will be ignored.")
fs.StringVar(&c.CheckpointPath, "checkpoint-path", c.CheckpointPath, "<Warning: Alpha feature> Path to to the directory where the checkpoints are stored")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with @smarterclayton's suggestion for the flag experimental-pod-preservation-path

@timothysc
Copy link
Member Author

/test pull-kubernetes-unit

@timothysc timothysc changed the title Initial basic checkpoint support Initial basic bootstrap-checkpoint support Nov 15, 2017
Copy link
Contributor

@yujuhong yujuhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR mostly looks good.

Besides the basic checkpoint write/restore testing (which should be a node e2e test suite), I'd like to see testing for the following cases:

  1. Update the pods in the apiserver while kubelet is down. Ensure kubelet restores the pods from the checkpoints, AND gets the updates from the apiserver correctly (i.e., eventually gets the updated pods).
  2. Delete the pods from the apiserver while kubelet is down. Ensure kubelet properly deletes pods (and the checkpoints).
  3. Change annotation of a pod and observe checkpoints are handled properly.

As we agreed in our last meeting, manual testing is fine for these corner cases for now. Eventually, it'd be nice to have some test coverage in our CI.

@@ -54,6 +54,9 @@ type KubeletConfiguration struct {
// podManifestPath is the path to the directory containing pod manifests to
// run, or the path to a single manifest file
PodManifestPath string `json:"podManifestPath"`
// bootstrapCheckpointPath is the path to the directory containing pod checkpoints to
// run on restore
BootstrapCheckpointPath string `json:"checkpointPath"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/json:"checkpointPath/json:"bootstrapCheckpointPath

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -58,6 +58,9 @@ type KubeletConfiguration struct {
// podManifestPath is the path to the directory containing pod manifests to
// run, or the path to a single manifest file
PodManifestPath string
// bootstrapCheckpointPath is the path to the directory containing pod checkpoints to
// run on restore
BootstrapCheckpointPath string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke to @mtaufen offline. He has removed all Alpha features from the kubeletconfig, unless they are feature-gated.
To make it consistent with other fields, the suggestion is either

  1. Remove this from the configuration, but keep it as a flag, or
  2. Feature gate this properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, removing this from the configuration may make node e2e testing harder, just FYI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, and just to add some context:

  1. Removing it from the configuration for now doesn't prevent you from adding it back later, once you're more confident in the stability of your API and behavior (e.g. no longer an alpha feature).
  2. If you do want an alpha feature in the API, these are the guidelines regarding feature gating it properly: https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#alpha-field-in-existing-api-version

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He has removed all Alpha features from the kubeletconfig, unless they are feature-gated.

@yujuhong This is patently untrue.

NodeLabels, VolumePluginDir, CPUManagerPolicy, LockFilePath are all alpha labeled features that exist in the kubeletconfig. The usage of componentconfig is also alpha, from what I've seen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do whatever, but your statement doesn't reconcile with what is there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timothysc CPUManagerPolicy is feature gated, which is why it remains.
NodeLabels, VolumePluginDir, and LockFilePath were all moved from KubeletConfiguration to KubeletFlags in #55562, because they are alpha but lack feature gates.
This is in preparation for moving KubeletConfiguration to beta.
Please feature gate your field if you want to leave it in KubeletConfiguration or otherwise move it to KubeletFlags.
If you have more questions, feel free to ping me on Slack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, following #55562 as reference.


// NewCheckpointManager will create a Manager that points to the following path
func NewCheckpointManager(path string) Manager {
mutex.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: add a comment to indicate that this is just for precaution; current implementation do/should not run multiple checkpoint managers.

// Get just the Resource from "Resource_Name"
fnfields := strings.Split(fname, delimiter)
switch fnfields[0] {
case "Pod":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for making it a constant.

fnfields := strings.Split(fname, delimiter)
switch fnfields[0] {
case "Pod":
pod, err := fcp.loadPod(fmt.Sprintf("%v/%v", fcp.path, f.Name()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd prefer %s too since it's more clear.

func (fcp *fileCheckPointManager) DeletePod(pod *v1.Pod) error {
var err error
podPath := fcp.getPodPath(pod)
if _, err := os.Stat(podPath); !os.IsNotExist(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think you need to Stat first.

	if err := os.Remove(podPath); !os.IsNotExist(err) {
         return err
    }
    return nil

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -233,7 +233,7 @@ func LoadPodFromFile(filePath string) (*v1.Pod, error) {
}
pod := &v1.Pod{}

codec := legacyscheme.Codecs.LegacyCodec(legacyscheme.Registry.GroupOrDie(v1.GroupName).GroupVersion)
codec := legacyscheme.Codecs.UniversalDecoder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why changing this? I'm not familiar with the codecs, so just a question.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In testing that f(n) it was a bug.

@@ -293,9 +293,20 @@ func makePodSourceConfig(kubeCfg *kubeletconfiginternal.KubeletConfiguration, ku
glog.Infof("Adding manifest url %q with HTTP header %v", kubeCfg.ManifestURL, manifestURLHeader)
config.NewSourceURL(kubeCfg.ManifestURL, manifestURLHeader, nodeName, kubeCfg.HTTPCheckFrequency.Duration, cfg.Channel(kubetypes.HTTPSource))
}

// Restore from the checkpoint path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment warning that this must happen before creating the apiserver source below, or the checkpoint would override the source of truth (apiserver).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

// the pod checkpoints and may be incomplete. Do not mark the
// source as ready.

// Mark the source ready after receiving at least one update from the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the original (now duplicated) comment below (line 1880)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

} else {
// This is to handle an edge where a pod update could remove
// an annotation and the checkpoint should then be removed.
err = fcp.DeletePod(pod)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, every update of a regular pod without the checkpoint annotation will trigger a deletion, which either calls os.Stat or os.Remove. That doesn't seem good. If you plan to optimize that in the future, please add a TODO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stat precedes any delete, and again this is an opt-in behavior on master nodes for control plane bootstrapping, not general purpose checkpointing, but I can add a TODO to optimize if more generally used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated below.

@k8s-github-robot
Copy link

[MILESTONENOTIFIER] Milestone Pull Request Current

@luxas @timothysc @yujuhong

Note: If this pull request is not resolved or labeled as priority/critical-urgent by Wed, Nov 22 it will be moved out of the v1.9 milestone.

Pull Request Labels
  • sig/cluster-lifecycle sig/node: Pull Request will be escalated to these SIGs if needed.
  • priority/important-soon: Escalate to the pull request owners and SIG owner; move out of milestone after several unsuccessful escalation attempts.
  • kind/feature: New functionality.
Help

@timothysc
Copy link
Member Author

/test pull-kubernetes-e2e-gce-device-plugin-gpu

if err != nil {
return nil, err
}
for _, f := range files {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this possibly contain temporary files written by safefile.WriteFile? Could you check and ignore them?

Copy link
Contributor

@yujuhong yujuhong Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code would print an warning, but it's quite annoying. Okay for an alpha feature though, as long as those temporary files are properly ignored. Please do check if that's the case.

@yujuhong
Copy link
Contributor

Looks good overall. Look forward to seeing the followup test PRs.

@yujuhong
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: timothysc, yujuhong
We suggest the following additional approver: thockin

Assign the PR to them by writing /assign @thockin in a comment when ready.

Associated issue: 49236

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@timothysc timothysc added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 21, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 55812, 55752, 55447, 55848, 50984). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 277d866 into kubernetes:master Nov 22, 2017
@aoxn
Copy link
Contributor

aoxn commented Dec 13, 2017

Is that mean that i have to power up the right machine which has apiserver.pod on to survive cluster power failure? I cant bring up my failed cluster back after power up the right node. Am i right? @timothysc

@luxas
Copy link
Member

luxas commented Dec 13, 2017

@spacexnice Please discuss that in an issue or on Slack. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RFE: Bootstrap Checkpointing - Modify manifest behavior slightly for self hosting.