Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: introduce the UnversionedKubeletConfigMap feature gate #105741

Conversation

neolit123
Copy link
Member

@neolit123 neolit123 commented Oct 18, 2021

What this PR does / why we need it:

introduce a new kubeadm specific FG called UnversionedKubeletConfigMap.
this feature controls the old vs new formatting of the kubelet config map used for storing the KubeletConfiguration.

also:

  • adapt unit tests and e2e tests
  • remove unused function DownloadConfig in kubelet.go

Which issue(s) this PR fixes:

xref kubernetes/kubeadm#1582

Special notes for your reviewer:

Does this PR introduce a user-facing change?

kubeadm: add the kubeadm specific, Alpha (disabled by default) feature gate UnversionedKubeletConfigMap. When this feature is enabled kubeadm will start using a new naming format for the ConfigMap where it stores the KubeletConfiguration structure. The old format included the Kubernetes version - "kube-system/kubelet-config-1.22", while the new format does not - "kube-system/kubelet-config". A similar formatting change is done for the related RBAC rules. The old format is now DEPRECATED and will be removed after the feature graduates to GA. When writing the ConfigMap kubeadm (init, upgrade apply) will respect the value of UnversionedKubeletConfigMap, while when reading it (join, reset, upgrade), it would attempt to use new format first and fallback to the legacy format if needed.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-cluster-lifecycle/kubeadm/2915-kubeadm-replace-kubelet-config-x.y
- [Enhancement tracking issue]: https://github.com/kubernetes/enhancements/issues/2915

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 18, 2021
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 18, 2021
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 18, 2021
@neolit123 neolit123 changed the title kubeaadm: introduce the UnversionedKubeletConfigMap feature gate kubeadm: introduce the UnversionedKubeletConfigMap feature gate Oct 19, 2021
@neolit123 neolit123 force-pushed the 1.23-kubeadm-kubelet-config-map-change branch from a660e96 to 569e63f Compare October 21, 2021 17:33
@neolit123 neolit123 marked this pull request as ready for review October 21, 2021 17:33
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 21, 2021
@neolit123
Copy link
Member Author

/triage accepted
/kind feature
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 21, 2021
@neolit123
Copy link
Member Author

/milestone v1.23

@neolit123 neolit123 force-pushed the 1.23-kubeadm-kubelet-config-map-change branch 3 times, most recently from 9a00126 to 608d165 Compare October 25, 2021 19:42
@pacoxu
Copy link
Member

pacoxu commented Nov 2, 2021

The legacy bool works well but looks a little odd to me. The PR LGTM.
Let me re-think the implementation if there are some other ways and get back to review this PR tommorrow. 😄

// During the first "kubeadm upgrade apply" when the feature gate goes "true" by default and
// a preferred user value is missing in the ClusterConfiguration, "kubeadm upgrade apply" will try
// to fetch using the new format and that CM will not exist yet.
// Tollerate both the old a new format until UnversionedKubeletConfigMap goes GA and is locked.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: tollerate -> tolerate

Copy link
Member

@pacoxu pacoxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123, pacoxu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Member

@fabriziopandini fabriziopandini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about the upgrade path for existing clusters.

  • users not expressing any preference on ConfigMap naming; legacy name will be converted to new unified name during the first upgrade after the default value for the feature flag has been switched from false to true. Is that right? Is the old configMap/RBAC rule being deleted or not?
  • users that wants to opt in in the new naming schema early; they have to force the feature gate to true in the kubeadm-config ConfigMap before an upgrade, no matter of the the default value for the feature flag. Is that right?
  • users that wants to stick on the legacy schema naming ConfigMap; in oder to prevent kubeadm moving to the new naming schema automatically, they have to force the feature gate to false in the kubeadm-config ConfigMap before the first upgrade after the default value for the feature flag has been switched from false to true. Is that right?

Can we have tests for those scenarios? currently all the upgrade logic is implicit in kubeletConfigFromCluster, which is really nested in the call of chains and it is kind of hard to validate at first sight..

cmd/kubeadm/app/componentconfigs/configset_test.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/componentconfigs/configset_test.go Outdated Show resolved Hide resolved
cmd/kubeadm/app/componentconfigs/configset_test.go Outdated Show resolved Hide resolved
@neolit123
Copy link
Member Author

neolit123 commented Nov 3, 2021

the summary for the current state of the patch:

  • during writes (init, post-upgrade) kubeadm will choose the CM/RBAC names based on the FG.
  • during reads (join, reset, pre-upgrade) kubeadm will first try the new CM naming format and if it fails it will fallback to the old naming independent of FG
  • kubeadm does not read the RBACs, but if users read them they need to adapt manually.

users not expressing any preference on ConfigMap naming; legacy name will be converted to new unified name during the first upgrade after the default value for the feature flag has been switched from false to true. Is that right? Is the old configMap/RBAC rule being deleted or not?

currently kubeadm does not remove old CM/RBACs during upgrade.
so an old cluster will have old the ...x.yy objects.
cleanup of the old names was stated as "maybe" in the KEP.

users that wants to opt in in the new naming schema early; they have to force the feature gate to true in the kubeadm-config ConfigMap before an upgrade, no matter of the the default value for the feature flag. Is that right?

if the user sets the FG to true in the ClusterConfiguration before upgrade, kubeadm will try to read the new naming and fallback to the old naming (this is the current state of the patch). after upgrade kubeadm will write the relevant CM in the new location.

users that wants to stick on the legacy schema naming ConfigMap; in oder to prevent kubeadm moving to the new naming schema automatically, they have to force the feature gate to false in the kubeadm-config ConfigMap before the first upgrade after the default value for the feature flag has been switched from false to true. Is that right?

yes, if the FG is forced to false, during upgrade kubeadm will write to the old location, during join, reset, etc kubeadm will try to use the new name and if it fails it will fallback to the old name.

Can we have tests for those scenarios? currently all the upgrade logic is implicit in kubeletConfigFromCluster, which is really nested in the call of chains and it is kind of hard to validate at first sight..

my eval is that testing the behavior in unit tests with the current state of the upgrade code is difficult.
i'm looking at testing the following with e2e tests during alpha (1.23):

  • set the FG to true during init and check if the right CM is created
  • create a cluster without the FG, patch the FG to true, upgrade to the same version, make sure that new CM is created

@fabriziopandini
Copy link
Member

kubeadm will first try the new CM naming format and if it fails it will fallback to the old naming independent of FG

@neolit123 I'm ok with this approach given that, as you confirmed above, we are covering all the use cases.
I also understand that alternatives to this will require very invasive changes we don't want to do given that this is a temporary and hopefully short transition.

my eval is that testing the behavior in unit tests with the current state of the upgrade code is difficult.
i'm looking at testing the following with e2e tests during alpha (1.23)

how dow you think to proceed about implementing E2E testing? merge this PR and the implements E2E? does this means that we are considering a rollback if E2E does not provide the expected signal before release?

@neolit123
Copy link
Member Author

neolit123 commented Nov 8, 2021 via email

Add the UnversionedKubeletConfigMap feature gate that can
be used to control legacy vs new behavior for naming the
default configmap used to store the KubeletConfiguration.

Update related unit tests.
The function has been marked as deprecated for a long time
and has been unused in the code base.

Remove it.
Add means to parse the value of UnversionedKubletConfigMap
feature gate if present and based on that decide what
configmap to look for.
@neolit123 neolit123 force-pushed the 1.23-kubeadm-kubelet-config-map-change branch from 608d165 to 17cc064 Compare November 8, 2021 15:31
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2021
@neolit123
Copy link
Member Author

made the changes requested in #105741 (review)

@fabriziopandini
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2021
@neolit123
Copy link
Member Author

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 9, 2021
@neolit123
Copy link
Member Author

/retest

1 similar comment
@neolit123
Copy link
Member Author

/retest

@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubeadm area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/deprecation Categorizes issue or PR as related to a feature/enhancement marked for deprecation. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants