Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the actual inhibit delay to be greater than the expected inhibit delay #103137

Merged

Conversation

wzshiming
Copy link
Member

@wzshiming wzshiming commented Jun 24, 2021

What type of PR is this?

/kind bug
/sig node

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Graceful node shutdown, allow the actual inhibit delay to be greater than the expected inhibit delay

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

NONE

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/node Categorizes an issue or PR as relevant to SIG Node. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jun 24, 2021
@wzshiming
Copy link
Member Author

/cc @bobbypage

@ehashman ehashman added this to Triage in SIG Node PR Triage Jun 24, 2021
@bobbypage
Copy link
Member

bobbypage commented Jun 29, 2021

This change makes sense, but I don't think it will fix #102818

If I understood the issue in #102818 it was that ubuntu installs a /usr/lib/systemd/logind.conf.d/unattended-upgrades-logind-maxdelay.conf config with InhibitDelayMaxSec=30.

Kubelet also writes a config to /etc/systemd/logind.conf.d/99-kubelet.conf but it looks that file is not taking precedence.

The systemd docs (https://www.freedesktop.org/software/systemd/man/logind.conf.html) note:

"In addition to the "main" configuration file, drop-in configuration snippets are read from /usr/lib/systemd/.conf.d/, /usr/local/lib/systemd/.conf.d/, and /etc/systemd/*.conf.d/. Those drop-ins have higher precedence and override the main configuration file. Files in the *.conf.d/ configuration subdirectories are sorted by their filename in lexicographic order, regardless of in which of the subdirectories they reside."

So based on that it looks like the kubelet file should take precedence? Perhaps it might be issue with the lexicographic ordering...

@wzshiming
Copy link
Member Author

So based on that it looks like the kubelet file should take precedence? Perhaps it might be issue with the lexicographic ordering...

Yes, I also think this is a lexicographic ordering issue. But it does not make sense to preempt and eventually overwrite other configurations here. If there are other applications that rely on suppression locks but require more time, I think they should be allowed to exist.

@bobbypage
Copy link
Member

If there are other applications that rely on suppression locks but require more time, I think they should be allowed to exist.

I agree, that makes sense. However in the ubuntu case, if the config file for unattended-upgrades is 30seconds and the kubelet requests something which is greater than 30 seconds, then I think kubelet inhibit delay should take precedence.

In other words, I don't think kubelet should ever decrease the overall InhibitDelayMaxSec, but I don't see a problem with it overriding it with a greater value. What do you think?

@wzshiming
Copy link
Member Author

In other words, I don't think kubelet should ever decrease the overall InhibitDelayMaxSec, but I don't see a problem with it overriding it with a greater value. What do you think?

I agree. This PR is done in this way

@bobbypage
Copy link
Member

bobbypage commented Jul 1, 2021

In other words, I don't think kubelet should ever decrease the overall InhibitDelayMaxSec, but I don't see a problem with it overriding it with a greater value. What do you think?

I agree. This PR is done in this way

Yup, agree! The only thing I'm not clear on is in PR description:

Which issue(s) this PR fixes:
Fixes #102818

As we discussed, it seems like #102818 is an unrelated issue (i.e. possibly due to lexicographic sorting of logind override files). Is that correct?

@wzshiming
Copy link
Member Author

wzshiming commented Jul 1, 2021

As we discussed, it seems like #102818 is an unrelated issue (i.e. possibly due to lexicographic sorting of logind override files). Is that correct?

Yes, I break the link.

I looked through logind's documentation, and I didn't think of an elegant solution for #102818

@bobbypage
Copy link
Member

As we discussed, it seems like #102818 is an unrelated issue (i.e. possibly due to lexicographic sorting of logind override files). Is that correct?

Yes, I break the link.

I looked through logind's documentation, and I didn't think of an elegant solution for #102818

Thanks!

LGTM for this change
/lgtm

Can we discuss further the logind issue on the corresponding issue #102818? I'm still unclear if issue is lexicographic sorting of files or something else...

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 1, 2021
@wzshiming wzshiming moved this from Triage to Needs Approver in SIG Node PR Triage Jul 5, 2021
@wzshiming
Copy link
Member Author

/assign @mrunalp

@ehashman
Copy link
Member

ehashman commented Jul 7, 2021

/priority important-soon
/triage accepted

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jul 7, 2021
@k8s-ci-robot k8s-ci-robot removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jul 7, 2021
@mrunalp mrunalp moved this from Needs Approver to Done in SIG Node PR Triage Aug 17, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, wzshiming

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 17, 2021
@k8s-ci-robot k8s-ci-robot merged commit d7c1663 into kubernetes:master Aug 17, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 17, 2021
@wzshiming wzshiming deleted the fix/expected_inhibit_delay branch August 18, 2021 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants