Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid spurious calls to update/delete validation #104182

Merged
merged 1 commit into from Aug 6, 2021

Conversation

liggitt
Copy link
Member

@liggitt liggitt commented Aug 5, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

The cached etcd storage uses in-memory cached versions of objects as a hint for the existing object when handling update/delete API calls.

If the update/delete fails for any reason, it does a live lookup of the existing object and retries. While correct, this means that when the failure was not due to the in-memory object being stale (for example, the object fails validation or is rejected by admission), the retry is done unnecessarily.

Because the retry involves admission, this means an admission webhook that rejects an update or delete request can get immediately reinvoked a second time with the same data.

If the live lookup reveals the cached object was actually up to date, skip the retry.

This retry has existed since 1.12, and the spurious retry only affects requests that were destined to fail anyway, but this will give a nice performance bump to update/delete requests rejected by admission webhooks.

kube-apiserver: Avoids unnecessary repeated calls to admission webhooks that reject an update or delete request.

/cc @wojtek-t @rngy @lavalamp

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Aug 5, 2021
@k8s-ci-robot
Copy link
Contributor

@liggitt: GitHub didn't allow me to request PR reviews from the following users: rngy.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?

/kind bug

What this PR does / why we need it:

The cached etcd storage uses in-memory cached versions of objects as a hint for the existing object when handling update/delete API calls.

If the update/delete fails for any reason, it does a live lookup of the existing object and retries. While correct, this means that when the failure was not due to the in-memory object being stale (for example, the object fails validation or is rejected by admission), the retry is done unnecessarily.

Because the retry involves admission, this means an admission webhook that rejects an update or delete request can get immediately reinvoked a second time with the same data.

If the live lookup reveals the cached object was actually up to date, skip the retry.

kube-apiserver: Avoids unnecessary repeated calls to admission webhooks that reject an update or delete request.

/cc @wojtek-t @rngy @lavalamp

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 5, 2021
@liggitt
Copy link
Member Author

liggitt commented Aug 5, 2021

/sig api-machinery
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. area/apiserver approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 5, 2021
@liggitt
Copy link
Member Author

liggitt commented Aug 5, 2021

Unit test failure looks related, will look

@liggitt
Copy link
Member Author

liggitt commented Aug 5, 2021

fun... the failing unit test was trying to exercise storage hook behavior on retry, but wasn't actually setting up the conditions to trigger a retry correctly... instead of providing a cachedObject with a mismatched resourceVersion, it was getting retried only because of the spurious retry-on-any-failure behavior this PR is fixing. Updated the unit test that wanted to test behavior on retry to set up the conditions to trigger retry correctly.

@liggitt
Copy link
Member Author

liggitt commented Aug 6, 2021

/retest

@wojtek-t
Copy link
Member

wojtek-t commented Aug 6, 2021

Thanks for addressing it!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 6, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-triage-robot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@dims
Copy link
Member

dims commented Aug 6, 2021

/retest

@k8s-ci-robot k8s-ci-robot merged commit 7f6d463 into kubernetes:master Aug 6, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 6, 2021
@liggitt liggitt deleted the suggestion-double-call branch August 6, 2021 18:22
k8s-ci-robot added a commit that referenced this pull request Aug 10, 2021
…182-upstream-release-1.20

Automated cherry pick of #104182: Avoid spurious calls to update/delete validation
k8s-ci-robot added a commit that referenced this pull request Aug 10, 2021
…182-upstream-release-1.19

Automated cherry pick of #104182: Avoid spurious calls to update/delete validation
k8s-ci-robot added a commit that referenced this pull request Aug 10, 2021
…182-upstream-release-1.22

Automated cherry pick of #104182: Avoid spurious calls to update/delete validation
k8s-ci-robot added a commit that referenced this pull request Aug 10, 2021
…182-upstream-release-1.21

Automated cherry pick of #104182: Avoid spurious calls to update/delete validation
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Aug 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants