New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
apiserver: graceful termination with new option shutdown-send-retry-after #101257
Conversation
Skipping CI for Draft Pull Request. |
/test all |
/assign @smarterclayton |
Hrm, I'd prefer 429 - try again (every request after shutdown we reject is " come back later) and connection close ensures that. |
/test pull-kubernetes-e2e-gce-ubuntu-containerd |
/triage accepted |
c8f9918
to
6aff959
Compare
6aff959
to
a08c822
Compare
ffc30cb
to
f8c0a38
Compare
|
||
func (f retryAfterConditionFunc) ShouldRespondWithRetryAfter() (*retryAfterParams, bool) { | ||
return f() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
odd construct. Just use retryAfterConditionFunc()
directly. No need for a method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
func withRetryAfter(handler http.Handler, excludeFn excludeFunc, retryAfterFn retryAfterConditionFunc) http.Handler { | ||
return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) { | ||
retryAfterParams, sendRetryAfter := retryAfterFn.ShouldRespondWithRetryAfter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retryAfterFn()
a735bb2
to
15f66e4
Compare
/retest |
962e00d
to
e305345
Compare
// net/http waits for 1s for the peer to respond to a GO_AWAY frame, so | ||
// we should wait for a minimum of 2s | ||
stopHttpServerCh = drainedCh.Signaled() | ||
shutdownTimeout = 2 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the long-running connections should have a chance to terminate gracefully.
with shutdownTimeout = 2 * time.Second
this is impossible.
e305345
to
082e0e5
Compare
// NOTE: both WithRetryAfter and WithWaitGroup must use the same exact isRequestExemptFunc 'isRequestExemptFromRetryAfter, | ||
// otherwise SafeWaitGroup might wait indefinitely and will prevent the server from shutting down gracefully. | ||
func WithRetryAfter(handler http.Handler, shutdownDelayDurationElapsedCh <-chan struct{}) http.Handler { | ||
return withRetryAfter(handler, isRequestExemptFromRetryAfter, newRetryAfterFunc(shutdownDelayDurationElapsedCh)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: move the isRequestExemptFromRetryAfter comment above this line, out of the func doc
|
||
func withRetryAfter(handler http.Handler, isRequestExemptFn isRequestExemptFunc, shouldRespondWithRetryAfterFn shouldRespondWithRetryAfterFunc) http.Handler { | ||
return http.HandlerFunc(func(w http.ResponseWriter, req *http.Request) { | ||
retryAfterParams, sendRetryAfter := shouldRespondWithRetryAfterFn() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
params, send := ...
. The func is about retryAfter. No need for fully qualified var names.
// newRetryAfterFunc builds a shouldRespondWithRetryAfterFunc that starts | ||
// sending Retry-After response to all incoming requests once the shut down | ||
// delay duration (ShutdownDelayDuration) has elapsed. | ||
func newRetryAfterFunc(shutdownDelayDurationElapsedCh <-chan struct{}) shouldRespondWithRetryAfterFunc { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't checked where this is used outside of WithRetryAfter. Just inline this there. This abstraction does not help, but rather distributes complexity.
// NOTE: both 'WithRetryAfter' and 'WithWaitGroup' filters should use this function | ||
// to exempt the set of requests from being rejected or tracked. | ||
func isRequestExemptFromRetryAfter(r *http.Request) bool { | ||
if isKubeApiserverUserAgent(r) || hasExemptPathPrefix(r) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return isKubeApiserverUserAgent(r) || hasExemptPathPrefix(r)
fdd44c8
to
a8a90f9
Compare
add a new mode for graceful termination with the new server run option 'shutdown-send-retry-after' - shutdown-send-retry-after=true: we initiate shutdown of the HTTP Server when all in-flight request(s) have been drained. during this window all incoming requests are rejected with status code 429 and the following response headers: - 'Retry-After: N' - client should retry after N seconds - 'Connection: close' - tear down the TCP connection - shutdown-send-retry-after=false: we initiate shutdown of the HTTP Server as soon as shutdown-delay-duration has elapsed. This is in keeping with the current behavior.
a8a90f9
to
3182b69
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sttts, tkashem The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Introduce a new server run option
shutdown-send-retry-after
The default value is set to
false
to maintain current behavior.With this option, we now have two modes of graceful termination:
shutdown-send-retry-after=true
: we initiate shutdown of the HTTP Server when all nonlongrunning
requests in-flight have been drained.shutdown-send-retry-after=false
: we initiate shutdown of the HTTP Server as soon asshutdown-delay-duration
has elapsed. (This is the current behavior)The following happens with
shutdown-send-retry-after=true
:Server.Shutdown
immediately afterShutdownDelayDuration
elapses. Rather, wait for all nonlongrunning
requests to drain. (the requests that are being counted byWithWaitGroup
filter)ShutdownDelayDuration
elapses, any new incoming request should be answered with a429
and the following headers:Connection: close
andRetry-After: N
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: