fix goroutine leak in the DeleteCollection #105606

sxllwx · 2021-10-11T12:47:36Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

In our k8s cluster, 3 kube-apiservers are running and consume a lot of memory resources. Through analysis of Heap and Goroutine Profile, we can find that a lot of memory is used by DeleteCollection and has not been released. After analysis, I think there is a goroutine leak in DeleteCollection.

Sorry, because the company's regulations cannot provide a complete profile, here are some screenshots.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

All workers exit, which will cause the permanent block of the task producer to be in the sent code block.

https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go#L1131-L1139

I think that in the process of requesting the DeleteCollection interface to kube-apiserver, if a Leader switch or other error occurs in etcd, this bug will be triggered.

Does this PR introduce a user-facing change?

kube-apiserver: fix a memory leak when deleting multiple objects with a deletecollection.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

neolit123 · 2021-10-11T19:19:26Z

/release-note-edit

kube-apiserver: fix a memory leak when deleting multiple objects with a deletecollection.

liggitt · 2021-10-11T19:22:29Z

from what I can see, this should only be possible in unexpected panic cases... do you see panics logged?

sxllwx · 2021-10-11T23:13:58Z

from what I can see, this should only be possible in unexpected panic cases... do you see panics logged?

I only found the timeout filter related panic log. I think the goroutine leak here will not trigger panic. Please correct me if i am wrong

fedebongio · 2021-10-12T20:10:49Z

/assign @CatherineF-dev @caesarxuchao
/triage accepted

k8s-ci-robot · 2021-10-12T20:10:51Z

@fedebongio: GitHub didn't allow me to assign the following users: CatherineF-dev.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @CatherineF-dev @caesarxuchao
/triage accepted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sxllwx · 2021-10-14T14:06:49Z

/ping @liggitt

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

liggitt · 2021-10-14T15:16:23Z

yeah, it looks like passing deleteoptions with a precondition (like a uid precondition) that causes the delete to return an error will exit workers early

this looks reasonable, but needs to be exercised via a test

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

aojea · 2021-10-21T13:37:05Z

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

+			case toProcess <- i:
+			case <-workersExited:
+				klog.V(4).InfoS("workers already exited, and there are some items waiting to be processed", "finished", i, "total", len(items))
+				return


or just break instead of return?

I think the results of these two operations here are the same. ~ What do you think?

I maybe was overthinking it, with this return instead of break you don't close(toProcess) ,
... but if you return before distributing all items , that means that all workers already finished, so no need to close toProcess

wojtek-t · 2021-10-21T13:45:48Z

/lgtm
/approve

k8s-ci-robot · 2021-10-21T13:46:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sxllwx, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/apiserver/OWNERS~~ [wojtek-t]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

aojea · 2021-10-21T13:46:27Z

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

@@ -1126,14 +1126,22 @@ func (e *Store) DeleteCollection(ctx context.Context, deleteValidation rest.Vali
 	}
 	wg := sync.WaitGroup{}
 	toProcess := make(chan int, 2*workersNumber)
-	errs := make(chan error, workersNumber+1)
+	errs := make(chan error, workersNumber)


I think I understand this +1 now, this is the number of workers goroutines + the distributor goroutine, all of them can send errors to this channel ... interestingly, we only report the first one ... should we drain the channel, aggregate and report all errors?

Very interesting... 😄 ...

I don't think it is necessary to aggregate all errors here. If a operation fails, it should report directly. In order to ensure the fastest response. But I think drain the err chan is necessary here, logging can help debug.

we need to add the +1 back... there are workersNumber + 1 things that can send errors to the errs channel... if they all send an error, the channel will block, wg.Wait() won't return, and we'll hang

We were discussing it here: #105606 (comment)

The only way the dispatcher function can emit an error is when it crash.
But I can't imagine any reason why it may crash.

Thoughts?

I haven't dug in to figure out detailed reasons it might crash. if we have panic handler code and workersNumber+1 things that could send errors, just size the channel buffer to match

I opened #105872 for this.
Though - I think that the HandleCrash here is more for "being on the safe side" reason - I don't know how I could make it crash even with any conspiration theory (which is also why I don't see how we can even test that).

aojea · 2021-10-21T13:46:44Z

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go

 	case err := <-errs:
 		return nil, err


https://github.com/kubernetes/kubernetes/pull/105606/files#r733695697

wojtek-t · 2021-10-21T13:48:04Z

/lgtm cancel

wojtek-t · 2021-10-21T13:49:09Z

Heh... - I didn't make the cancelling on time.

@aojea - #105606 (comment) is a valid comment
We should send a follow up to it.

wojtek-t · 2021-10-21T13:50:16Z

@aojea - #105606 (comment) is a valid comment
We should send a follow up to it.

Though - I don't see any reason why that goroutine may crash - so maybe not super critical...

aojea · 2021-10-21T13:51:12Z

yeah, me neither, I'm fine with it,

aojea · 2021-10-21T13:55:28Z

hmm @sxllwx this test pass with and without this patch 🤔

wojtek-t · 2021-10-21T13:57:36Z

hmm @sxllwx this test pass with and without this patch

Yeah - it's somewhat expected. Because the changes in the code are actually two things:

change the blocking mechanism
fix the issue

This test wouldn't pass only if we had a PR with the first one, but without the second.

sxllwx · 2021-10-22T02:41:56Z

hmm @sxllwx this test pass with and without this patch 🤔

Without this patch, the new UT TestStoreDeleteCollectionWorkDistributorExited cannot pass.

aojea · 2021-10-22T06:33:28Z

hmm @sxllwx this test pass with and without this patch thinking

Without this patch, the new UT TestStoreDeleteCollectionWorkDistributorExited cannot pass.

it does, I run the test only without the patch and passes

sxllwx · 2021-10-22T08:40:29Z

hmm @sxllwx this test pass with and without this patch thinking

Without this patch, the new UT TestStoreDeleteCollectionWorkDistributorExited cannot pass.

it does, I run the test only without the patch and passes

Humm...Ignore my answer. Although the unit test can pass without this patch, there is still a leak of goroutine. (this is due to the concurrency mechanism of the previous version.)

microyahoo · 2021-10-22T09:39:20Z

I think the code can be simplified as follows.

tokenBucket := make(chan struct{}, workersNumber)
errs := make(chan error, workersNumber)
wg := &sync.WaitGroup{}

for _, item := range items {
	wg.Add(1)
	tokenBucket <- struct{}{}

	go func(i runtime.Object) {
		// panics don't cross goroutine boundaries
		defer utilruntime.HandleCrash(func(panicReason interface{}) {
			errs <- fmt.Errorf("DeleteCollection goroutine panicked: %v", panicReason)
		})
		defer wg.Done()
		defer func() {
			<-tokenBucket
		}()
		accessor, err := meta.Accessor(i)
		if err != nil {
			errs <- err
			return
		}

		// DeepCopy the deletion options because individual graceful deleters communicate changes via a mutating
		// function in the delete strategy called in the delete method.  While that is always ugly, it works
		// when making a single call.  When making multiple calls via delete collection, the mutation applied to
		// pod/A can change the option ultimately used for pod/B.
		if _, _, err := e.Delete(ctx, accessor.GetName(), deleteValidation, options.DeepCopy()); err != nil && !apierrors.IsNotFound(err) {
			klog.V(4).InfoS("Delete object in DeleteCollection failed", "object", klog.KObj(accessor), "err", err)
			errs <- err
			return
		}
	}(item)
}
wg.Wait()

select {
case err := <-errs:
	return nil, err
default:
	return listObj, nil
}

aojea · 2021-10-22T12:34:15Z

Whatever wojtek or Jordan say :)

sxllwx · 2021-10-25T06:34:54Z

I think the code can be simplified as follows.

tokenBucket := make(chan struct{}, workersNumber)
errs := make(chan error, workersNumber)
wg := &sync.WaitGroup{}

for _, item := range items {
	wg.Add(1)
	tokenBucket <- struct{}{}

	go func(i runtime.Object) {
		// panics don't cross goroutine boundaries
		defer utilruntime.HandleCrash(func(panicReason interface{}) {
			errs <- fmt.Errorf("DeleteCollection goroutine panicked: %v", panicReason)
		})
		defer wg.Done()
		defer func() {
			<-tokenBucket
		}()
		accessor, err := meta.Accessor(i)
		if err != nil {
			errs <- err
			return
		}

		// DeepCopy the deletion options because individual graceful deleters communicate changes via a mutating
		// function in the delete strategy called in the delete method.  While that is always ugly, it works
		// when making a single call.  When making multiple calls via delete collection, the mutation applied to
		// pod/A can change the option ultimately used for pod/B.
		if _, _, err := e.Delete(ctx, accessor.GetName(), deleteValidation, options.DeepCopy()); err != nil && !apierrors.IsNotFound(err) {
			klog.V(4).InfoS("Delete object in DeleteCollection failed", "object", klog.KObj(accessor), "err", err)
			errs <- err
			return
		}
	}(item)
}
wg.Wait()

select {
case err := <-errs:
	return nil, err
default:
	return listObj, nil
}

I think this code is simpler and faster. But the buffer size of the errs channel needs to be adjusted to len(items).

wojtek-t · 2021-10-25T07:12:07Z

I'm against the above - this potentially may start deletion of all items at once, which may overload etcd.
We need a control over number of inflight requests to etcd - which the workersNumber is achieving.

microyahoo · 2021-10-26T15:44:12Z

@wojtek-t The above is essentially the same as the existed. The goroutine can be executed only on the premise of obtaining a token, thereby ensuring that at most workersNumber requests can be executed each time.

wojtek-t · 2021-10-26T17:36:05Z

OK sorry - I misread it. Yes - that looks fine

…!1009) 1.20: 移植105606&105872，修复DeleteCollection导致的goroutine泄露问题 kubernetes#105606 kubernetes#105872

k8s-ci-robot requested review from liggitt and xiang90 October 11, 2021 12:48

sxllwx force-pushed the fix/goroutine-leak branch from 44144f4 to 2256c83 Compare October 11, 2021 13:44

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 11, 2021

k8s-ci-robot assigned caesarxuchao Oct 12, 2021

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 12, 2021

k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 12, 2021

aojea reviewed Oct 14, 2021

View reviewed changes

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go Show resolved Hide resolved

aojea reviewed Oct 14, 2021

View reviewed changes

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go Outdated Show resolved Hide resolved

wojtek-t reviewed Oct 14, 2021

View reviewed changes

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go Outdated Show resolved Hide resolved

staging/src/k8s.io/apiserver/pkg/registry/generic/registry/store.go Show resolved Hide resolved

sxllwx force-pushed the fix/goroutine-leak branch from 2256c83 to 81861d6 Compare October 15, 2021 08:20

k8s-ci-robot removed the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Oct 15, 2021

aojea reviewed Oct 21, 2021

View reviewed changes

k8s-ci-robot assigned wojtek-t Oct 21, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 21, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 21, 2021

aojea reviewed Oct 21, 2021

View reviewed changes

k8s-ci-robot merged commit 2dede1d into kubernetes:master Oct 21, 2021

k8s-ci-robot added this to the v1.23 milestone Oct 21, 2021

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 21, 2021

microyahoo mentioned this pull request Oct 27, 2021

optimize store.DeleteCollection #105929

Closed

wojtek-t mentioned this pull request Feb 4, 2022

Verify that DeleteCollection respects context cancellation #107950

Merged

wojtek-t mentioned this pull request Jan 26, 2023

DELETECOLLECTION doesn't always #90743

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix goroutine leak in the DeleteCollection #105606

fix goroutine leak in the DeleteCollection #105606

sxllwx commented Oct 11, 2021 •

edited by k8s-ci-robot

neolit123 commented Oct 11, 2021

liggitt commented Oct 11, 2021

sxllwx commented Oct 11, 2021 •

edited

fedebongio commented Oct 12, 2021

k8s-ci-robot commented Oct 12, 2021

sxllwx commented Oct 14, 2021

liggitt commented Oct 14, 2021

aojea Oct 21, 2021

sxllwx Oct 22, 2021

aojea Oct 22, 2021

wojtek-t commented Oct 21, 2021

k8s-ci-robot commented Oct 21, 2021

aojea Oct 21, 2021

sxllwx Oct 22, 2021 •

edited

liggitt Oct 22, 2021

wojtek-t Oct 22, 2021

liggitt Oct 22, 2021

wojtek-t Oct 25, 2021

aojea Oct 21, 2021

wojtek-t commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

aojea commented Oct 21, 2021

aojea commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

sxllwx commented Oct 22, 2021 •

edited

aojea commented Oct 22, 2021

sxllwx commented Oct 22, 2021

microyahoo commented Oct 22, 2021 •

edited

aojea commented Oct 22, 2021

sxllwx commented Oct 25, 2021

wojtek-t commented Oct 25, 2021

microyahoo commented Oct 26, 2021

wojtek-t commented Oct 26, 2021

fix goroutine leak in the DeleteCollection #105606

fix goroutine leak in the DeleteCollection #105606

Conversation

sxllwx commented Oct 11, 2021 • edited by k8s-ci-robot

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

neolit123 commented Oct 11, 2021

liggitt commented Oct 11, 2021

sxllwx commented Oct 11, 2021 • edited

fedebongio commented Oct 12, 2021

k8s-ci-robot commented Oct 12, 2021

sxllwx commented Oct 14, 2021

liggitt commented Oct 14, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t commented Oct 21, 2021

k8s-ci-robot commented Oct 21, 2021

Choose a reason for hiding this comment

sxllwx Oct 22, 2021 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wojtek-t commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

aojea commented Oct 21, 2021

aojea commented Oct 21, 2021

wojtek-t commented Oct 21, 2021

sxllwx commented Oct 22, 2021 • edited

aojea commented Oct 22, 2021

sxllwx commented Oct 22, 2021

microyahoo commented Oct 22, 2021 • edited

aojea commented Oct 22, 2021

sxllwx commented Oct 25, 2021

wojtek-t commented Oct 25, 2021

microyahoo commented Oct 26, 2021

wojtek-t commented Oct 26, 2021

sxllwx commented Oct 11, 2021 •

edited by k8s-ci-robot

sxllwx commented Oct 11, 2021 •

edited

sxllwx Oct 22, 2021 •

edited

sxllwx commented Oct 22, 2021 •

edited

microyahoo commented Oct 22, 2021 •

edited