Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make kubectl cp resume on transfer error #104792

Merged
merged 1 commit into from Nov 10, 2021
Merged

Conversation

matthyx
Copy link
Contributor

@matthyx matthyx commented Sep 6, 2021

What type of PR is this?

/kind feature

What this PR does / why we need it:

Allows kubectl cp transfers to continue despite network errors.

Which issue(s) this PR fixes:

Fixes #60140

Special notes for your reviewer:

Does this PR introduce a user-facing change?

adding option for kubectl cp to resume on network errors until completion, requires tar in addition to tail inside the container image

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/kubectl sig/cli Categorizes an issue or PR as relevant to SIG CLI. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 6, 2021
@matthyx matthyx marked this pull request as draft September 6, 2021 08:12
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 6, 2021
},

// TODO: Improve error messages by first testing if 'tar' is present in the container?
Command: []string{"tar", "cf", "-", src.File},
Command: []string{"sh", "-c", fmt.Sprintf("tar cf - %s | tail -c+%d", t.src.File, n)},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how related is to this, but Ben and I had to spend a lot of time to make it work flawlessly, to stream a tar over a network pipe, check if you can find something useful for this https://github.com/kubernetes-sigs/kind/blob/b6bc112522651d98c81823df56b7afa511459a3b/pkg/cluster/internal/logs/logs.go#L39-L60

Copy link
Contributor Author

@matthyx matthyx Sep 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah nice trick the shell invocation to have access to pipes...
I'm using tail to skip the first n bytes of the archive when resuming, however the best one is the pattern where I use a reader of a reader to catch errors and keep track of the transferred data to reconnect without untarAll noticing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is only valid if the source has not changed during the re connection, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, in this respect I don't improve on current implementation - however if your use case is to retrieve GBs of backups, usability is much higher
I might add a flag to set a max retry, but for the moment I'm waiting for some feedback on the issue with a prebuilt binary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backwards compatibility and when not using retries (the proposed default, above) please use only tar, since tail isn't required in that case.

@matthyx matthyx marked this pull request as ready for review September 17, 2021 07:33
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2021
@matthyx matthyx changed the title make kubectl cp resume infinitely on transfer error make kubectl cp resume on transfer error Sep 17, 2021
@matthyx
Copy link
Contributor Author

matthyx commented Sep 19, 2021

/assign @deads2k

@pacoxu pacoxu added this to Needs review in SIG CLI Sep 27, 2021
@Lastin
Copy link

Lastin commented Oct 16, 2021

Hey @matthyx. Retrying seems to work perfectly, though it fails after fetching 2362411520 bytes with error tail: write error: Bad address. Tried few times, each time seems to fail after 2.4GB download.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 21, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 21, 2021
@matthyx
Copy link
Contributor Author

matthyx commented Oct 21, 2021

/assign @soltysh

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks mostly good, please address the bits I've mentioned and ping me on slack to get it included in 1.23.

@@ -155,6 +156,7 @@ func NewCmdCp(f cmdutil.Factory, ioStreams genericclioptions.IOStreams) *cobra.C
}
cmdutil.AddContainerVarFlags(cmd, &o.Container, o.Container)
cmd.Flags().BoolVarP(&o.NoPreserve, "no-preserve", "", false, "The copied file/directory's ownership and permissions will not be preserved in the container")
cmd.Flags().IntVarP(&o.MaxTries, "retries", "t", 20, "Set number of retries to complete a copy operation from a container. Specify 0 to disable or any negative value for infinite retrying. The default is 20.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this to be opt-in for starters, so default to be 0, and don't use -t as the shorthand, since that's being used elsewhere as shorthand for --tty so that might be confusing here a bit.

},

// TODO: Improve error messages by first testing if 'tar' is present in the container?
Command: []string{"tar", "cf", "-", src.File},
Command: []string{"sh", "-c", fmt.Sprintf("tar cf - %s | tail -c+%d", t.src.File, n)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For backwards compatibility and when not using retries (the proposed default, above) please use only tar, since tail isn't required in that case.

@soltysh
Copy link
Contributor

soltysh commented Nov 8, 2021

/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 8, 2021
@matthyx
Copy link
Contributor Author

matthyx commented Nov 8, 2021

This looks mostly good, please address the bits I've mentioned and ping me on slack to get it included in 1.23.

Thanks @soltysh updated the PR accordingly.

Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: matthyx, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 10, 2021
@k8s-ci-robot k8s-ci-robot merged commit 8237943 into kubernetes:master Nov 10, 2021
SIG CLI automation moved this from Needs review to Done Nov 10, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Nov 10, 2021
@matthyx matthyx deleted the 60140 branch November 11, 2021 06:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/backlog Higher priority than priority/awaiting-more-evidence. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cli Categorizes an issue or PR as relevant to SIG CLI. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Development

Successfully merging this pull request may close these issues.

kubectl cp fails on large files
6 participants