Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix client IP preservation for NodePort service with protocol SCTP #104756

Merged
merged 1 commit into from Sep 9, 2021

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Sep 3, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

The iptables rule that matches kubeNodePortLocalSetSCTP must be inserted before the one matches kubeNodePortSetSCTP, otherwise all SCTP traffic would be masqueraded regardless of whether its ExternalTrafficPolicy is Local or not.

Besides, it adds unit test to verify the iptables rules are in expected order.

Which issue(s) this PR fixes:

Fixes #104755

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fixed client IP preservation for NodePort service with protocol SCTP in ipvs mode

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 3, 2021
@k8s-ci-robot
Copy link
Contributor

@tnqn: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 3, 2021
@k8s-ci-robot k8s-ci-robot added area/ipvs sig/network Categorizes an issue or PR as relevant to SIG Network. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 3, 2021
{kubeNodePortLocalSetSCTP, string(KubeNodePortChain), "RETURN", "dst,dst", utilipset.ProtocolSCTP},
{kubeNodePortSetSCTP, string(KubeNodePortChain), string(KubeMarkMasqChain), "dst,dst", utilipset.ProtocolSCTP},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the ipset implementation, why sctp matchType is dst,dst but udp and tcp ar eonly dst?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because bitmap:port ipset doesn't support matching SCTP port: #74341
https://ipset.netfilter.org/ipset.man.html#lbAV:

The set match and SET target netfilter kernel modules interpret the stored numbers as TCP or UDP port numbers.

string(kubeServicesChain): {{
JumpChain: string(KubeNodePortChain), MatchSet: "",
}},
string(KubeNodePortChain): {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this test affected if we only have changed the SCTP rules order?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the test only checked whether the target iptables rule exists or not and didn't care about the extra rules got from the system and their order, by using checkIptables. If we just add the two sctp related rules to expectedIptablesChains, it's not going to validate one rule is after the other and can be broken in the future without any test failure. So I add a new method checkIptablesInOrder which validates the expectedIptablesChains have the same number and order of rules as the ones got from the system. As a result, we need to add missing chains of this test accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, shouldn't that be the default behavior of checkIptables ?
that method is used in other places and order matters, i.e. we can't have a RETURN as first rule and mark the test as valid XD

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to touch less code. But making checkIptables check rule order for all tests makes sense to me. I have updated checkIptables and other tests accordingly.

@aojea
Copy link
Member

aojea commented Sep 3, 2021

/assign @andrewsykim @uablrek

Copy link
Member Author

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aojea thanks for the review. Answered your questions.

{kubeNodePortLocalSetSCTP, string(KubeNodePortChain), "RETURN", "dst,dst", utilipset.ProtocolSCTP},
{kubeNodePortSetSCTP, string(KubeNodePortChain), string(KubeMarkMasqChain), "dst,dst", utilipset.ProtocolSCTP},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because bitmap:port ipset doesn't support matching SCTP port: #74341
https://ipset.netfilter.org/ipset.man.html#lbAV:

The set match and SET target netfilter kernel modules interpret the stored numbers as TCP or UDP port numbers.

string(kubeServicesChain): {{
JumpChain: string(KubeNodePortChain), MatchSet: "",
}},
string(KubeNodePortChain): {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the test only checked whether the target iptables rule exists or not and didn't care about the extra rules got from the system and their order, by using checkIptables. If we just add the two sctp related rules to expectedIptablesChains, it's not going to validate one rule is after the other and can be broken in the future without any test failure. So I add a new method checkIptablesInOrder which validates the expectedIptablesChains have the same number and order of rules as the ones got from the system. As a result, we need to add missing chains of this test accordingly.

The iptables rule that matches kubeNodePortLocalSetSCTP must be inserted
before the one matches kubeNodePortSetSCTP, otherwise all SCTP traffic
would be masqueraded regardless of whether its ExternalTrafficPolicy is
Local or not.

To cover the case in tests, the patch adds rule order validation to
checkIptables.
@uablrek
Copy link
Contributor

uablrek commented Sep 6, 2021

/lgtm

I have tested with and without the PR on IPv4 and IPv6 and it works as expected.

And for completeness I also tested that the src was preserved for type: LoadBalancer and it works with and without this PR.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 6, 2021
@aojea
Copy link
Member

aojea commented Sep 6, 2021

/test pull-kubernetes-e2e-gci-gce-ipvs

@tnqn
Copy link
Member Author

tnqn commented Sep 9, 2021

@aojea @uablrek @andrewsykim if there is no new comment, could you please approve it?

@aojea
Copy link
Member

aojea commented Sep 9, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aojea, tnqn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 9, 2021
@k8s-triage-robot
Copy link

The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass.

This bot retests PRs for certain kubernetes repos according to the following rules:

  • The PR does have any do-not-merge/* labels
  • The PR does not have the needs-ok-to-test label
  • The PR is mergeable (does not have a needs-rebase label)
  • The PR is approved (has cncf-cla: yes, lgtm, approved labels)
  • The PR is failing tests required for merge

You can:

/retest

@k8s-ci-robot k8s-ci-robot merged commit a402f17 into kubernetes:master Sep 9, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 9, 2021
@tnqn tnqn deleted the ipvs-sctp-masquerade branch September 10, 2021 03:53
@JornShen
Copy link
Member

JornShen commented May 17, 2022

@tnqn shall we cherry-pick this to 1.22 or even earlier branch ? cc @aojea

@tnqn
Copy link
Member Author

tnqn commented May 20, 2022

@tnqn shall we cherry-pick this to 1.22 or even earlier branch ? cc @aojea

@JornShen I could backport it to 1.22, but will there be new patch releases for 1.21? @aojea

@aojea
Copy link
Member

aojea commented May 23, 2022

1.21 EOL is 2022-06-28 based on https://kubernetes.io/releases/

@tnqn
Copy link
Member Author

tnqn commented May 24, 2022

@aojea thanks for the information. I created PRs to backport it to 1.21 and 1.22.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipvs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ipvs mode doesn't preserve client IP for NodePort Service with protocol SCTP and ExternalTrafficPolicy Local
7 participants