Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make dataplanes dual (internal plus external together) #9723

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

VadimEisenberg
Copy link

@VadimEisenberg VadimEisenberg commented Jan 15, 2025

Description

Currently, for Felix it is possible to use either the Internal dataplane driver or an external one. In NeuReality, our hardware, employs a dual network stack for different types of traffic. One is the standard Linux stack, for which the internal Felix dataplane is used. For the other network stack, we developed our own dataplane driver (external).

We propose to change Calico code to enable using two dataplane drivers simultaneously. The internal dataplane driver will be designated as the primary one, the external dataplane driver will be designated as the secondary one. Felix will send messages to both dataplanes, but will receive messages from the primary dataplane only.

The change should not be disruptive to the current users of Calico since they will continue to specify either internal or external dataplane driver. In case both drivers will be specified, the dual dataplane mode will be employed.

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

TBD

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@VadimEisenberg VadimEisenberg requested a review from a team as a code owner January 15, 2025 13:33
@marvin-tigera marvin-tigera added this to the Calico v3.30.0 milestone Jan 15, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Jan 15, 2025
@caseydavenport
Copy link
Member

@VadimEisenberg thanks for the PR, very cool! One thing - you'll need to sign the CLA in order to make the bot happy.

Copy link
Contributor

@tomastigera tomastigera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the secondary driver has and error? Is it OK to ignore it?


package dataplane

type dataplaneDriverDecorator struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps?

Suggested change
type dataplaneDriverDecorator struct {
type dataplaneDriverMultiplexer struct {

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomastigera Great points, thank you! Let me rename the decorator, and return an error in case the secondary driver returns an error.

@VadimEisenberg
Copy link
Author

@caseydavenport Thank you! I have signed the CLA.

image

// PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
// OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need to use the Apache license header here (similar to in the other files in the repo). You're welcome to keep your copyright line (without "All rights reserved", which is typically used when no license is provided).

Please let me know if you have any specific concerns about submitting this under the Apache 2.0 license.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewdupre Thank you for pointing my attention to this, I have fixed the header. I have no concerns about submitting this PR under the Apache 2.0 license.

@VadimEisenberg
Copy link
Author

@tomastigera I have applied your changes, please review. More comments are welcome.

@tomastigera
Copy link
Contributor

@VadimEisenberg there is already a mechanism to support multiple dataplanes. Afaik that mechanism is used to push policies to applicationlayer policy engine (Dikastes) Who could we unify these mechanisms. Having 2 seems like and overkill. Feels like your approach would not need separate gorutines to read from the per-dp channels, otoh your approach makes one dp wait for the other as the message is dispatched synchronously. Idk whether it is a requirement of your usecase.

cc @fasaxc @caseydavenport

https://github.com/projectcalico/calico/blob/master/felix/daemon/daemon.go#L462

@VadimEisenberg
Copy link
Author

@tomastigera Using policy sync server will work for our needs, thank you! However, I have the second part of dual dataplanes issue, the one that deals with the CNI plugin dataplanes #9743. Do you have a similar solution for CNI plugin dataplanes?

Let's discuss the two PRs together, what should be the right design for a dual network stack, considering Felix and CNI plugin dataplanes. Once we agree on the design, we can discuss technical details - using goroutines for primary and secondary dataplanes, etc.

@tomastigera
Copy link
Contributor

@VadimEisenberg sent you amessage on CU slack, where it might be easier to discuss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants