Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vnet][1] setup TUN and IPv6 on MacOS #40893

Merged
merged 17 commits into from
May 7, 2024
Merged

[vnet][1] setup TUN and IPv6 on MacOS #40893

merged 17 commits into from
May 7, 2024

Conversation

nklaassen
Copy link
Contributor

@nklaassen nklaassen commented Apr 24, 2024

This is the second in a series of PRs implementing Teleport VNet RFD. parent child

This commit adds a hidden (for now) tsh vnet command which starts up Teleport VNet. It is only supported on MacOS.

When run as root, tsh vnet is able to create a TUN virtual network device in its own process. When run as a regular user, it exec's a child process as root using an AppleScript hack to get administrator privileges. The root process then passes the file descriptor for the TUN device to the non-root process over a unix-domain socket. This applescript hack will be superseded by a daemonized process once implemented, but it will probably stick around for development versions.

When creating the TUN device, the root process installs in the host an IPv6 route for all link-local addresses in the range of the VNet network, identified by a 64-bit prefix consisting of the 8-bit ULA prefix, a 40-bit randomly generated global ID, and a 16-bit subnet ID.

After getting the TUN device to the calling process, tsh vnet initiates bidirectional forwarding of traffic between the host OS and the VNet network.

There is currently no DNS support, no app forwarding, no Teleport login of any kind, that will all come in following PRs.

Because this is all MacOS-specific code that must either run as root or get root through an interactive UI element, and would need to mess with the host networking stack, none of it is unit tested. The virtual networking component is being tested in #40889 and the rest of the functionality to be implemented will be tested similarly.

Copy link

The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with changelog: followed by the changelog entries for the PR.

@github-actions github-actions bot added size/md tsh tsh - Teleport's command line tool for logging into nodes running Teleport. labels Apr 24, 2024
@nklaassen nklaassen added the no-changelog Indicates that a PR does not require a changelog entry label Apr 25, 2024
lib/vnet/setup.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently no DNS support, no app forwarding, no Teleport login of any kind, that will all come in following PRs. At this point the VNet process just logs that it has handled the connection and closes it immediately.

To make it more clear, at the moment all you can do is try to reach the IPv6 address directly, for example:

$ tsh vnet -d
2024-04-26T17:08:07+02:00 INFO  Spawning child process as root to create and setup TUN device vnet/setup_darwin.go:44
2024-04-26T17:08:12+02:00 INFO  Created TUN device. device:utun4 vnet/setup.go:83
2024-04-26T17:08:12+02:00 INFO [VNET]      Running Teleport VNet. ipv6:fd75:7ee9:a8d5:: trace_id:c9080ff4b2b4cd396cdb04c01d2c8a3f span_id:0d365c617eaf8431 vnet/vnet.go:226
2024-04-26T17:08:12+02:00 DEBU  Forwarding IP packets between OS and VNet. trace_id:c9080ff4b2b4cd396cdb04c01d2c8a3f span_id:0d365c617eaf8431 vnet/vnet.go:338

So I grab fd75:7ee9:a8d5:: and then:

$ curl -g -6 'http://[fd75:7ee9:a8d5::]:80'

Maybe there's an easier way to do this, but running that is what worked for me to trigger the following output from tsh vnet -d:

2024-04-26T17:08:47+02:00 DEBU [VNET]      Handling TCP connection. request:{80 fd75:7ee9:a8d5:: 63653 fd75:7ee9:a8d5::1} trace_id:c9080ff4b2b4cd396cdb04c01d2c8a3f span_id:0d365c617eaf8431 vnet/vnet.go:244
2024-04-26T17:08:47+02:00 DEBU [VNET]      No handler for address. request:{80 fd75:7ee9:a8d5:: 63653 fd75:7ee9:a8d5::1} addr:fd75:7ee9:a8d5:: trace_id:c9080ff4b2b4cd396cdb04c01d2c8a3f span_id:0d365c617eaf8431 vnet/vnet.go:249
2024-04-26T17:08:47+02:00 DEBU [VNET]      Finished handling TCP connection. request:{80 fd75:7ee9:a8d5:: 63653 fd75:7ee9:a8d5::1} trace_id:c9080ff4b2b4cd396cdb04c01d2c8a3f span_id:0d365c617eaf8431 vnet/vnet.go:251

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be even more clear, even this no longer works, since I removed promiscuous mode in the previous PR this no longer handles packets destined for an IP address that hasn't been assigned yet, and outside of tests, no IPs are assigned yet. Things start actually working in #41031 where we assign an IP for DNS, and querying DNS for an app name assigns an IP to that app


// VnetAdminSetupSubCommand is the sub-command tsh vnet uses to perform
// a setup as a privileged user.
VnetAdminSetupSubCommand = "vnet-admin-setup"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
VnetAdminSetupSubCommand = "vnet-admin-setup"
VNetAdminSetupSubCommand = "vnet-admin-setup"

To be consistent with nklaassen/vnet0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravicious and I have decided to capitalize as Vnet in code - otherwise JS gRPC codegen starts creating things called vNetXxxxx. Logs and errors still use VNet.

}

func configureOS(ctx context.Context, cfg *osConfig) error {
if cfg.tunIPv6 != "" && cfg.tunName != "" {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider inverting the condition here so that the return nil is indented and the body that does most of the work is not in a conditional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nklaassen nklaassen force-pushed the nklaassen/vnet1 branch 2 times, most recently from 07c8f8a to 08a8d97 Compare April 29, 2024 16:38
Base automatically changed from nklaassen/vnet0 to master April 30, 2024 01:26
This was referenced Apr 30, 2024
lib/vnet/setup_other.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Show resolved Hide resolved
lib/vnet/setup.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Show resolved Hide resolved
lib/vnet/setup.go Outdated Show resolved Hide resolved
lib/vnet/setup.go Show resolved Hide resolved
lib/vnet/setup_darwin.go Show resolved Hide resolved
lib/vnet/setup_darwin.go Outdated Show resolved Hide resolved
lib/vnet/setup_darwin.go Outdated Show resolved Hide resolved
lib/vnet/setup_darwin.go Outdated Show resolved Hide resolved
@nklaassen nklaassen requested a review from ibeckermayer May 2, 2024 18:27
lib/vnet/setup.go Outdated Show resolved Hide resolved
@nklaassen nklaassen requested a review from ibeckermayer May 3, 2024 22:08
Copy link
Contributor

@ibeckermayer ibeckermayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved because it's all logically correct AFAICT, left some suggestions for how to keep all the various concurrency mechanisms more readily understandable for future you / others. Your call whether to take them or leave them.

lib/vnet/vnet.go Show resolved Hide resolved
lib/vnet/vnet.go Show resolved Hide resolved
lib/vnet/vnet.go Outdated Show resolved Hide resolved
lib/vnet/vnet.go Outdated
Comment on lines 238 to 255
g.Go(func() error {
// When the context is canceled for any reason, the caller may have canceled it or one of the other
// concurrent tasks, destroy everything and quit.
<-ctx.Done()
close(m.destroyed)
m.linkEndpoint.Close()
err := trace.Wrap(m.tun.Close(), "closing TUN device")
allErrors <- err
return err
})
return trace.Wrap(g.Wait())
}

// Destroy closes the link endpoint, waits for all goroutines to terminate, and destroys the networking stack.
func (m *Manager) Destroy() error {
close(m.destroyed)
m.linkEndpoint.Close()
// Deliberately ignoring the error from g.Wait() to return an aggregate of all errors.
_ = g.Wait()
m.wg.Wait()
m.stack.Destroy()
return nil

close(allErrors)
return trace.NewAggregateFromChannel(allErrors, context.Background())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding something like these comments / rearrangement to try and keep the order/reasons for all of the various concurrency mechanisms easy to recall going forward (note that I may have somethings wrong here, this is just what I inferred based on studying the code):

	g.Go(func() error {
		// When the context is canceled for any reason (the caller may have canceled it or one of the other
		// concurrent tasks), ensure the `forwardBetweenTunAndNetstack` goroutine is stopped. (The `statsHandler`)
		// goroutine will stop on its own when the context is canceled.)
		<-ctx.Done()

		// Close the link endpoint and TUN device to cause the `forwardBetweenTunAndNetstack` function to
		// return.
		m.linkEndpoint.Close()
		err := trace.Wrap(m.tun.Close(), "closing TUN device")

		allErrors <- err

		return err
	})
	// Deliberately ignoring the error from g.Wait() to return an aggregate of all errors.
	_ = g.Wait()
	// Close the destroyed channel to signal that the VNet is no longer running, and that any in
	// flight connections in the `stack` should be closed.
	close(m.destroyed)
	// Wait for all in flight connections to be closed.
	m.wg.Wait()
	// Now we can destroy the `stack` to free all resources.
	m.stack.Destroy()

	close(allErrors)
	return trace.NewAggregateFromChannel(allErrors, context.Background())

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the review! I added a bunch of comments in ebef36a

@nklaassen nklaassen enabled auto-merge May 6, 2024 23:21
@nklaassen nklaassen added this pull request to the merge queue May 6, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 6, 2024
@nklaassen nklaassen added this pull request to the merge queue May 7, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 7, 2024
@nklaassen nklaassen added this pull request to the merge queue May 7, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks May 7, 2024
@nklaassen nklaassen added this pull request to the merge queue May 7, 2024
Merged via the queue into master with commit 6ce4fdd May 7, 2024
39 checks passed
@nklaassen nklaassen deleted the nklaassen/vnet1 branch May 7, 2024 02:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Indicates that a PR does not require a changelog entry size/md tsh tsh - Teleport's command line tool for logging into nodes running Teleport.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants