-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vnet] windows service stub #50468
[vnet] windows service stub #50468
Conversation
cc @ravicious, I'm once again refactoring a bunch of stuff |
b97bcdd
to
7295ac0
Compare
@ravicious @atburke @codingllama this is ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked through the code and it seems to work fine. I'll give it a go tomorrow in a VM and I'll check if the macOS version still works.
lib/vnet/escalate_windows.go
Outdated
// configureServicePermissions sets the security descriptor DACL on the service | ||
// such that the user who installed the service (identified by userSIDStr) is | ||
// allowed to start, stop, and query the service. | ||
func configureServicePermissions(service *mgr.Service, userSIDStr string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this assume that only a single user on the device can install and use the service?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed everything about installing the service and configuration permissions for now, will solve this later in a way that multiple users can install/use the service
lib/vnet/process_manager.go
Outdated
case <-pm.closed: | ||
// Errors are expected after the process manager has been closed, | ||
// usually due to context cancellation. | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What problems does this solve compared to the version on master? I assume it has to do with what errors are returned by Wait
, but by just looking at this file alone is hard to tell what changed in the whole process orchestration compared to master.
Do we just want Close()
to not make Wait()
return a "context cancelled" error? I know we've changed how we approach this specific error a couple of times, so I'd just want to have some clarity on how to approach it in the future. ;)
Lines 177 to 218 in 47bf4ca
func newProcessManager() (*ProcessManager, context.Context) { | |
ctx, cancel := context.WithCancel(context.Background()) | |
g, ctx := errgroup.WithContext(ctx) | |
return &ProcessManager{ | |
g: g, | |
cancel: cancel, | |
}, ctx | |
} | |
// ProcessManager handles background tasks needed to run VNet. | |
// Its semantics are similar to an error group with a context, but it cancels the context whenever | |
// any task returns prematurely, that is, a task exits while the context was not canceled. | |
type ProcessManager struct { | |
g *errgroup.Group | |
cancel context.CancelFunc | |
} | |
// AddCriticalBackgroundTask adds a function to the error group. [task] is expected to block until | |
// the context returned by [newProcessManager] gets canceled. The context gets canceled either by | |
// calling Close on [ProcessManager] or if any task returns. | |
func (pm *ProcessManager) AddCriticalBackgroundTask(name string, task func() error) { | |
pm.g.Go(func() error { | |
err := task() | |
if err == nil { | |
// Make sure to always return an error so that the errgroup context is canceled. | |
err = fmt.Errorf("critical task %q exited prematurely", name) | |
} | |
return trace.Wrap(err) | |
}) | |
} | |
// Wait blocks and waits for the background tasks to finish, which typically happens when another | |
// goroutine calls Close on the process manager. | |
func (pm *ProcessManager) Wait() error { | |
return trace.Wrap(pm.g.Wait()) | |
} | |
// Close stops any active background tasks by canceling the underlying context. | |
func (pm *ProcessManager) Close() { | |
pm.cancel() | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't feel strongly about this, for some reason while writing i wanted it not to return an error after the processmanager was closed, if you have another opinion i can revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No opinion, just wanted to understand the reasoning.
tool/tsh/common/vnet_windows.go
Outdated
newVnetCommand(app), | ||
newVnetInstallServiceCommand(app), | ||
newVnetUninstallServiceCommand(app), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Connect will have to perform its own equivalent of those commands, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, the vnet service command will run when tshd re-execs itself, it should be handled by tsh fine
lib/vnet/escalate_windows.go
Outdated
return trace.Wrap(err, "querying admin service") | ||
} else { | ||
if status.State != svc.Running && status.State != svc.StartPending { | ||
return trace.Errorf("service stopped running prematurely, status: %v", status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tsh vnet
works fine on Windows. The macOS daemon works too.
At first I copied the wrong wintun.dll. I copied the arm64 version but my VM is actually amd64. This resulted in tsh vnet
returning this:
$ ./build/tsh.exe vnet
VNet is ready.
ERROR: running admin process in the background
service stopped running prematurely, status: {1 0 0 0 0 0 0}
Is it possible to convert the status to some kind of a text representation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah switched to %+v which gives a better view
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is pretty dense at times (Windows services are nobody's specialty) and holds a large number of related-but-seemingly unnecessary changes (800+ lines in 25 files is no joke). I would advise, in the future, that you split PRs like this into smaller, more focused parts.
Given the number of comments it's probably not worth splitting now, apart from tool/tsh changes which I do think could still be reasonably split.
7295ac0
to
83cda08
Compare
Back to draft for now, i'm taking a stab at chunking this down a bit |
83cda08
to
36bdd77
Compare
Moved the tsh changes in #50935 |
18bf244
to
b47a65a
Compare
9c88cae
to
1984e63
Compare
1984e63
to
a6f09c8
Compare
@codingllama @ravicious this is ready for review again, i did my best to remove unrelated changes and everything about service installation, it's about half the size now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for refactoring!
lib/vnet/user_process_darwin.go
Outdated
// runPlatformUserProcess creates a network stack for VNet and runs it in the | ||
// background. To do this, it also needs to launch an admin process in the | ||
// background. It returns a [ProcessManager] which controls the lifecycle of | ||
// both background tasks. | ||
// | ||
// The caller is expected to call Close on the process manager to close the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little strange that the caller has to close the ProcessManager on error, but not on success - typically code goes the other way. Should we make sure the ProcessManager on errors here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this ended up weird when i kept this part here and moved process manager creation into the platform-specific functions. Now i've moved closing the ProcessManager down into the platform functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true that it's kind of weird, but I believe that's mostly because runPlatformUserProcess
in user_process_darwin.go
does a bunch of other operations that 1) that can fail independently of ProcessManager
2) depend on ProcessManager
to start its tasks. And also 3) the tasks managed by ProcessManager
are expected to continue running after runPlatformUserProcess
exits.
runPlatformUserProcess
creates a "server" (through a task handled by ProcessManager
) as well as a "client" which then contacts the server. If the client doesn't get a response from the server, we have no user for the server, hence we stop the server.
Given those properties, I'm not sure if there's a better way to manage that? Looking at some other prior art would be helpful, I do have to admit that it's hard to reason about this stuff!
* [vnet] Windows service * address alan's comments * fix typo in comment
* [vnet] add windows tsh cli commands (#50935) * [vnet] add windows tsh cli commands This PR refactors the VNet tsh CLI commands and adds stubs for the commands that are going to be added for Windows VNet support. * fix linux build * update copyright year * use context.AfterFunc * fix typo * [vnet] windows service stub (#50468)
This commit adds a Windows service for VNet. Currently all the service does is create a TUN interface, it doesn't yet actually handle any networking or establish any IPC, the rest will come in later PRs.
Installation of the service will eventually be handled by the Connect installer.
If you want to test this out on a Windows machine/VM you will need to:
tsh vnet -d
sc.exe query TeleportVNet
netsh interface show interface
logs.txt
in the directory wheretsh
is installed (this is temporary until I find a better place for logs).