Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add service cmd #3

Closed
wants to merge 12 commits into from
Closed

Add service cmd #3

wants to merge 12 commits into from

Conversation

djdv
Copy link
Owner

@djdv djdv commented May 25, 2021

The service command acts both as the server command for our service, but also has subcommands that interface with the system service manager.
Allowing the program to be registered with the host, and managed by it.

The service command may be called directly from the command line, either by a user to run interactively, or by the host to run in system service mode.
Client commands will try to find either of these instances by default. If not found, the executor will automatically try to launch an instance of the server. (The launched service shuts down and exits automatically later, if not busy)

Comment on lines 130 to 166
proc := cmd.Process
procState := cmd.ProcessState
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though these are just shorthand variables. They should stand out more than hide. The indented var draws attention to the area near the call to Start which is significant, but easy to gloss over when cluttered like this.
Besides, we use the multi-line var liberally throughout the code, so something something consistency too.

Suggested change
proc := cmd.Process
procState := cmd.ProcessState
var (
proc = cmd.Process
procState = cmd.ProcessState
)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, it's probably better to at least move procState into the error encountered branch where it's used. I'll reconsider this whole section later.

@djdv djdv mentioned this pull request May 27, 2021
@djdv djdv force-pushed the cmd/fs branch 2 times, most recently from 443f194 to 3fc5ecf Compare May 28, 2021 13:32
@djdv djdv force-pushed the cmd/service branch 11 times, most recently from c4c1d44 to d07faed Compare May 30, 2021 20:12
@djdv
Copy link
Owner Author

djdv commented May 30, 2021

Ci is being a nuisance. Likely has to do with multi-line output from go mod edit -json being sent to set-output.
(ref)
I'm going to continue investigating this later, but unfortunately it's cluttering the thread (and spamming my inbox with failure reports).

Edit:
There's potentially undocumented escaping that needs to happen.
More info here: https://git.luolix.topmunity/t/set-output-truncates-multiline-strings/16852

I was trying to rely more on the action expressions than shelling out to binaries; but it's easier to just use jq here instead of storing the JSON output, converting it to a single-line and/or encoding special characters, then passing that to the fromJSON expression.

Ideally this would just work, but doesn't.

- name: Setup specific Go version
  uses: actions/setup-go@v2
  with:
    go-version: ${{ fromJSON($(go mod edit -json)).Go }}

Even when used from seperate steps, the JSON data is truncated if multi-line.

- name: Parse Go mod file
  id: go_mod_parse
  run: echo "::set-output name=go_mod_json::$(go mod edit -json)"
- name: Setup specific Go version
  uses: actions/setup-go@v2
  with:
    go-version: ${{ fromJSON(steps.go_mod_parse.outputs.go_mod_json).Go }}

With the setup seemingly working, I just need to link the jobs together correctly.
Right now it looks like we're setting up and discarding everything before invoking the tests.

@djdv djdv force-pushed the cmd/service branch 3 times, most recently from 4bc5f9b to 47215f7 Compare May 30, 2021 21:37
@djdv
Copy link
Owner Author

djdv commented May 30, 2021

CI seems good enough for now.
I'm going to look at it again later and move changes back into cmd/fs then rebase off that branch here.

Unfortunately, it looks like the service pkg may need another fix, this time for macOS.
In regards to that package I need to file 1 PR for systemd, fix another issue in the Windows PR (related to error handling on service Control methods.), figure out what's going on with macOS, and then check how things are on Solaris and the BSDs.

If it's not too difficult, those platforms should get a CI instance as well.
Whether that's some container on GitHub's hardware, or a self hosted instance somewhere.

@djdv djdv force-pushed the cmd/service branch 2 times, most recently from c4f42b5 to cf74628 Compare June 2, 2021 19:46
@djdv
Copy link
Owner Author

djdv commented Jun 3, 2021

Things in progress:

  • fs service doesn't work properly on macOS because of a bug in a dependency and an OS specific permission issue.
    • Problem detected, patches are pending.
    • Virtualized macOS graphics is painfully slow on non-Apple hardware.
  • fs service install needs a way to set the arguments used by the daemon (like --api) when the service starts. Current ideas:
    • Use the arguments of fs service install and persist them into the service's config file.
      • E.g. fs service install --api=x results in a service which starts with command line: fs service --api=x
    • Use a string argument like --serviceArguments.
      • E.g fs service install --serviceArguments="--api=x" results in a service which starts with command line: fs service --api=x.
  • The service library is missing some platform specific options (specifically for launchd).
    • The missing service options and their default settings need to be added to the service pkg and its templates.
    • The same options need to be added to our pkg as well, except with additional information (argument's data type, env-var lookup name, etc.)
  • How we find a local instance's maddr needs to be changed.
    • Needs to at least accommodate the macOS fix which uses a different UDS socket path.
  • How we construct the default system service config needs to be changed.
    • Needs to at least make launchd aware of the socket address we use.
  • Move the technical/rationale text in the section below into PR's, and maybe move the brainstorming text into discussion threads.

With the exception of bugfixes and refactoring, there shouldn't be much more to do in regards to spawning the daemon, or managing the system service.

fs service sets up a server which handles our API requests. And has subcommands to query the status of the server.
As well as commands to create and control a host system "service" which utilizes this server (usually as a daemonized process managed by the OS).
Most of this is tested on multiple platforms already with a few mocks in mind to add more coverage to some of the platform specific things.

When that's all out of the way, we should be able to start focusing on the portions of code which perform the actual file system binding. Giving the program the ability to fulfill the requests we've been preparing it to receive.
I.e. the fs mount portions of fs.

The journal version of basically the same message but very verbose: I broke the macOS builds with the systemd patch to the `service` library, but resolved that [in this diff](https://github.com/djdv/service/compare/eb90fecf335399bac9202b2df592a9cf36f207d1...3b4ca4320e3c98c8c923bae54a8a04540300ec75?diff=split). This needs to be split up later, and ideally moved into more upstream PRs.

The diff fixes the status issue with systemd on Linux (if the service is not installed fs service status says the service is not "running" when it should say it's not "installed" at all).
But launchd needs the same fix for things to work on macOS.
As well as a separate issue that needs to be resolved; the system level socket is not accessible to users.


Right now on macOS, we create the socket in /Library/Application Support/fs/server by default.
But .../fs' is created with permissions 0o700 which denies users from accessing it.

We can initialize the parent directory with more lax permissions on service start to resolve this (like we currently do for the Windows implementation), however; launchd offers a Sockets service property which we can configure (ref:launchd.plist(5)).

Using the service property would give the service manager better comprehension and control over our service, allowing for things like starting the service on demand rather than say on boot.
I believe users can do similar things with systemd on Linux as well.
So it seems like the more correct option to go with.

I tested this by modifying the plist directly, but there seems to be restrictions around SockPathName.
Using the macOS path /Library/Application Support/fs/server as a socket target has no effects on the socket, but using the conventional BSD path /var/run/fsservice as a target seems to have launchd create the socket with the same permissions I specified in SockPathMode.
So I'll have to check Apple's "File System Programming Guide" later to see where the most appropriate place to put a system-wide socket is, but It's likely going to be/remain /var/run.

Since the system service socket's path will change, our functions that look for it will have to change as well.
Currently, we use the value returned from xdg.ConfigFile on all platforms.


While looking into this I realized we'll probably want to preserve (non-OS specific) arguments passed to fs service install as well.
For example, consider an invocation like fs service install --api=$sockMaddr --KeepAlive=false --SockPathName=$sameAddr --stop-after=30s.

On a launchd system, this should construct a launchd plist which tells the service manager to spawn a socket that starts the service on demand (when a client tries to connect to it,) rather than starting it at boot.

We'd also retain the generic arguments provided like --api and --stop-after commands, and place them in the service configuration, so that they're used when the service starts.
(Right now the service is always called with a fixed set of default arguments.)

Combined, this would result in a launchd service that starts on demand and calls our daemon with its auto-stop arguments.
During the service manager's initialization, the OS will create the sockets we use to accept incoming requests, and flag the service as available, but won't execute the daemon's process until a request actually comes in.
In addition, the daemon process should be able to exit without flagging the service as unavailable.
(When the service socket is operated on, the service will be started again automatically/transparently if it decided to stop earlier).

*I know this is true for Start, but need to check how systemd and launchd deal with soft-stops like that. It might require patching the service pkg. If it does, we'll probably punt on that and be content with auto-start for now.


Considering the binary currently does nothing, none of this matters right now.
But this should still appeal to system operators.
With these fixes/features implemented, we gain the ability to compose native system services, and dependents of them, for multiple platforms. With a good deal of user configuration available.

For example, we could compose a hypothetical auto-mount service.
It would depend on our file system service, and itself wants to make requests to our API(, such as calling our mount method).
It would also be configured to use whatever triggers and conditions the system's service manager provides.

You can imagine creating this auto-mount service with the condition of starting after login (as opposed to starting on init).
And mounting an IPFS node via our API.
The implication of this is that a connection to an IPFS node wouldn't be established immediately on boot and maintained forever, it would only be established and mounted after login.
This could be paired with some kind of auto-unmount service triggered by logoff, or some other system event. To handle conditional teardown and freeing of resources in a way that make sense for that particular system.
etc.

Considering we don't have a config format defined yet, another example would be using your own.
A dependent service could be created which parses some config file, generates a list of mount requests, then sends them to the file system service. The author of the dependent service doesn't need to be concerned about starting the daemon, they only need to be concerned with making requests to it.
*Note this part is just for example's sake. When a config format is defined and used by us, it's not going to make sense to construct your own arguments for this purpose. You still could, but wouldn't have to.

@djdv
Copy link
Owner Author

djdv commented Jun 8, 2021

I made the modifications mentioned above to the service library we depend on.
It can now create and manage systemd+launchd services that utilize socket based activation.
But the changes aren't reviewed yet. I'll do that sometime later and put them into a PR in the source repo.
This text will get edited into links when that happens.

The code in this PR will need some changes to accommodate that feature too.
We only need to modify the daemon logic slightly, so that it receives and uses the handles provided by the system service manager (instead of the creating/using its own like it does now).

While that change is minor, I developed some concerns with the way settings data is handled in the program generally.
And want to amend some of that next.

Keeping with the trend, I put more info than necessary in the block below. Mostly notes for myself that might be needed later, but obviously anyone interested can read them too.

Journal

As with most programs, our API logic is set up to handle various parameters for the program.
If the user wants to use something besides a default value, they can just provide their own through a parameter.
The arguments of which come from various sources, like the command line, process environment, HTTP request, etc.

For example, the api parameter can be provided and used when hosting or connecting to the service daemon.
(E.g. fs service --api=/dns4/localhost/tcp/1234 to host an instance on some non-default address, fs list --api=/dns4/localhost/tcp/1234 to make requests to it)

For reasons like the patch above, the API surface might have to change.
--api needs to start accepting variadic arguments, instead of a singe value like it does now.
And any logic which depends on this value would have to be adapted too.

We use an abstraction CmdsParameterSet which is similar to (and interacts with) cmds.Options.
It defines a parameter's name, purpose, and methods to retrieve an argument value from various sources.
Instances of this type are currently being used when interfacing with the APIs we interact with (such as cmds, service, and the various OS-level APIs+formats like systemd+unit-file, et al.)

The CmdsParameterSet abstraction in its current form is pretty basic, and has already been useful.
However, not much thought has gone into it. Just enough to get fs built, and just flexible enough to make extensions not too painful.

We're going to have to start porting the mount command from the other repo soon, and that's likely going to warrant a persistent config file to go with it.
We also need to consider how clients (other than fs itself) can/will interface with our service daemon (and enable them to if they currently can't).
Both of these would benefit from refining the parameter logic, so now seems like a good time to go over those parts of the code more seriously. And we shouldn't have to worry about it later.


One of the concerns I have at the moment is around how we expose fundamental settings to a client.
For example, a list of addresses a client should try by default when looking for a daemon instance.
Right now, the way fs deals with this is unexposed magic. If something like go-ipfs needed to coordinate with us (and it will), it won't even know how to find us in the first place. Since we don't currently provide anything for this, it's up to the client developer to provide their own address value.

For --api specifically, we can easily define static values in text documentation, saying something like "try tcp port 1234 on localhost by default", but this isn't ideal.
What if the target system can't use the value we chose as the default?
(Port in use, protocol not supported, permission issues, etc.)
Consider also, platform specific features that could be taken advantage of, but wouldn't be used when default values target the lowest common denominator.
I.e. You wouldn't want to use a tcp socket for your service on a Unix machine if uds is available to you. This goes further with other protocol+system pairs like XPC on XNU. If we can use the best by default, then we should.

Something as simple as exporting fscmds.SomeName() []multiaddr.Multiaddr, and/or using mDNS, would alleviate the concern around the --api parameter specifically.
But the intent is to review how we define, transform, pass, etc. the whole set of parameters we use.


I'm am worried about Go's type system in relation to some of this. It may or may not get in the way of some of the generic patterns I have in mind.

E.g. Something concise like

type Parameter struct{}
var someParameter = &Parameter{...}
var someIntValue, provided, err = someParameter.ArgFrom(request)

is likely impossible to define when considering variable types.
We'd probably end up with something more type-explicit like this

type StringParameter {...}
type IntParameter {...}
...
var someParameter = &IntParameter{...}
var someInt, provided, err =  someParameter.From(request)

or some equally explicit combination. Like maybe on the method side instead.

someParameter.StringFrom(request)

Ideally I'll be happy if I can get something that looks closer to this

apiMaddr, provided, err := serviceMaddr.From(cmdline, env, conf, default)

Where serviceMaddr is a theoretical type of Parameter.
From is a method on it which takes an ordered list of "sources" to pull the value from, and returns a concrete type.
This seems more possible, but still likely impossible to define when strict typing is required.
Specifically, From needs to return a concrete type, and that type is going to be a different type for each value type we use in our parameters.

I'm going to mess around with this more later. I'm wondering if something conceptually similar to functional options would be applicable here.
E.g. something more like typeMagic(pointerToArgumentVariable interface{}) (provided bool, err error) { *magicArgPtr = Interface{*realConcreteValue} } that will accept an interface{} as if it were void*, and avoids returning the argument value altogether by just directly assigning to it.
Any magic required to make this happens would be internal to the function.
This forces the caller to pass in a typed reference, but removes the need for them to do a type assertion at the call site. And may be the best way to work around types in function signatures for this problem. *If we can find a way to do it that's both compiler legal and doesn't require accepting too many other compromises.

It's also possible (although unlikely) that we decide to use go generate somewhere in this.
Considering most of the boilerplate in this tree was synthesized with vim macros from spec text, we may be able to just make a template tool which does the same thing.
That way if some upstream API adds a parameter for its API, we'd literally just run go generate and go build, and the binary would contain any new parameters, without manual code changes/intervention.

Whether this is worth the time or not, I'm not sure. Likely not worth it since these things don't change much and anything important is already in here. Anything new could just be added "by hand" upon request from actual users when they actually want/need to use them with these packages.

@djdv
Copy link
Owner Author

djdv commented Aug 1, 2021

Edit: It looks like I dropped commits during the rebase. ee3c99e is missing a few changes from its parent branch cmd/fs. Going to redo it. The push should show up below when that happens.

Resetting the branch, basically just squashing everything and rebasing it on the other branch that was reset. Old history is here: dd78d01
The commit is one large commit that still needs to be broken up later, at least splitting the service command from its subcommands.
I'm also not happy with the testing code. It works and covers a lot but I'm willing to bet it can be laid out in a more sensible way and probably simplified. As-is it's just a big mess. I pretty much added things wherever it was convenient rather than writing proper units that can stand on their own.
Some of this likely can't be avoided, but I don't think that's true for all of it.
For example, removing the status pkg breaks the top level service tests because we use it to query the status in the tests.
Really, the service command itself should emit a response to use saying "I'm ready" instead.
And there's probably more odd things like this too, will need to review it.

For now though I want to spawn a new branch off of this one, so it's getting pushed here and will be worked on after some progress on the other branch (relating to configuration and the mount interfaces).


Responding to myself

Ideally I'll be happy if I can get something that looks closer to this
apiMaddr, provided, err := serviceMaddr.From(cmdline, env, conf, default)

Ended up with something close enough which I'm content with.

func fileSystemServiceRun(request *cmds.Request, _ cmds.ResponseEmitter, env cmds.Environment) error {
	var (
		ctx             = request.Context
		settings        = new(Settings)
		unsetArgsChan, errs = parameters.ParseSettings(ctx, settings,
			parameters.SettingsFromCmds(request),
			parameters.SettingsFromEnvironment(),
		)
		unsetArgsSlice, err = parameters.AccumulateArgs(ctx, unsetArgs, errs)
	)
	if err != nil {
		return err
	}
	...

We should be able to add something like
parameters.SettingsFromConfig(configFilePath) below this. (After the other args are processed first, so that we can also provide a global --config parameter to override a default or something like that).

@djdv
Copy link
Owner Author

djdv commented Aug 5, 2021

This all seems to work okay. There's some platform stuff that needs to be cleaned up, as well as general review and documentation.
But I'm going to focus on breaking this up a little more. As-is it's really easy to create a cyclical dependency between client and server commands, as well as hosting programs (fs itself) which reference both.

I'm going to try and move the cmds Executor and Environment into their own pkg, rpc. And move some of the server constants and settings into this pkg as well. This should alleviate some of this in a way that doesn't seem too terrible.
I'd like all the service relevant stuff to remain in the service pkg but the compiler already disagrees with this in relation to the subcommand status where I have to use a lambda to circumvent a cycle.
And while importing the old code here https://github.com/djdv/go-filesystem-utils/commits/tmp I realized a lot of the commands are going to have to reference service via the Environment interface, so it might make more sense to just have those constants and methods within that pkg (in this case mostly as methods on the Environment because this is how the cmdslib seems to be intended to work - it's hard to say because that library isn't really documented)

We'll see how this goes.

cmds lint
djdv added 3 commits August 9, 2021 12:54
82a961b had logical problems.
We'd never have arguments to supply from the user if we're calling this
function. And supplying them ourselves caused the service to expect
the service socket directory to already exist (which it usually won't).
Instead of telling the server what address to use and trusting it,
we use an address provided by the server and try to dial it.
@djdv
Copy link
Owner Author

djdv commented Sep 10, 2021

More journaling:

We'll see how this goes.

The separation using an intermediate ipc pkg is still a little gross, but I don't think this can be avoided.
It seems like on a linking level we need to have some seperation between the client and server commands, as well as an intermediate for shared code.
This puts basically all of the important responsibility in the ipc package, and the Environment type which exposes the remote environment's methods to client commands.


Despite the CI failure, the current commit has worked pretty well for what it's intended for.
The mount code from years ago was brought over from the go-ipfs repo with basically no effort required at all. The only changes that had to be made where related to CLI argument parsing. Moving away from manually parsing the request, to using the new parameters pkg which automates this. With the exception of the machine translating the import paths, and removing unused things, the actual manual changes needed was something like 20 lines.
The code for that currently lives in the "tmp" branch, but this will disappear eventually. It's just the unstable prototype code ported over from go-ipfs to this daemon.


The current issue with this code is mainly the shoddy tests. Which come as a result of poor orchestration capabilities.
As-is in this branch, the service command doesn't emit any values. So there's no way besides parsing stdio to know the status it during startup, or just waiting to test the socket it should be listening on.
In the tests we just assume that it's running if it doesn't immediately return an error.
We also don't/can't do a good job of waiting for it to stop. We can cancel it, and know that it will stop, but not when/that it had stopped.

While implementing the new version of the mount-related code in the development branch, I have made changes to the code in this PR to remedy these problems.
There's also an informal demo of that working here: https://www.youtube.com/watch?v=q-hKANpM9Xg


This PR/command however, is overly complicated. It deals both with the host OS service interfaces (like the service manager, systemd, launchd, et al.) as well as being the command which hosts/exposes our IPC API via a daemon process listening on sockets.

I want to split this up so that the service command and its subcommands continue to deal with the host OS's service management APIs. And move the daemon code somewhere else. Maybe as fs daemon, fs service daemon, or something else. But regardless of its command path, it will be a logically separate command/pkg. That fs service wraps in a way that interfaces with the conventions of the host OS.
For example, passing sockets from the OS to the daemon, printing the status to the syslog instead of to stdio, etc.
(These are handled by fs service currently but done so in a way I think is bad/unnecessarily complicated. The daemon shouldn't be concerned with these, only the wrapping command.)

fs service shall become something that only the OS will call. And fs service daemon may be called by users if they want to manually start and watch the daemon. But for normal use, we'll continue to spawn this process in the background when needed if it's not found.

All of this should simplify both the code and allow me to better test them separately.
So sometime soon I'm going to split the logic, refactor the tests, and reset this branch to omit the daemon code, putting that in a separate PR, with this on top of that.
After all this, focus will resume on trying to make sure the CLI/API makes sense and fixing up + testing the file system implementations and interfaces that I have in the development branch.

@djdv
Copy link
Owner Author

djdv commented Feb 12, 2022

The comment above was done but it's in need of review. A new PR will replace this one since it's full of me talking to myself with outdated information.

@djdv djdv closed this Feb 12, 2022
@djdv djdv mentioned this pull request Aug 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant