Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variable support for configuration #1049

Closed
RyanSquared opened this issue May 25, 2020 · 23 comments
Closed

Environment variable support for configuration #1049

RyanSquared opened this issue May 25, 2020 · 23 comments
Milestone

Comments

@RyanSquared
Copy link

thank. ❤️

cc @daurnimator

@DanielOaks
Copy link
Member

DanielOaks commented May 25, 2020

Hmm, as in an environment variable setting the location of the config file, or specific config entries being populated by env variables? If the latter, there may be a way to do that already via yaml's weird builtins

@slingamn slingamn added this to the v2.2 milestone May 25, 2020
@slingamn
Copy link
Member

The second. I think kodah might have mentioned something relevant?

@DanielOaks
Copy link
Member

Yeah if we can make it clean and simple then coolcool. Having a separate blah-env-variable entry in the config file is uuugly and probably confusing.

If we don't find a nice clean yaml way to do this and still want to support it: Rather than e.g. what we did for the secret cloak key env var, if we do end up supporting this we can just skip the config file altogether and have ora just load the ORAGONO_BLAH env vars if they exist and ignore those entries in the config file. have a really obvious warning message on startup, e.g. BLAH, BLAH, and BLAH are being loaded from ENVIRONMENT VARIABLES and NOT the config file.. If we want to we could even add a command line flag like --ignore-env-vars to force us to just ignore all of them.

@slingamn slingamn modified the milestones: v2.2, v2.3 Jun 28, 2020
@RyanSquared
Copy link
Author

if we do end up supporting this we can just skip the config file altogether and have ora just load the ORAGONO_BLAH env vars if they exist and ignore those entries in the config file.

That would be ideal for most things. However, I do see one issue where this becomes a bit difficult, which is the list/dict variables, like the opers config. It might also be useful to allow multiple configuration files (or an "include" option) instead of env vars. The reason this came about was because we have a Kubernetes setup, and I'd like to have one file which is the base config (i.e. allow from local traffic, configure where to load TLS certs from, base amount of connections, etc.), and I'd like to have another file which is instance-specific, such as the network/server name, allowed origins for websockets, IP cloaking address, and oper classes/oper blocks.

This would mean users would only have make a ConfigMap for the things they want to override, like the above variables.

I realize that this may also complicate things, but I think this may be a cleaner solution given you'd otherwise have to deal with how to load lists/maps from environment variables.

@slingamn
Copy link
Member

slingamn commented Jul 8, 2020

Sounds like you could do this externally with templating/scripting?

@slingamn
Copy link
Member

slingamn commented Jul 8, 2020

A real-world example: our integration test suite can modify the config programmatically from Python, then dump it as JSON for consumption by Oragono (since valid JSON is valid YAML).

@RyanSquared
Copy link
Author

Sounds like you could do this externally with templating/scripting?

I would like to avoid overriding the entrypoint of the oragono container, and there's not really a way to automatically template or script configs without adding all the necessary components into the Docker container.

@flavienbwk
Copy link

flavienbwk commented Jul 18, 2020

Anyway the docker-compose.yml file should be at the root of the project with environment variables defining the important variables of the server (the standard way to use docker-compose).

Currently, it is not obvious how to launch the project with docker-compose.

@csmith
Copy link
Contributor

csmith commented Jul 20, 2020

I'm a fan of twelve-factor apps -- I think configuration is the only facet of that where Ora really falls down.

Using something like https://github.com/caarlos0/env it could be fairly straight-forward to add some extra tags to the config struct and some parsing logic. I'm not sure how it should interact with the yaml config if it exists - the easiest way is probably to just have a flag to switch entirely between them. Then the docker entrypoint can default to using that flag and the compose file can include the necessary vars.

If there were strong objections to doing that then it could be done in the existing entrypoint script which already does some config munging to set the oper password. I feel like that would get very messy very quickly, but it'd be nicer than end-users trying to do their own versions and having to override the entrypoint and keep it up-to-date. (I'd much rather get rid of that shell script entirely rather than make it more important, so we can avoid having a shell and all its accoutrements in the container).

@clukawski
Copy link
Contributor

This looks like a templating problem to me, tbh

Just my 2c

@slingamn
Copy link
Member

I'll think more about this; I'm not really up to date on best practices with Docker at all.

In particular, I don't understand the tradeoffs between putting things "inside" and "outside" the container. The ircd.yaml file ends up on a "volume" outside the container, right? So in principle, couldn't it be templated using tools that exist outside of the container? What's the role of the entrypoint in all this?

My main concern is that environment variables don't seem like they can be as expressive as YAML, or at least not without sacrificing cleanliness and visibility. If I squint, I can see a solution where we have environment variables like ORAGONO_config.server.websockets.allowed-origins, and then the value that variable takes is YAML or JSON that gets deserialized into Config.Server.Websockets.AllowedOrigins (there are some implementation challenges here). But it's hard for me to see this as a full replacement for the current YAML config file; it seems really cumbersome and unmaintainable to do a line-by-line translation of a fully functional oragono config into environment variables.

@slingamn
Copy link
Member

Oh, by the way, any thoughts on the narrow issue of moving docker-compose.yml to the root of the project?

@csmith
Copy link
Contributor

csmith commented Jul 22, 2020

In particular, I don't understand the tradeoffs between putting things "inside" and "outside" the container. [...]

Environmental variables allow for small/incremental changes, and are natively supported by container orchestrators like kubernetes. You can maintain plain text config files outside of that, and you can write some tooling to automate that, but the more you go that way the more you're going against the grain and losing benefits.

e.g. you can get ora running right now by doing docker run -p 6697:6697 oragono/oragono. If you then wanted to change the server name from "oragono.test" to "irc.example.com" and ora supported env vars it'd be something like:

docker stop $CONTAINER
docker run -p 6697:6697 -e SERVER_NAME=irc.example.com oragono/oragono

whereas with a config file it's:

docker cp $CONTAINER:/some/path/I/don't/remember/config.yml .
docker stop $CONTAINER
vi config.yml
// Edit the config, hope you don't make yaml syntax errors, etc
docker run -p 6697:6697 -v ./config.yml:/some/path/etc/config.yml oragono/oragono

Running multiple instances with small differences (e.g. different hostnames) is obviously a lot easier with env vars, as is keeping track of state (I can see exactly what env vars each container is running with, whereas I can only see the current state of the config file and hope the instances have been rehashed since it last changed). Finally, upgrading feels a lot nicer with env vars (the base image defines some new defaults, I override them if I want) vs config files (I have to manually merge the new config fields, or start from a new default config, or bury my head in the sand and ignore them; if the config file has incompatible changes I need to manage the editing and the upgrading of the containers in sync, etc).

My main concern is that environment variables don't seem like they can be as expressive as YAML, or at least not without sacrificing cleanliness and visibility

Yes, for sure. I wouldn't try and configure, say, haproxy using env vars, but I think Ora's config is not actually that complex? If you flatten all the namespacing I think there's only a few bits that wouldn't fit straight into a KEY=value or KEY="value1 value2 value3" structure. I'm having a hard time deciding how messy that would make the current config code without actually giving it a try...

What's the role of the entrypoint in all this?

The entrypoint is just the executable that's started when the container starts. Ideally for a go app it'd just be the statically compiled binary, and the image would have almost nothing else in it.

In practice, we have a small script as the entrypoint that (if needed) installs a default config, generates a random oper password, creates self-signed certs, then launches oragono itself. If we didn't want Ora itself to worry about environment variables, this script could potentially edit the config if there are env vars present, but that brings its own world of mess (and would probably be better done in python not shell script, which then drags python into the image as well...)

Oh, by the way, any thoughts on the narrow issue of moving docker-compose.yml to the root of the project?

I'm personally ambivalent. I'd usually put it at the root for my projects, but it's no skin off my nose if it's elsewhere and pointed at by the README (as it is now for Ora). Some software has example compose files inline in the docs or on the website, it's not a hard and fast rule that it should exist in the root of the repo.

(Thank you for attending my TED talk on environment variables.)

@slingamn
Copy link
Member

Thanks for this very detailed analysis, I learned a lot :-)

This side-by-side comparison (for changing the server name) seems to be eliding details that would be necessary for actual productionization. You wouldn't actually have important configuration data floating around on a command line somewhere, you'd have it in some kind of versioned file (I think this is the ConfigMap concept in k8s?) So in the end the workflows are going to look much more similar than in this example.

Finally, upgrading feels a lot nicer with env vars (the base image defines some new defaults, I override them if I want) vs config files (I have to manually merge the new config fields, or start from a new default config, or bury my head in the sand and ignore them; if the config file has incompatible changes I need to manage the editing and the upgrading of the containers in sync, etc).

I don't think this accurately reflects real-world issues with upgrading Oragono. In particular:

  1. The forwards and backwards compatibility guarantees for config files are fairly strong; most releases do not require any changes
  2. Typically, when a new section of the config is added, it gates new functionality that operators may not want to enable by default. The idea is that upgrades should be as safe and nondisruptive as possible.
  3. When the proper way to minimize disruption is to enable the field when unset, or to give it a default value, we have the capability to do this as well (e..g, recover-from-errors and force-trailing)
  4. On the rare occasions when we ship genuinely backwards-incompatible changes (for example, the transition from server.listen to server.listeners), it seems likely that the compatibility issues would affect environment variables as well, except in the case where the user has not modified that section of the config

Anyway, as long as I can get it to work in a reasonably clean way with reflect, I'm OK with implementing the scheme described earlier (where ORAGONO_config.server.websockets.allowed-origins contains actual YAML that can be deserialized into the relevant field). Would that be sufficient?

@slingamn
Copy link
Member

I think Ora's config is not actually that complex? If you flatten all the namespacing I think there's only a few bits that wouldn't fit straight into a KEY=value or KEY="value1 value2 value3" structure.

Oper blocks (which came up earlier as part of hashbang's use case) are a good example of something that doesn't fit neatly into this framework --- hence Ryan's interest in "include" directives or the equivalent.

@slingamn
Copy link
Member

slingamn commented Aug 3, 2020

@csmith: just wanted to get your opinion on the proposal (ORAGONO_config.server.websockets.allowed-origins contains YAML that can be deserialized into the relevant field)

@daurnimator
Copy link

@csmith: just wanted to get your opinion on the proposal (ORAGONO_config.server.websockets.allowed-origins contains YAML that can be deserialized into the relevant field)

I don't think this is a good idea. Usually you'd see e.g. comma separated values; or perhaps JSON; or they're left out of env var support.

@slingamn
Copy link
Member

slingamn commented Aug 4, 2020

All valid JSON is valid YAML, so that seems fine, no?

@slingamn slingamn modified the milestones: v2.3, v2.4 Aug 23, 2020
@slingamn
Copy link
Member

I'd like more feedback on above proposal if this is going to make it into 2.4 :-)

@csmith
Copy link
Contributor

csmith commented Oct 22, 2020

TL;DR: Sounds great! 👍

Sorry for not replying earlier. I was completely swamped with work at the time and then, well, forgot about it :)

This side-by-side comparison (for changing the server name) seems to be eliding details that would be necessary for actual productionization.

Yes, fair point, it's slightly more complex, but it still basically comes down to something like "configure things via a standard key/value method" vs "figure out whatever particular config format this app is using, where it needs to be placed on the file system, etc". Honestly I'm probably being a bit quixotic here - if you're setting up oragono it's likely something you care about and don't mind spend time configuring -- and the config format is actually quite nice -- as opposed to, say, a random dependency-of-a-dependency that requires you to configure it in XML and you really can't be bothered with that kind of nonsense.

I don't think this accurately reflects real-world issues with upgrading Oragono.

Again, fair points. I was speaking in mostly hypotheticals and you've taken good care to avoid the problems there.

Anyway, as long as I can get it to work in a reasonably clean way with reflect, I'm OK with implementing the scheme described earlier (where ORAGONO_config.server.websockets.allowed-origins contains actual YAML that can be deserialized into the relevant field). Would that be sufficient?

As a minor point, I'd much prefer if they could be styled like ORAGONO_SERVER_WEBSOCKETS_ALLOWED_ORIGINS (as in, uppercase and with underscores instead of other punctuation).

Using YAML values seems fine to me. I believe most of the simple options will "just work" because plain strings/numbers/bools are all valid YAML? e.g. ORAGONO_SERVER_STS_ENABLED=true or ORAGONO_SERVER_NAME=irc.example.com.

If you do want to put the more complicated stuff in environment variables it's not actually that inelegant if it's yaml on both ends (which is true for docker-compose and k8s). You can do something like:

  environment:
    ORAGONO_OPERS: |
      chris:
        class: "server-admin"
        whois-line: "etc"
      slingamn:
        class: "etc"
    ORAGONO_SOMETHING_ELSE: "..."

(Where the | operator makes everything under it into a scalar. YAML is such a weird beast sometimes...)

Obviously that's pushing what's sensible and personally I'd probably be reaching for the proper config at that point, but it works and isn't hideous.

I don't think this is a good idea. Usually you'd see e.g. comma separated values; or perhaps JSON; or they're left out of env var support.

I think it's fine to have the option. The stuff on the more complex-end of the spectrum like oper blocks won't be everyone's cup of tea, but strings/bools/ints/etc will just work, and lists using the JSON-esque syntax of [foo, bar, baz] aren't a big deal IMO.

@slingamn
Copy link
Member

Thanks!

As a minor point, I'd much prefer if they could be styled like ORAGONO_SERVER_WEBSOCKETS_ALLOWED_ORIGINS (as in, uppercase and with underscores instead of other punctuation).

The motivation for the other syntax was that it would be easier to parse --- in particular, you could split on ., whereas with your example syntax it's hard to tell whether a _ indicates a new field access or is just a replacement for a space in a field name.

@csmith
Copy link
Contributor

csmith commented Oct 22, 2020

Ah, yeah, I wasn't sure if you were planning to iterate the env vars and map them onto the config struct, or iterate the config struct and look for matching env vars. . isn't a deal breaker, it's just a bit unusual (most shells won't let you easily export an env var with a . in it, but it is doable and technically valid so shouldn't be an issue when configuring via docker/k8s/etc).

Alternatively what about dropping dashes entirely, then _ becomes your splitable divider? e.g. ORAGONO_CHANNELS_DEFAULTMODES. Makes the matching a bit more fiddly than using dots, but a lot easier than the quoted example where you can't tell where to split.

@slingamn
Copy link
Member

Ah, good point about .. It looks like bash doesn't like - either.

I'm looking for a solution where no special casing is required at all, and the config handling code can process environment variables based only on the preexisting struct tags and the reflect package.

I think this would work for me: ORAGONO__SERVER__WEBSOCKETS__ALLOWED_ORIGINS , i.e., two underscores is the field separator, and then the field names just have to be converted from "screaming snake case" to "kebab case" (ALLOWED_ORIGINS to allowed-origins).

slingamn added a commit to slingamn/ergo that referenced this issue Oct 23, 2020
slingamn added a commit to slingamn/ergo that referenced this issue Oct 23, 2020
slingamn added a commit to slingamn/ergo that referenced this issue Oct 23, 2020
slingamn added a commit to slingamn/ergo that referenced this issue Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants