Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate config parsing from rest of the code #383

Closed
wants to merge 13 commits into from
252 changes: 252 additions & 0 deletions sydent/config/server.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@
import configparser
Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
import copy
import logging
import logging.handlers
import os
from typing import Dict

from twisted.python import log

from sydent.config.crypto import CryptoConfig
from sydent.config.database import DatabaseConfig
from sydent.config.email import EmailConfig
from sydent.config.general import GeneralConfig
from sydent.config.http import HTTPConfig
from sydent.config.sms import SMSConfig

logger = logging.getLogger(__name__)

CONFIG_DEFAULTS = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, so: you've said "Can be reviewed commit to commit" - but that only works when each commit actually stands alone, or almost alone, in its own right. In this case (ac78368) you've added a new lump of code which isn't used anywhere, so that doesn't seem to make sense. As a reviewer when I look at this commit on its own, should I assume this is all new code, and review it accordingly? And then: the individual config classes (CryptoConfig and friends) don't exist yet, so I can't really review anything touching that.

So the long and short of it is that I don't think this can be reviewed commit to commit.

Splitting up big changes so that they can be reviewed easily can be challenging, and generally takes a bit of thinking about. What you're aiming for is a change that is simple enough that anyone can look at it for a couple of minutes - even if they haven't been following the progress of change in the project or even in the branch - and say "sure, this is obviously correct". That approach helps anyone else looking at the changes (including yourself when you come back to it in a few weeks' time), but is also a useful way for you as the author to be sure that the changes you are making are correct (and will also help track down where any problems cropped up).

The implication is that I could take any of the commits on your branch, and the code would all work properly: I could fire up a working application, or run all the tests and lint and expect them to pass. You don't necessarily need to actually run them all, but it's a useful guideline to consider.

So it's generally not as simple as "add a load of new code in one commit, then switch to it in later commits". Rather, you need to look for individual bits of functionality you can move around.

As an example, in this case, you might start in your first commit by creating an empty SydentConfig class and passing it into the Sydent constructor alongside the existing config dictionary. Then you can gradually move bits of the configuration parsing into the new SydentConfig, so your second commit could create simple like DatabaseConfig, and update SqliteDatabase to use DatabaseConfig instead of the dictionary. And so on.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for taking time to look at this.

So it's generally not as simple as "add a load of new code in one commit, then switch to it in later commits". Rather, you need to look for individual bits of functionality you can move around.

I see your point about making changes that work standalone and I will have another go at this where I move individual bits of functionality in peices. I can see how that would be much easier to review!

The implication is that I could take any of the commits on your branch, and the code would all work properly: I could fire up a working application, or run all the tests and lint and expect them to pass. You don't necessarily need to actually run them all, but it's a useful guideline to consider.

Thank you, that's really clear advice!

I'm sorry for taking unneccessary amounts of your time up, and I will try to make my next attempt very clear!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Azrenbeth please don't apologise. This is great work, and it's not at all obvious how best to structure these things. You're doing great!

"general": {
"server.name": os.environ.get("SYDENT_SERVER_NAME", ""),
"log.path": "",
"log.level": "INFO",
"pidfile.path": os.environ.get("SYDENT_PID_FILE", "sydent.pid"),
"terms.path": "",
"address_lookup_limit": "10000", # Maximum amount of addresses in a single /lookup request
# The root path to use for load templates. This should contain branded
# directories. Each directory should contain the following templates:
#
# * invite_template.eml
# * verification_template.eml
# * verify_response_template.html
"templates.path": "res",
# The brand directory to use if no brand hint (or an invalid brand hint)
# is provided by the request.
"brand.default": "matrix-org",
# The following can be added to your local config file to enable prometheus
# support.
# 'prometheus_port': '8080', # The port to serve metrics on
# 'prometheus_addr': '', # The address to bind to. Empty string means bind to all.
# The following can be added to your local config file to enable sentry support.
# 'sentry_dsn': 'https://...' # The DSN has configured in the sentry instance project.
# Whether clients and homeservers can register an association using v1 endpoints.
"enable_v1_associations": "true",
"delete_tokens_on_bind": "true",
# Prevent outgoing requests from being sent to the following blacklisted
# IP address CIDR ranges. If this option is not specified or empty then
# it defaults to private IP address ranges.
#
# The blacklist applies to all outbound requests except replication
# requests.
#
# (0.0.0.0 and :: are always blacklisted, whether or not they are
# explicitly listed here, since they correspond to unroutable
# addresses.)
"ip.blacklist": "",
# List of IP address CIDR ranges that should be allowed for outbound
# requests. This is useful for specifying exceptions to wide-ranging
# blacklisted target IP ranges.
#
# This whitelist overrides `ip.blacklist` and defaults to an empty
# list.
"ip.whitelist": "",
},
"db": {
"db.file": os.environ.get("SYDENT_DB_PATH", "sydent.db"),
},
"http": {
"clientapi.http.bind_address": "::",
"clientapi.http.port": "8090",
"internalapi.http.bind_address": "::1",
"internalapi.http.port": "",
"replication.https.certfile": "",
"replication.https.cacert": "", # This should only be used for testing
"replication.https.bind_address": "::",
"replication.https.port": "4434",
"obey_x_forwarded_for": "False",
"federation.verifycerts": "True",
# verify_response_template is deprecated, but still used if defined. Define
# templates.path and brand.default under general instead.
#
# 'verify_response_template': 'res/verify_response_page_template',
"client_http_base": "",
},
"email": {
# email.template and email.invite_template are deprecated, but still used
# if defined. Define templates.path and brand.default under general instead.
#
# 'email.template': 'res/verification_template.eml',
# 'email.invite_template': 'res/invite_template.eml',
"email.from": "Sydent Validation <noreply@{hostname}>",
"email.subject": "Your Validation Token",
"email.invite.subject": "%(sender_display_name)s has invited you to chat",
"email.invite.subject_space": "%(sender_display_name)s has invited you to a space",
"email.smtphost": "localhost",
"email.smtpport": "25",
"email.smtpusername": "",
"email.smtppassword": "",
"email.hostname": "",
"email.tlsmode": "0",
# The web client location which will be used if it is not provided by
# the homeserver.
#
# This should be the scheme and hostname only, see res/invite_template.eml
# for the full URL that gets generated.
"email.default_web_client_location": "https://app.element.io",
# When a user is invited to a room via their email address, that invite is
# displayed in the room list using an obfuscated version of the user's email
# address. These config options determine how much of the email address to
# obfuscate. Note that the '@' sign is always included.
#
# If the string is longer than a configured limit below, it is truncated to that limit
# with '...' added. Otherwise:
#
# * If the string is longer than 5 characters, it is truncated to 3 characters + '...'
# * If the string is longer than 1 character, it is truncated to 1 character + '...'
# * If the string is 1 character long, it is converted to '...'
#
# This ensures that a full email address is never shown, even if it is extremely
# short.
#
# The number of characters from the beginning to reveal of the email's username
# portion (left of the '@' sign)
"email.third_party_invite_username_obfuscate_characters": "3",
# The number of characters from the beginning to reveal of the email's domain
# portion (right of the '@' sign)
"email.third_party_invite_domain_obfuscate_characters": "3",
},
"sms": {
"bodyTemplate": "Your code is {token}",
"username": "",
"password": "",
},
"crypto": {
"ed25519.signingkey": "",
},
}


class SydentConfig:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd that all the modules in sydent/config are BaseConfig classes, except this one. I'd expect sydent/config/server.py to hold some sort of server-specific config.

Ideas (none of which I love):

  • Move all the BaseConfig classes down into a submodule (sydent/config/sections maybe)?
  • Move SydentConfig into sydent/__init__.py
  • Move SydentConfig into some other module in the top-level Sydent.

Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
def __init__(self):
self.general = GeneralConfig()
self.email = EmailConfig()
self.database = DatabaseConfig()
self.http = HTTPConfig()
self.sms = SMSConfig()
self.crypto = CryptoConfig()

self.config_sections = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be nice to get rid of this, and instead just have a _parse_config method which called section.parse_config(cfg) on each of general, email, etc.

self.general,
self.email,
self.database,
self.http,
self.sms,
self.crypto,
]

def parse_config_file(self, config_file: str):
Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
"""Parse the given config from a filepath, populating missing items and
sections
Args:
config_file (str): the file to be parsed
Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
"""
# If the config file doesn't exist, prepopulate the config object
# with the defaults, in the right section.
#
# Otherwise, we have to put the defaults in the DEFAULT section,
# to ensure that they don't override anyone's settings which are
# in their config file in the default section (which is likely,
# because sydent used to be braindead).
use_defaults = not os.path.exists(config_file)
cfg = configparser.ConfigParser()
for sect, entries in CONFIG_DEFAULTS.items():
cfg.add_section(sect)
for k, v in entries.items():
cfg.set(configparser.DEFAULTSECT if use_defaults else sect, k, v)

cfg.read(config_file)

# Logging is configured in cfg, but these options must be parsed first
# so that we can log while parsing the rest
setup_logging(cfg)

for section in self.config_sections:
section.parse_config(cfg)

# Changes may need to be saved back to file (e.g. generated keys)
if hasattr(section, "update_cfg") and section.update_cfg:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty nasty. Can we just have section.parse_config return a boolean to indicate whether an update is needed?

fp = open(config_file, "w")
cfg.write(fp)
fp.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do this after we parse all the sections, rather than after each section?


def parse_config_dict(self, config_dict: Dict):
"""Parse the given config from a dictionary, populating missing items and sections

Args:
config_dict (dict): the configuration dictionary to be parsed
"""
# Build a config dictionary from the defaults merged with the given dictionary
config = copy.deepcopy(CONFIG_DEFAULTS)
for section, section_dict in config_dict.items():
if section not in config:
config[section] = {}
for option in section_dict.keys():
config[section][option] = config_dict[section][option]

# Build a ConfigParser from the merged dictionary
cfg = configparser.ConfigParser()
for section, section_dict in config.items():
cfg.add_section(section)
for option, value in section_dict.items():
cfg.set(section, option, value)

# This is only ever called by tests so don't configure logging
# as tests do this themselves

for section in self.config_sections:
section.parse_config(cfg)


def setup_logging(cfg: configparser.ConfigParser):
Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
"""
Setup logging using the options selected in the config

Args:
cfg (ConfigParser): the configuration
Azrenbeth marked this conversation as resolved.
Show resolved Hide resolved
"""
log_format = "%(asctime)s - %(name)s - %(lineno)d - %(levelname)s" " - %(message)s"
formatter = logging.Formatter(log_format)

logPath = cfg.get("general", "log.path")
if logPath != "":
handler = logging.handlers.TimedRotatingFileHandler(
logPath, when="midnight", backupCount=365
)
handler.setFormatter(formatter)

def sighup(signum, stack):
logger.info("Closing log file due to SIGHUP")
handler.doRollover()
logger.info("Opened new log file due to SIGHUP")

else:
handler = logging.StreamHandler()

handler.setFormatter(formatter)
rootLogger = logging.getLogger("")
rootLogger.setLevel(cfg.get("general", "log.level"))
rootLogger.addHandler(handler)

observer = log.PythonLoggingObserver()
observer.start()