Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pants-plugins/uses_services to check before running tests for required services (like mongo) #5864

Merged
merged 71 commits into from
Feb 3, 2023

Conversation

cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Jan 9, 2023

Background

This is another part of introducing pants, as discussed in various TSC meetings.

Related PRs can be found in:

Overview of this PR

This PR improves the DX (developer experience) by failing as early as possible if the development (or CI) environment does not have the required services.

This PR only checks for mongo, but I have draft branches to also add rabbitmq and redis.

Developer Experience

I do local development on my linux and mac laptops. Others use vagrant. Wherever you have your st2 sources, you must have mongo running in order to run some of our tests. I don't normally run mongo, so I often forget to start it before running tests. I would rather the tests bail out if the service is not running instead of giving me tons of logs to sift through to realize that mongo isn't running.

So, this PR focuses on failing as early as possible if the required services aren't running. On failure, the error message should be as actionable as possible to help guide new contributors to do the right thing without having to search our docs or ask for help in slack.

This does not run over and over. Pants has a daemon, pantsd, that runs in the background to handle results caching of the various test runs. The is_mongo_running check will re-run whenever pantsd restarts.

Developing pants plugins

Pants has extensive documentation on the plugin API, including how to create targets, how to write rules, and there's even a mini tutorial about how to add a formatter (adding formatters is a very common thing for plugins to do).

Architecture of the uses_services plugin

contents in this section:

  • uses field
  • raise ServiceMissingError when the service is not running
  • Platform dataclass and the rule and inspect_platform.py script that creates it
  • is_mongo_running.py script and the rule that runs it
  • wiring up the test goal so it runs is_mongo_running when pytest runs on a target with the uses field.
  • testing the rules and scripts

uses field

In BUILD files, we can add the uses field like this to say that these tests "use" mongo (so mongo should be running):

python_tests(
name="tests",
uses=["mongo"],
)

This field is defined here:

class UsesServicesField(StringSequenceField):
alias = "uses"
help = "Define the services that a test target depends on (mongo, rabbitmq, redis)."
valid_choices = supported_services

And gets added to the python_test and python_tests targets here:

PythonTestsGeneratorTarget.register_plugin_field(UsesServicesField),
PythonTestTarget.register_plugin_field(UsesServicesField),

raise ServiceMissingError when the service is not running

When a target has the uses field, some rules (described below) handle checking for running services like mongo. When the service is not running, the rule raises a ServiceMissingError which is defined here:

class ServiceMissingError(Exception):
"""Error raised when a test uses a service but that service is missing."""
def __init__(
self, service: str, platform: Platform, instructions: str = "", msg=None
):
if msg is None:
msg = f"The {service} service does not seem to be running or is not accessible!"
if instructions:
msg += f"\n{instructions}"
super().__init__(msg)
self.service = service
self.platform = platform
self.instructions = instructions

Wherever possible, the exception should include general instructions to help guide the developer towards getting the service running. Those instructions are platform-specific (eg development on Linux vs on Mac, or on Ubuntu vs on Rocky). So, the error needs to now the service name (eg mongo), a Platform descriptor dataclass (described below), and the instructions to show.

As I wrote the instructions for mongo, rabbitmq, and redis, I found that I was basically copy/pasting the same message, which meant that if I wanted to reword something, I had to change 3 copies of it, which is not pleasant. So, I wrote a little util, ServiceMissingError.generate() that generates an error with standardized instructions. The platform argument is the same as what the ServiceMissingError requires (described below) and the messages argument takes a dataclass with all of the strings that need to be interpolated in those standard instructions. That class is defined here:

@dataclass(frozen=True)
class ServiceSpecificMessages:
service: str
service_start_cmd_el_7: str
service_start_cmd_el: str
not_installed_clause_el: str
install_instructions_el: str
service_start_cmd_deb: str
not_installed_clause_deb: str
install_instructions_deb: str
service_start_cmd_generic: str

Here is what raising the error looks like for mongo:

# mongo is not running, so raise an error with instructions.
raise ServiceMissingError.generate(
platform=platform,
messages=ServiceSpecificMessages(
service="mongo",
service_start_cmd_el_7="service mongo start",
service_start_cmd_el="systemctl start mongod",
not_installed_clause_el="this is one way to install it:",
install_instructions_el=dedent(
"""\
# Add key and repo for the latest stable MongoDB (4.0)
sudo rpm --import https://www.mongodb.org/static/pgp/server-4.0.asc
sudo sh -c "cat <<EOT > /etc/yum.repos.d/mongodb-org-4.repo
[mongodb-org-4]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/${OSRELEASE_VERSION}/mongodb-org/4.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.0.asc
EOT"
# Install mongo
sudo yum -y install mongodb-org
# Don't forget to start mongo.
"""
),
service_start_cmd_deb="systemctl start mongod",
not_installed_clause_deb="this is one way to install it:",
install_instructions_deb=dedent(
"""\
sudo apt-get install -y mongodb-org
# Don't forget to start mongo.
"""
),
service_start_cmd_generic="systemctl start mongod",
),
)

Please review the messages that these strings get interpolated into in exceptions.py. These are effectively documentation to help developers have a better experience.

Platform dataclass and the rule script that creates it

Here is the platform dataclass:

@dataclass(frozen=True)
class Platform:
arch: str
os: str
distro: str
distro_name: str
distro_codename: str
distro_like: str
distro_major_version: str
distro_version: str
mac_release: str
win_release: str

inspect_platform.py is a simple script that collects the data about the running arch, os, and distro.

The rule that runs this script to collect the Platform details is defined in platform_rules.py

Normally, direct file access is a bad idea in pants rules. However, after experimentation, that turned out to be the cleanest/quickest way to handle loading a python file that is part of the plugin. In most rules, pants keeps track of which files that rule needs so it can watch for changes and restart the rule if that file changes. Because the file in question is part of the plugin that needs to use it, pants is already watching the file, so using a pants Get request is superfluous. So the rule grabs the contents of the script like this:

# pants is already watching this directory as it is under a source root.
# So, we don't need to double watch with PathGlobs, just open it.
with open(inspect_platform_full_path, "rb") as script_file:
script_contents = script_file.read()

Then it runs that script in a pex, which gets cached once per session (which is as long as pantsd is still running) according to this:

# this can change from run to run, so don't cache results.
cache_scope=ProcessCacheScope.PER_RESTART_SUCCESSFUL,

And then creates the Platform here:

platform = json.loads(result.stdout)
return Platform(**platform)

is_mongo_running.py script and the rule that runs it

Finally, is_mongo_running.py is the part that checks to see if mongo is running.

The pants plugin cannot import anything from our other st2* code, so is_mongo_running.py is a minimal self-contained script that mirrors how st2 connects to mongo. It should be as minimal as possible so that keeping it up-to-date with the core st2 code is not onerous.

The is_mongo_running.py rule gets opened and run in a pex with the same rule logic as inspect_platform.py (see above).

I avoided handling any kind of mongo auth; instead I had it use a mongo command that does not require auth, but still guarantees that mongo is actually running and accessible. For now, all of the tests hard-code the credentials and mongo connection settings like ssl. I do not expect this script to support auth in the future, but we might need to add TLS and other connection settings so we can support whatever mongo instance people want to use to run tests when developing locally. For now, it only supports the database connection settings hard-coded in the tests. Here is where the rule defines the default connection settings:

class UsesMongoRequest:
"""One or more targets need a running mongo service using these settings.
The db_* attributes represent the db connection settings from st2.conf.
In st2 code, they come from:
oslo_config.cfg.CONF.database.{host,port,db_name,connection_timeout}
"""
# These config opts currently hard-coded in:
# for unit tests: st2tests/st2tests/config.py
# for integration tests: conf/st2.tests*.conf st2tests/st2tests/fixtures/conf/st2.tests*.conf
# (changed by setting ST2_CONFIG_PATH env var inside the tests)
# TODO: for unit tests: modify code to pull db connect settings from env vars
# TODO: for int tests: modify st2.tests*.conf on the fly to set the per-pantsd-slot db_name
# and either add env vars for db connect settings or modify conf files as well
# with our version of oslo.config (newer are slower) we can't directly override opts w/ environment variables.
db_host: str = "127.0.0.1" # localhost in test_db.DbConnectionTestCase
db_port: int = 27017
# db_name is "st2" in test_db.DbConnectionTestCase
db_name: str = f"st2-test{os.environ.get('ST2TESTS_PARALLEL_SLOT', '')}"
db_connection_timeout: int = 3000

Here is the definition of the rule that runs is_mongo_running.py:

async def mongo_is_running(
request: UsesMongoRequest, platform: Platform
) -> MongoIsRunning:

So, when another rule Gets MongoIsRunning with a UsesMongoRequest, pants will also run the rule that generates Platform (described above), and then it will run this is_mongo_running rule.

The is_mongo_running rule either returns MongoIsRunning() if it is running, or raises ServiceMissingError if it is not. By raising an error, this actually breaks a convention for pants rules. Exceptions stop everything and get shown to the user right away, and for most goals, pants wants to see some dataclass returned that has something like a succeeded boolean instead. But, we want to error as early as possible, so this breaks that convention on purpose.

wiring up the test goal so it runs is_mongo_running when pytest runs on a target with the uses field.

The last piece that ties this all together is a rule that makes sure the is_mongo_running rule runs before pytest runs (if pants is running it on a target with the uses field). Here is the definition of the mongo_is_running_for_pytest rule:

async def mongo_is_running_for_pytest(
request: PytestUsesMongoRequest,
) -> PytestPluginSetup:

This rule is very simple. It just triggers running the is_mongo_running rule:

# this will raise an error if mongo is not running
_ = await Get(MongoIsRunning, UsesMongoRequest())
return PytestPluginSetup()

This rule needs the PytestUsesMongoRequest which selects targets that have the uses field.

class PytestUsesMongoRequest(PytestPluginSetupRequest):
@classmethod
def is_applicable(cls, target: Target) -> bool:
if not target.has_field(UsesServicesField):
return False
uses = target.get(UsesServicesField).value
return uses is not None and "mongo" in uses

This request will be made by the pants-internal pytest rules thanks to this UnionRule, which is a way to declare that our class "implements" the PytestPluginSetupRequest:

UnionRule(PytestPluginSetupRequest, PytestUsesMongoRequest),

If we need to add uses support to other targets, we will need to find similar UnionRules where we can inject our rule logic. For now, I've only wired this up so our uses rules will run for pytest.

testing the rules and scripts

I collected a set of Platform samples from a variety of machines, including the 4 OSes we officially support, for use in tests. We will need them for several tests, so I put them in data_fixtures.py

Then there are some tests that test running the scripts as part of testing the rules that run them.

There is at least one test for every new pants rule.

@cognifloyd cognifloyd added this to the pants milestone Jan 9, 2023
@cognifloyd cognifloyd self-assigned this Jan 9, 2023
@pull-request-size pull-request-size bot added the size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review. label Jan 9, 2023
@cognifloyd cognifloyd changed the title Pants plugins uses services Add pants-plugins/uses_services to check before running tests for required services (like mongo) Jan 10, 2023
@rush-skills
Copy link
Member

@cognifloyd Is this ready for review? One test seems to be still failing

Copy link
Member Author

@cognifloyd cognifloyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs some more work. There are a few bits I could use some help on (see the comment about getting platform samples).

This plugin is not necessary to release wheels. It is strictly to improve Developer Experience. So, I'm working on a pack_metadata plugin which is needed for that. I'll come back to this plugin later.

Also, I had a lot going on this past week so I wasn't able to work on the pants integration much. Hopefully I can wrap up the branches for the next 2 plugins this week.

pants-plugins/uses_services/exceptions.py Outdated Show resolved Hide resolved
pants-plugins/uses_services/mongo_rules.py Outdated Show resolved Hide resolved
pants-plugins/uses_services/mongo_rules_test.py Outdated Show resolved Hide resolved
@cognifloyd cognifloyd force-pushed the pants-plugins-uses_services branch 2 times, most recently from 833df33 to 451a4e6 Compare January 26, 2023 16:48
@cognifloyd cognifloyd marked this pull request as ready for review January 27, 2023 03:53
@cognifloyd
Copy link
Member Author

rebased

@cognifloyd cognifloyd requested review from amanda11 and a team February 1, 2023 17:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
database infrastructure: ci/cd pantsbuild size/XL PR that changes 500-999 lines. Consider splitting work into several ones that easier to review. tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants