Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version update and JWT support #15

Open
wants to merge 419 commits into
base: master
Choose a base branch
from

Conversation

fdelavega
Copy link

This includes a merge of NREL API Umbrella master branch and support for Keycloak and JWT

GUI and others added 30 commits June 14, 2018 08:04
The timing of the keepalive tests are a bit finicky, so try to make them
a bit looser to fix intermittent CI failures, since we don't necessarily
care about the strictness of this behavior.
If the "core" task was modified but the admin-ui and web-app's caches
were cleared, then the core task could fail to load. Persist the core's
dependencies, so the core task can run successfully in the CI
environment, even if the other normal build files are cleared.
Just to better tidy things up as we shift most of the scripts into
./tasks.
Build process and caching improvements
Under rare circumstances, some of the tests could fail because the
timestamps could get rounded incorrectly when converting to
milliseconds.

This usually cropped up when publishing ConfigVersion records to the
database, since the millisecond "version" timestamp Ruby had within the
test suite, didn't actually match the real time inserted into the
database, so the test suite would get stuck waiting for a ConfigVersion
record that would never actually be published.

As an example, the following Time.now could get generated:

2018-06-16T10:09:04.437

Which would get inserted into MongoDB as the following (correct)
timestamp:

1529158144437

However, calling `.to_f` on that same time object could incorrectly
result in the milliseconds being off slightly, due to random float
precision issues:

1529158144.438

So by relying on strftime and extracting the milliseconds field from the
time object, we can fix the floating point issues when we need to
convert Ruby times to milliseconds since epoch.
Add retry logic for a couple of other errors that could rarely crop up
during mongodb replicaset changes.

This helps fix rare, but intermittent test suite failures with the
mongodb replicaset tests.
For the tests that change trafficserver config that require restarts
(DNS tests and keepalive tests), these were randomly failing in the CI
environment. While I wasn't able to precisely reproduce this locally, I
believe the underlying cause was the "sleep 1" after sending the kill
signal to trafficserver. In some cases, this wasn't long enough, which I
believe explains the CI failures, since essentially the tests were
taking place without trafficserver having actually restarted (so the
expected configuration was not actually in effect).

This attempts to fix it by more properly waiting for the trafficserver
process to restart, by ensuring the PID actually changes before
proceeding.

This also better abstracts some of the similar perpctl commands in the
test suite, so other restarts and signals are better handled.
Hopefully this will finally settle this test for good, since rarely the
memory growth is reported upwards of 12000kb. So change some things
around, and check the overall memory of the process at the start to try
and better test for this when running in the full suite.
This follows up on the changes made in
27e73cd. While those issues are
probably still good to have and eliminate other race-conditions, it
turns out the real issue was related to sending the reload signal to
trafficserver at the same time we were writing new templates.
Trafficserver by default tries to manage and write the config files
itself on config changes, so this could lead to races between the reload
signal writing a new config file, and API Umbrella installing a new
configuration file based on the template. To fix this, we disable
Trafficserver's ability to modify the configuration files itself, which
should always ensure our installed templates are the definitive version
of the config.

I was finally able to reproduce this issue locally (sporadically) by
running preceding the DNS tests with some other tests (not sure why),
and this seems to finally fix things. Command used to reproduce locally
for reference:

SEED=42546 bundle exec minitest test/proxy/test_mongodb_replica_set.rb test/proxy/dns/test_custom_server.rb
The badge didn't actually take into account the multiple files (eg both
package.json and Gemfile), only the first Gemfile. It also didn't
acknowledge the manually ignored issues, and was still reporting them as
vulnerabilities (which they aren't).
The changes made in e3a3e78 to run
tests as root were due to lsof issues in Docker. However, this has
caused other issues, since some processes don't want to ever start as
root.

So we'll go back to not running tests as root, and instead workaround
the lsof issues by using "sudo" to execute lsof by all the process
owners.
GUI and others added 28 commits March 9, 2019 09:15
In Trafficserver 8, the behavior for stale DNS usage has changed
slightly. If a valid NXDOMAIN result is returned, then the stale record
is expired and will no longer be used. Instead, stale records are only
used if the lookup actually fails. This new behavior seems fine (just
different), So we need to change our test suite to account for these
differences.
This is likely only an issue in our test suite, but if sending rapid
reload commands to the api-umbrella process, the TrafficServer config
could go missing for a short period of time, which could lead to
requests not being routed. This fixes it by only changing the config
files if they actually differ (to prevent unnecessary reloads by
TrafficServer), and then also ensuring that all config files are fully
written to disk before being moved into place (so processes don't ever
pick up on partially written config files).
Regardless of order in which processes start versus config files are
re-written.
[Snyk] Fix for 2 vulnerable dependencies
Use the version of chromedriver that's already installed on the CI
image, and should match the version of chrome. I think things are maybe
breaking due to mismatched versions of chrome and chromedriver.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants