-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add stress test #74
add stress test #74
Conversation
… codes/times (relates to Ouranosinc/Magpie#433)
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea ! We have zero stress test so far. This is a very good start.
I'd like to propose a better reporting so all the infos are immediately available in the Jenkins console output. Otherwise, all good.
notebooks/stress-tests.ipynb
Outdated
"\n", | ||
"for bird in [\"finch\", \"flyingpigeon\", \"raven\"]:\n", | ||
" bird_url = f\"{TWITCHER_URL}/{bird}/wps?service=wps&request=getcapabilities\"\n", | ||
" assert stress_test_requests(bird_url, runs=100, max_avg_time=1) == 0, \"Failure condition encountered\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we expand this to something like below for better reporting? This runs failed https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/stress-test/1/console and it's impossible to determine the root cause from Jenkins console output.
stress_test_result = stress_test_requests(bird_url, runs=100, max_avg_time=1)
if (stress_test_result.result == 1):
# example "Failed: some non 200 response code observed within the total 100 runs: 401, 401, 401, 409, 500, 502"
raise AssertionError(f"Failed: some non {stress_test_result.code} response code observed within the total {stress_test_result.runs} runs: {stress_test_result.bad_errors_codes}")
if (stress_test_result.result == 2):
# example: "Failed: avg_time (1.034s) > max_avg_time (1s)"
raise AssertionError("Failed: avg_time ({stress_test_result.avg_time}s) > max_avg_time ({stress_test_result.max_avg_time}s)")
This implies changing the function stress_test_requests
to return a dataclass instead, which I think is not too much work since all the info are already available, just have to shuffle them into the dataclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes should be easy enough. Good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlvu
You can see in the latest commit. Updated to use dataclass and reporting the error/results as suggested.
This is embarrassing, I've run this test on a bunch of servers https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/stress-test/ and our prod is the one failing while the other test servers passed and we do not know which root cause (not |
… better messages per status if failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better reporting in Jenkins now, thanks ! Minor changes requested.
For completeness, here is a report when a failure is detected, on our production instance again:( At the 51 iteration there is a 500
return code that took 0.079s
, much shorter than regular requests time).
https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/stress-test/8/console
16:21:30 ---------------------------------------------------------------------------
16:21:30 AssertionError Traceback (most recent call last)
16:21:30 <ipython-input-3-90969fe311b4> in <module>
16:21:30 8 results = stress_test_requests(bird_url, runs=100, code=expect_status_code, max_avg_time=expect_max_avg_time)
16:21:30 9 if results.status == 1:
16:21:30 ---> 10 raise AssertionError(f"Detected non HTTP {expect_status_code} codes.\n{results!s}")
16:21:30 11 if results.status == 2:
16:21:30 12 raise AssertionError(f"Detected regression with long request time.\n"
16:21:30
16:21:30 AssertionError: Detected non HTTP 200 codes.
16:21:30 Stress Test:
16:21:30 Test:
16:21:30 code: 200
16:21:30 runs: 100
16:21:30 max-errors: 0
16:21:30 Request:
16:21:30 method: GET
16:21:30 url: https://pavics.ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps?service=wps&request=getcapabilities
16:21:30 args: {}
16:21:30 Times:
16:21:30 min: 0.079s
16:21:30 avg: 0.161s
16:21:30 max: 0.210s
16:21:30 Results:
16:21:30 Index Codes Delta Times
16:21:30 0 200 0.000s 0.165s
16:21:30 1 200 0.038s 0.151s
16:21:30 2 200 0.067s 0.143s
16:21:30 3 200 0.098s 0.175s
16:21:30 4 200 0.011s 0.159s
16:21:30 5 200 0.067s 0.154s
16:21:30 6 200 0.052s 0.170s
16:21:30 7 200 0.090s 0.170s
16:21:30 8 200 0.098s 0.185s
16:21:30 9 200 0.064s 0.175s
16:21:30 10 200 0.041s 0.155s
16:21:30 11 200 0.048s 0.184s
16:21:30 12 200 0.055s 0.196s
16:21:30 13 200 0.012s 0.159s
16:21:30 14 200 0.004s 0.179s
16:21:30 15 200 0.032s 0.138s
16:21:30 16 200 0.100s 0.152s
16:21:30 17 200 0.050s 0.164s
16:21:30 18 200 0.096s 0.145s
16:21:30 19 200 0.096s 0.155s
16:21:30 20 200 0.003s 0.157s
16:21:30 21 200 0.012s 0.135s
16:21:30 22 200 0.050s 0.162s
16:21:30 23 200 0.032s 0.150s
16:21:30 24 200 0.040s 0.148s
16:21:30 25 200 0.071s 0.131s
16:21:30 26 200 0.054s 0.146s
16:21:30 27 200 0.072s 0.140s
16:21:30 28 200 0.036s 0.152s
16:21:30 29 200 0.022s 0.155s
16:21:30 30 200 0.031s 0.143s
16:21:30 31 200 0.041s 0.153s
16:21:30 32 200 0.012s 0.156s
16:21:30 33 200 0.084s 0.154s
16:21:30 34 200 0.061s 0.169s
16:21:30 35 200 0.005s 0.173s
16:21:30 36 200 0.089s 0.170s
16:21:30 37 200 0.035s 0.155s
16:21:30 38 200 0.055s 0.164s
16:21:30 39 200 0.005s 0.198s
16:21:30 40 200 0.030s 0.169s
16:21:30 41 200 0.026s 0.170s
16:21:30 42 200 0.002s 0.172s
16:21:30 43 200 0.067s 0.161s
16:21:30 44 200 0.005s 0.173s
16:21:30 45 200 0.069s 0.158s
16:21:30 46 200 0.017s 0.159s
16:21:30 47 200 0.010s 0.157s
16:21:30 48 200 0.046s 0.170s
16:21:30 49 200 0.014s 0.170s
16:21:30 50 200 0.074s 0.156s
16:21:30 51 500 0.067s 0.079s
16:21:30 52 200 0.015s 0.161s
16:21:30 53 200 0.032s 0.204s
16:21:30 54 200 0.071s 0.150s
16:21:30 55 200 0.054s 0.173s
16:21:30 56 200 0.053s 0.157s
16:21:30 57 200 0.012s 0.164s
16:21:30 58 200 0.095s 0.157s
16:21:30 59 200 0.080s 0.170s
16:21:30 60 200 0.067s 0.147s
16:21:30 61 200 0.046s 0.170s
16:21:30 62 200 0.041s 0.154s
16:21:30 63 200 0.062s 0.160s
16:21:30 64 200 0.042s 0.160s
16:21:30 65 200 0.063s 0.179s
16:21:30 66 200 0.024s 0.173s
16:21:30 67 200 0.033s 0.148s
16:21:30 68 200 0.036s 0.193s
16:21:30 69 200 0.025s 0.172s
16:21:30 70 200 0.086s 0.152s
16:21:30 71 200 0.072s 0.159s
16:21:30 72 200 0.053s 0.132s
16:21:30 73 200 0.089s 0.161s
16:21:30 74 200 0.044s 0.178s
16:21:30 75 200 0.006s 0.153s
16:21:30 76 200 0.039s 0.143s
16:21:30 77 200 0.044s 0.142s
16:21:30 78 200 0.051s 0.173s
16:21:30 79 200 0.005s 0.143s
16:21:30 80 200 0.004s 0.165s
16:21:30 81 200 0.097s 0.146s
16:21:30 82 200 0.019s 0.173s
16:21:30 83 200 0.041s 0.172s
16:21:30 84 200 0.060s 0.161s
16:21:30 85 200 0.025s 0.160s
16:21:30 86 200 0.023s 0.162s
16:21:30 87 200 0.059s 0.165s
16:21:30 88 200 0.100s 0.159s
16:21:30 89 200 0.049s 0.149s
16:21:30 90 200 0.088s 0.149s
16:21:30 91 200 0.074s 0.166s
16:21:30 92 200 0.085s 0.166s
16:21:30 93 200 0.092s 0.145s
16:21:30 94 200 0.003s 0.155s
16:21:30 95 200 0.018s 0.160s
16:21:30 96 200 0.090s 0.153s
16:21:30 97 200 0.096s 0.199s
16:21:30 98 200 0.043s 0.210s
16:21:30 99 200 0.014s 0.156s
@tlvu |
I was trying to reproduce it again on our production but it did not happen again but it did happen again on our staging (Medus), see https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/stress-test/10/console and it's Flyingpigeon again throwing a I wonder if it's just a coincidence because with prod, it was Flyingpigeon also ! In the for-loop, I think it should be changed to hit all the WPS and collect all the results. Then subsequently go through the results for when status is not Also you might want to set a sensible timeout. If the server is down, hitting 100 times for each WPS and waiting for a response that will never come will be long ! In the mean time I'll go fish for that log matching that |
Below is There was no errors in Magpie logs.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay you merge this now and implement the following remaining improvements in another PR. These are nice to have but not blocker.
From #74 (comment):
In the for-loop, I think it should be changed to hit all the WPS and collect all the results. Then subsequently go through the results for when status is not 0 and only then raise the AssertionError. Because in both cases, Flyingpigeon was failing and we never get to testing Raven. Or we might have multiple WPS failing and we only see the first failure.
Also you might want to set a sensible timeout. If the server is down, hitting 100 times for each WPS and waiting for a response that will never come will be long !
@tlvu I agree the final raise and timeout would be better. I'll look into it. |
FYI, we should be able to decommission FlyingPigeon, so no need to invest time over that component. |
With more runs of the new stress test, it also happens on other birds as well, not just FP. We even have this randomly on our regular nightly runs without the stress test. |
… TEST_MAX_AVG_TIME and TEST_MAX_ERR_CODE + make it more obvious where the http error code is in the list of resutls
… hummingbird to list
I have improved the stress test to add more configuration variables and full test of all specified WPS birds before a single final assert that they all passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 again. Let's merge this now so it's easier to trace the intermittent issue because we will have stats from production before enabling the cache optimization. Then once we enable the cache optimization, we can compare.
Merge when ready and copy/paste the PR description into the merge commit description.
…wing-error-in-jenkins stress-tests.ipynb: fix not showing helpful error in Jenkins and allow to disable It seems that once the `stress-test.ipynb` starts failing, subsequent notebooks are much more likely to get this error: ``` ServiceException: <?xml version="1.0" encoding="utf-8"?> <ExceptionReport version="1.0.0" xmlns="http://www.opengis.net/ows/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.opengis.net/ows/1.1 http://schemas.opengis.net/ows/1.1.0/owsExceptionReport.xsd"> <Exception exceptionCode="NoApplicableCode" locator="AccessForbidden"> <ExceptionText>Not authorized to access this resource. Missing user authentication.</ExceptionText> </Exception> </ExceptionReport> ``` * So I added a new option on Jenkins to allow disable all the notebooks under `./notebooks/*.ipynb`, which includes the `stress-test.ipynb`, default value still `true`. * I also added a way to send extra environment variables to the notebooks, ex: `TEST_RUNS=20 TEST_WPS_BIRDS=finch,raven,flyingpigeon` that would impact the `stress-test.ipynb` Demo run with those new Jenkins params: https://daccs-jenkins.crim.ca/job/PAVICS-e2e-workflow-tests/job/fix-stress-tests.ipynb-not-showing-error-in-jenkins/2/parameters/ Related to PR #74 Before this fix, this is what Jenkins shows, not very helpful: ``` _____________________ notebooks/stress-tests.ipynb::Cell 2 _____________________ Notebook cell execution failed Cell 2: Cell execution caused an exception Input: # NBVAL_IGNORE_OUTPUT test_statuses = [] for bird in TEST_WPS_BIRDS: bird_url = f"{TWITCHER_URL}/{bird}/wps?service=wps&request=getcapabilities" expect_status_code = 200 results = stress_test_requests(bird_url, runs=TEST_RUNS, code=expect_status_code, max_err_code=TEST_MAX_ERR_CODE, max_avg_time=TEST_MAX_AVG_TIME, abort_retries=TEST_TIMEOUT_RETRY, abort_timeout=TEST_TIMEOUT_ABORT) test_statuses.append(results.status) print(results) failed_tests = sum(test_statuses) assert not failed_tests, f"Failed {failed_tests} tests." print("\nAll tests passed!") Traceback: --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) <ipython-input-3-954e90dc833f> in <module> 12 print(results) 13 failed_tests = sum(test_statuses) ---> 14 assert not failed_tests, f"Failed {failed_tests} tests." 15 16 print("\nAll tests passed!") AssertionError: Failed -2 tests. ``` After this fix: ``` _____________________ notebooks/stress-tests.ipynb::Cell 2 _____________________ Notebook cell execution failed Cell 2: Cell execution caused an exception Input: # NBVAL_IGNORE_OUTPUT test_statuses = [] failed_results = '' for bird in TEST_WPS_BIRDS: bird_url = f"{TWITCHER_URL}/{bird}/wps?service=wps&request=getcapabilities" expect_status_code = 200 results = stress_test_requests(bird_url, runs=TEST_RUNS, code=expect_status_code, max_err_code=TEST_MAX_ERR_CODE, max_avg_time=TEST_MAX_AVG_TIME, abort_retries=TEST_TIMEOUT_RETRY, abort_timeout=TEST_TIMEOUT_ABORT) test_statuses.append(results.status) print(results) if results.status: failed_results = f"{failed_results}\n{results}" failed_tests = sum(test_statuses) assert not failed_tests, f"Failed {failed_tests} tests. Failed results: {failed_results}" print("\nAll tests passed!") Traceback: --------------------------------------------------------------------------- AssertionError Traceback (most recent call last) <ipython-input-3-63ac944aa8d7> in <module> 15 failed_results = f"{failed_results}\n{results}" 16 failed_tests = sum(test_statuses) ---> 17 assert not failed_tests, f"Failed {failed_tests} tests. Failed results: {failed_results}" 18 19 print("\nAll tests passed!") AssertionError: Failed -2 tests. Failed results: Stress Test: Test: code: 200 runs: 100 max-avg-time: 1s max-err-code: 0 sum-err-code: 4 timeout-abort: 0s timeout-retry: 0 timeout-count: 0 Request: method: GET url: https://pavics.ouranos.ca/twitcher/ows/proxy/finch/wps?service=wps&request=getcapabilities args: {} Times: min: 0.028s avg: 0.235s max: 0.565s Results: Run Codes Delta Times 1 200 0.000s 0.189s 2 200 0.021s 0.167s 3 200 0.083s 0.406s 4 200 0.010s 0.185s 5 200 0.028s 0.186s 6 200 0.022s 0.228s 7 200 0.082s 0.373s 8 200 0.070s 0.230s 9 200 0.078s 0.277s 10 200 0.018s 0.180s 11 200 0.084s 0.230s 12 200 0.048s 0.395s 13 200 0.004s 0.227s 14 200 0.002s 0.219s 15 200 0.034s 0.341s 16 200 0.065s 0.222s 17 200 0.015s 0.231s 18 (!) 401 0.071s 0.056s 19 200 0.016s 0.199s 20 200 0.090s 0.331s 21 200 0.067s 0.219s 22 200 0.089s 0.255s 23 200 0.054s 0.184s 24 200 0.023s 0.207s 25 200 0.055s 0.328s 26 200 0.079s 0.219s 27 200 0.045s 0.289s 28 200 0.045s 0.374s 29 200 0.087s 0.565s 30 200 0.026s 0.184s 31 (!) 500 0.087s 0.028s 32 200 0.089s 0.211s 33 200 0.021s 0.365s 34 200 0.078s 0.215s 35 200 0.036s 0.222s 36 200 0.040s 0.182s 37 200 0.053s 0.232s 38 200 0.063s 0.324s 39 200 0.031s 0.171s 40 200 0.037s 0.346s 41 200 0.041s 0.182s 42 200 0.043s 0.196s 43 200 0.066s 0.332s 44 200 0.098s 0.224s 45 200 0.023s 0.187s 46 200 0.072s 0.173s 47 200 0.080s 0.200s 48 200 0.042s 0.296s 49 200 0.027s 0.320s 50 200 0.030s 0.203s 51 200 0.019s 0.177s 52 200 0.056s 0.203s 53 200 0.054s 0.303s 54 200 0.063s 0.195s 55 200 0.021s 0.185s 56 200 0.072s 0.162s 57 200 0.097s 0.199s 58 200 0.062s 0.360s 59 200 0.083s 0.224s 60 200 0.025s 0.194s 61 200 0.058s 0.397s 62 200 0.091s 0.194s 63 200 0.075s 0.189s 64 200 0.002s 0.204s 65 200 0.010s 0.298s 66 200 0.049s 0.187s 67 200 0.072s 0.193s 68 200 0.018s 0.220s 69 200 0.049s 0.241s 70 200 0.050s 0.309s 71 200 0.063s 0.221s 72 200 0.064s 0.205s 73 200 0.021s 0.383s 74 200 0.024s 0.216s 75 200 0.098s 0.178s 76 200 0.012s 0.198s 77 200 0.001s 0.321s 78 200 0.039s 0.195s 79 200 0.018s 0.207s 80 200 0.048s 0.201s 81 200 0.078s 0.211s 82 200 0.100s 0.313s 83 200 0.039s 0.216s 84 200 0.076s 0.214s 85 200 0.006s 0.304s 86 200 0.077s 0.181s 87 200 0.046s 0.205s 88 200 0.047s 0.160s 89 200 0.044s 0.321s 90 200 0.052s 0.184s 91 200 0.035s 0.217s 92 200 0.074s 0.163s 93 200 0.015s 0.169s 94 (!) 500 0.045s 0.033s 95 200 0.024s 0.307s 96 200 0.067s 0.240s 97 200 0.093s 0.241s 98 200 0.087s 0.314s 99 (!) 500 0.065s 0.030s 100 200 0.062s 0.185s Summary: Detected 4 erroneous HTTP codes not equal to expected 200. Test failed. Stress Test: Test: code: 200 runs: 100 max-avg-time: 1s max-err-code: 0 sum-err-code: 6 timeout-abort: 0s timeout-retry: 0 timeout-count: 0 Request: method: GET url: https://pavics.ouranos.ca/twitcher/ows/proxy/flyingpigeon/wps?service=wps&request=getcapabilities args: {} Times: min: 0.030s avg: 0.137s max: 0.212s Results: Run Codes Delta Times 1 200 0.000s 0.150s 2 200 0.012s 0.147s 3 200 0.057s 0.137s 4 200 0.086s 0.167s 5 200 0.077s 0.140s 6 200 0.023s 0.130s 7 200 0.061s 0.130s 8 200 0.083s 0.143s 9 200 0.065s 0.160s 10 200 0.087s 0.145s 11 200 0.098s 0.153s 12 200 0.081s 0.146s 13 200 0.024s 0.158s 14 200 0.028s 0.124s 15 200 0.041s 0.148s 16 200 0.045s 0.144s 17 200 0.001s 0.181s 18 200 0.012s 0.149s 19 (!) 500 0.084s 0.037s 20 200 0.078s 0.156s 21 200 0.059s 0.150s 22 200 0.091s 0.149s 23 200 0.075s 0.144s 24 200 0.036s 0.133s 25 200 0.025s 0.128s 26 200 0.020s 0.135s 27 200 0.016s 0.156s 28 200 0.016s 0.148s 29 200 0.034s 0.131s 30 200 0.078s 0.147s 31 200 0.025s 0.143s 32 200 0.043s 0.152s 33 200 0.026s 0.155s 34 200 0.092s 0.138s 35 200 0.047s 0.151s 36 200 0.035s 0.212s 37 200 0.049s 0.158s 38 200 0.041s 0.149s 39 200 0.063s 0.151s 40 (!) 500 0.060s 0.031s 41 200 0.003s 0.142s 42 200 0.014s 0.148s 43 200 0.058s 0.153s 44 200 0.004s 0.141s 45 200 0.062s 0.140s 46 200 0.024s 0.129s 47 200 0.099s 0.126s 48 200 0.086s 0.141s 49 200 0.095s 0.128s 50 200 0.039s 0.138s 51 200 0.100s 0.156s 52 200 0.091s 0.146s 53 200 0.012s 0.130s 54 200 0.029s 0.129s 55 200 0.091s 0.148s 56 200 0.027s 0.134s 57 200 0.072s 0.133s 58 200 0.083s 0.153s 59 200 0.090s 0.140s 60 200 0.053s 0.152s 61 200 0.055s 0.146s 62 200 0.067s 0.123s 63 200 0.056s 0.138s 64 200 0.031s 0.156s 65 200 0.012s 0.138s 66 200 0.013s 0.136s 67 200 0.069s 0.116s 68 200 0.082s 0.122s 69 200 0.008s 0.133s 70 (!) 500 0.024s 0.032s 71 200 0.018s 0.169s 72 200 0.008s 0.151s 73 200 0.042s 0.102s 74 (!) 500 0.052s 0.039s 75 200 0.091s 0.145s 76 200 0.090s 0.116s 77 200 0.078s 0.125s 78 200 0.042s 0.143s 79 (!) 500 0.082s 0.030s 80 200 0.030s 0.149s 81 200 0.084s 0.149s 82 200 0.064s 0.147s 83 200 0.013s 0.164s 84 200 0.081s 0.161s 85 (!) 500 0.096s 0.040s 86 200 0.036s 0.184s 87 200 0.021s 0.148s 88 200 0.012s 0.129s 89 200 0.044s 0.136s 90 200 0.065s 0.146s 91 200 0.037s 0.112s 92 200 0.019s 0.126s 93 200 0.083s 0.139s 94 200 0.041s 0.128s 95 200 0.095s 0.113s 96 200 0.058s 0.149s 97 200 0.045s 0.177s 98 200 0.069s 0.130s 99 200 0.013s 0.125s 100 200 0.004s 0.136s Summary: Detected 6 erroneous HTTP codes not equal to expected 200. Test failed. ``` # Overview Please include a summary of the changes and which issues is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change. This PR fixes [issue id](url) ## Changes - Adds... ## Related Issue / Discussion - Resolves ... ## Additional Information Links to other issues or sources. - [ ] Things to do...
…umber-of-failed-tests-in-stress-test-ipynb Fix incorrect calculation of number of failed tests in stress-test.ipynb # Overview Fix `stress-test.ipynb` incorrectly throw error when there are no errors! ## Changes - stress-test.ipynb: fix incorrect calculation of number of failed tests (when no tests had any errors it was reporting all 4 had error, regression introduced by PR #81) - stress-test.ipynb: fix incorrect variable access (class method was using global var `results` instead of `self`, copy/paste error during code refactoring in PR #74)
## Overview Applies the same changes as #174 (cache settings) but disable Twitcher caching explicitly (Magpie caching already disabled). This is in order to validate that cache settings are applied and tests suite can run successfully with cache disabled, to isolate that sporadic errors seen in #174 are related to enabled cache. ## Changes **Non-breaking changes** - Added cache settings of `MagpieAdapter` through Twitcher (intended to improve response times, but disabled in this PR). - Update Magpie/Twitcher version 3.14.0 **Breaking changes** - Unique user email in Magpie 3.13.0 is enforced, see birdhouse-deploy changelog for how to find them and update the conflicted values before the upgrade. Otherwise the upgrade will break. See [Magpie 3.13.0 change log](https://pavics-magpie.readthedocs.io/en/3.13.0/changes.html#bug-fixes) for details regarding the introduced unique email restriction. ## Related Issue / Discussion - Undo resolution of [DAC-296 Twitcher performance issue](https://www.crim.ca/jira/browse/DAC-296) since caching now disabled - Undo resolution of bird-house/twitcher#97 since cache becomes disabled - Based on top of #174 with cache-settings fixes - Related new stress test Ouranosinc/PAVICS-e2e-workflow-tests#74
Bump version: 1.13.14 → 1.14.0 Following merge of pull request #182 from bird-house/cache-settings-off ## Overview Applies the same changes as #174 (cache settings) but disable Twitcher caching explicitly (Magpie caching already disabled). This is in order to validate that cache settings are applied and tests suite can run successfully with cache disabled, to isolate that sporadic errors seen in #174 are related to enabled cache. ## Changes **Non-breaking changes** - Added cache settings of `MagpieAdapter` through Twitcher (intended to improve response times, but disabled in this PR). - Update Magpie/Twitcher version 3.14.0 **Breaking changes** - Unique user email in Magpie 3.13.0 is enforced, see birdhouse-deploy changelog for how to find them and update the conflicted values before the upgrade. Otherwise the upgrade will break. See [Magpie 3.13.0 change log](https://pavics-magpie.readthedocs.io/en/3.13.0/changes.html#bug-fixes) for details regarding the introduced unique email restriction. ## Related Issue / Discussion - Undo resolution of [DAC-296 Twitcher performance issue](https://www.crim.ca/jira/browse/DAC-296) since caching now disabled - Undo resolution of bird-house/twitcher#97 since cache becomes disabled - Based on top of #174 with cache-settings fixes - Related new stress test Ouranosinc/PAVICS-e2e-workflow-tests#74
Overview
Adds a stress test notebook.
The new notebook uses a function with versatile inputs to control how strict we want to be with this test.
For the moment, it tests 100 times WPS
GetCapabilities
for each offinch
,flyingpigeon
andraven
birds, but can be extended for basically any request.Because response errors are sporadic, we will need to tweak the parameters slightly to find a sweet spot that we consider "good enough" to make tests pass. If still too problematic, it can also be placed in excluded tests until required.
For the moment, tests were ran against following instances (multiple times each):
In all cases, results showed both 200 for all runs, and partial (~50/50 runs) with 200/401 responses.
This demonstrates that caching feature (bird-house/birdhouse-deploy#174) is not at the root of 200/401 sporadic responses since instances that don't even have it yet have random behaviour.
Actual cause is still unknown for the moment.
Changes
Related Issue / Discussion