-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
getting timeouts, 500s, and good number of errors logged -- need help with analysis/triage #1117
Comments
This looks to me like a worker crashed when it used too much memory, and that caused other errors to appear. @mvandenburgh has been working on solving memory usage errors, so I wonder if we will seeing fewer errors like this. @danlamanna, could you look these over and add your own insight as to what might be going on? |
I see it mentions |
FWIW I do not see assetSummary errors in today's logs (so might indeed be mitigated by now -- yeay!) but some new types of errors seems to bubble up - never a boring day(base) dandi@drogon:/mnt/backup/dandi/heroku-logs/dandi-api$ grep -ih error 20220706* | grep -v sql_error_code | sed -e 's,^.*+00:00 ,,g' | sort | uniq -c | sort -n | nl | tail -n 20
22084 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 3 messages since 2022-07-06T17:57:25.862123+00:00.
22085 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 3 messages since 2022-07-06T17:58:25.919928+00:00.
22086 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 3 messages since 2022-07-06T18:58:58.728623+00:00.
22087 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 3 messages since 2022-07-06T18:59:28.649057+00:00.
22088 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 3 messages since 2022-07-06T19:59:31.397667+00:00.
22089 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 4 messages since 2022-07-06T17:53:12.387464+00:00.
22090 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 4 messages since 2022-07-06T17:59:25.922625+00:00.
22091 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 4 messages since 2022-07-06T18:59:25.730034+00:00.
22092 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 6 messages since 2022-07-06T18:59:23.790925+00:00.
22093 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 8 messages since 2022-07-06T18:59:35.182099+00:00.
22094 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 9 messages since 2022-07-06T16:59:57.066035+00:00.
22095 2 heroku[logplex]: Error L10 (output buffer overflow): drain 'd.3a7ec4ce-ec2a-4a97-9063-bac84479d38a' dropped 9 messages since 2022-07-06T18:59:26.93824+00:00.
22096 12 app[web.2]: raise ValueError("Provided metadata has
22097 12 app[web.2]: ValueError: Provided metadata has no schema
22098 16 app[web.1]: raise ValueError("Provided metadata has
22099 16 app[web.1]: ValueError: Provided metadata has no schema
22100 17 app[worker.1]: dandischema.exceptions.JsonschemaValidationError: [<ValidationError: "'schemaKey' is a required property">]
22101 17 app[worker.1]: raise JsonschemaValidationError(error_list)
22102 51 app[worker.1]: raise ValueError("Provided metadata has no schema version")
22103 51 app[worker.1]: ValueError: Provided metadata has no schema version
I think it would be useful at and what is that buffer L10 error? we have over 20k log lines today only |
As for the |
for now I think it is ok to assume that |
also when a metadata of an asset is saved on the server side it should inject latest schemaVersion and validate if not provided or reject the post. |
i.e. it should never save metadata without a schemaVersion. |
While backing up dandisets on drogon,
we keep running into various 500s, timeouts etc
and I have difficulty establishing reliable dump of logs from heroku but did archive some, e.g. recent ones:
where is the brief summary of errors without
sql_error_code
logged:and for sql_error (including 0000 -- didn't check if legit or not)
and without 00000
the same non-0 sqlerrors in may
edit1: those few which we observed in Aug of last 2021 year
IMHO someone with better knowledge of those systems should review/analyze and report on either they are all "benign" or some require attention/action.
The text was updated successfully, but these errors were encountered: