You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 12, 2022. It is now read-only.
metro reported packets not being generated, so i shelled into the metro pdf merger server, and had a look at the logs. i found this:
ubuntu@ip-10-0-0-208:~$ tail -n 200 /tmp/metro-pdf-worker-d-XSTH3EHWO-err.log
...
Traceback (most recent call last):
File "/usr/local/lib/python3.5/threading.py", line 914, in _bootstrap_inner
self.run()
File "/home/datamade/metro-pdf-merger-d-XSTH3EHWO/tasks.py", line 146, in run
self.doWork()
File "/home/datamade/metro-pdf-merger-d-XSTH3EHWO/tasks.py", line 149, in doWork
msg = redis.blpop(REDIS_QUEUE_KEY)
File "/home/datamade/.virtualenvs/metro-pdf-merger-d-XSTH3EHWO/lib/python3.5/site-packages/redis/client.py", line 1163, in blpop
return self.execute_command('BLPOP', *keys)
File "/home/datamade/.virtualenvs/metro-pdf-merger-d-XSTH3EHWO/lib/python3.5/site-packages/redis/client.py", line 578, in execute_command
connection.send_command(*args)
File "/home/datamade/.virtualenvs/metro-pdf-merger-d-XSTH3EHWO/lib/python3.5/site-packages/redis/connection.py", line 563, in send_command
self.send_packed_command(self.pack_command(*args))
File "/home/datamade/.virtualenvs/metro-pdf-merger-d-XSTH3EHWO/lib/python3.5/site-packages/redis/connection.py", line 538, in send_packed_command
self.connect()
File "/home/datamade/.virtualenvs/metro-pdf-merger-d-XSTH3EHWO/lib/python3.5/site-packages/redis/connection.py", line 442, in connect
raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 111 connecting to localhost:6379. Connection refused.
sure enough, the worker process had exited at some point the previous day.
ubuntu@ip-10-0-0-208:~$ sudo supervisorctl status
...
metro-pdf-merger-d-XSTH3EHWO:metro-pdf-merger RUNNING pid 1974, uptime 3 days, 6:34:30
metro-pdf-worker-d-XSTH3EHWO:metro-pdf-worker EXITED Nov 29 06:48 AM
...
there were no alerts in sentry or semaphor that our app had gone down. let's investigate why this happened, and consider putting alerts in place so we can handle this proactively if it happens again.
The text was updated successfully, but these errors were encountered:
@evz and I looked in to why the worker died in the first place. We did not find a logical explanation, though we did find a pretty unlikely one: Redis exited on November 29, and restarted itself. Could this brief milisecond, when redis was down, have collided with the same brief milisecond when the Metro PDF worker executed msg = redis.blpop(REDIS_QUEUE_KEY)? Possibly, but that seems improbable.
[1947 | signal handler] (1543474090) Received SIGTERM, scheduling shutdown...
[1947] 29 Nov 06:48:10.278 # User requested shutdown...
[1947] 29 Nov 06:48:10.278 * Saving the final RDB snapshot before exiting.
[1947] 29 Nov 06:48:10.281 * DB saved on disk
[1947] 29 Nov 06:48:10.282 * Removing the pid file.
[1947] 29 Nov 06:48:10.282 # Redis is now ready to exit, bye bye...
[4352] 29 Nov 06:48:11.874 # Unable to set the max number of files limit to 10032 (Operation not permitted), setting the max clients configuration to 3984
.
_._
_.-``__ ''-._
_.-`` `. `_. ''-._ Redis 2.8.4 (00000000/0) 64 bit
.-`` .-```. ```\/ _.,_ ''-._
( ' , .-` | `, ) Running in stand alone mode
|`-._`-...-` __...-.``-._|'` _.-'| Port: 6379
| `-._ `._ / _.-' | PID: 4352
`-._ `-._ `-./ _.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' | http://redis.io
`-._ `-._`-.__.-'_.-' _.-'
|`-._`-._ `-.__.-' _.-'_.-'|
| `-._`-._ _.-'_.-' |
`-._ `-._`-.__.-'_.-' _.-'
`-._ `-.__.-' _.-'
`-._ _.-'
`-.__.-'
[4352] 29 Nov 06:48:11.875 # Server started, Redis version 2.8.4
Moving forward, let's be better about logging (as the issue suggests!), and if this occurs again, we might have a better sense of the reason why.
metro reported packets not being generated, so i shelled into the metro pdf merger server, and had a look at the logs. i found this:
sure enough, the worker process had exited at some point the previous day.
ubuntu@ip-10-0-0-208:~$ sudo supervisorctl status ... metro-pdf-merger-d-XSTH3EHWO:metro-pdf-merger RUNNING pid 1974, uptime 3 days, 6:34:30 metro-pdf-worker-d-XSTH3EHWO:metro-pdf-worker EXITED Nov 29 06:48 AM ...
there were no alerts in sentry or semaphor that our app had gone down. let's investigate why this happened, and consider putting alerts in place so we can handle this proactively if it happens again.
The text was updated successfully, but these errors were encountered: