Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issues with mautrix dbs on resource-constrained systems #725

Closed

Conversation

scottcrossen
Copy link
Contributor

The issue:

Whenever I restart my server or docker daemon, Mautrix-based bridges with alembic databases refuse to start because of this error:

docker: Error response from daemon: Conflict. The container name "/matrix-mautrix-something-db" is already in use by container "blahblahblah".
You have to remove (or rename) that container to be able to reuse that name.

It's because the --rm step times-out and the old container can't be removed. For some reason the container can't be removed by the daemon.

This fix will first try to kill the database container, then if that times out it forcefully removes the container and the network interface.

@spantaleev
Copy link
Owner

Does that network disconnect even do anything when you've removed the container and then force-removed it? It should be completely gone, shouldn't it.

Thinking about it now, we can probably run the migrations container without networking (--network=none), at least for now, until we've migrated these bridges to use Postgres.

@scottcrossen
Copy link
Contributor Author

scottcrossen commented Nov 23, 2020

I found that the a rm --force does retain artifacts in the network bridge even though it removes the container. You won't be able to start another container because the network identifier is still in use on that bridge. I'm not really sure what these "-db" containers are used for. Are you saying that these containers (ending in "-db") are "migrations containers"? Are these the ones you want on --network=none? Right now 2/3 of them run on the default bridge.

@scottcrossen
Copy link
Contributor Author

According to this. It looks like something might fail if the non-matrix docker bridge is used:

331c77a

@spantaleev
Copy link
Owner

You're right. Some people may be overriding the configuration and pointing it to matrix-postgres, in which case it would need to be on the matrix network.

@scottcrossen
Copy link
Contributor Author

You're right. Some people may be overriding the configuration and pointing it to matrix-postgres, in which case it would need to be on the matrix network.

Is this just for telegram or for all mautrix bridges?

@spantaleev
Copy link
Owner

Probably for all bridges (not just mautrix ones). Well, not for now, because none of them use a DB that's on the network. But they should in the future. Especially after #740 and some related work lands.

@spantaleev
Copy link
Owner

I'd rather we don't merge this as it appears to be a hacky workaround for some weird container issue, which is not experienced commonly and may have been been fixed in newer Docker releases (20.10 appeared recently).

If you're still experiencing it, let's see how we can reproduce it and come up with a similar fix.

Thanks anyway! 👍

@spantaleev spantaleev closed this Jan 3, 2021
@scottcrossen
Copy link
Contributor Author

Hey cool with me! Sorry about the delay (holidays)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants