Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State Manager #538

Open
3 of 4 tasks
wzrdtales opened this issue Feb 3, 2018 · 9 comments
Open
3 of 4 tasks

State Manager #538

wzrdtales opened this issue Feb 3, 2018 · 9 comments
Labels
Milestone

Comments

@wzrdtales
Copy link
Member

wzrdtales commented Feb 3, 2018

Description

The state manager will help to overcome the challenges when using db-migrate in an automated and integrated fashion. It solves problems with concurrency and allows to continue from aborted states, or rollback these without the usual pains involved when using a database with exclusively non transactional DDL like MySQL.

Tasks

  • State management basis
  • Read and rollback from state
  • Tick for progressing activity check in
  • Authenticator for node selection on concurrency

Implementation details

The state manager is basically a json document that is being exchanged over the most obvious communication channel already existent. The db that is being migrated, more options could be added in the future, but that is it for now. This db will be a simple blob column, or better said text and will include either a json payload or jsonic.

The lock aquisition will happen either through one of the following scenarios

Scenario 1 - Update WHERE or CAS

Update exclusively on an try that has no current lock or a current explicitly expired one.

Scenario 2 - Randomized node name, update and check

We generate a randomized node name, or possibly a user presented key of the current node that he is responsible for, as he ensures that it may not be used by any other node.

Next we straightforwardly update the column and refetching it after a short timeout to reensure that we got the lock and not any other node. This strategy is not entirely safe and should only be used in scenarios where CAS or UPDATE where clauses are not available that can guarantee only a single update being made.

An alternative would also be the user again to provide to whom he may provide the lock.

Decided implementation

To decide the worker node to execute the migrations every node will generate a UUID and check the current lock state. If there was no activity for the current lock for 30 seconds, it will be assumed dead. Next step will insert a lock request with the nodes UUID and the execution date (database time). After inserting we retrieve the lock requests again and check if there is more than one request. If there is more than one request and they can't be differentiated by the date the smallest UUID will take the lock and the others cease their request. The lock holder will start with the normal process and start ticking, the failed requestors will go back to the watching state.

Relevant Issues

Refers to #464


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@Gregoor
Copy link

Gregoor commented Apr 5, 2018

Heya @wzrdtales, I'm a happy user of db-migrate, specifically in https://github.com/mozilla/voice-web. I was looking into building sth like this on the application side, though it'd be even better if it could be solved as part of this lib. Is there any way one can help with this?

@wzrdtales
Copy link
Member Author

wzrdtales commented Apr 5, 2018

So I guess that demands background information :)

Basically I made decisions on this item already, but yet did not made it to the point implementing it. So there is one big issue with this item:

It will break all the drivers, either partially, or completely. The reason is that a new generalized method needs to be added to the drivers to provide the actual data structure. However I dealt with this problem already and made 1 a decision and 2 put out a backwards compatibility plan, to make things less worse in the beginning.

First things first, the decision was to break, but break it right, merge in the new loader at the same time and step up to major version 1, one to signalize future stability and second, to make signaling breaking changes more easily visible, although I try to avoid them.

So to come to the second thing breaking them right means, only break when we need to and avoid breaking the same thing in the future again. So two things have to happen for this.

  1. Add two new generic methods to the drivers:
  • A create an ordered list structure method
  • A create (single) key value structure method

Both may be able to specify the maximum length of their information (value) field, but not quite much more.

  1. Scan for the old creation methods when the new ones are not available and fallback to those, but yell about it in the logs. This will make the user aware of the problem, but does not make him unable to move on. This will hold true until the v2 migration schema is being released, which is already planned to make use heavily of the state manager.

So back to where we started, helping here pretty much starts when you help writing the state manager, if you're willing to. The only real reason it was not done yet, is as always time for me, although I use db-migrate in my own current projects where I spend the time that is missing, which always means that I have also more time left that I spend on db-migrate, b/c I have a current need and a benefit on the current project to invest time on it. So that is pretty much it, if you want to give this a stab, I am very open for a PR which I will as always be happy to help with.

@wzrdtales
Copy link
Member Author

Also to add: the state manager is named like what it should do, b/c locking is not enough for future plans. See #508 and #533 for reference.

@wzrdtales
Copy link
Member Author

New loader is also already here #537, but waits for the state manager to be merged in.

@Gregoor
Copy link

Gregoor commented Apr 9, 2018

Thanks for the detailed information! We've also been talking with our infrastructure team last week, if we can solve it (for our case) in our build step, though that would probably not happen in the next weeks. So I'm still interested in helping with the implementation, which would probably take me a couple of days (I don't have a good mental model of what needs to be done just yet). I doubt it'll be this sprint for me, I'll get back to you when I find the time!

@wzrdtales
Copy link
Member Author

@Gregoor My pleasure! Just hit me up when you've got some time. I can't guarantee, but at least I can say that I will work in that item quite soon anyway. One of the projects that I work on includes db-migrate in programable mode ad as soon as it hits production I will need that solved, so there is a definite need for me behind it. If you should get some time before I start, I will be happy to welcome your help! :)

wzrdtales added a commit that referenced this issue May 14, 2019
refers to #538

Signed-off-by: Tobias Gurtzick <magic@wizardtales.com>
@wzrdtales
Copy link
Member Author

Just as a note @Gregoor . Currently there is some momentum on the development from my side, which is finally possible since I dedicated a time block to db-migrate. So first step will be the state manager as a dependency to the new migration schema, next step will add concurrency control.

@Gregoor
Copy link

Gregoor commented May 14, 2019

That's great to hear, sorry the radio silence. We had to solve a coordination issue in our application which also took away the need we had for this.

But if you need someone to rubber duck or review some code, feel free to hit me up ✋

@wzrdtales
Copy link
Member Author

wzrdtales commented May 14, 2019

In modern setups I also separate the migrations in a setup step. But the concurrency control will also allow some interesting features, like dynamic feature toggles without redeployment, that might be a reason to fall back again into integrated migrations at that point. I will see how the design looks in the end however.

I will reach out when the code hits a certain degree of maturity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants