-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add startup lock support (general framework + tested etcd implementation) #6
Add startup lock support (general framework + tested etcd implementation) #6
Conversation
This required completely reworking startup sequence in `autocluster.erl` without making the existing dozen step sequence into an incomprehensible mess. Now steps are more independent and are executed in order with function that provides something like State + Either monads. Boot process is now split into 3 stages: - First stage roughly corresponds to original implementation - Second step is responsible for registering node in the backend. Also TTL timers are now started there, instead of spreading `-rabbit_boot_steps` through backend modules. This makes `autocluster.erl` the single source of truth about what is actually happening during startup. - Third stage is responsible for releasing startup lock. We need it because of `ignore` failure mode - we want the lock to be released even if some prior steps failed. During this rework it became evident that explicit mnesia:reset/0 is not needed (more thorough explanation is in comments for `autocluster:maybe_cluster/1`). Startup locking support should work the same way for every backend, but for now the only working implementation is for `etcd`. It even has some property-based tests =) Other backend fall back to original random delay behaviour. Also it looks like `dialyzer` ignores type specs in comments - because some of types there were definitely wrong, and converting to them to `-spec` caused a lot of barfing from `dialyzer`.
…rabbitmq-autocluster_startup-locking-support
6093cd4
to
4909873
Compare
The plugin has to work even if the Backend is not configured. Add the unit tests for these cases
…r_startup-locking-support
Fix consul example
6. Build a [Docker image](https://github.com/rabbitmq/rabbitmq-autocluster/blob/master/Dockerfile) | ||
``` | ||
git clone https://github.com/rabbitmq/rabbitmq-autocluster.git rabbitmq-autocluster | ||
make dist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here the users have to build the plugin from source.
We can remove:
git clone https://github.com/rabbitmq/rabbitmq-autocluster.git rabbitmq-autocluster
make dist
when we will put the docker
image on docker hub
We will take a look after 0.7.0 GA is released. |
@binarin @Gsantomaggio I don't see where in the code is |
OK, so GitHub does not display the entire diff again. The "find best node to join" part can be a bit too opinionated for inclusion into RabbitMQ core. |
src/autocluster.erl
Outdated
@@ -371,19 +382,34 @@ startup_delay(Max) -> | |||
|
|||
-spec backend_register(#startup_state{}) -> ok | {error, iolist()}. | |||
backend_register(#startup_state{backend_module = Mod}) -> | |||
Mod:register(). | |||
case erlang:function_exported(Mod, register, 0) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply check if Mod
is unconfigured
? All backends should support these callbacks, otherwise is a proper error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it is unconfigured
it raises the error undef
examples/k8s_minikube/README.md
Outdated
|
||
minikube service rabbitmq-deployment --namespace=test-rabbitmq | ||
|
||
kubectl scale rabbitmq-deployment --replicas=4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fails with:
$ kubectl scale rabbitmq-deployment --replicas=4
error: resource(s) were provided, but no name, label selector, or --all flag specified
This would work: kubectl scale --replicas=4 -f examples/k8s_minikube/rabbitmq.yaml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This error was in this commit a8987df#diff-c9aed055be592401ae2140d96ca90ad3R77
Already fixed here: https://github.com/rabbitmq/rabbitmq-autocluster/pull/6/files#diff-c9aed055be592401ae2140d96ca90ad3R94
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kubectl scale deployment/rabbitmq-deployment --namespace=test-rabbitmq --replicas=6
Fix unit-tests by adding the autocluster:new_startup_state().
Add consul link example Fix consul link
Conflicts: src/autocluster_k8s.erl
Available as of 0.8.0.M1. |
Sadly |
@binarin thanks, I will edit out that line now. |
Moved the PR aweber#98