Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Registry] Add MySQL as registry plugin #10408

Closed
3 tasks done
ruanwenjun opened this issue Jun 11, 2022 · 3 comments · Fixed by #10406
Closed
3 tasks done

[Feature][Registry] Add MySQL as registry plugin #10408

ruanwenjun opened this issue Jun 11, 2022 · 3 comments · Fixed by #10406
Assignees
Labels

Comments

@ruanwenjun
Copy link
Member

ruanwenjun commented Jun 11, 2022

Search before asking

  • I had searched in the issues and found no similar feature requirement.

Description

Right now, we only support Zookeeper as a registry plugin, this may cause problem for some users who want to deploy DS but don't have a Zookeeper cluster. For example, if I don't have an exist Zookeeper cluster, and I want to deploy a DS cluster with 2 masters and 2 workers, then I need to extra deploy a reliable Zookeeper cluster with 3 instances.

DS already relies on database to store the metadata of workflow, this issue is hope to introduce a way that DS can use database as the registry center to store the metadata of master/worker.

Design

DS use registry to do the below three things:

  1. Store the metadata of master/worker so that it can get notify when nodes up and down.
  2. Store the metadata of worker to do loadbalance.
  3. Acquire a global lock when do failover.

So for DS, the registry need to notify the server when the server subscribe data have added/deleted/updated, support a way to create/release a global lock, delete the server's metadata when server down.

Subscribe/Notify

MySQL doesn't support subscribe/notify, we need to loop the data and find out if we need to notify the subscribed listener.
There will be a scheduler thread, loop data from database, and compare with the last version's data, if the data has "changed", then it will trigger the subscribed listener.
image

Ephemeral Node

Ephemeral node representing a connection, if the connect server down, then the ephemeral node will disappear.
In each server, there will be a SchedulerThread to hold the ephmeral node created by this server, and it will update the lastUpdateTime of its ephemeral node, and if the ephmeral node lastUpdateTime has not been updated in a while, then the ephmeral will be clear.
The server will schedule update the ephmeral node's term created by itself, and clear all ephemeral nodes which have expired.

image

Global Lock

The design of global lock is the same with ephemeral node, there will be a table to store the lock info. And each server will update their hold lock's term, clear the expiry lock.
image

Table Design

There will be two table need to be created, if you want to use MySQL as registry.

  • t_ds_mysql_registry_data: used to store data about Ephemeral/Persistent node
  • t_ds_mysql_registry_lock: used to store the global lock data.
CREATE TABLE `t_ds_mysql_registry_data`
(
    `id`               bigint(11)   NOT NULL AUTO_INCREMENT COMMENT 'primary key',
    `key`              varchar(200) NOT NULL COMMENT 'key, like zookeeper node path',
    `data`             varchar(200) NOT NULL COMMENT 'data, like zookeeper node value',
    `type`             tinyint(4)   NOT NULL COMMENT '1: ephemeral node, 2: persistent node',
    `last_update_time` timestamp    NULL COMMENT 'last update time',
    `create_time`      timestamp    NULL COMMENT 'create time',
    PRIMARY KEY (`id`),
    unique (`key`)
) ENGINE = InnoDB
  DEFAULT CHARSET = utf8;

CREATE TABLE `t_ds_mysql_registry_lock`
(
    `id`               bigint(11)   NOT NULL AUTO_INCREMENT COMMENT 'primary key',
    `key`              varchar(200) NOT NULL COMMENT 'lock path',
    `lock_owner`       varchar(100) NOT NULL COMMENT 'the lock owner, ip_processId',
    `last_term`        timestamp    NOT NULL COMMENT 'last term time',
    `last_update_time` timestamp    NULL COMMENT 'last update time',
    `create_time`      timestamp    NULL COMMENT 'lock create time',
    PRIMARY KEY (`id`),
    unique (`key`)
) ENGINE = InnoDB
  DEFAULT CHARSET = utf8;

Use case

If user don't have a Zookeeper cluster, they can still use DolphinScheduler by use mysql as registry.

Related issues

No response

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@ruanwenjun ruanwenjun added feature new feature Waiting for reply Waiting for reply labels Jun 11, 2022
@github-actions
Copy link

Thank you for your feedback, we have received your issue, Please wait patiently for a reply.

  • In order for us to understand your request as soon as possible, please provide detailed information、version or pictures.
  • If you haven't received a reply for a long time, you can join our slack and send your question to channel #troubleshooting

@mgsky1
Copy link
Contributor

mgsky1 commented Jun 11, 2022

I have a suggestion. After the development completed, would you mind making a report about performance on DB, master server and worker? I would like to know how much this affects performance.

@ruanwenjun
Copy link
Member Author

I have a suggestion. After the development completed, would you mind making a report about performance on DB, master server and worker? I would like to know how much this affects performance.

Ok, I will provide a report about database performance, in each server there will be three schedule thread to scan mysql at a given interval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants