Skip to content

Latest commit

 

History

History
189 lines (150 loc) · 15 KB

README.md

File metadata and controls

189 lines (150 loc) · 15 KB

MongoDB ReplicaSet Manager for Docker Swarm

Version: 1.03

Introduction

This tool automates the configuration, initiation, monitoring, and management of a MongoDB replica set within a Docker Swarm environment. It ensures continuous operation, and adapts to changes within the Swarm network, ensuring high availability and consistency of data.

Features

  • Automated ReplicaSet Initialization: configures and initiates MongoDB replica set from scratch, determines the number of nodes in the MongoDB global service to wait for, taking into account 'down' nodes or nodes marked as 'unavailable' in the swarm.

  • Primary Node Tracking & Configuration for ReplicaSet: automatic replicaset primary designation and tracking on new & existing deployments.

  • Dynamic ReplicaSet Reconfiguration: adjusts the replica set as MongoDB instance IPs change within Docker Swarm. Checks if a replicaset is already configured or being redeployed and adjusts members accordingly.

  • Resilience and Redundancy: ensures the replica set's stability and availability, even during node changes. In case the primary node is lost, it waits for a new election or forces reconfiguration when the replica-set is inconsistent.

  • Mongo Admin and init User Setup: automates the creation of MongoDB admin (root) account as well as an initial db user & associated collection/db insertion for your main application/service using mongo.

  • Continuous Monitoring: watches for changes in the Docker Swarm topology. Continuously listens for changes in IP addresses or MongoDB instances removals and/or additions and adjusts the replica set accordingly. Tested againts a wide variety of potential outage edge cases to ensure reliability.

  • Error Handling and Detailed Logging: provides comprehensive logging for efficient troubleshooting.

  • Scalability: Designed to work with multiple nodes in a Docker Swarm environment and scale the ReplicaSet automatically as additional MongoDB nodes are added/removed from the stack.

Requirements

  • MongoDB: version 6.0 and above (recipe uses 7.0.6).
  • PyMongo Driver: 4.5.0 and above - included (image uses 4.6.2).
  • Docker: tested on >= 24.0.5.
  • Operating System: Linux (tested on Ubuntu 23.04).
    mongo-replica-ctrl image supports: linux/amd64, linux/arm/v7, linux/arm64

Prerequisites

How to Use

TL;DR:

  • git clone https://github.com/JackieTreeh0rn/MongoDB-ReplicaSet-Manager
  • ./deploy.sh

1. Ensure that all required environment variables are set in [`mongo-rs.env`](./mongo-rs.env) (see environment variables below).
  1. Modify the docker-compose-stack.yml to add your main application making use of the mongo service. Note - set your application's MongoDB URI to use the following connection string when connecting to the replica set service:

    mongodb://${MONGO_ROOT_USERNAME}:${MONGO_ROOT_PASSWORD}@database:27017/?replicaSet=${REPLICASET_NAME}

  2. Deploy the compose stack on your Docker Swarm using the deploy.sh script via: ./deploy.sh - this will perform the following actions:

    • Import ENVironment variables.
    • Create backend 'overlay' network with encryption enabled.
    • Generate a keyfile for the replicaSet and add it as a Docker "secret" for the stack to use.
    • Spin-up the various docker stack services: mongo, dbcontroller, nosqlclient, your application or service.
    • The dbcontroller tool will run as a single instance per Swarm node (global mode) as defined in the Compose YML.
  3. Monitor logs for the tool's output and any potential errors or adjustments (see troubleshooting section)

  4. To remove, run ./remove.sh or delete the stack manually via docker stack rm [stackname]. Note: the _backend 'overlay' network created during initial deployment will not be removed automatically as it is considered external. If redeploying/updating, leave the existing network in place so as to retain the original network subnet.

Environment Variables

The script requires the following environment variables, defined in mongo-rs.env:

  • STACK_NAME, the default value is myapp
  • MONGO_VERSION, the default value is 7.0.6
  • REPLICASET_NAME, the default value is rs
  • BACKEND_NETWORK_NAME, the default value is ${STACK_NAME}_backend
  • MONGO_SERVICE_URI, the default value is ${STACK_NAME:}_database
  • MONGO_ROOT_USERNAME, the default value is root
  • MONGO_ROOT_PASSWORD, the default value is password123
  • INITDB_DATABASE, the default value is myinitdatabase
  • INITDB_USER, the default value is mydbuser
  • INITDB_PASSWORD, the default value is password

How It Works

  • The tool first identifies and assesses the status of MongoDB services in the Docker Swarm.
  • It then either initializes a new MongoDB replica set or manages an existing one based on the current state.
  • Continuous monitoring allows the tool to adapt the replica set configuration in response to changes in the Swarm network, such as node additions or removals, reboots, shutdowns, etc.
  • The nosqlclient service included in the recipe can be used to access and manage the db - upon launching the nosqlclient front-end via http://<any-swarm-node-ip>:3030, click connect to select a database to view/manage.
  • The included compose YML will use the latest version available on DockerHub via jackietreehorn/mongo-replica-ctrl . Alternatively, you can use docker pull jackietreehorn/mongo-replica-ctrl:latest to pull the latest version and push it onto your own repo. Additionally, the included ./build.sh allows you to build the docker image locally as well.

Troubleshooting / Additional Details

  • Logs - check the Docker service logs for the mongo controller service for details about its operation (enable DEBUG:1 in compose YML if you want more detail). If you do not use something like Portainer or similar web frontend to manage Docker, you can follow the controller logs via CLI on one of your docker nodes via: docker service logs [servicename]_dbcontroller --follow

    Example:

    docker service logs myapp_dbcontroller --follow --details
    
    | INFO:__main__:Expected number of mongodb nodes: {6} | Remaining to start: {0}
    | INFO:__main__:Mongo service nodes are up and running!
    | INFO:__main__:Mongo tasks ips: ['10.0.26.48', '10.0.26.52', '10.0.26.51', '10.0.26.49', '10.0.26.7', '10.0.26.4']
    | INFO:__main__:Inspecting Mongo nodes for pre-existing replicaset - this might take a few moments, please be patient...
    | INFO:__main__:Pre-existing replicaSet configuration found in node 10.0.26.48: {'10.0.26.52', '10.0.26.51', '10.0.26.49', '10.0.26.4', '10.0.26.7', '10.0.26.48'}
    | INFO:__main__:Checking Task IP: 10.0.26.52 for primary...
    | INFO:__main__:Checking Task IP: 10.0.26.51 for primary...
    | INFO:__main__:Checking Task IP: 10.0.26.7 for primary...
    | INFO:__main__:Checking Task IP: 10.0.26.48 for primary...
    | INFO:__main__:--> Mongo ReplicaSet Primary is: 10.0.26.48 <--
    
    
  • Environment - verify that all required environment variables are correctly set in mongo-rs.env.
  • Docker Stack Compose YML - ensure that the MongoDB service is correctly configured and accessible within the Docker Swarm - see compose file for standard configuration. The dbcontroller that maintains the status of the replica-set must be deployed in a single instance over a Swarm manager node (see docker-compose-stack.yml). Multiple instances of the Controller, may perform conflicting actions! Also, to ensure that the controller is restarted in case of error, there is a restart policy in the controller service definition.

    IMPORTANT: The default MongoDB port is 27017. This port is only used internally by the services/applications in the compose YML and it is not published outside the Swarm by design. Changing or publishing this port in the YML configuration will break management of the mongodb replicaSet.

  • Firewalls / SELinux - Linux distributions using SELinux are well known for causing issues with MongoDB. To check if your distribution is using SELinux you can run sestatus and either disable it or configure it for mongodb if you must absolutely use it. Additionally, ensure your distribution's firewall is disabled during testing or configured for Mongo - check your distribution docs for appropiate steps (eg. systemctl status firewalld, ufw status, etc).

  • Networking - the _backend 'overlay' external network created during initial deployment is assigned an address space (eg. 10.0.25.0) automatically by Docker. You can define your own network space by uncommenting the relevant section in deploy.sh and adjusting as needed, in the event of overlap with other subnets in your network (this should only be needed in extremely rare ocassions). In addition, DO NOT remove this network when re-deploying / updating your stack on top of an existing-working replicaSet configuration so as to avoid subnet changes and connectivity issues between re-deployments.

  • Persistent Data - to use data persistence, the mongo service needs to be deployed in global mode (see docker-compose-stack.yml). This is to avoid more than one instance being deployed on the same node and prevent different instances from concurrently accessing the same MongoDB data space on the filesystem. The volumes defined in the compose YML allow for each mongo node to use its own dedicated data store. They are also set as external so that they aren't inadvertenly deleted or recreated between service redeployments.
  • Swarm Nodes - for HA purposes, your Swarm cluster should have more than one manager. This allows the controller to start/restart on different nodes in case of issues.
  • Healthchecks - the Mongo health check script mongo-healthcheck serves only to verify the status of the MongoDB service. No check on mongo cluster status is made. The cluster status is checked and managed by the dbcontroller service. I use Docker Configs to pass the MongoDB health check script to the MongoDB containers - this is done automatically by Docker once the compose stack is deployed.

  • MongoDB Configuration Check - the Mongo ./docker-mongodb_config-check.sh script can be run from any docker manager node to locate and connect to a mongodb instance in the swarm and fetch configuration information. It runs rs.status() and rs.config() and returns the output. This can help in validating/correlating the config's PRIMARY shown, against the dbcontroller logs, in addition to other relevant configuration information for your replicaSet.

    Example:

    ./docker-mongodb_config-check.sh
    
    members: [
        {
        _id: 1,
        name: '10.0.26.51:27017',
        health: 1,
        state: 2,
        stateStr: 'SECONDARY',
        uptime: 20842,
        optime: { ts: Timestamp({ t: 1701196480, i: 1 }), t: Long("26") },
        optimeDurable: { ts: Timestamp({ t: 1701196480, i: 1 }), t: Long("26") },
        optimeDate: ISODate("2023-11-28T18:34:40.000Z"),
        optimeDurableDate: ISODate("2023-11-28T18:34:40.000Z"),
        lastAppliedWallTime: ISODate("2023-11-28T18:34:40.505Z"),
        lastDurableWallTime: ISODate("2023-11-28T18:34:40.505Z"),
        lastHeartbeat: ISODate("2023-11-28T18:34:54.484Z"),
        lastHeartbeatRecv: ISODate("2023-11-28T18:34:54.798Z"),
        pingMs: Long("6"),
        lastHeartbeatMessage: '',
        syncSourceHost: '10.0.26.52:27017',
        syncSourceId: 5,
        infoMessage: '',
        configVersion: 1521180,
        configTerm: 26
        },
        {
        _id: 2,
        name: '10.0.26.48:27017',
        health: 1,
        state: 1,
        stateStr: 'PRIMARY',     <-------------------------- SHOULD match log's outout for Primary
        uptime: 20843,
        optime: { ts: Timestamp({ t: 1701196480, i: 1 }), t: Long("26") },
        optimeDurable: { ts: Timestamp({ t: 1701196480, i: 1 }), t: Long("26") },
        optimeDate: ISODate("2023-11-28T18:34:40.000Z"),
        optimeDurableDate: ISODate("2023-11-28T18:34:40.000Z"),
        lastAppliedWallTime: ISODate("2023-11-28T18:34:40.505Z"),
        lastDurableWallTime: ISODate("2023-11-28T18:34:40.505Z"),
        lastHeartbeat: ISODate("2023-11-28T18:34:54.698Z"),
        lastHeartbeatRecv: ISODate("2023-11-28T18:34:55.156Z"),
        pingMs: Long("8"),
        lastHeartbeatMessage: '',
        syncSourceHost: '',
        syncSourceId: -1,
        infoMessage: '',
        electionTime: Timestamp({ t: 1701152367, i: 1 }),
        electionDate: ISODate("2023-11-28T06:19:27.000Z"),
        configVersion: 1521180,
        configTerm: 26
        }
    
  • Service Start-up - please note that depending on the number of nodes in your swarm and your connection speed, it might take some time for images to download, for the mongodb instances to spin up, and the replica manager to configure the replica-set. Services in the compose stack YML recipe, such as nosqlclient, [your mongo application], etc, that depend on the mongo database to be operational, should be allowed enough time to start (particularly upon an initial/blank-slate deployment) before showing as READY. Additionally, docker might fail/restart services that are dependent on mongodb when starting things up if the mongo service isn't ready and configured - this is normal for initial deployments and services will connect to mongo when available.

    MongoDB operating in replicaset mode will not become available for use until the replicaset configuration is finalized and a primary instance is elected.

Contact