Skip to content

A bundle for implementing application level sharding on traditional databases.

License

Notifications You must be signed in to change notification settings

ClearTax/dropwizard-db-sharding-bundle

 
 

Repository files navigation

Dropwizard DB Sharding Bundle

This library adds support for Database sharding in Dropwizard based applications. Make sure you're familiar with Dropwizard, dependency injection framework like Guice and concepts like ThreadLocals before going ahead.

License: Apache 2.0

Stable version: 0.2.18

Why this library?

  1. Traditionally in Dropwizard, to open a database transaction @UnitOfWork is put on every Jersey-resource method. In case @UnitOfWork is required at places, other than Jersey-resources a more verbose approach is required as mentiond here.
  2. Nested @UnitOfWork don't work in Dropwizard, which are some times needed in a sharding use-case.
  3. Dropwizard doesn't support Hibernate's multi-tenancy.
  4. The original library from where this library has been forked, advocates that database-transaction should only be managed at DAO layer, while it may work in their use-case, for Cleartax being a finance system, clean rollback of transaction (which may include multiple entities in some cases) is very important.

This library solves above problems by:

  1. Using Guice method interceptors to seamlessly use @UnitOfWork in anywhere in the code. Assumption: Methods that are annotated with @UnitOfWork aren't private and the class must be created via Guice dependency injection.
  2. Handle nested @UnitOfWork within the same thread. How?
  3. Uses Hibernate's multi-tenancy support and integrates it with Dropwizard.

How to Use

Terminology

  1. Shard/Tenant mean the same thing, which is the physical database.
  2. Shard-id/tenant-id also mean the same thing, which is the id of the physical database. (Refer point no. 2 in High-level section)
  3. Shard-key is the ID on which data is sharded, e.g. if you're sharding by user-id, then user-id becomes your shard-key.
  4. Bucket is an intermediate virtual-shard to which shard-key gets mapped to.
  5. Bucket-id gets mapped to the shard-id.

High level

  1. Include as dependency:
<dependency>
    <groupId>in.cleartax.dropwizard</groupId>
    <artifactId>sharding-core</artifactId>
    <version>0.2.8</version>
</dependency>

for liquibase migration support, you can also include:

<dependency>
    <groupId>in.cleartax.dropwizard</groupId>
    <artifactId>sharding-migrations</artifactId>
    <version>0.2.8</version>
</dependency>
  1. Update your Dropwizard YML config to declare all the hosts as described here.

  2. Understand Guice's AbstractModule and @Provides.

  3. Define all the dependencies in an extension to AbstractModule that binds all your classes. e.g. refer this.

  4. Register your module in your Dropwizard application as described here.

Low level

Consider this method which is annotated with @UnitOfWork.

UnitOfWorkInterceptor by using Guice's AOP would intercept the method call and figure out the shard-id/tenant-id then initiate the transaction. For this to work, you would need to do:

Map all the shard-keys to bucket

UnitOfWorkInterceptor calls the implementation of ShardKeyProvider to map the shard-key to a bucket.

Map all the buckets to shards

UnitOfWorkInterceptor calls the implementation of BucketResolver to figure out shard-id. Refer DbBasedShardResolver to understand one example use-case. In this example, refer this SQL script where the mappings are done.

Connect to right shard, by

  1. Setup shard-key for every incoming HTTP request - Refer ShardKeyFeature in the example project. Note: ShardKeyProvider in the example is bounded to it's implementation in the Guice module described earlier.

  2. Setup shard-key manually - Get an instance of ShardKeyProvider and then do:

try {
  shardKeyProvider.setKey("shard-key")
  // Call your method which is annotated with @UnitOfWork
} finally {
  shardKeyProvider.clear();
}

Example use-case: In case you're invoking your code outside of HTTP layer, or you're creating a child-thread which may not have all the context of parent.

  1. Connect to a shard by explicitly mentioning shard-id
try {
  DelegatingTenantResolver.getInstance().setDelegate(new ConstTenantIdentifierResolver("your shard-id/tenant-id"));
  // Call your method which is annotated with @UnitOfWork
} finally {
  if (DelegatingTenantResolver.getInstance().hasTenantIdentifier()) {
      DelegatingTenantResolver.getInstance().setDelegate(null);
  }
}

Example use-case: This might be useful in case you're aggregating data from across all the shards

  1. Connect to a shard without using @UnitOfWork

This is not recommended because then you'll need hibernate specific objects in your business code. In case required, you can instantiate TransactionRunner and use it as mentioned here

Note: In case you don't need sharding but still need flexibility of using @UnitOfWork outside of resources and ability to use nested @UnitOfWork you can still do so. Refer these tests to understand the use-case.

About

A bundle for implementing application level sharding on traditional databases.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 98.6%
  • TSQL 1.4%