Skip to content

Commit

Permalink
Merge pull request #10 from diegopacheco/dev
Browse files Browse the repository at this point in the history
Refactoring: New Algorithm, Better Performance!
  • Loading branch information
diegopacheco committed May 5, 2017
2 parents 6c81c11 + fe547fb commit 4fbf55b
Show file tree
Hide file tree
Showing 35 changed files with 1,081 additions and 953 deletions.
282 changes: 88 additions & 194 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ Dynomite Cluster Checker checks if a Dynomite cluster is working properly via Dy

## Why ?

* Validated Dynomite Cluster Instalations
* Telemetry, cheks if seeds and nodes are responsive
* Deployment Validation: Dynomite Cluster checks dynomite clusters
* Telemetry: DCC cheks if seeds and nodes are responsive
* Troubleshooting: Is my Dynomite cluster okay?

## Features

* Cheks and Validate Dynomite Cluster Seeds Config
* Cheks and Validate Dynomite Cluster Seeds and Cluster Configs
* Checks if there are bad nodes (SG issues, ports, etc..)
* Perform GET/SET to check latency between nodes in the cluster
* Check Data consistency
Expand All @@ -30,209 +31,55 @@ Dynomite Cluster Checker checks if a Dynomite cluster is working properly via Dy
## Checking Dynomite Cluster

```bash
./gradlew execute -Dexec.args="127.0.0.1:8102:rack1:localdc:1383429731"
./gradlew execute -Dexec.args="172.18.0.101:8101:rack1:dc:100|172.18.0.102:8101:rack2:dc:100|172.18.0.103:8101:rack3:dc:100"
```
SAMPLES RESULTS
DCC will output
```bash
#
# 1. ALL BAD NODES
#

**** BEGIN DYNOMITE CLUSTER CHECKER ****
1. Checking cluster connection...
BAD NODES:
172.28.198.18:8102:rack1:default_dc:100
172.28.198.236:8102:rack2:default_dc:100
172.28.198.118:8102:rack3:default_dc:100
2. Cannot check data replication since there are no valid nodes
4. Shwoing Results as JSON...
[
]
{

**** END DYNOMITE CLUSTER CHECKER ****

#
# 2. One BAD NODES
#

**** BEGIN DYNOMITE CLUSTER CHECKER ****
1. Checking cluster connection...
BAD NODES:
172.28.198.118:8102:rack3:default_dc:100
2. Checking cluster data replication...
SEEDS: [172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100]
Checking Node: 172.28.198.18
TIME to Insert DCC_dynomite_123_kt - Value: DCC_replication_works: 2.0 ms - 0 s
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.236
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
3. Checking cluster failover...
All Seeds Cluster Failover test: OK
4. Shwoing Results as JSON...
[
{
"server":"172.28.198.18",
"seeds":"[172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100]",
"insertTime":"2.0 ms",
"getTime":"2.0 ms",
"consistency":"true"
},
{
"server":"172.28.198.236",
"getTime":"2.0 ms",
"consistency":"true"
}
]
"timeToRun": "2 seconds",

**** END DYNOMITE CLUSTER CHECKER ****

#
# 3. all working good.
#

**** BEGIN DYNOMITE CLUSTER CHECKER ****
1. Checking cluster connection...
OK - All nodes are accessible!
2. Checking cluster data replication...
SEEDS: [172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100, 172.28.198.118:8102:rack3:default_dc:100]
Checking Node: 172.28.198.18
TIME to Insert DCC_dynomite_123_kt - Value: DCC_replication_works: 2.0 ms - 0 s
TIME to Get: DCC_dynomite_123_kt : 1.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.236
TIME to Get: DCC_dynomite_123_kt : 1.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.118
TIME to Get: DCC_dynomite_123_kt : 4.0 ms - 0 s
200 OK - set/get working fine!
3. Checking cluster failover...
All Seeds Cluster Failover test: OK
4. Shwoing Results as JSON...
[
{
"server":"172.28.198.18",
"seeds":"[172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100, 172.28.198.118:8102:rack3:default_dc:100]",
"insertTime":"2.0 ms",
"getTime":"1.0 ms",
"consistency":"true"
},
{
"server":"172.28.198.236",
"getTime":"1.0 ms",
"consistency":"true"
},
{
"server":"172.28.198.118",
"getTime":"4.0 ms",
"consistency":"true"
}
]
"failoverStatus": "OK",

"replicationCount": "3",

"badNodes": [],

"nodesReport":

**** END DYNOMITE CLUSTER CHECKER ****

#
# 4. Inconsistency
#

**** BEGIN DYNOMITE CLUSTER CHECKER ****
1. Checking cluster connection...
OK - All nodes are accessible!
2. Checking cluster data replication...
SEEDS: [172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100, 172.28.198.118:8102:rack3:default_dc:100]
Checking Node: 172.28.198.18
TIME to Insert DCC_dynomite_123_kt - Value: DCC_replication_works: 2.0 ms - 0 s
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.236
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.118
TIME to Get: DCC_dynomite_123_kt : 1.0 ms - 0 s
ERROR - Inconsistency set/get! Set: DCC_dynomite_123_ktGet: null
3. Checking cluster failover...
All Seeds Cluster Failover test: OK
4. Shwoing Results as JSON...
[

{
"server":"172.28.198.18",
"seeds":"[172.28.198.18:8102:rack1:default_dc:100, 172.28.198.236:8102:rack2:default_dc:100, 172.28.198.118:8102:rack3:default_dc:100]",
"insertTime":"2.0 ms",
"getTime":"2.0 ms",
"server":"[172.18.0.101:8101:rack1:dc:100, 172.18.0.102:8101:rack2:dc:100, 172.18.0.103:8101:rack3:dc:100]",
"seeds":"172.18.0.101:8101:rack1:dc:100|172.18.0.102:8101:rack2:dc:100|172.18.0.103:8101:rack3:dc:100",
"insertTime":"0",
"consistency":"true"
},
, },

{
"server":"172.28.198.236",
"getTime":"2.0 ms",
"server":"172.18.0.101",
"seeds":"172.18.0.101:8101:rack1:dc:100",
"getTime":"0",
"consistency":"true"
},
{
"server":"172.28.198.118",
"getTime":"1.0 ms",
"consistency":"false"
}
]
, },

**** END DYNOMITE CLUSTER CHECKER ****

#
# 5. bad cluster - failover issue
#

**** BEGIN DYNOMITE CLUSTER CHECKER ****
1. Checking cluster connection...
BAD NODES:
172.28.198.18:8102:rack1:default_dc:100
2. Checking cluster data replication...
SEEDS: [172.28.198.236:8102:rack2:default_dc:200, 172.28.198.118:8102:rack3:default_dc:300]
Checking Node: 172.28.198.236
TIME to Insert DCC_dynomite_123_kt - Value: DCC_replication_works: 2.0 ms - 0 s
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
Checking Node: 172.28.198.118
TIME to Get: DCC_dynomite_123_kt : 2.0 ms - 0 s
200 OK - set/get working fine!
3. Checking cluster failover...
All Seeds Cluster Failover test: FAIL: PoolOfflineException: [host=Host [hostname=UNKNOWN, ipAddress=UNKNOWN, port=0, rack: null, datacenter: null, status: Down], latency=0(0), attempts=0]host pool is offline and no Racks available for fallback
4. Shwoing Results as JSON...
[
{
"server":"172.28.198.236",
"seeds":"[172.28.198.236:8102:rack2:default_dc:200, 172.28.198.118:8102:rack3:default_dc:300]",
"insertTime":"2.0 ms",
"getTime":"2.0 ms",
"server":"172.18.0.102",
"seeds":"172.18.0.102:8101:rack2:dc:100",
"getTime":"0",
"consistency":"true"
},
, },

{
"server":"172.28.198.118",
"getTime":"2.0 ms",
"server":"172.18.0.103",
"seeds":"172.18.0.103:8101:rack3:dc:100",
"getTime":"0",
"consistency":"true"
}
]
, }

**** END DYNOMITE CLUSTER CHECKER ****
]

#
# 6. Telemetry Mode (i.e: seeds=seed1|seed2|seed3&telemetry=true)
#
curl "http://localhost:7766/dynomite-cluster-checker/check?seeds=127.0.0.1:rack1:local-dc:8101:1|127.0.0.22:rack2:local-dc:8101:2|127.0.0.13:rack1:local-dc:8101:1&telemetry=true"
{
"failoverStatus": "0",
"badNodes": 2,
"nodesReport": [
{
"server": "127.0.0.1",
"seeds": "[127.0.0.1:rack1:local-dc:8101:1]",
"insertTime": "1",
"getTime": "2",
"insertError": "0",
"getError": "0",
"consistency": "0"
}
]
}

```
You Can pass Muliples nodes. If you send mroe than One node we will check for the nodes data replication consistency.<BR>
Expand All @@ -254,16 +101,63 @@ Run the embeded Jetty server with.
```
Them you can call: curl http://localhost:7766/dynomite-cluster-checker/check?seeds=127.0.0.1:8101:rack1:local-dc:437425602
Them you can call.
```bash
curl http://localhost:7766/dynomite-cluster-checker/check?seeds=172.18.0.101:8101:rack1:dc:100|172.18.0.102:8101:rack2:dc:100|172.18.0.103:8101:rack3:dc:100
```
```bash
{
"timeToRun": "2 seconds",
"failoverStatus": "OK",
"replicationCount": "3",
"badNodes": [],
"nodesReport":
[
{
"server":"[172.18.0.101:8101:rack1:dc:100, 172.18.0.102:8101:rack2:dc:100, 172.18.0.103:8101:rack3:dc:100]",
"seeds":"172.18.0.101:8101:rack1:dc:100|172.18.0.102:8101:rack2:dc:100|172.18.0.103:8101:rack3:dc:100",
"insertTime":"0",
"consistency":"true"
, },
{
"server":"127.0.0.1",
"seeds":"[127.0.0.1:8101:rack1:local-dc:437425602]",
"insertTime":"1.0 ms",
"getTime":"2.0 ms",
"server":"172.18.0.101",
"seeds":"172.18.0.101:8101:rack1:dc:100",
"getTime":"0",
"consistency":"true"
}
, },
{
"server":"172.18.0.102",
"seeds":"172.18.0.102:8101:rack2:dc:100",
"getTime":"0",
"consistency":"true"
, },
{
"server":"172.18.0.103",
"seeds":"172.18.0.103:8101:rack3:dc:100",
"getTime":"0",
"consistency":"true"
, }
]
}
```
## Dynomite-docker
There is another project I create to make easier to create dynomite clusters. This project is called dynomite-docker and thid project create 2 dynomite clusters of 3 nodes each. Dynomite-docker uses DCC toi test clusters. You can find more info here https://github.com/diegopacheco/dynomite-docker.
Cheers, <BR>
Diego Pacheco (@diego_pacheco)
19 changes: 9 additions & 10 deletions dynomite-cluster-checker/build.gradle
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ apply plugin: 'eclipse'
version = '1.0'
sourceCompatibility = 1.8

mainClassName = "com.github.diegopacheco.dynomite.cluster.checker.DynomiteClusterCheckerMain"
mainClassName = "com.github.diegopacheco.dynomite.cluster.checker.main.DynomiteClusterCheckerMain"

applicationDefaultJvmArgs = [
"-Djava.net.preferIPv4Stack=true",
Expand Down Expand Up @@ -55,22 +55,22 @@ eclipse {
}

dependencies {

testCompile('junit:junit:4.11')

compile([
'javax.servlet:servlet-api:2.5',
'org.apache.commons:commons-lang3:3.4',
'org.slf4j:slf4j-simple:1.7.21',
'com.netflix.hystrix:hystrix-core:1.5.6'
'com.google.inject:guice:4.1.0'
])
testCompile('junit:junit:4.11')
compile('com.netflix.dyno:dyno-jedis:1.5.7'){
compile('com.netflix.dyno:dyno-jedis:1.5.8-rc.4'){
exclude group: 'org.slf4j', module: 'slf4j-api'
}
}

}

//httpPort = 7766
///stopPort = 7765
//stopKey = "stopKey"

jar {
from(configurations.compile.collect { it.isDirectory() ? it : zipTree(it) }) {
exclude "META-INF/*.SF"
Expand All @@ -92,7 +92,6 @@ jar {
// ./gradlew execute -Dexec.args="local"
// ./gradlew execute -Dexec.args="127.0.0.1:8102:rack1:localdc:1383429731"
// ./gradlew execute -Dexec.args="127.0.0.1:8102:rack1:localdc:1383429731|127.0.0.1:8102:rack1:localdc:1383429731|127.0.0.1:8102:rack1:localdc:1383429731"
// ./gradlew execute -Dexec.args="127.0.0.1:8102:rack1:localdc:1383429731|127.0.0.1:8102:rack1:localdc:1383429731|127.011.1:8102:rack1:localdc:1383429731"
//
task execute(type:JavaExec) {
main = mainClassName
Expand Down
4 changes: 4 additions & 0 deletions dynomite-cluster-checker/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

./gradlew execute -Dexec.args"$1" -q

Loading

0 comments on commit 4fbf55b

Please sign in to comment.