Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

poor performance of querying metric_idx #453

Closed
woodsaj opened this issue Jan 6, 2017 · 7 comments · Fixed by #465
Closed

poor performance of querying metric_idx #453

woodsaj opened this issue Jan 6, 2017 · 7 comments · Fixed by #465
Assignees

Comments

@woodsaj
Copy link
Member

woodsaj commented Jan 6, 2017

The new metric_idx table is performing terribly.

We need to drop the clustering key as it is redundant.

The Partition Key is responsible for data distribution across your nodes.
The Clustering Key is responsible for data sorting within the partition.
The Primary Key is equivalent to the Partition Key in a single-field-key table.
The Composite/Compound Key is just a multiple-columns key

As our paritionKey is unique (metricDef.Id), there is 0 value to have a clusteringKey.

@woodsaj
Copy link
Member Author

woodsaj commented Jan 6, 2017

All that is needed is to change the schema to

const TableSchema = `CREATE TABLE IF NOT EXISTS %s.metric_idx (
    id text,
    orgid int,
    partition int,
    name text,
    metric text,
    interval int,
    unit text,
    mtype text,
    tags set<text>,
    lastupdate int,
    PRIMARY KEY (id)
) WITH compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}`

@woodsaj
Copy link
Member Author

woodsaj commented Jan 6, 2017

It is not possible to alter an existing table but we dont really need a migration tool as the change does not affect how we read or write to the table, it only affects cassandra internals of how the table is stored.

Existing deployments that are experiencing performance problems can be "migrated" by exporting the current data, droping the table, re-creating the table with the new schema, importing the data.

The below script can be used.

#!/bin/sh

KEYSPACE=${1:-raintank}

cat << EOF > /tmp/metric_idx_migrate.cql
COPY metric_idx (id, partition, interval, lastupdate, metric, mtype, name, orgid, tags, unit) TO '/tmp/metric_idx.csv' ;
DROP TABLE metric_idx;
CREATE TABLE metric_idx (
    id text,
    orgid int,
    partition int,
    name text,
    metric text,
    interval int,
    unit text,
    mtype text,
    tags set<text>,
    lastupdate int,
    PRIMARY KEY (id)
) WITH compaction = {'class': 'SizeTieredCompactionStrategy'}
    AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
CREATE INDEX metric_idx_partition_idx ON mt_ops.metric_idx_new (partition);
COPY metric_idx (id, partition, interval, lastupdate, metric, mtype, name, orgid, tags, unit) FROM '/tmp/metric_idx.csv';
EOF

echo "starting migration"
cqlsh -k $KEYSPACE -f /tmp/metric_idx_migrate.cql
echo ".....migration complete"

echo "Cleaning up......"
rm -f /tmp/metric_idx_migrate.cql
rm -f /tmp/metric_idx.csv
echo "done"

@Dieterbe
Copy link
Contributor

Dieterbe commented Jan 9, 2017

The new metric_idx table is performing terribly. ... We need to drop the clustering key as it is redundant.

can you elaborate on how you observed the problem and how you diagnosed it? is the clustering key just the first low hanging fruit, or are you confident it is the problem? (if so, why?)

thanks

@woodsaj
Copy link
Member Author

woodsaj commented Jan 10, 2017

can you elaborate on how you observed

MT crashes at startup due to cassandra timeout loading the index. When testing the queries directly on cassandra with cqlsh also got timeouts. So i looked at the schema and realized that clustering key made no sense. After removing it and testing again with cqlsh, the queries were much faster.

So removing the clustering key improves performance, but we could still improve it more by using the partition as the partitionKey and the id as the clustering key. We then also wouldnt need the secondary index. This would mean that the partition would need to be known for all queries. The only code change i can see that would be needed is to change the DELETE query used to include the partition, ie DELETE from metric_idx where partition=? and id=?

That might be the better option and can also be handled without a migration tool as the updated delete command would still work on the current version of the schema.

@Dieterbe do any of your mt-X tools make queries to the C* index without knowledge of the "partition" they are querying?

@Dieterbe
Copy link
Contributor

Dieterbe commented Jan 10, 2017

So if I understand it correctly, can you please confirm I got this right:

  1. the proposed change is simply:
-     PRIMARY KEY (id, partition)
+     PRIMARY KEY (id)

(note: I think we can just remove the PRIMARY KEY line alltogether, and change
id text, to id text PRIMARY KEY,)

  1. the partition column was used as a clustering key, e.g. for sorting, but only within the data for one given id (not for all data in a given partition, where multiple id's may hash to the same partition). this is a subtle difference that wasn't entirely clear to me.

  2. If we would have multiple values for the same id, they would be ordered (clustered) by the partition column. However, we only have unique id values, so there is nothing to be sorted (clustered)
    or to be more precise: the list of things to be ordered has a len of 1 each time.

  3. despite there nothing being to sort/order , it slows cassandra down significantly

  4. this is unrelated to the recent msgp data blob -> individual fields change, it was introduced via f5f2e75

but we could still improve it more by using the partition as the partitionKey and the id as the clustering key

  1. this effectively maps our cassandra partitions 1:1 to our kafka partitions. could this be a problem? Isn't our number of cassandra partitions supposed to be much higher than our kafka partitions? (which we currently achieve by partitioning on id column)

  2. why the id as clustering key? i'm guessing it could help with id based lookups but not sure? do we do id based lookups anywhere?

FWIW I found this very useful:

do any of your mt-X tools make queries to the C* index without knowledge of the "partition" they are querying?

yes, see
https://github.com/raintank/metrictank/blob/bb1055053cc72457d40a88b9ecde73b27371daf9/idx/cassandra/cassandra.go#L258-L266
the idx in MT's rebuildIndex feature uses LoadPartition to only load data for given partitions,
mt-index-cat uses the Load variant which does not filter or partition.
the goal of mt-index-cat is to get a dump of everything in the index (for troubleshooting, diagnosis etc), and to generate a list of all metrics, which can be used to generate fake traffic. It aims to do this without forcing the user to know anything about the partitioning used.

@woodsaj
Copy link
Member Author

woodsaj commented Jan 10, 2017

So if I understand it correctly, can you please confirm I got this right:

yes

this effectively maps our cassandra partitions 1:1 to our kafka partitions. could this be a problem? Isn't our number of cassandra partitions supposed to be much higher than our kafka partitions? (which we currently achieve by partitioning on id column)

The consequence of using the kafka partition as the cassandra partitionKey is that all metrics in that kafka partition will be saved on the same cassandra node, cassandra partitions do not span nodes. With our query pattern of either querying all metricDefs in a partition or deleting 1 metricDef by id (and partition) this new schema will perform much better. When loading all metricDefs for a partition the cassandra node just needs to stream the local sstables to the client and there is no need for co-ordination between nodes. The primary purpose of splitting data up between cassandra nodes is to ensure that queries get evenly distributed between the nodes. But for best performance each query should ideally be able to be handled by a single node.

The queries used in mt-index-cat will still if the kafka partition is used as the

why the id as clustering key? i'm guessing it could help with id based lookups but not sure? do we do id based lookups anywhere?

we need to delete by id.

mt-index-cat uses the Load variant which does not filter or partition.

the choice of partition and clustering keys does not affect executing a select without any where conditions

@Dieterbe
Copy link
Contributor

ok sounds good to me :)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants