-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Co-partitioned tables #79
Comments
Will this have ability to co-partition existing tables ? |
@ddorian - you mean "after the fact" co-partition of two tables that already have data and but are arbitrarily partitioned? This requires a lot more engineering and is a bit more complex, and wasn't on the plans for the initial cut. The initial plans are around supporting creation of a "new" table that's co-partitioned with an existing table (which may contain data already). Under the covers the co-partitioned table will share Raft groups with it's parent table. |
@kmuthukk Yes. If the number of tablets & partition-key is the same, data should be split the same I guess? You just had to migrate the tablets into the same servers. Mostly asking cause it's a big performance improvement for colocated transactions/joins/aggregations. TLDR: it's a pretty big feature, any eta ? |
Yes - number of tablets being same would make things a bit easier. Merging previously independent RAFT groups would be a bit more trickier... especially if we try to support such an operation in an "online" fashion. Will have to get back to you on ETA after discussion with team. |
hi @ddorian Don't have an exact timeline for this feature specifically, but by Q2'2019 quite likely. Specific use cases or customer requests might change/accelerate this. |
* Correct mistakes in syntax description. * Removed quote around keyword * Fixed a few more typos in syntax and removed single quote from '(' and ','
I am also interested in this feature. My use case is partitioning tables by The implementation in Citus: Creating And Distributing Tables and Choosing Distribution Column. This is a very powerful way of associating data (especially with If I understand correctly, colocation of tables addresses another issue, not directly related here. And I am not sure about Tablegroups. Note that cockroachdb already had interleaved tables, but removed them again. (I am glad that we didn't use them, even though they fit well into the project). Is there a plan for this feature request? And it would be nice to see how this feature affects the benchmarks when used. |
Summary: Copartitioning was a feature from a few years ago. Original GH issue [[ #79 | request ]]. Some work was done to implement it but it was never completed and the feature was abandoned. This diff removes the logic wired into the master and tserver to support it. Test Plan: builds & no new broken tests Reviewers: timur, bogdan, jhe, yguan Reviewed By: jhe, yguan Subscribers: ybase, slingam, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D23661
Summary: Copartitioning was a feature from a few years ago. Original GH issue [[ yugabyte#79 | request ]]. Some work was done to implement it but it was never completed and the feature was abandoned. This diff removes the logic wired into the master and tserver to support it. Test Plan: builds & no new broken tests Reviewers: timur, bogdan, jhe, yguan Reviewed By: jhe, yguan Subscribers: ybase, slingam, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D23661
Jira Link: DB-1949
This is a feature request for implementing co-partitioned tables in order to achieve distributed transactions with high performance in certain scenarios.
This should allow creating a table A, and then creating another table B with the same partition key as A and specify that we want B to be co-partitioned with A. The two tables would share tablets, raft groups, load-balancing decisions, etc. And we should allow for updates to these tables to happen atomically if the update involves the same partition key (i.e. userid in our example above).
For example:
This should ensure that the
user
andmessages
tables have the same tablet splits with respect to the primary hash keys, and the tablets serving a given key range across the two tables are always colocated on the same node.Background in other NoSQL systems:
Apache Cassandra
In Apache Cassandra, if two tables share the same replication strategy and same partition key they will co-locate their partitions. So as long as the two tables are in the same keyspace AND their partition keys match, for example:
and an
Then, the data for a given customer_id across both tables is guaranteed to be colocated on the same set of nodes.
HBase
In HBase, co-partition happens across column families that share the same ROW KEY.
Additionally, in HBase, since a common WAL is shared for cross-column family mutations, we also guaranteed cross-column family atomicity.
The text was updated successfully, but these errors were encountered: