You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, when Presto INSERTs or CREATE TABLE AS a partitioned table, each node can write to all partitions.
This is good when number of partitions is reasonable, but doesn't play well when loading data into large number of partitions.
Typically, such query will fail with "Too many open files". After bumping file limits, one can run into other issues (too many threads, out of memory, etc.)
We should repartition data so that each partition is written by one node.
at first, this can be an opt-int feature (or opt-out? repartitioning is a safer bet)
later, we can perhaps smartly choose whether to repartition data or not based at planning time
Repartitioning is not safer. If there is only one or a few partitions, the query will be many times slower. At FB, we never implemented this mode because we worried users would blindly set it, causing their queries would run for hours and never finish. Writing a single partition was the common case.
Either mode will be completely wrong for certain types of queries. It definitely needs to be a session property. I also think the current behavior (no-repartition) should be the default, as running fast then explicitly failing is better than silently running very slow.
We had ideas around doing it adaptively at runtime, but that’s a big project and it was never a priority.
A heuristic that might work is to always disable repartitioning when all the partition keys are constant.
I will create a PR that will allow connectors to specify preferred insert partitioning (#2358). There will be a toggle that will allow to enable usage of such partitioning in the engine. In the future we could use CBO for choosing preferred insert partitioning.
Currently, when Presto INSERTs or CREATE TABLE AS a partitioned table, each node can write to all partitions.
This is good when number of partitions is reasonable, but doesn't play well when loading data into large number of partitions.
Typically, such query will fail with "Too many open files". After bumping file limits, one can run into other issues (too many threads, out of memory, etc.)
We should repartition data so that each partition is written by one node.
(or opt-out? repartitioning is a safer bet)This issue replaces #579
The text was updated successfully, but these errors were encountered: