[SPARK-2873] [SQL] using ExternalAppendOnlyMap to resolve OOM when aggregating #2029

guowei2 · 2014-08-19T06:13:59Z

A new PR clone from PR 1822

Fix numbers of problems

Reuse the CompactBuffer from Spark Core to save memory and pointer dereferences as PR 1993

Hive UDAF not support external aggregate, for hive AggregationBuffer need serializable and hive GenericUDAFEvaluator has no method implement to merge two evaluators

AmplabJenkins · 2014-08-19T06:18:36Z

Can one of the admins verify this patch?

guowei2 · 2014-08-19T08:56:18Z

@marmbrus
what outputs should i give about the benchmarks?

marmbrus · 2014-08-20T19:23:57Z

People usually just summarize the benchmark itself and the results in description of the PR. For example: #1439

guowei2 · 2014-08-26T14:15:21Z

import org.apache.spark.sql.catalyst.types.{IntegerType, DataType}
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.execution._
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.SparkContext._
import org.apache.spark._
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.execution.OnHeapAggregate
import org.apache.spark.sql.catalyst.expressions.Alias
import org.apache.spark.sql.catalyst.expressions.BoundReference


object AggregateBenchMark extends App {

  val sc = new SparkContext(
    new SparkConf().setMaster("local").setAppName("agg-benchmark"))

  val dataType: DataType = IntegerType
  val aggExps = Seq(Alias(sum(BoundReference(1, dataType, true)),"sum")())
  val groupExps = Seq(BoundReference(0, dataType, true))
  val attributes =  aggExps.map(_.toAttribute)
  val childPlan = rowsPlan(sc, attributes)

  def benchmarkOnHeap = {
    val begin = System.currentTimeMillis()
    OnHeapAggregate(false, groupExps, aggExps, childPlan).execute().foreach(_ => {})
    val end = System.currentTimeMillis()
    end - begin
  }

  def benchmarkExternal = {
    val begin = System.currentTimeMillis()
    ExternalAggregate(false, groupExps, aggExps, childPlan).execute().foreach(_ => {})
    val end = System.currentTimeMillis()
    end - begin
  }

  (1 to 5).map(_=> println("OnHeapAggregate time: "+ benchmarkOnHeap))
  (1 to 5).map(_=> println("ExternalAggregate time: "+ benchmarkExternal))

}
private[spark] class TestRDD(
   sc: SparkContext,
   numPartitions: Int) extends RDD[Row](sc, Nil) with Serializable {

  override def compute(split: Partition, context: TaskContext): Iterator[Row] = {
    new Iterator[Row] {
      var lines = 0
      override final def hasNext: Boolean = lines < 50000
      override final def next(): Row = {
        lines += 1
        val row = new GenericMutableRow(2)
        //key numbers
        row(0) = (math.random * 1500).toInt
        row(1) = (math.random * 50).toInt
        row.asInstanceOf[Row]
      }
    }
  }
  override def getPartitions = (0 until numPartitions).map(i => new Partition {
    override def index = i
  }).toArray
  override def getPreferredLocations(split: Partition): Seq[String] = Nil
  override def toString: String = "TestRDD " + id
}


case class rowsPlan(@transient val sc:SparkContext, attributes: Seq[Attribute]) extends LeafNode {

  override def output = attributes

  override def execute() = {
    new TestRDD(sc, 1).asInstanceOf[RDD[Row]]
  }
}

guowei2 · 2014-08-26T14:55:55Z

@marmbrus

it's very sad about the result of benchmark above.
once one spill happen, usually batch of spills will happen one by one.

the size of AppendOnlyMap is according to the number of keys for values with the same key merged

i think it's not a good way by using ExternalAppendOnlyMap,fot it is too expensive when records with the same key spill to disk over and over again.

otherwise, user can easily avoid OOM by raising spark.sql.shuffle.partitions to reduce the key numbsers

i think the logic of ExternalAppendOnlyMap should Optimize.

join seems have similar problems. meanwhile, both left and right table put into ExternalAppendOnlyMap is expensive too

marmbrus · 2014-09-04T02:25:11Z

What were the actual results of the benchmark? It is acceptable for there to be some performance hit here. In cases where there are too many keys, its much better to spill to disk than to OOM, though you have a good point about just adding more partitions.

SparkQA · 2014-09-05T23:42:55Z

Can one of the admins verify this patch?

guowei2 · 2014-09-17T07:30:45Z

I've run a micro benchmark in my local with 50000 records,1500 keys.

Type	OnHeapAggregate	ExternalAggregate (happens 10 spills)
First run	876ms	16.9s
Stablized runs	150ms	15.0s

marmbrus · 2014-12-02T00:29:01Z

Thanks for working on this, but we are trying to clean up the PR queue (in order to make it easier for us to review). Thus, I think we should close this issue for now and reopen when its ready for review.

merge PR 1822

5cbaaac

asfgit closed this in b0a46d8 Dec 2, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-2873] [SQL] using ExternalAppendOnlyMap to resolve OOM when aggregating #2029

[SPARK-2873] [SQL] using ExternalAppendOnlyMap to resolve OOM when aggregating #2029

guowei2 commented Aug 19, 2014

AmplabJenkins commented Aug 19, 2014

guowei2 commented Aug 19, 2014

marmbrus commented Aug 20, 2014

guowei2 commented Aug 26, 2014

guowei2 commented Aug 26, 2014

marmbrus commented Sep 4, 2014

SparkQA commented Sep 5, 2014

guowei2 commented Sep 17, 2014

marmbrus commented Dec 2, 2014

[SPARK-2873] [SQL] using ExternalAppendOnlyMap to resolve OOM when aggregating #2029

[SPARK-2873] [SQL] using ExternalAppendOnlyMap to resolve OOM when aggregating #2029

Conversation

guowei2 commented Aug 19, 2014

AmplabJenkins commented Aug 19, 2014

guowei2 commented Aug 19, 2014

marmbrus commented Aug 20, 2014

guowei2 commented Aug 26, 2014

guowei2 commented Aug 26, 2014

marmbrus commented Sep 4, 2014

SparkQA commented Sep 5, 2014

guowei2 commented Sep 17, 2014

marmbrus commented Dec 2, 2014