Skip to content

Commit

Permalink
[SPARK-3771][SQL] AppendingParquetOutputFormat should use reflection …
Browse files Browse the repository at this point in the history
…to prevent from breaking binary-compatibility.

Original problem is [SPARK-3764](https://issues.apache.org/jira/browse/SPARK-3764).

`AppendingParquetOutputFormat` uses a binary-incompatible method `context.getTaskAttemptID`.
This causes binary-incompatible of Spark itself, i.e. if Spark itself is built against hadoop-1, the artifact is for only hadoop-1, and vice versa.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes #2638 from ueshin/issues/SPARK-3771 and squashes the following commits:

efd3784 [Takuya UESHIN] Add a comment to explain the reason to use reflection.
ec213c1 [Takuya UESHIN] Use reflection to prevent breaking binary-compatibility.
  • Loading branch information
ueshin authored and marmbrus committed Oct 13, 2014
1 parent d3cdf91 commit 73da9c2
Showing 1 changed file with 9 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -331,13 +331,21 @@ private[parquet] class AppendingParquetOutputFormat(offset: Int)

// override to choose output filename so not overwrite existing ones
override def getDefaultWorkFile(context: TaskAttemptContext, extension: String): Path = {
val taskId: TaskID = context.getTaskAttemptID.getTaskID
val taskId: TaskID = getTaskAttemptID(context).getTaskID
val partition: Int = taskId.getId
val filename = s"part-r-${partition + offset}.parquet"
val committer: FileOutputCommitter =
getOutputCommitter(context).asInstanceOf[FileOutputCommitter]
new Path(committer.getWorkPath, filename)
}

// The TaskAttemptContext is a class in hadoop-1 but is an interface in hadoop-2.
// The signatures of the method TaskAttemptContext.getTaskAttemptID for the both versions
// are the same, so the method calls are source-compatible but NOT binary-compatible because
// the opcode of method call for class is INVOKEVIRTUAL and for interface is INVOKEINTERFACE.
private def getTaskAttemptID(context: TaskAttemptContext): TaskAttemptID = {
context.getClass.getMethod("getTaskAttemptID").invoke(context).asInstanceOf[TaskAttemptID]
}
}

/**
Expand Down

0 comments on commit 73da9c2

Please sign in to comment.