Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-11051][Core] Do not allow local checkpointing after the RDD is materialized and checkpointed #9072

Closed

Conversation

viirya
Copy link
Member

@viirya viirya commented Oct 12, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-11051

When a RDD is materialized and checkpointed, its partitions and dependencies are cleared. If we allow local checkpointing on it and assign LocalRDDCheckpointData to its checkpointData. Next time when the RDD is materialized again, the error will be thrown.

@SparkQA
Copy link

SparkQA commented Oct 12, 2015

Test build #43560 has finished for PR 9072 at commit 124aab0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -1521,8 +1521,15 @@ abstract class RDD[T: ClassTag](
}

checkpointData match {
case Some(reliable: ReliableRDDCheckpointData[_]) => logWarning(
"RDD was already marked for reliable checkpointing: overriding with local checkpoint.")
case Some(reliable: ReliableRDDCheckpointData[_]) =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this is a good restriction, but cc @tdas regarding the semantics.

I feel like this code construct is getting hard to read. I don't feel strongly, but is if (checkpointData.isDefined && isCheckpointed) not simpler? then there are just two branches, one return point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to checkpointData.isDefined && isCheckpointed, we still need to check isLocallyCheckpointed, because when checkpointData.isDefined is true, it is possibly a local checkpointing.

@viirya
Copy link
Member Author

viirya commented Oct 12, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 12, 2015

Test build #43565 has finished for PR 9072 at commit 124aab0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Oct 12, 2015

retest this please.

@SparkQA
Copy link

SparkQA commented Oct 12, 2015

Test build #43566 has finished for PR 9072 at commit 124aab0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tdas
Copy link
Contributor

tdas commented Oct 12, 2015

Actually, best for @andrewor14 to take a look, who implemented local checkpointing and has more context.

case Some(reliable: ReliableRDDCheckpointData[_]) => logWarning(
"RDD was already marked for reliable checkpointing: overriding with local checkpoint.")
case Some(reliable: ReliableRDDCheckpointData[_]) =>
if (isCheckpointed) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you rename this to isCheckpointedAndMaterialized and update the java docs? Right now it's a little confusing because it can be interpreted as "marked for checkpointing" only, which is actually what the existing java docs say.

@andrewor14
Copy link
Contributor

@viirya thanks for the fix. I suggested an alternative which I think is a little clearer. Could you also add a regression test for this in CheckpointSuite?

@viirya
Copy link
Member Author

viirya commented Oct 15, 2015

@andrewor14 Thanks. I will update this patch and add the test later today.

@andrewor14
Copy link
Contributor

retest this please

@@ -548,6 +548,11 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
def isCheckpointed: Boolean = rdd.isCheckpointed

/**
* Return whether this RDD has been checkpointed and materialized or not
*/
def isCheckpointedAndMaterialized: Boolean = rdd.isCheckpointedAndMaterialized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be private[spark]

@SparkQA
Copy link

SparkQA commented Oct 15, 2015

Test build #43813 has finished for PR 9072 at commit 448774f.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 16, 2015

Test build #43829 has finished for PR 9072 at commit dc58498.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -548,6 +548,11 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends Serializable {
def isCheckpointed: Boolean = rdd.isCheckpointed

/**
* Return whether this RDD has been checkpointed and materialized or not
*/
private[spark] def isCheckpointedAndMaterialized: Boolean = rdd.isCheckpointedAndMaterialized
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why add this? I don't think this is used except in tests

@andrewor14
Copy link
Contributor

@viirya this looks pretty close. I don't think we need to add the method in Java. Once you remove that and update the java docs I will merge this. Thanks for fixing this issue.

@viirya
Copy link
Member Author

viirya commented Oct 17, 2015

@andrewor14 Thanks. I've updated this. Please see if there is any problem.

@SparkQA
Copy link

SparkQA commented Oct 17, 2015

Test build #43873 has finished for PR 9072 at commit 3958bc2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Oct 19, 2015

ping @andrewor14 any other comments?

@andrewor14
Copy link
Contributor

Yup LGTM I'm merging this into master 1.5. Thanks for fixing this @viirya !

asfgit pushed a commit that referenced this pull request Oct 19, 2015
… materialized and checkpointed

JIRA: https://issues.apache.org/jira/browse/SPARK-11051

When a `RDD` is materialized and checkpointed, its partitions and dependencies are cleared. If we allow local checkpointing on it and assign `LocalRDDCheckpointData` to its `checkpointData`. Next time when the RDD is materialized again, the error will be thrown.

Author: Liang-Chi Hsieh <viirya@appier.com>

Closes #9072 from viirya/no-localcheckpoint-after-checkpoint.

(cherry picked from commit a1413b3)
Signed-off-by: Andrew Or <andrew@databricks.com>
@asfgit asfgit closed this in a1413b3 Oct 19, 2015
@viirya viirya deleted the no-localcheckpoint-after-checkpoint branch December 27, 2023 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants