Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7277][SQL] Throw exception if the property mapred.reduce.tasks is set to -1 #5811

Closed
wants to merge 4 commits into from

Conversation

viirya
Copy link
Member

@viirya viirya commented Apr 30, 2015

JIRA: https://issues.apache.org/jira/browse/SPARK-7277

As automatically determining the number of reducers is not supported (mapred.reduce.tasks is set to -1), we should throw exception to users.

@SparkQA
Copy link

SparkQA commented Apr 30, 2015

Test build #31436 has finished for PR 5811 at commit 68a1c70.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • "public class " + className + extendsText + " implements java.io.Serializable
  • This patch does not change any dependencies.

if (value == "-1") {
logWarning(
s"Set this property to -1 for automatically determining the number of reducers " +
s"is not supported, showing current ${SQLConf.SHUFFLE_PARTITIONS} instead.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "Setting this property ..."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually can I suggest a small rewording a think is a little clearer?

Setting this property to -1 for automatically determing the number of reducers is not supported. Ignoring request, and showing ${SQLConf.SHUFFLE_PARTITIONS} instead

@squito
Copy link
Contributor

squito commented Apr 30, 2015

could you also add a test that things work when you try to set it to -1? from the jira it seems that was a major goal here, to prevent incorrect results.

@squito
Copy link
Contributor

squito commented Apr 30, 2015

though I wonder if instead it should fail (with a good error) if you give a bad value, rather than just logging a message and ignoring the request. any thoughts, @marmbrus?

@marmbrus
Copy link
Contributor

Warnings seem likely to be missed. I would throw an exception.

@viirya viirya changed the title [SPARK-7277][SQL] Show warning message if the property mapred.reduce.tasks is set to -1 [SPARK-7277][SQL] Throw exception if the property mapred.reduce.tasks is set to -1 May 1, 2015
@viirya
Copy link
Member Author

viirya commented May 1, 2015

Thanks. Updated to throw exception now. Also added a test for it.

@SparkQA
Copy link

SparkQA commented May 1, 2015

Test build #31504 has finished for PR 5811 at commit 4ede705.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@viirya
Copy link
Member Author

viirya commented May 1, 2015

should not be a related failure.

@viirya
Copy link
Member Author

viirya commented May 1, 2015

test again please.

@squito
Copy link
Contributor

squito commented May 1, 2015

Jenkins, retest this please

@SparkQA
Copy link

SparkQA commented May 1, 2015

Test build #31552 has finished for PR 5811 at commit 4ede705.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor

squito commented May 1, 2015

lgtm

@TanninOne
Copy link

Has anyone verified that a string comparison against -1 completely fully fixes the issue? Isn't it interpreted as an integer later?
What I'm getting at is: What if the user has the variable to "-01". or to "-2"? Maybe those were valid values for mapred.reduce.tasks as well? How was zero handled in hadoop?
Wouldn't it be safer and cleaner to test any variable against a valid range?
I.e. "if (Integer.parseInt("value") < 1) ..."

@viirya
Copy link
Member Author

viirya commented May 4, 2015

Setting mapred.reduce.tasks to '-1' has special meaning that asks Hive to automatically determine the number of reducers. It is the default value in hive-default.xml. You can also refer to HADOOPNUMREDUCERS.

Since it will parse the given string value, the configuration value of "-01" or "-2" should be able to be parsed to integer value too. As I can tell from the Hive sources, it checks if mapred.reduce.tasks is greater than 0 before using it in SemanticAnalyzer and SetReducerParallelism. So I think these values "-01", "-2" would have the same effect as setting it as '-1'.

Although I don't think it will be possible to have such values as setting values for mapred.reduce.tasks, it is still good to consider these cases. I will update it later.

@TanninOne
Copy link

Great, thanks for looking into this.

@SparkQA
Copy link

SparkQA commented May 4, 2015

Test build #31754 has finished for PR 5811 at commit e518f96.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • trait LDAOptimizer
    • class EMLDAOptimizer extends LDAOptimizer
    • class OnlineLDAOptimizer extends LDAOptimizer

asfgit pushed a commit that referenced this pull request May 7, 2015
…s is set to -1

JIRA: https://issues.apache.org/jira/browse/SPARK-7277

As automatically determining the number of reducers is not supported (`mapred.reduce.tasks` is set to `-1`), we should throw exception to users.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5811 from viirya/no_neg_reduce_tasks and squashes the following commits:

e518f96 [Liang-Chi Hsieh] Consider other wrong setting values.
fd9c817 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into no_neg_reduce_tasks
4ede705 [Liang-Chi Hsieh] Throw exception instead of warning message.
68a1c70 [Liang-Chi Hsieh] Show warning message if mapred.reduce.tasks is set to -1.

(cherry picked from commit ea3077f)
Signed-off-by: Michael Armbrust <michael@databricks.com>
@marmbrus
Copy link
Contributor

marmbrus commented May 7, 2015

Thanks, merged to master and 1.4

@asfgit asfgit closed this in ea3077f May 7, 2015
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…s is set to -1

JIRA: https://issues.apache.org/jira/browse/SPARK-7277

As automatically determining the number of reducers is not supported (`mapred.reduce.tasks` is set to `-1`), we should throw exception to users.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5811 from viirya/no_neg_reduce_tasks and squashes the following commits:

e518f96 [Liang-Chi Hsieh] Consider other wrong setting values.
fd9c817 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into no_neg_reduce_tasks
4ede705 [Liang-Chi Hsieh] Throw exception instead of warning message.
68a1c70 [Liang-Chi Hsieh] Show warning message if mapred.reduce.tasks is set to -1.
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…s is set to -1

JIRA: https://issues.apache.org/jira/browse/SPARK-7277

As automatically determining the number of reducers is not supported (`mapred.reduce.tasks` is set to `-1`), we should throw exception to users.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5811 from viirya/no_neg_reduce_tasks and squashes the following commits:

e518f96 [Liang-Chi Hsieh] Consider other wrong setting values.
fd9c817 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into no_neg_reduce_tasks
4ede705 [Liang-Chi Hsieh] Throw exception instead of warning message.
68a1c70 [Liang-Chi Hsieh] Show warning message if mapred.reduce.tasks is set to -1.
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…s is set to -1

JIRA: https://issues.apache.org/jira/browse/SPARK-7277

As automatically determining the number of reducers is not supported (`mapred.reduce.tasks` is set to `-1`), we should throw exception to users.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes apache#5811 from viirya/no_neg_reduce_tasks and squashes the following commits:

e518f96 [Liang-Chi Hsieh] Consider other wrong setting values.
fd9c817 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into no_neg_reduce_tasks
4ede705 [Liang-Chi Hsieh] Throw exception instead of warning message.
68a1c70 [Liang-Chi Hsieh] Show warning message if mapred.reduce.tasks is set to -1.
@viirya viirya deleted the no_neg_reduce_tasks branch December 27, 2023 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants