Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-26322][SS] Add spark.kafka.sasl.token.mechanism to ease delegation token configuration. #23274

Closed
wants to merge 3 commits into from

Conversation

gaborgsomogyi
Copy link
Contributor

@gaborgsomogyi gaborgsomogyi commented Dec 10, 2018

What changes were proposed in this pull request?

When Kafka delegation token obtained, SCRAM sasl.mechanism has to be configured for authentication. This can be configured on the related source/sink which is inconvenient from user perspective. Such granularity is not required and this configuration can be implemented with one central parameter.

In this PR spark.kafka.sasl.token.mechanism added to configure this centrally (default: SCRAM-SHA-512).

How was this patch tested?

Existing unit tests + on cluster.

@@ -642,9 +642,9 @@ This way the application can be configured via Spark parameters and may not need
configuration (Spark can use Kafka's dynamic JAAS configuration feature). For further information
about delegation tokens, see [Kafka delegation token docs](http://kafka.apache.org/documentation/#security_delegation_token).

The process is initiated by Spark's Kafka delegation token provider. When `spark.kafka.bootstrap.servers`,
The process is initiated by Spark's Kafka delegation token provider. When `spark.kafka.bootstrap.servers` set,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found this wording issue so fixed it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it should be "is set".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Spark considers the following log in options, in order of preference:
- **JAAS login configuration**
- **JAAS login configuration**, please see example below.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this small pointer to make things more clear.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @gaborgsomogyi . Which example is this pointing? Previous examples seems to be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun down below with the same JAAS login configuration name.

@SparkQA
Copy link

SparkQA commented Dec 10, 2018

Test build #99917 has finished for PR 23274 at commit 320040a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gaborgsomogyi
Copy link
Contributor Author

cc @vanzin @HeartSaVioR

The feature works on cluster but I'm not fully happy with the ConfigUpdater test coverage so planning to file a jira to extract it to a file and test it properly.

@@ -642,9 +642,9 @@ This way the application can be configured via Spark parameters and may not need
configuration (Spark can use Kafka's dynamic JAAS configuration feature). For further information
about delegation tokens, see [Kafka delegation token docs](http://kafka.apache.org/documentation/#security_delegation_token).

The process is initiated by Spark's Kafka delegation token provider. When `spark.kafka.bootstrap.servers`,
The process is initiated by Spark's Kafka delegation token provider. When `spark.kafka.bootstrap.servers` set,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, it should be "is set".

docs/structured-streaming-kafka-integration.md Outdated Show resolved Hide resolved
@SparkQA
Copy link

SparkQA commented Dec 11, 2018

Test build #99963 has finished for PR 23274 at commit 1a6ae00.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 11, 2018

Test build #99986 has finished for PR 23274 at commit de35aa2.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gaborgsomogyi
Copy link
Contributor Author

retest this, please

@gaborgsomogyi gaborgsomogyi changed the title [SPARK-26322][SS] Add spark.kafka.token.sasl.mechanism to ease delegation token configuration. [SPARK-26322][SS] Add spark.kafka.sasl.token.mechanism to ease delegation token configuration. Dec 11, 2018
@SparkQA
Copy link

SparkQA commented Dec 12, 2018

Test build #99991 has finished for PR 23274 at commit de35aa2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds a nice improvement to me, and LGTM on code changes.

@vanzin
Copy link
Contributor

vanzin commented Dec 13, 2018

Merging to master.

@asfgit asfgit closed this in 6daa783 Dec 13, 2018
holdenk pushed a commit to holdenk/spark that referenced this pull request Jan 5, 2019
…tion token configuration.

## What changes were proposed in this pull request?

When Kafka delegation token obtained, SCRAM `sasl.mechanism` has to be configured for authentication. This can be configured on the related source/sink which is inconvenient from user perspective. Such granularity is not required and this configuration can be implemented with one central parameter.

In this PR `spark.kafka.sasl.token.mechanism` added to configure this centrally (default: `SCRAM-SHA-512`).

## How was this patch tested?

Existing unit tests + on cluster.

Closes apache#23274 from gaborgsomogyi/SPARK-26322.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
…tion token configuration.

## What changes were proposed in this pull request?

When Kafka delegation token obtained, SCRAM `sasl.mechanism` has to be configured for authentication. This can be configured on the related source/sink which is inconvenient from user perspective. Such granularity is not required and this configuration can be implemented with one central parameter.

In this PR `spark.kafka.sasl.token.mechanism` added to configure this centrally (default: `SCRAM-SHA-512`).

## How was this patch tested?

Existing unit tests + on cluster.

Closes apache#23274 from gaborgsomogyi/SPARK-26322.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants