Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf #43238

Closed
wants to merge 4 commits into from

Conversation

hasnain-db
Copy link
Contributor

@hasnain-db hasnain-db commented Oct 6, 2023

What changes were proposed in this pull request?

This PR adds the options added in #43220 to SSLOptions and SparkTransportConf.

By adding it to the SSLOptions we can support inheritance of options, so settings for the UI and RPC SSL settings can be shared as much as possible. The SparkTransportConf changes are needed to support propagating these settings.

I also make some changes to SecurityManager to log when this feature is enabled, and make the existing spark.network.crypto options mutually exclusive with this new settings (it would just involve double encryption then).

Lastly, make these flags propagate down to when executors are launched, and allow the passwords to be sent via environment variables (similar to how it's done for an existing secret). This ensures they are not visible in plaintext, but also ensures they are available at executor startup (otherwise it can never talk to the driver/worker)

Why are the changes needed?

The propagation of these options are needed for the RPC functionality to work

Does this PR introduce any user-facing change?

No

How was this patch tested?

CI

Added some unit tests which I verified passed:

build/sbt
> project core
> testOnly org.apache.spark.SparkConfSuite org.apache.spark.SSLOptionsSuite org.apache.spark.SecurityManagerSuite org.apache.spark.deploy.worker.CommandUtilsSuite

The rest of the changes and integration were tested as part of #42685

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the CORE label Oct 6, 2023
@hasnain-db
Copy link
Contributor Author

cc: @mridulm @JoshRosen

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w.r.t the config namespace, we currently have:

  • spark.ssl for UI related SSL configs.
  • rpc IO is currently under spark.network.crypto
  • spark.rpc. for various rpc related configs, but they dont deal with security.
  • spark.io.encryption which deals with block encryption.

IMO we should namespace to spark.network and add ssl there, instead of moving these configs for rpc to spark.ssl.

Thoughts @JoshRosen, @hasnain-db ?

@hasnain-db
Copy link
Contributor Author

hasnain-db commented Oct 8, 2023

w.r.t the config namespace, we currently have ...
IMO we should namespace to spark.network and add ssl there, instead of moving these configs for rpc to spark.ssl.

My main preference here was to use the existing spark.ssl settings because there is a nice interop with how SSL works for the UI: we can reuse the same keystore and trustore and protocol settings and have things be consistent. We can also reuse the existing code for option inheritance then without much hassle. We could make this reuse existing settings under a difference namespace (if we want to offer the same UX) but then the code gets a bit messier (which is probably still fine if we feel strongly about prioritizing the UX and using the spark.network namespace).

I agree it's a little confusing as it's not under the network settings, but I don't see it being a stretch for a user to expect SSL settings to be under spark.ssl - that's naively where I checked initially when looking for RPC SSL support.

What do you think? I'm happy to defer to you and @JoshRosen for the final call here.

@hasnain-db
Copy link
Contributor Author

addressed part of the feedback. Still waiting for @JoshRosen to also chime in with his thoughts re: which config namespace to use.

Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have super strong feelings about the configuration namespacing: I'm okay with any option as long as our docs help users to chart a clear course for common deployment scenarios.

Some assorted thoughts:

  • As we think through this, it might be useful to also review some of the proposed documentation updates from the main PR at https://github.com/apache/spark/pull/42685/files#diff-2c506cc756a5333f8fb5acb850d443980706602048f9ddb3dc515abacbfbce6a (in security.md) to see if we think these scenarios make sense.
    • Re-reading this now, I think we might want to re-work the page's introduction to offer more clarity on the different options for encryption so users have a clearer understanding of which options are mutually-exclusive. For example, we should clarify whether the existing shared secret authentication can be used with SSL.
    • We might want to re-work the intro to more clearly state that there are two mutually-exclusive choices for RPC encryption, and might want to leave some breadcrums cross-linking some sub-sections to reiterate this point.
  • There may be some advantages to the hierarchical namespacing of SSL configurations for the sake of giving users a single place to configure enabledAlgorithms (e.g. if a user has a compliance protocol that allows only certain algorithms then they shouldn't have to repeat themselves and keep multiple configurations in sync (although I suppose we could have a separate configuration that just inherits its default from the main SSL options)).
  • In the long run, do we expect that new users would choose to use non-SSL RPC encryption mechanisms? Or would they configure SSL everywhere? Do we expect the existing non-UI encryption configurations to eventually be deprecated (SASL already is)? I'm wondering if we use some long-term vision of a future end state to guide us here.

@hasnain-db
Copy link
Contributor Author

FYI The open PR for the docs is here: #43240 -- happy to address and make changes there.

@github-actions github-actions bot added the BUILD label Oct 11, 2023
@hasnain-db
Copy link
Contributor Author

@JoshRosen I updated the docs PR to hopefully add more clarity around the various encryption options (and, to your request, clarified how they overlap with the authentication setting).

There may be some advantages to the hierarchical namespacing of SSL configurations...

Agreed. If we reuse the spark.ssl.rpc namespace as is done in the PR the implementation is also much easier -- but we can also make the settings inheritance work if we use a different namespace for it - implementation complexity shouldn't hinder UX IMO.

I'm wondering if we use some long-term vision of a future end state to guide us here.

I don't have enough context to make a call here and defer to @mridulm . I will say that as a random observer here who hasn't had to manage too many spark deployments just yet, I feel like the current encryption mechanism has a strong point in its favor: it's quite simple to configure just a shared secret. SSL requires provisioning keys and certs which is always more work.

@mridulm
Copy link
Contributor

mridulm commented Oct 12, 2023

I was trying to see if spark.network.ssl made more sense, and why we have spark.ssl and not spark.ui.ssl (given current use is for UI).

We used to have more ssl related configs in the past under that namespace - spark.ssl.akka, spark.ssl.fs (file server), etc ... with the intent to keep spark.ssl as the namespace for various module's ssl config.

Given this, even though today it is just UI, it does make more sense to go back to that approach.
Let us keep it as proposed in this PR.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of nits, looks good to me

@hasnain-db
Copy link
Contributor Author

test failure looks unrelated, retrying for the third time

@hasnain-db
Copy link
Contributor Author

org.apache.spark.sql.kafka010.KafkaSourceStressSuite is failing on this and the other outstanding PR, and seems to be unrelated to the changes here. I wonder if that is related to the new kafka version bump since it looks like it also failed on https://github.com/apache/spark/actions/runs/6497942586/job/17648192318

@mridulm mridulm closed this in 26aaf1c Oct 14, 2023
@mridulm
Copy link
Contributor

mridulm commented Oct 14, 2023

The failures are not related.
Merging to master.
Thanks for working on this @hasnain-db !
Thanks for the review @JoshRosen :-)

mridulm pushed a commit that referenced this pull request Oct 25, 2023
…rtConf

### What changes were proposed in this pull request?

This change ensures that RPC SSL options settings inheritance works properly after #43238 - we pass `sslOptions` wherever we call `fromSparkConf`.

In addition to that minor mechanical change, duplicate/add tests for every place that calls this method, to add a test case that runs with SSL support in the config.

### Why are the changes needed?

These changes are needed to ensure that the RPC SSL functionality can work properly with settings inheritance. In addition, through these tests we can ensure that any changes to these modules are also tested with SSL support and avoid regressions in the future.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Full integration testing also done as part of #42685

Added some tests and ran them:

```
build/sbt
> project core
> testOnly org.apache.spark.*Ssl*
> testOnly org.apache.spark.network.netty.NettyBlockTransferSecuritySuite
```

and

```
build/sbt -Pyarn
> project yarn
> testOnly org.apache.spark.network.yarn.SslYarnShuffleServiceWithRocksDBBackendSuite
```

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #43387 from hasnain-db/spark-tls-integrate-everywhere.

Authored-by: Hasnain Lakhani <hasnain.lakhani@databricks.com>
Signed-off-by: Mridul Muralidharan <mridul<at>gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants