diff --git a/docs/security.md b/docs/security.md index 0a6a3cf089185..2a1105fea33fe 100644 --- a/docs/security.md +++ b/docs/security.md @@ -147,7 +147,26 @@ Note that when using files, Spark will not mount these files into the containers you to ensure that the secret files are deployed securely into your containers and that the driver's secret file agrees with the executors' secret file. -## Encryption +# Network Encryption + +Spark supports two mutually exclusive forms of encryption for RPC connections. + +The first is an AES-based encryption which relies on a shared secret, and thus requires +RPC authentication to also be enabled. + +The second is an SSL based encryption mechanism utilizing Netty's support for SSL. This requires +keys and certificates to be properly configured. It can be used with or without the authentication +mechanism discussed earlier. + +One may prefer to use the SSL based encryption in scenarios where compliance mandates the usage +of specific protocols; or to leverage the security of a more standard encryption library. However, +the AES based encryption is simpler to configure and may be preferred if the only requirement +is that data be encrypted in transit. + +If both options are enabled in the configuration, the SSL based RPC encryption takes precedence +and the AES based encryption will not be used (and a warning message will be emitted). + +## AES based Encryption Spark supports AES-based encryption for RPC connections. For encryption to be enabled, RPC authentication must also be enabled and properly configured. AES encryption uses the @@ -209,6 +228,17 @@ The following table describes the different options available for configuring th +## SSL Encryption + +Spark supports SSL based encryption for RPC connections. Please refer to the SSL Configuration +section below to understand how to configure it. The SSL settings are mostly similar across the UI +and RPC, however there are a few additional settings which are specific to the RPC implementation. +The RPC implementation uses Netty under the hood (while the UI uses Jetty), which supports a +different set of options. + +Unlike the other SSL settings for the UI, the RPC SSL is *not* automatically enabled if +`spark.ssl.enabled` is set. It must be explicitly enabled, to ensure a safe migration path for users +upgrading Spark versions. # Local Storage Encryption @@ -437,8 +467,10 @@ application configurations will be ignored. Configuration for SSL is organized hierarchically. The user can configure the default SSL settings which will be used for all the supported communication protocols unless they are overwritten by protocol-specific settings. This way the user can easily provide the common settings for all the -protocols without disabling the ability to configure each one individually. The following table -describes the SSL configuration namespaces: +protocols without disabling the ability to configure each one individually. Note that all settings +are inherited this way, *except* for `spark.ssl.rpc.enabled` which must be explicitly set. + +The following table describes the SSL configuration namespaces: @@ -466,17 +498,22 @@ describes the SSL configuration namespaces: + + + +
spark.ssl.historyServer History Server Web UI
spark.ssl.rpcSpark RPC communication
The full breakdown of available SSL options can be found below. The `${ns}` placeholder should be replaced with one of the above namespaces. - + + @@ -490,6 +527,7 @@ replaced with one of the above namespaces.
When not set, the SSL port will be derived from the non-SSL port for the same service. A value of "0" will make the service bind to an ephemeral port. + @@ -504,6 +542,7 @@ replaced with one of the above namespaces.
Note: If not set, the default cipher suite for the JRE will be used. + @@ -511,6 +550,7 @@ replaced with one of the above namespaces. + @@ -519,16 +559,19 @@ replaced with one of the above namespaces. Path to the key store file. The path can be absolute or relative to the directory in which the process is started. + + + @@ -541,11 +584,15 @@ replaced with one of the above namespaces. this page. + - + + @@ -554,16 +601,68 @@ replaced with one of the above namespaces. Path to the trust store file. The path can be absolute or relative to the directory in which the process is started. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Property NameDefaultMeaning
Property NameDefaultMeaningSupported Namespaces
${ns}.enabled false Enables SSL. When enabled, ${ns}.ssl.protocol is required.ui,standalone,historyServer,rpc
${ns}.portui,standalone,historyServer
${ns}.enabledAlgorithmsui,standalone,historyServer,rpc
${ns}.keyPassword The password to the private key in the key store. ui,standalone,historyServer,rpc
${ns}.keyStoreui,standalone,historyServer,rpc
${ns}.keyStorePassword None Password to the key store.ui,standalone,historyServer,rpc
${ns}.keyStoreType JKS The type of the key store.ui,standalone,historyServer
${ns}.protocolui,standalone,historyServer,rpc
${ns}.needClientAuth falseWhether to require client authentication. + Whether to require client authentication. + ui,standalone,historyServer
${ns}.trustStoreui,standalone,historyServer,rpc
${ns}.trustStorePassword None Password for the trust store.ui,standalone,historyServer,rpc
${ns}.trustStoreType JKS The type of the trust store.ui,standalone,historyServer
${ns}.openSSLEnabledfalse + Whether to use OpenSSL for cryptographic operations instead of the JDK SSL provider. + This setting requires the `certChain` and `privateKey` settings to be set. + This takes precedence over the `keyStore` and `trustStore` settings if both are specified. + If the OpenSSL library is not available at runtime, we will fall back to the JDK provider. + rpc
${ns}.privateKeyNone + Path to the private key file in PEM format. The path can be absolute or relative to the + directory in which the process is started. + This setting is required when using the OpenSSL implementation. + rpc
${ns}.certChainNone + Path to the certificate chain file in PEM format. The path can be absolute or relative to the + directory in which the process is started. + This setting is required when using the OpenSSL implementation. + rpc
${ns}.trustStoreReloadingEnabledfalse + Whether the trust store should be reloaded periodically. + This setting is mostly only useful in standalone deployments, not k8s or yarn deployments. + rpc
${ns}.trustStoreReloadIntervalMs10000 + The interval at which the trust store should be reloaded (in milliseconds). + This setting is mostly only useful in standalone deployments, not k8s or yarn deployments. + rpc