Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

:x-pack:qa:full-cluster-restart fails with SecurityException: Keystore has been corrupted or tampered with on FIPS JVM #32737

Closed
alpar-t opened this issue Aug 9, 2018 · 19 comments · Fixed by #32901
Assignees
Labels
:Security/Security Security issues without another label >test-failure Triaged test failures from CI v6.4.1 v6.5.0 v7.0.0-beta1

Comments

@alpar-t
Copy link
Contributor

alpar-t commented Aug 9, 2018

Relevant log

18:02:02 > Task :x-pack:qa:full-cluster-restart:with-system-key:v6.3.3-SNAPSHOT#oldClusterTestCluster#node0.addToKeystore#xpack.watcher.encryption_key
18:02:02 Task ':x-pack:qa:full-cluster-restart:with-system-key:v6.3.3-SNAPSHOT#oldClusterTestCluster#node0.addToKeystore#xpack.watcher.encryption_key' is not up-to-date because:
18:02:02   Task has not declared any outputs despite executing actions.
18:02:02 	at sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:658)
18:02:02 	at sun.security.provider.JavaKeyStore$JKS.engineLoad(JavaKeyStore.java:56)
18:02:02 	at sun.security.provider.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:224)
18:02:02 	at sun.security.provider.JavaKeyStore$DualFormatJKS.engineLoad(JavaKeyStore.java:70)
18:02:02 	at java.security.KeyStore.load(KeyStore.java:1445)
18:02:02 Starting process 'command '/var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/java11/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/elasticsearch-6.3.3-SNAPSHOT/bin/elasticsearch-keystore''. Working directory: /var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/java11/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/cwd Command: /var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/java11/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/elasticsearch-6.3.3-SNAPSHOT/bin/elasticsearch-keystore add-file xpack.watcher.encryption_key /var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/java11/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/src/test/resources/system_key
18:02:02 Successfully started process 'command '/var/lib/jenkins/workspace/elastic+elasticsearch+master+matrix-java-periodic/ES_BUILD_JAVA/java11/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/elasticsearch-6.3.3-SNAPSHOT/bin/elasticsearch-keystore''
18:02:02 warning: ignoring JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
18:02:02 	at sun.security.util.AnchorCertificates$1.run(AnchorCertificates.java:61)
18:02:02 	at sun.security.util.AnchorCertificates$1.run(AnchorCertificates.java:52)
18:02:02 	at java.security.AccessController.doPrivileged(Native Method)
18:02:02 	at sun.security.util.AnchorCertificates.<clinit>(AnchorCertificates.java:52)
18:02:02 	at sun.security.provider.certpath.AlgorithmChecker.checkFingerprint(AlgorithmChecker.java:214)
18:02:02 	at sun.security.provider.certpath.AlgorithmChecker.<init>(AlgorithmChecker.java:164)
18:02:02 	at sun.security.provider.certpath.AlgorithmChecker.<init>(AlgorithmChecker.java:118)
18:02:02 	at sun.security.validator.SimpleValidator.engineValidate(SimpleValidator.java:157)
18:02:02 	at sun.security.validator.Validator.validate(Validator.java:262)
18:02:02 	at sun.security.validator.Validator.validate(Validator.java:238)
18:02:02 	at sun.security.validator.Validator.validate(Validator.java:207)
18:02:02 	at javax.crypto.JarVerifier.isTrusted(JarVerifier.java:610)
18:02:02 	at javax.crypto.JarVerifier.verifySingleJar(JarVerifier.java:530)
18:02:02 	at javax.crypto.JarVerifier.verifyJars(JarVerifier.java:363)
18:02:02 	at javax.crypto.JarVerifier.verify(JarVerifier.java:289)
18:02:02 	at javax.crypto.JceSecurity.verifyProviderJar(JceSecurity.java:164)
18:02:02 	at javax.crypto.JceSecurity.getVerificationResult(JceSecurity.java:190)
18:02:02 	at javax.crypto.JceSecurity.canUseProvider(JceSecurity.java:204)
18:02:02 	at javax.crypto.SecretKeyFactory.nextSpi(SecretKeyFactory.java:295)
18:02:02 	at javax.crypto.SecretKeyFactory.<init>(SecretKeyFactory.java:121)
18:02:02 	at javax.crypto.SecretKeyFactory.getInstance(SecretKeyFactory.java:160)
18:02:02 	at org.elasticsearch.common.settings.KeyStoreWrapper.createCipher(KeyStoreWrapper.java:288)
18:02:02 	at org.elasticsearch.common.settings.KeyStoreWrapper.decrypt(KeyStoreWrapper.java:336)
18:02:02 	at org.elasticsearch.common.settings.AddFileKeyStoreCommand.execute(AddFileKeyStoreCommand.java:73)
18:02:02 	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
18:02:02 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
18:02:02 	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:79)
18:02:02 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
18:02:02 	at org.elasticsearch.cli.Command.main(Command.java:90)
18:02:02 	at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:41)
18:02:02 Exception in thread "main" java.lang.SecurityException: Keystore has been corrupted or tampered with
18:02:02 	at org.elasticsearch.common.settings.KeyStoreWrapper.decrypt(KeyStoreWrapper.java:349)
18:02:02 	at org.elasticsearch.common.settings.AddFileKeyStoreCommand.execute(AddFileKeyStoreCommand.java:73)
18:02:02 	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86)
18:02:02 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
18:02:02 	at org.elasticsearch.cli.MultiCommand.execute(MultiCommand.java:79)
18:02:02 	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124)
18:02:02 	at org.elasticsearch.cli.Command.main(Command.java:90)
18:02:02 	at org.elasticsearch.common.settings.KeyStoreCli.main(KeyStoreCli.java:41)

The test is calling elasticsearch-keystore add xpack.watcher.encryption_key -x

CI logs:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA=java10,ES_RUNTIME_JAVA=java8fips,nodes=virtual&&linux/25/console

@alpar-t alpar-t added >test-failure Triaged test failures from CI v7.0.0 :Security/Security Security issues without another label labels Aug 9, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-security

@alpar-t
Copy link
Contributor Author

alpar-t commented Aug 9, 2018

Actually it seems that in CI the problem surfaced when calling elasticsearch-keystore create
When I tried to reproduce this locally, it failed in the same way when adding a file

elasticsearch-keystore add-file xpack.watcher.encryption_key /home/alpar/work/elastic/elasticsearch/x-pack/qa/full-cluster-restart/src/test/resources/system_key

@albertzaharovits
Copy link
Contributor

@atorok It is not clear to me how this fails. Can you please detail your reproduction?

@alpar-t
Copy link
Contributor Author

alpar-t commented Aug 9, 2018

@albertzaharovits sorry for that.
I can reproduce locally with:
RUNTIME_JAVA_HOME=$JAVA8FIPS_HOME ./gradlew :x-pack:qa:full-cluster-restart:check
the test runs elasticsearch-keystore as part of the upgrade the relevant log is

10:02:54 Starting process 'command '/var/lib/jenkins/workspace/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA/java10/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/elasticsearch-6.3.3-SNAPSHOT/bin/elasticsearch-keystore''. Working directory: /var/lib/jenkins/workspace/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA/java10/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/cwd Command: /var/lib/jenkins/workspace/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA/java10/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/with-system-key/build/cluster/v6.3.3-SNAPSHOT#oldClusterTestCluster node0/elasticsearch-6.3.3-SNAPSHOT/bin/elasticsearch-keystore add-file xpack.watcher.encryption_key /var/lib/jenkins/workspace/elastic+elasticsearch+6.4+matrix-java-periodic/ES_BUILD_JAVA/java10/ES_RUNTIME_JAVA/java8fips/nodes/virtual&&linux/x-pack/qa/full-cluster-restart/src/test/resources/system_key```
This is the call that fails with the exception from the description. 

@alpar-t
Copy link
Contributor Author

alpar-t commented Aug 9, 2018

It actually takes this to reproduce it :
[:~/work/elastic/elasticsearch/x-pack/qa/full-cluster-restart] master+ 130 ± RUNTIME_JAVA_HOME=$JAVA8FIPS_HOME ../../../gradlew check

The previous command looks to be equivalent but it just runs an empty check due to the sub-projects.

alpar-t added a commit that referenced this issue Aug 9, 2018
alpar-t added a commit that referenced this issue Aug 9, 2018
alpar-t added a commit that referenced this issue Aug 9, 2018
spinscale added a commit to spinscale/elasticsearch that referenced this issue Aug 10, 2018
This disables the x-pack rolling upgrade tests using a fips JVM, as
there are problems creating the keystore.

Relates elastic#32737
spinscale added a commit that referenced this issue Aug 10, 2018
)

This disables the x-pack rolling upgrade tests using a fips JVM, as
there are problems creating the keystore.

Relates #32737
spinscale added a commit to spinscale/elasticsearch that referenced this issue Aug 10, 2018
…stic#32775)

This disables the x-pack rolling upgrade tests using a fips JVM, as
there are problems creating the keystore.

Relates elastic#32737
spinscale added a commit that referenced this issue Aug 10, 2018
)

This disables the x-pack rolling upgrade tests using a fips JVM, as
there are problems creating the keystore.

Relates #32737
spinscale added a commit that referenced this issue Aug 10, 2018
)

This disables the x-pack rolling upgrade tests using a fips JVM, as
there are problems creating the keystore.

Relates #32737
@jkakavas jkakavas self-assigned this Aug 13, 2018
@jkakavas
Copy link
Member

The java.io.IOException: Invalid keystore format and associated stack trace is a red herring: It fails while trying to verify the signature of the BouncyCastle FIPS Security Provider JAR and more specifically when trying to check if the certificate that contains the public key with which the JAR is signed is one of the trusted root certificates that ship with the JDK :

/**
 * The purpose of this class is to determine the trust anchor certificates is in
 * the cacerts file.  This is used for PKIX CertPath checking.
 */
public class AnchorCertificates {

    private static final Debug debug = Debug.getInstance("certpath");
    private static final String HASH = "SHA-256";
    private static Set<String> certs = Collections.emptySet();

    static  {
        AccessController.doPrivileged(new PrivilegedAction<Void>() {
            @Override
            public Void run() {
                File f = new File(System.getProperty("java.home"),
                        "lib/security/cacerts");
                KeyStore cacerts;
                try {
                    cacerts = KeyStore.getInstance("JKS");
                    try (FileInputStream fis = new FileInputStream(f)) {
                        cacerts.load(fis, null);
                     .....
                     ......

The code above makes the assumption that the cacerts keystore is a JKS one, but we're running in a FIPS 140 JVM where the keystore has been modified to be of BCFKS type, and this predictably fails with

java.io.IOException: Invalid keystore format

There is an issue ( unresolved ) upstream for Java 8 : https://bugs.openjdk.java.net/browse/JDK-8202893.

@albertzaharovits
Copy link
Contributor

Nice catch @jkakavas ! 💯

@jkakavas
Copy link
Member

The actual issue causing the failure is discussed in #28515 (comment). This was identified as part of the effort for FIPS 140-2 compliance and the relevant change was introduced in #28515. It was not backported to 6.3 as it wasn't deemed necessary at the time.
The two options I see are:

  • Attempting to run the 6.3 keystore commands with a different JAVA RUNTIME (@atorok any thoughts on how feasible that is ? )
  • Mute the bwc tests for anything older than 6.4 in FIPS JVMs

@alpar-t
Copy link
Contributor Author

alpar-t commented Aug 14, 2018

@jkakavas What would a user do ? do we expect upgrades from non-fips to fips ? AFAIK we don't have any other tests swap out the JDK during an upgrade, but FIPS is indeed a bit different. We should probably test what we would tell our users to do, either remove 6.3 from bwc tests when running on fips - if we tell users you would need to do something like upgrade to 6.4 then enable fips, or extend the cluster formation tests to know about FIPS and be able to go from a FIPS run-time java version to a non fips one and run with that. @nik9000 any thoughts ?

@jkakavas
Copy link
Member

jkakavas commented Aug 14, 2018

@jkakavas What would a user do ? do we expect upgrades from non-fips to fips ?

fips and non-fips in this context applies to the JVM only, not Elasticsearch. This, in addition to the fact that we do not support FIPS 140 JVMs for anything < 6.4 , means that one cannot first switch to a FIPS 140 JVM and then upgrade.

Since this is the only upgrade path, I will be muting 6.3 bwc tests when running on fips.

@tvernum
Copy link
Contributor

tvernum commented Aug 14, 2018

@jkakavas I don't disagree, but I'd like us to be crystal clear on what we do support.

If someone is on 6.3 (or earlier) they will need to do a rolling upgrade to 6.4 on a non-FIPS JVM.
Can they then do a rolling "upgrade" to a FIPS JVM while staying on 6.4?

Do we have a documented upgrade to a FIPS JVM process, or are we currently targeting new installs?

@jkakavas
Copy link
Member

You are right that

we do not support FIPS 140 JVMs for anything < 6.4 , means that one cannot first switch to a FIPS 140 JVM and then upgrade.

was not very clear. This is probably not the right venue for the detailed discussion about what is supported, but I haven't completed the docs to point to yet so:

If someone is on 6.3 (or earlier) they will need to do a rolling upgrade to 6.4 on a non-FIPS JVM.

If they are not on a trial license (that is they're already on platinum) they could do a rolling upgrade where each node starts after upgrade in a FIPS 140 JVM ( provided the necessary configuration changes are performed on the node before starting it, elasticsearch keystore manually recreated, jks keystrores replaced, etc ) .

Can they then do a rolling "upgrade" to a FIPS JVM while staying on 6.4?

Yes.

Do we have a documented upgrade to a FIPS JVM process, or are we currently targeting new installs?

I'm in the process of writing it.

TBC, the problem we're seeing in these CI failures is because a 6.3.3 node is running in a FIPS 140 JVM

@alpar-t
Copy link
Contributor Author

alpar-t commented Aug 14, 2018

If my understanding is correct, removing 6.3 from bwc testing solves the problem of the failing test, but we will support users to do the rolling upgrade and enable FIPS at the same time, which would not be covered by any tests.

I think we wouldn't want to mix things too much, so removing 6.3 from bwc tests when running on FIPS jvm seems like the right thing to do, this will test rolling upgrade between fips enabled versions, i.e 6.4 -> 6.5 but it also sounds to me that we would need a new FIPS specific rolling upgrade test that does include 6.3 and older to assert that any version running on non fips can be rolling upgraded to a fips enabled JVM.

@tvernum
Copy link
Contributor

tvernum commented Aug 15, 2018

any version running on non fips can be rolling upgraded to a fips enabled JVM.

If we do this (and we should), I'd like it to include testing moving in either direction to/from FIPS on the same ES version. I suspect those will be the most likely scenario for customers.

@tvernum
Copy link
Contributor

tvernum commented Aug 15, 2018

Conceptually this seems very similar to new JDK versions, and the BWC tests must handle that already.
How do we handle the rolling upgrade tests on Java 10? 6.2 and earlier aren't supported on Java 10. Do the rolling upgrade tests detect that and run the old nodes on Java8?

@jasontedor
Copy link
Member

How do we handle the rolling upgrade tests on Java 10? 6.2 and earlier aren't supported on Java 10. Do the rolling upgrade tests detect that and run the old nodes on Java8?

Yes, this is exactly what we do.

@jkakavas
Copy link
Member

Yes, this is exactly what we do.

IIUC this sets the JAVA_VERSION so that building the distribution won't fail. What we want here is that xxx-oldClusterTestCluster ( since we're trying to set a keystore value when configuring it ) and oldClusterTestRunner Task ( which runs the version we built ) run with a non-fips JVM if the version is < 6.4.

I'll give environment('JAVA_RUNTIME_HOME', 'appropriate-non-fips-version') for these tasks a try, unless there is another preferred way to do this (Couldn't see any existing examples)

@jasontedor
Copy link
Member

Sorry @jkakavas, I pointed you to the wrong place. Take a look at

/** Return the java home used by this node. */
String getJavaHome() {
return javaVersion == null ? project.runtimeJavaHome : project.javaVersions.get(javaVersion)
}

and

if (nodeVersion.before("6.2.0")) {
javaVersion = 8
} else if (nodeVersion.onOrAfter("6.2.0") && nodeVersion.before("6.3.0")) {
javaVersion = 9
}

This is how we ensure the right versions for Java home for the BWC nodes. We will need slightly different logic to pick up that the older nodes should run with a non-FIPS JVM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Security/Security Security issues without another label >test-failure Triaged test failures from CI v6.4.1 v6.5.0 v7.0.0-beta1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants