After update to Lucene 9.4 use `--enable-preview` on Java==19 (exact) to allow mmap use new JDK APIs #4637

uschindler · 2022-09-29T15:28:36Z

Description

Apache Lucene 9.4 will have support for Java 19 Panama APIs to mmap index files (using a MR-JAR). See apache/lucene#912 for more information.

As those APIs are not yet enabled by default in the JDK, we have to still use some opt-in approach, controlled by Java's command line:

Lucene by default uses the old implementation using MappedByteBuffer and several hacks which may also risk crushing the JDK if an open index is closed from another thread while a search is running (this is well known). If Java 19 is detected, Lucene will log a warning through JUL when MMapDirectory is initialized (see below).
If you pass --enable-preview to the Java command line (next to heap settings), it will enable preview APIs in JDK (https://openjdk.org/jeps/12). Lucene detects this and switches MMapDirectory and uses a new implementation MemorySegmentIndexInput for the inputs to use those new APIs (at moment it will also log this as "info" message to JUL). The new APIs are safe and can no longer crush the JVM. But most importantly, all index files are now mapped in portions of 16 GiB instead of 1 GiB into memory. In fact, unless an index is force-merged to one segment, all index files will then consist only of one memory mapping spawning the whole file! This will help Hotspot to further optimize reading as only one implementation on MemorySegmentIndexInput ist used. In addition, because the number of mappings is dramatically reduced (approximately 5 times less mappings, because the maximum segment size is 5 Gigabytes by default and all such segments now use one instead of 5 mappings). This may allow users to no longer change sysctl (see max_map_count @ https://opensearch.org/docs/latest/opensearch/install/important-settings/) and go with defaults of OS. On the other hand users may host more indexes with many more segments on one node.

Some TODOs:

Make sure that Opensearch also redirects stuff logged via java.util.logging (JUL) to its own log file, so they do not land in console. This can be done with log4j by adding the log4j-jul adapter and install it using a system property in the Bootstrap classes. I have not checked if this is already done. The reason for this is that Apache Lucene now logs some events using java.util.logging since Lucene 9.0. Some of those events are MMapDirectory messages (e.g., when unmapping was not working) or few others like some module system settings are incorrect. Logging is very seldom, but for this feature it will definitely log using JUL, so it would be good to make sure Opensearch redirects JUL logging correctly to its own loggers. This could be a separate issue!
The Opensearch startup script should pass --enable-preview as command line flag if exactly Java 19 is used to start up Opensearch. If this is not done, a warning gets logged (see above).

Important: Lucene 9.4 only supports this on Java 19 (exactly), because the APIs are in flux. If you start with Java 20, it falls back to the classical MMapDirectory. We will add support for Java 20 in a later release. The reason for this is that the class files of new implementation are marked by some special version numbers that make them ONLY compatible to Java 19, not earlier or later, to allow the JDK to apply changes to the API before final release in Java 21. But passing --enable-preview to later versions won't hurt, so maybe enable it on all versions >= 19.

A last note: The downside of this new code is that closing and unmapping an index file gets more heavy (it will trigger an safepoint in the JVM). We have not yet found out how much this impacts servers opening/closing index files a lot. Because of this we would really like Amazon/Opensearch to do benchmarking on this, ideally if their users and customers could optionally enable it. But benchmarking should be done now, because with hopefully Java 21, Lucene will use the new implementation by default. Java 20 will be the second and last preview round.

The text was updated successfully, but these errors were encountered:

dblock · 2022-09-29T15:52:55Z

Thanks for the detailed proposal. Alongside these changes I think we need to start running at least some CI on both JDK 17 and JDK19. Currently we run 11 and 17 via https://github.com/opensearch-project/OpenSearch/blob/main/buildSrc/version.properties#L5.

uschindler · 2022-09-29T15:56:49Z

Ah OK, I did not know which versions you are running. Because the sister project ES is already on JDK 19 for testing and bundles 18 at moment.

reta · 2022-09-29T16:03:45Z

We did some work to support JDK-18 [1], somehow overlooked JDK-19, @dblock I can take this part, will update CI images, etc.

[1] #1710

dblock · 2022-09-29T16:16:27Z

@reta What do you think about extracting a small subset of tests (not ITs for example) to run on GHA, so we can make the JDK matrix easy and visible in the repo, just like we do for plugins? (e.g. this one). Something more than precommit but less than gradle check? and reducing gradle check to ITs

reta · 2022-09-29T16:30:22Z

@reta What do you think about extracting a small subset of tests (not ITs for example) to run on GHA, so we can make the JDK matrix easy and visible in the repo, just like we do for plugins? (e.g. this one). Something more than precommit but less than gradle check? and reducing gradle check to ITs

@dblock certainly +1 to give it a try, I think we should be able to run everything on GHA but starting with subset is probably a good idea.

uschindler · 2022-09-29T16:33:42Z

Of course when Lucene comes out tomorrow (hopefully, the schedule suggests it, but something bad can always happen), please update Lucene to 9.4.0 first. I have no idea at which stage the snapshot currently used is: 9.4.0-snapshot-ddf0d0a

dblock · 2022-09-29T16:53:38Z

@uschindler Yes, of course, but don't let us stop you :)

Read more about Lucene Snapshots here: https://github.com/opensearch-project/OpenSearch/blob/main/DEVELOPER_GUIDE.md#lucene-snapshots

dbwiddis · 2022-09-30T05:23:28Z

+1 on this. I've been playing around with JDK19 and there's some great performance gains to be had. Downside is the temporary nature of anything we do, as we'll have to tweak it for 20 and then 21, but will be good to work out the kinks with an opt-in.

uschindler · 2022-09-30T07:24:49Z

@dbwiddis You can use a similar approach for/with JNA vs. panama-foreign like we do by using an MR-JAR and adding some try/catch logic when loading the classes.

dbwiddis · 2022-09-30T20:26:02Z

You can use a similar approach for/with JNA vs. panama-foreign like we do by using an MR-JAR

@uschindler it's less the practical aspect (it can be done) but more the future question of "what do we do a year from now when JDK 21 LTS is out". Or even 6 months from now when JDK 20 is out and JDK 19 is EOL and no longer receiving updates.

Are we going to continue to "support" older versions that are not getting security updates (which I might suggest are more likely with native memory access than other features)? Do we back out the "19" branch from the MRJAR? If it's the same API in 20 but there's a security fix in 20 that's not in 19, which version do we enforce in the MRJAR?

I'm all for trying to enable features (early) that we think will be in JDK 21 (with some possible tweaks to the API) but I think it should be with the expectation that we will flip 19 to 20, and then 21, and not make any attempt to "support" EOL versions.

uschindler · 2022-10-01T13:51:07Z

Hi,
My plan is to have another Gradle SourceSet in Lucene for 20 and then one for 21. The logic to load will be the same (no changes needed).

If I figure out that in JDK 20 there are no API changes (most likely for the part of Panama we use), we may just add a Gradle task to copy the v19 class files to the v20 MR-JAR folder while patching classfile major and preview minor version (directly or via ASM visitor). This spares us to autoprovision JDK20. I will see what works. When JDK 21 LTS comes out we may drop support for the 19 and 20 in newer Lucene versions and use a plain MR-JAR variant without any preview supporting logic. But this would not break any code here. JDK 19 would fallback to the old code. 🤣

In short: I will support 20 in addition later.

dbwiddis · 2022-10-01T23:46:01Z

When JDK 21 LTS comes out we may drop support for the 19 and 20 in newer Lucene versions

I think that aligns with my thinking. Of course I'm not the one to convince. :-)

saratvemulapalli · 2022-10-11T19:07:06Z

Looks like we do want to use --enable-preview with Lucene 9.4, which means OpenSearch has to run on Java 19.
Is everybody agreeing with the change, looking for thoughts from @nknize.
Please do 👍🏻 👎🏻 .

dblock · 2022-10-11T21:01:28Z

So OpenSearch 3.x would be JDK 19 only as a breaking change (I think 👎 for me), or would support --enable-preview as an option and continue to be runnable on JDK11, 17 and 19 (definitely 👍 and could be on 2.x?)?

reta · 2022-10-11T21:14:35Z

Oh I think we should be testing with JDK 19 but not make it the default JDK bundle ... People could use it at will but we should not push it

dbwiddis · 2022-10-12T04:47:15Z

So OpenSearch 3.x would be JDK 19 only as a breaking change (I think 👎 for me), or would support --enable-preview as an option and continue to be runnable on JDK11, 17 and 19 (definitely 👍 and could be on 2.x?)?

The request is in the original Issue text:

support JDK 11 and 17 with no changes
If, and only if, the user is running on JDK 19, add the --enable-preview on the command line

(Original issue also notes some logging changes required.)

nknize · 2022-10-12T18:25:17Z

Make sure that Opensearch also redirects stuff logged via java.util.logging (JUL) to its own log file,...I have not checked if this is already done.

JUL log4j adapter was not configured so java.util.logging was only using stderr handler (note stdout / stderr are redirected to opensearch.stdout.log / opensearch.stderr.log but logger.fine in JUL was never logging since the log4j configuration wasn't propagating to the JUL logger). I opened a PR to install and configure the JUL log4j adapter so it logs through the same logger used across OpenSearch.

Oh I think we should be testing with JDK 19 but not make it the default JDK bundle ... People could use it at will but we should not push it

I think this is a good first step. However 3.0 is scheduled to go GA January next year and this is a good time to introduce any breaking changes (e.g., maybe start working on jigsaw modularization of OpenSearch?). So perhaps we think about opt in for OpenSearch 2.4 but maybe it becomes the default for 3.0 (we'll do this in a separate PR targeted for main branch only)? @uschindler, any concerns with cutting over to 19 as default bundled jdk for 3.0 GA? Maybe we continue to let --enable-preview be opt in in 3.0 even w/ default 19?

nknize · 2022-10-12T22:08:26Z

We have not yet found out how much this impacts servers opening/closing index files a lot. Because of this we would really like Amazon/Opensearch to do benchmarking on this...

@anasalkouz @adnapibar are we publishing nightly benchmarks yet? We should be helping out the Lucene project w/ performance benchmarks like this using opensearch-benchmarks. /cc @mikemccand

uschindler · 2022-10-12T23:13:17Z

That was my plan: bundle and test with JDK 19 and enable the preview mode only for that combination.

I think the magic can be done in this java bootstrap part that precalculates the initial java command before launching the main server.

nknize · 2022-10-12T23:54:15Z

That was my plan: bundle and test with JDK 19 and enable the preview mode only for that combination.

+1 to follow suit and default bundle jdk 19 for OpenSearch 3.0 (main branch); and make it opt-in only for 2.x.

reta · 2022-10-27T16:41:03Z

JDK-19 support on Jenkins: opensearch-project/opensearch-build#2808

reta · 2022-11-08T18:16:41Z

The main is on JDK 19: #4973

uschindler · 2022-11-08T18:47:40Z

Do we also pass enable-preview at runtime?

In the patch I only see it passed to javadoc, but why?

reta · 2022-11-08T19:02:31Z

Do we also pass enable-preview at runtime?

Not yet, will do that shortly

In the patch I only see it passed to javadoc, but why?

> Task :server:javadoc                                                                         
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInputProvider.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInput.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)

uschindler · 2022-11-08T20:17:57Z

Do we also pass enable-preview at runtime?

Not yet, will do that shortly

In the patch I only see it passed to javadoc, but why?

> Task :server:javadoc                                                                         
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInputProvider.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInput.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)
error: class file for /org.apache.lucene/lucene-core/9.5.0-snapshot-a4ef70f/e949897fa24e14d2701a3c41fe27a4f094681b81/lucene-core-9.5.0-snapshot-a4ef70f.jar(/org/apache/lucene/store/MemorySegmentIndexInput$MultiSegmentImpl.class) uses preview features of Java SE 19.
  (use --enable-preview to allow loading of class files which contain preview features)

The problem is because during javadoc you do not pass --release 17 or like that. So it fails because it can "see" those classes during javadoc, although it won't use them. That's strange, looks like a bug in javadoc to me. I thought the target JDK for compile is still 17, or was this changed to release 19?

reta · 2022-11-08T20:42:25Z

I thought the target JDK for compile is still 17, or was this changed to release 19?

The target is actually set to 11, but we didn't pass anything to javadoc as target

uschindler · 2022-11-08T20:42:45Z

As far as I see you do not pass any release version to the compiler. So your java files are marked as 19. Therefore you need java 19 to build javadoc. And therefore it reads all class files from Lucene, also the preview ones.

I doubt that it will work with JDK 20. If you execute compilation with 20 and then enforce javadoc to 19 it will also fail.

So be consistent: pass consistent predefined release version to both compiler and javadoc. Just run tests with latest version. This makes build reproducible.

In java we compile against release 11 (Lucene 9) and against release 17 (Lucenes main), but run tests with what Gradle provides.

reta · 2022-11-08T20:48:53Z

So be consistent: pass consistent predefined release version to both compiler and javadoc. Just run tests with latest version. This makes build reproducible.

Fixed that for Javadoc (#5151), for compiler we use Gradle's source/target compatibility settings.

reta · 2022-11-08T20:49:41Z

In java we compile against release 11 (Lucene 9) and against release 17 (Lucenes main), but run tests with what Gradle provides.

You use toolchain, correct?

uschindler · 2022-11-08T21:05:55Z

So be consistent: pass consistent predefined release version to both compiler and javadoc. Just run tests with latest version. This makes build reproducible.

Fixed that for Javadoc (#5151), for compiler we use Gradle's source/target compatibility settings.

That's the correct variant. I did not find the target compatibility in your Gradle file. Gradle does tot automatically pass release to java and javadoc globally, you need to configure it in every task. To be 100% safe, you should also pass 11 as release to javac. This also ensures that you compile against 11 API. Target and source only produce class files for 11 but it compiles against actual java library.

But the fix looks much better now.

reta · 2022-11-08T21:08:25Z

Thanks @uschindler here we are https://github.com/opensearch-project/OpenSearch/blob/main/buildSrc/build.gradle#L163 (OpenSearch build is quite complex) :-)

uschindler · 2022-11-08T21:08:50Z

In java we compile against release 11 (Lucene 9) and against release 17 (Lucenes main), but run tests with what Gradle provides.

You use toolchain, correct?

Partially, yes, to compile java 19 classes for mmap. For the main sources we use --release 11 also on compiler. This will tell javac to compile against a signature file with java 11 class declarations (all versions after java 8 ship with some minimized jar file containing stripped class files without bodies for all major versions).

dblock · 2022-12-27T01:22:40Z

FYI apache/lucene#12033

uschindler · 2022-12-27T08:11:57Z

Sorry, forgot to post this here. After update to Lucenes 9.5 you can remove the flag.

uschindler added enhancement Enhancement or improvement to existing feature or request untriaged labels Sep 29, 2022

saratvemulapalli added the Build Libraries & Interfaces label Oct 4, 2022

This was referenced Oct 12, 2022

Install and configure Log4j JUL adapter for Lucene 9.4 #4753

Closed

Install and configure Log4j JUL adapter for Lucene 9.4 #4754

Merged

reta mentioned this issue Oct 27, 2022

Support JDK-19 builds on Jenkins opensearch-project/opensearch-build#2807

Closed

reta mentioned this issue Oct 28, 2022

Update to Gradle 7.6 and JDK-19 #4973

Merged

6 tasks

VachaShah removed the untriaged label Nov 1, 2022

dblock closed this as completed in #4973 Nov 8, 2022

reta mentioned this issue Nov 8, 2022

Allow mmap to use new preview JDK-19 APIs in Apache Lucene 9.4+ #5151

Merged

6 tasks

reta mentioned this issue Dec 27, 2022

Remove --enable-preview feature flag since Apache Lucene now patches class files #5642

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

After update to Lucene 9.4 use `--enable-preview` on Java==19 (exact) to allow mmap use new JDK APIs #4637

After update to Lucene 9.4 use `--enable-preview` on Java==19 (exact) to allow mmap use new JDK APIs #4637

uschindler commented Sep 29, 2022

dblock commented Sep 29, 2022

uschindler commented Sep 29, 2022 •

edited

Loading

reta commented Sep 29, 2022 •

edited

Loading

dblock commented Sep 29, 2022 •

edited

Loading

reta commented Sep 29, 2022

uschindler commented Sep 29, 2022

dblock commented Sep 29, 2022 •

edited

Loading

dbwiddis commented Sep 30, 2022

uschindler commented Sep 30, 2022

dbwiddis commented Sep 30, 2022

uschindler commented Oct 1, 2022 •

edited

Loading

dbwiddis commented Oct 1, 2022

saratvemulapalli commented Oct 11, 2022

dblock commented Oct 11, 2022 •

edited

Loading

reta commented Oct 11, 2022

dbwiddis commented Oct 12, 2022

nknize commented Oct 12, 2022 •

edited

Loading

nknize commented Oct 12, 2022

uschindler commented Oct 12, 2022

nknize commented Oct 12, 2022

reta commented Oct 27, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022 •

edited

Loading

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022 •

edited

Loading

dblock commented Dec 27, 2022

uschindler commented Dec 27, 2022

After update to Lucene 9.4 use --enable-preview on Java==19 (exact) to allow mmap use new JDK APIs #4637

After update to Lucene 9.4 use --enable-preview on Java==19 (exact) to allow mmap use new JDK APIs #4637

Comments

uschindler commented Sep 29, 2022

Description

dblock commented Sep 29, 2022

uschindler commented Sep 29, 2022 • edited Loading

reta commented Sep 29, 2022 • edited Loading

dblock commented Sep 29, 2022 • edited Loading

reta commented Sep 29, 2022

uschindler commented Sep 29, 2022

dblock commented Sep 29, 2022 • edited Loading

dbwiddis commented Sep 30, 2022

uschindler commented Sep 30, 2022

dbwiddis commented Sep 30, 2022

uschindler commented Oct 1, 2022 • edited Loading

dbwiddis commented Oct 1, 2022

saratvemulapalli commented Oct 11, 2022

dblock commented Oct 11, 2022 • edited Loading

reta commented Oct 11, 2022

dbwiddis commented Oct 12, 2022

nknize commented Oct 12, 2022 • edited Loading

nknize commented Oct 12, 2022

uschindler commented Oct 12, 2022

nknize commented Oct 12, 2022

reta commented Oct 27, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022 • edited Loading

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022

reta commented Nov 8, 2022

uschindler commented Nov 8, 2022 • edited Loading

dblock commented Dec 27, 2022

uschindler commented Dec 27, 2022

After update to Lucene 9.4 use `--enable-preview` on Java==19 (exact) to allow mmap use new JDK APIs #4637

After update to Lucene 9.4 use `--enable-preview` on Java==19 (exact) to allow mmap use new JDK APIs #4637

uschindler commented Sep 29, 2022 •

edited

Loading

reta commented Sep 29, 2022 •

edited

Loading

dblock commented Sep 29, 2022 •

edited

Loading

dblock commented Sep 29, 2022 •

edited

Loading

uschindler commented Oct 1, 2022 •

edited

Loading

dblock commented Oct 11, 2022 •

edited

Loading

nknize commented Oct 12, 2022 •

edited

Loading

reta commented Nov 8, 2022 •

edited

Loading

uschindler commented Nov 8, 2022 •

edited

Loading