Add hardware-accelerated codecs for DEFLATE and LZ4 #122

mulugetam · 2024-03-15T06:55:30Z

Description

Adds hardware-accelerated DEFLATE and LZ4 compression codecs for stored fields. The hardware in focus here is Intel (R) QAT, which is an integrated, built-in accelerator on the latest 4th and 5th Gen Intel Xeon processors. The implementation relies on the Qat-Java library.

The PR adds two additional valid values for index.codec: qat_deflate and qat_lz4. It also introduces a new setting, index.codec.qatmode, that specifies the mode of execution for QAT.

Two values are supported for index.codec.qatmode: hardware and auto. A hardware execution mode uses only the QAT hardware, while an auto execution mode may switch to software if hardware resources are not available.

Closes

#130

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

sarthakaggarwal97 · 2024-03-15T08:43:28Z

@mulugetam thanks for raising the PR. Could you please share some performance numbers for these modes?

wbeckler · 2024-03-15T19:30:06Z

licenses/qat-java-LICENSE.txt

In case anyone's wondering, it looks like there is already BSD software in the project: https://github.com/search?q=repo%3Aopensearch-project%2FOpenSearch+bsd+license&type=code

asonje · 2024-03-18T16:58:10Z

@mulugetam thanks for raising the PR. Could you please share some performance numbers for these modes?

Here are some performance numbers for indexing using stack overflow workload

	`qat_deflate` relative to `deflate`	`qat_lz4` relative to `default`
Total time	-11.4%	-2.5%
Mean Indexing throughput	23.7%	3.4%
Store Size	1.6%	-1.76%

mulerm · 2024-03-19T17:39:06Z

Thank you @asonje. @sarthakaggarwal97 we will also share the performance numbers for search when they're ready.

sarthakaggarwal97 · 2024-03-20T02:47:46Z

thanks @mulugetam @asonje for initial numbers. Out of curiosity, if the underlying algorithm is still same (in this case lz4, zlib), how are we seeing differences in store size?

asonje · 2024-03-20T18:52:47Z

Out of curiosity, if the underlying algorithm is still same (in this case lz4, zlib), how are we seeing differences in store size?

This is expected. As you know the store size will vary from run to run but it is still a good approximation of compression ratio. qat_deflate and deflate will have different compression ratios because even though the algorithm is the same, their implementations are different. The resulting compressed bytes will not be identical in both cases even though they are compatible.

src/main/java/org/opensearch/index/codec/customcodecs/CustomCodecPlugin.java

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99CustomCodec.java

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatStoredFieldsFormat.java

src/main/java/org/opensearch/index/codec/customcodecs/QatLz4CompressionMode.java

src/test/java/org/opensearch/index/codec/customcodecs/QatDeflateCompressorTests.java

reta · 2024-03-27T19:26:22Z

@mulugetam it looks pretty cool, could you please share what arch/oses it is available? (the arch part is somewhat clear, but not arch + os combinations, windows / linux / intel macs, ...)

…xception. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

mulerm · 2024-03-27T22:42:24Z

@mulugetam it looks pretty cool, could you please share what arch/oses it is available? (the arch part is somewhat clear, but not arch + os combinations, windows / linux / intel macs, ...)

The QAT built-in accelerator is available on 4th and 5th gen Intel (R) Xeon Processors. This version requires amd64/Linux. For all other systems, the auto mode can be used to do a software-only compression.

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

reta · 2024-03-27T23:23:11Z

@mulugetam when you have chance, could you please resolve the conflicts? thank you

mulugetam · 2024-03-27T23:43:20Z

@mulugetam when you have chance, could you please resolve the conflicts? thank you

Looks like it's asking me to insert new lines because existing code is not spotless formatted.

Signed-off-by: mulugetam <mulugeta.mammo@intel.com>

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatCodec.java

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatStoredFieldsFormat.java

src/main/java/org/opensearch/index/codec/customcodecs/CustomCodecService.java

mulugetam · 2024-05-14T22:29:41Z

2. Based on the PR my understanding is that the storage format does not change. I think in future, codec like lz4_qat should be inferred, and continue to work with other implementation of LZ4 in the example case. We should add tests to ensure that the storage remains compatible always if we're adding changes which impact only compute (e.g. use lz4_qat for compression, and lz4 for decompression)

Thanks. I believe some of us discussed similar ideas on Slack. An issue entry was also created: #130.

mulugetam · 2024-05-15T17:27:29Z

@reta do we have any pending issues that need to be resolved?

reta · 2024-05-15T17:33:12Z

@reta do we have any pending issues that need to be resolved?

@mulugetam we still need a signoff #122 (comment)

sarthakaggarwal97 · 2024-05-15T17:37:20Z

We should add tests to ensure that the storage remains compatible always if we're adding changes which impact only compute (e.g. use lz4_qat for compression, and lz4 for decompression)

@mulerm I think we can add this test to ensure the only change would be in the compute. What do you think @reta?

reta · 2024-05-15T18:26:24Z

@mulerm I think we can add this test to ensure the only change would be in the compute. What do you think @reta?

@sarthakaggarwal97 I am not sure we could pull it off, for lz4_qat we need a real hardware support, right? I don't think it is available in GA.

mulugetam · 2024-05-15T18:41:34Z

@reta @sarthakaggarwal97

Adding the suggested test in the current implementation is not sufficient and probably will not work, as the fieldsWriter and fieldsReader of Lucene store BEST_COMPRESSION and BEST_SPEED as the value of the MODE_KEY attribute.

My thinking was that we should treat qat_deflate (vs. BEST_COMPRESSION) and qat_lz4 (vs. BEST_SPEED) as separate codecs until such a time when BEST_COMPRESSION and BEST_SPEED would just use the accelerator "transparently", if it is present.

reta · 2024-05-15T19:02:15Z

My thinking was that we should treat qat_deflate (vs. BEST_COMPRESSION) and qat_lz4 (vs. BEST_SPEED) as separate codecs

Certainly +1 to that

mgodwan · 2024-05-16T07:30:39Z

I am not sure we could pull it off, for lz4_qat we need a real hardware support, right? I don't think it is available in GA.

If this is the case, I would highly advocate for this to be behind a plugin setting. #148

reta · 2024-05-16T11:58:17Z

If this is the case, I would highly advocate for this to be behind a plugin setting. #148

But it is separate codec already, which users have to opt-in to use (not a default one)? So users have to pick a codec AND set a setting to use it? Sounds like unreasonably complicated process to me

mgodwan · 2024-05-16T12:10:17Z

But it is separate codec already, which users have to opt-in to use (not a default one)?

So was the case when zstd was added to OpenSearch in the first iteration, but it was done via a sandbox plugin. Since the plugin is no longer sandbox, I think it would benefit to have a way to denote this is experimental and innovate with time on the settings, codec management, issue handling, etc. for these new codecs without worrying about breaking changes.

The only reason I'm saying this is that the codec impacts storage of data and in case of issues, just disabling may not bring users out of any issues unless already written data is also fixed.

That said, I'm okay with the call you and @sarthakaggarwal97 take on this.

sarthakaggarwal97 · 2024-05-17T06:16:58Z

Given the precedent we have had with Zstd, I think its okay to keep the new QAT codecs as experimental for now.
Currently, there is not a clean way to mark the codecs as experimental, and I have opened an issue opensearch-project/OpenSearch#13723.

With that, it would be nice if we can come up with a plan as well to make new codecs, here QAT, generally available in the future. Let me think back on it. One of the list I created earlier for Zstd correctness was this for reference: opensearch-project/OpenSearch#9502

reta · 2024-05-17T13:32:23Z

Given the precedent we have had with Zstd, I think its okay to keep the new QAT codecs as experimental for now.

@sarthakaggarwal97 we have #148 to address the problem for every custom codec.

mulugetam · 2024-05-29T17:26:14Z

@sarthakaggarwal97 we have #148 to address the problem for every custom codec.

@reta @sarthakaggarwal97 Are we on hold now until #148 is implemented? I think we should not be, as it is a separate codec that users would have to opt in to use.

reta · 2024-05-29T18:35:08Z

@sarthakaggarwal97 we have #148 to address the problem for every custom codec.

@reta @sarthakaggarwal97 Are we on hold now until #148 is implemented? I think we should not be, as it is a separate codec that users would have to opt in to use.

@mulugetam I don't think we need to wait for #148, we could addressed that right after (before 2.15.0)

sarthakaggarwal97 · 2024-05-30T09:14:20Z

@reta I think we would need to introduce the experimental settings in OpenSearch, since we do the validation of index codecs in EngineConfig

Ideally would want to stop the creation of index only if the codecs is not experimental. I feel we would need a two changes. A new method in CodecSettings can tell us if the codec is experimental or not, and a feature flag setting will tell us whether we should make the experimental codecs available or not.

sarthakaggarwal97 · 2024-05-30T09:15:32Z

Thank you @mulugetam for this change.

opensearch-trigger-bot · 2024-05-30T09:16:22Z

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 128

To backport manually, run these commands in your terminal:

# Navigate to the root of your repository
cd $(git rev-parse --show-toplevel)
# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/custom-codecs/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/custom-codecs/backport-2.x
# Create a new branch
git switch --create backport/backport-122-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 c8b0d80a8286459857f2db2c0e9d3c1c076ada9d
# Push it to GitHub
git push --set-upstream origin backport/backport-122-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/custom-codecs/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-122-to-2.x.

sarthakaggarwal97 · 2024-05-30T09:34:34Z

@mulugetam would you please help with the backport as well? Thank you

reta · 2024-05-30T11:44:28Z

@reta I think we would need to introduce the experimental settings in OpenSearch, since we do the validation of index codecs in EngineConfig

@sarthakaggarwal97 it would be great but we validation logic won't help us here I think: the codecs are registered by Apache Lucene SPI (the validation logic you are referring to only helps with ensuring the codec settings validness).

Ideally would want to stop the creation of index only if the codecs is not experimental. I feel we would need a two changes. A new method in CodecSettings can tell us if the codec is experimental or not, and a feature flag setting will tell us whether we should make the experimental codecs available or not.

That's one of the problems: we could extend CodecSettings (this is ours) but none of the Codec implementations are required to implement it. Ideally, Apache Lucene SPI should have that feature, or alternatively, we could disable some codecs through settings.

mgodwan · 2024-05-30T11:52:33Z

the codecs are registered by Apache Lucene SPI (the validation logic you are referring to only helps with ensuring the codec settings validness).

I think a check like this will definitely help to disable its usage in the write path. For write path, NamedSPI interface is not used.

…ct#122) * Add QAT accelerated compression. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Use own classes for QAT codec. Apply SpotlessJavaCheck. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Declare fields final, unless required not to. Throw a valid type of exception. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Use assumeThat in the Qat test classes. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Add more QAT availability check in QatCodecTests. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Make LZ4 the default algorithm for QAT. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Make 'auto' the default execution mode for QAT. Also, minor clean up work. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Revert compression level for ZSTD to 3. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Replace QatLz4/DeflateCompressionMode classes with QatCompressionMode. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Fix a MultiCodecMergeIT test fail. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Remove hard-coded values for default compression level. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> Signed-off-by: mulugetam <mulugeta.mammo@intel.com> Co-authored-by: Mulugeta Mammo <cppx86@gmail.com> (cherry picked from commit c8b0d80)

mulugetam · 2024-05-30T18:06:52Z

@mulugetam would you please help with the backport as well? Thank you

@reta @sarthakaggarwal97 #150

* Add QAT accelerated compression. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Use own classes for QAT codec. Apply SpotlessJavaCheck. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Declare fields final, unless required not to. Throw a valid type of exception. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Use assumeThat in the Qat test classes. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Add more QAT availability check in QatCodecTests. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Make LZ4 the default algorithm for QAT. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Make 'auto' the default execution mode for QAT. Also, minor clean up work. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Revert compression level for ZSTD to 3. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Replace QatLz4/DeflateCompressionMode classes with QatCompressionMode. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Fix a MultiCodecMergeIT test fail. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> * Remove hard-coded values for default compression level. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> --------- Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com> Signed-off-by: mulugetam <mulugeta.mammo@intel.com> Co-authored-by: Mulugeta Mammo <cppx86@gmail.com> (cherry picked from commit c8b0d80)

Add QAT accelerated compression.

a23beae

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

mulugetam requested review from andrross, reta, nknize and sarthakaggarwal97 as code owners March 15, 2024 06:55

mulugetam mentioned this pull request Mar 15, 2024

Add a plugin for hardware-accelerated compression. opensearch-project/OpenSearch#12351

Closed

8 tasks

mulugetam changed the title ~~Add hardware-accelerated DEFLATE and LZ4 compression codec~~ Add hardware-accelerated codecs for DEFLATE and LZ4 Mar 15, 2024

wbeckler reviewed Mar 15, 2024

View reviewed changes

mulerm mentioned this pull request Mar 26, 2024

[RFC] Hardware-accelerated Compression #130

Open

reta reviewed Mar 26, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/CustomCodecPlugin.java Outdated Show resolved Hide resolved

reta reviewed Mar 26, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99CustomCodec.java Outdated Show resolved Hide resolved

Use own classes for QAT codec. Apply SpotlessJavaCheck.

32e6ab7

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

reta reviewed Mar 27, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatStoredFieldsFormat.java Outdated Show resolved Hide resolved

reta reviewed Mar 27, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/QatLz4CompressionMode.java Outdated Show resolved Hide resolved

reta reviewed Mar 27, 2024

View reviewed changes

src/test/java/org/opensearch/index/codec/customcodecs/QatDeflateCompressorTests.java Outdated Show resolved Hide resolved

Declare fields final, unless required not to. Throw a valid type of e…

69a8dc3

…xception. Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

Use assumeThat in the Qat test classes.

6178b69

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

mulugetam added 2 commits March 27, 2024 17:02

Merge branch 'main' into add_qat_acceleration

4266df1

Signed-off-by: mulugetam <mulugeta.mammo@intel.com>

Add more QAT availability check in QatCodecTests.

544aa92

Signed-off-by: Mulugeta Mammo <mulugeta.mammo@intel.com>

sarthakaggarwal97 reviewed Mar 28, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatCodec.java Show resolved Hide resolved

src/main/java/org/opensearch/index/codec/customcodecs/Lucene99QatStoredFieldsFormat.java Outdated Show resolved Hide resolved

sarthakaggarwal97 reviewed Mar 28, 2024

View reviewed changes

src/main/java/org/opensearch/index/codec/customcodecs/CustomCodecService.java Outdated Show resolved Hide resolved

mgodwan mentioned this pull request May 16, 2024

[FEATURE] Add support for experimental codecs #148

Open

sarthakaggarwal97 approved these changes May 30, 2024

View reviewed changes

sarthakaggarwal97 merged commit c8b0d80 into opensearch-project:main May 30, 2024
17 checks passed

sarthakaggarwal97 added the backport 2.x label May 30, 2024

opensearch-trigger-bot bot added the backport-failed label May 30, 2024

reta mentioned this pull request May 30, 2024

Remove org.opensearch.secure_sm.ThreadPermission from plugin manifest, this permission should not be granted #151

Merged

peterzhuamazon mentioned this pull request Jul 25, 2024

[BUG] 2.15 using qat_deflate with default docker image crashes node because of missing library #168

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hardware-accelerated codecs for DEFLATE and LZ4 #122

Add hardware-accelerated codecs for DEFLATE and LZ4 #122

mulugetam commented Mar 15, 2024 •

edited by reta

Loading

sarthakaggarwal97 commented Mar 15, 2024

wbeckler Mar 15, 2024

asonje commented Mar 18, 2024 •

edited

Loading

mulerm commented Mar 19, 2024

sarthakaggarwal97 commented Mar 20, 2024

asonje commented Mar 20, 2024

reta commented Mar 27, 2024

mulerm commented Mar 27, 2024

reta commented Mar 27, 2024

mulugetam commented Mar 27, 2024

mulugetam commented May 14, 2024

mulugetam commented May 15, 2024

reta commented May 15, 2024

sarthakaggarwal97 commented May 15, 2024 •

edited

Loading

reta commented May 15, 2024 •

edited

Loading

mulugetam commented May 15, 2024

reta commented May 15, 2024

mgodwan commented May 16, 2024

reta commented May 16, 2024 •

edited

Loading

mgodwan commented May 16, 2024

sarthakaggarwal97 commented May 17, 2024 •

edited

Loading

reta commented May 17, 2024

mulugetam commented May 29, 2024

reta commented May 29, 2024

sarthakaggarwal97 commented May 30, 2024

sarthakaggarwal97 commented May 30, 2024

opensearch-trigger-bot bot commented May 30, 2024

sarthakaggarwal97 commented May 30, 2024

reta commented May 30, 2024

mgodwan commented May 30, 2024

mulugetam commented May 30, 2024

Add hardware-accelerated codecs for DEFLATE and LZ4 #122

Add hardware-accelerated codecs for DEFLATE and LZ4 #122

Conversation

mulugetam commented Mar 15, 2024 • edited by reta Loading

Description

Closes

sarthakaggarwal97 commented Mar 15, 2024

wbeckler Mar 15, 2024

Choose a reason for hiding this comment

asonje commented Mar 18, 2024 • edited Loading

mulerm commented Mar 19, 2024

sarthakaggarwal97 commented Mar 20, 2024

asonje commented Mar 20, 2024

reta commented Mar 27, 2024

mulerm commented Mar 27, 2024

reta commented Mar 27, 2024

mulugetam commented Mar 27, 2024

mulugetam commented May 14, 2024

mulugetam commented May 15, 2024

reta commented May 15, 2024

sarthakaggarwal97 commented May 15, 2024 • edited Loading

reta commented May 15, 2024 • edited Loading

mulugetam commented May 15, 2024

reta commented May 15, 2024

mgodwan commented May 16, 2024

reta commented May 16, 2024 • edited Loading

mgodwan commented May 16, 2024

sarthakaggarwal97 commented May 17, 2024 • edited Loading

reta commented May 17, 2024

mulugetam commented May 29, 2024

reta commented May 29, 2024

sarthakaggarwal97 commented May 30, 2024

sarthakaggarwal97 commented May 30, 2024

opensearch-trigger-bot bot commented May 30, 2024

sarthakaggarwal97 commented May 30, 2024

reta commented May 30, 2024

mgodwan commented May 30, 2024

mulugetam commented May 30, 2024

mulugetam commented Mar 15, 2024 •

edited by reta

Loading

asonje commented Mar 18, 2024 •

edited

Loading

sarthakaggarwal97 commented May 15, 2024 •

edited

Loading

reta commented May 15, 2024 •

edited

Loading

reta commented May 16, 2024 •

edited

Loading

sarthakaggarwal97 commented May 17, 2024 •

edited

Loading