forked from opensearch-project/k-NN
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Quantization Framework Implementation with 1bit, 2bit and 4bit Binary…
… Quantizer
- Loading branch information
Showing
35 changed files
with
2,124 additions
and
0 deletions.
There are no files selected for viewing
34 changes: 34 additions & 0 deletions
34
src/main/java/org/opensearch/knn/quantization/enums/QuantizationType.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.enums; | ||
|
||
/** | ||
* The QuantizationType enum represents the different types of quantization | ||
* that can be applied in the KNN. | ||
* | ||
* <ul> | ||
* <li><b>SPACE_QUANTIZATION:</b> This type of quantization focuses on the space | ||
* or the representation of the data vectors. It is commonly used for techniques | ||
* that reduce the dimensionality or discretize the data space.</li> | ||
* <li><b>VALUE_QUANTIZATION:</b> This type of quantization focuses on the values | ||
* within the data vectors. It involves mapping continuous values into discrete | ||
* values, which can be useful for compressing data or reducing the precision | ||
* of the representation.</li> | ||
* </ul> | ||
*/ | ||
public enum QuantizationType { | ||
/** | ||
* Represents space quantization, typically involving dimensionality reduction | ||
* or space partitioning techniques. | ||
*/ | ||
SPACE_QUANTIZATION, | ||
|
||
/** | ||
* Represents value quantization, typically involving the conversion of continuous | ||
* values into discrete ones. | ||
*/ | ||
VALUE_QUANTIZATION, | ||
} |
62 changes: 62 additions & 0 deletions
62
src/main/java/org/opensearch/knn/quantization/enums/SQTypes.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.enums; | ||
|
||
/** | ||
* The SQTypes enum defines the various scalar quantization types that can be used | ||
* in the KNN for vector quantization. | ||
* Each type corresponds to a different bit-width representation of the quantized values. | ||
*/ | ||
public enum SQTypes { | ||
/** | ||
* FP16 quantization uses 16-bit floating-point representation. | ||
* This type offers a good balance between range and precision. | ||
*/ | ||
FP16, | ||
|
||
/** | ||
* INT8 quantization uses 8-bit integer representation. | ||
* It is commonly used for efficient storage and processing. | ||
*/ | ||
INT8, | ||
|
||
/** | ||
* INT6 quantization uses 6-bit integer representation. | ||
* It provides a lower precision than INT8 but with less storage space. | ||
*/ | ||
INT6, | ||
|
||
/** | ||
* INT4 quantization uses 4-bit integer representation. | ||
* This type is suitable for highly compressed storage with significant loss of precision. | ||
*/ | ||
INT4, | ||
|
||
/** | ||
* ONE_BIT quantization uses a single bit per coordinate. | ||
* This type is the most compact, representing only two possible values per dimension. | ||
*/ | ||
ONE_BIT, | ||
|
||
/** | ||
* TWO_BIT quantization uses two bits per coordinate. | ||
* This type represents four possible values per dimension, offering a balance between compression and accuracy. | ||
*/ | ||
TWO_BIT, | ||
|
||
/** | ||
* FOUR_BIT quantization uses four bits per coordinate. | ||
* It allows for sixteen possible values per dimension, providing more detail than lower bit-widths. | ||
*/ | ||
FOUR_BIT, | ||
|
||
/** | ||
* UNSUPPORTED_TYPE is used to denote quantization types that are not supported. | ||
* This can be used as a placeholder or default value. | ||
*/ | ||
UNSUPPORTED_TYPE | ||
} | ||
|
21 changes: 21 additions & 0 deletions
21
src/main/java/org/opensearch/knn/quantization/enums/ValueQuantizationType.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.enums; | ||
|
||
/** | ||
* The ValueQuantizationType enum defines the types of value quantization techniques | ||
* that can be applied in the KNN. | ||
* These types represent different methodologies for quantizing the values of vectors. | ||
*/ | ||
public enum ValueQuantizationType { | ||
/** | ||
* SQ (Scalar Quantization) represents a method where each coordinate of the vector is quantized | ||
* independently. This technique is widely used for its simplicity and efficiency. | ||
*/ | ||
SQ | ||
} | ||
|
||
|
72 changes: 72 additions & 0 deletions
72
src/main/java/org/opensearch/knn/quantization/factory/QuantizerFactory.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.factory; | ||
|
||
import org.opensearch.knn.quantization.enums.QuantizationType; | ||
import org.opensearch.knn.quantization.enums.SQTypes; | ||
import org.opensearch.knn.quantization.models.quantizationParams.QuantizationParams; | ||
import org.opensearch.knn.quantization.models.quantizationParams.SQParams; | ||
import org.opensearch.knn.quantization.quantizer.MultiBitScalarQuantizer; | ||
import org.opensearch.knn.quantization.quantizer.OneBitScalarQuantizer; | ||
import org.opensearch.knn.quantization.quantizer.Quantizer; | ||
|
||
/** | ||
* The QuantizerFactory class is responsible for creating instances of {@link Quantizer} | ||
* based on the provided {@link QuantizationParams}. It uses a registry to look up the | ||
* appropriate quantizer implementation for the given quantization parameters. | ||
*/ | ||
public class QuantizerFactory { | ||
private static volatile boolean isRegistered = false; | ||
|
||
/** | ||
* Retrieves a quantizer instance based on the provided quantization parameters. | ||
* | ||
* @param params the quantization parameters used to determine the appropriate quantizer | ||
* @param <P> the type of quantization parameters, extending {@link QuantizationParams} | ||
* @param <Q> the type of the quantized output | ||
* @return an instance of {@link Quantizer} corresponding to the provided parameters | ||
*/ | ||
public static <P extends QuantizationParams, Q> Quantizer<P, Q> getQuantizer(P params) { | ||
if (params == null) { | ||
throw new IllegalArgumentException("Quantization parameters must not be null."); | ||
} | ||
// Lazy Registration instead of static block as class level; | ||
if (!isRegistered) { | ||
registerDefaultQuantizers(); | ||
} | ||
return QuantizerRegistry.getQuantizer(params); | ||
} | ||
|
||
/** | ||
* Registers default quantizers if not already registered. | ||
*/ | ||
private static synchronized void registerDefaultQuantizers() { | ||
if (!isRegistered) { | ||
// Register OneBitScalarQuantizer for SQParams with VALUE_QUANTIZATION and SQTypes.ONE_BIT | ||
QuantizerRegistry.register( | ||
SQParams.class, | ||
QuantizationType.VALUE_QUANTIZATION, | ||
SQTypes.ONE_BIT, | ||
OneBitScalarQuantizer::new | ||
); | ||
// Register MultiBitScalarQuantizer for SQParams with VALUE_QUANTIZATION with bit per co-ordinate = 2 | ||
QuantizerRegistry.register( | ||
SQParams.class, | ||
QuantizationType.VALUE_QUANTIZATION, | ||
SQTypes.TWO_BIT, | ||
() -> new MultiBitScalarQuantizer(2) | ||
); | ||
// Register MultiBitScalarQuantizer for SQParams with VALUE_QUANTIZATION with bit per co-ordinate = 4 | ||
QuantizerRegistry.register( | ||
SQParams.class, | ||
QuantizationType.VALUE_QUANTIZATION, | ||
SQTypes.FOUR_BIT, | ||
() -> new MultiBitScalarQuantizer(4) | ||
); | ||
isRegistered = true; | ||
} | ||
} | ||
} |
67 changes: 67 additions & 0 deletions
67
src/main/java/org/opensearch/knn/quantization/factory/QuantizerRegistry.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.factory; | ||
|
||
import org.opensearch.knn.quantization.enums.QuantizationType; | ||
import org.opensearch.knn.quantization.enums.SQTypes; | ||
import org.opensearch.knn.quantization.models.quantizationParams.QuantizationParams; | ||
import org.opensearch.knn.quantization.quantizer.Quantizer; | ||
|
||
import java.util.Map; | ||
import java.util.concurrent.ConcurrentHashMap; | ||
import java.util.function.Supplier; | ||
|
||
/** | ||
* The QuantizerRegistry class is responsible for managing the registration and retrieval | ||
* of quantizer instances. Quantizers are registered with specific quantization parameters | ||
* and type identifiers, allowing for efficient lookup and instantiation. | ||
*/ | ||
class QuantizerRegistry { | ||
|
||
// Use ConcurrentHashMap for thread-safe access | ||
private static final Map<String, Supplier<? extends Quantizer<?, ?>>> registry = new ConcurrentHashMap<>(); | ||
|
||
/** | ||
* Registers a quantizer with the registry. | ||
* | ||
* @param paramClass the class of the quantization parameters | ||
* @param quantizationType the quantization type (e.g., VALUE_QUANTIZATION) | ||
* @param sqType the specific quantization subtype (e.g., ONE_BIT, TWO_BIT) | ||
* @param quantizerSupplier a supplier that provides instances of the quantizer | ||
* @param <P> the type of quantization parameters | ||
*/ | ||
public static <P extends QuantizationParams> void register(Class<P> paramClass, | ||
QuantizationType quantizationType, | ||
SQTypes sqType, | ||
Supplier<? extends Quantizer<?, ?>> quantizerSupplier) { | ||
String identifier = quantizationType.name() + "_" + sqType.name(); | ||
// Ensure that the quantizer for this identifier is registered only once | ||
registry.computeIfAbsent(identifier, key -> { | ||
return quantizerSupplier; | ||
}); | ||
} | ||
|
||
/** | ||
* Retrieves a quantizer instance based on the provided quantization parameters. | ||
* | ||
* @param params the quantization parameters used to determine the appropriate quantizer | ||
* @param <P> the type of quantization parameters | ||
* @param <Q> the type of the quantized output | ||
* @return an instance of {@link Quantizer} corresponding to the provided parameters | ||
* @throws IllegalArgumentException if no quantizer is registered for the given parameters | ||
*/ | ||
public static <P extends QuantizationParams, Q> Quantizer<P, Q> getQuantizer(P params) { | ||
String identifier = params.getTypeIdentifier(); | ||
Supplier<? extends Quantizer<?, ?>> supplier = registry.get(identifier); | ||
if (supplier == null) { | ||
throw new IllegalArgumentException("No quantizer registered for type identifier: " + identifier + | ||
". Available quantizers: " + registry.keySet()); | ||
} | ||
@SuppressWarnings("unchecked") | ||
Quantizer<P, Q> quantizer = (Quantizer<P, Q>) supplier.get(); | ||
return quantizer; | ||
} | ||
} |
31 changes: 31 additions & 0 deletions
31
...a/org/opensearch/knn/quantization/models/quantizationOutput/BinaryQuantizationOutput.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.models.quantizationOutput; | ||
|
||
/** | ||
* The BinaryQuantizationOutput class represents the output of a quantization process in binary format. | ||
* It implements the QuantizationOutput interface to handle byte arrays specifically. | ||
*/ | ||
public class BinaryQuantizationOutput implements QuantizationOutput<byte[]> { | ||
private final byte[] quantizedVector; | ||
|
||
/** | ||
* Constructs a BinaryQuantizationOutput instance with the specified quantized vector. | ||
* | ||
* @param quantizedVector the quantized vector represented as a byte array. | ||
*/ | ||
public BinaryQuantizationOutput(byte[] quantizedVector) { | ||
if (quantizedVector == null) { | ||
throw new IllegalArgumentException("Quantized vector cannot be null"); | ||
} | ||
this.quantizedVector = quantizedVector; | ||
} | ||
|
||
@Override | ||
public byte[] getQuantizedVector() { | ||
return quantizedVector; | ||
} | ||
} |
22 changes: 22 additions & 0 deletions
22
...in/java/org/opensearch/knn/quantization/models/quantizationOutput/QuantizationOutput.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.models.quantizationOutput; | ||
|
||
/** | ||
* The QuantizationOutput interface defines the contract for quantization output data. | ||
* | ||
* @param <T> The type of the quantized data. | ||
*/ | ||
public interface QuantizationOutput<T> { | ||
/** | ||
* Returns the quantized vector. | ||
* | ||
* @return the quantized data. | ||
*/ | ||
T getQuantizedVector(); | ||
} | ||
|
||
|
39 changes: 39 additions & 0 deletions
39
...in/java/org/opensearch/knn/quantization/models/quantizationParams/QuantizationParams.java
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
/* | ||
* Copyright OpenSearch Contributors | ||
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
package org.opensearch.knn.quantization.models.quantizationParams; | ||
|
||
import org.opensearch.knn.quantization.enums.QuantizationType; | ||
|
||
import java.io.Serializable; | ||
|
||
/** | ||
* Interface for quantization parameters. | ||
* This interface defines the basic contract for all quantization parameter types. | ||
* It provides methods to retrieve the quantization type and a unique type identifier. | ||
* Implementations of this interface are expected to provide specific configurations | ||
* for various quantization strategies. | ||
*/ | ||
public interface QuantizationParams extends Serializable{ | ||
|
||
/** | ||
* Gets the quantization type associated with the parameters. | ||
* The quantization type defines the overall strategy or method used | ||
* for quantization, such as VALUE_QUANTIZATION or SPACE_QUANTIZATION. | ||
* | ||
* @return the {@link QuantizationType} indicating the quantization method. | ||
*/ | ||
QuantizationType getQuantizationType(); | ||
|
||
/** | ||
* Provides a unique identifier for the quantization parameters. | ||
* This identifier is typically a combination of the quantization type | ||
* and additional specifics, and it serves to distinguish between different | ||
* configurations or modes of quantization. | ||
* | ||
* @return a string representing the unique type identifier. | ||
*/ | ||
String getTypeIdentifier(); | ||
} |
Oops, something went wrong.