Release coremltools 8.0b2 · apple/coremltools

Release Notes

Support for Latest Dependencies
- Compatible with the latest protobuf python package: Improves serialization latency.
- Compatible with numpy 2.0.
- Supports scikit-learn 1.5.
New Core ML model utils
- coremltools.models.utils.bisect_model can break a large Core ML model into two smaller models with similar sizes.
- coremltools.models.utils.materialize_dynamic_shape_mlmodel can convert a flexible input shape model into a static input shape model.
New compression features in coremltools.optimize.coreml
- Vector palettization: By setting cluster_dim > 1 in coremltools.optimize.coreml.OpPalettizerConfig, you can do the vector palettization, where each entry in the lookup table is a vector of length cluster_dim.
- Palettization of per channel scale: By setting enable_per_channel_scale=True in coremltools.optimize.coreml.OpPalettizerConfig, weights are normalized along the output channel using per channel scales before being palettized.
- Joint compression: A new pattern is supported, where weights are first quantized to int8 and then palettized into n-bit look-up table with int8 entries.
- Support conversion of palettized model with 8bits LUT produced from coremltools.optimize.torch.
New compression features / bug fixes in coremltools.optimize.torch
- Added conversion support for Torch models jointly compressed using the training time APIs in coremltools.optimize.torch .
- Added vector palettization support to SKMPalettizer .
- Fixed bug in construction of weight vectors along output channel for vector palettization with PostTrainingPalettizer and DKMPalettizer .
- Deprecated cluter_dtype option in favor of lut_dtype in ModuleDKMPalettizerConfig .
- Added support for quantizing ConvTranspose modules with PostTrainingQuantizer and LinearQuantizer .
- Added static grouping for activation heuristic in GPTQ.
- Fixed bug in how quantization scales are computed for Conv2D layer with per-block quantization in GPTQ .
- Can now perform activation only quantization with QAT APIs.
Experimental torch.export conversion support
- Support conversion of stateful models with mutable buffer.
- Support conversion of dynamic inputs shape models.
- Support conversion of 4-bit weight compression models.
Support new torch ops: clip .
Various other bug fixes, enhancements, clean ups and optimizations.
Special thanks to our external contributors for this release: @dpanshu , @timsneath , @kasper0406 , @lamtrinhdev , @valfrom

Appendix

Example code of converting stateful torch.export model

import torch
import coremltools as ct

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.register_buffer("state_1", torch.tensor([0.0, 0.0, 0.0]))

    def forward(self, x):
        # In place update of the model state
        self.state_1.mul_(x)
        return self.state_1 + 1.0

source_model = Model()
source_model.eval()

example_inputs = (torch.tensor([1.0, 2.0, 3.0]),)
exported_model = torch.export.export(source_model, example_inputs)
coreml_model = ct.convert(exported_model, minimum_deployment_target=ct.target.iOS18)

Example code of converting torch.export models with dynamic input shapes

import torch
import coremltools as ct

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear = torch.nn.Linear(3, 5)

    def forward(self, x):
        y = self.linear(x)
        return y

source_model = Model()
source_model.eval()

example_inputs = (torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]),)
dynamic_shapes = {"x": {0: torch.export.Dim(name="batch_dim")}}
exported_model = torch.export.export(source_model, example_inputs, dynamic_shapes=dynamic_shapes)
coreml_model = ct.convert(exported_model)

Example code of converting torch.export with 4-bit weight compression

import torch
from torch._export import capture_pre_autograd_graph
from torch.ao.quantization.quantize_pt2e import convert_pt2e, prepare_pt2e
from torch.ao.quantization.quantizer.xnnpack_quantizer import (
    XNNPACKQuantizer,
    get_symmetric_quantization_config,
)
import coremltools as ct

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.linear = torch.nn.Linear(3, 5)
    def forward(self, x):
        y = self.linear(x)
        return y

source_model = Model()
source_model.eval()

example_inputs = (torch.tensor([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]),)

pre_autograd_graph = capture_pre_autograd_graph(source_model, example_inputs)
quantization_config = get_symmetric_quantization_config(weight_qmin=-7, weight_qmax=8)
quantizer = XNNPACKQuantizer().set_global(quantization_config)
prepared_graph = prepare_pt2e(pre_autograd_graph, quantizer)
converted_graph = convert_pt2e(prepared_graph)

exported_model = torch.export.export(converted_graph, example_inputs)
coreml_model = ct.convert(exported_model, minimum_deployment_target=ct.target.iOS17)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

coremltools 8.0b2

Release Notes

Appendix

Contributors