A general question about single onnx model vs few smaller models #1569

ARKEYTECT · 2024-11-05T07:19:32Z

ARKEYTECT
Nov 5, 2024

Say I have a model of ~2M parameters. What are the major pro/cons in terms of inference speed on single core cpu of running this model in one single onnx and breaking it into smaller submodules?
Thanks! Looking forward to your insights.

banitalebi · 2024-11-05T14:03:37Z

banitalebi
Nov 5, 2024

A model with 2 million parameters is generally not classified as a large model.

Running a single ONNX module may simplifies the inference pipeline, especially for smaller models where the overhead of splitting might outweigh the benefits.

ONNX Runtime is designed to optimize inference performance, which can be particularly beneficial when running a single model instance.

A single ONNX module may handle batch processing more efficiently, as it can optimize memory access patterns and computational resources without needing to synchronize multiple submodules.

In case of multiple ONNX submodule, if one module becomes a bottleneck, due to being more computationally intensive or poorly optimized, it can slow down the entire inference process.

Sometimes, due to the high computational demands of the model, it may monopolize CPU resources, leading to reduced performance for other processes running on the same core. This can result in slower overall system performance during inference.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A general question about single onnx model vs few smaller models #1569

{{title}}

Replies: 1 comment

{{title}}

Select a reply

A general question about single onnx model vs few smaller models #1569

ARKEYTECT Nov 5, 2024

Replies: 1 comment

banitalebi Nov 5, 2024

ARKEYTECT
Nov 5, 2024

banitalebi
Nov 5, 2024