-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimum openvino #467
base: main
Are you sure you want to change the base?
optimum openvino #467
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR adds OpenVINO backend support to optimize model inference on Intel hardware, with changes spanning Docker configuration, embedder implementation, and utility functions.
- Added OpenVINO execution provider in
/libs/infinity_emb/infinity_emb/transformer/utils_optimum.py
with bf16 precision support - Implemented OpenVINO model file handling with new
get_openvino_files()
function - Set default
INFINITY_ENGINE="optimum"
in CPU Docker configuration with OpenVINO extras - Added
CHECK_OPTIMUM_INTEL
optional import for Intel optimization capabilities - Duplicate optimization config code in
optimize_model()
needs to be cleaned up
5 file(s) reviewed, 9 comment(s)
Edit PR Review Bot Settings | Greptile
@@ -20,6 +20,8 @@ cpu: | |||
extra_env_variables: | | |||
# Sets default to onnx | |||
ENV INFINITY_ENGINE="optimum" | |||
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Installing optimum[openvino] in extra_env_variables section is unconventional. Should be moved to main_install or extra_installs_main.
RUN ./requirements_install_from_poetry.sh --no-root --without lint,test "https://download.pytorch.org/whl/cpu" | ||
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Redundant installation of optimum[openvino] - this package is already included via EXTRAS='all openvino' in the environment variables
RUN ./requirements_install_from_poetry.sh --without lint,test "https://download.pytorch.org/whl/cpu" | ||
RUN poetry run python -m pip install --upgrade --upgrade-strategy eager "optimum[openvino]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Installing optimum[openvino] multiple times in different build stages may cause version conflicts or increase build time unnecessarily
except Exception as e: # show error then let the optimum intel compress on the fly | ||
print(str(e)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Silently printing errors and continuing is dangerous. Consider logging the error and/or raising a more specific exception if OpenVINO file loading fails.
) | ||
if provider == "OpenVINOExecutionProvider": | ||
CHECK_OPTIMUM_INTEL.mark_required() | ||
filename = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Empty filename could cause issues if exception occurs. Initialize with None instead to make the failure case more explicit.
else: # Optimum onnx cpu path | ||
optimizer = ORTOptimizer.from_pretrained(unoptimized_model) | ||
|
||
is_gpu = "cpu" not in execution_provider.lower() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Duplicate optimization config block. Remove lines 231-239 as they are identical to 222-230.
openvino_files = [p for p in repo_files if p.match(pattern)] | ||
|
||
if len(openvino_files) > 1: | ||
logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: Log message incorrectly refers to 'onnx files' when listing OpenVINO files
logger.info(f"Found {len(openvino_files)} onnx files: {openvino_files}") | |
logger.info(f"Found {len(openvino_files)} OpenVINO files: {openvino_files}") |
if files_optimized: | ||
file_optimized = files_optimized[-1] | ||
if file_name: | ||
file_optimized = file_name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: Overwriting file_optimized with file_name could bypass optimization caching if file_name is set
ov_config={ | ||
"INFERENCE_PRECISION_HINT": "bf16" | ||
}, # fp16 for now as it has better precision than bf16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: Using bf16 precision by default may reduce accuracy compared to fp16/fp32 on some hardware
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #467 +/- ##
==========================================
- Coverage 79.51% 78.74% -0.77%
==========================================
Files 41 41
Lines 3417 3468 +51
==========================================
+ Hits 2717 2731 +14
- Misses 700 737 +37 ☔ View full report in Codecov by Sentry. |
@tjtanaa FYI, continued by merging your branch into this and main.