Add migx ep fp8 int4 #78

TedThemistokleous · 2024-12-04T23:04:02Z

Description

Datatype support for int4 and fp8 (all formats)

Motivation and Context

Allows us to support operators of these data types to be handled by Onnxruntime MIGraphX EP

Map things to int8 right now as we don't explicitly set an int4 input type and pack/unpack int4 operands

Mirror the same calibration code we use for int8 and just change which quantize we call through the MIGraphx API

- Add additional flags for fp8 thats shared for int8 - Add lockout warning message when int8/fp8 used at the same time

Previous runs using session options failed as we were missing pulling in inputs from the python interface. This plus additional logging allowed me to track what options were invoked via env and what were added during the start of an inference session

TedThemistokleous · 2025-01-22T22:13:50Z

rebasing off ort_value changes used for llama_V2 pipe to verify end to end with an int4 model.

need this so the user knows there's any of the environment variables running in the background to ensure proper consistently between runs.

TedThemistokleous · 2025-01-29T15:55:14Z

Merging this in to not block fp8 related items further. Will workout any additional int4 items as int4 verification occurs.

* Add fp8 and int4 types in supported list for Onnxruntime EP * Add support for int4 inputs Map things to int8 right now as we don't explicitly set an int4 input type and pack/unpack int4 operands * Add flag to allow for fp8 quantization through Onnxruntime API * Add fp8 quantization to the compile stage of the MIGraphX EP Mirror the same calibration code we use for int8 and just change which quantize we call through the MIGraphx API * cleanup logging * Cleanup and encapsulate quantization / compile functions - Add additional flags for fp8 thats shared for int8 - Add lockout warning message when int8/fp8 used at the same time * Run lintrunner pass * Fix session options inputs + add better logging. Previous runs using session options failed as we were missing pulling in inputs from the python interface. This plus additional logging allowed me to track what options were invoked via env and what were added during the start of an inference session * Fix naming for save/load path varibles to be consistent with enable. * Print only env variables that are set as warnings need this so the user knows there's any of the environment variables running in the background to ensure proper consistently between runs. --------- Co-authored-by: Ted Themistokleous <tedthemistokleous@amd.com>

TedThemistokleous added the enhancement New feature or request label Dec 4, 2024

TedThemistokleous self-assigned this Dec 4, 2024

TedThemistokleous requested a review from causten December 4, 2024 23:04

TedThemistokleous force-pushed the add_migx_ep_fp8_int4 branch from 9b44fcc to 7076f74 Compare December 4, 2024 23:05

TedThemistokleous force-pushed the add_migx_ep_fp8_int4 branch from 4850dfd to 11ff644 Compare December 18, 2024 21:26

streamhsa force-pushed the add_migx_ep_fp8_int4 branch from c19fa76 to d1a2609 Compare December 25, 2024 03:52

TedThemistokleous changed the title ~~Add migx ep fp8 int4~~ Add migx ep fp8 Jan 5, 2025

TedThemistokleous changed the title ~~Add migx ep fp8~~ Add migx ep fp8 int4 Jan 5, 2025

TedThemistokleous added 9 commits January 22, 2025 21:58

Add fp8 and int4 types in supported list for Onnxruntime EP

15ac027

Add support for int4 inputs

e2e0e24

Map things to int8 right now as we don't explicitly set an int4 input type and pack/unpack int4 operands

Add flag to allow for fp8 quantization through Onnxruntime API

9c60719

Add fp8 quantization to the compile stage of the MIGraphX EP

b761475

Mirror the same calibration code we use for int8 and just change which quantize we call through the MIGraphx API

cleanup logging

9ab8d61

Cleanup and encapsulate quantization / compile functions

df0f428

- Add additional flags for fp8 thats shared for int8 - Add lockout warning message when int8/fp8 used at the same time

Run lintrunner pass

b8e8595

Fix naming for save/load path varibles to be consistent with enable.

628363d

TedThemistokleous force-pushed the add_migx_ep_fp8_int4 branch from 87f1f91 to a314c7f Compare January 22, 2025 22:13

Print only env variables that are set as warnings

56246ac

need this so the user knows there's any of the environment variables running in the background to ensure proper consistently between runs.

TedThemistokleous force-pushed the add_migx_ep_fp8_int4 branch from a314c7f to 56246ac Compare January 24, 2025 04:54

TedThemistokleous merged commit 2117821 into rocm6.3_internal_testing Jan 29, 2025
11 of 15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add migx ep fp8 int4 #78

Add migx ep fp8 int4 #78

TedThemistokleous commented Dec 4, 2024

TedThemistokleous commented Jan 22, 2025

TedThemistokleous commented Jan 29, 2025

Add migx ep fp8 int4 #78

Add migx ep fp8 int4 #78

Conversation

TedThemistokleous commented Dec 4, 2024

Description

Motivation and Context

TedThemistokleous commented Jan 22, 2025

TedThemistokleous commented Jan 29, 2025