- Overview
- Installing and running the CLI
- ARISE in action on sample data
- Running ARISE from the UI - Work in Progress
- More on data requirements
- Known tool issues
AI Right Sizing Engine (ARISE) is a tool for predicting required resources and execution time of an AI workload, based on historical executions or performance benchmarks of similar workloads (a workload dataset). ARISE is intended to support configuration decision-making for platform engineers or data scientists operating with AI stacks.
ARISE parses and preprocesses the given workloads dataset into a standard format, provides descriptive statistics,
trains predictive models, and performs predictions based on the models. See Instructions for running the CLI for
details on the commands to invoke the above operations. To use these commands, in addition to the workload dataset, you
need to provide in your input path a job_spec.yaml
file indicating the metadata inputs and outputs of your data.
See this example of a job spec.
-
Clone the repo or download codebase zip
-
Install the CLI
To install the CLI in a virtual environment (this would be the preferred installation mode to keep the installation isolated and avoid version conflicts), run the commands:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Windows users should run:
python3 -m venv venv
venv\Scripts\activate.bat
pip install -r requirements.txt
From the project root directory:
python -m unittest -v
To see the log messages for failing tests, use the buffer command (and use
tests.utils.logger_redirector
in your test case). See tests.test_analyze.py
as an example.
python -m unittest -v --buffer
Running the tests on a single test case:
python -m unittest -v --buffer tests/test_build_models.py
There are four supported commands:
analyze-jobs
provides descriptive statistics on the metadata inputs (workload measurements) and generates a number of spreadsheets and plots in a subdirectory calledjob-analysis
. The data should be provided in a folder calleddata
in the giveninput-path
. To use this command, you need to provide in your input path ajob_spec.yaml
file indicating the metadata inputs and outputs of your data. See this example of a job spec.
python -m arise_predictions.main analyze-jobs --input-path examples/MLCommons
It is also possible to specify the metadata input explicitly:
python -m arise_predictions.main.py analyze-jobs --input-path examples/MLCommons --reread-history --input-file inference_data_tokens.csv --custom-job-name inference-thpt
In the above example, we also specify a custom job name. In this example data
set there is no column capturing the job id. If there were, we could provide it in the --job-id-column
argument. With
--custom-job-name
, we instruct the code to insert such a column with the given job name as values.
This tends to improve the output of the descriptive job analysis (e.g., labels in plots).
auto-build-models
performs a hyperparameter search over the models and parameter space specified in a configuration file (cf.,config/default-auto-model-search-config.yaml
) and finds the best model and its hyperparameter settings for each target variable in the data. It attempts to build one best model per target variable in the metadata outputs based on the metadata inputs. To use this command, you need to provide in your input path ajob_spec.yaml
file indicating the metadata inputs and outputs of your data. See this example of a job spec.
Example:
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history
The output models, their relative ranking, and the cross validation results are all stored in a folder named
ARISE-auto-models
which is created in the given input path. If you run with the flag --single-output-file
, the
models and results will be archived into a single output file ARISE-auto-models.zip
in the given input path.
If you do not specify an option for --config-file
, it uses the default one in
config/default-auto-model-search-config.yaml
. There is a config file that
defines a much smaller parameter search space and hence completes in a shorter
time. You can make use of it like this:
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history --config-file config/small-auto-model-search-config.yaml
If you are running on your local machine, it is advised to limit the number of processors used. However, this will result in a much longer run. To build models using 2 processors only, use this command:
python -m arise_predictions.main --num-jobs 2 auto-build-models --input-path examples/MLCommons --reread-history --config-file config/small-auto-model-search-config.yaml
By default, auto-build-models
performs 10-fold cross validation. If you want to perform instead leave-one-group-out (logo)
cross validation, add the cflag --leave-one-out-cv
, which takes as an argument a list of one or more feature names to
group py, separated by commas.
For example, the following command will build models using logo cross validation on values of LLM name. That is, in each iteration it will use a specific LLM as the test set, and all other data as the training set.
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --leave-one-out-cv "Model MLC"
In addition to the above, we can also let auto-build-models
search for models that are tuned for
extrapolation. That is, you can let ARISE build a model that performs
relatively well when asked to predict on inputs that are outside the range of
values seen for this feature during training. This is an experimental feature
whose performance we expect to improve with time. Currently, only a single
extrapolated feature is supported and the training data needs a number of
different data points or levels for this feature to have an opportunity to learn
to extrapolate.
You specify the name of the input feature on which to extrapolate
(--feature-column
) as well as a low and or high threshold (--low-threshold
,
--high-treshold
) which define the extrapolation region to train on. The
thresholds should be chosen from within the range of values that exist in the
training data, so that ARISE can define regions used for training and for
testing the extrapolation performance of the resulting models. For example:
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history --feature-column "# of Accelerators" --high-threshold 8
predict
generates estimated values for metadata outputs given metadata input values. It should be run afterauto-build-models
command and uses its output. The--model-path
flag is where the models created byauto-build-models
are located. Predict requires to specify a model name and input space configuration. It generates the space of input features according to the configuration and uses models previously built withauto-build-models
to run predictions on this input space for the target variables indicated in the same configuration file.
An example configuration file: example-demo-mlcommons-config.yaml.
In addition to feature values, the config file requires specifying target variables for prediction. For each target
variable, the boolean parameter indicating whether greater_is_better
should be specified. estimator_file
is an
optional parameter providing the name of the model file to use for predictions of this target variable. If it is not
provided, ARISE automatically uses the top-ranked model file according to the auto-build-models
results which should
be located in the provided model path, next to the persisted model files.
python -m arise_predictions.main predict --input-path examples/MLCommons --config-file
config/example-demo-mlcommons-config.yaml --model-path examples/MLCommons/ARISE-auto-models
The input space defined by the configuration file and ARISE predictions for each input combination in this space are
stored in a folder named ARISE-predictions
which is created in the given input path.
-
demo-predict
is a version of predict that facilitates demos by ranking predictions and comparing predictions with ground truth where available. It also enables obtaining prediction values from the given data instead of specifying them explicitly.The
--input-path
should point to historic or benchmark input data sodemo-predict
could compare predictions with available ground truth (as far as is possible). The script needs to have the path to the directory containing the serialized models built byauto-build-models
. Other parameters are taken from the configuration file.
python -m arise_predictions.main demo-predict --input-path examples/MLCommons --config-file
config/example-demo-mlcommons-demo-predict-config.yaml --model-path examples/MLCommons/ARISE-auto-models
In addition to the outputs described for the predict
command, demo-predict
will also create a file named
predictions-with-ground-truth.csv
, containing the predicted versus ground truth values and the resulting MAPE error,
for any input combination in the defined input space that appears also in the given ground truth data.
Note that a different configuration file was used for
demo-predict
than the one used for predict
. It includes a data_values
list. Rather than explicitly listing
the values to be predicted as in the predict
configuration file, the values are taken from the given data.
The values
key can be either all
for taking all values from the data, or min_max
for taking the entire range
from minimal to maximal value appearing in the data (the latter is applicable to numeric inputs only). You can also
specify a list of values to exclude from prediction. In our example, the values for Accelerator
are instructed to be
taken from the data, the values for # of Accelerators
are instructed to spread from the minimal to maximal value
appearing in the data (this is of course possible for numeric inputs only), and the case # of Accelerators = 0
is
excluded from the prediction space. If the same input appears also in the variable_values
list, as in the case of
# of Accelerators
, the values explicitly specified (9
in our example) are added to the values derived from the data.
data-predict
is a version of predict that receives the prediction space directly as a dataframe (read from a csv file given in--prediction-data-file
) instead of defining it via a configuration file. The configuration file is still provided, just for specifying properties of the target variables (i.e., the estimators section). If an original data file is provided (in--original-data-file
), ground truth is calculated by comparing predicted outputs to the outputs that appear in it. If the flag--delta-only
is provided and the original data is provided as well, predictions are performed only for input combinations that appear in the prediction file but not in the
original data. This is useful, for example, if the original data provided is the training data, and we want to predict only for input combinations unseen by the model during training.
The default log level is DEBUG
. You can change by specifying a different log
level as in the following example:
python -m arise_predictions.main --loglevel info analyze-jobs
To run ARISE from the UI, see documentation here. Note that the UI is still work in progress and missing many features that are available from the CLI.
To see ARISE in action on a sample dataset, go here.
The data consists of historical workload executions and/or performance benchmarks. Examples of potential properties of workloads that can be considered:
- Input data size and data complexity-related properties
- Hyper-parameters
- Workload task
- LLM
- GPU configuration
- Total execution time
- Throughput and latency
- Consumed resources: number of workers, CPU, GPU, and memory per worker
- Job status (success, fail/abort, etc.)
Example datasets can be found here and here.
The data is divided into job-metadata-inputs
: the properties of the workload that are known before it starts running
(e.g., items 1-5 above), and job-metadata-outputs
: properties of the workload execution and output that are known only
once the workload completes (e.g., items 6-9 above). The inputs and outputs specification is provided in the
job_spec.yaml
file. See this example of a job spec.
In your job spec, you can use the job-entry-filter
key to filter out entries from the original data according to
specific input values. In this example, we filter out all entries
where the Processor is2xAMD EPYC 9374F
, but we keep Processor as a data input. The semantics between the different
entries specified in job-entry-filter
is OR. That is, an entry matching any of the values specified will be
filtered out.
If the format of your data requires special parsing to transform into a dataframe (i.e., beyond a simple csv file), you
can implement your own parser in this class. For example, the sentiment
analysis example (here) uses SAJsonJobParser
as its parser, since its original
data consists of a json file per workload execution. The name of your parser should be provided in the
job-parser-class-name
optional field in job_spec.yaml
, see here.
- Currently, the tool uses exhaustive grid search for hyperparameter optimization (HPO). This may result in long run time for large datasets. We plan to move to a sample-based HPO that will scale the model search phase.
- Extrapolation is still work in progress, hence currently we expect large errors when predicting outputs for input values which are far beyond the range provided in the training dataset.