Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NPU] Add CLI option to directly pass reference files when running tests #28446

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

seowwj
Copy link

@seowwj seowwj commented Jan 15, 2025

Details:

  • Enables user to directly specify reference files to be used in test mode. If none is specified, the current default file scheme (<netname>_ref_out....blob) is used.

Tickets:

  • E#148130

@seowwj seowwj requested review from a team as code owners January 15, 2025 06:53
@github-actions github-actions bot added the category: NPU OpenVINO NPU plugin label Jan 15, 2025
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jan 15, 2025
@Maxim-Doronin Maxim-Doronin added good first issue Good for newcomers and removed ExternalPR External contributor labels Jan 15, 2025
@@ -127,6 +127,7 @@ DEFINE_string(
ref_dir,
"",
"A directory with reference blobs to compare with in run_test mode. Leave it empty to use the current folder.");
DEFINE_string(ref_results, "", "Reference files used during 'run_test'");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, specify input format in case of several inputs.
Is it --ref_results "file1;file2"?

Are relative paths supported? e.g. --ref_dir /part/of/path/ --ref_results rest/of/path/file.blob?

Are absolute paths supported? e.g. --ref_results /abs/path/file.blob

// Make sure that the number of reference files is the same as the number of input files
if (refFilesPerCase.size() != inputFilesPerCase.size()) {
std::cout << "The number of reference files is not equal to the number of input files."
<< " # Reference Files: " << refFilesPerCase.size()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please replace # with number of or # of

}

for (const auto& refResult : refFilesPerCase) {
std::vector<std::string> refFilesPerModel = splitStringList(refResult, ',');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIR, --network supports only a single model to be loaded. Please explain when we expect several models in a single SIT run.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that it is for batched input scenario when a whole row looks like CASE_0;CASE_1;etc
where
CASE_X is a run of SIT and which contains INPUT_0,INPUT_1,INPUT2,etc,
where
INPUT_0 encompasses data for single input in batch BATCH_0|BATCH_1|etc
which makes sense for separation input data per model

If my understanding is correct that here we need to produce tensors which are belong to output data types. In case of batch, though inputs pictures for every batch line are separated, outputs are printed into unified single file without separation of its lines. So probably we should not use this level with '|' as perFiles separator

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that it is for batched input scenario when a whole row looks like CASE_0;CASE_1;etc where CASE_X is a run of SIT and which contains INPUT_0,INPUT_1,INPUT2,etc, where INPUT_0 encompasses data for single input in batch BATCH_0|BATCH_1|etc which makes sense for separation input data per model

This is correct, the separators are mainly for delimiting if there are multiple expected outputs for the model.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but in order to leverage that "multiple expected outputs for the model" you need to write a special logic to split "sigle batched output" onto several non-batched outputs - in order to make it works. Without that logic implemented this "|" is confusing as it doesn't work properly

}

for (const auto& refResult : refFilesPerCase) {
std::vector<std::string> refFilesPerModel = splitStringList(refResult, ',');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that it is for batched input scenario when a whole row looks like CASE_0;CASE_1;etc
where
CASE_X is a run of SIT and which contains INPUT_0,INPUT_1,INPUT2,etc,
where
INPUT_0 encompasses data for single input in batch BATCH_0|BATCH_1|etc
which makes sense for separation input data per model

If my understanding is correct that here we need to produce tensors which are belong to output data types. In case of batch, though inputs pictures for every batch line are separated, outputs are printed into unified single file without separation of its lines. So probably we should not use this level with '|' as perFiles separator

if (!refFiles.empty()) {
blobFileFullPath = refFiles[numberOfTestCase][outputInd];
} else {
std::ostringstream ostr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick:
I'd rather move this into a separate function like 'generate_reference_file_name(...)', which will makes this huge code lesser overwhelming

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick 2:
I wondered whether forming the array refFiles initially regardless either ref_dir or res_output set and stick to it will without this condition checking refFiles.empty() would be more nice or not.
So that we could fill refFiles by using generate_reference_file_name() (because all data to generate is known in the beginning) when ref_dir exist or just parse command line of res_output

Please feel free to ignore this comment as it's arguable

const auto blobFileFullName = fullPath.string();
// If reference files are provided, use them
if (!refFiles.empty()) {
blobFileFullPath = refFiles[numberOfTestCase][outputInd];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that we have already taken numberOfTestCase here https://github.com/openvinotoolkit/openvino/pull/28446/files#diff-32f65067f63ae981b516b3780122e243b0ed1684efd43992f00063f0d458716eR2118
when we had defined refFiles by itself. Probably when use remove '|' parsing level this second numberOfTestCase would not required anymore

: refFilesForOneInfer[numberOfTestCase];
if (!FLAGS_ref_results.empty()) {
OPENVINO_ASSERT(refFiles.size() == outputsInfo.size(), "Number of reference files ", refFiles.size(),
" doesn't match network configuration ", outputsInfo.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100500 for printing troubleshooting info's!
I'd only suggest emphasize that it must match output count by amending this doesn't match network output count: ', outputsInfo.size()

std::cout << " Reference files direcotry: "
<< (FLAGS_ref_dir.empty() ? "Current directory" : FLAGS_ref_dir) << std::endl;
std::cout << " Reference files directory: "
<< (FLAGS_ref_dir.empty() && FLAGS_ref_results.empty() ? "Current directory" : FLAGS_ref_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from a context it sounds like these parameters are mutually exclusive so that I presume that printing a message and exiting with an error is more intentional

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that they are not mutually exclusive in this case.
Case 1: User provides a reference directory + relative file paths
Result: This line should show the reference directory given

Case 2: User only provides a reference directory
Result: This line should show the reference directory given

Case 3: Nothing provided
Result: This line should show "Current directory"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡Perhaps it would be even better to show the actual directory path instead of just "current directory"?

Copy link
Contributor

@sivanov-work sivanov-work Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the question about UX: for me mostly as a UNIX user it's quite obvious that if the path starting from "/" it must be an absolute path, so that these parameters --ref_dir=A --ref_outputs=/B;/C mean that /B and /C are expected to be absolute paths rather then A/B and A/C. Thus, if we can configure either absolute or relative paths using only one parameter, instead of changing behavior of that parameter by looking at a value of another one parameter - it will be more natural in usage

this is IMHO, so that maybe it would be better to ask an issue reporter opinion

@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Jan 16, 2025
@seowwj seowwj changed the title Add CLI option to directly pass reference files when running tests [NPU] Add CLI option to directly pass reference files when running tests Jan 17, 2025
}

for (const auto& refResult : refFilesPerCase) {
std::vector<std::string> refFilesPerModel = splitStringList(refResult, ',');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but in order to leverage that "multiple expected outputs for the model" you need to write a special logic to split "sigle batched output" onto several non-batched outputs - in order to make it works. Without that logic implemented this "|" is confusing as it doesn't work properly

@@ -2084,6 +2145,12 @@ static int runSingleImageTest() {
const FilesForModelInputs &inputFiles = inputFilesForOneInfer[numberOfTestCase];
OPENVINO_ASSERT(inputFiles.size() == inputsInfo.size(), "Number of input files ", inputFiles.size(),
" doesn't match network configuration ", inputsInfo.size());
const RefFilesForModelOutputs &refFiles = refFilesForOneInfer.empty() ? RefFilesForModelOutputs{}
: refFilesForOneInfer[numberOfTestCase];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still didn't get why we are getting this refFiles as refFilesForOneInfer[numberOfTestCase]; and then getting
refFiles[numberOfTestCase] again inside a function getRefBlobFilePath()

I assume that there is the confusion regarding this "|" layer in parsing refFilesForOneInfer, which brings up this addition array layer

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the "|" layer as it may be less useful in model output.
Hence now the convention becomes: Ref1A, Ref1B; Ref2A, Ref2B

Ref1A, Ref1B will constitute the first test case (test case 0), with 2 outputs (1A and 1B)
Ref2A, Ref2B will constitute the second test case (test case 1), with 2 outputs (2A and 2B)

I've also cleared up the confusion in the help text. (semicolon = different test case, comma = different reference file within same test case)

Copy link
Contributor

@sivanov-work sivanov-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/plugins/intel_npu/tools/single-image-test/main.cpp Outdated Show resolved Hide resolved
Copy link

@MikhailPedus MikhailPedus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Maxim-Doronin Maxim-Doronin added this pull request to the merge queue Feb 12, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: NPU OpenVINO NPU plugin ExternalPR External contributor good first issue Good for newcomers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants