Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google tests to obey MIOPEN_TEST_HALF etc #1843

Open
atamazov opened this issue Oct 24, 2022 · 10 comments
Open

Google tests to obey MIOPEN_TEST_HALF etc #1843

atamazov opened this issue Oct 24, 2022 · 10 comments

Comments

@atamazov
Copy link
Contributor

atamazov commented Oct 24, 2022

(1) When the tests are run under CTest's control (i.e. in our CI):

  • (1.1) The tests in gtest folder must obey the MIOPEN_TEST_HALF, MIOPEN_TEST_INT8, MIOPEN_TEST_BFLOAT16 etc CMAKE variables. For example FP16 tests should be run only if MIOPEN_TEST_HALF was set to On when the library has been configured.
  • (1.2) Ditto MIOPEN_TEST_ALL.
  • (1.3) The refactored test should obey SKIP_ALL_EXCEPT_TESTS just like the original test does. Explanation: [tests] Refactor cache test to gTest #1652 (comment)

(2) When the tests are run autonomously (in other words, "in standalone mode"),

  • (2.1) All the datatypes needs to be tested
  • (2.2) "Full Testing" should be performed

How to implement detection of "standalone mode":

Query MIOPEN_TEST_ALL to detect if the test is run under CTest. If the variable is unset, then assume the standalone mode. Otherwise, assume that the test is run under CTest's control.

Example of proper implementation of checks:

TEST_P(...)
{
    if(!(IsTestSupportedForDevice()                      //
         && (miopen::IsUnset(ENV(MIOPEN_TEST_ALL))       // standalone run
             || (miopen::IsEnabled(ENV(MIOPEN_TEST_ALL)) // or --half full tests enabled
                 && miopen::GetStringEnv(ENV(MIOPEN_TEST_FLOAT_ARG)) == "--half"))))
    {
        GTEST_SKIP();
    }

    Run....
};

@junliume https://github.com/ROCmSoftwarePlatform/MIOpen/labels/urgency_blocker

@JehandadKhan
Copy link
Collaborator

I do not agree with this design since it requires MIOpen run CMake again and again just to change the test cases which are run. Instead each of those precision can be a different test and be run directly. If manual running of tests requires filtering one over the other that can be done using the GTest filter feature which allows tests to run selectively.

I would like to point out that due to this "feature" part of the test information ( precision and in some tests even test cases ) live in the CMake file while the code that is actually run lives in the CPP file which to my mind is unnecessary complexity.

Furthermore, it can be seen that if I would like to change the precision of the tests to run I need to "rebuild" MIOpen ( since I have to run CMake again) which is again counter intuitive and the source of complexity of our CI pipeline.

Based on the above arguments I suggest we document this change in our test behavior and close this issue.

@atamazov
Copy link
Contributor Author

atamazov commented Nov 1, 2022

@JehandadKhan

I do not agree with this design since it requires MIOpen run CMake again and again just to change the test cases which are run...

However, this is our current design which we leverage in our CI in order to distribute the total testing work among nodes. If Google tests do not follow it, then the same testing work is unnecessarily repeated in different stages thus wasting CI resources. This is the main reason why this issue was opened.

Of course I do agree with many of your statements in previous comment. Current design has many drawbacks and may require redesigning. But this is out of scope of this ticket. I suggest opening another one, dedicated to redesign of our test suite, deciding the new design and then implement the new ticket gradually.

[Off-topic][Idea] For example, we can begin refactoring of our tests with moving all Google tests from our legacy CI stages to the new stages. As a result, the new tests will be run only once (per GPU type).

@atamazov
Copy link
Contributor Author

atamazov commented Nov 7, 2022

[Informative] Issue description updated with a link to #1652 (comment)

@JehandadKhan
Copy link
Collaborator

However, this is our current design which we leverage in our CI in order to distribute the total testing work among nodes. If Google tests do not follow it, then the same testing work is unnecessarily repeated in different stages thus wasting CI resources. This is the main reason why this issue was opened.

@atamazov Since it is our current design we have to live with it, however, that is not an argument to complicate a cleaner test architecture. This was done since we were taking an inordinate amount of time during testing which has been considerably reduced. I suggest instead of forcing the same decisions of the past on new things, we should go towards a direction where testing is simple and not dependent on the build environment. I suggest we survey the CMakeFiles.txt, Jenkinsfile and the tests to assess whether such a division of work is necessary anymore and reduce the number of stages in our CI pipeline.

@atamazov
Copy link
Contributor Author

atamazov commented Nov 8, 2022

@JehandadKhan

Since it is our current design we have to live with it, however, that is not an argument to complicate a cleaner test architecture.

I am not asking to complicate test architecture. My requirements are purely functional. The functionality may be implemented using simple and clean architecture or using entangled and complicate one -- it depends on many circumstances (let me omit description of those).

I suggest instead of forcing the same decisions of the past on new things

No problem because I am not forcing this. What I am trying to force is described below.

I suggest we survey the CMakeFiles.txt, Jenkinsfile and the tests to assess whether such a division of work is necessary...

I am second to this, of course. Let's discuss and possibly invent new architecture/design of the tests.

What I do not like is moving ahead of the new, well-judged and agreed design, which in this specific case leads to performance an quality regressions in our CI.

While the new design is not agreed, we can use legacy approach (which this ticket is about) or create new stages for the refactored tests (as I proposed here). When the new testing architecture is ready and agreed, we can begin transition to it. The old and new testing architecture will coexist during the transition period.

@atamazov
Copy link
Contributor Author

@JehandadKhan @junliume Please note that the tests that do not comply with the requirements listed in this ticket waste CI resources (time).

@atamazov
Copy link
Contributor Author

@CAHEK7 FYI another ticket important for tests

@atamazov
Copy link
Contributor Author

atamazov commented Dec 22, 2023

Description updated: More info about MIOPEN_TEST_FLOAT_ARG added.

@atamazov
Copy link
Contributor Author

Description updated: Added info about MIOPEN_TEST_ALL and SKIP_ALL_EXCEPT_TESTS. Added recommendation about preferred way of detection if the test is run under CTest control or not.

@atamazov
Copy link
Contributor Author

atamazov commented Apr 2, 2024

@junliume Can you please also add specification . Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants