Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add back a few large GEMM tests #1219

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open

Conversation

krzysz00
Copy link
Collaborator

@krzysz00 krzysz00 commented Aug 25, 2023

I suspect that we're running into more oom issues because of tharnsB and transC and the like. So, move large gemm tests into their own file where there'll only be one or two tests that run, thus ensuring we don't overload the machines.

(Note: historicallly, we tried to add large GEMM tests to the usual GEMM test set and we'd get OOMs from parallel runs on memeory-constrained machines)

@krzysz00 krzysz00 force-pushed the move-nightly-large-tests branch from a948053 to 4b48a86 Compare October 17, 2023 15:11
@krzysz00 krzysz00 force-pushed the move-nightly-large-tests branch from 4b48a86 to d9e58c2 Compare November 3, 2023 15:21
@jerryyin jerryyin added the skip-ci Don't build Jenkins tests label Feb 20, 2024
@krzysz00 krzysz00 requested review from jerryyin and sjw36 as code owners April 26, 2024 16:42
@krzysz00 krzysz00 force-pushed the move-nightly-large-tests branch from d9e58c2 to f0fc9d9 Compare July 24, 2024 21:55
I suspect that we're running into more oom issues because of tharnsB
and transC and the like. So, move large gemm tests into their own file
where there'll only be one or two tests that run, thus ensuring we
don't overload the machines.
@krzysz00 krzysz00 force-pushed the move-nightly-large-tests branch from f0fc9d9 to acbf139 Compare September 11, 2024 20:12
@krzysz00 krzysz00 removed the skip-ci Don't build Jenkins tests label Sep 11, 2024
@krzysz00
Copy link
Collaborator Author

Ping

@manupak
Copy link
Contributor

manupak commented Sep 17, 2024

I suspect that we're running into more oom issues because of tharnsB and transC and the like

Is this true now ? I cant recall any issues of the like... (unless its a thing that happened last week whilst im out)

So, move large gemm tests into their own file where there'll only be one or two tests that run, thus ensuring we don't overload the machines.

If anything this PR adds more tests and does not reduce any workload on the machine. So what is the actual motivation for the addition of tests?

Im not opposed to merging this in; its just the motivation written in the description is confusing.

@krzysz00
Copy link
Collaborator Author

@manupak The motivation was from back when we were trying to add even more of these tests and they were causing CI hangs because the test took so long

@krzysz00 krzysz00 changed the title Further reduce large gemm tests Add back a few large GEMM tests Sep 18, 2024
@krzysz00
Copy link
Collaborator Author

Have updated description

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants