-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AD global indexing sparse container is slower #17674
Comments
@hugary1995 what size container(s) were you using for these? |
I believe it's 50 because I don't recall changing the default of that. Unless the default AD container size changed between these commits. |
👍 (it hasn't changed) |
With size 75 containers, taking 5 time steps, I see 28 seconds for global sparse vs 32 seconds for local nonsparse. I will try with size 50 |
Hmm going down to 50 the nonsparse container definitely improves significantly (as it should) but it doesn't appear to be significantly faster than sparse. Children of |
I will try running the full |
Yeah I was about to say that. I believe later on the plasticity model starts to play a role, where there is a newton loop per qp while evaluating the material model. |
Ok with the full run on my machine I see 311s for Jacobian for sparse and 276s for nonsparse, so that's up to 13% difference. |
Sadly I just don't see much room for optimization in our most expensive function
@roystgnr do you see anything you would do differently? |
Yeah I think 13% isn't enough of a motivation. Could the performance of the AD container be machine dependent? I remember I was seeing a bigger difference. Probably I messed something up. |
No I doubt you messed anything up. Could be that I had more chatter on my machine during the nonsparse profiling than you did or when I was doing the sparse profiling. |
We could try to special-case the "sparsity patterns are identical" use case; potential speedup there might make up for the extra overhead in real unions. We could try to turn some of these manual loops into STL algorithm invocations, and hope that we're on systems that can do better vectorization or assembly or whatever with those. I'm not optimistic about either idea. |
Reason
See discussion #17671 . For a specific problem, global sparse is slower than local nonsparse.
For the record, I am attaching the input file here:
The csv file and the mesh file are in this archive:
Archive.zip
Design
Have @lindsayad figure out what's going on.
Impact
Speed!
The text was updated successfully, but these errors were encountered: