2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs #3884

nmnobre · 2024-06-20T12:00:48Z

Hi Roy,

The good:

The bad:

46% slowdown in compute_shape_functions() and 40% in init_shape_functions() when running vector fe ex3 with 1st order Nédélec elements.
for i in {1..5}; do ./example-opt element_type=TRI6 -pc_type lu grid_size=150 | grep "| FE " -A2; done

BEFORE (196a0a4):

| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0584      0.000000    0.0584      0.000000    5.14     5.14     |
|   init_shape_functions()           181200     0.0567      0.000000    0.0567      0.000000    4.98     4.98     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0599      0.000000    0.0599      0.000000    5.26     5.26     |
|   init_shape_functions()           181200     0.0570      0.000000    0.0570      0.000000    5.01     5.01     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0591      0.000000    0.0591      0.000000    5.15     5.15     |
|   init_shape_functions()           181200     0.0567      0.000000    0.0567      0.000000    4.95     4.95     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0611      0.000000    0.0611      0.000000    5.32     5.32     |
|   init_shape_functions()           181200     0.0566      0.000000    0.0566      0.000000    4.93     4.93     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0598      0.000000    0.0598      0.000000    5.24     5.24     |
|   init_shape_functions()           181200     0.0567      0.000000    0.0567      0.000000    4.98     4.98     |

AFTER (c659c2b):

| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0865      0.000000    0.0865      0.000000    7.31     7.31     |
|   init_shape_functions()           181200     0.0809      0.000000    0.0809      0.000000    6.84     6.84     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0861      0.000000    0.0861      0.000000    7.19     7.19     |
|   init_shape_functions()           181200     0.0795      0.000000    0.0795      0.000000    6.64     6.64     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0867      0.000000    0.0867      0.000000    7.36     7.36     |
|   init_shape_functions()           181200     0.0798      0.000000    0.0798      0.000000    6.77     6.77     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0862      0.000000    0.0862      0.000000    7.24     7.24     |
|   init_shape_functions()           181200     0.0794      0.000000    0.0794      0.000000    6.67     6.67     |
| FE                                                                                                              |
|   compute_shape_functions()        181200     0.0853      0.000000    0.0853      0.000000    7.18     7.18     |
|   init_shape_functions()           181200     0.0801      0.000000    0.0801      0.000000    6.75     6.75     |

The ugly:

No support for non-conforming interfaces and the shape functions are still explicitly hard-coded. On the bright side, all other code should support arbitrary order Nédélec elements. In practice, this means that, in the future, adding support for 3rd order shape functions would literally 'only' require writing those functions and their derivatives.

moosebuild · 2024-06-20T23:21:49Z

Job Coverage on 9b88378 wanted to post the following:

Coverage

	b76c44	#3884 9b8837
	Total	Total	+/-	New
Rate	62.84%	61.35%	-1.49%	8.02%
Hits	69469	69517	+48	243
Misses	41082	43797	+2715	2786

Diff coverage report

Full coverage report

Warnings

New new line coverage rate 8.02% is less than the suggested 90.0%

This comment will be updated on new commits.

Co-authored-by: Nuno Nobre <nuno.nobre@stfc.ac.uk>

roystgnr · 2024-06-27T22:16:18Z

Is this ready to review? I've been busy, and currently the recent force pushes make me cautious, but if I don't get to it in between vacation prep work tomorrow then I'm definitely not getting to it until I get back.

nmnobre · 2024-06-28T09:45:55Z

Is this ready to review? I've been busy, and currently the recent force pushes make me cautious, but if I don't get to it in between vacation prep work tomorrow then I'm definitely not getting to it until I get back.

There are a couple things I want to do:

Try using Elem::node_id instead of Elem::point to recover some of the performance loss but, wickedly, this isn't working for quads (yet), even though it is for tris.
EDIT: Ah, of course, it's caching in FE::reinit()! In that function we avoid reinitialising the shape functions even if shapes_need_reinit() is true by looking at the geometrical similarity between consecutive elements, and so we must also check orientation based on geometrical features. In this way, the same geometry equates to the same orientation. This is obviously not true if we compute orientation based on node ids, which only depend on how we happen to count nodes. So now the question is if caching is a worthwhile optimisation in the general case, i.e. everything which is not a structured quad/hex grid? Are these common in libMesh/MOOSE use cases?
Add support for 2d, 2nd order RT at the cost of the ND1 definitions, which is working for tris for both 1st and 2nd orders, and for quads ~~but (still) only for 1st order~~.
EDIT: Turns out I was just not using the correct space for the scalar variable in ex6, it's looking good now:

I know you've been quite busy so I just kept on working here. If you review this, I'll simply open another PR when I get somewhere with these two items, otherwise I'm happy to continue adding here. I'm happy either way, enjoy your vacation! :)

Okay Roy, I'm happy with this now. We can always transfer the discussion around 1) to #3874. Ready for review when you are. Enjoy your vacation! :)

roystgnr · 2024-07-10T21:07:40Z

This is obviously not true if we compute orientation based on node ids, which only depend on how we happen to count nodes.

And on how we happen to recount nodes. I'd love to have an option someday to compute based on Node::unique_id(), which we preserve through hell or high water, but "if we just coarsened away some elements, let's renumber every id() to get rid of the gaps" is a default libMesh setting.

So now the question is if caching is a worthwhile optimisation in the general case, i.e. everything which is not a structured quad/hex grid?

In the general case, no, and it might be worthwhile to add an option to FE to disable caching entirely upon user request.

In a more-general-than-structured-grids case, yes. There are a lot of FEM users in general who use immersed boundary methods or who use grid generation schemes based on an overset structured grid or grids. In the former case you don't have a structured grid topology but you have lots of identical geometry that benefits from caching; in the latter case you have a lot of weirdly shaped elements but they're around a core of identical quads or hexes. I don't know if anyone is currently doing these with libMesh, but they're ideas that seem to keep resurfacing, usually with good reasons.

Are [structured grids] common in libMesh/MOOSE use cases?

Yes, surprisingly; you can go quite far with geometries captured via phase field methods ... perhaps not on a structured grid but at least on an adaptively refined grid whose coarse level is structured. I did a lot of that in my PhD work (which is where the caching work came from), and I was just at a little conference this spring with a lot of MOOSE users doing that for subsurface modeling.

nmnobre · 2024-07-10T23:40:13Z

You make good points, so let's keep things as they are. :)

Actually... I wanted to convince myself that what I said here,

In practice, this means that, in the future, adding support for 3rd order shape functions would literally 'only' require writing those functions and their derivatives.

is in fact true, so I went ahead and implemented the 3rd order case... :P

As a matter of fact, I may or may not have written a little script that generates the (code for the) shape functions (and their derivatives) for arbitrary order, https://github.com/farscape-project/libMeshGen. It uses symfem and it's only 2d for now, but the objective is to generalise to 3d of course. So, long story short, we can add any order we want in a matter of minutes. :)

I think for our use cases 3rd order suffices, so I'd probably leave it there for now? Also, vector ex6 requires L2_LAGRANGE elements of an order below the RT order, meaning that we can't go to 4th order RT in that example anyway.

What's your take? What order is good enough? I can then add those commits to this PR and then we can finally review/merge. :)

roystgnr · 2024-07-11T21:12:54Z

Oh that's beautiful.

I'm of the general opinion that we should add as many orders as floating point arithmetic lets us get away with; in FEInterface I set unlimited = 11 because that was the point where even in 1D I found double-precision error swamping discretization error in p-refined meshes. :-D

But that was a cap for elements where we have code that handles arbitrary p. What degree we want to hand-code is another question, and I'm not sure we have a settled answer. With fe_hierarchic_shape_1D.C we wrote hand-coded shapes up to p=7 and only go to default: for 8 and up; with fe_hermite_shape_1D.C the default: is 6 and up, and I'm sure the explanation wasn't "Hermite is relatively more efficient in the parameterized case" so much as "Roy is lazier than Ben and John".

If you can't get a parametric solution out of symfem (it does get tricky outside 1D...) then we don't need to go up far. H(curl) stuff is where I've seen hp-adaptivity people go wild with p, but that could be because EM problems are inherently smoother in most domains or because these were academic researchers who tend to solve problems on simpler domains. I don't think I've personally done anything serious past p=4, and sometimes I feel like half of our MOOSE users would use p=-1 if that was an option, so even p=3 is a great start.

Does vector ex6 really need L2_LAGRANGE? We're using MONOMIAL for p there now, and that goes up indefinitely. If we need a bi/tri linear space that goes up indefinitely then we've got L2_HIERARCHIC available too.

nmnobre · 2024-07-11T23:45:29Z

and sometimes I feel like half of our MOOSE users would use p=-1 if that was an option

😆

Does vector ex6 really need L2_LAGRANGE? We're using MONOMIAL for p there now

Probably not... MONOMIAL is enough for triangles because it spans the same space as L2_LAGRANGE, but that's not true for quads, thus why I switched to L2_LAGRANGE... Even then, I spent most part of yesterday's afternoon trying to figure out why my convergence plots weren't showing the right convergence rate... Turns out I was using QUAD8s, not QUAD9s, so I was silently falling into the serendipity case...

I didn't even think of L2_HIERARCHIC, I'll give that a try tomorrow. If it spans the same space as L2_LAGRANGE, I don't see why it won't work.

nmnobre · 2024-07-13T23:21:41Z

Good shout, L2_HIERARCHIC works just fine! I didn't try to understand why, but I can't use L2_HIERARCHIC for CONSTANT order for some reason... I mean it runs, but the errors are higher, and the error on the scalar field doesn't converge.
EDIT: it's by design, only SIDE_HIERARCHIC covers the CONSTANT case:

libmesh/src/fe/fe_l2_hierarchic.C

Line 103 in 7579290

libmesh_assert_greater (o, 0);

I didn't add these higher orders to either run.sh nor to either .html, there's only so much we can add before it becomes too crowded, but let me know what you think.

Other than that, I'm struggling to converge the high-order, highly-refined configurations (the situation improved slightly with the latest commit), though I think I've got enough evidence the implementation is correct:

EDIT: I've rerun the 2nd order case on QUADs in single precision and marked both $\sqrt{\epsilon_\mathrm{single}}$ and $\sqrt{\epsilon_\mathrm{double}}$ with red lines. I'm pretty convinced the issues are due to FP precision limits, though I'm not quite sure I understand why we saturate at the sqrt, maybe related to the l2 norm, but not sure...

nmnobre force-pushed the nd_2d_2nd branch from d2833f5 to c659c2b Compare June 20, 2024 20:09

nmnobre added 2 commits June 26, 2024 13:40

Consolidate p-refinement inconsistency in FEInterface

f1da298

Fix bug in nedelec_one_nodal_soln

6d84463

nmnobre force-pushed the nd_2d_2nd branch from c659c2b to 3c8f9b3 Compare June 26, 2024 12:42

nmnobre and others added 5 commits June 26, 2024 13:43

Allow 2d, 2nd order Nédélec FEs

6a000c8

Provision for high order 2d Nédélec shape functions

906461c

Add 2d, 2nd order Nédélec shape functions

93a2b21

Co-authored-by: Nuno Nobre <nuno.nobre@stfc.ac.uk>

Provision Vector FE Ex3 for high order Nédélec FEs

9ef6586

Run Vector FE Ex3 with 2nd order Nédélec FEs

f42ed32

nmnobre force-pushed the nd_2d_2nd branch from 3c8f9b3 to f42ed32 Compare June 26, 2024 12:43

nmnobre added 3 commits June 26, 2024 13:47

Allow 2d, 2nd order Raviart-Thomas FEs

4762f1e

Provision for high order Raviart-Thomas FEs

29d778e

Define the 2d RT shape functions as rotations of ND1

c430b9f

nmnobre changed the title ~~2d, 2nd order Nédélec FEs~~ 2d, 2nd order Nédélec (first kind) and Raviart-Thomas FEs Jun 28, 2024

nmnobre added 2 commits July 10, 2024 12:04

Provision Vector FE Ex6 for high order Raviart-Thomas FEs

46ea397

Run Vector FE Ex6 with 2d, 2nd order Raviart-Thomas FEs

2ce9faf

nmnobre force-pushed the nd_2d_2nd branch from 92c78db to 2ce9faf Compare July 10, 2024 12:54

Nitpick some code comments

b2f26de

Add 2d, 3rd order Nédélec shape functions

3d5f4c6

nmnobre added 4 commits July 13, 2024 16:28

Add 2d, 4th and 5th order Nédélec shape functions

bcae4d3

Allow 2d, 5th order ND1 and RT FEs

2f416ce

Provision Vector FE Ex3 for 5th order Nédélec FEs

5e8c4e0

Provision Vector FE Ex6 for 5th order Raviart-Thomas FEs

80f135d

Switch to L2_HIERARCHIC for the scalar field in Vector FE Ex6

98d9ace

nmnobre changed the title ~~2d, 2nd order Nédélec (first kind) and Raviart-Thomas FEs~~ 2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs Jul 13, 2024

Get rid of long floats as coefficients

9b88378

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs #3884

2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs #3884

nmnobre commented Jun 20, 2024 •

edited

Loading

moosebuild commented Jun 20, 2024 •

edited

Loading

roystgnr commented Jun 27, 2024

nmnobre commented Jun 28, 2024 •

edited

Loading

roystgnr commented Jul 10, 2024

nmnobre commented Jul 10, 2024 •

edited

Loading

roystgnr commented Jul 11, 2024

nmnobre commented Jul 11, 2024 •

edited

Loading

nmnobre commented Jul 13, 2024 •

edited

Loading

2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs #3884

Are you sure you want to change the base?

2d, 5th order Nédélec (first kind) and Raviart-Thomas FEs #3884

Conversation

nmnobre commented Jun 20, 2024 • edited Loading

The good:

The bad:

The ugly:

moosebuild commented Jun 20, 2024 • edited Loading

Coverage

Warnings

roystgnr commented Jun 27, 2024

nmnobre commented Jun 28, 2024 • edited Loading

roystgnr commented Jul 10, 2024

nmnobre commented Jul 10, 2024 • edited Loading

roystgnr commented Jul 11, 2024

nmnobre commented Jul 11, 2024 • edited Loading

nmnobre commented Jul 13, 2024 • edited Loading

nmnobre commented Jun 20, 2024 •

edited

Loading

moosebuild commented Jun 20, 2024 •

edited

Loading

nmnobre commented Jun 28, 2024 •

edited

Loading

nmnobre commented Jul 10, 2024 •

edited

Loading

nmnobre commented Jul 11, 2024 •

edited

Loading

nmnobre commented Jul 13, 2024 •

edited

Loading