[Gmsh] node and element sparsity in msh4

Sat Sep 15 03:30:49 CEST 2018

On Fri, 2018-09-14 at 19:58 +0200, Christophe Geuzaine wrote:
> > On 14 Sep 2018, at 04:43, jeremy theler <jeremy at seamplex.com>
> > wrote:
> > 
> > Dear developers,
> > 
> > I am updating my .msh parser for the new version 4 format. What I
> > found
> > was that nodes and elements are "sparse" (borrowing exact terms
> > from
> > Gmsh manual) in the sense that they do not constitute a continuous
> > list
> > of indexes starting at 1.
> > 
> > For example, the attached geo generates 45 nodes numbered up to an
> > index of 90 and 68 elements up to 184 using latest stable 4.0.1.
> > Even
> > using RenumberMeshNodes and RenumberMeshElements (the output is the
> > same with these two instructions in and out... what do they
> > actually
> > do?).
> > 
> 
> RenumberMeshNodes and RenumberMeshElements renumber the nodes and the
> elements in the (whole) mesh. In Gmsh 4.0.1 this is now called by
> default after the meshing stage, so that all the meshes generated by
> Gmsh have nodes/elements numbered in a continuous sequence.

Why would anyone not want this feature by default?

> In your exemple you only save a part of the mesh, as you define a
> Physical group for only one of the two surfaces. By default Gmsh only
> saves the parts of the mesh that belong to physical groups, so you
> indeed end up with "gaps" in the mesh - corresponding to the parts
> you didn't save. To save the whole mesh, set Mesh.SaveAll=1.

I know. You might recall I always explain this behavior in the list. I
even asked to add this question to the FAQ. The posted example was a
fairly edge case that my parser eventually would need to address.

In any case, the more usual case of a 3d volume where materials are
attached to 3d physical entities and boundary conditions to 2d physical
entities will lead to dense nodes but sparse elements, as those 0d, 1d
and 2d with no explicit BCs will not be present in the mesh file, thus
giving a sparse list of elements.

I actually do not care too much about sparse elements, but I do about
sparse nodes because the former need to refer (i.e. to "find") the
latter.

> To efficiently deal with the format just use arrays: you will waste
> some memory for empty entries, but that's it. In our reference
> implementation (GModelIO_MSH4.cpp) we put a threshold on the "level
> of sparsity", which determines if we use an array or a map.

I would need to know the biggest tag of the nodes or elements before
readuing them, not just the number of non-zero elements. Otherwise I
need to either double-parse the file or reallocate the array each time.

I might try an intermediate inverse mapping index and stick to dense
vectors. Still, I would need the biggest tag (i.e. the size of the
sparse array) next to the number of non-zeros before the actual data.

> 
> > 
> > As the manual correctly says, this "sparcity" impacts efficiency
> > and
> > performance as arrays cannot be used and other types of objects
> > need to
> > appear (linked lists, hashed lists, etc) in order to handle them.
> > 
> > 
> > I still want to use arrays on my code due to performance issues. I
> > might re-order the nodes to get them straight at parse-time, but I
> > think that should be done on the mesher side (because the same mesh
> > might be read several times from the solver side so it makes no
> > sense
> > to re-order every time).
> > 
> > On the other hand, if I use msh2 output (even in Gmsh 4.0.1) the
> > sparcity is gone and I can rely on my good old arrays.
> > 
> > Question is: what are the benefits of this behavior in msh4? 
> > 
> 
> In previous MSH file formats, we renumbered the nodes and the
> elements *during the file export*. It guaranteed that the "tags" in
> the mesh file were in effect "indices" - always dense (and actually
> useless... since they were indices). But this destroyed the link 

I agree

> between the internal representation of the mesh and what was
> exported. It was then impossible to export a mesh and guarantee that
> the state would be the same when reading it back; which led to
> horrible hacks for partitioned (distributed, parallel) meshes,
> periodic conditions, mesh reparametrization, etc. Worse yet, physical

Fair enough.

> groups were handled as attributes at the element level, meaning that
> an element could not belong to more than one group: new elements 

Yes, I understood that one of the "features" msh4 format has is that
one element may belong to many physical entities. I commented a commit
in gitlab. Do you have an example of application where a geometrical
entity needs to belong to many physical entities?

> (with the same connectivity but a new tag) were created in that case
> *during file export*, which in effect created a completely new mesh.
> 
> MSH 4 gets rid of all these hacks by simply exporting the mesh *as
> is*: tags are not modified, groups are just "grouping" entities and
> not creating new elements, the classification of nodes and elements
> on geometrical entities is saved, etc. In addition to the massive
> performance boost, MSH 4 thus allows to keep a consistent
> representation of the mesh with all the topological and geometrical
> information necessary to reuse the information for further
> processing.

Fair enoguh.

> PS: We had an internal debated about setting Mesh.SaveAll=1 by
> default in Gmsh 4. At the end we decided against it, as it would
> break the workflow of many applications which rely on only saving
> parts of the mesh.

I agree. Again, it is not a SaveAll issue. I will try another approach
and let you know if I get stuck again.

Thanks

--
jeremy theler
www.seamplex.com