[Gmsh] node and element sparsity in msh4

Sat Sep 15 09:22:36 CEST 2018

> On 15 Sep 2018, at 03:30, jeremy theler <jeremy at seamplex.com> wrote:
> 
> On Fri, 2018-09-14 at 19:58 +0200, Christophe Geuzaine wrote:
>>> On 14 Sep 2018, at 04:43, jeremy theler <jeremy at seamplex.com>
>>> wrote:
>>> 
>>> Dear developers,
>>> 
>>> I am updating my .msh parser for the new version 4 format. What I
>>> found
>>> was that nodes and elements are "sparse" (borrowing exact terms
>>> from
>>> Gmsh manual) in the sense that they do not constitute a continuous
>>> list
>>> of indexes starting at 1.
>>> 
>>> For example, the attached geo generates 45 nodes numbered up to an
>>> index of 90 and 68 elements up to 184 using latest stable 4.0.1.
>>> Even
>>> using RenumberMeshNodes and RenumberMeshElements (the output is the
>>> same with these two instructions in and out... what do they
>>> actually
>>> do?).
>>> 
>> 
>> RenumberMeshNodes and RenumberMeshElements renumber the nodes and the
>> elements in the (whole) mesh. In Gmsh 4.0.1 this is now called by
>> default after the meshing stage, so that all the meshes generated by
>> Gmsh have nodes/elements numbered in a continuous sequence.
> 
> Why would anyone not want this feature by default?

Gmsh is often used to pre- or post-process existing meshes: in many cases you don't want to renumber those to keep track of the original associations. Typical example: read a complex input mesh, remove a single volume and remesh only that part - you don't want to renumber the parts that have not been remeshed.

> 
>> In your exemple you only save a part of the mesh, as you define a
>> Physical group for only one of the two surfaces. By default Gmsh only
>> saves the parts of the mesh that belong to physical groups, so you
>> indeed end up with "gaps" in the mesh - corresponding to the parts
>> you didn't save. To save the whole mesh, set Mesh.SaveAll=1.
> 
> I know. You might recall I always explain this behavior in the list. I
> even asked to add this question to the FAQ. The posted example was a
> fairly edge case that my parser eventually would need to address.
> 
> In any case, the more usual case of a 3d volume where materials are
> attached to 3d physical entities and boundary conditions to 2d physical
> entities will lead to dense nodes but sparse elements, as those 0d, 1d
> and 2d with no explicit BCs will not be present in the mesh file, thus
> giving a sparse list of elements.
> 
> I actually do not care too much about sparse elements, but I do about
> sparse nodes because the former need to refer (i.e. to "find") the
> latter.
> 
>> To efficiently deal with the format just use arrays: you will waste
>> some memory for empty entries, but that's it. In our reference
>> implementation (GModelIO_MSH4.cpp) we put a threshold on the "level
>> of sparsity", which determines if we use an array or a map.
> 
> I would need to know the biggest tag of the nodes or elements before
> readuing them, not just the number of non-zero elements. Otherwise I
> need to either double-parse the file or reallocate the array each time.

No: you need some temp buffer for reading the raw data anyway (it's mixed int/float data and might have to be swapped). So just keep track of (tag,vertex) pairs in a vector while reading like we do in the reference GModelIO_MSH4.cpp implementation; then create the array (or map if too sparse) for indexing once you've read the vertices.

We could indeed add min/max vertex/element tags in the section header in a future revision of the format. Such a revision will include

- changes based on user feedback
- additional features for high-performance parallel IO (MPI IO)
- ability to use 64 bit node and element tags (for very large meshes)
- reworked post-processing fields with the ability to choose float size

We could include an option to renumber *based on physical definitions*, i.e. we could renumber all the nodes/elements that are needed by physical groups first, followed by all the other nodes/elements. Not sure if it's worth the hassle though?

> 
> I might try an intermediate inverse mapping index and stick to dense
> vectors. Still, I would need the biggest tag (i.e. the size of the
> sparse array) next to the number of non-zeros before the actual data.
> 
>> 
>>> 
>>> As the manual correctly says, this "sparcity" impacts efficiency
>>> and
>>> performance as arrays cannot be used and other types of objects
>>> need to
>>> appear (linked lists, hashed lists, etc) in order to handle them.
>>> 
>>> 
>>> I still want to use arrays on my code due to performance issues. I
>>> might re-order the nodes to get them straight at parse-time, but I
>>> think that should be done on the mesher side (because the same mesh
>>> might be read several times from the solver side so it makes no
>>> sense
>>> to re-order every time).
>>> 
>>> On the other hand, if I use msh2 output (even in Gmsh 4.0.1) the
>>> sparcity is gone and I can rely on my good old arrays.
>>> 
>>> Question is: what are the benefits of this behavior in msh4? 
>>> 
>> 
>> In previous MSH file formats, we renumbered the nodes and the
>> elements *during the file export*. It guaranteed that the "tags" in
>> the mesh file were in effect "indices" - always dense (and actually
>> useless... since they were indices). But this destroyed the link 
> 
> I agree
> 
>> between the internal representation of the mesh and what was
>> exported. It was then impossible to export a mesh and guarantee that
>> the state would be the same when reading it back; which led to
>> horrible hacks for partitioned (distributed, parallel) meshes,
>> periodic conditions, mesh reparametrization, etc. Worse yet, physical
> 
> Fair enough.
> 
>> groups were handled as attributes at the element level, meaning that
>> an element could not belong to more than one group: new elements 
> 
> Yes, I understood that one of the "features" msh4 format has is that
> one element may belong to many physical entities. I commented a commit
> in gitlab. Do you have an example of application where a geometrical
> entity needs to belong to many physical entities?

All applications where you have overlapping groups: functional characterization ("left wing") vs. materials ("aluminum" and "carbon fiber"), local feature ("wire 2") vs. global one ("stator winding"), overlapping boundary conditions for multiphysics applications ("zero flux" vs "fixed displacement"), overlapping boundaries of adjacent parts (sharing some entities), etc.

Christophe

> 
>> (with the same connectivity but a new tag) were created in that case
>> *during file export*, which in effect created a completely new mesh.
>> 
>> MSH 4 gets rid of all these hacks by simply exporting the mesh *as
>> is*: tags are not modified, groups are just "grouping" entities and
>> not creating new elements, the classification of nodes and elements
>> on geometrical entities is saved, etc. In addition to the massive
>> performance boost, MSH 4 thus allows to keep a consistent
>> representation of the mesh with all the topological and geometrical
>> information necessary to reuse the information for further
>> processing.
> 
> Fair enoguh.
> 
>> PS: We had an internal debated about setting Mesh.SaveAll=1 by
>> default in Gmsh 4. At the end we decided against it, as it would
>> break the workflow of many applications which rely on only saving
>> parts of the mesh.
> 
> I agree. Again, it is not a SaveAll issue. I will try another approach
> and let you know if I get stuck again.
> 
> Thanks
> 
> --
> jeremy theler
> www.seamplex.com
> 
> 
> 
> _______________________________________________
> gmsh mailing list
> gmsh at onelab.info
> http://onelab.info/mailman/listinfo/gmsh

— 
Prof. Christophe Geuzaine
University of Liege, Electrical Engineering and Computer Science 
http://www.montefiore.ulg.ac.be/~geuzaine

Free software: http://gmsh.info | http://getdp.info | http://onelab.info