Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Kronbichler, Martin
Publications (10 of 37) Show all publications
Munch, P., Heister, T., Prieto Saavedra, L. & Kronbichler, M. (2023). Efficient distributed matrix-free multigrid methods on locally refined meshes for FEM computations. ACM transactions on parallel computing, 10(1), 1-38, Article ID 3.
Open this publication in new window or tab >>Efficient distributed matrix-free multigrid methods on locally refined meshes for FEM computations
2023 (English)In: ACM transactions on parallel computing, E-ISSN 2329-4957, Vol. 10, no 1, p. 1-38, article id 3Article in journal (Refereed) Published
Abstract [en]

This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening (both h-multigrid), and polynomial global coarsening (a variant of p-multigrid). We have integrated the algorithms into the same framework-the open source finite-element library deal.II-, which allows us to make fair comparisons regarding their implementation complexity, computational efficiency, and parallel scalability as well as to compare the measurements with theoretically derived performance metrics. Serial simulations and parallel weak and strong scaling on up to 147,456 CPU cores on 3,072 compute nodes are presented. The results obtained indicate that global-coarsening algorithms show a better parallel behavior for comparable smoothers due to the better load balance, particularly on the expensive fine levels. In the serial case, the costs of applying hanging-node constraints might be significant, leading to advantages of local smoothing, even though the number of solver iterations needed is slightly higher. When using p- and h-multigrid in sequence (hp-multigrid), the results indicate that it makes sense to decrease the degree of the elements first from a performance point of view due to the cheaper transfer.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
National Category
Computational Mathematics
Research subject
Scientific Computing with specialization in Numerical Analysis
Identifiers
urn:nbn:se:uu:diva-492939 (URN)10.1145/3580314 (DOI)000964905500003 ()
Projects
eSSENCE - An eScience Collaboration
Available from: 2023-01-11 Created: 2023-01-11 Last updated: 2023-08-23Bibliographically approved
Golshan, S., Munch, P., Gassmoeller, R., Kronbichler, M. & Blais, B. (2023). Lethe-DEM: an open-source parallel discrete element solver with load balancing. COMPUTATIONAL PARTICLE MECHANICS, 10(1), 77-96
Open this publication in new window or tab >>Lethe-DEM: an open-source parallel discrete element solver with load balancing
Show others...
2023 (English)In: COMPUTATIONAL PARTICLE MECHANICS, ISSN 2196-4378, Vol. 10, no 1, p. 77-96Article in journal (Refereed) Published
Abstract [en]

Approximately 75% of the raw material and 50% of the products in the chemical industry are granular materials. The discrete element method (DEM) provides detailed insights of phenomena at particle scale, and it is therefore often used for modeling granular materials. However, because DEM tracks the motion and contact of individual particles separately, its computational cost increases nonlinearly O (n(p) log(n(p))) - O (n(2)) (depending on the algorithm) with the number of particles (n(p)). In this article, we introduce a new open-source parallel DEM software with load balancing: Lethe-DEM. Lethe-DEM, a module of Lethe, consists of solvers for two-dimensional and three-dimensional DEM simulations. Load balancing allows LetheDEM to significantly increase the parallel efficiency by approximate to 25-70% depending on the granular simulation. We explain the fundamental modules of Lethe-DEM, its software architecture, and the governing equations. Furthermore, we verify LetheDEM with several tests including analytical solutions and comparison with other software. Comparisons with experiments in a flat-bottomed silo, wedge-shaped silo, and rotating drum validate Lethe-DEM. We investigate the strong and weak scaling of Lethe-DEM with 1 <= n(c) <= 192 and 32 <= n(c) <= 320 processes, respectively, with and without load balancing. The strong-scaling analysis is performed on the wedge-shaped silo and rotating drum simulations, while for the weak-scaling analysis, we use a dam-break simulation. The best scalability of Lethe-DEM is obtained in the range of 5000 <= n(p)/n(c) <= 15,000. Finally, we demonstrate that large-scale simulations can be carried out with Lethe-DEM using the simulation of a three-dimensional cylindrical silo with n(p) = 4.3 x 10(6) on 320 cores.

Place, publisher, year, edition, pages
Springer Nature, 2023
Keywords
Discrete element methods (DEMs), High-performance computing, Load balancing, Silo, Rotating drum
National Category
Computational Mathematics
Research subject
Scientific Computing with specialization in Numerical Analysis
Identifiers
urn:nbn:se:uu:diva-492368 (URN)10.1007/s40571-022-00478-6 (DOI)000799532200001 ()
Projects
eSSENCE - An eScience Collaboration
Available from: 2023-01-04 Created: 2023-01-04 Last updated: 2023-05-31Bibliographically approved
Munch, P., Dravins, I., Kronbichler, M. & Neytcheva, M. (2023). Stage-parallel fully implicit Runge-Kutta implementations with optimal multilevel preconditioners at the scaling limit. SIAM Journal on Scientific Computing, 46(2), 71-96
Open this publication in new window or tab >>Stage-parallel fully implicit Runge-Kutta implementations with optimal multilevel preconditioners at the scaling limit
2023 (English)In: SIAM Journal on Scientific Computing, ISSN 1064-8275, E-ISSN 1095-7197, Vol. 46, no 2, p. 71-96Article in journal (Refereed) Published
Abstract [en]

We present an implementation of a stage-parallel preconditioner for Radau IIA type fully implicit Runge–Kutta methods, which approximates the inverse of the Runge–Kutta matrix AQ from the Butcher tableau by the lower triangular matrix resulting from an LU decomposition and diagonalizes the system with as many blocks as stages. For the transformed system, we employ a block preconditioner where each block is distributed and solved by a subgroup of processes in parallel. For combination of partial results, we use either a communication pattern resembling Cannon’s algorithm or shared memory. A performance model and a large set of performance studies (including strong-scaling runs with up to 150k processes on 3k compute nodes) conducted for a time-dependent heat problem, using matrix-free finite element methods, indicate that the stage-parallel implementation can reach higher throughputs near the scaling limit. The achievable speedup increases linearly with the number of stages and is bounded by the number of stages. Furthermore, we show that the presented stage-parallel concepts are also applicable to the case that AQ is directly diagonalized, which requires either complex arithmetic or solutions of two-by-two blocks, both exposing about half the parallelism. Alternatively to distributing stages and assigning them to distinct processes, we discuss the possibility of batching operations from different stages together.

Place, publisher, year, edition, pages
Society for Industrial and Applied Mathematics, 2023
Keywords
implicit Runge–Kutta methods, Radau quadrature, stage-parallel preconditioning, finite element methods, matrix-free methods, geometric multigrid, massively parallel
National Category
Computational Mathematics
Research subject
Scientific Computing with specialization in Numerical Analysis
Identifiers
urn:nbn:se:uu:diva-492935 (URN)10.1137/22M1503270 (DOI)001291137100004 ()2-s2.0-85192680512 (Scopus ID)
Projects
eSSENCE - An eScience Collaboration
Available from: 2023-01-11 Created: 2023-01-11 Last updated: 2025-02-19Bibliographically approved
Munch, P., Ljungkvist, K. & Kronbichler, M. (2022). Efficient Application of Hanging-Node Constraints for Matrix-Free High-Order FEM Computations on CPU and GPU. In: Varbanescu, AL Bhatele, A Luszczek, P Marc, B (Ed.), HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2022: . Paper presented at 37th International Supercomputing Conference on High Performance Computing (ISC High Performance Computing), MAY 29-JUN 02, 2022, Hamburg, GERMANY (pp. 133-152). Springer Nature, 13289
Open this publication in new window or tab >>Efficient Application of Hanging-Node Constraints for Matrix-Free High-Order FEM Computations on CPU and GPU
2022 (English)In: HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2022 / [ed] Varbanescu, AL Bhatele, A Luszczek, P Marc, B, Springer Nature, 2022, Vol. 13289, p. 133-152Conference paper, Published paper (Refereed)
Abstract [en]

This contribution presents an efficient algorithm for resolving hanging-node constraints on the fly for high-order finite-element computations on adaptively refined meshes, using matrix-free implementations. We concentrate on unstructured hex-dominated meshes and on multi-component elements with nodal Lagrange shape functions in at least one of their components. The application of general constraints is split up into two distinct operators, one specialized in the hanging-node part and a generic one for the remaining constraints, such as Dirichlet boundary conditions. The former implements in-face interpolations efficiently by a sequence of 1D interpolations with sum factorization according to the refinement configuration of the cell. We discuss ways to efficiently encode and decode such refinement configurations. Furthermore, we present distinct differences in the interpolation step on GPU and CPU, as well as compare different vectorization strategies for the latter. Experimental comparisons with a state-of-the-art algorithm that does not exploit the tensor-product structure show that, on CPUs, the additional costs of cells with hanging-node constraints can be reduced by a factor of 5-10 for a Laplace operator evaluation with high-order elements (k = 3) and affine meshes. For non-affine meshes, the costs for the application of hanging-node constraints can be completely hidden behind the memory transfer. The algorithm has been integrated into the open-source finite-element library deal.II.

Place, publisher, year, edition, pages
Springer Nature, 2022
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Adaptively refined meshes, Finite element methods, High order, Hanging-node constraints, Matrix-free operator evaluation, Node-level optimization, SIMD vectorization, Manycore optimizations
National Category
Computational Mathematics Computer Sciences
Identifiers
urn:nbn:se:uu:diva-488240 (URN)10.1007/978-3-031-07312-0_7 (DOI)000871773100007 ()978-3-031-07312-0 (ISBN)978-3-031-07311-3 (ISBN)
Conference
37th International Supercomputing Conference on High Performance Computing (ISC High Performance Computing), MAY 29-JUN 02, 2022, Hamburg, GERMANY
Available from: 2022-11-10 Created: 2022-11-10 Last updated: 2022-11-10Bibliographically approved
Guermond, J.-L., Kronbichler, M., Maier, M., Popov, B. & Tomas, I. (2022). On the implementation of a robust and efficient finite element-based parallel solver for the compressible Navier-Stokes equations. Computer Methods in Applied Mechanics and Engineering, 389, 114250, Article ID 114250.
Open this publication in new window or tab >>On the implementation of a robust and efficient finite element-based parallel solver for the compressible Navier-Stokes equations
Show others...
2022 (English)In: Computer Methods in Applied Mechanics and Engineering, ISSN 0045-7825, E-ISSN 1879-2138, Vol. 389, p. 114250-, article id 114250Article in journal (Refereed) Published
Abstract [en]

This paper describes in detail the implementation of a finite element technique for solving the compressible Navier–Stokes equations that is provably robust and demonstrates excellent performance on modern computer hardware. The method is second-order accurate in time and space. Robustness here means that the method is proved to be invariant domain preserving under the hyperbolic CFL time step restriction, and the method delivers results that are reproducible. The proposed technique is shown to be accurate on challenging 2D and 3D realistic benchmarks.

Place, publisher, year, edition, pages
ElsevierElsevier BV, 2022
Keywords
Hyperbolic conservation equations; compressible Navier–Stokes equations; Invariant domains; High-order method; Convex limiting; Finite element method
National Category
Computational Mathematics
Research subject
Scientific Computing
Identifiers
urn:nbn:se:uu:diva-463461 (URN)10.1016/j.cma.2021.114250 (DOI)000784334700002 ()
Funder
eSSENCE - An eScience Collaboration
Available from: 2022-01-10 Created: 2022-01-10 Last updated: 2024-01-15Bibliographically approved
Arndt, D., Bangerth, W., Feder, M., Fehling, M., Gassmöller, R., Heister, T., . . . Wells, D. (2022). The deal.II library, Version 9.4. Journal of Numerical Mathematics, 30(3), 231-246
Open this publication in new window or tab >>The deal.II library, Version 9.4
Show others...
2022 (English)In: Journal of Numerical Mathematics, ISSN 1570-2820, E-ISSN 1569-3953, Vol. 30, no 3, p. 231-246Article in journal (Refereed) Published
Abstract [en]

This paper provides an overview of the new features of the finite element library deal.II, version 9.4.

Place, publisher, year, edition, pages
Walter de Gruyter, 2022
Keywords
software, finite elements, deal, II, 65M60, 65N30, 65Y05
National Category
Computational Mathematics
Research subject
Scientific Computing with specialization in Numerical Analysis
Identifiers
urn:nbn:se:uu:diva-485345 (URN)10.1515/jnma-2022-0054 (DOI)000853178400003 ()
Projects
eSSENCE - An eScience Collaboration
Available from: 2022-09-22 Created: 2022-09-22 Last updated: 2023-01-12Bibliographically approved
Kronbichler, M., Fehn, N., Munch, P., Bergbauer, M., Wichmann, K.-R., Geitner, C., . . . Wall, W. A. (2021). A next-generation discontinuous Galerkin fluid dynamics solver with application to high-resolution lung airflow simulations. In: SC21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. Paper presented at SC21: International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-15). St. Louis, MO, USA: Association for Computing Machinery (ACM), Article ID 21.
Open this publication in new window or tab >>A next-generation discontinuous Galerkin fluid dynamics solver with application to high-resolution lung airflow simulations
Show others...
2021 (English)In: SC21: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, St. Louis, MO, USA: Association for Computing Machinery (ACM), 2021, p. 1-15, article id 21Conference paper, Published paper (Refereed)
Abstract [en]

We present a novel, highly scalable and optimized solver for turbulent flows based on high-order discontinuous Galerkin discretizations of the incompressible Navier-Stokes equations aimed to minimize time-to-solution. The solver uses explicit-implicit time integration with variable step size. The central algorithmic component is the matrix-free evaluation of discretized finite element operators. The node-level performance is optimized by sum-factorization kernels for tensor-product elements with unique algorithmic choices that reduce the number of arithmetic operations, improve cache usage, and vectorize the arithmetic work across elements and faces. These ingredients are integrated into a framework scalable to the massive parallelism of supercomputers by the use of optimal-complexity linear solvers, such as mixed-precision, hybrid geometric-polynomial-algebraic multigrid solvers for the pressure Poisson problem. The application problem under consideration are fluid dynamical simulations of the human respiratory system under mechanical ventilation conditions, using unstructured/structured adaptively refined meshes for geometrically complex domains typical of biomedical engineering.

Place, publisher, year, edition, pages
St. Louis, MO, USA: Association for Computing Machinery (ACM), 2021
Keywords
high-order discontinuous Galerkin, matrix-free algorithms, multigrid, time-to-solution
National Category
Computational Mathematics
Research subject
Scientific Computing
Identifiers
urn:nbn:se:uu:diva-463179 (URN)10.1145/3458817.3476171 (DOI)000946520100044 ()978-1-4503-8442-1 (ISBN)
Conference
SC21: International Conference for High Performance Computing, Networking, Storage and Analysis
Funder
eSSENCE - An eScience Collaboration
Available from: 2022-01-06 Created: 2022-01-06 Last updated: 2023-05-16Bibliographically approved
Munch, P., Kormann, K. & Kronbichler, M. (2021). hyper.deal: An Efficient, Matrix-free Finite-element Library for High-dimensional Partial Differential Equations. ACM Transactions on Mathematical Software, 47(4), 1-34, Article ID 33.
Open this publication in new window or tab >>hyper.deal: An Efficient, Matrix-free Finite-element Library for High-dimensional Partial Differential Equations
2021 (English)In: ACM Transactions on Mathematical Software, ISSN 0098-3500, E-ISSN 1557-7295, Vol. 47, no 4, p. 1-34, article id 33Article in journal (Refereed) Published
Abstract [en]

This work presents the efficient, matrix-free finite-element library hyper deal for solving partial differential equations in two up to six dimensions with high-order discontinuous Galerkin methods. It builds upon the low-dimensional finite-element library deal. II to create complex low-dimensional meshes and to operate on them individually. These meshes are combined via a tensor product on the fly, and the library provides new special-purpose highly optimized matrix-free functions exploiting domain decomposition as well as shared memory via MPI-3.0 features. Both node-level performance analyses and strong/weak-scaling studies on up to 147,456 CPU cores confirm the efficiency of the implementation. Results obtained with the library hyper . deal are reported for high-dimensional advection problems and for the solution of the Vlasov-Poisson equation in up to six-dimensional phase space.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM)ASSOC COMPUTING MACHINERY, 2021
Keywords
Matrix-free operator evaluation, discontinuous Galerkin methods, high-dimensional, high-order, Vlasov-Poisson equation, MPI-3.0 shared memory
National Category
Computational Mathematics
Identifiers
urn:nbn:se:uu:diva-457641 (URN)10.1145/3469720 (DOI)000703370900004 ()
Funder
German Research Foundation (DFG), KO5206/1-1German Research Foundation (DFG), KR4661/2-1eSSENCE - An eScience Collaboration
Available from: 2021-11-01 Created: 2021-11-01 Last updated: 2024-01-15Bibliographically approved
Fehn, N., Kronbichler, M., Munch, P. & Wall, W. A. (2021). Numerical evidence of anomalous energy dissipation in incompressible Euler flows: towards grid-converged results for the inviscid Taylor-Green problem. Journal of Fluid Mechanics, 932, Article ID A40.
Open this publication in new window or tab >>Numerical evidence of anomalous energy dissipation in incompressible Euler flows: towards grid-converged results for the inviscid Taylor-Green problem
2021 (English)In: Journal of Fluid Mechanics, ISSN 0022-1120, E-ISSN 1469-7645, Vol. 932, article id A40Article in journal (Refereed) Published
Abstract [en]

The well-known energy dissipation anomaly in the inviscid limit, related to velocity singularities according to Onsager, still needs to be demonstrated by numerical experiments. The present work contributes to this topic through high-resolution numerical simulations of the inviscid three-dimensional Taylor-Green vortex problem using a novel high-order discontinuous Galerkin discretisation approach for the incompressible Euler equations. The main methodological ingredient is the use of a discretisation scheme with inbuilt dissipation mechanisms, as opposed to discretely energy-conserving schemes, which - by construction - rule out the occurrence of anomalous dissipation. We investigate effective spatial resolution up to 8192(3) (defined based on the 2 pi-periodic box) and make the interesting phenomenological observation that the kinetic energy evolution does not tend towards exact energy conservation for increasing spatial resolution of the numerical scheme, but that the sequence of discrete solutions seemingly converges to a solution with non-zero kinetic energy dissipation rate. Taking the fine-resolution simulation as a reference, we measure grid-convergence with a relative L-2-error of 0.27% for the temporal evolution of the kinetic energy and 3.52% for the kinetic energy dissipation rate against the dissipative fine-resolution simulation. The present work raises the question of whether such results can be seen as a numerical confirmation of the famous energy dissipation anomaly. Due to the relation between anomalous energy dissipation and the occurrence of singularities for the incompressible Euler equations according to Onsager's conjecture, we elaborate on an indirect approach for the identification of finite-time singularities that relies on energy arguments.

Place, publisher, year, edition, pages
Cambridge University PressCAMBRIDGE UNIV PRESS, 2021
Keywords
computational methods, Navier-Stokes equations, turbulence theory
National Category
Computational Mathematics Applied Mechanics
Identifiers
urn:nbn:se:uu:diva-462467 (URN)10.1017/jfm.2021.1003 (DOI)000730262900001 ()
Funder
eSSENCE - An eScience Collaboration
Available from: 2021-12-23 Created: 2021-12-23 Last updated: 2024-01-15Bibliographically approved
Arndt, D., Bangerth, W., Blais, B., Fehling, M., Gassmoller, R., Heister, T., . . . Zhang, J. (2021). The deal. II library, Version 9.3. Journal of Numerical Mathematics, 29(3), 171-186
Open this publication in new window or tab >>The deal. II library, Version 9.3
Show others...
2021 (English)In: Journal of Numerical Mathematics, ISSN 1570-2820, E-ISSN 1569-3953, Vol. 29, no 3, p. 171-186Article in journal (Refereed) Published
Abstract [en]

This paper provides an overview of the new features of the finite element library deal . II, version 9.3.

Place, publisher, year, edition, pages
Walter de GruyterWalter de Gruyter GmbH, 2021
Keywords
software, finite elements, deal.II
National Category
Computational Mathematics
Identifiers
urn:nbn:se:uu:diva-457483 (URN)10.1515/jnma-2021-0081 (DOI)000700870700001 ()
Funder
eSSENCE - An eScience Collaboration
Available from: 2021-11-01 Created: 2021-11-01 Last updated: 2024-01-15Bibliographically approved
Organisations

Search in DiVA

Show all publications