uu.seUppsala University Publications
Change search
Refine search result
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Kronbichler, Martin
    et al.
    Tech Univ Munich, Inst Computat Mech, Boltzmannstr 15, D-85748 Garching, Germany.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing.
    Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors2019In: ACM TRANSACTIONS ON PARALLEL COMPUTING, ISSN 2329-4949, Vol. 6, no 1, article id 2Article in journal (Refereed)
    Abstract [en]

    This article presentsmatrix-free finite-element techniques for efficiently solving partial differential equations on modern many-core processors, such as graphics cards. We develop a GPU parallelization of a matrix-free geometric multigrid iterative solver targeting moderate and high polynomial degrees, with support for general curved and adaptively refined hexahedral meshes with hanging nodes. The central algorithmic component is the matrix-free operator evaluation with sum factorization. We compare the node-level performance of our implementation running on an Nvidia Pascal P100 GPU to a highly optimized multicore implementation running on comparable Intel Broadwell CPUs and an Intel Xeon Phi. Our experiments show that the GPU implementation is approximately 1.5 to 2 times faster across four different scenarios of the Poisson equation and a variety of element degrees in 2D and 3D. The lowest time to solution per degree of freedom is recorded for moderate polynomial degrees between 3 and 5. A detailed performance analysis highlights the capabilities of the GPU architecture and the chosen execution model with threading within the element, particularly with respect to the evaluation of the matrix-vector product. Atomic intrinsics are shown to provide a fast way for avoiding the possible race conditions in summing the elemental residuals into the global vector associated to shared vertices, edges, and surfaces. In addition, the solver infrastructure allows for using mixed-precision arithmetic that performs the multigrid V-cycle in single precision with an outer correction in double precision, increasing throughput by up to 83%.

  • 2.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes2017In: Proc. 25th High Performance Computing Symposium, San Diego, CA: The Society for Modeling and Simulation International, 2017, p. 1-12Conference paper (Refereed)
  • 3.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Matrix-free finite-element operator application on graphics processing units2014In: Euro-Par 2014: Parallel Processing Workshops, Part II, Springer, 2014, p. 450-461Conference paper (Refereed)
  • 4.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Techniques for finite element methods on modern processors2015Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    In this thesis, methods for efficient utilization of modern computer hardware for numerical simulation are considered. In particular, we study techniques for speeding up the execution of finite-element methods.

    One of the greatest challenges in finite-element computation is how to efficiently perform the the system matrix assembly efficiently in parallel, due to its complicated memory access pattern. The main difficulty lies in the fact that many entries of the matrix are being updated concurrently by several parallel threads. We consider transactional memory, an exotic hardware feature for concurrent update of shared variables, and conduct benchmarks on a prototype processor supporting it. Our experiments show that transactions can both simplify programming and provide good performance for concurrent updates of floating point data.

    Furthermore, we study a matrix-free approach to finite-element computation which avoids the matrix assembly. Motivated by its computational properties, we implement the matrix-free method for execution on graphics processors, using either atomic updates or a mesh coloring approach to handle the concurrent updates. A performance study shows that on the GPU, the matrix-free method is faster than a matrix-based implementation for many element types, and allows for solution of considerably larger problems. This suggests that the matrix-free method can speed up execution of large realistic simulations.

    List of papers
    1. Using hardware transactional memory for high-performance computing
    Open this publication in new window or tab >>Using hardware transactional memory for high-performance computing
    Show others...
    2011 (English)In: Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, Piscataway, NJ: IEEE , 2011, p. 1660-1667Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE, 2011
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-158551 (URN)10.1109/IPDPS.2011.322 (DOI)978-1-61284-425-1 (ISBN)
    Conference
    IPDPS Workshop on Multi-Threaded Architectures and Applications
    Projects
    eSSENCEUPMARC
    Available from: 2011-09-01 Created: 2011-09-10 Last updated: 2018-01-12Bibliographically approved
    2. Matrix-free finite-element operator application on graphics processing units
    Open this publication in new window or tab >>Matrix-free finite-element operator application on graphics processing units
    2014 (English)In: Euro-Par 2014: Parallel Processing Workshops, Part II, Springer, 2014, p. 450-461Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer, 2014
    Series
    Lecture Notes in Computer Science ; 8806
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-238380 (URN)10.1007/978-3-319-14313-2_38 (DOI)000354785000038 ()978-3-319-14312-5 (ISBN)
    Conference
    7th Workshop on Unconventional High-Performance Computing
    Projects
    UPMARCeSSENCE
    Available from: 2014-12-11 Created: 2014-12-11 Last updated: 2018-01-11Bibliographically approved
  • 5.
    Ljungkvist, Karl
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Kronbichler, Martin
    Multigrid for matrix-free finite element computations on graphics processors2017Report (Other academic)
  • 6.
    Ljungkvist, Karl
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Tillenius, Martin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Black-Schaffer, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Karlsson, Martin
    Larsson, Elisabeth
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Using hardware transactional memory for high-performance computing2011In: Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, Piscataway, NJ: IEEE , 2011, p. 1660-1667Conference paper (Refereed)
  • 7. Ljungkvist, Karl
    et al.
    Tillenius, Martin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Karlsson, Martin
    Larsson, Elisabeth
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Early results using hardware transactional memory for high-performance computing applications2010In: Proc. 3rd Swedish Workshop on Multi-Core Computing, Göteborg, Sweden: Chalmers University of Technology , 2010, p. 93-97Conference paper (Other academic)
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf