uu.seUppsala University Publications
Change search
Refine search result
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Finite Element Computations on Multicore and Graphics Processors2017Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    In this thesis, techniques for efficient utilization of modern computer hardwarefor numerical simulation are considered. In particular, we study techniques for improving the performance of computations using the finite element method.

    One of the main difficulties in finite-element computations is how to perform the assembly of the system matrix efficiently in parallel, due to its complicated memory access pattern. The challenge lies in the fact that many entries of the matrix are being updated concurrently by several parallel threads. We consider transactional memory, an exotic hardware feature for concurrent update of shared variables, and conduct benchmarks on a prototype multicore processor supporting it. Our experiments show that transactions can both simplify programming and provide good performance for concurrent updates of floating point data.

    Secondly, we study a matrix-free approach to finite-element computation which avoids the matrix assembly. In addition to removing the need to store the system matrix, matrix-free methods are attractive due to their low memory footprint and therefore better match the architecture of modern processors where memory bandwidth is scarce and compute power is abundant. Motivated by this, we consider matrix-free implementations of high-order finite-element methods for execution on graphics processors, which have seen a revolutionary increase in usage for numerical computations during recent years due to their more efficient architecture. In the implementation, we exploit sum-factorization techniques for efficient evaluation of matrix-vector products, mesh coloring and atomic updates for concurrent updates, and a geometric multigrid algorithm for efficient preconditioning of iterative solvers. Our performance studies show that on the GPU, a matrix-free approach is the method of choice for elements of order two and higher, yielding both a significantly faster execution, and allowing for solution of considerably larger problems. Compared to corresponding CPU implementations executed on comparable multicore processors, the GPU implementation is about twice as fast, suggesting that graphics processors are about twice as power efficient as multicores for computations of this kind.

    List of papers
    1. Using hardware transactional memory for high-performance computing
    Open this publication in new window or tab >>Using hardware transactional memory for high-performance computing
    Show others...
    2011 (English)In: Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, Piscataway, NJ: IEEE , 2011, p. 1660-1667Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE, 2011
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-158551 (URN)10.1109/IPDPS.2011.322 (DOI)978-1-61284-425-1 (ISBN)
    Conference
    IPDPS Workshop on Multi-Threaded Architectures and Applications
    Projects
    eSSENCEUPMARC
    Available from: 2011-09-01 Created: 2011-09-10 Last updated: 2018-01-12Bibliographically approved
    2. Matrix-free finite-element operator application on graphics processing units
    Open this publication in new window or tab >>Matrix-free finite-element operator application on graphics processing units
    2014 (English)In: Euro-Par 2014: Parallel Processing Workshops, Part II, Springer, 2014, p. 450-461Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer, 2014
    Series
    Lecture Notes in Computer Science ; 8806
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-238380 (URN)10.1007/978-3-319-14313-2_38 (DOI)000354785000038 ()978-3-319-14312-5 (ISBN)
    Conference
    7th Workshop on Unconventional High-Performance Computing
    Projects
    UPMARCeSSENCE
    Available from: 2014-12-11 Created: 2014-12-11 Last updated: 2018-01-11Bibliographically approved
    3. Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes
    Open this publication in new window or tab >>Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes
    2017 (English)In: Proc. 25th High Performance Computing Symposium, San Diego, CA: The Society for Modeling and Simulation International, 2017, p. 1-12Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    San Diego, CA: The Society for Modeling and Simulation International, 2017
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-320146 (URN)978-1-5108-3822-2 (ISBN)
    Conference
    HPC 2017, April 23–26, Virginia Beach, VA
    Projects
    UPMARC
    Available from: 2017-04-23 Created: 2017-04-16 Last updated: 2018-01-13Bibliographically approved
    4. Multigrid for matrix-free finite element computations on graphics processors
    Open this publication in new window or tab >>Multigrid for matrix-free finite element computations on graphics processors
    2017 (English)Report (Other academic)
    Series
    Technical report / Department of Information Technology, Uppsala University, ISSN 1404-3203 ; 2017-006
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-320073 (URN)
    Projects
    UPMARCeSSENCE
    Available from: 2017-04-20 Created: 2017-04-13 Last updated: 2018-01-13Bibliographically approved
  • 2.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Matrix-free finite-element computations on graphics processors with adaptively refined unstructured meshes2017In: Proc. 25th High Performance Computing Symposium, San Diego, CA: The Society for Modeling and Simulation International, 2017, p. 1-12Conference paper (Refereed)
  • 3.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Matrix-free finite-element operator application on graphics processing units2014In: Euro-Par 2014: Parallel Processing Workshops, Part II, Springer, 2014, p. 450-461Conference paper (Refereed)
  • 4.
    Ljungkvist, Karl
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Techniques for finite element methods on modern processors2015Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    In this thesis, methods for efficient utilization of modern computer hardware for numerical simulation are considered. In particular, we study techniques for speeding up the execution of finite-element methods.

    One of the greatest challenges in finite-element computation is how to efficiently perform the the system matrix assembly efficiently in parallel, due to its complicated memory access pattern. The main difficulty lies in the fact that many entries of the matrix are being updated concurrently by several parallel threads. We consider transactional memory, an exotic hardware feature for concurrent update of shared variables, and conduct benchmarks on a prototype processor supporting it. Our experiments show that transactions can both simplify programming and provide good performance for concurrent updates of floating point data.

    Furthermore, we study a matrix-free approach to finite-element computation which avoids the matrix assembly. Motivated by its computational properties, we implement the matrix-free method for execution on graphics processors, using either atomic updates or a mesh coloring approach to handle the concurrent updates. A performance study shows that on the GPU, the matrix-free method is faster than a matrix-based implementation for many element types, and allows for solution of considerably larger problems. This suggests that the matrix-free method can speed up execution of large realistic simulations.

    List of papers
    1. Using hardware transactional memory for high-performance computing
    Open this publication in new window or tab >>Using hardware transactional memory for high-performance computing
    Show others...
    2011 (English)In: Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, Piscataway, NJ: IEEE , 2011, p. 1660-1667Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Piscataway, NJ: IEEE, 2011
    National Category
    Computer Sciences
    Identifiers
    urn:nbn:se:uu:diva-158551 (URN)10.1109/IPDPS.2011.322 (DOI)978-1-61284-425-1 (ISBN)
    Conference
    IPDPS Workshop on Multi-Threaded Architectures and Applications
    Projects
    eSSENCEUPMARC
    Available from: 2011-09-01 Created: 2011-09-10 Last updated: 2018-01-12Bibliographically approved
    2. Matrix-free finite-element operator application on graphics processing units
    Open this publication in new window or tab >>Matrix-free finite-element operator application on graphics processing units
    2014 (English)In: Euro-Par 2014: Parallel Processing Workshops, Part II, Springer, 2014, p. 450-461Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Springer, 2014
    Series
    Lecture Notes in Computer Science ; 8806
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-238380 (URN)10.1007/978-3-319-14313-2_38 (DOI)000354785000038 ()978-3-319-14312-5 (ISBN)
    Conference
    7th Workshop on Unconventional High-Performance Computing
    Projects
    UPMARCeSSENCE
    Available from: 2014-12-11 Created: 2014-12-11 Last updated: 2018-01-11Bibliographically approved
  • 5.
    Ljungkvist, Karl
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Kronbichler, Martin
    Multigrid for matrix-free finite element computations on graphics processors2017Report (Other academic)
  • 6.
    Ljungkvist, Karl
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Tillenius, Martin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Black-Schaffer, David
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Karlsson, Martin
    Larsson, Elisabeth
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Using hardware transactional memory for high-performance computing2011In: Proc. 25th International Symposium on Parallel and Distributed Processing Workshops and PhD Forum, Piscataway, NJ: IEEE , 2011, p. 1660-1667Conference paper (Refereed)
  • 7. Ljungkvist, Karl
    et al.
    Tillenius, Martin
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Karlsson, Martin
    Larsson, Elisabeth
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.
    Early results using hardware transactional memory for high-performance computing applications2010In: Proc. 3rd Swedish Workshop on Multi-Core Computing, Göteborg, Sweden: Chalmers University of Technology , 2010, p. 93-97Conference paper (Other academic)
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf