Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Refine search result
1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Iterative and Adaptive PDE Solvers for Shared Memory Architectures2006Doctoral thesis, comprehensive summary (Other academic)
    Abstract [en]

    Scientific computing is used frequently in an increasing number of disciplines to accelerate scientific discovery. Many such computing problems involve the numerical solution of partial differential equations (PDE). In this thesis we explore and develop methodology for high-performance implementations of PDE solvers for shared-memory multiprocessor architectures.

    We consider three realistic PDE settings: solution of the Maxwell equations in 3D using an unstructured grid and the method of conjugate gradients, solution of the Poisson equation in 3D using a geometric multigrid method, and solution of an advection equation in 2D using structured adaptive mesh refinement. We apply software optimization techniques to increase both parallel efficiency and the degree of data locality.

    In our evaluation we use several different shared-memory architectures ranging from symmetric multiprocessors and distributed shared-memory architectures to chip-multiprocessors. For distributed shared-memory systems we explore methods of data distribution to increase the amount of geographical locality. We evaluate automatic and transparent page migration based on runtime sampling, user-initiated page migration using a directive with an affinity-on-next-touch semantic, and algorithmic optimizations for page-placement policies.

    Our results show that page migration increases the amount of geographical locality and that the parallel overhead related to page migration can be amortized over the iterations needed to reach convergence. This is especially true for the affinity-on-next-touch methodology whereby page migration can be initiated at an early stage in the algorithms.

    We also develop and explore methodology for other forms of data locality and conclude that the effect on performance is significant and that this effect will increase for future shared-memory architectures. Our overall conclusion is that, if the involved locality issues are addressed, the shared-memory programming model provides an efficient and productive environment for solving many important PDE problems.

    List of papers
    1. Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
    Open this publication in new window or tab >>Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
    2004 (English)In: Computational Science – ICCS 2004, Berlin: Springer-Verlag , 2004, p. 9-16Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Berlin: Springer-Verlag, 2004
    Series
    Lecture Notes in Computer Science ; 3037
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-71098 (URN)10.1007/b97988 (DOI)
    Available from: 2007-03-11 Created: 2007-03-11 Last updated: 2018-01-10Bibliographically approved
    2. affinity-on-next-touch: Increasing the Performance of an Industrial PDE Solver on a cc-NUMA System
    Open this publication in new window or tab >>affinity-on-next-touch: Increasing the Performance of an Industrial PDE Solver on a cc-NUMA System
    2005 (English)In: Proc. 19th ACM International Conference on Supercomputing, New York: ACM Press , 2005, p. 387-392Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    New York: ACM Press, 2005
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-80041 (URN)10.1145/1088149.1088201 (DOI)1-59593-167-8 (ISBN)
    Available from: 2006-05-19 Created: 2009-01-19 Last updated: 2018-01-13Bibliographically approved
    3. Algorithmic optimizations of a conjugate gradient solver on shared memory architectures
    Open this publication in new window or tab >>Algorithmic optimizations of a conjugate gradient solver on shared memory architectures
    2006 (English)In: International Journal of Parallel, Emergent and Distributed Systems, ISSN 1744-5760, E-ISSN 1744-5779, Vol. 21, p. 345-363Article in journal (Refereed) Published
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-80937 (URN)10.1080/17445760600568139 (DOI)
    Available from: 2006-06-29 Created: 2006-06-29 Last updated: 2018-01-13Bibliographically approved
    4. Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors
    Open this publication in new window or tab >>Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors
    2006 (English)In: Proc. 20th ACM International Conference on Supercomputing, New York: ACM Press , 2006, p. 145-155Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    New York: ACM Press, 2006
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-19810 (URN)10.1145/1183401.1183423 (DOI)1-59593-282-8 (ISBN)
    Available from: 2008-02-08 Created: 2008-02-08 Last updated: 2018-01-12Bibliographically approved
    5. Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers
    Open this publication in new window or tab >>Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers
    2008 (English)In: OpenMP Shared Memory Parallel Programming, Berlin: Springer-Verlag , 2008, p. 382-393Conference paper, Published paper (Refereed)
    Place, publisher, year, edition, pages
    Berlin: Springer-Verlag, 2008
    Series
    Lecture Notes in Computer Science ; 4315
    National Category
    Computer Sciences Computational Mathematics
    Identifiers
    urn:nbn:se:uu:diva-17844 (URN)10.1007/978-3-540-68555-5_31 (DOI)000256573200031 ()978-3-540-68554-8 (ISBN)
    Projects
    UPMARC
    Available from: 2008-09-05 Created: 2008-09-05 Last updated: 2022-01-28Bibliographically approved
    Download full text (pdf)
    FULLTEXT01
  • 2.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Parallelizing the Method of Conjugate Gradients for Shared Memory Architectures2004Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    Solving Partial Differential Equations (PDEs) is an important problem in many fields of science and engineering. For most real-world problems modeled by PDEs, we can only approximate the solution using numerical methods. Many of these numerical methods result in very large systems of linear equations. A common way of solving these systems is to use an iterative solver such as the method of conjugate gradients. Furthermore, due to the size of these systems we often need parallel computers to be able to solve them in a reasonable amount of time.

    Shared memory architectures represent a class of parallel computer systems commonly used both in commercial applications and in scientific computing. To be able to provide cost-efficient computing solutions, shared memory architectures come in a large variety of configurations and sizes. From a programming point of view, we do not want to spend a lot of effort optimizing an application for a specific computer architecture. We want to find methods and principles of optimizing our programs that are generally applicable to a large class of architectures.

    In this thesis, we investigate how to implement the method of conjugate gradients efficiently on shared memory architectures. We seek algorithmic optimizations that result in efficient programs for a variety of architectures. To study this problem, we have implemented the method of conjugate gradients using OpenMP and we have measured the runtime performance of this solver on a variety of both uniform and non-uniform shared memory architectures. The input data used in the experiments come from a Finite-Element discretization of the Maxwell equations in three dimensions of a fighter-jet geometry.

    Our results show that, for all architectures studied, optimizations targeting the memory hierarchy exhibited the largest performance increase. Improving the load balance, by balancing the arithmetical work and minimizing the number of global barriers showed to be of lesser importance. Overall, bandwidth minimization of the iteration matrix showed to be the most efficient optimization.

    On non-uniform architectures, proper data distribution showed to be very important. In our experiments we used page migration to improve the data distribution during runtime. Our results indicate that page migration can be very efficient if we can keep the migration cost low. Furthermore, we believe that page migration can be introduced in a portable way into OpenMP in the form of a directive with a affinity-on-next-touch semantic.

    Download full text (pdf)
    fulltext
  • 3.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    affinity-on-next-touch: Increasing the Performance of an Industrial PDE Solver on a cc-NUMA System2005In: Proc. 19th ACM International Conference on Supercomputing, New York: ACM Press , 2005, p. 387-392Conference paper (Refereed)
  • 4.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Nordén, Markus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers2004In: Computational Science – ICCS 2004, Berlin: Springer-Verlag , 2004, p. 9-16Conference paper (Refereed)
  • 5.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Nordén, Markus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Improving geographical locality of data for shared memory implementations of PDE solvers2004Report (Other academic)
  • 6.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing.
    Radović, Zoran
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    THROOM: Running POSIX Multithreaded Binaries on a Cluster2003Report (Other academic)
  • 7.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing.
    Radović, Zoran
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    THROOM — Supporting POSIX Multithreaded Binaries on a Cluster2003In: Euro-Par 2003: Parallel Processing, Berlin: Springer-Verlag , 2003, p. 760-769Conference paper (Refereed)
  • 8.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Rantakokko, Jarmo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Algorithmic Optimizations of a Conjugate Gradient Solver on Shared Memory Architectures2004Report (Other academic)
  • 9.
    Löf, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Rantakokko, Jarmo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Algorithmic optimizations of a conjugate gradient solver on shared memory architectures2006In: International Journal of Parallel, Emergent and Distributed Systems, ISSN 1744-5760, E-ISSN 1744-5779, Vol. 21, p. 345-363Article in journal (Refereed)
  • 10.
    Nordén, Markus
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Rantakokko, Jarmo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers2006Report (Other academic)
  • 11.
    Wallin, Dan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors2006In: Proc. 20th ACM International Conference on Supercomputing, New York: ACM Press , 2006, p. 145-155Conference paper (Refereed)
  • 12.
    Wallin, Dan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors2006Report (Other academic)
1 - 12 of 12
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf