Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Parallelizing the Method of Conjugate Gradients for Shared Memory Architectures
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis. (Software Aspects of High-Performance Computing)
2004 (English)Licentiate thesis, comprehensive summary (Other academic)
Abstract [en]

Solving Partial Differential Equations (PDEs) is an important problem in many fields of science and engineering. For most real-world problems modeled by PDEs, we can only approximate the solution using numerical methods. Many of these numerical methods result in very large systems of linear equations. A common way of solving these systems is to use an iterative solver such as the method of conjugate gradients. Furthermore, due to the size of these systems we often need parallel computers to be able to solve them in a reasonable amount of time.

Shared memory architectures represent a class of parallel computer systems commonly used both in commercial applications and in scientific computing. To be able to provide cost-efficient computing solutions, shared memory architectures come in a large variety of configurations and sizes. From a programming point of view, we do not want to spend a lot of effort optimizing an application for a specific computer architecture. We want to find methods and principles of optimizing our programs that are generally applicable to a large class of architectures.

In this thesis, we investigate how to implement the method of conjugate gradients efficiently on shared memory architectures. We seek algorithmic optimizations that result in efficient programs for a variety of architectures. To study this problem, we have implemented the method of conjugate gradients using OpenMP and we have measured the runtime performance of this solver on a variety of both uniform and non-uniform shared memory architectures. The input data used in the experiments come from a Finite-Element discretization of the Maxwell equations in three dimensions of a fighter-jet geometry.

Our results show that, for all architectures studied, optimizations targeting the memory hierarchy exhibited the largest performance increase. Improving the load balance, by balancing the arithmetical work and minimizing the number of global barriers showed to be of lesser importance. Overall, bandwidth minimization of the iteration matrix showed to be the most efficient optimization.

On non-uniform architectures, proper data distribution showed to be very important. In our experiments we used page migration to improve the data distribution during runtime. Our results indicate that page migration can be very efficient if we can keep the migration cost low. Furthermore, we believe that page migration can be introduced in a portable way into OpenMP in the form of a directive with a affinity-on-next-touch semantic.

Place, publisher, year, edition, pages
Uppsala University, 2004.
Series
Information technology licentiate theses: Licentiate theses from the Department of Information Technology, ISSN 1404-5117 ; 2004-005
National Category
Software Engineering Computational Mathematics
Research subject
Scientific Computing
Identifiers
URN: urn:nbn:se:uu:diva-86295OAI: oai:DiVA.org:uu-86295DiVA, id: diva2:117104
Supervisors
Available from: 2004-11-19 Created: 2006-05-15 Last updated: 2018-01-13Bibliographically approved

Open Access in DiVA

fulltext(885 kB)3368 downloads
File information
File name FULLTEXT01.pdfFile size 885 kBChecksum SHA-512
adaf7a24bfb169ce638936d1735e4f21b412b732c6e65f98961187d8c0779c5ad1439700ff903c7b23394409aae1821b40eb2399692667a57306647ac0935a3d
Type fulltextMimetype application/pdf

Authority records

Löf, Henrik

Search in DiVA

By author/editor
Löf, Henrik
By organisation
Division of Scientific ComputingNumerical Analysis
Software EngineeringComputational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 3384 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 3279 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf