Logotyp: till Uppsala universitets webbplats

uu.sePublikationer från Uppsala universitet
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Geographical locality and dynamic data migration for OpenMP implementations of adaptive PDE solvers
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för teknisk databehandling. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Numerisk analys. (Software Aspects of High-Performance Computing)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för teknisk databehandling. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Numerisk analys. (Software Aspects of High-Performance Computing)
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för teknisk databehandling. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Tillämpad beräkningsvetenskap.
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Avdelningen för teknisk databehandling. Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi, Tillämpad beräkningsvetenskap.
2008 (Engelska)Ingår i: OpenMP Shared Memory Parallel Programming, Berlin: Springer-Verlag , 2008, s. 382-393Konferensbidrag, Publicerat paper (Refereegranskat)
Ort, förlag, år, upplaga, sidor
Berlin: Springer-Verlag , 2008. s. 382-393
Serie
Lecture Notes in Computer Science ; 4315
Nationell ämneskategori
Datavetenskap (datalogi) Beräkningsmatematik
Identifikatorer
URN: urn:nbn:se:uu:diva-17844DOI: 10.1007/978-3-540-68555-5_31ISI: 000256573200031ISBN: 978-3-540-68554-8 (tryckt)OAI: oai:DiVA.org:uu-17844DiVA, id: diva2:45615
Projekt
UPMARCTillgänglig från: 2008-09-05 Skapad: 2008-09-05 Senast uppdaterad: 2022-01-28Bibliografiskt granskad
Ingår i avhandling
1. Iterative and Adaptive PDE Solvers for Shared Memory Architectures
Öppna denna publikation i ny flik eller fönster >>Iterative and Adaptive PDE Solvers for Shared Memory Architectures
2006 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Alternativ titel[sv]
Iterativa och adaptiva PDE-lösare för parallelldatorer med gemensam minnesorganisation
Abstract [en]

Scientific computing is used frequently in an increasing number of disciplines to accelerate scientific discovery. Many such computing problems involve the numerical solution of partial differential equations (PDE). In this thesis we explore and develop methodology for high-performance implementations of PDE solvers for shared-memory multiprocessor architectures.

We consider three realistic PDE settings: solution of the Maxwell equations in 3D using an unstructured grid and the method of conjugate gradients, solution of the Poisson equation in 3D using a geometric multigrid method, and solution of an advection equation in 2D using structured adaptive mesh refinement. We apply software optimization techniques to increase both parallel efficiency and the degree of data locality.

In our evaluation we use several different shared-memory architectures ranging from symmetric multiprocessors and distributed shared-memory architectures to chip-multiprocessors. For distributed shared-memory systems we explore methods of data distribution to increase the amount of geographical locality. We evaluate automatic and transparent page migration based on runtime sampling, user-initiated page migration using a directive with an affinity-on-next-touch semantic, and algorithmic optimizations for page-placement policies.

Our results show that page migration increases the amount of geographical locality and that the parallel overhead related to page migration can be amortized over the iterations needed to reach convergence. This is especially true for the affinity-on-next-touch methodology whereby page migration can be initiated at an early stage in the algorithms.

We also develop and explore methodology for other forms of data locality and conclude that the effect on performance is significant and that this effect will increase for future shared-memory architectures. Our overall conclusion is that, if the involved locality issues are addressed, the shared-memory programming model provides an efficient and productive environment for solving many important PDE problems.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2006. s. 49
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 218
Nyckelord
partial differential equations, iterative methods, finite elements, conjugate gradients, adaptive mesh refinement, multigrid, cc-NUMA, distributed shared memory, OpenMP, page migration, TLB shoot-down, bandwidth minimization, reverse Cuthill-McKee, migrate-on-next-touch, affinity, temporal locality, chip multiprocessors, CMP
Nationell ämneskategori
Programvaruteknik Beräkningsmatematik
Forskningsämne
Beräkningsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-7136 (URN)91-554-6648-6 (ISBN)
Disputation
2006-10-07, Auditorium Minus, Museum Gustavianum, Akademigatan 3, Uppsala, 13:15 (Engelska)
Opponent
Handledare
Tillgänglig från: 2006-09-15 Skapad: 2006-09-15 Senast uppdaterad: 2022-03-11Bibliografiskt granskad
2. Multithreaded PDE Solvers on Non-Uniform Memory Architectures
Öppna denna publikation i ny flik eller fönster >>Multithreaded PDE Solvers on Non-Uniform Memory Architectures
2006 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Abstract [en]

A trend in parallel computer architecture is that systems with a large shared memory are becoming more and more popular. A shared memory system can be either a uniform memory architecture (UMA) or a cache coherent non-uniform memory architecture (cc-NUMA).

In the present thesis, the performance of parallel PDE solvers on cc-NUMA computers is studied. In particular, we consider the shared namespace programming model, represented by OpenMP. Since the main memory is physically, or geographically distributed over several multi-processor nodes, the latency for local memory accesses is smaller than for remote accesses. Therefore, the geographical locality of the data becomes important.

The focus of the present thesis is to study multithreaded PDE solvers on cc-NUMA systems, in particular their memory access pattern with respect to geographical locality. The questions posed are: (1) How large is the influence on performance of the non-uniformity of the memory system? (2) How should a program be written in order to reduce this influence? (3) Is it possible to introduce optimizations in the computer system for this purpose?

The main conclusion is that geographical locality is important for performance on cc-NUMA systems. This is shown experimentally for a broad range of PDE solvers as well as theoretically using a model involving characteristics of computer systems and applications.

Geographical locality can be achieved through migration directives that are inserted by the programmer or — possibly in the future — automatically by the compiler. On some systems, it can also be accomplished by means of transparent, hardware initiated migration and replication. However, a necessary condition that must be fulfilled if migration is to be effective is that the memory access pattern must not be "speckled", i.e. as few threads as possible shall make accesses to each memory page.

We also conclude that OpenMP is competitive with MPI on cc-NUMA systems if care is taken to get a favourable data distribution.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2006. s. 33
Serie
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 224
Nyckelord
PDE solver, high-performance, NUMA, UMA, OpenMP, MPI, data migration, data replication, thread scheduling, data affinity
Nationell ämneskategori
Programvaruteknik
Forskningsämne
Beräkningsvetenskap
Identifikatorer
urn:nbn:se:uu:diva-7149 (URN)91-554-6656-7 (ISBN)
Disputation
2006-10-20, Room 2446, Polacksbacken, Lägerhyddsvägen 2D, Uppsala, 10:15 (Engelska)
Opponent
Handledare
Tillgänglig från: 2006-09-28 Skapad: 2006-09-28 Senast uppdaterad: 2022-03-11Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltext

Person

Rantakokko, JarmoHolmgren, Sverker

Sök vidare i DiVA

Av författaren/redaktören
Nordén, MarkusRantakokko, JarmoHolmgren, Sverker
Av organisationen
Avdelningen för teknisk databehandlingNumerisk analysTillämpad beräkningsvetenskap
Datavetenskap (datalogi)Beräkningsmatematik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetricpoäng

doi
isbn
urn-nbn
Totalt: 738 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf