Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Refine search result
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Holmgren, Sverker
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Nordén, Markus
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Rantakokko, Jarmo
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Wallin, Dan
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Performance of PDE solvers on a self-optimizing NUMA architecture2002In: Parallel Algorithms and Applications, ISSN 1063-7192, E-ISSN 1029-032X, Vol. 17, p. 285-299Article in journal (Refereed)
  • 2.
    Holmgren, Sverker
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Wallin, Dan
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Performance of high-accuracy PDE solvers on a self-optimizing NUMA architecture2001In: Euro-Par 2001: Parallel Processing, Berlin: Springer-Verlag , 2001, p. 602-610Conference paper (Refereed)
  • 3.
    Johansson, Henrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Wallin, Dan
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Analyzing advanced PDE solvers through simulation2006In: Applied Parallel Computing: State of the Art in Scientific Computing, Berlin: Springer-Verlag , 2006, p. 893-900Conference paper (Refereed)
  • 4.
    Wallin, Dan
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Computer Systems. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Exploiting data locality in adaptive architectures2003Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    The speed of processors increases much faster than the memory access time. This makes memory accesses expensive. To meet this problem, cache hierarchies are introduced to serve the processor with data. However, the effectiveness of caches depends on the amount of locality in the application's memory access pattern. The behavior of various programs differs greatly in terms of cache miss characteristics, access patterns and communication intensity. Therefore a computer built for many different computational tasks potentially benefits from dynamically adapting to the varying needs of the applications.

    This thesis shows that a cc-NUMA multiprocessor with data migration and replication optimizations efficiently exploits the temporal locality of algorithms. The performance of the self-optimizing system is similar to a system with a perfect initial thread and data placement.

    Data locality optimizations are not for free. Large cache line coherence protocols improve spatial locality but yield increases in false sharing misses for many applications. Prefetching techniques that reduce the cache misses often lead to increased address and data traffic. Several techniques introduced in this thesis efficiently avoid these drawbacks. The bundling technique reduces the coherence traffic in multiprocessor prefetchers. This is especially important in snoop-based systems where the coherence bandwidth is a scarce resource. Bundled prefetchers manage to reduce both the cache miss rate and the coherence traffic compared with non-prefetching protocols. The most efficient bundled prefetching protocol studied, lowers the cache misses by 27 percent and the address snoops by 24 percent relative to a non-prefetching protocol on average for all examined applications. Another proposed technique, capacity prefetching, avoids false sharing misses by distinguishing between cache lines involved in communication from non-communicating cache lines at run-time.

    Download full text (pdf)
    fulltext
  • 5.
    Wallin, Dan
    et al.
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology. Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Datorteknik.
    Hagersten, Erik
    Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology. Uppsala University, Teknisk-naturvetenskapliga vetenskapsområdet, Mathematics and Computer Science, Department of Information Technology, Computer Systems. Datorteknik.
    Bundling: Reducing the Overhead of Multiprocessor Prefetchers2004In: 18th International Parallel and Distributed Processing Symposium: (IPDPS 2004), 2004Conference paper (Refereed)
    Abstract [en]

    Prefetching has proven to be a useful technique for reducing cache misses in multiprocessors at the cost of increased coherence traffic. This is especially troublesome for snoop-based systems, where the available coherence bandwidth often is the scalability bottleneck. The bundling technique presented in this paper reduces the overhead caused by prefetching in two ways: piggybacking prefetches with normal requests, and requiring only one device to perform the snoop lookup for each prefetch transaction. This can reduce both the address bandwidth and the number of snoop lookups compared with a nonprefetching system. We describe bundling implementations for two important transaction types: reads and upgrades. While bundling could reduce the overhead of most existing prefetch schemes, the evaluation of bundling performed in this paper has been limited to two of them: sequential prefetching and Dahlgren´s adaptive sequential prefetching. Both schemes have their snoop bandwidth halved for all commercial and scientific benchmarks in the study. The combined effect of bundling applied to these prefetch schemes lowers the cache miss rate, the address bandwidth and the snoop bandwidth, compared with a system with no prefetching, for all applications. Bundling, will not reduce the data bandwidth introduced by a prefetch scheme. However, we argue that the data bandwidth is more easily scaled than the snoop bandwidth for snoop-based coherence systems.

    Available as PDF (693 kB)

  • 6.
    Wallin, Dan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Johansson, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Cache memory behavior of advanced PDE solvers2004In: Parallel Computing: Software Technology, Algorithms, Architectures and Applications, Amsterdam, The Netherlands: Elsevier , 2004, p. 475-482Conference paper (Refereed)
  • 7.
    Wallin, Dan
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Löf, Henrik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Hagersten, Erik
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computer Systems.
    Holmgren, Sverker
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Numerical Analysis.
    Multigrid and Gauss-Seidel smoothers revisited: Parallelization on chip multiprocessors2006In: Proc. 20th ACM International Conference on Supercomputing, New York: ACM Press , 2006, p. 145-155Conference paper (Refereed)
1 - 7 of 7
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf