Shared-memory parallelism can be simple, fast, and scalable için kapak resmi
Shared-memory parallelism can be simple, fast, and scalable
Başlık:
Shared-memory parallelism can be simple, fast, and scalable
Yazar:
Shun, Julian, author.
ISBN:
9781970001891
Edisyon:
First edition.
Fiziksel Niteleme:
1 PDF (xv, 426 pages) : illustrations.
Seri:
ACM books, #15

ACM books ; #15.
İçindekiler:
1. Introduction -- 1.1 Shared-memory programming -- 1.2 Shared-memory algorithm design -- 1.3 Shared-memory performance -- 1.4 The problem based benchmark suite -- 1.5 Contributions of this book --

2. Preliminaries and notation -- 2.1 Parallel programming model -- 2.2 Algorithmic complexity model -- 2.3 Parallel primitives -- 2.4 Graphs -- 2.5 Strings -- 2.6 Problem definitions -- 2.7 Experimental environment -- Part I. Programming techniques for deterministic parallelism --

3. Internally deterministic parallelism: techniques and algorithms -- 3.1 Introduction -- 3.2 Programming model -- 3.3 Commutative building blocks -- 3.4 Internally deterministic parallel algorithms -- 3.5 Experimental results --

4. Deterministic parallelism in sequential iterative algorithms -- 4.1 Introduction -- 4.2 Analysis tools -- 4.3 Algorithmic design techniques -- 4.4 Maximal independent set -- 4.5 Maximal matching -- 4.6 Random permutation -- 4.7 List contraction -- 4.8 Tree contraction -- 4.9 Limited randomness -- 4.10 Experiments --

5. A deterministic phase-concurrent parallel hash table -- 5.1 Introduction -- 5.2 Related work -- 5.3 Preliminaries -- 5.4 Deterministic phase-concurrent hash table -- 5.5 Applications -- 5.6 Experiments --

6. Priority updates: a contention-reducing primitive for deterministic programming -- 6.1 Introduction -- 6.2 Priority updates -- 6.3 Contention in shared memory operations -- 6.4 Applications of priority update -- 6.5 Experiment study: applications --

Part II. Large-scale shared-memory graph analytics --

7. Ligra: a lightweight graph processing framework for shared memory -- 7.1 Introduction -- 7.2 Related work -- 7.3 Framework -- 7.4 Applications -- 7.5 Experiments --

8. Ligra+: adding compression to Ligra -- 8.1 Introduction -- 8.2 Previous work -- 8.3 Ligra+ implementation -- 8.4 Experiments --

Part III. Parallel graph algorithms --

9. Linear-work parallel graph connectivity -- 9.1 Introduction -- 9.2 Linear-work low-diameter decomposition -- 9.3 Linear-work connectivity -- 9.4 Implementation details -- 9.5 Experiments --

10. Parallel and cache-oblivious triangle computations -- 10.1 Introduction -- 10.2 Preliminaries -- 10.3 Triangle counting -- 10.4 Exact triangle counting -- 10.5 Approximate triangle counting -- 10.6 Extensions -- 10.7 Evaluation -- 10.8 Parallelization of the Pagh-Silvestri algorithm -- 10.9 Prior and related work --

Part IV. Parallel string algorithms --

11. Parallel Cartesian tree and suffix tree construction -- 11.1 Introduction -- 11.2 Preliminaries -- 11.3 Parallel Cartesian trees -- 11.4 Cartesian trees and the ANSV problem -- 11.5 Experiments --

12. Parallel computation of longest common prefixes -- 12.1 Introduction -- 12.2 Preliminaries -- 12.3 Algorithms and analysis -- 12.4 Experiments --

13. Parallel Lempel-Ziv factorization -- 13.1 Introduction -- 13.2 Preliminaries -- 13.3 Parallel Lempel-Ziv factorization algorithm -- 13.4 Implementations -- 13.5 Experiments --

14. Parallel wavelet tree construction -- 14.1 Introduction -- 14.2 Preliminaries -- 14.3 Related work -- 14.4 Parallel wavelet tree construction -- 14.5 Experiments -- 14.6 Parallel construction of rank/select structures -- 14.7 Extensions --

15. Conclusion and future work -- 15.1 Summary -- 15.2 Future work --

References -- Index -- Author's biography.
Özet:
Parallelism is the key to achieving high performance in computing. However, writing efficient and scalable parallel programs is notoriously difficult, and often requires significant expertise. To address this challenge, it is crucial to provide programmers with high-level tools to enable them to develop solutions easily, and at the same time emphasize the theoretical and practical aspects of algorithm design to allow the solutions developed to run efficiently under many different settings. This book, a revised version of the thesis that won the 2015 ACM Doctoral Dissertation Award, addresses this challenge using a three-pronged approach consisting of the design of shared-memory programming techniques, frameworks, and algorithms for important problems in computing. It provides evidence that with appropriate programming techniques, frameworks, and algorithms, shared-memory programs can be simple, fast, and scalable, both in theory and in practice. The results serve to ease the transition into the multicore era. The book starts by introducing tools and techniques for deterministic parallel programming, including means for encapsulating nondeterminism via powerful commutative building blocks, as well as a novel framework for executing sequential iterative loops in parallel, which lead to deterministic parallel algorithms that are efficient both in theory and in practice. The book then introduces Ligra, the first high-level shared-memory framework for parallel graph traversal algorithms. The framework enables short and concise implementations that deliver performance competitive with that of highly optimized code and up to orders of magnitude faster than previous systems designed for distributed memory. Finally, the book bridges the gap between theory and practice in parallel algorithm design by introducing the first algorithms for a variety of important problems on graphs and strings that are both practical and theoretically efficient.
Elektronik Erişim:
Abstract with links to full text http://dx.doi.org/10.1145/3018787