TY - GEN

T1 - Atomic Power in Forks: A Super-Logarithmic Lower Bound for Implementing Butterfly Networks in the Nonatomic Binary Fork-Join Model

AU - Jacob, Riko

AU - Sitchinava, Nodari

AU - Goodrich, Michael T.

PY - 2021/1/14

Y1 - 2021/1/14

N2 - We prove an Ω (log n log log n) lower bound for the span of implementing the n input, log n-depth FFT circuit (also known as butterfly network) in the nonatomic binary fork-join model. In this model, memory-access synchronizations occur only through fork operations, which spawn two child threads, and join operations, which resume a parent thread when its child threads terminate. Our bound is asymptotically tight for the nonatomic binary fork-join model, which has been of interest of late, due to its conceptual elegance and ability to capture asynchrony. Our bound implies super-logarithmic lower bound in the nonatomic binary fork-join model for implementing the butterfly merging networks used, e.g., in Batcher's bitonic and odd-even mergesort networks. This lower bound also implies an asymptotic separation result for the atomic and nonatomic versions of the fork-join model, since, as we point out, FFT circuits can be implemented in the atomic binary fork-join model with span equal to their circuit depth.

AB - We prove an Ω (log n log log n) lower bound for the span of implementing the n input, log n-depth FFT circuit (also known as butterfly network) in the nonatomic binary fork-join model. In this model, memory-access synchronizations occur only through fork operations, which spawn two child threads, and join operations, which resume a parent thread when its child threads terminate. Our bound is asymptotically tight for the nonatomic binary fork-join model, which has been of interest of late, due to its conceptual elegance and ability to capture asynchrony. Our bound implies super-logarithmic lower bound in the nonatomic binary fork-join model for implementing the butterfly merging networks used, e.g., in Batcher's bitonic and odd-even mergesort networks. This lower bound also implies an asymptotic separation result for the atomic and nonatomic versions of the fork-join model, since, as we point out, FFT circuits can be implemented in the atomic binary fork-join model with span equal to their circuit depth.

KW - lower bound, parallel computation, fork-join model

U2 - https://doi.org/10.1137/1.9781611976465.128

DO - https://doi.org/10.1137/1.9781611976465.128

M3 - Article in proceedings

SP - 2141

EP - 2153

BT - Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA)

PB - Society for Industrial and Applied Mathematics

ER -