Significance Statement
Despite the potential of high performance computing in simulations and modeling in today’s technology, petascale computing platforms remain challenging for engineers and researchers to solve their problems on. Even matching a critical algorithm such as the sparse matrix factorization, central to many problems in science, engineering, and optimization, faces difficulty in adjusting to increasing computational power of high performance computing.
Dr. Seid Koric from the National Center for Supercomputing Applications at University of Illinois and Dr. Anshul Gupta from IBM T.J. Watson Research Center demonstrated that multifrontal sparse factorization algorithm with hybrid parallelization, such as one in the Watson Sparse Matrix Package (WSMP), can scale efficiently in todays’ large-scale supercomputers. The hardware used in these studies is the sustained petascale system of Blue Waters hosted at University of Illinois’ National Center for Supercomputing Applications. Blue Waters is considered one of the most powerful supercomputers in the world and is funded by the National Science Foundation (NSF) and the state of Illinois. The challenging large sparse matrices used in this study with tens of millions of unknowns are from Dr. Koric’s multiphysics research with implicit finite element methods and often unsolvable by iterative solvers. The study is published in Computer Methods in Applied Mechanics and Engineering.
The general categories used in solving large sparse linear systems of equations are the direct methods and iterative methods. However, previous research has shown that direct methods provide faster solution than iterative methods for certain classes of linear systems, and they also possess sufficient scalability and robustness in tackling very large problem sizes of ill-conditioned finite element analysis equations on many thousands of processor cores.
The robust high-performance and easy-to-use software, Watson Sparse Matrix Package (WSMP), which Dr. Anshul Gupta was a lead author of, is frequently used for solving large sparse systems of linear equations. It has a unique feature in ability to exploit both shared-memory and distributed-memory parallelism using threads and message passing interface respectively. The multifrontal technique in Watson Sparse Matrix Package ensures that most floating point computation is performed by cache friendly level 2 and level 3 basic linear algebra subprograms.
Results from factorization time, parallel speedup and performance showed that 64 nodes is the minimum number of nodes that can fit the largest matrix case with 1 thread per node for the total of 64 threads (successfully solved in only 18 seconds). Watson Sparse Matrix Package was also found not to be bound by memory access and interconnect communication bandwidths on Blue Waters. On smaller CPU thread counts, memory access time was drastically reduced by large amount of cache memory available, leading to extra speedup in addition to that from actual computation. Speedup nevertheless remained high for large problems reaching 13,179 and 76.4 TFlops at the largest of thread count of 65,536 used in this study.
When full scale evaluation were performed using direct multifrontal factorization method with a hybrid parallel implementation in WSMP, it performed exceedingly well on a petascale high-performance computing system of Blue Waters, and delivered excellent factorization robustly with high scalability. Indeed, more than 40 million equations solved on more than 65,000 of cores with 76.4 TFlops of sustained performance.
It is believed that the unprecedented level of parallel scalability and robustness of the direct solver in implicit finite element method, demonstrated for the first time by Drs. Koric and Gupta on the sustained peta-scale system of Blue Waters, will lead to a massive leap in terms of advances in high-fidelity modeling and design in science and engineering
Journal Reference
Seid Koric1,2, Anshul Gupta3. Sparse matrix factorization in the implicit finite element method on petascale architecture, Computer Methods in Applied Mechanics and Engineering, Volume 302, 2016, Pages 281–292.
[expand title=”Show Affiliations”]- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, USA
- Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, USA
- Mathematical Sciences, IBM T.J. Watson Research Center, USA
Go To Computer Methods in Applied Mechanics and Engineering