Citation: | DENG Liang, ZHAO Dan, BAI Hanli, et al., “Performance Optimization and Comparison of the Alternating Direction Implicit CFD Solver on Multi-core and Many-Core Architectures,” Chinese Journal of Electronics, vol. 27, no. 3, pp. 540-548, 2018, doi: 10.1049/cje.2018.03.011 |
P. Giangiacomo and V. Michelassi, "An efficient parallel ADI algorithm for turbomachinery flows", International Journal of Computational Fluid Dynamics, Vol.17, No.1, pp.15-26, 2003.
|
P. Panickar, J.P. Erwin, N. Sinha, et al., "Localization of acoustic sources in shock-containing jet flows using phased array measurements", Proc. of 51st AIAA Aerospace Science Meeting, Grapevine, Texas, USA, pp.2013-2025, 2013.
|
M.A. Prakash, K. Mayilsamy and P.R. Kanna P, "Numerical simulation of two dimensional laminar sall jet flow over solid obstacle", Applied Mechanics and Materials, Vol.592, No.1, pp.1935-1939, 2014.
|
A. Wood and K.H. Wang, "Modeling dam-break flows in channels with 90 degree bend using an alternating-direction implicit based curvilinear hydrodynamic solver", Computers & Fluids, Vol.114, No.3, pp.254-264, 2015.
|
N. Satish, C. Kim, J. Chhugani, et al., "Can traditional programming bridge the ninja performance gap for parallel computing applications?", Proc. of ACM SIGARCH Computer Architecture News, New York, USA, pp.440-451, 2012.
|
Y. You, H. Fu, S.L. Song, et al., "Evaluating multi-core and many-core architectures through accelerating the threedimensional Lax? Wendroff correction stencil", International Journal of High Performance Computing Applications, Vol.28, No.3, pp.301-318, 2014.
|
N. Sakharnykh, "Tridiagonal solvers on the GPU and applications to fluid simulation", Proc. of NVIDIA GPU Technology Conference, San Jose, California, USA, pp.22-28, 2009.
|
W. Zhang, B. Jang, Y. Zhang, et al., "Parallelizing alternating direction implicit solver on GPUs", Procedia Computer Science, Vol.18, No.1, pp.389-398, 2013.
|
P.V. Le, P. Kumar, A.J. Valocchi, et al., "GPU-based highperformance computing for integrated surface?sub-surface flow modeling", Environmental Modelling & Software, Vol.73, No.3, pp.1-13, 2015.
|
Y.X. Wang, L.L. Zhang, W. Liu, et al., "Efficient parallel implementation of large scale 3D structured grid CFD applications on the Tianhe-1A supercomputer", Computers & Fluids, Vol.80, No.1, pp.244-250, 2013.
|
J. Treibig, G. Hager and G. Wellein, "Likwid:A lightweight performance-oriented tool suite for x86 multicore environments", Proc. of 39th International Conference on Parallel Processing Workshops, San Diego, California, USA, pp.207-216, 2010.
|
M. Sato, S. Tsutsui, N. Fujimoto, et al., "First results of performance comparisons on many-core processors in solving QAP with ACO:Kepler GPU versus Xeon Phi", Proc. of the 2014 Conference Companion on Genetic and Evolutionary Computation Companion, Vancouver, Canada, pp.1477-1478, 2014.
|
T. Liu, X.G. Xu and C.D. Carothers, "Comparison of two accelerators for Monte Carlo radiation transport calculations, Nvidia Tesla M2090 GPU and Intel Xeon Phi 5110p coprocessor:A case study for X-ray CT imaging dose calculation", Annals of Nuclear Energy, Vol.82, No.1, pp.230-239, 2015.
|
M. Bernaschi, M. Bisson and F. Salvadore, "Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations", Computer Physics Communications, Vol.185, No.10, pp.2495-2503, 2014.
|
B. Varghese, "The GPU vs Phi debate:Risk analytics using many-core computing", arXiv preprint arXiv:1501.06326, 2015.
|
E.F. Toro, Riemann Solvers and Numerical Methods for Fluid Dynamics, Springer Science & Business Media, Berlin, Germany, pp.32-34, 1997.
|
L.B. Van, "Towards the ultimate conservative difference scheme", Journal of Computational Physics, Vol.135, No.2, pp.229-248, 1997.
|
D.W. Peaceman, Rachford and H.H. Jr, "The numerical solution of parabolic and elliptic differential equations", Journal of the Society for Industrial & Applied Mathematics, Vol.3, No.1, pp.28-41, 1955.
|
M. Harris, "Optimizing parallel reduction in CUDA", http://developer.download.nvidia.com/assets/cuda/files/reduction.pdf, 2012-9-11.
|
S. Rennich, "CUDA C/C++ streams and concurrency", http://on-demand.gputechconf.com/gtc-express/2011/presentations/, 2012-7-1.
|
NVIDIA Corporation, "CUDA C best practices guide version 4.2", http://www.scribd.com/doc/106303214/CUDA-CBest-Practices-Guide/2012.
|
NVIDIA Corporation, "GPU occupancy calculator", http://developer.download.nvidia.com/compute/cuda/, 2010.
|
J. Jeffers and J. Reinders, Intel Xeon Phi Coprocessor HighPerformance Programming, Newnes, USA, pp.42-43, 2013.
|
Y.X. Wang, L.L. Zhang, Y.G. Che, et al., "Efficient parallel computing and performance tuning for multi-block structured grid CFD applications on Tian-he supercomputer", Chinese Journal of Electronics, Vol.43, No.1, pp.36-44, 2014(in Chinese).
|
G. Teodoro, T. Kurc, G. Andrade, et al., "Performance analysis and efficient execution on systems with multi-core CPUs, GPUs and MICs", arXiv preprint arXiv:1505.03819, 2015.
|
X. Tian, H. Saito, S.V. Preis, et al., "Effective SIMD vectorization for Intel Xeon Phi coprocessors", Scientific Programming, Vol.501, No.1, pp.69-76, 2015.
|