PENG Yuanxi, ZHOU Feng, HAI Yue, et al., “A Multi-instruction Streams Extension Mechanism for SIMD Processor,” Chinese Journal of Electronics, vol. 26, no. 6, pp. 1154-1160, 2017, doi: 10.1049/cje.2017.09.013
Citation: PENG Yuanxi, ZHOU Feng, HAI Yue, et al., “A Multi-instruction Streams Extension Mechanism for SIMD Processor,” Chinese Journal of Electronics, vol. 26, no. 6, pp. 1154-1160, 2017, doi: 10.1049/cje.2017.09.013

A Multi-instruction Streams Extension Mechanism for SIMD Processor

doi: 10.1049/cje.2017.09.013
Funds:  This work is supported by the National Natural Science Foundation of China (No.61402493), and the Research Project of National University of Defense Technology (No.GC-14-06-02).
More Information
  • Corresponding author: WANG Yaohua (corresponding author) was born in 1985. He received the Ph.D. degree in electronic engineering from National University of Defense Technology. He is a research Associate in college of computer science, National University of Defense Technology.(Email:nudtyh@gmail.com)
  • Received Date: 2015-08-24
  • Rev Recd Date: 2016-06-23
  • Publish Date: 2017-11-10
  • Multi-media applications contain multibranches loop, it is of low efficiency to map them into traditional Single instruction multiple data (SIMD) structures. Considering the above matter, we proposed a multiinstruction streams extension method for traditional SIMD structures. The main idea is to simultaneously dispatch multiple instruction streams to multiple lanes. Compared with traditional SIMD whose lanes receive the unified single instruction stream but execute conditionally through a lane mask vector, Multi-instruction streams extension grants each of its lanes the ability to receive and execute the instructions of one particular branch path. Thus, it is of high efficiency to map multi-branches loop in applications. The design is finally implemented through Verilog language, and then integrated into the FT-Matrix vector-SIMD chip. Application profiling results shows that the proposed method consumes mere 2.61% area overhead while obtains about 1.8x to 2.4x performance gain.
  • loading
  • R. Krashinsky, C. Batten and M. Hampton, "The vector-thread architecture", Proceedings of the 31st Annual International Symposium on Computer Architecture, IEEE Computer Society Washington, DC, USA, pp.52-63, 2004.
    Lee, Yunsup, et al., "Exploring the tradeoffs between programmability and efficiency in data-parallel accelerators", ACM SIGARCH Computer Architecture News, Vol.39, No.3, pp.129-140, 2011.
    B.C. Li, J.Z. Wei, W. Guo and J.Z. Sun, "Improving SIMD utilization with thread-lane shuffled compaction in GPGPU", Chinese Journal of Electronics, Vol.24, No.2, pp.684-688, 2015.
    W.W.L. Fung, I. Sham and G. Yuan, "Dynamic warp formation:Efficient MIMD control flow on SIMD graphics hardware", ACM Transactions on Architecture and Code Optimization (TACO), Vol.6, No.2, pp.407-420, 2009.
    Aniruddha S. Vaidya, Anahita Shayesteh, Dong Hyuk Woo, et al., "SIMD divergence optimization through intra-warp compaction", ACM SIGARCH Computer Architecture News, pp.368-379, 2013.
    Rhu, Minsoo and Mattan Erez, "CAPRI:Prediction of compaction-adequacy for handling control-divergence in GPGPU architectures", ACM SIGARCH Computer Architecture News, Vol.40, No.3, pp.61-71, 2012.
    Minsoo Rhu and Mattan Erez, "Maximizing SIMD resource utilizationin GPGPUs with SIMD lane permutation", International Symposium on Computer Architecture, pp.356-367, 2013.
    El Tantawy, Ahmed, et al., "A scalable multi-path microarchitecture for efficient GPU control flow", IEEE 20th International Symposium on High Performance Computer Architecture (HPCA), 2014.
    Yaohua Wang, Shuming Chen, et al., "Instruction shuffle:Achieving mimd-like performance on simd architectures", IEEE Computer Architecture Letters, Vol.11, No.2, pp.37-40, 2012.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (411) PDF downloads(213) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return