YANG Xudong, LIU Quan, JING Ling, LI Jin, YANG Kai. A Scalable Parallel Reinforcement Learning Method Based on Divide-and-Conquer Strategy[J]. Chinese Journal of Electronics, 2013, 22(2): 242-246.
Citation: YANG Xudong, LIU Quan, JING Ling, LI Jin, YANG Kai. A Scalable Parallel Reinforcement Learning Method Based on Divide-and-Conquer Strategy[J]. Chinese Journal of Electronics, 2013, 22(2): 242-246.

A Scalable Parallel Reinforcement Learning Method Based on Divide-and-Conquer Strategy

Funds:  This work is supported by the National Natural Science Foundation of China (No.60873116, No.61070223), Natural Science Foundation of Jiangsu (No.BK2009116), High School Natural Foundation of Jiangsu (No.09KJA520002), Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University (No.93K172012K04).
  • Received Date: 2012-02-01
  • Rev Recd Date: 2012-04-01
  • Publish Date: 2013-04-25
  • To conquer the slow convergence and poor scalability problems of reinforcement learning, a Scalable parallel reinforcement learning method, DCS-SPRL, is proposed on the basis of Divide-and-conquer strategy. In this method, the learning problem with large state space is decomposed into multiple smaller subproblems. According to a weighted priority scheduling algorithm, these subproblems are then dispatched to the learning agents which are able to learn in parallel. Finally, the learning results of each subproblem are merged into a composite solution. The experimental results show that DCS-SPRL has good scalability and needs significantly less computational time.
  • loading
  • R.M. Kretchmar, "Parallel reinforcement learning", Proc. of the 6th World Conference on Systemics, Cybernetics, and Informatics, Orlando, Florida, USA, pp.114-118, 2002.
    D. Wingate, K.D. Seppi, "P3VI: A partitioned, prioritized, parallel value iterator", Proc. of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, pp.109- 116, 2004.
    W. Meng, X.D. Han, "Parallel reinforcement learning algorithm and its application", Chinese Computer Engineering and Applications, Vol.45, No.34, pp.25-28, 2009.
    M. Kaya, A. Arslan, "Parallel and distributed multi-agent reinforcement learning", Proc. of the 8th International Conference on Parallel and Distributed Systems, KyongJu City, Korea, pp.437-441, 2001.
    J. Shi, J. Malik, "Normalized cuts and image segmentation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.22, No.8, pp.888-905, 2000.
    J.B. MacQueen, "Some methods for classification and analysis of multivariate observations", Proc. of the fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, California, USA, pp.281-297, 1967.
    R.S. Sutton, A.G. Barto, Reinforcement Learning: An Introduction. MIT Press, Cambridge, MA, USA, 1998.
    J. Holland, Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann Arbor, Michigan, USA, 1975.
    J.N. Tsitsiklis, "Asynchronous stochastic approximation and Qlearning", Maching Learning, Vol.16, No.3, pp.185-202, 1994.
    R.M. Kretchmar, "Reinforcement learning algorithms for homogenous multi-agent systems", Workshop on Agent and Swarm Programming, Cleveland, OH, USA, 2003.
    A.M. Printista, M.L. Errecalde, C.I. Montoya, "A parallel implementation of Q-learning based on communication with cache", Journal of Computer Science & Technology, Vol.6, No.1, pp.268-278, 2002.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (347) PDF downloads(1492) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return