LI Shuang, YAN Yanghui, REN Ju, ZHOU Yuezhi, ZHANG Yaoxue. A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification[J]. Chinese Journal of Electronics, 2020, 29(1): 89-96. DOI: 10.1049/cje.2019.10.004
Citation: LI Shuang, YAN Yanghui, REN Ju, ZHOU Yuezhi, ZHANG Yaoxue. A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification[J]. Chinese Journal of Electronics, 2020, 29(1): 89-96. DOI: 10.1049/cje.2019.10.004

A Sample-Efficient Actor-Critic Algorithm for Recommendation Diversification

  • Diversifying recommendation results gains benefits from satisfying user's existing interests as well as exploring novel information needs. Recently proposed Monte-Carlo based reinforcement learning method suffers from sample inefficiency, large variance, and even failing to perform well in large action space. We propose a novel actor-critic reinforcement learning algorithm for recommendation diversification in order to solve the above mentioned problems. The actor acts as the ranking policy, while the introduced critic predicts the expected future rewards of each candidate action. The critic target is updated by full Bellman equation and the actor network is optimized using expected gradient in the whole action space. To further stabilize and improve the performance, we also add policy-filtered critic supervision loss. Experiments on MovieLens dataset well demonstrate the effectiveness of our approach over multiple competitive methods.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return