LIU Bingyu, WANG Cuirong, WANG Yiran, ZHANG Kun, WANG Cong. Microblog Topic Mining Based on FR-DATM[J]. Chinese Journal of Electronics, 2018, 27(2): 334-341. doi: 10.1049/cje.2017.12.006
Citation: LIU Bingyu, WANG Cuirong, WANG Yiran, ZHANG Kun, WANG Cong. Microblog Topic Mining Based on FR-DATM[J]. Chinese Journal of Electronics, 2018, 27(2): 334-341. doi: 10.1049/cje.2017.12.006

Microblog Topic Mining Based on FR-DATM

doi: 10.1049/cje.2017.12.006
Funds:  This paper was supported by National Natural Science Foundation of China (No.61300195), Natural Science Foundation of Hebei Province (No.F2014501078), Technology Planning Project of Hebei Province (No.15210146), and the General Project of Liaoning Province Department of Education Science Research (No.L2013099).
  • Received Date: 2015-11-25
  • Rev Recd Date: 2016-02-16
  • Publish Date: 2018-03-10
  • Microblog has become a major platform for people to release or obtain information. Texts on Microblog are shorter and have scarce co-occurrence information of terms. It is more complicated to discover topics from Microblog. To solve the problems, this paper proposes a dynamic author topic model FR-DATM and uses Gibbs sampling implementation for inference of this model. The FR-DATM model analyzes the relationships between blogs, and connects the related blogs to solve the sparseness of data. It allows blogs to be related to multiple topics, and each author of the blogs is also related to the topics of the blogs. The FR-DATM can also mine the topic evolution of the blogs and the authors. Experiments on Twitter dataset show that FR-DATM outperforms Latent dirichlet allocation (LDA) model and Microblog latent Dirichlet Allocation (MB-LDA) from three different perspectives:The quality of generated latent topics, the model perplexity and FR-DATM can mine the topic that the author are concerned dynamically.
  • loading
  • D.M. Blei, A.Y. Ng and M.I. Jordan, "Latent dirichlet allocation", Journal of Machine Learning Research, Vol.3, pp.993-1022, 2003.
    X. Wei and W.B. Croft, "Lda-based document models for adhoc retrieval", SIGIR, pp.178-185, 2006.
    X. Wang and A. Mc Callum, "Topics over time:A non-markov continuous-time model of topical trends", KDD, pp.424-433,2006.
    M. Rosen-Zvi, T. Griffiths, M. Steyvers and P. Smyth, "The author-topic model for authors and documents", Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, AUAI'04:AUAI Press, Arlington, Virginia, United States, pp.487-494, 2004.
    W. Li and A. McCalum, "Pachinko allocation:Dag-structured mixture models of topic correlations", ICML, pp.577-584, 2006.
    D. Zhou, E. Manavoglu, J. Li, C.L. Giles and H. Zha, "Probabilistic models for discovering e-communities", WWW, pp.173-182, 2006.
    Mehrotra R, Sanner S, Buntine W, et al., "Improving lda topic models for microblogs via tweet pooling and automatic labeling", Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp.889-892, 2013.
    H. Xu, F. Zhang and W. Wang, "Implicit feature identification in Chinese reviews using explicit topic mining model", Knowledge-Based Systems, Vol.76, pp.166-175, 2015.
    J.H. Ouyang, Y.H. Liu, et al., "Multi-Grain Sentiment/Topic Model Based on LDA", Chinese Journal of Electronics, Vol.43, No.9, pp.1875-1880, 2015.
    G. Heinrich, "Parameter estimation for text analysis", Technical Report, 2004 IEEE Transl. J. Magn. Japan, Vol.2, pp.740-741.
    Blei, Daivd M, and John D. Lafferty, "Dynamic topic models", Proceedings of the 23rd International Conference on Machine Learning, ACM, 2006.
    Wang Chong, David Blei and Daivd Heckerman, "Continuous time dynamic topic models", arXiv preprint arXiv:1206.3298, 2012.
    Z.J. Yin, et al., "Lpta:A probabilistic model for latent periodic topic analysis", 2011 IEEE 11th International Conference on Data Mining (ICDM) IEEE, 2011.
    Z. Cheng, J. Caverlee and K. Lee, "You are where you tweet:A content-based approach to geo-locating twitter users", Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto Canada, pp.759-768, 2010.
    C.Y. Zhang, Jianling Sun and Yiqun Ding, "Topic mining for Mmicroblog based on MB-LDA model", Journal of Computer Research and Development, Vol.48, No.10, pp.1795-1802, 2011. (in Chinese)
    S. Kullback, R.A. Leibler, "On information and sufficiency", Annals of Mathematical Statistics, Vol.22, No.1, pp.79-86, 1951.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (154) PDF downloads(662) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return