WEI Baogang, ZHANG Yin, YUAN Jie, LIU Yonghuai, WANG Lidong. A Novel Approach to Text Detection and Extraction from Videos by Discriminative Features and Density[J]. Chinese Journal of Electronics, 2014, 23(2): 322-328.
Citation: WEI Baogang, ZHANG Yin, YUAN Jie, LIU Yonghuai, WANG Lidong. A Novel Approach to Text Detection and Extraction from Videos by Discriminative Features and Density[J]. Chinese Journal of Electronics, 2014, 23(2): 322-328.

A Novel Approach to Text Detection and Extraction from Videos by Discriminative Features and Density

Funds:  This work is supported by the National Natural Science Foundation of China (No.60673088), He-gao-ji National Major Technology Special Project (No.2010ZX01042-002-003), Chinese Knowledge Center of Engineering Science and Technology (CKCEST), Specialized Research Fund for the Doctoral Program of Higher Education (SRFDP) (No. 20130101110136) and China Academic Digital Associative Library (CADAL).
More Information
  • Corresponding author: ZHANG Yin
  • Received Date: 2012-07-01
  • Rev Recd Date: 2013-04-01
  • Publish Date: 2014-04-05
  • Text is very important to video retrieval, index, and understanding. However, its detection and extraction is challenging due to varying background, low contrast between text and non-text regions, and perspective distortion. In this paper, we propose a novel two phase approach to tackling this problem by discriminative features and edge density. The first phase firstly defines and extracts a novel feature called edge distribution entropy and then uses this feature to remove most non-text regions. The second phase employs a Support vector machine (SVM) to further distinguish real text regions from nontext ones. To generate inputs for SVM, additional three novel features are defined and extracted from each region: a foreground pixel distribution entropy, skeleton/size ratio, and edge density. After text regions have been detected, texts are extracted from such regions that are surrounded by sufficient edge pixels. A comparative study using two publicly accessible datasets shows that the proposed method significantly outperforms the selected four state of the art ones for accurate text detection and extraction.
  • loading
  • J. Gllavata, R. Ewerth and B. Freisleben, "Text detection in images based on unsupervised classification of high-frequency wavelet coefficients", In Proceedings of International Conference on Pattern Recognition (ICPR), Washington, DC, USA, pp.425-428, 2004.
    R. Lienhart, "Automatic text segmentation and text recognition for video indexing", Multimedia System Magazine, Vol.8, No.1, pp.69-81, 2000.
    Palaiahnakote Shivkumara, Weihua Huang and Chew Lim Tan, "Efficient video text detection using edge features", In Proceedings of 19th International Conference on Pattern Recognition (ICPR), pp.1-4, 2008.
    Liu Lu, Sun Lifeng, Yang Shiqiang, "Web video duplicate detection based on video vocabulary", Chinese Journal of Electronics, Vol.18, No.1, pp.25-30, 2009.
    Q. Ye, Q. Huang, W. Gao, D. Zhao, "Fast and robust text detection in images and video frames", Image and Vision Computing, Vol.23, No.6, pp.565-576, 2005.
    H. Hase, T. Shinokawa, M. Yoneda, C.Y. Suen, "Character string extraction from color documents", Pattern Recognition, Vol.34, No.7, pp.1349-1365, 2001.
    P. Shivakumara, A. Dutta, C.L. Tan and U. Pal, "A new wavelet-median-moment based method for multi-oriented video text detection", In Proceedings of the Ninth IAPR International Workshop on Document Analysis and Systems (DAS), pp.279-288, 2010.
    FengWang, Chongwah Ngo, Tingchuen Pong, "Structuring lowquality videotaped lectures for cross-reference browsing by video text analysis", Pattern Recognition, Vol.41, No.10, pp.3257-3269, 2008.
    Michael R. Lyu, Jiqiang Song and Min Cai, "A comprehensive method for multilingual video text detection, localization, and extraction", IEEE Transactions on Circuits and Systems for Video Technology, Vol.15, No.2, pp.243-255, 2005.
    Wonjun Kim and Changick Kim, "A new approach for overlay text detection and extraction from complex video scene", IEEE Transactions on Image Processing, Vol.18, No.2, pp.401-411, 2009.
    Palaiahnakote Shivakumara, Weihua Huang, Trung Quy Phan, Chew Lim Tan, "Accurate video text detection through classification of low and high contrast images", Pattern Recognition, Vol.43, No.6, pp.2165-2185, 2010.
    Xu Zhao, Kaihsiang Lin, Yun Fu, Yuxiao Hu, Yuncai Liu, Thomas S. Huang, "Text from corner: A novel approach to detect text and caption in videos", IEEE Transactions on Image Processing, Vol.20, No.3, pp.790-799, 2011.
    Chucai Yi, Yingli Tian, "Text string detection from natural scenes by structure-based partition and grouping", IEEE Transactions on Image Processing, Vol.20, No.9, pp.2594-2605, 2011.
    Qixiang Ye, Jianbin Jiao, Jun Huang, Hua Yu, "Text detection and restoration in natural scene images", Visual Communication and Image Representation, Vol.18, No.6, pp.504-513, 2007.
    J. Liang, D. Doermann, H. Li, "Camera based analysis of text and documents: A survey", International Journal on Document Analysis and Recognition, Vol.7, No.2-3, pp.84-104, 2005.
    Chunmei Liu, Chunheng Wang, Ruwei Dai, "Text detection in images based on unsupervised classification of edge-based features", In Proceedings of the 2005 Eight International Conference on Document Analysis and Recognition (ICDAR), pp.610-614, 2005.
    T. Chen, "Text localization using DWT fusion algorithm", In IEEE International Conference on Communication Technology (ICCT), pp.722-725, 2008.
    J. Yi, Y. Peng and J. Xiao, "Color-based clustering for text detection and extraction in image", In Proceedings of the ACM International Multimedia Conference and Exhibition (MM), pp.847-850, 2007.
    D. Karatzas and A. Antonacopoulos, "Text extraction fromWeb images based on a split-and-merge segmentation method using color perception", In Proceedings of the 17th International Conference on Pattern Recognition (ICPR), pp.634-637, 2004.
    Yi-Ffeng Pan, Xinwen Hou and Chenglin, "A hybrid approach to detect and localiza texts in natural scene images", IEEE Transactions on Image Processing, Vol.20, No.3, pp.800-813, 2011.
    Zohra Saidane and Christophe Garcia, "Robust binarization for video text recognition", In Proceedings of the Ninth International Conference on Document Analysis and Recognition (ICDAR), pp.874-879, 2007.
    T. Sato, T. Kanade, E.K. Hughes and M.A. Smith, "Video OCR for digital news archive", In Processing of IEEE Workshop Content-Based Access Image Video Database (CIVR), pp.52-60, 1998.
    N. Otsu, "A threshold selection method from gray-level histograms", IEEE Transaction on System Man and Cybernetics, Vol.9, No.1, pp.62-66, 1997.
    P. Shivakumara, Trung Quy Phan and Chew Lim Tan, "A Laplacian approach to multi-oriented text detection in video", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.33, No.2, pp.412-419, 2011.
    E.K. Wong and M. Chen, "A new robust algorithm for video text extraction", Pattern Recognition, Vol.36, No.6, pp.1397-1406, 2003.
    J. Song, M. Cai and M.R. Lyu, "A robust statistic method for classifying color polarity of video text", In Processing of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.581-584, 2003.
    X.S. Hua, L. Wenyin and H.J. Zhang, "An automatic performance evaluation protocol for video text detection algorithms", IEEE Transactions on Circuits and Systems for Video Technology, Vol.14, No.4, pp.498-507, 2004.
    S.M. Lucas, A. Panaretos, L. Sosa, A. Tang, S.Wong, R. Young, ICDAR 2003 Robust Reading Competitions, ICDAR, pp.682-687, 2003.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (285) PDF downloads(1687) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return