WAN Yulong, WANG Xianliang, ZHOU Ruohua, YAN Yonghong. Automatic Piano Music Transcription Using Audio-Visual Features[J]. Chinese Journal of Electronics, 2015, 24(3): 596-603. DOI: 10.1049/cje.2015.07.027
Citation: WAN Yulong, WANG Xianliang, ZHOU Ruohua, YAN Yonghong. Automatic Piano Music Transcription Using Audio-Visual Features[J]. Chinese Journal of Electronics, 2015, 24(3): 596-603. DOI: 10.1049/cje.2015.07.027

Automatic Piano Music Transcription Using Audio-Visual Features

  • The performance of automatic music transcription seems to have reached a limit over the last decade, and a promising direction of improvements could be to incorporate music instruments' specific parameters. We propose a novel piano-specific transcription system, using both audio and visual features for the first time. Contribution of the paper mainly includes two parts: A new onset detection method is proposed using a specific spectrum envelope matched filter on multiple frequency bands. A computer-vision method is proposed to enhance audio-only piano music transcription, through tracking the positions of the pianist's hands on the piano keyboard. Based on the MIDI Aligned piano sounds (MAPS) database and a selfrecorded video database, we carried out comparable experiments for audio-only onset detection and overall system, respectively. The performance was compared with the best piano transcription system in Music information retrieval evaluation exchange (MIREX), and the results showed that the proposed system outperforms the state-of-art method substantially.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return