Multi-modal piano note detection using audio and video

Loading...
Thumbnail Image

Date

Citation for Previous Publication

Link to Related Item

Abstract

Description

Many people have been interested in music recognition. The automated transcription of musical compositions and the identification of sound sources, such as the sort of instruments used, have taken a lot of time and work. With the rise of personal computers and multimedia systems in recent years, research in these areas has gotten a lot of attention. In our paper, we have chosen a piano based song for the purpose of analysis. We have divided the song in chunks called frames for note recognition. Initially, we performed manual analysis to recognize the notes so that we have the correct notes. Then after, we have used finder tip following technique for tracking the notes which are played. This is our input dataset for image or frame based input. Subsequently, the audio is extracted and divided to chunks similar to number of frames in the video. We have performed audio frequency analysis to perform note detection based on the audio. When the variables of interest can’t be measured directly but an indirect measurement is available, Kalman filter and particle filter are used to estimate them as best as possible. They’re also used to obtain the best approximation of states in the presence of noise by integrating readings from numerous sensors. The novelty of our research is that we have implemented Kalman filter and particle filter based on audio and video based input instead of sensor data which is never used before.

Item Type

http://purl.org/coar/resource_type/c_1843

Alternative

Other License Text / Link

Language

en

Location

Time Period

Source