We introduce KaraMIR, a musical project dedicated to karaoke song analysis. Within KaraMIR we define Kara1k, a dataset composed of 1,000 cover songs provided by Recisio Karafun application, and the corresponding 1,000 songs by the original artists.
Kara1k is mainly dedicated toward cover song identification and singing voice analysis. For both tasks, Kara1k offers novel approaches, as each cover song is a studio-recorded song with the same arrangement as the original recording, but with different singers and musicians.
Essentia, harmony-analyser, Marsyas, Vamp plugins and YAAFE have been used to extract audio features for each track in Kara1k. We provide metadata such as the title, genre, original artist, year, International Standard Recording Code and the ground truths for the singer's gender, backing vocals, duets, and lyrics' language.
KaraMIR project focuses on defining new problems and describing features and tools to solve them. We thus provide a comparison of traditional and new features for a cover song identification task using statistical methods, as well as the dynamic time warping method on chroma, MFCC, chords, keys, and chord distance features.
A supporting experiment on the singer gender classification task is also proposed. The KaraMIR project website facilitates the continuous research.