Dataset creation for graphics recognition, especially for hand-drawn inputs, is often an expensive and time-consuming undertaking. The MUSCIMarker tool used for creating the MUSCIMA++ dataset for Optical Music Recognition (OMR) led to efficient use of annotation resources, and it provides enough flexibility to be applicable to creating datasets for other graphics recognition tasks where the ground truth can be represented similarly.
First, we describe the MUSCIMA++ ground truth to define the range of tasks for which using MUSCIMarker to annotate ground truth is applicable. We then describe the MUSCIMarker tool itself, discuss its strong and weak points, and share practical experience with the tool from creating the MUSCIMA++ dataset.