Skip to content

shansongliu/HumTrans

Repository files navigation

Model Evaluation on the HumTrans Dataset

PWC

This is the official repository for HumTrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond. To use the whole dataset, please refer to this link.

Introduction

We present baseline results of four SOTA vocal melody transcription models on both validation and test sets of our HumTrans dataset, including VOCANO, Sheet Sage, MIR-ST500, and JDC-STP, shown in the following table. For all the experiments, we directly utilized the codes provided by the authors to generate predicted transcription (midis/{VOCANO.zip,SheetSage.zip,MIR-ST500.zip,JDC-STP.zip}) and compared them with the reference MIDI files (midis/GroundTruth.zip). We can observe that although JDC-STP performed slightly better than the other models, the transcription capabilities of all the models are still far from satisfactory. Therefore, there is significant room for improvement in the domain of humming melody transcription.

Model Valid Set Test Set
Precison Recall F1 Precison Recall F1
VOCANO 3.270 3.314 3.194 3.384 3.329 3.352
Sheet Sage 2.757 2.656 2.702 3.039 2.982 3.005
MIR-ST500 6.258 6.448 6.341 5.686 5.853 5.755
JDC-STP 6.777 6.785 6.741 5.844 5.620 5.667

Script Example Usage

python calc_transcription_eval_metric.py valid_keys.txt midis/GroundTruth/valid midis/VOCANO/valid

The valid_keys.txt contains a list of name keys of the validation set, midis/GroundTruth/valid is the reference MIDI folder, and midis/VOCANO/valid is the predicted MIDI folder. The output will be three numbers which are precision, recall and F1-score of the compared group. The train_valid_test_keys.json contains the official split of this dataset, if users need to train their own model, please use this official split for fair comparison.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages