Datasets
MUSDB18​
The Convolutional Neural Network used in this project is trained using the MUSDB18 dataset. This data set is comprised of 150 full-length music tracks split 100-50 between training and testing M4A files. An M4A, or MPEG-4, file is a datatype that contains the song and its constituent parts including the vocals, drums, bass, and other- "other" being anything not fitting into the other three categories. This dataset is used to train the dataset itself and can be used to test the final result by plugging in the testing songs. The neural network used in this project was a pre-trained model found at this GitHub.
Basic Scales and Chords
To test the subparts of the project, a set of basic scales, chords, and songs was compiled. These datasets were created using the music software Musescore and were used to test the created pitch detection algorithms and music splitter algorithms. This was used as a baseline before full musical compositions were input into the final system. These included C-major, F-major, and G-major scales and their respective chords. Basic songs such as "Twinkle Twinkle Little Star" and Christian Petzold's "Minuet in G" were used for their monophonic sound and relative simplicity. Using this dataset allowed for debugging of the algorithms and more fine-tuning.
Full Musical Compositions
As the final goal of this project is to transcribe any song input into it, the final dataset is full musical compositions. These compositions were used mostly for testing and grading the final result. Melodically simple songs with fewer instruments were preferred, with Neil Young's "Heart of Gold" used most commonly for presenting and testing. As the songs' audio and sheet music are typically publicly available, this allowed for easy testing.