Learn to generate music with LSTM neural networks in Keras
Python 3.x
Keras and music21
GPU strongly recommended for training
After reading Sigurður Skúli's towards data science article 'How to Generate Music using a LSTM Neural Network in Keras' - I was astounded at how well LSTM classification networks were at predicting notes and chords in a sequence, and ultimately then how they could generate really nice music.
Sigurður's approach had some really nice and useful functions for parsing the data, creating dictionaries to translate between notes and class labels, and then using the trained model to generate pieces of music.
The main limitations of this work is that notes generated are all the same length and offset from one another, so music can sound quite unnatural sometimes. In this extension, we instead use the Keras Functional API (instead of a Sequential model) to branch the neural network to consider multiple timeseries from the music. They are:
- The Notes and Chords in the sequence (just referred to as 'notes' from here on)
- The offsets of the note from the previous one (offset of the note from the start of the midi minus the current base (previous value))
- The durations of the notes in the sequence
The above also serve as three separate outputs to the network.
Thus, three tasks are trained. Classifying the next note, its offset, and its duration.
This is the best model I have found so far: Note: the number of outputs for the three final layers differ depending on what was detected during parsing of the files. For example, if you parse 100 midi files consisting of only three chords, the notes_output layer will only have three outputs. Thus, depending on what kind of music you're training on, the number of softmax neurons will change dynamically.
Everything in this repo can be run as-is to train on classical piano pieces:
- Run train.py for a while until note/chord loss is below 1
- Set line 155 in generate-music.py to the name of the .HDF5 file that was last saved (the model weights)
- Run generate-music.py to generate a midi file from the model
Since most midis are multi-track (eg. 2x piano, left and right hand), and this model only supports one - run convertmidis.py to merge all tracks into one
Change line 53 in train.py to specify where your .midi files are stored
To run with TensorFlow 2.0 use lstm-new-tf2.py
I need to see what happens when the offset and duration problems are regression with MSE. Currently, we do classification of a library of classes (ie. unique offsets and durations) detected from the dataset during Midi parsing
Enable multi-track training and prediction, currently we just merge all midi tracks into one Upload some more examples of music Try and experiment with smaller models that are quicker to train but still produce good results Try out some different loss functions, Adam seems best so far Maybe experiment with adding a genre classification network branch so the model doesn't need curated data as input Clean up the code, it's a bit messy in places
I hope to update the model to learn to predict the instrument, but at the moment I just use https://onlinesequencer.net/ if I want to hear it played by something other than a piano
You can train the network on any instrument, all it cares about are the notes and their offset and duration. But, that said, the network will set each note's instrument as piano. This can be changed via lines 265 and 282 which set notes and chords to piano respectively in generate-music.py