Incremental Music Learning with Sparse Ensemble Coding for Regeneration

Cheolho Han, Byoung-Hee Kim, and Byoung-Tak Zhang


In this paper, a computational model for musical memory is presented, especially in the perspective of long-term memory for recall. Studies from music psychology propose that segmentation is a foundation for the establishment of long-term memory representation (Synder, 2009). To build a recall memory for symbolic music, researchers have introduced these segmentation-based models, including n-gram based language models and variable-order Markov models. Usual variable-order Markov model for symbolic sequences builds a dictionary tree structure in which frequencies of partial sequences are encoded. In this paper, a sparse ensemble coding scheme is presented in which only a small subset of variable-length sequences are stored in a sparse coding manner. As the representation for melodies, we apply interval-based coding for pitches and 1/8 quantization in durations. The rationale behind this is that the relative pitches rather than absolute pitches are important in music and the set of 1/8-quantized durations are enough so that any duration in usual music belongs to the set. Additional incremental learning scheme is suggested. We apply the suggested method to learning a set of music and regenerating pieces of music when some cue sequences are given as seeds. Based on pop song melody dataset of MIDI format, we try to search the capacity of the model that can regenerate various sequences successfully and experimentally check sudden changes of model structures and related parameters around the boundary condition. Considering that most theories of the structure of long-term representations of music use the concept of hierarchy to varying degrees (Synder, 2009), we may add additional contour-based representation for pitches (up and down) and durations (increase and decrease).