Hello,
In your paper, you explain that you chunk the inputs in chunks of 512 notes to feed it to the model. (I'm not sure that 512-chunking part of the code has been released yet, has it?)
As I understand, from the performance MIDI, you take 512 consecutive notes (unlikely to get 2 simultaneous notes, so no ambiguity), which makes sense. However, how do you get the corresponding notes from the XML file? I get that the performance and the XML are beat-aligned, but I'm not sure how you deal with the (likely partially filled) first and last beats of the XML chunk.
Hello,
In your paper, you explain that you chunk the inputs in chunks of 512 notes to feed it to the model. (I'm not sure that 512-chunking part of the code has been released yet, has it?)
As I understand, from the performance MIDI, you take 512 consecutive notes (unlikely to get 2 simultaneous notes, so no ambiguity), which makes sense. However, how do you get the corresponding notes from the XML file? I get that the performance and the XML are beat-aligned, but I'm not sure how you deal with the (likely partially filled) first and last beats of the XML chunk.