Expose or export tempo information during inference

Hi, thanks for releasing this project. I am testing the inference path on piano performance MIDI and exporting the resulting `music21` score to MusicXML.

One interoperability issue I ran into is that the inferred MusicXML does not contain any explicit tempo information (`<direction><metronome>...` or `<sound tempo="..."/>`). This makes downstream renderers/players such as alphaTab, OSMD + playback pipelines, or MuseScore round-trips fall back to defaults.

Simply copying the input MIDI's first `set_tempo` is not always correct for performance-MIDI-to-score conversion. For example, a MIDI may contain a default 120 BPM tempo while the inferred score grid represents a slower/faster tactus, or may need a local tempo map to preserve expressive timing.

I noticed `MultistreamTokenizer.detokenize_mxl(..., midi_sequence=...)` has comments around using the input MIDI timings to create `(score offset, time in seconds)` pairs:

```py
# If we have the input midi timings, we can use them to set the tempo
# We first set tempo marks to track where their location `should` be
# The inserted tempo marks therefore form (offset, time in seconds) pairs.
```

However, the public inference helper `quantize_path()` currently calls `detokenize_mxl(y_hat)` without passing the MIDI sequence, and the inserted `MetronomeMark(number=midi_sequence[i].start)` values appear to be seconds rather than BPM, so they are not directly valid MusicXML tempo markings for playback.

Would it make sense for inference/export to expose one of these options?

1. preserve the input MIDI tempo when no score timing alignment is requested;
2. accept an explicit `bpm` argument for MusicXML export;
3. derive and export a tempo map from the alignment between input performance onset times and predicted score offsets/downbeats;
4. return the predicted score timing information separately, so callers can generate their own MusicXML `<sound tempo="..."/>` marks.

For my local wrapper I can add a constant metronome mark after inference, but this is not enough for expressive performance MIDI and it is hard for downstream tools to know the intended playback tempo without additional output from the model/inference step.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose or export tempo information during inference #9

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Expose or export tempo information during inference #9

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions