Add CPAL backend#786
Conversation
This provides an out-of-the-box experience. Particularly CoreAudio systems (Macs) may report they only support F32 while librespot defaults to S16.
|
A |
Actually I'm scrapping the idea because it's more complicated than that, and wouldn't be very transparent to the user.
This has been discussed in various places, and is not the intended direction. There are good reasons to keep at least some of the others around not only as backup, but for their features. e.g. CPAL doesn't offer So while I think CPAL is great as a default, because it offers a good out-of-the-box experience, there's a place for (most of) the others as well. |
|
(btw one nice thing about a decoupled playback library would be the possibility of automated tests on Windows) |
|
Windows: |
|
My guess: it doesn't support natively 44100 as sample rate, and resampling would be necessary. Rodio did that, cpal doesn't. |
|
Thanks. Could you provide me with the binary? Can I set up a Rust compilation environment with MinGW? I don't have the disk space for all the Visual Studio SDK baggage. |
I think you're right. On Windows this seems to be a bit of a minefield, with various Realtek cards (and who knows which others) only taking 48 kHz and WASAPI not doing resampling. We certainly don't want to expose users to all this, it should just work out of the box. If so there are two things we can do:
|
Yep, pretty much everyone assumes that more bits, more better. Implementation is really the important part.
No 24 bit is kinda a deal breaker. I was not privy to those discussions. I'm sorry. |
|
I shortly played around with the Rubato resampling crate but unfortunately its API is not directly compatible with ours. Rubato works with a To use Rubato we'd have to iterate over all samples two more times: once to split them into two vectors, then again to join them in a single vector. That looks aweful and seems wasteful. There are other resampling libraries that I haven't investigated, because they are wrappers around Options I see now:
So as much as it aches me that it's only on Windows, I'm starting to feel it might be best to just leave Rodio be. Your opinion? |
|
Lewton doesn't deliver the samples interleaved by default: https://docs.rs/lewton/0.10.2/lewton/samples/trait.Samples.html#tymethod.from_floats. If you want to rewrite everything still another time, it would be possible to use this fact. Are you still interested in the binary? And are we sure we don't need resampling for any other OS than Windows? Is it possible that some devices with linux don't support the usual sample rate? Is it possible that formats other than ogg (which librespot doesn't support for now) use different sample rates? Is it possible that Spotify HiFi will give us different sample rates? I think it's another good argument to create a librespot-tailored alternative crate to rodio as I suggested before, @roderickvd. Just to have more flexibility for any of these cases. But well... |
ALSA/Dmix defaults to resampling to 48 kHz if you go though the "Default" device because of the above reason that 48 kHz is actually the most common supported sampling rate for integrated sound cards (or at least it was when that decision was made) . PulseAudio, Pipewire and Gstreamer all sit on top of ALSA/Dmix. So on Linux resampling isn't really a must have. The only way it would really be a draw is if it were really high quality like comparable to sox. |
Yes for Now suppose we would be able to extract non-interleaved samples from those crates. Then for the backends, it would probably add a bit of complexity. When writing non-interleaved samples (if the backends support it -- I haven't checked all of them although Alsa and GStreamer should be fine) you need to write periods ("chunks") of samples per channel. We would need to be careful to not run into latency issues and underruns.
The question isn't so much the source format, but rather the flexibility of the platform. Then yes, it's really only an issue on Windows because the other platforms have easy options to transparently (from a UX perspective) resample to whatever the hardware supports. 44.1 kHz remains the Red Book CD-standard and that is what Spotify has also announced for Spotify HiFi. It's certainly possible that at other sampling rates will also be offered in the future. Perhaps for streaming video or even multiples of 44.1 or 48 kHz. Again this wouldn't be a big deal on most platforms except on Windows. So I'm thinking: we can delve into changing vector layouts but aren't we introducing too much complexity for a single platform? Especially when the other options seem so much easier?
I know, I noticed your earlier suggestions too 😆 I don't oppose the idea. In fact, I think the way your handling formats with generics here in CPAL is a nice middle of the road between what we have now, and what you proposed in It's not the highest on my list but it's on there. Once refactoring is done, we could look into extracting it.
No I believe that you were correct in your analysis. Thanks. |
That is true but not the answer to the question 😜 You basically just restated what I said in:
It's honestly surprising that Windows doesn't automatically resample. It has a software mixer doesn't it? But anyway I have nothing more to add so I'll see myself out ☮️ I don't own nor use a Windows machine on a regular basis. |
Which question did I forget to answer? 🤷♂️
Usually it should for most cards (except when in WASAPI exclusive mode -- in which we aren't) but Realtek cards in particular seems real aggressive in enforcing its own sampling rate. And they are pretty pervasive. |
I answered this question:
With:
And then you rephrased my answer as:
And responded with this for some reason?:
|
|
I kinda feel like I got mansplained... |
|
Sorry that should have been in response to @Johannesd3’s question if we’d ever need to support other sampling rates. I updated my comment accordingly. No ill intention here. |
|
Coming back to working with
Still doesn't quite fit the bill but if it sparks anyone's creativity... |
|
I'm scrubbing this idea for as long as we target Windows as a "tier 1" platform. Introducing resampling would just be redoing what Rodio already has, in spite of our ideas to do it a little better. One good thing that came from this is this idea for future work:
I'll keep my |
|
For reference, the Psst project is implementing a |
This is a working CPAL backend based on the extensive initial work by @Johannesd3 (thanks!).
The intention is to deprecate Rodio in favour of CPAL: Rodio is based on CPAL, and builds on it, but there is nothing extra we need in Rodio that isn't already in CPAL. This has been discussed in various places, notably #648 and #734 (comment).
So far I have successfully tested this backend on Linux (
S16,F32) and macOS (F32).Todo
CpalJackvariant (like we also haveRodioJack)Open questions
What do we want to do with the Rodio backend for the upcoming release: replace it with CPAL? Or keep it around for planned deprecation in a release sometime later?
The code that checks whether the requested audio format is available, or falls back to the system default otherwise, actually returns the highest supported audio format by default. CurrentlyNo it would work with idiosyncracies for every backend and not be very transparent to the user.librespotdefaults toS16but this code gives another opportunity: we could change format selection to anOption<AudioFormat>, selecting the highest quality by default unless specified otherwise. For other backends, that do not easily support querying supported formats, we could still default toS16unless specified otherwise. Should we get this in or leave it be? This would be another PR but I'd first like to hear your thoughts before I put time in.For an out-of-the-box experience on Windows, we need resampling. Do we want to add this, or forget about this PR and stick with Rodio? See: Add CPAL backend #786 (comment)