Loudness war!

Started by bass, June 01, 2019, 00:02:04

Previous topic - Next topic

bass

Sorry for the clickbaiting title. After a few years, I started programming on my tracker player again. I have noticed, that OpenMPT plays XM files louder than my player. I´m mixing everything in floats and finally converting the float data into 16 bit output. For this, I have to divide the data somehow depending on the channel count. There are many therories what this divider should be. What are your experiences with this? I get clipping, when my divider is too low. But OpenMPT doesn´t clip and is still louder. Am I doing something wrong here?
Thanks!

Saga Musix

XM doesn't have a well-defined global volume, you could even change it in  FT2 and this value wouldn't be saved in XM files even, so you can only guess what the correct amplification factor is. OpenMPT's default amplification is that one full-volume voice is played at around -15dB in the mix which has proven to be a rather good average value for lots of different modules. However, for some files it's too quiet and for others it's too loud so it's not a silver bullet.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

bass

Thank you! So, this means, you divide each sample value on a channel by (1 / 0.177828)?
And does it make a difference for the sound scaling the sample values on each channel while mixing or scaling the final sample value after mixing?

Saga Musix

Quote from: bass on June 03, 2019, 14:14:11Thank you! So, this means, you divide each sample value on a channel by (1 / 0.177828)?
No, you are taking my approximate a description a bit too literally. :) The sample pre-amp is just one out of many factors leading to the final total amplification amount of every channel, and it's user-configurable. In XM by default it's 48 (found on the General Tab) but it can be changed. The pre-amp is stored as an OpenMPT extension in XM files so other players will ignore it completely and just use whatever pre-amp they want. XMPlay normalizes modules for example, so the pre-amp is calculated while the file is loading. For a tracker this is not a viable approach, though, it makes only sense for players.

QuoteAnd does it make a difference for the sound scaling the sample values on each channel while mixing or scaling the final sample value after mixing?
It depends. If you literally want to add a separate amplification for the scaling to each channel, then latter is less computationally expensive and introduces less floating-point requantization errors than doing it individually for each sample (every multiplication adds more errors to the final result, so you want to keep the number of operations that require requantization down). However, you can of course just use the same "trick" (really shouldn't be called a trick as it's a trivial optimization that you will find in any good mixer) as OpenMPT and have a single amplification factor that is applied to each mixed sample that itself is a multiplication of actual note volume, panning and the final scaling. This would end up with a single multiplication per audio channel per sample.
So if your mixer currently looks like this:

// Mixer
v = sample_value;
v *= note_volume;
v *= panning_left;
v *= global_scaling_factor;
out_left = v;
// same repeated for right channel

Then it would make more sense to optimize it to this:

// Somewhere outside of the mixer where the volume of each channel is calculated
volume_factor_left = note_volume * panning_left * global_scaling_factor;
volume_factor_right = note_volume * panning_right * global_scaling_factor;

...

// Mixer
out_left = sample_value * volume_factor_left;
out_right = sample_value * volume_factor_right;

» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

bass

Thanks for clarification :-)
I already use the "trick" for the volume factor values (without global scaling factor, because I calculate this at the end when I convert the floating samples [-1, 1] to 16 bit samples).
I just wondered why OpenMPT is much louder playing XM or MOD files than my player, even if my global scaling factor is i.e. 0.25 for 16 channel XM file. Is there a compressor or soft clipping in OpenMPT?

Saga Musix

Quote from: bass on June 03, 2019, 14:55:58
my global scaling factor is i.e. 0.25 for 16 channel XM file.
Please don't adjust the global volume based on the number of channels. ModPlug used to do this and it's a terrible idea.
I cannot tell you what exact factor OpenMPT would be using because there are several possible answers - it depends on many things, including the user-chosen mix mode. In particular the FT2 Pan law is louder in the centre than linear pan law, and whether OpenMPT uses a mix mode with FT2 pan law depends on which tracker was used to create that XM file. With linear pan law, the factor would be 48/256 (assuming a sample pre-amp of 48), with FT2 pan law it's higher.
However, what I can say is that there is certainly no compression or soft clipping going on - you can easily verify by comparing the original waveform and OpenMPT's output.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

bass

OK, I didn´t expect, that this would be so confusing. But somehow the global volume has to be based ALSO on the number of channels. It´s a big difference if I play 4 or 32 channels. And I don´t want clipping. Of course, I can let the user control the volume or compute the global volume while loading but this seems to be too complicated to me. I thought, there would be an "easy" answer to this. I wonder how other players handle this, it seems, that there is no correct way?!
I also heard about dividing by sqrt(num of channels) or just dividing by number of channels, but this is unnecessary, I think.

Saga Musix

Imagine a piano player on stage playing for a concert. Now a flute player enters the stage to accompany the piano. Will the piano automatically become more quiet? Of course not! Maybe the piano player will adjust their play style but that is their decision.
The same goes for volume in a tracker: It is up to the user to decide on a volume level:
1. What if they only use quiet samples that are not normalized? If you decided to change the volume just because they used those samples on a lot of channels, their song will be too quiet.
2. What if they mostly use 8 out of chose 32 channels, but the remaining 24 channels are only used very occasionally? Again, the song will be too quiet.
3. On the other hand, if they decide to put lots of simultaneous notes at full volume on all of those channels, they already have the opportunity to lower the global volume of their song to prevent it from clipping. But it's not up to you to do that decision for them.

It's okay to have a global amplification factor that is not stored individually for each song (like the sample pre-amp in S3M or IT files), but it should treat all songs equally, no matter how many channels they have.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

TheRealByteraver

I'm a bit late to the discussion but I was wondering how MPT doesn't do soft clipping. Do you mean to say the mixer never ever reaches values below -32768 or above 32767? Or did I take this too literally?  ;)
While it might make some sense to change the amplify / divide factor based on the number of channels in an 8 bit mixer, I felt there was no need for such a construct in my 16 bit mixer so I always use the same value. I guess one could implement some adaptive gain control algorithm such as modplug player did. Not 100% how I would do such a thing though.

Saga Musix

Your assumption that OpenMPT uses a 16-bit mixer is wrong. It's always been a 32-bit mixer, or more precisely a 4.28 bit  mixer (4 bits of headroom with 28 bits below 0dB). Depending on the output device or file format, the output is either unclipped (e.g. writing to a 32-bit float WAV file) or hard-clipped at 0dB. For all the reasons mentioned above, it makes sense to have a dynamic amplification factor defined by the user, not by the number of channels. Automatic gain control also should never be automatic. If the composer wants their music to be compressed automatically, let them put a compressor on their music.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

TheRealByteraver

Interesting, I had to read up on that. Does that mean OpenMPT feeds 32 bit float data per channel to "Windows" (the soundcard driver / wave mapper / ...)?
I made my mixer 32 bit fixed point, it scales down to 16 bit after the mixing, before feeding it to the wave mapper (I should use XAudio2 instead, will do so later). Does it still make sense to make a fixed point mixer in this day and age? For performance reasons I mean? Because it seems somehow easier to make it floating point.

manx

Quote from: TheRealByteraver on November 17, 2019, 08:59:46
Interesting, I had to read up on that. Does that mean OpenMPT feeds 32 bit float data per channel to "Windows" (the soundcard driver / wave mapper / ...)?
As already stated, OpenMPT uses 4.28 fixed point internally. This gets converted to whatever is configured in the soundcard settings as output format when playing live, or selected as output file format when rendering to a file.

Quote from: TheRealByteraver on November 17, 2019, 08:59:46
I made my mixer 32 bit fixed point, it scales down to 16 bit after the mixing, before feeding it to the wave mapper (I should use XAudio2 instead, will do so later).
Depending on the headroom/precision trade-off in your 32bit fixed point representation, you are loosing precision by converting it to 16bit. all Windows audio APIs are capable of handling higher precision audio, so you could also send 24bit/32bit integer PCM or just 32bit float.

Quote from: TheRealByteraver on November 17, 2019, 08:59:46
Does it still make sense to make a fixed point mixer in this day and age? For performance reasons I mean?
No, OpenMPT uses fixed point purely due to historical reasons.

TheRealByteraver

Thanks for the answer manx :)
I had no idea Windows could be fed 32 bit audio data. But lo and behold, my mixer is now feeding it 32 bits per sample and it plays just fine. My imagination confirms the sound quality has definitely gone up ;) It should make a difference when using a lot of 16 bit samples I guess.
QuoteThis gets converted to whatever is configured in the soundcard settings...
Do you mean OpenMPT's soundcard settings or Windows' soundcard settings? I guess it is the latter?

manx

Quote from: TheRealByteraver on November 17, 2019, 10:51:17
QuoteThis gets converted to whatever is configured in the soundcard settings...
Do you mean OpenMPT's soundcard settings or Windows' soundcard settings? I guess it is the latter?
OpenMPT's soundcard settings determine what format OpenMPT uses to talk to the underlying API. Windows (since Vista) system level APIs always use 32bit float internally, and talk to the hardware with the Windows-configured sample format.

TheRealByteraver

All right, thanks for clearing that up :)