Mixing channels

bass · May 28, 2017, 18:06:04

Hi,

mixing channels in theory is just adding the samples and dividing by channelcount. There is enough headroom to avoid clipping. But if we have 32 channels, the output signal is very quiet and we have too much headroom in this case. So, which formula or method is useful in these cases? Hard limiting? Compression? Is there an easy and good sounding way?

And why do many players use floating sample data (-1..0..1) and not i.e. 32 bit integer data for calculations? Is there a big difference in sound quality or is it just simpler to convert the floating data to real output format?

Thanks!

manx · May 28, 2017, 18:22:09

Quote from: bass on May 28, 2017, 18:06:04
mixing channels in theory is just adding the samples and dividing by channelcount. There is enough headroom to avoid clipping. But if we have 32 channels, the output signal is very quiet and we have too much headroom in this case. So, which formula or method is useful in these cases? Hard limiting? Compression? Is there an easy and good sounding way?

For file formats or situations that specify an actual global volume or amplification factor, the problem of reaching the desired target volume is basically transferred to the user.

In all other cases any player generally resorts to some sort of heuristic of which the "divide by channelcount" is one possible heuristic but not a very good one. The mixed signal (as a rule of thumb) increases by roughly 3dB instead of 6dB on average.

Dynamic range compression or hard limiting is a separate concern that can be applied after mixing anyway in order to mitigate clipping/distortion caused by going over 0dBFs.

Also, if the number of mixed channels varies over time (like for example in the situation of the Windows system audio mixer), you cannot even apply any heuristic based on channel count. Thus, no channel gets attenuated at all and the result is just a plain addition of the source signals. Windows applies a hard limiter to the result in order to avoid clipping.

Quote from: bass on May 28, 2017, 18:06:04
And why do many players use floating sample data (-1..0..1) and not i.e. 32 bit integer data for calculations? Is there a big difference in sound quality or is it just simpler to convert the floating data to real output format?

Floating point data is generally easier to work with because you can basically ignore any data type overflow issues if the signal is too loud or quantization distortion if the signal is too quiet.

Sound quality wise, 32bit float vs. 32bit integer does not make any practical difference. Both provide more dynamic range than the human ear can resolve.

Saga Musix · May 28, 2017, 18:53:16

To give some ideas:
What e.g. XMPlay/BASS.DLL does is rendering a very low-quality version of the module while it's loading which gives a pretty good estimation of how loud the module will be in total. This is then used as an amplification factor.
However, apart from simple normalization, a module player really should not modify the dynamic range of modules. The dynamic range should be under full control of the artist.

bass · May 28, 2017, 20:52:34

Thanks for the competent answers!
Last question: How is this solved in OpenMPT? Just manual setting?

Saga Musix · May 29, 2017, 09:21:36

OpenMPT has a deprecated feature named Automatic Gain Control which is basically a slow compressor/limiter, but it really should not be used. Since OpenMPT is a tracker, the artist has full control about setting the correct volume levels which then (in theory) sound identical on other people's machine as well.

ModPlug Central

Mixing channels

bass

manx

Saga Musix

bass

Saga Musix