Mixing channels

Started by bass, May 28, 2017, 18:06:04

Previous topic - Next topic

bass

Hi,

mixing channels in theory is just adding the samples and dividing by channelcount. There is enough headroom to avoid clipping. But if we have 32 channels, the output signal is very quiet and we have too much headroom in this case. So, which formula or method is useful in these cases? Hard limiting? Compression? Is there an easy and good sounding way?

And why do many players use floating sample data (-1..0..1) and not i.e. 32 bit integer data for calculations? Is there a big difference in sound quality or is it just simpler to convert the floating data to real output format?

Thanks!

manx

Quote from: bass on May 28, 2017, 18:06:04
mixing channels in theory is just adding the samples and dividing by channelcount. There is enough headroom to avoid clipping. But if we have 32 channels, the output signal is very quiet and we have too much headroom in this case. So, which formula or method is useful in these cases? Hard limiting? Compression? Is there an easy and good sounding way?

For file formats or situations that specify an actual global volume or amplification factor, the problem of reaching the desired target volume is basically transferred to the user.

In all other cases any player generally resorts to some sort of heuristic of which the "divide by channelcount" is one possible heuristic but not a very good one. The mixed signal (as a rule of thumb) increases by roughly 3dB instead of 6dB on average.

Dynamic range compression or hard limiting is a separate concern that can be applied after mixing anyway in order to mitigate clipping/distortion caused by going over 0dBFs.

Also, if the number of mixed channels varies over time (like for example in the situation of the Windows system audio mixer), you cannot even apply any heuristic based on channel count. Thus, no channel gets attenuated at all and the result is just a plain addition of the source signals. Windows applies a hard limiter to the result in order to avoid clipping.

Quote from: bass on May 28, 2017, 18:06:04
And why do many players use floating sample data (-1..0..1) and not i.e. 32 bit integer data for calculations? Is there a big difference in sound quality or is it just simpler to convert the floating data to real output format?

Floating point data is generally easier to work with because you can basically ignore any data type overflow issues if the signal is too loud or quantization distortion if the signal is too quiet.

Sound quality wise, 32bit float vs. 32bit integer does not make any practical difference. Both provide more dynamic range than the human ear can resolve.

Saga Musix

To give some ideas:
What e.g. XMPlay/BASS.DLL does is rendering a very low-quality version of the module while it's loading which gives a pretty good estimation of how loud the module will be in total. This is then used as an amplification factor.
However, apart from simple normalization, a module player really should not modify the dynamic range of modules. The dynamic range should be under full control of the artist.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

bass

#3
Thanks for the competent answers!
Last question: How is this solved in OpenMPT? Just manual setting?

Saga Musix

OpenMPT has a deprecated feature named Automatic Gain Control which is basically a slow compressor/limiter, but it really should not be used. Since OpenMPT is a tracker, the artist has full control about setting the correct volume levels which then (in theory) sound identical on other people's machine as well.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.