If your architecture does not provide saturated arithmetic you have to do it yourself, which is of course costly. One way to avoid it is to reduce the mixing precision (e.g. treat the mix buffer as 7 bit + 1 bit of headroom), which is what OpenMPT does (32 bit mix buffer, 28 bit = 0dB, so 4 bits of headroom). Mixing in 8 bit precision is a really bad idea, even DOS trackers like IT and FT2 didn't do this. You will get an extremely noisy output not only because of 8 bit in general but also because of the needed headroom. If you absolutely must output 8 bit audio in the end, it makes more sense to mix in a 32 bit buffer and then dither the output to 8 bit instead. Or if you don't have any restrictions to the output format, do the mixing in floating point precision right away so that no wraparounds can occur anyway.