Curiosity: Is Pattern Data compressed?

Started by Kitsune_Phoenix, March 12, 2015, 09:24:01

Previous topic - Next topic

Kitsune_Phoenix

This isn't very important, but I learned somewhere that many text-based formats (for pretty much anything) are completely uncompressed, and it is pretty obvious that pretty much all pattern data in modules is text. So this made me curious; when a module is compressed in OpenMPT, are only the samples and instruments compressed, or is the pattern data compressed too?

Thanks in advance! ^.^

Saga Musix

#1
Most formats have simple (S3M or XM) or more clever (IT) pattern compression schemes. MOD for instance is completely uncompressed. Very few formats (including IT) offer optional sample compression, too. As such, OpenMPT never "compresses" modules (for instance, instrument and sample headers always have the same size, no matter what kind of easily compressible information is in there), it merely follows the specifications of these formats, which precisely describe how patterns have to be stored.
If you're interested about the details, you can always look at the format specs.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Kitsune_Phoenix

Yes please, if you could post a link to them, I'd love to read that. ^.^

That makes me think though; why not enable Pattern and Header compression in MPTM format, since it is exclusive to OpenMPT?

Saga Musix

QuoteYes please, if you could post a link to them, I'd love to read that. ^.^
ftp://ftp.modland.com/pub/documents/format_documentation/

QuoteThat makes me think though; why not enable Pattern and Header compression in MPTM format, since it is exclusive to OpenMPT?
Because that would require the whole format to be re-specified and would 1) lose backwards-compatibilty with regular IT players (MPTM is nothing more than an IT file with added features) and 2) backwards-compatibility with previous OpenMPT versions for no good reason? The uncompressed parts of IT/MPTM are completely outweighed by patterns and samples (554 bytes per instrument, 80 bytes per sample header), so that really doesn't justify screwing up the current format to save a few bytes. MPTM is already small because it's based on IT, but it was never meant to squeeze the last bit out of everything.
Eventually we'll need a completely new MPTM format anyway, but whether that will be completely compressed or not remains to be seen.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

LPChip

Also, if you compress something inside the file, using a compressor software will not make the file any smaller but usually larger.

So an .mptm file would be smaller than to when you zip it, but if you zip the .mtpm file as it is now, chances are it will be smaller than the mptm file could achieve. Not to mention the difference if you compress lots of .mptm files into a zip file.
"Heh, maybe I should've joined the compo only because it would've meant I wouldn't have had to worry about a damn EQ or compressor for a change. " - Atlantis
"yes.. I think in this case it was wishful thinking: MPT is makng my life hard so it must be wrong" - Rewbs

Saga Musix

Quote from: LPChip on March 12, 2015, 20:39:05
Also, if you compress something inside the file, using a compressor software will not make the file any smaller but usually larger.
This common misconception is not necessarily true, because it is incomplete.
Applying several general-purpose packers on the same data set iteratively will indeed gain you next to nothing. However, specialized packers which are optimized to remove the redundancies of a specific kind of data (e.g. patterns) are often a good pre-processing step that can greatly shrink the resulting file. As an example, if you have highly repetitive pattern data, it is often a good idea to transform it into something without all the redundant data. Don't save data that basically doesn't exist or doesn't contain sensible information (e.g. empty pattern cells).
Another great example is IT's sample compression format, which significantly reduces the size of samples in IT files, however putting such a compressed sample in a ZIP file will still yield a smaller file than just zipping the original, uncompressed sample.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Kitsune_Phoenix

Quote from: Saga Musix on March 12, 2015, 21:18:32
However, specialized packers which are optimized to remove the redundancies of a specific kind of data (e.g. patterns) are often a good pre-processing step that can greatly shrink the resulting file.
QuoteAnother great example is IT's sample compression format, which significantly reduces the size of samples in IT files, however putting such a compressed sample in a ZIP file will still yield a smaller file than just zipping the original, uncompressed sample.
Somebody agrees with me! :D

Quote from: Saga Musix on March 12, 2015, 19:07:27
Eventually we'll need a completely new MPTM format anyway, but whether that will be completely compressed or not remains to be seen.

On that subject, you could probably do an XML-based format with an MPX or MPTX extension (maybe with extremely extensive metadata), sorta like how MS Word 2007 uses DOCX as opposed to DOC.

Saga Musix

#7
Quote from: Kitsune_Phoenix on March 12, 2015, 22:55:28
On that subject, you could probably do an XML-based format with an MPX or MPTX extension (maybe with extremely extensive metadata), sorta like how MS Word 2007 uses DOCX as opposed to DOC.
I've gone that road and I won't go it again. Honestly, if it's going to be text-based, then I'd rather use something more compact like JSON, maybe. JSON vs XML is another good example for my previous post, btw - XML contains a lot of unnecessary redundancy which just bloats the file, and it also bloats the compressed result. In the explanation above, you can think of converting XML to JSON as a pre-processing step which then results in data which is still smaller than just compressing the XML data.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Kitsune_Phoenix

I've... never actually heard of JSON before.

arseniiv

There is also some 'half-text half-binary' format named Bencode originally used in BitTorrent. Maybe it's worth mentioning. Bencoded data can be smaller than JSONed one (and be parsed/rendered quicker, I suppose), but JSON is more readable (does readability matter for a module format? :-\ ).
Feel free to correct my English.
Music & sounds: [Freesound] [SoundCloud] [Direct links to music]

Saga Musix

Well, some people seem to argue that human-readable formats are easier for them to parse etc... I personally don't really think that it's necessary, since binary formats can be easy to parse as well (especially if they follow some standardized formatting that is available to many platforms, such as Google's protobuf). I'd rather have a binary format that's fast to load than a text format.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

arseniiv

#11
Speaking of protobuf, which I hadn't known before I've finished reading the topic you linked here, it seems it's far better a route than Bencode... The former is a complete framework! :D
Feel free to correct my English.
Music & sounds: [Freesound] [SoundCloud] [Direct links to music]

Saga Musix

It's a good and a bad thing. Good because it makes 3rd party code for manipulating the resulting files easier. Bad because libopenmpt doesn't really have any third-party dependencies (apart from zlib for extracting J2B files) and it should stay that way. It's a difficult matter, and that's one of the reasons why MPTM is still a bastardized IT variant.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

arseniiv

Ah, definitely. I just forgot about libopenmpt...
Feel free to correct my English.
Music & sounds: [Freesound] [SoundCloud] [Direct links to music]

Kitsune_Phoenix

Quote from: Saga Musix on March 16, 2015, 13:43:52
It's a good and a bad thing. Good because it makes 3rd party code for manipulating the resulting files easier. Bad because libopenmpt doesn't really have any third-party dependencies (apart from zlib for extracting J2B files) and it should stay that way. It's a difficult matter, and that's one of the reasons why MPTM is still a bastardized IT variant.
A tower of ducttape, sorta like Windows, the GBA Metroid games, and the Source engine. *giggles*

Back on topic, when you eventually create the new format, what should the file extension be?