Metadata for Modules?

Started by Hermandar, June 22, 2015, 14:12:10

Previous topic - Next topic

Hermandar

First, I'd like to thank the ModPlug developers for their excellent work in creating fun to use music software with accurate playback.

I have a question on organizing repositories of .MOD (.IT, .XM, ...) files. While formats like MP3 or OGG have dedicated areas and structured formats for ample metadata, modules offer only very basic metadata: The least common denominator is that you can write a short title and (ab)use the instrument/sample names as a general comment area.
However, I would like to have something a little more sophisticated. Also, I don't want to change the comments that many modules have in their sample names. Also, this data should belong to the file. This way, (a) it is not lost through copying and (b) independent of some music library software's database format.
The best solution I came up with was to use APEv2 tags. These can be written at the end of the file, so they should not interfere with playing/loading them. There are just two problems:
1. Does this really not interfere with playback, or might some programs mistake the metadata for additional song or sample data?
2. Module playback software won't look for APE tags at the end of a module file, so this data can be used for sorting, but is not displayed on playback.

Does anyone have experience with this kind of metadata? What solutions are you using to organize your modules?

Cheers,
Hermandar

Saga Musix

At the very least, OpenMPT's own MPTM format will break if you append anything at the end, since someone had the brilliant idea to store a pointer there pointing to some extended data fields. Certain feature detections in the MOD format will also depend on the filesize being exactly the expected size. No matter what you will come up with, one format or another will not like it, so my simple advice is: "don't." It might be smarter to use some OS-provided functionality for tagging where the file format doesn't allow for it, e.g. Alternate Data Streams and of course the classic way of organizing modules into artist folders. I don't know what you want to achieve in the end, but proper audio players allow you to properly tag files without modifying them, like XMPlay's built in library functionality.

PS: Some module formats don't even have titles or sample names. The least common denominator is not having any metadata at all.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

manx


Hello,

Quote from: Hermandar on June 22, 2015, 14:12:10
The best solution I came up with was to use APEv2 tags. These can be written at the end of the file, so they should not interfere with playing/loading them. There are just two problems:
1. Does this really not interfere with playback, or might some programs mistake the metadata for additional song or sample data?

NO! This will not work at the very least for any XM/IT/MPTM file that makes use of ModPlug Tracker extensions (and probably for other formats as well). ModPlug Tracker extensions are stored at the end of the file and will not get parsed successfully if any other data gets appended after it. This feature is used extensively in the wild.

Quote from: Hermandar on June 22, 2015, 14:12:10
2. Module playback software won't look for APE tags at the end of a module file, so this data can be used for sorting, but is not displayed on playback.
Does anyone have experience with this kind of metadata? What solutions are you using to organize your modules?

libopenmpt (http://lib.openmpt.org/) based players are able to extract the following metadata (for module formats that actually store it):

  • file format (obviously)
  • tracker software
  • song title
  • artist name
  • date
  • comment text
  • duration
  • subsong structure and titles
  • instrument, sample, pattern, channel and subsong names
Which, I guess, is more than most other players will offer (as libopenmpt maintainer, I'm biased of course).


Saga Musix develops ModLibrary (https://github.com/sagamusix/ModLibrary). I myself am not using it, thus I cannot say that much about it. Saga Musix will probably further comment on that one.


Module formats in generel are already very very fragile and ambiguous. Extending them with some random metadata format will only increase the confusion and make the life harder for anyone having to deal with them. We really urge you to not do that, not even for your own private collection. Player software WILL break.

We, the OpenMPT team, will continue to support reading all kinds of metadata with our libopenmpt player library that we come accross (the libopenmpt interface is extensible in that regard) and we will also continue to envolve our own format with further metadata that we deam useful (in fact, Saga Musix just added artist name support in MPTM some weeks ago in the current OpenMPT test builds).

PLEASE leave old formats alone and do not, in any regard, change their on disk format. This will break older player software that does not expect and could never have anticipated that kind of format change.
Whatever you do, just store your own personal metadata outside the actual files themselves. In order to be able to search any metadata, it would have to be stored in some kind of player datebase anyway.

manx

Hermandar

Thank you for your quick and competent answers.
I will heed your warnings and look for another way to organize my module collection, then, probably some cross-platform music player with an open database format for metadata.
(What I want to achieve is mainly the ability to sort / filter by author, year or genre, and ideally also find cover versions etc., so I would need additional metadata records anyway)

LPChip

Can't you do that with ID3 tags from MP3 and OGG etc files? You could theoretically convert the modules to that format. Then they'll work in any player too on any device.
"Heh, maybe I should've joined the compo only because it would've meant I wouldn't have had to worry about a damn EQ or compressor for a change. " - Atlantis
"yes.. I think in this case it was wishful thinking: MPT is makng my life hard so it must be wrong" - Rewbs

Saga Musix

LPChip: How is that going to help with having a module database? If you already store tags in a converted files, you can just stored them in any other kind of external database, which is what I get Hermandar wanted to avoid if possible.

Hermandar: What you are looking for sounds like a project that is slowly going on between several module / demoscene databases, namely Nectarine, ModArchive and demozoo. We want to create audio fingerprints of the whole ModArchive collection, match them against Nectarine and Demozoo entries, etc... - in the end, there would be one database connecting them all, with potentially useful metadata. I don't know the current status of the project but it's still in the works.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Relabsoluness

Quote from: Saga Musix on June 22, 2015, 14:46:46
At the very least, OpenMPT's own MPTM format will break if you append anything at the end, since someone had the brilliant idea to store a pointer there pointing to some extended data fields.
There indeed has been an incompetent someone hacking around OpenMPT code in the past, but does this brilliancy cause actual problems apart from making it harder to do undesirable hacks to the end?

Saga Musix

Well, the actual problems can be found in one way or another when trying to actually implement all these hacks in other software, as they have to implement at least three different OpenMPT extensions if they want to support it all. At least placing this pointer at the end of the file is still far better than just assuming that the data can be found right after the sample block (as it's being done with the earlier XM/IT extensions), since that broke extension loading as soon as compressed samples came into play - and when Ian from un4seen actually went ahead to add support for an OpenMPT extension in XMPlay/BASS, he did have some problems with that because samples are not loaded at first when decoding IT files; however, IT-compressed samples are required to be uncompressed to know their compressed length. And of course, this mechanism made OpenMPT unable to read its own extensions for one reason (bug) or another in the past multiple times, which luckily was caught by automated testing most of the time.
But I digress... In the end, IMHO it would have been much better to stick both the IT/XM and the newer MPTM extension just in the same place as the already existing chunk-based MPT extensions, instead of having the same functionality split into three different and incompatible ways of handling data.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Relabsoluness

#8
Thanks for the reply and the better explained context. It's quite some time from it's creation, but if I recall correctly, one of the main reasons for introducing yet another extension type was the fact that the existed extension fields, while being easy and convenient for simple additions, was limited to 16-bit size making it practically useless for certain extensions and inconvenient for many such as per sample or per pattern items. For example the PC-notes, or extended pattern data in general, is saved in the latest extension type and it would be somewhat of a hacking exercise to implement it with older extensions given that even a single pattern isn't guaranteed to fit in 16-bits. Also the old extensions lived in a global ID list while the new extensions are modular allowing e.g. extracting extended pattern data from a single pattern without the need to figure out which ID's are related to pattern in general and how to find the data that is related to searched pattern. But this is not to say that ID list wouldn't have it's own advantages.

Saga Musix

Yeah, I can definitely see the 16-bit fields being problematic - which is incidentally why e.g. the new sample cue points get one chunk per sample; 4000 samples * 9 cue points would already exceed the 64k limit. Why the fields were limited to 16 bits in the first IT/XM hacks is beyond me (it's not like a great amount of space is saved there), especially since previously existing MPT extensions already used 32-bit chunks.
I know history can't be changed, all I wanted to point out is that it would very well have been possible to put all the MPTM-specific data into another chunk after the PNAM, CNAM, etc. stuff (from the old MPT hacks) - naturally the internal chunk data there could still have been of any flavour (like the pattern sub chunks used in MPTM), but at least the outer chunk would have looked the same as the existing chunks and it would have been easy to just use the same chunk-based reader (which the MPT code at that time didn't do anyway) for old and new hacks. The same goes for putting the OpenMPT IT/XM hacks right where MPT's original hacks go.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Relabsoluness

Not that this has much relevance for today's development (or this thread), but just for the fun of analysis:
Would you still considered it very well possible if requiring that loading file made with newer versions shall recognize and read everything it could have, had the file been saved with the same version and not cause risk of random corruptions? For example maintaining id+size pattern for the new chunks would not have been an option because of the size limitation and changing the size field and writing new stuff to the end would have caused, given that there was no way to break the id+size reading loop by any tag, older versions to read differently formatted chunks which might, by chance, have contained recognized ID's causing reading of garbage data and potentially corrupted load.

Saga Musix

Yes, as said, it would of course not have been viable to use the existing "LoadExtendedSongProperties" infrastructure, but both this and the MPTM extensions could very well have extended the chunks which are written after the IT header (or after the XM sample data) which already existed in the MPT 1.16 days and used 32-bit chunk sizes. This mechanism was already very well backwards and forwards compatible - turning it into a proper chunk-based reader (instead of just expecting the chunks being present in a certain order) could just have skipped over unknown chunks.
But it's too late for all that, so hopefully the new MPTM format (once it's out) will prevent all these mistakes from ever happening again.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Relabsoluness

We likely have no disagreement on what could have been done in the very beginning of OpenMPT, but as we also seem to agree, contrary to your initial statement, that the actual problem was created long before introduction of the pointer at the end for MPTM, perhaps we also agree that the pointer hardly is the brilliance worth remarking on, if being brilliance at all given the context in which it was created in.

Saga Musix

In this particular case, the whole "pointer at the end of file" thing is the reason why e.g. sticking APE tags at the end of MPTM files won't work. As said before this won't work for some other formats like MOD as well, so it's a bad idea to begin with, but with just the "classic" MPT and OpenMPT IT/XM extensions, putting arbitrary data at the end of the file would actually not have been a problem. But as we already figured out, doing so would be a bad thing to do anyway.
» No support, bug reports, feature requests via private messages - they will not be answered. Use the forums and the issue tracker so that everyone can benefit from your post.

Relabsoluness

Adding bytes without duplicating the existing end bytes will break mptm loading, sure. With OpenMPT IT/XM it probably won't break, but as I see it, the data would be parsed assuming extended song property format and if having a (very) bad luck, arbitrary data might match with some id and have proper size causing reading of arbitrary data causing corrupted loading.