ModPlug Central

Community => General Chatter => Topic started by: spacedrone808 on June 20, 2021, 13:18:46

Title: Duplicate finder software...
Post by: spacedrone808 on June 20, 2021, 13:18:46
...of course for tracker music. Any ideas?
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 21, 2021, 07:09:01
Duplicates on which level? Exact byte-for-byte copies? Modules that are "mostly the same" but maybe with a few bytes of difference? Modules sharing the same samples? etc...
Title: Re: Duplicate finder software...
Post by: spacedrone808 on June 21, 2021, 10:21:39
Byte-for-byte comparision can be achived by generic dup finders.
I suppose that song name and authoras matching criteria will be very neat. Yeah, i know that each tracker has it's own "exif"signature, but what if i'm missing smth?
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 21, 2021, 18:19:32
Finding duplicates by song name is probably rather useless - song names are far from unique, and many modules don't set a song title at all. Author information also not present at all in most modules.
I have been working on a tool on and off called Mod Library (https://github.com/sagamusix/ModLibrary/). I never had the time to bring it much beyond alpha stage, but I occasionally add new features. As it can already find files that are 100% identical, and as it already keeps track of all pattern contents, I can probably update it to also find modules with identical patterns, which I have found to be a very effective method of finding duplicate songs. I'll try to do that one of these days.
Title: Re: Duplicate finder software...
Post by: spacedrone808 on June 22, 2021, 09:02:13
Ohaaa, very cool project. But no binaries though?
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 22, 2021, 09:04:47
I'll upload some binaries later today. As said, it's in alpha stage right now, and at this point I have no intent to keep backwards compatibility with previous database versions (I just recreate my own library whenever there's a breaking change) so I normally don't upload binaries for it, unless someone asks for them.
Title: Re: Duplicate finder software...
Post by: spacedrone808 on June 22, 2021, 09:56:46
I see.. if i'm not mistaken there was similar software at maz-sound site.
But i can't remember it's name...
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 22, 2021, 18:16:03
Here's a current test build of Mod Library (https://sagagames.de/stuff/modlibrary2.7z). After adding some files or folders to the database, you can hit the button for finding duplicates. Note that at the moment it will only show one file per duplicate, but for finding the actual duplicates it's at least a start. Hopefully later it would show all versions of a duplicate file.

Note that adding files to the database is relatively slow at the moment - adding 10,000 files can take about an hour or so - because it's single-threaded and an audio fingerprint is generated of every file. This allows to enter an AcoustID in the search and find songs that sound simliar to that AcoustID. However, the search results for that feature aren't that good yet, they find a lot of stuff that really is a completely different song IIRC.
Title: Re: Duplicate finder software...
Post by: spacedrone808 on June 24, 2021, 08:48:34
Thank you for posting, will check it out in action
Title: Re: Duplicate finder software...
Post by: Tigoro on June 24, 2021, 22:51:54
I use mo3enc for xm\mod\it\s3m\mtm modules.
1) mo3enc -m4 -rmiso for files (no compression, remove any text in modules)
Wait time...
2) find duplicates in *.mo3
Wait time...
3) remove clone modules in original collection
I found a lot duplicates in modland in some formats (example, stm2mod or mod2stm or other 4-16k modules formats) and find many tracks in Keygen collection (who make music, authors).
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 25, 2021, 08:44:52
Note that you can report duplicates on ModLand to the contacts mentioned in readme_welcome.txt at the root of the server. MO3 can remove a lot of "unnecessary" metadata but it doesn't guarantee to find every duplicate that went through the format conversions you mentioned, because MO3 stores data differently depending on what the original source format was (so e.g. a MOD converted to MO3 and then the same song converted from S3M to MO3 will not result in the same MO3 file).
Title: Re: Duplicate finder software...
Post by: Tigoro on June 25, 2021, 17:31:07
Quote from: Saga Musix on June 25, 2021, 08:44:52
Note that you can report duplicates on ModLand to the contacts mentioned in readme_welcome.txt at the root of the server. MO3 can remove a lot of "unnecessary" metadata but it doesn't guarantee to find every duplicate that went through the format conversions you mentioned, because MO3 stores data differently depending on what the original source format was (so e.g. a MOD converted to MO3 and then the same song converted from S3M to MO3 will not result in the same MO3 file).
To be honest, I'm afraid to flood the project with information. Coma had been sorting through my software collection of trackers for many years :-) I have a fresh slice of ftp.modland.com, I'll try it. Yes, not every duplicate, but many - unknow -
Title: Re: Duplicate finder software...
Post by: Saga Musix on June 26, 2021, 16:14:29
Coma handed over administration of the archive to Menace, and he's eager to clean things up. :) Don't hesitate to report duplicates, they are frequently removed.