TOSEC-MSX WIP (TOSEC-v2011-11-19)

Codenamed ‘Martos release’ (a well-known MSXer in the scene), this dat file release notably enlarges TOSEC-MSX catalogue by adding 252 previously uncatalogued dumps. Below follows a description of the steps involved in the dat creation process for anyone with a interest.

Downloading all files from Preservación de cintas MSX
1175 archives (zip) comprising a total of 1943 dumps, each dump being a piece of software on its own regardless whether it is the only release or an alternative release or a bak.

Unzipping 1175 archives (1943 dumps) and zipping 1943 files on a dump-per-zip basis. In order to discern whether a dump has already been catalogued in TOSEC-MSX, checking the dumps against the official TOSEC-MSX v2011-11-11 release was a necessary procedure resulting in…

187 matching dumps @ \MSX\Compilations\[CAS]
2 matching dumps @ \MSX\Compilations\[DSK]
no matching dumps @ \MSX\Firmware
114 matching dumps @ \MSX\Magazines\[CAS]
no matching dumps @ \MSX\Magazines\[DSK]
1026 matching dumps @ \MSX\Various\[CAS]
14 matching dumps @ \MSX\Various\[DSK]
73 matching dumps @ \MSX\Various\[ROM]
2 matching dumps @ \MSX2\Compilations\[CAS]
6 matching dumps @ \MSX2\Compilations\[DSK]
no matching dumps @ \MSX2\Magazines
4 matching dumps @ \MSX2\Various\[CAS]
5 matching dumps @ \MSX2\Various\[DSK]
no matching dumps @ \MSX2\Various\[ROM]
no matching dumps @ \MSX2+\Various
no matching dumps @ \Turbo-R\Applications
no matching dumps @ \Turbo-R\Demos
no matching dumps @ \Turbo-R\Games
no matching dumps @ \Turbo-R\Operating Systems

Total: 1433 dumps matching TOSEC-MSX v2011-11-11 (or the very same or TOSEC-MSX v2011-04-07)

Therefore, remaining dumps (not added to the TOSEC-MSX catalogue) turn to be unique dumps to add to a forthcoming TOSEC-MSX release. At least, theoretically. Let us proceed.

Following the step above, dat files were created for the Martos dumps matching TOSEC-MSX v2011-11-11

MSX MSX - Compilations - [CAS] [Martos matching dumps].dat
MSX MSX - Compilations - [DSK] [Martos matching dumps].dat
MSX MSX - Magazines - [CAS] [Martos matching dumps].dat
MSX MSX - Various - [CAS] [Martos matching dumps].dat
MSX MSX - Various - [DSK] [Martos matching dumps].dat
MSX MSX - Various - [ROM] [Martos matching dumps].dat
MSX MSX2 - Compilations - [CAS] [Martos matching dumps].dat
MSX MSX2 - Compilations - [DSK] [Martos matching dumps].dat
MSX MSX2 - Various - [CAS] [Martos matching dumps].dat
MSX MSX2 - Various - [DSK] [Martos matching dumps].dat

One of TOSEC aims has to do with providing more descriptive information for the different alternative [a] versions of a same title. Thus, the WIP [Martos] more info field will serve the purpose of tracing the files origin by by appending [Martos] field at the end of the filename.

Maths are maths so a total of 1943 dumps minus 1433 matching dumps equals 510 non-matching dumps. However, there are only 492 non-matching dumps, which leads us to formulate the question: What happened to the missing 18 non-matchind dumps?

To make the story short:

The workaround entailed renaming the 1943 dumps by their sha1 checksums. The hash renamer nicely renamed all the dumps but… 18! (apart from unrenamed .bak files,of no use or cataloguing purposes). As a further check those 18 dumps were merged into TOSEC-MSX v2011-11-11 resulting in each of those dumps having a match in TOSEC-MSX v2011-11-11 and, consequently, leaving the ‘ToSort’ folder emtpy. Therefore, somehow, those 18 dumps were already duplicates. That is why the merger did not leave those dumps alongside the other 492 non-matching dumps.

No statement goes without evidence so here it is:

folder: 18 missing files
md5deep *.* > 18missingfiles.txt (edited to remove .txt and .exe)

241edff03a46058bcc1e28adeb94c414 \ToSort\4X4-B.CAS
00d91a4ca022cb8a68b4db8f38f0dbf8 \ToSort\AMC-B.CAS
78b8201bb0b57feabe880ec11172a876 \ToSort\CAPIESPA.CAS
02a3a944a403efd02f0e54ee2d258522 \ToSort\EUROPG-B.CAS
27d21893af6ac41161e682fb0ada6ad7 \ToSort\FIGUPLAN.CAS
015ceaf1a52c19e401ff35c910fbb971 \ToSort\FREDDY-A(2).CAS
be263c9542a3ba12bfea35d778780b17 \ToSort\GALAXIAN(2).CAS
45a2b0ef9aa163ac9e7ee2138ee4d4a0 \ToSort\MONCLOA.CAS
db3c2b907aaacf43c0380da3bd28825c \ToSort\PETALOCO.CAS
660446e46ea5fe86b65307732e0f49dc \ToSort\PHANTI-A.CAS
b1403ec4ff742f2efe9f1f5b71564fc9 \ToSort\PHANTI-B(2).CAS
84949525ace10c237e36e36831c962bb \ToSort\PRINCINM.CAS
0dc96830dde1115c91552dfa28e5ce1d \ToSort\RAMBO3-B.CAS
df7a3f6d11dad486ba45efefdb4d08df \ToSort\RIOSESPA.CAS
057ef1044ee88bdbe1e7db5f34e13e6e \ToSort\ROBOCO-B.CAS
cf77ed6744053cfdbbbb880969b3fc32 \ToSort\SATAN-A.CAS
58e8cb3d052d7025f765f395f70c9f88 \ToSort\TRATATEX.CAS
92e4bab93982845ee87f14ccbb2094de \ToSort\Vg-estre.cas

folder 1943 dumps
md5deep -wM 18missingfiles.txt *.* > missingfilesmatching1943originaldumps.txt

241edff03a46058bcc1e28adeb94c414 \1943dumpedfiles\4X4-B(1).CAS matched 18missingfilestosort\4X4-B.CAS
241edff03a46058bcc1e28adeb94c414 \1943dumpedfiles\4X4-B.CAS matched 18missingfilestosort\4X4-B.CAS

00d91a4ca022cb8a68b4db8f38f0dbf8 \1943dumpedfiles\AMC-B(1).CAS matched 18missingfilestosort\AMC-B.CAS
00d91a4ca022cb8a68b4db8f38f0dbf8 \1943dumpedfiles\AMC-B.CAS matched 18missingfilestosort\AMC-B.CAS

78b8201bb0b57feabe880ec11172a876 \1943dumpedfiles\CAPIESPA(1).CAS matched 18missingfilestosort\CAPIESPA.CAS
78b8201bb0b57feabe880ec11172a876 \1943dumpedfiles\CAPIESPA.CAS matched 18missingfilestosort\CAPIESPA.CAS

02a3a944a403efd02f0e54ee2d258522 \1943dumpedfiles\EUROPG-B(1).CAS matched 18missingfilestosort\EUROPG-B.CAS
02a3a944a403efd02f0e54ee2d258522 \1943dumpedfiles\EUROPG-B.CAS matched 18missingfilestosort\EUROPG-B.CAS

27d21893af6ac41161e682fb0ada6ad7 \1943dumpedfiles\FIGUPLAN(1).CAS matched 18missingfilestosort\FIGUPLAN.CAS
27d21893af6ac41161e682fb0ada6ad7 \1943dumpedfiles\FIGUPLAN.CAS matched 18missingfilestosort\FIGUPLAN.CAS

015ceaf1a52c19e401ff35c910fbb971 \1943dumpedfiles\FREDDY-A(1).CAS matched 18missingfilestosort\FREDDY-A(2).CAS
015ceaf1a52c19e401ff35c910fbb971 \1943dumpedfiles\FREDDY-A(2).CAS matched 18missingfilestosort\FREDDY-A(2).CAS

be263c9542a3ba12bfea35d778780b17 \1943dumpedfiles\GALAXIAN(1).CAS matched 18missingfilestosort\GALAXIAN(2).CAS
be263c9542a3ba12bfea35d778780b17 \1943dumpedfiles\GALAXIAN(2).CAS matched 18missingfilestosort\GALAXIAN(2).CAS

45a2b0ef9aa163ac9e7ee2138ee4d4a0 \1943dumpedfiles\MONCLOA(1).CAS matched 18missingfilestosort\MONCLOA.CAS
45a2b0ef9aa163ac9e7ee2138ee4d4a0 \1943dumpedfiles\MONCLOA.CAS matched 18missingfilestosort\MONCLOA.CAS

db3c2b907aaacf43c0380da3bd28825c \1943dumpedfiles\PETALOCO(1).CAS matched 18missingfilestosort\PETALOCO.CAS
db3c2b907aaacf43c0380da3bd28825c \1943dumpedfiles\PETALOCO.CAS matched 18missingfilestosort\PETALOCO.CAS

660446e46ea5fe86b65307732e0f49dc \1943dumpedfiles\PHANTI-A(1).CAS matched 18missingfilestosort\PHANTI-A.CAS
660446e46ea5fe86b65307732e0f49dc \1943dumpedfiles\PHANTI-A.CAS matched 18missingfilestosort\PHANTI-A.CAS

b1403ec4ff742f2efe9f1f5b71564fc9 \1943dumpedfiles\PHANTI-B(1).CAS matched 18missingfilestosort\PHANTI-B(2).CAS
b1403ec4ff742f2efe9f1f5b71564fc9 \1943dumpedfiles\PHANTI-B(2).CAS matched 18missingfilestosort\PHANTI-B(2).CAS

84949525ace10c237e36e36831c962bb \1943dumpedfiles\PRINCINM(1).CAS matched 18missingfilestosort\PRINCINM.CAS
84949525ace10c237e36e36831c962bb \1943dumpedfiles\PRINCINM.CAS matched 18missingfilestosort\PRINCINM.CAS

0dc96830dde1115c91552dfa28e5ce1d \1943dumpedfiles\RAMBO3-B(1).CAS matched 18missingfilestosort\RAMBO3-B.CAS
0dc96830dde1115c91552dfa28e5ce1d \1943dumpedfiles\RAMBO3-B.CAS matched 18missingfilestosort\RAMBO3-B.CAS

df7a3f6d11dad486ba45efefdb4d08df \1943dumpedfiles\RIOSESPA(1).CAS matched 18missingfilestosort\RIOSESPA.CAS
df7a3f6d11dad486ba45efefdb4d08df \1943dumpedfiles\RIOSESPA.CAS matched 18missingfilestosort\RIOSESPA.CAS

057ef1044ee88bdbe1e7db5f34e13e6e \1943dumpedfiles\ROBOCO-B(1).CAS matched 18missingfilestosort\ROBOCO-B.CAS
057ef1044ee88bdbe1e7db5f34e13e6e \1943dumpedfiles\ROBOCO-B.CAS matched 18missingfilestosort\ROBOCO-B.CAS

cf77ed6744053cfdbbbb880969b3fc32 \1943dumpedfiles\SATAN-A(2).CAS matched 18missingfilestosort\SATAN-A.CAS
cf77ed6744053cfdbbbb880969b3fc32 \1943dumpedfiles\SATAN-A.CAS matched 18missingfilestosort\SATAN-A.CAS

58e8cb3d052d7025f765f395f70c9f88 \1943dumpedfiles\TRATATEX(1).CAS matched 18missingfilestosort\TRATATEX.CAS
58e8cb3d052d7025f765f395f70c9f88 \1943dumpedfiles\TRATATEX.CAS matched 18missingfilestosort\TRATATEX.CAS

92e4bab93982845ee87f14ccbb2094de \1943dumpedfiles\VG-ESTRE(1).CAS matched 18missingfilestosort\Vg-estre.cas
92e4bab93982845ee87f14ccbb2094de \1943dumpedfiles\Vg-estre.cas matched 18missingfilestosort\Vg-estre.cas

As the data above demonstrates, every two dumps have a matching sha1 checksum. Hence there are not 18 missing dumps but, quite the contrary, 18 duplicated dumps.

The merger did its job nicely; myself, I just needed to double-check, just in case🙂 Now, psycho-mode goes off.

The long story: (which happens to be the short story)

I tried comaring checksums of hashed dumps in both folders 1943 and 1925 but couldn’t find a tool that’d serve the purpose. Long time…
So then I thought about renaming dumps in both folders by their sha1 checksums. A text comparison of of sha1’ed dumps in both files would
reveal sha1’ed dump filenames. Knowing the sha1 filename of the 18 missing dumps and running a matching search of those 18 sha1 filenames against the 1943 folder would link the sha1 checksum filenames to the original filenames. Solved question.

Substract 185 bak dumps from 492 and there are 309 real dumps to be catalogued. There are 57 dumps whose renaming I am dubious about at present so in this WIP release 252 dumps are being added to TOSEC-MSX. A nice figure.

Details of the TOSEC-MSX WIP v2011-11-19 dat comparisons follow; further details in the log files from the dat comparisons.

MSX MSX - Compilations - [CAS] (TOSEC-v2011-04-23_CM) versus MSX MSX - Compilations - [CAS] (TOSEC-v2011-11-19_CM)
34 additions

MSX MSX - Compilations - [DSK] (TOSEC-v2011-04-23_CM) versus MSX MSX - Compilations - [DSK] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX - Firmware (TOSEC-v2011-04-23_CM) versus MSX MSX - Firmware (TOSEC-v2011-11-19_CM)
no changes

MSX MSX - Magazines - [CAS] (TOSEC-v2011-04-23_CM) versus MSX MSX - Magazines - [CAS] (TOSEC-v2011-11-19_CM)
52 additions

MSX MSX - Magazines - [DSK] (TOSEC-v2011-04-23_CM) versus MSX MSX - Magazines - [DSK] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX - Various - [CAS] (TOSEC-v2011-04-23_CM) versus MSX MSX - Various - [CAS] (TOSEC-v2011-11-19_CM)
164 additions

MSX MSX - Various - [DSK] (TOSEC-v2011-04-23_CM) versus MSX MSX - Various - [DSK] (TOSEC-v2011-11-19_CM)
1 addition

MSX MSX - Various - [ROM] (TOSEC-v2011-04-23_CM) versus MSX MSX - Various - [ROM] (TOSEC-v2011-11-19_CM)
1 addition

MSX MSX2 - Compilations - [CAS] (TOSEC-v2011-04-23_CM) versus MSX MSX2 - Compilations - [CAS] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX2 - Compilations - [DSK] (TOSEC-v2011-04-23_CM) versus MSX MSX2 - Compilations - [DSK] (TOSEC-v2011-04-23_CM) v2
no changes

MSX MSX2 - Various - [CAS] (TOSEC-v2011-04-23_CM) versus MSX MSX2 - Various - [CAS] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX2 - Various - [DSK] (TOSEC-v2011-04-23_CM) versus MSX MSX2 - Various - [DSK] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX2 - Various - [ROM] (TOSEC-v2011-04-23_CM) versus MSX MSX2 - Various - [ROM] (TOSEC-v2011-11-19_CM)
no changes

MSX MSX2+ - Various (TOSEC-v2011-04-23_CM) versus MSX MSX2+ - Various (TOSEC-v2011-11-19_CM)
no changes

MSX Turbo-R - Applications (TOSEC-v2011-04-23_CM) versus MSX Turbo-R - Applications (TOSEC-v2011-11-19_CM)
no changes

MSX Turbo-R - Demos (TOSEC-v2011-04-23_CM) versus MSX Turbo-R - Demos (TOSEC-v2011-11-19_CM)
no changes

MSX Turbo-R - Games (TOSEC-v2011-04-23_CM) versus MSX Turbo-R - Games (TOSEC-v2011-11-19_CM)
no changes

MSX Turbo-R - Operating Systems (TOSEC-v2011-04-23_CM) versus MSX Turbo-R - Operating Systems (TOSEC-v2011-11-19_CM)
no changes

Total additions: 252 additions

Note: The thoughtful reader should observe that rating the whole ‘Martos’ archive as being a one-person task might be inaccurate. As a result of this, dump crafting might come from different sources.

  No me he enterado de nada, pero parece un curro impresionante, gran trabajo.


