Anyone good at guessing filenames?
- Malvineous
- Shikadi Webmaster
- Posts: 382
- Joined: Wed Oct 31, 2007 21:48
- Location: Brisbane, Australia
- Contact:
Anyone good at guessing filenames?
Hi all,
I need some help finalising the reverse engineering of the .LBR archive files used by Vinyl Goddess From Mars. This file format does not store filenames, using a hash (number) derived from the filename instead. This is very easy to calculate if you know the filename, but if you only know the hash it is impossible to go back the other way and figure out the filename.
This means the only way to figure out what some of the files are called is by guessing, and seeing if the hash of the guess matches one of the existing hashes. I've already matched up all the files I can using this method, so I'm hoping some creative people might spend a few minutes and see if they can guess a few more filenames.
I have put up a quick web page which lets you type in filenames and it automatically calculates the hash and checks to see whether it matches one of the unknown values. If so it will display a message, and you can post the newly discovered filename here so I can add it to the list of known names.
Any help with this would be much appreciated! There are only 11 unknown filenames to go, so it would be great to get them all figured out.
I need some help finalising the reverse engineering of the .LBR archive files used by Vinyl Goddess From Mars. This file format does not store filenames, using a hash (number) derived from the filename instead. This is very easy to calculate if you know the filename, but if you only know the hash it is impossible to go back the other way and figure out the filename.
This means the only way to figure out what some of the files are called is by guessing, and seeing if the hash of the guess matches one of the existing hashes. I've already matched up all the files I can using this method, so I'm hoping some creative people might spend a few minutes and see if they can guess a few more filenames.
I have put up a quick web page which lets you type in filenames and it automatically calculates the hash and checks to see whether it matches one of the unknown values. If so it will display a message, and you can post the newly discovered filename here so I can add it to the list of known names.
Any help with this would be much appreciated! There are only 11 unknown filenames to go, so it would be great to get them all figured out.
[ KeenWiki | ModdingWiki | Camoto ]
- VikingBoyBilly
- Vorticon Elite
- Posts: 4158
- Joined: Sat Jan 05, 2008 2:06
- Location: The spaghetti island of the faces of dinosaur world for a vacation
Can the program methodically make a list of hashes from file names alphabetically limited by the character space of the filename? Like if you know the file name can only have 8 characters and it can only be letters, numbers, and a few symbols, you can have it methodically go through a process of "guessing" by calculating every possible combination until it finds a match to the hash you put in.
"I don't trust players. Not one bit." - Levellass
-
- Kuliwho?
- Posts: 2167
- Joined: Fri Jan 20, 2012 7:02
- Location: Tied up in the Oracle Chamber's basement
- Contact:
Given these limits, maybe a simple program that guesses by the way of iterating over all possible filenames (i.e., the bruteforce way) can solve this, possibly taking advantage of GPGPU code for acceleration.VikingBoyBilly wrote:Can the program methodically make a list of hashes from file names alphabetically limited by the character space of the filename? Like if you know the file name can only have 8 characters and it can only be letters, numbers, and a few symbols, you can have it methodically go through a process of "guessing" by calculating every possible combination until it finds a match to the hash you put in.
Website: https://ny.duke4.net/
You're supposed to type in a filename, and if it's correct, the four digit hash code that is generated as you type will match one of the four digit hash codes highlighted in yellow.wiivn wrote:I've tried all eeXX and b4XX combinations. Is that the kind of work you are asking?
.ny00123 wrote: Given these limits, maybe a simple program that guesses by the way of iterating over all possible filenames (i.e., the bruteforce way) can solve this, possibly taking advantage of GPGPU code for acceleration
Just some back of the envelope calculations here:
An 8 character string gives 26^7*36 = 289 billion possible file names (assuming only letters and numbers are used, only the last character can be a number and the extension is known). We can knock off the first character as most of these files are in alphabetical order, leaving 11 billion possible choices, and if you restrict the second letter to allowable English combinations, then you get maybe a billion strings to start from, which seems like it should be brute-forceable.
Given a billion possible strings and one potential hash match, a list of around 170k filenames will be generated. I bet at least a couple of those will be the dictionary word you're looking for, though.
A lot of these filenames are in the executable. The ones that we need to guess were put in the archive, but never used in the game.Keening_Product wrote:Bloody hell that's hard... but fun! No luck so far but I'll keep coming back to it if needed.
How on earth did you figure out SAVEBOXO and SAVEBOXG?
Last edited by lemm on Sat Oct 19, 2013 20:38, edited 1 time in total.
Oh nice. I had guessed "swish.snd"wiivn wrote:Found one:
"You're supposed to type in a filename, and if it's correct, the four digit hash code that is generated as you type will match one of the four digit hash codes highlighted in yellow."
Then why if you type cdwn or eeg` it will mark them as green?
The hash function will take any string of characters as potential file names and output a 4-digit code. But, since there's trillions of strings and only 65536 4-digit hex numbers, many different filenames are going to result in the same 4-digit code. So, we're trying to guess what the most likely filename is, even though there are many strings of garble that will produce a "correct" result.
MoffD wrote:can you give me the method of hashing? I'm willing to try brute forcing all the max length filenames.
This is the JavaScript function from Malv's website.
Code: Select all
function calcHash(str)
{
var hash = 0;
var len = str.length;
for (var i = 0; i < len; i++) {
hash ^= str.charCodeAt(i) << 8;
for (var j = 0; j < 8; j++) {
hash <<= 1;
if (hash & 0x10000) hash ^= 0x1021;
}
}
return hash & 0xffff;
}
Code: Select all
sub_24C22 proc far ; CODE XREF: sub_1CAAE+9DP
arg_0 = dword ptr 6
arg_4 = word ptr 0Ah
push bp
mov bp, sp
push ds
push si
pushf
cld
xor dx, dx
lds si, [bp+arg_0]
mov cx, [bp+arg_4]
loc_24C31: ; CODE XREF: sub_24C22+2Bj
lodsb
sub ah, ah
xchg ah, al
xor dx, ax
push cx
mov cx, 8
loc_24C3C: ; CODE XREF: sub_24C22:loc_24C4Aj
mov bx, dx
shl dx, 1
and bx, 8000h
jz short loc_24C4A
xor dx, 1021h
loc_24C4A: ; CODE XREF: sub_24C22+22j
loop loc_24C3C
pop cx
loop loc_24C31
mov ax, dx
popf
pop si
pop ds
pop bp
retf
sub_24C22 endp
Another clue:
These both have the same file size, they're adjacent, and they're the same size as GAMEOPT.GRA, which is the game options window. I'd guess that they're both GRA files, and that the last character is a digit.
Malv, can your tool extract the graphics to help with guesses?
I also wonder if running through the game and logging the dosbox output to the file would turn up anything interesting...
These both have the same file size, they're adjacent, and they're the same size as GAMEOPT.GRA, which is the game options window. I'd guess that they're both GRA files, and that the last character is a digit.
Malv, can your tool extract the graphics to help with guesses?
I also wonder if running through the game and logging the dosbox output to the file would turn up anything interesting...
Thanks to the clues, I found it!lemm wrote:Another clue:
These both have the same file size, they're adjacent, and they're the same size as GAMEOPT.GRA, which is the game options window. I'd guess that they're both GRA files, and that the last character is a digit.
Malv, can your tool extract the graphics to help with guesses?
I also wonder if running through the game and logging the dosbox output to the file would turn up anything interesting...
I also searched in google for images as a clues, since this game isn't one of my favorites.
More clues for a0f? (this thing kinda looks like a text adventure game )
- Malvineous
- Shikadi Webmaster
- Posts: 382
- Joined: Wed Oct 31, 2007 21:48
- Location: Brisbane, Australia
- Contact:
Holy crap guys, this is fantastic!! I can't believe you figured out so many so quickly! This is amazing. Many, many thanks! I've updated the page with the newly discovered filenames. I've also removed some of the filenames I guessed that I suspect are wrong, to see if you can come up with any better suggestions. These are now listed in brackets after the hash. Be aware that once a hash is marked green (e.g. if you type in my old guess) you'll have to reload the page again so it goes back to yellow, otherwise that hash will no longer be checked as you type.
@Keening_Product: As Lemm said, I scoured the .exe, as well as all files inside the .LBR, and extracted anything that looked like a filename. This got about 90% of the names. I'm not sure whether the remaining files are used or not, given that their names don't appear anywhere in the game!
@wiivn: Thanks for your correct guesses
@lemm: I haven't yet reversed the .gra files, so I can't look at them to figure out what they might be. They seem to be in some kind of planar EGA-like arrangement, which is odd for VGA graphics... And since bapple0.omp was in the .exe but not in the archive, maybe it's one of those cases where they add a number to a character already in a string to construct the filename?
@MoffD: The problem isn't so much brute forcing the filenames, rather it's figuring out which of the matching filenames is the correct one.
I have actually written a program which does a slightly better job of brute-forcing the algorithm, instead calculating the hash backwards and printing all possible matches. This means you can restrict it to e.g. files ending in ".GRA" and beginning with "S", and it's quite a bit faster than brute-forcing every possible filename. However there are still thousands of matches. For example, here are all the filenames that match the a0f hash, where the first character is an S, and the second is from A to H, the extension is .GRA and a digit can only appear as the last character in the filename. Even with these restrictions there are a lot of matches.
The correct filename for the a0f hash is probably in that list, but since *all* those filenames match, which one is the correct one??
@Keening_Product: As Lemm said, I scoured the .exe, as well as all files inside the .LBR, and extracted anything that looked like a filename. This got about 90% of the names. I'm not sure whether the remaining files are used or not, given that their names don't appear anywhere in the game!
@wiivn: Thanks for your correct guesses
@lemm: I haven't yet reversed the .gra files, so I can't look at them to figure out what they might be. They seem to be in some kind of planar EGA-like arrangement, which is odd for VGA graphics... And since bapple0.omp was in the .exe but not in the archive, maybe it's one of those cases where they add a number to a character already in a string to construct the filename?
@MoffD: The problem isn't so much brute forcing the filenames, rather it's figuring out which of the matching filenames is the correct one.
I have actually written a program which does a slightly better job of brute-forcing the algorithm, instead calculating the hash backwards and printing all possible matches. This means you can restrict it to e.g. files ending in ".GRA" and beginning with "S", and it's quite a bit faster than brute-forcing every possible filename. However there are still thousands of matches. For example, here are all the filenames that match the a0f hash, where the first character is an S, and the second is from A to H, the extension is .GRA and a digit can only appear as the last character in the filename. Even with these restrictions there are a lot of matches.
The correct filename for the a0f hash is probably in that list, but since *all* those filenames match, which one is the correct one??
[ KeenWiki | ModdingWiki | Camoto ]
There are 13 matches with SAVE in this list and 7 with SHWR. Maybe it's one of these.
Another suggestion is SCRFONT.GRA since there are files with extension SCR, but more probably might be some screen font as there are some other GRA font files.
Some clues for dc79? Is it something with weapons or with objects to collect?
Another suggestion is SCRFONT.GRA since there are files with extension SCR, but more probably might be some screen font as there are some other GRA font files.
Some clues for dc79? Is it something with weapons or with objects to collect?
I think the best course of action would be to reverse the files to get the graphics, and then to filter out the candidate file names by throwing away those that don't have an English word in them. Almost every file in the archive contains a word with three letters or more. I think you might be able to narrow your search by an order of magnitude if you did that, and combined with the information from the picture, the answer should be obvious.
@wiivn: The filesize would suggest that it's not a WEAP variant.
@wiivn: The filesize would suggest that it's not a WEAP variant.