Date: Mon, 27 Feb 2012 02:52:15 +0400 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: -ext:keyboard with 8-bit chars magnum - Thank you for reporting this! On Sun, Feb 26, 2012 at 06:11:20PM +0100, magnum wrote: > I tried making a custom keyboard external mode for producing German > keyboard output in iso-8859-1. I doubled the array sizes per the comment > and also changed the while loop that initializes mc accordingly. At > first I just entered the characters as '??' and so on, and took care that > john.conf was encoded in iso-8859-1. But I got a segmentation fault when > running. > > After some head scratching I tried defining the 8-bit characters as hex > instead, and this works. Is this requirement a bug or a known > limitation? A non-jumbo build shows the same limitation. It's neither a bug nor a known limitation - I was not aware of it before (or at least I don't recall it), but it's also not exactly a bug. ... or maybe it is given that we have a comment that talks about adding umlauts, but does not mention this detail. Here's what this is about: In C, which the external mode language is similar to, when you assign a character constant to an int type variable, you may or may not get sign extension depending on whether the char type is signed or not in a given C implementation on a given platform and with given compiler settings. %-) John's external mode behaves as if char were signed (even though it does not actually have this type except that you're able to specify constants using the same syntax as you would for char). So it acts as a valid implementation of C in this respect (even though it does not even try to in some other aspects). That said, we may want to have it behave the way a C implementation would for char being unsigned (also valid) - this may be more convenient for us, as you have found out. To make this change, you may e.g. edit the two instances of "value = c_getchar(1)" to "value = (unsigned char)c_getchar(1)" in c_getint(). I did not test this change. When you simply assign 8-bit characters e.g. to elements of word, it does not matter whether they get sign-extended or not because John only uses the lower 8 bits. However, Keyboard mode itself uses those chars as array indices, hence the problem. We may adjust the Keyboard mode to avoid the problem e.g. by using " & 0xff" on the three references to k further down in init() (and only in init(), so there's no performance impact from this change). I've just tested this, and it works. In fact, I think we should adjust the Keyboard sample to ship with this change already made, along with the m and mc size change and the mc initialization loop change that you made. (The comment may then be dropped.) I will probably commit this change to the main tree and think about also adjusting compiler.c to treat char constants as unsigned. > Another thing, the comment "This sample can be enhanced to infer the > rest of the indices here". What exactly is missing, and what would it > change? Are we not resuming correctly with current code? We only infer the length and id, but not id and on (they're now set to 0). This means that when we interrupt and --restore a Keyboard mode run, it tries some previously-tested candidate passwords for a second time. If we add code to infer the rest of the indices in the same way that we do for id, we'll be continuing from almost exactly the place where we interrupted (skipping only the last incomplete set of passwords that were in-flight at the time of interruption, which is something John must be doing and so it does). The downside is that the sample will become longer and more complicated. But we can try to implement this and see. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.