john-dev - Re: err in rules processor and dirty fix [rules.c]

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <f5ed62a76154e8f890eb72f3b7e68736@smtp.hushmail.com>
Date: Fri, 22 Mar 2013 09:43:04 +0100
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: err in rules processor and dirty fix [rules.c]

On 21 Mar, 2013, at 15:13 , Costin Enache <e_costin@...oo.com> wrote:

> >> So 0x100 is not enough even if we did not have the hex encoding? Your rules must be extreme! Anyway, I think even core John should emit a warning when rules get truncated (should be possible to catch outside the performance critical code).
> 
> Not really: I have an unknown selection of characters from ISO-8859-1, say 97 bytes, c1 ... c97, machine generated, some may be consecutive, some not. Let's assume we do not need hex encoding. I need a rule to append all this characters, three times to each word in the wordlist, covering all the combinations:
>  
> $[c1-c97]$[c1-c97]$[c1-c97]
> (BTW, I will NOT use Az"[c1-c97] [c1-c97] [c1-c97]" as \x22, should it be amongst the c1-c97, will be decoded as double quotes, will be interpreted by the pre-processor, and will mess up everything(bug? should we escape escaped hex chars?). The same does not happen for [], lucky me.)
>  
> I will end up with 100x3=300 characters per rule line. This is a conservative example. To be on the safe side, 0x400-1 will be an uncompressed range, hex encoding (maybe the order the characters are tried matters for some). Say we append 4 characters, makes it ~0x1000++. I also think 0x5000 is heavily exaggerated, but some complex rules could get really long. I know have the buffer size at 0x1000 in params.h, with the 4x patch in rules.c. This means an effective rule text line length of max 4096.
> 
> And yes, John silently truncating rules was a nasty and unexpected surprise :) Warnings are welcome.
> 
> I guess that the best alternative would be to change the pre-processor to check and reject invalid rules in a separate function, with an appropriate buffer. Everything else, up to the reject check, works fine. The expanded rules are also checked fine by the aforementioned function.I may also be overlooking a lot and drifting in the wrong direction here :)

OK, so the problem is the size after expanding ranges but before turning bracket lists to individual rules. $[c1-c97]$[c1-c97]$[c1-c97] is just 27 characters and each resulting rule is just 6 characters. But if there is a string representation inbetween, I can see it will be 300 as you say.

I will have a look again. Thanks,

magnum

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.