Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 10 Nov 2020 23:06:21 +0100
From: Solar Designer <solar@...nwall.com>
To: john-users@...ts.openwall.com
Subject: Re: Rules characters unicode support.

On Tue, Nov 10, 2020 at 05:01:24PM +0100, François wrote:
> I've just finished writing the john.conf using your micro-optimization
> trick.
> 
> Three last questions before creating a pull request:

Great!

> 1- On my experimental file I'm working on, this rule is surprisingly
> effective (hundreds of pass cracked), however, I specifically does
> not have uppercase in my sample, so my john.conf change just
> contains lowercase utf-8, do you want me to add uppercase?

It will be most flexible to have lowercase and uppercase as two separate
sections, then a section .include'ing both of those, and then have the
latter .include'd from [List.Rules:Jumbo].  That way, lowercase-only can
also be run by requesting just the corresponding ruleset.

> 2- Correct me if I'm wrong but there are no obvious search and
> replace strategy for any pattern of more than one letter in john rules
> engine; I'm thinking two-letter substitution to one unicode,
> specifically:
> # Latin small letter thorn (th) -> þ
> # Latin small letter ae -> æ

There's no way to search for a two-character substring, but you can
search for the first character and then check the second:

/a Dp =pe

Unfortunately, if the very first "a" isn't followed by an "e", this will
reject the word instead of searching further.  You can partially
compensate for that by also having:

%2a Dp =pe

and so on.  Of course, you'll need to follow these with commands that
introduce the UTF-8 characters at position "p".

Instead of the "D" command, you can have the rule calculate p+1 and
check the character there, or search for the second character and then
check the first at p-1 (fits the rule commands better, since adding 1
requires putting -1 into a variable first):

/e vap1 =aa
%2e vap1 =aa

This is likely quicker when the remaining portion of the word is long.
It's also better if your UTF-8 character is 2 bytes: so you just do two
overstrikes.

I didn't test any of these now, but they should work.

> 3- Do you want me to provide the rules in a best-match order,
> it might get a bit confusing, I can group by best unicode substitution
> order.

I have no preference, and I don't know what you mean by "best unicode
substitution".  I suspect these rules will usually be used as part of
the jumbo ruleset, in which case their number will be relatively small
and thus their order won't matter much.

However, I think "best-match order" is valuable if you have that data.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.