Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 24 Jun 2013 08:43:01 +0200
From: Jan Starke <jan.starke@...ofbed.org>
To: john-users@...ts.openwall.com
Subject: Re: Fuzzing with regular expressions

Hello guys,

rexgen does now support UTF-8 input, so for example
rexgen 'M(ü|ö|ue|oe)ller'
generates all 4 variants of this surname. Additionally, cmake now creates a
Visual Studio solution which compiles rexgen.exe and librexgen-0.1.0.dll
natively (assuming you have bison and flex available). Unfortunately, I
didn't get find_library() running in Windows, so the Lua interface is
currently not included on Windows.

In order to integrate rexgen into JtR, what was the necessary requirements
to the library? I'm currently thinking of
 - state serialization (for "john --restore")
 - ... ?

Can you give me a good starting point into john's code, to read how those
features of password generators are invoked by john?

Kind regards, Jan


2013/5/22 magnum <john.magnum@...hmail.com>

> On 22 May, 2013, at 12:40 , Jan Starke <jan.starke@...ofbed.org> wrote:
> > 2013/5/22 magnum <john.magnum@...hmail.com>
> >> I do not quite understand the section about Unicode. And it does not
> seem
> >> to work (my terminal is UTF-8):
> >>
> >> $ rexgen "M[üö]ller"
> >> Mller
> >> Mller
> >> Mller
> >> $ rexgen -u8 n "M[üö]ller"
> >> Mller
> >> Mller
> >> Mller
> >>
> >> -DUTF_VARIANT=8 does not change the above, in case it was supposed to.
> >
> > rexgen currently cannot use Unicode strings as input, due to limitations
> of
> > the lexer (GNU flex). flex ignores any characters which are not known to
> > it. If you want to generate unicode characters, you must specify them
> with
> > the \uxxxx syntax, e.g.
> >
> > rexgen 'M(ue|oe|\u00fc|\u00f6)ller'
>
> This contradicts the Unicode section on http://code.google.com/p/rexgen/so you might want to revise that. Or better, make the code work like the
> docs says :-)
>
> > The aim of the options u8, u16 and u32 are to enforce the output
> encoding.
> > To verify this, you could create a hexdump of the output:
> >
> > rexgen 'test' | od -x
>
> OK, I see it now. This also contradicts the web docs: the default is UTF-8
> and not UTF-32. And in this case the actual behavior is better - defaulting
> to UTF-32 would be very odd!
>
> magnum
>

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.