Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 18 Aug 2011 22:01:27 -0500
From: "JFoug" <jfoug@....net>
To: <john-dev@...ts.openwall.com>
Subject: Re: 45% wordlist boost just waiting to happen

This is easy to see the bug here (dummy rule).

static char *dummy_rules_apply(char *word, char *rule, int split, char 
*last)
{
    word[length] = 0;
    if (strcmp(word, last))
    return strcpy(last, word);
    return NULL;
}

....
   length = 15;  (assume MD5a)
....
   line = words[nCurLine++];
....
   if ((word = apply(line, rule, -1, last))) {


So, here is what is happening:


Buffer:

line1\0line 2 this is a longer line\0line 3\0

words[0] points to "line1"
length is set to 15

The call to apply() happens.

Now, buffer looks like this:

line1\0line 2 th\0s is a longer line\0line 3\0

See the 'new' null byte, 15 bytes past the start of words[0]?  The buffer is 
smashed.  When we get the next line with line = words[nCurLine++];   line 
will now contain "line 2 th"  and not "line 2 this is a longer line".

There is no way to use the buffer in place, without making sure that 
everything treats the lines as const.  If anything has to trim the length of 
a line, then it MUST do a strlen, and if it is too long, then copy the data, 
and modify the copy of the data, and use it from that point on.    Thus, the 
'best' we will end up with in gain, is the change from a strcpy, to a 
strlen.  I bet there will be zero improvement in performance, but there 
likely will be more places where 'non-constness' is found.

I think your observed performance gains, were a red herring.

Thus my vote, is to scrap this idea, and remove the #define, and simply use 
the strcpy into the known sized static buffer. that buffer is already larger 
than LINE_BUFFER_SIZE, so it works 'like' the non memory caching read 
function.

Jim.


Full context kept.

----- Original Message ----- 
From: "magnum" <rawsmooth@...dband.net>
To: "JFoug" <jfoug@....net>
Sent: Thursday, August 18, 2011 8:51 PM
Subject: Re: [john-dev] 45% wordlist boost just waiting to happen


> Maybe I fixed it now? I'm not sure but the problem is gone with this
> patch (just this, no parts of 0018)
>
> magnum
>
>
> On 2011-08-19 03:17, JFoug wrote:
>> I saw that fix. For now, I was leaving it out, focusing on why the #if 0
>> is needed.
>>
>> Jim.
>>
>> ----- Original Message ----- From: "magnum" <rawsmooth@...dband.net>
>> To: "JFoug" <jfoug@....net>
>> Sent: Thursday, August 18, 2011 8:00 PM
>> Subject: Re: [john-dev] 45% wordlist boost just waiting to happen
>>
>>
>>> You might want to apply my 0018 patch. It's still on the wiki, just
>>> visual name changed to "0018 PULLED". It has a length fix that I
>>> believe must be there unless it's taken care of by other means. I
>>> could see dummy_rules_apply() write beyond the last word when it
>>> blindly truncated the last word in the buffer.
>>>
>>> It's bound to be the 100-char lines in pw.dic that triggers the fault
>>> but I can't see why. I tried changing the fgetl size from
>>> LINE_BUFFER_SIZE to length+3 (+3 for \r\n\0) but that did not help.
>>>
>>> magnum
>>>
>>>
>>> On 2011-08-19 02:45, JFoug wrote:
>>>> I get teh same problem on my 32 bit build. I simply removed the 0 from
>>>> the #define, and did not touch ANYTHING else. I will make sure it also
>>>> affects my MSVC build, and if so, I should be able to figure things out
>>>> by stepping.
>>>>
>>>> Jim.
>>>> ----- Original Message ----- From: "magnum" <rawsmooth@...dband.net>
>>>> To: "JimF" <jfoug@....net>
>>>> Sent: Thursday, August 18, 2011 6:19 PM
>>>> Subject: Re: [john-dev] 45% wordlist boost just waiting to happen
>>>>
>>>>
>>>>> On 2011-08-19 01:01, JimF wrote:
>>>>>> excellent. I have not had a time to look into it (but was going to
>>>>>> tonite).
>>>>>
>>>>> It was a false track. I have no idea what is the problem. Can you
>>>>> replicate it? It happens for all formats in test suite.
>>>>>
>>>>> NOTE 1: if I manually mimic one of the suite's test, but using --pipe
>>>>> instead, the problem IS still there!
>>>>>
>>>>> NOTE 2: if I create this dummy rule and use it, the problem is gone!
>>>>>
>>>>> [List.Rules:none]
>>>>> :
>>>>>
>>>>> This narrows it down a lot but I'm out of ideas, I'll take a break
>>>>> from this now. If you have time to look at it, please do. That boost
>>>>> would be welcome!
>>>>>
>>>>>> I smoked out the pkzip stuff. I think at this time this may be 
>>>>>> several
>>>>>> orders of magnitude faster than other tools out there (at least for
>>>>>> .zip
>>>>>> files with only 1 file).
>>>>>
>>>>> That's way cool. I'll check it out.
>>>>>
>>>>>>
>>>>>> Jim.
>>>>>>
>>>>>> ----- Original Message ----- From: "magnum" <rawsmooth@...dband.net>
>>>>>> To: "jfoug" <jfoug@....net>
>>>>>> Sent: Thursday, August 18, 2011 5:33 PM
>>>>>> Subject: Re: [john-dev] 45% wordlist boost just waiting to happen
>>>>>>
>>>>>>
>>>>>>> I think I found the wordlist.c problem. Working on it
>>>>>>>
>>>>>>> magnum
>>>>>>>
>>>>>>>
>>>>>>> On 2011-08-18 18:19, jfoug wrote:
>>>>>>>> I am working on getting pkzip fully working. My 'first' alpha for
>>>>>>>> pkzip may
>>>>>>>> be to use unzip -P for the 3rd step (the 'real' test). I have made
>>>>>>>> changes
>>>>>>>> to the format, and have had 4 CORES running on a markov over night.
>>>>>>>> There
>>>>>>>> are 2.4 trillion words in the run, and I am more than 1/4 done.
>>>>>>>> There
>>>>>>>> were
>>>>>>>> only 5 false positives, so I think spawning out, for the 'first' 
>>>>>>>> cut
>>>>>>>> would
>>>>>>>> be fine.
>>>>>>>>
>>>>>>>> Right now, I am working on zip2john, so that we can use it to
>>>>>>>> make the
>>>>>>>> hashes. Doing them by hand is a pain in the arse, and if things are
>>>>>>>> not
>>>>>>>> right, you have to pull up the debugger to figure out what part
>>>>>>>> of the
>>>>>>>> validation fails. Certainly NOT user friendly.
>>>>>>>>
>>>>>>>> Jim.
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: magnum [mailto:rawsmooth@...dband.net]
>>>>>>>>> Sent: Thursday, August 18, 2011 10:42 AM
>>>>>>>>> To: jfoug
>>>>>>>>> Subject: Re: [john-dev] 45% wordlist boost just waiting to happen
>>>>>>>>>
>>>>>>>>> I'm not sure. This was plain Linux x86-64, no MP stuff. It's
>>>>>>>>> probably
>>>>>>>>> trivial but I probably have no time in the next 24h so if you
>>>>>>>>> want to
>>>>>>>>> look at it, just go ahead - if you can replicate the problem in 
>>>>>>>>> the
>>>>>>>>> first place. I'll send you a note if I start debugging it.
>>>>>>>>>
>>>>>>>>> magnum
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Sounds like there is still a problem.
>>>>>>>>>>
>>>>>>>>>> I know I wrote the thing the way I did, then when solar put it
>>>>>>>>>> out in
>>>>>>>>> jumbo,
>>>>>>>>>> he changed it. Then within a day, people found problems, so he
>>>>>>>>>> took it
>>>>>>>>> out.
>>>>>>>>>>
>>>>>>>>>> Do you want me to take a look also? Does it just behave badly on
>>>>>>>>> certain
>>>>>>>>>> build types, or is it on all of them?
>>>>>>>>>>
>>>>>>>>>> Jim.
>>>>>>>>>>
>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>> From: magnum [mailto:rawsmooth@...dband.net]
>>>>>>>>>>> Sent: Thursday, August 18, 2011 3:41 AM
>>>>>>>>>>> To: jfoug
>>>>>>>>>>> Subject: Re: [john-dev] 45% wordlist boost just waiting to 
>>>>>>>>>>> happen
>>>>>>>>>>>
>>>>>>>>>>> Yeah but it goes from all 1500's to this:
>>>>>>>>>>>
>>>>>>>>>>> Detected John 'jumbo' build, so only testing all jumbo formats
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(0) guesses: 370 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(0) guesses: 88 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(1) guesses: 419 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(1) guesses: 88 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(2) guesses: 370 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(2) guesses: 88 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(3) guesses: 370 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(3) guesses: 88 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(4) guesses: 279 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(4) guesses: 75 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(5) guesses: 279 time: 0:00:00:00 DONE
>>>>>>>>>>> .pot CHK: md5_gen(5) guesses: 66 time: 0:00:00:00 DONE
>>>>>>>>>>>
>>>>>>>>>>> -form=md5_gen(6) ^Cmake: *** [test] Interrupt
>>>>>>>>>>>
>>>>>>>>>>> I'm pretty sure the '+length' part of my patch was correct but
>>>>>>>>>>> there
>>>>>>>>>>> something more to fix. I really hope I can find the problem.
>>>>>>>>>>>
>>>>>>>>>>> magnum
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 2011-08-18 04:02, jfoug wrote:
>>>>>>>>>>>> Keep in mind, the TS as it lives (1.06) has a lot of things not
>>>>>>>>> found
>>>>>>>>>>> (for
>>>>>>>>>>>> encodings).
>>>>>>>>>>>>
>>>>>>>>>>>> If your patch is finding all of the non-encoding hashes (ignore
>>>>>>>>> mssql
>>>>>>>>>>> and
>>>>>>>>>>>> oracle), then likely the build is not a problem, but the TS is.
>>>>>>>>>>>>
>>>>>>>>>>>> Jim.
>>>>>>>>>>>>
>>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>>> From: magnum [mailto:rawsmooth@...dband.net]
>>>>>>>>>>>>> Sent: Wednesday, August 17, 2011 7:02 PM
>>>>>>>>>>>>> To: john-dev@...ts.openwall.com
>>>>>>>>>>>>> Subject: Re: [john-dev] 45% wordlist boost just waiting to
>>>>>>>>>>>>> happen
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 2011-08-17 17:02, magnum wrote:
>>>>>>>>>>>>>> I believe I found the problem. You actually had another XXX
>>>>>>>>> comment
>>>>>>>>>>>>>> hinting about it, but I didn't fully understand the issue
>>>>>>>>>>>>>> until
>>>>>>>>>>> seeing
>>>>>>>>>>>>>> in valgrind what happened. Easy fix then.
>>>>>>>>>>>>>
>>>>>>>>>>>>> No, something is still wrong. I tested it quite a bit but the
>>>>>>>>>>>>> test
>>>>>>>>>>> suite
>>>>>>>>>>>>> (why didn't I use that?) reveals something is very wrong. It
>>>>>>>>>>>>> does
>>>>>>>>> not
>>>>>>>>>>>>> crash - it just fails to crack a lot.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'll investigate more. Anyone that applied 0018, please 
>>>>>>>>>>>>> revert.
>>>>>>>>>>>>>
>>>>>>>>>>>>> magnum
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
> 

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.