Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 4 Aug 2011 09:03:33 -0500
From: "jfoug" <jfoug@....net>
To: <john-dev@...ts.openwall.com>
Subject: RE: issues with 1.7.8-jumbo-5

>-----Original Message-----
>From: magnum [mailto:rawsmooth@...dband.net]
>Sent: Thursday, August 04, 2011 8:14 AM
>To: john-dev@...ts.openwall.com
>Subject: Re: [john-dev] issues with 1.7.8-jumbo-5
>
>On 2011-08-04 14:17, Solar Designer wrote:
>> magnum -
>>
>> On Thu, Aug 04, 2011 at 01:28:55PM +0200, magnum wrote:
>>> On 2011-08-04 13:13, magnum wrote:
>>>> The bug is in --pipe
>>>
>>> And here it is: An assumption that average line length is at most 16:
>>>
>>> 	max_pipe_words = (db->options->max_wordfile_memory/16);
>>
>> Thank you for figuring this out!
>
>I was wrong though. Jim does the right thing but somewhere in this code
>block there must be some kind of fence-post error.

I will dig in and have a look. However, the size/16 is only for the 'max'
count.  If this is raised, then there will be far fewer words possible.  16
was chosen (15 byte PW's average), as I thought that to be a good 'average'
size.  If the words are far too small, then we use only a tiny part of the
memory buffer.  If the words are all very long, then we fill up the memory
buffer, but do so in only a few words (thus wasting space in the word
pointer).  Since there is no way to know in advance, I make the above
assumption.

Now, the data structure, is a flat buffer, that has each null terminated
word appended to all the previous (until we run out of space). There is also
an array of pointers, which point to the start of each word. In the 'normal'
wordfile, I know the size of buffer needed, read one time, and will put
nulls in where the \n chars are (or \r\n \n\r, etc).  I am pretty sure I
walk that buffer twice, once to get a count of lines, then allocate the
array of pointers to line start, then walk it a second time, putting in the
nulls, and assigning pointers.  For the --pipe, I allocate fixed sized
buffer and fixed sized array of pointers.  Then I load one line at a time,
setting the pointer to this line properly.  When I exhaust either the lines
array of pointers, or the memory buffer, I stop loading this block, and use
it.  Simple as that.   I will look deeper, and see where I missed something.
But that was supposed to be how it worked.

Jim.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.