john-dev - RE: Some ideas about enhanced self-test

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <05b801cc7d1f$7f1df670$7d59e350$@net>
Date: Tue, 27 Sep 2011 09:12:31 -0500
From: "jfoug" <jfoug@....net>
To: <john-dev@...ts.openwall.com>
Subject: RE: Some ideas about enhanced self-test

>From: David Jones [mailto:jonesd@...umbus.rr.com]
>
>On Sep 27, 2011, at 2:29 AM, magnum wrote:
>
>> I take this on-list from here in case someone wants to chime in. This
>>started as a private discussion about eg. crc-32 and dummy getting
>>significantly inflated benchmark figures 
>
>On first blush, I'd suggest having a field bench_weight whose value
>reflects the relative frequency that's to be tested during benchmarking.
>A zero value is used for edge cases that very rare but need to be
>included in the self test for completeness.  An 8 character password
>would be given a higher weight than the 20 character test password if
>that's relevant to the benchmark.

Nice idea, but remember, we do not want to impose much overhead at all,
beyond what the speed the formats crypt/cmp_all functions have.   There are
formats, which the simple strcpy that happens during the set_key, makes up
50% of the runtime of the format.  Doing a few floating point ops, to try to
find the next candidate to provide the format each time, will end up making
a rather significant change in the speed.  This will be seen in formats like
dummy, crc32, NT, etc, likely even raw-md5 type formats.

To see the amount of speed used JUST in the strcpy of setkey, look at crc32.
It gets 36M on my system (and about 32k in 'actual' -inc:alpha6 testing).
When you switch to multiple salts, the format gets about 65M (and in
testing, with 10 salts, I see actual speeds getting up close to 80M). The
ONLY savings at runtime between 1 and multiple salts, is that the setkey is
called once, then each salt is set and tested for multi-salt.  There is THAT
much difference in the speed, simply due to key maintenance when there is
only 1 salt.

Now one way to proceed is to somehow build the list of words to use, prior
to actually 'entering' the loop the feeds the format benchmarking it.  If
done this way we 'could' do frequency percentages, by building a pseudo
wordlist during benchmarking, and adding words in proper proportion, then
spinning through that list.  At benchmark time (the inner loop), we would
not have to compute percentages.  They have been done 1 time, prior to
startup.    

Jim.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.