Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 24 Jan 2012 16:02:44 +0100
From: Samuele Giovanni Tonon <>
Subject: Re: best approach to cmp_all and cmp_one

On 01/23/12 22:45, Solar Designer wrote:
> Hi Samuele,
> On Mon, Jan 23, 2012 at 10:03:55PM +0100, Samuele Giovanni Tonon wrote:
>> sorry to bring back another question about cmp_all and cmp_one.
> Why, the question is entirely appropriate and your progress at improving
> your code is desirable.  There's nothing for you to be sorry for.

>> however: if cmp_all get another call before a crypt_all i will just read
>> garbage; would be good on having a global semaphore to avoid a
>> clEnqueReadBuffer if there's hasn't been any previous crypt_all or there
>> are more elegant ways to solve the problem ?
> Having a global flag (not a semaphore) is in fact what I'd recommend.
> You may call it, say, have_full_hashes.  Set it to 0 in crypt_all()
> (because you only read 1/5 - 32 bits out of 160 bits per hash, right?)

> In cmp_all(), you just use your partial hashes - cmp_all()'s return
> value does not need to be precise (non-zero means that there _might_ be
> matches).  You do not need to access this flag there, nor to retrieve
> the full hashes.
> Ditto for cmp_one().
> However, in cmp_exact() you do need to check the flag and if it's still
> 0, then retrieve the full hashes from the GPU (all of them at once) and
> set the flag.  This means that subsequent calls to cmp_exact(), if any,
> will not need to retrieve anything.
>> second: i was thinking on getting the i-th hashes instead of the whole
>> buffer but i discovered cmp_one goes incremental, let me explain by an
>> example
>> suppose NUM_KEYS is 1000 and cmp_all(1000) is called and
>> b == outbuffer[i] is TRUE for 995, then cmp_one will be called like that:
>> cmp_one(0)
>> cmp_one(1)
>> .....
>> cmp_one(995) and password is found.
>> wouldn't be best to have cmp_one start at 995 and incrementing until
>> NUM_KEYS is reached instead of starting from the beginning or to do that
>> there will be side effects or a major rewrite of the code ?
> There should be no need to do that.  cmp_all() should give false
> positives infrequently enough that we should be able to afford doing
> things non-optimally in that case.  In case cmp_all()'s false positives
> are too frequent, you should instead increase the size of your partial
> hashes.  For example, if your max_keys_per_crypt is 0x10000, you load
> 0x10000 of same-salt hashes for cracking, and your partial hashes are
> 32-bit, then false positives become very frequent (not cmp_all()'s, but
> get_hash*() and cmp_one()'s, though - since cmp_all() is not used on
> this many loaded hashes).  Since your cmp_exact() is very expensive,
> you should deal with this by increasing the partial hash size - e.g., to
> 48 bits.
> Sure, increasing the partial hash size may have a performance penalty of
> its own in your case.  This is normally not the case for CPU-only code,
> where larger partial hashes only consume more RAM, whereas their effect
> on performance through them also consuming more cache space is minimal.
> Also, cmp_exact() is normally not that expensive.  But in your case it
> is and thus a slightly larger partial hash size may be justified.
> As I wrote in another posting, I am actually considering merging some or
> all of the hash comparisons into crypt_all().  This would let you use
> specialized algorithms if you like - even offload the comparisons to
> the GPU, which is part of the plan for the formats interface rework that
> I am considering.
>> next question: saltless password.
>> cmp_all is not called for saltless password due to 1. ,
> Sometimes it is - for low hash counts.
>> so instead you get cmp_one and get_hash_xxx .
>> This unfortunately makes me have 4 clEnqueReadBuffer which are quite
>> expensive and do something like that.
> As I wrote above, please use a global flag and please read the entire
> buffer at once - in cmp_exact().
> For get_hash*() and cmp_one(), just use partial hashes that you already
> have.  It is OK for cmp_one() to have false positives - this is why we
> also have cmp_exact().
>> to solve this problem i'd have to revectorize the opencl output,
> There should be no need.  Just use large enough partial hashes so that
> cmp_exact() is infrequent enough and you can afford to fetch the entire
> buffer there.
> Indeed, a lot of this is still non-optimal - starting with generating
> candidate passwords on the CPU - but my proposals above is what still
> fits in the current formats interface best.  Like I said, a longer-term
> plan is to rework the interface, but the above is what you probably want
> to do for now.

this made me
i was able to speed up saltless password and fix some problems on nsldaps.
i'm waiting for some more test and if all goes well i'll send a new
patch for the git.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.