Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 19 Apr 2013 20:32:56 +0200
From: magnum <>
Subject: Re: Got all dyna formats (except $1$ and $apr1$) working with OMP

On 19 Apr, 2013, at 6:19 , wrote:
> In the 3rd param method, we are calling omp_get_thread_num() 4 times for every 5760 candidates.  For the one where the omp_get_thread_num() call was in the unicode getter/setter, omp_get_thread_num() was being called at least 11520 times per each 5760 candidates!!!!  That could be GREATLY reduced (basically a  loop-invariant code motion).  But using the 2nd method (newest), it simply is an inline function to a array.  So a smart compiler will actually do the loop invariant motion for us.

A very similar problem can be seen with OpenCL: On nvidia you should call get_global_id() only once and put the result in a register, otherwise you'll get a performance hit. On AMD, the compiler does this for you.

Sometimes the optimizers do really astonishing things, sometimes they, well, don't.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.