Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [day] [month] [year] [list]
Date: Tue, 26 Jul 2011 17:41:37 +0200
From: magnum <>
Subject: Performance fluctuations

I'm trying to measure the performance boost from the "0010" patch but I 
can't see any. In fact, it seems to drop by less than 1% (it definitely 
should not). This brings up the old issue of fluctuations in performance 
between runs of the exact same binary.

This quote is from an old private thread I had with Jim about --test 

On 2011-07-05 21:01, JFoug wrote:
 > -test=10 hides 'some' of the variation. -test=4 if the system is more
 > stable. I usually run on Winblos with a ton of other shit running
 > Windows does not task switch 'too' smoothly, and the timing resolution
 > is only 55ms. Thus, for a second or a couple seconds, the timing is
 > flaky. I use 10s. You may be able to get by with less, but I think
 > default john is 2 or 3s, and I have seen that simply not be enough.
 > However, the code does look a little faster, and almost never slower.

I believe on Linux, the timing resolution is a couple of magnitudes 
better than 55ms and I get the same amount of fluctuations whether I run 
--test=1 or --test=20. Also, I have an idling dual core system and only 
run john on one core. But it fluctuates wildly. I believe the code (many 
formats) actually runs this much faster/slower between runs. This is an 
interesting issue. It must have to do with how things end up in memory, 
caches, alignment and so on. For a while when I made mskrb5 I used 
__align__ heavily, not because it was needed but becuase I thought it 
would stabilize things (always ending up aligned so a tad faster for 
various operations) but in the end I dropped them because it did not 
really help.

Another interesting thing is that more of the formats are rock stable 
when compiled with icc. During development of the latest mscash2 (dcc2) 
we had a version that would vary between 459-472 c/s when built with 
gcc, but *always* 496 c/s when built with icc. I wonder what makes that 

This is not just when running --test. I remember a set of test hashes I 
used with a specific dictionary (much like the test suite we have now) 
which would take anything between 30s and 1:30 to complete. I believe 
the performance does not vary *within* a run, only *between* runs.

To sum it up, I would like to find ways to "help" the compiler produce 
binaries with less fluctuating performance. Anyone have a clue? I tried 
googling it to no avail.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.