Date: Fri, 9 Sep 2011 13:06:02 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: Strange timings in pdf Dhiru, Pdf has some 'strange' timings. Multi-salt being slower than single salt. Usually, this is the other way around. The problem is that all the work is being done within set_salt() function. Normally, this would be much better done in get_salt() function. Here is why: Init of john: Load each hash line. Call get_salt() and store off the data this function returns (the salt) Running of john (word testing): Load the max allowable passwords for the format. Then for each salt that was returned by the calls to get_salt (done during john loading), call set_salt();crypt_all();cmp_all() over and over again, once for each salt provided, for this/these block of password(s). Thus, if there is 'work' to be done (on the salt only), it is best to try to do this in get_salt, so that set_salt can be as fast as possible. This will speed up the processing of the format (sometimes greatly speeding it up). It does not matter how fast the get_salt() function is, within reason. It is only run 1 time against each salt. However, the set_salt function is run one time against each salt, for each 'set' of passwords loaded. Thus, if your format processes 1 password at a time, and you have 10 salts, and run against 10 billion candidate passwords, then set_salt is called 100 billion times. However, in this same run, the get_salt is only called 10 times at john loading. So, if the processing of the raw salt string takes .0000001s, but assigning a pointer takes 0s, then having get_salt build this object takes .0000001s, and the calls to set_salt take 0s (I know this is not true, but bear with me). However, if get salt simply returns the unprocessed string, and set_salt is used to do the processing each time, then the amount of time used in this faked up example is 100 billion * .0000001s, or 10k seconds (about 3 hours). The amount of code change required to get pdf_fmt into the 'proper' john format, is not that much. Likely simply have get_salt do the work being done in set_salt, and instead of setting the 'static' data of the format, to set the data in some allocated structure, then return the address of that structure. Then in set_salt, simply cast the void* to this marshalling structure type, and copy the data from that structure, into the format's static data. That way, you do not have to interpret the full string, doing all the parsing each time. However, the faster you can get this set_salt() function to run, the better your 'multi-salt' times will be, and the faster the format WILL run, when someone is trying to run it against a few hashes at the same time. The single salt timings will not matter one way or the other. For a single salt, the salt is loaded one time, and then the code that does the password testing, simply loads candidates, then calls crypt_all();cmp_all(); then loads the next batch of passwords. If done like this, I bet those reversed numbers will get back to the 'right' order, and the multi salt test times will likely be 2x or 3x better than they are today. I am not making any changes myself, because I usually stay away from other developers code, if they are likely to still be developing that code. Another note, is that this format HAS had some changes to it (porting, and now, a tiny change to the BENCHMARK_TIMING), so you may want to get to the current version, prior to starting any coding changes. Jim. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.