Date: Sun, 3 Feb 2008 16:25:08 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: faster hash file loading On Sat, Feb 02, 2008 at 06:32:44PM +0000, helleye wrote: > i think that if loader.c will use ifstream::getline > the loading might be lot faster I doubt it, although this is system-specific. The loader actually does quite a lot of work - parsing the input lines, validating and decoding hash encodings, combining hashes with matching salts, eliminating duplicate hashes (when "single crack" mode is not to be used), updating linked lists and hash tables, etc. - yet it is quite fast, given this amount of work. Typically, it can process hundreds of thousands of input lines (or perhaps even millions on newer systems) in under a minute. If you really want to optimize the buffered file reads, rather than actual processing of the data (which is what most processor time is probably spent on), then the way to do so would be by using lower-level C library functions in a way that you think is more optimal for this specific task and for your operating system. For example, you can try to use the read(2) syscall directly and implement your own buffered input with no support for seeks and writes - and you'd use a larger buffer (although you can also alter the buffer size with stdio). Or you could mmap(2) your file into the process address space, then use madvise(2) with the MADV_SEQUENTIAL flag and have the loader go over the address space range, avoiding the need for any explicit read buffer (this approach is only available on some operating systems). However, let me repeat: I don't expect any significant speedup from this, and especially not consistent speedup across a wide range of operating systems, their versions, and underlying hardware (cache sizes, their relative speeds, etc. will affect optimal buffer size). That said, I have not done any benchmarks of the loader on Windows, which is what you appear to be using. If you suspect stdio to be the bottleneck, then one easy thing to try is to provide your own and much larger buffer with setvbuf(3). Please give this a try and post your results in here (specific load times before and after the change, as well as what buffer size you found to be optimal for your system). > g++ -o john.exe DES_fmt.o ... ... > DES_fmt.o: file not recognized: File format not recognized Maybe you did not "make clean", resulting in a mixed build. > any clue how to use ifstream in loader please ? You really should not be doing this, and if you are - you're on your own with it. -- Alexander Peslyak <solar at openwall.com> GPG key ID: 5B341F15 fp: B3FB 63F4 D7A3 BCCC 6F6E FC55 A2FC 027C 5B34 1F15 http://www.openwall.com - bringing security into open computing environments -- To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply to the automated confirmation request that will be sent to you.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.