Date: Tue, 17 Mar 2015 00:40:40 +0100 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: malloc not returning SIMD-aligned (was: Re: [GSoC] building JtR for MIC) On 2015-03-16 17:58, magnum wrote: > On 2015-03-15 19:14, Solar Designer wrote: >> If it runs fine, then the problem was likely caused by memory allocation >> by the formats that were tested previously. Some of them intentionally >> allocate memory such that it can't be freed. This lowers overhead for >> when john is run for cracking, with just one specific format, but it >> results in increased total memory usage for a many-formats self-test. >> We should only be making such non-freed allocations for things that are >> small. For larger allocations, formats should free the memory in their >> done() methods. Perhaps some formats allocate "too much" memory (at >> least when the number of threads is this large) and don't free it - and >> perhaps you should identify them and change them to free the memory. > > This has been on my to-do list for some time. The "worst" formats use an > OMP_SCALE of eg. 128K. Even at just 16K and with a PLAINTEXT_LENGTH of > 125, the key buffer alone is nearly half a gigabyte for 240 threads. > > I just committed fixes for a good number of the worst offenders. > Hopefully this will fix the problem already but we should fix some more Solar, It turns out that some versions of (at least 32-bit) malloc() may return a buffer aligned to 8 even on a SIMD-capable machine (apparently "any object" in reality means "any object not larger than a double"). This even on Linux. I did not see that coming :-( Right now bleeding is broken for 32-bit x86. I may revert the changes but I'm trying not to move too fast here. So we are now considering the alternatives. One is to simply use posix_memalign(). That one is "IEEE Std 1003.1-2001 (``POSIX.1'')" whatever that means in practice. That would be the easiest fix. Another option is to use a new function in misc.h that does almost the same but that has the drawback we need to keep track of a second pointer for each buffer - that obviously is the one to use for free(). This is totally doable but code get a little less nice. Third option, which has several unrelated benefits, is to finally implement the "malloc_tiny with domains" we have discussed over the years. Any insights, ideas or suggestions while we think this over? magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.