Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 17 Mar 2015 00:40:40 +0100
From: magnum <>
Subject: malloc not returning SIMD-aligned (was: Re: [GSoC] building
 JtR for MIC)

On 2015-03-16 17:58, magnum wrote:
> On 2015-03-15 19:14, Solar Designer wrote:
>> If it runs fine, then the problem was likely caused by memory allocation
>> by the formats that were tested previously.  Some of them intentionally
>> allocate memory such that it can't be freed.  This lowers overhead for
>> when john is run for cracking, with just one specific format, but it
>> results in increased total memory usage for a many-formats self-test.
>> We should only be making such non-freed allocations for things that are
>> small.  For larger allocations, formats should free the memory in their
>> done() methods.  Perhaps some formats allocate "too much" memory (at
>> least when the number of threads is this large) and don't free it - and
>> perhaps you should identify them and change them to free the memory.
> This has been on my to-do list for some time. The "worst" formats use an
> OMP_SCALE of eg. 128K. Even at just 16K and with a PLAINTEXT_LENGTH of
> 125, the key buffer alone is nearly half a gigabyte for 240 threads.
> I just committed fixes for a good number of the worst offenders.
> Hopefully this will fix the problem already but we should fix some more


It turns out that some versions of (at least 32-bit) malloc() may return
a buffer aligned to 8 even on a SIMD-capable machine (apparently "any
object" in reality means "any object not larger than a double"). This
even on Linux. I did not see that coming :-(  Right now bleeding is
broken for 32-bit x86. I may revert the changes but I'm trying not to
move too fast here.

So we are now considering the alternatives. One is to simply use
posix_memalign(). That one is "IEEE Std 1003.1-2001 (``POSIX.1'')"
whatever that means in practice. That would be the easiest fix.

Another option is to use a new function in misc.h that does almost the
same but that has the drawback we need to keep track of a second pointer
for each buffer - that obviously is the one to use for free(). This is
totally doable but code get a little less nice.

Third option, which has several unrelated benefits, is to finally
implement the "malloc_tiny with domains" we have discussed over the years.

Any insights, ideas or suggestions while we think this over?


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.