Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 17 Dec 2014 20:23:16 -0900
From: Royce Williams <royce@...ho.org>
To: john-users@...ts.openwall.com
Subject: Re: bleeding-jumbo: make -j random opencl errors

On Wed, Dec 17, 2014 at 8:24 AM, Solar Designer <solar@...nwall.com> wrote:
> On Wed, Dec 17, 2014 at 07:03:45AM -0900, Royce Williams wrote:
>> On two different 64-bit Linux gcc systems using NVIDIA OpenCL and CUDA
>> 6.5, a non-parallel make works fine, but parallel makes die randomly
>> with errors like the following, but with different errors on some
>> attempts.
>>
>> $ make -j -s
>> opencl_mscash2_fmt_plug.c:457:1: internal compiler error: Segmentation fault
>>  };
>
> Why do you use -j, as opposed to -j8 or whatever matches your machine?
> Is this a stress test?  Do you even have enough RAM for it?  I think not.

Heh.  Point taken.  Call it an inadvertent stress test.  I'd been
doing this for a while and never had a problem.  I typo'd it one day
(leaving off the number of cores), and since it worked well and
finished much faster, I just kept using it.  I realized that it was
doing a *lot* more "parallelizing" than before, but it seems to be
fine until now.

> So this looks like a non-issue to me.  It is expected behavior to
> receive a SIGSEGV on an out of memory condition, as long as some memory
> overcommitment is allowed in the kernel.
>
>> IIRC, this was working a few days ago on at least one of the systems,
>> and neither have had this failure mode before.
>
> Maybe memory usage by this build or by something else on your system has
> increased.

I'm mildly curious about what changed -- but not enough to waste any
more of anyone's time on it, especially during a bugfix-only cycle.

> Just do not use -j without specifying a limit, like "-j8" or
> e.g. -j32 on our "super" machine.

Understood.

> We do have a large number of source
> files, so plain -j will result in excessive load and likely a slightly
> slower build (cache thrashing, extra context switches), and it may very
> well run out of memory unless you have lots installed.

Thanks -- that helps me understand the root cause.  And "lots" clearly
must mean something more than the 16G that I have in that system.

Back to the real bugs - sorry, guys.

Royce

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.