|
|
Message-ID: <20111216231340.GA23495@openwall.com>
Date: Sat, 17 Dec 2011 03:13:40 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: 1.7.9's --external + OpenMP fails on Cygwin
On Sat, Dec 17, 2011 at 01:46:48AM +0400, Solar Designer wrote:
> src/winsup/cygwin/thread.cc:
>
> int
> pthread_mutex::init (pthread_mutex_t *mutex,
> const pthread_mutexattr_t *attr,
> const pthread_mutex_t initializer)
> {
> if (attr && !pthread_mutexattr::is_good_object (attr))
> return EINVAL;
>
> mutex_initialization_lock.lock ();
> if (initializer == NULL || pthread_mutex::is_initializer (mutex))
>
> Notice how the not yet initialized mutex is checked with
> "pthread_mutex::is_initializer (mutex)". And yes, it catches faults:
...
This was close, but not quite it. The same approach is used in other
parts of the Cygwin threads code, including in:
int
semaphore::init (sem_t *sem, int pshared, unsigned int value)
{
/*
We can't tell the difference between reinitialising an
existing semaphore and initialising a semaphore who's
contents happen to be a valid pointer
*/
if (is_good_object (sem))
{
paranoid_printf ("potential attempt to reinitialise a semaphore");
}
where:
inline bool
semaphore::is_good_object (sem_t const * sem)
{
if (verifyable_object_isvalid (sem, SEM_MAGIC) != VALID_OBJECT)
return false;
return true;
}
While paranoid_printf() is probably not triggered, a fault is often
triggered (on invalid pointer inside the not-yet-initialized semaphore).
And apparently there's something wrong with the fault handling.
Since this stuff is not needed, I binary-patched it out of my copy of
cygwin1.dll. As seen with "objdump -d" and "diff -u":
610ecff6: e8 75 d4 06 00 call 6115a470 <__Z11__set_errnoPKcii>
610ecffb: b8 ff ff ff ff mov $0xffffffff,%eax
610ed000: eb 42 jmp 610ed044 <__ZN9semaphore4initEPPS_ij+0x194>
-610ed002: 8b 06 mov (%esi),%eax
-610ed004: 81 78 04 4c f0 0d df cmpl $0xdf0df04c,0x4(%eax)
+610ed002: 33 c0 xor %eax,%eax
+610ed004: 40 inc %eax
+610ed005: 90 nop
+610ed006: 90 nop
+610ed007: 90 nop
+610ed008: 90 nop
+610ed009: 90 nop
+610ed00a: 90 nop
610ed00b: 0f 85 0f ff ff ff jne 610ecf20 <__ZN9semaphore4initEPPS_ij+0x70>
610ed011: 8b 95 14 ff ff ff mov -0xec(%ebp),%edx
610ed017: 64 a1 04 00 00 00 mov %fs:0x4,%eax
After this change, the problem went away.
Another workaround that worked was to add:
free(calloc(1, 0x100));
free(calloc(1, 0x1000));
free(calloc(1, 0x10000));
free(calloc(1, 0x100000));
right before one of the parallel regions where the problem was otherwise
triggered, but this obviously has performance impact.
Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.