Date: Sun, 6 Sep 2015 12:02:07 -0500 From: JimF <jfoug@....net> To: john-dev@...ts.openwall.com Subject: Re: Large stack alignment I fixed things here. We still must use mem_align(), but now it is a define macro, using no writes to memory variables to do this 'magic'. Just simply pointer readjustment Here is the macro: #define mem_align(a,b) (void*)(((char*)(a))+(((b)-1)-(((size_t)((char*)(a))-1)&((b)-1)))) Now to align, just build a buffer that contains align_wanted extra bytes: unsigned long _buf[whatever_size + (SIMD_ALIGN)/sizeof(unsigned long)]; then use the macro; unsigned long *buf = mem_alloc(_buf, SIMD_ALIGN); buf will be properly aligned somewhere in the range of _buf to &((char*)_buf) +SIMD_COEF On 9/6/2015 9:17 AM, Solar Designer wrote: > On Sun, Sep 06, 2015 at 03:15:11PM +0200, magnum wrote: >> On 2015-09-06 13:20, magnum wrote: >>> So Lei reminded me of this: >>> http://www.openwall.com/lists/john-dev/2015/08/24/8 >>> >>> We have an issue for changing all stack allocs using MEM_ALIGN_SIMD to >>> align ourselves. >> This is fixed in 1bd8d9d. > I think we (including me) still don't have adequate understanding of > the problem. This commit fixes the instances where we were using gcc's > alignment attributes, but it does not touch explicit vector variables on > the stack such as SIMDmd5body()'s: > > vtype w[16*SIMD_PARA_MD5]; > vtype a[SIMD_PARA_MD5]; > vtype b[SIMD_PARA_MD5]; > vtype c[SIMD_PARA_MD5]; > vtype d[SIMD_PARA_MD5]; > vtype tmp[SIMD_PARA_MD5]; > vtype tmp2[SIMD_PARA_MD5]; > vtype mask; > > We're hoping these will be in registers, but when not will they be > properly aligned for AVX2? I guess this is just as dependent on stack > pointer alignment and gcc's capabilities as uses of gcc's alignment > attribute were. (And if so, the 1bd8d9d commit makes little sense.) > > Also, when gcc spills AVX2 registers to stack, does it ensure proper > alignment? Or does it use unaligned-capable instructions? Or neither? > > We need to figure all of this out. > > A related issue are library callbacks when our code is called by a > library that was compiled with a smaller -mpreferred-stack-boundary or > with non-gcc (and for an ABI permitting a smaller stack alignment than > we need). This may be the cause of some of the problems, such as what > Lei and Jim saw with OpenMP (the threads are started via OpenMP runtime > library), even when the program initially started with proper stack > alignment (which is also dependent not only on gcc, but on dynamic > linker and libc startup code). > > We might want to consider reading up on and using -mstackrealign or/and > -mincoming-stack-boundary. And maybe we'd be able to revert the > explicit alignment commits then, which as I suggested above are likely > not a complete solution anyway. I expect that forcing gcc to realign > the stack would have performance impact, though. And then there are > non-gcc compilers. > > It's a mess. > > Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.