Date: Sat, 25 Jul 2015 21:26:34 +0300 From: Alexander Cherepanov <ch3root@...nwall.com> To: musl@...ts.openwall.com Subject: Re: Left-shift of negative number On 2015-07-25 06:22, Rich Felker wrote: > On Fri, Jul 17, 2015 at 05:28:58PM -0400, Rich Felker wrote: >> On Fri, Jul 17, 2015 at 06:28:00PM +0000, Loïc Runarvot wrote: >>> >>> According to the C11 standard, doing a left-shift on a negative >>> integer is considered as an undefined behavior (6.5.7:4). >>> >>> This undefined behavior occurs in files src/multibyte/internal.c and >>> src/multibyte/internal.h. At line 21 in the header >>> (http://git.musl-libc.org/cgit/musl/tree/src/multibyte/internal.h?id=0f9c2666aca95eb98eb0ef4f4d8d1473c8ce3fa0#n21), >>> the implementation of the macro-definition R allow to have a >>> negative value on the expression ((a == 0x80) ? 0x40-b : -a) << 23. >>> >>> In fact, in the source file, at the line 11 >>> (http://git.musl-libc.org/cgit/musl/tree/src/multibyte/internal.c?id=0f9c2666aca95eb98eb0ef4f4d8d1473c8ce3fa0#n11). >>> During the application of the macro-definition R(0x90, 0xc0), we >>> have a != 0x90, so it's try to do (-0x90) << 23, which is an >>> undefined behavior. >> >> Thank you. Reporting of such issues is very welcome, as it is the >> intent in musl to avoid undefined behavior regardless of whether it's >> believed to cause problems with current compilers. The cleanest >> solution is probably to use unsigned arithmetic here (e.g. replace -a >> with 0u-a or -(unsigned)a) but I'd like to look at the code in more >> detail again and check all of the consequences before committing to a >> particular approach to fixing it. > > This looks like the best approach, and the macro is only used in > initializers so it was easy to confirm that the object file is not > changed. I also considered replacing <<23 with *(1<<23), which is a > standard idiom I'd like to promote for working around the standard's > failure to define left-shift of negative numbers properly, but > ensuring that the multiplication doesn't overflow is non-trivial > without re-examining the logic, so I'd rather just work with unsigned > arithmetic. > > I've gone ahead and made the change as commit > fe7582f4f92152ab60e9523bf146fe28ceae51f6. If anything looks wrong, > please let me know. Thanks again for the bug report. The new definition of R: #define R(a,b) ((uint32_t)((a==0x80 ? 0x40u-b : 0u-a) << 23)) It implicitly casts a and b to unsigned (and triggers -Wsign-conversion). Isn't it better to express it explicitly, e.g. by moving the cast to uint32_t inside the conditional operator? Or maybe more intuitive to move the work with negative numbers outside the conditional operator: #define R(a,b) (-(uint32_t)(a==0x80 ? b-0x40 : a) << 23) While at it, maybe change -1 to -1u in the definition of C in internal.c (triggers -Wsign-compare)? -- Alexander Cherepanov
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.