Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 20 Jan 2020 00:34:31 +0300
From: Alexander Cherepanov <ch3root@...nwall.com>
To: musl@...ts.openwall.com
Subject: Re: Minor style patch to exit.c

On 19/01/2020 19.39, Rich Felker wrote:
> On Sun, Jan 19, 2020 at 07:33:08PM +0300, Alexander Cherepanov wrote:
>> On 19/01/2020 17.46, Alexander Monakov wrote:
>>> On Sun, 19 Jan 2020, Alexander Cherepanov wrote:
>>>
>>>> Couldn't _start defined as an array? Then separate values could be accessed
>>>> simply as elements of this array. And casts to integers could be limited to
>>>> calculating the number of elements, the terminating value or something.
>>>
>>> Yeah, I think usually such linker-provided symbols are declared as
>>> extern arrays. I'm surprised that isn't the case in musl.  I don't think
>>> declaring them as arrays helps with making casts pedantically suitable for
>>> calculating number of elements though - as you said, any bijection between
>>> intptr_t and pointers would be a valid implementation of a cast, you're not
>>
>> Well, we want use from C some outside info, there could be no
>> pedantic way to do this. Let's see, we know that the _end array
>> follows the _start array in memory. This means that &_start[i] ==
>> &_end[0] for some i. But different provenance of the pointers means
>> that we cannot do it just like that. Adding a cast should fix this.
>> Summarizing, it should look like this:
>>
>> for (size_t i = 0; (uintptr_t)&_start[i] != (uintptr_t)&_end[0]; i++)
>>
>> or
>>
>> for (type *p = _start; (uintptr_t)p != (uintptr_t)_end; p++)
> 
> This works for forward walk, not backwards walk.

Oops, then asm barriers look more attractive.

>>> guaranteed that (intptr_t)&a[i] == (intptr_t)a + i * sizeof *a.
>>
>> While you are inside one object, I think this should be safe in
>> practice. For gcc, this is more or less guaranteed by [3]. BTW there
>> is an explicit restriction there:
>>
>> "When casting from pointer to integer and back again, the resulting
>> pointer must reference the same object as the original pointer,
>> otherwise the behavior is undefined. That is, one may not use
>> integer arithmetic to avoid the undefined behavior of pointer
>> arithmetic as proscribed in C99 and C11 6.5.6/8."
>>
>> [3] https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html
> 
> GCC is badly wrong here, and it breaks XOR linked lists and other
> things. 

Why is that? Integers (and pointers as it turned out) could have several 
provenances as tracked by gcc, so XOR linked lists should be fine.

> It's also worded imprecisely.

Sure.

> What does it mean if arithmetic
> is performed on the value between the cast and cast back. What if two
> pointers go into the arithmetic, but complex mathematical relations
> result in one of the original values coming out, and the compiler can
> only "see" the other pointer going in? Will it then wrongly assume
> that the result points to the same object as the pointer it "saw" go
> in?

I looked into exactly this about 3 week ago:-) Rediscovered an old gcc 
bug and found that the problem happens even without any casts -- see 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49330#c28 .

> This whole provenance thing is a trashfire.

Pluses of the provenance thing: `a[i] = 1;` could be moved over `b[j] = 
2;` when `a` and `b` are different array while `i` and `j` are unknown.
Minuses of the provenance thing: slight inconvenience in cases like with 
_start & _end. The pluses seem to outweigh the minuses. Did I miss 
something important?

What I recently found definitely wrong is instability of equality `&x + 
1 == &y`. This leads to outright nonsense.

-- 
Alexander Cherepanov

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.