Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 21 Apr 2021 16:02:00 -0300
From: √Črico Nogueira <ericonr@...root.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] shorten __aeabi_memset by one instruction

Em 21/04/2021 14:38, Rich Felker escreveu:
> On Wed, Apr 21, 2021 at 10:24:58AM +0200, Szabolcs Nagy wrote:
>> * √Črico Nogueira <ericonr@...root.org> [2021-04-20 16:15:19 -0300]:
>>> when building for armhf, this makes libc.so text smaller by 4 bytes:
>>> 606619 to 606615
>>> ---
>>>   src/string/arm/__aeabi_memset.s | 3 +--
>>>   1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/src/string/arm/__aeabi_memset.s b/src/string/arm/__aeabi_memset.s
>>> index f9f60583..980774e8 100644
>>> --- a/src/string/arm/__aeabi_memset.s
>>> +++ b/src/string/arm/__aeabi_memset.s
>>> @@ -24,8 +24,7 @@ __aeabi_memset:
>>>   	cmp   r1, #0
>>>   	beq   2f
>>>   	adds  r1, r0, r1
>>> -1:	strb  r2, [r0]
>>> -	adds  r0, r0, #1
>>> +1:	strb  r2, [r0], #1
>>
>> this is not available before armv7 as thumb instruction (and it
>> has 32bit thumb encoding, so you replace two 16bit instructions
>> with a 32bit one.)
>>
>> normally this asm is compiled in arm mode even if your toolchain
>> defaults to thumb (i'm not sure why), but if you select a cpu or
>> arch that only supports thumb then the assembler will try to use
>> thumb and fail e.g. on -march=armv6-m (but i'm not sure if musl
>> supports that compilation mode throughout)
> 
> Should we hold off on doing anything about this for now then? I'd
> rather avoid making more work for future, and this is pure *junk* code
> that we do not expect to be called from anywhere (it's extremely slow)
> and only there to satisfy broken tooling generating calls to it rather
> than to the standard functions.

That's ok for me. I was just browsing this file for some reason and 
noted the potential to "simplify" it.

That said, src/string/arm/memcpy.S also uses this addressing mode, so it 
is probably relevant to watch out for it for an eventual port:

	/* align source to 32 bits. We need to insert 2 instructions between
	 * a ldr[b|h] and str[b|h] because byte and half-word instructions
	 * stall 2 cycles.
	 */
	movs    r12, r3, lsl #31
	sub     r2, r2, r3              /* we know that r3 <= r2 because r2 >= 4 */
	ldrbmi r3, [r1], #1
	ldrbcs r4, [r1], #1
	ldrbcs r12,[r1], #1
	strbmi r3, [r0], #1
	strbcs r4, [r0], #1
	strbcs r12,[r0], #1


> 
> Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.