Date: Wed, 21 Apr 2021 16:02:00 -0300 From: Érico Nogueira <ericonr@...root.org> To: musl@...ts.openwall.com Subject: Re: [PATCH] shorten __aeabi_memset by one instruction Em 21/04/2021 14:38, Rich Felker escreveu: > On Wed, Apr 21, 2021 at 10:24:58AM +0200, Szabolcs Nagy wrote: >> * Érico Nogueira <ericonr@...root.org> [2021-04-20 16:15:19 -0300]: >>> when building for armhf, this makes libc.so text smaller by 4 bytes: >>> 606619 to 606615 >>> --- >>> src/string/arm/__aeabi_memset.s | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/src/string/arm/__aeabi_memset.s b/src/string/arm/__aeabi_memset.s >>> index f9f60583..980774e8 100644 >>> --- a/src/string/arm/__aeabi_memset.s >>> +++ b/src/string/arm/__aeabi_memset.s >>> @@ -24,8 +24,7 @@ __aeabi_memset: >>> cmp r1, #0 >>> beq 2f >>> adds r1, r0, r1 >>> -1: strb r2, [r0] >>> - adds r0, r0, #1 >>> +1: strb r2, [r0], #1 >> >> this is not available before armv7 as thumb instruction (and it >> has 32bit thumb encoding, so you replace two 16bit instructions >> with a 32bit one.) >> >> normally this asm is compiled in arm mode even if your toolchain >> defaults to thumb (i'm not sure why), but if you select a cpu or >> arch that only supports thumb then the assembler will try to use >> thumb and fail e.g. on -march=armv6-m (but i'm not sure if musl >> supports that compilation mode throughout) > > Should we hold off on doing anything about this for now then? I'd > rather avoid making more work for future, and this is pure *junk* code > that we do not expect to be called from anywhere (it's extremely slow) > and only there to satisfy broken tooling generating calls to it rather > than to the standard functions. That's ok for me. I was just browsing this file for some reason and noted the potential to "simplify" it. That said, src/string/arm/memcpy.S also uses this addressing mode, so it is probably relevant to watch out for it for an eventual port: /* align source to 32 bits. We need to insert 2 instructions between * a ldr[b|h] and str[b|h] because byte and half-word instructions * stall 2 cycles. */ movs r12, r3, lsl #31 sub r2, r2, r3 /* we know that r3 <= r2 because r2 >= 4 */ ldrbmi r3, [r1], #1 ldrbcs r4, [r1], #1 ldrbcs r12,[r1], #1 strbmi r3, [r0], #1 strbcs r4, [r0], #1 strbcs r12,[r0], #1 > > Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.