|
|
Message-ID: <aTNN4hcHRjGWLAqp@intrepid> Date: Fri, 5 Dec 2025 22:25:54 +0100 From: Markus Wichmann <nullplan@....net> To: musl@...ts.openwall.com Subject: How supported is Thumb 1? Hi all, I recently got into reading ARM assembly, and then noticed a couple of weird things in musl's sources. In particular, at the moment, compiling musl with "-march=armv4t -mthumb" produces non-working binaries. Or at least I assume so, I haven't actually checked. But the code sequences contained in the object files are specifically called out in the AAPCS as not working. Before I get into fixing it, I thought I'd ask whether this is even desired, and if so to what degree? It seems like you went for having Thumb1 only in objects compiled from C code, while not having support for it in assembler source files. OK, but that still requires all the inline assembly to work in the correct mode. And here I want to call out the BLX macro (contained in atomic_arch.h and pthread_arch.h) as not working for ARMv4 Thumb mode. "mov lr, pc" loads the LR with the address from two instructions down, yes, but fails to set the Thumb bit where required. The AAPCS says to use a "bl" instruction to a veneer that does the "bx". In __tlsdesc_dynamic, you attempted to work around this with #if __ARM_ARCH >= 5 blx r0 // r0 = tp #else #if __thumb__ add lr,pc,#1 #else mov lr,pc #endif bx r0 #endif #endif Cute, but it doesn't actually work. "add lr,pc,#1" is not a Thumb1 instruction. It seems there is no Thumb1 instruction that can do what is required here. There are also many instructions around it that are not Thumb1, and cannot be easily adjusted. I mean, the two instructions following the above snippet are: ldr r3,[r0,#-4] // r3 = dtv ldr ip,[r3,r1,LSL #2] Which doesn't work as Thumb1, because in Thumb1, memory offsets cannot be negative, shifted addressing modes are not supported, and you cannot directly load to high registers such as "ip". A word on instruction set selection: It appears that the assembler, confronted with "-mthumb", will scan the source file for any instructions that cannot be represented in thumb mode (for the desired architecture version), and if any are found, will assemble in ARM mode. This is independent of the preprocessor, so the existance of the __thumb__ macro doesn't say we are actually in Thumb mode. This is also different from the ".thumb" directive, which forces it into Thumb mode and gives errors when something cannot be represented. All of this means, that with the given settings, the file will assemble in ARM mode, but contain the "add lr,pc,#1" instruction. I see no issue with just assuming the file will not be Thumb1. You can make the assumption explicit with the declaration #ifndef __thumb2__ .arm #endif at the start of the file. This changes nothing about the code, but makes it more obvious that support for Thumb1 is misplaced and can be deleted. For the BLX macro, I also have a suggestion, but it is kind of a hack. Namely something along the lines of #if __ARM_ARCH >= 5 #define BLX "blx " #elif !defined __thumb__ #define BLX "mov lr, pc\n\tbx " #else #define BLX "bl __thumb_bx_" #endif Note the lack of space in the last case. That is deliberate. You then create a new source file, and it contains __thumb_bx_r0: bx r0 and so on for all registers. Make them global hidden functions, put them each in their own section if need be (so the linker can cull the unused variants). Finally, the commit adding the "add lr,pc,#1" line mentions that the "bl" thing was not done for reasons of speed, but rapidly doing the wrong thing is not an improvement. It is unnecessary, anyway, by just selecting ARM mode for those old processor versions. Ciao, Markus
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.