libc-coord - Re: RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <be4ef8b2-bbd3-1125-bf10-e6e58b5c87c2@efficios.com>
Date: Fri, 16 Sep 2022 16:44:53 +0200
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Florian Weimer <fw@...eb.enyo.de>, "carlos@...hat.com"
 <carlos@...hat.com>, libc-alpha <libc-alpha@...rceware.org>,
 szabolcs.nagy@....com, libc-coord@...ts.openwall.com
Subject: Re: RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature_size

On 2022-09-16 16:36, Mathieu Desnoyers wrote:
> Hi Florian,
> 
> I wanted to clarify by email what we each have in mind with respect
> to exposing the RSEQ feature set available to the outside world
> through libc symbols.
> 
> I have 3 different possible approaches in mind, shown below with
> 3 examples:
> 
> #include <stdint.h>
> 
> #undef likely
> #define likely(x)       __builtin_expect(!!(x), 1)
> #undef __aligned
> #define __aligned(x)    __attribute__((__aligned__(x)))
> #undef offsetof
> #define offsetof(TYPE, MEMBER)  __builtin_offsetof(TYPE, MEMBER)
> #undef sizeof_field
> #define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
> #undef offsetofend
> #define offsetofend(TYPE, MEMBER) \
>          (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
> 
> #define __RSEQ_FLAG_FEATURE_EXTENDED    0x2
> 
> #define __RSEQ_FLAG_FEATURE_VM_VCPU_ID  0x4
> 
> typedef uint32_t __u32;
> typedef uint64_t __u64;
> 
> /* Original: size=32 bytes */
> 
> struct rseq_orig {
>    uint32_t cpu_id_start;
>    uint32_t cpu_id;
>    uint64_t rseq_cs;
>    uint32_t flags;
>    uint32_t padding[3];
> } __aligned(32);
> 
> /* Extended */
> 
> struct rseq_ext {
>    uint32_t cpu_id_start;
>    uint32_t cpu_id;
>    uint64_t rseq_cs;
>    uint32_t flags;
>    /* New */
>    uint32_t node_id;
>    uint32_t vm_vcpu_id;
>    uint32_t padding[1];
> } __aligned(32);
> 
> unsigned int __rseq_flags;
> unsigned int __rseq_size;
> unsigned int __rseq_feature_size;
> 
> /* A) Check extended feature flag and size. One mask and two 
> comparisons. */
> void fA(void)
> {
>          if (likely((__rseq_flags & __RSEQ_FLAG_FEATURE_EXTENDED)
>                     && __rseq_size >= offsetofend(struct rseq_ext, 
> vm_vcpu_id))) {
>                  /* Use rseq with vcpu_id. */
>                  asm volatile ("ud2\n\t");
>          } else {
>                  /* Fallback. */
>                  asm volatile ("int3\n\t");
>          }
> }
> 
> /*
>   * B) Check rseq feature size. Feature number only limited by size of
>   * uint32_t. One comparison.
>   */
> void fB(void)
> {
>          if (likely(__rseq_feature_size >= offsetofend(struct rseq_ext, 
> vm_vcpu_id))) {
>                  /* Use rseq with vcpu_id. */
>                  asm volatile ("ud2\n\t");
>          } else {
>                  /* Fallback. */
>                  asm volatile ("int3\n\t");
>          }
> }
> 
> /*
>   * C) Check only rseq flags. 32 features at most. One mask and one
>   * comparison.
>   */
> 
> void fC(void)
> {
>          if (likely(__rseq_flags & __RSEQ_FLAG_FEATURE_VM_VCPU_ID)) {
>                  /* Use rseq with vcpu_id. */
>                  asm volatile ("ud2\n\t");
>          } else {
>                  /* Fallback. */
>                  asm volatile ("int3\n\t");
>          }
> 
> Here is the resulting objdump:
> 
> 
> rseq-flags.o:     file format elf64-x86-64
> 
> 
> Disassembly of section .text:
> 
> 0000000000000000 <fA>:
>     0:   f6 05 00 00 00 00 02    testb  $0x2,0x0(%rip)        # 7 <fA+0x7>
>     7:   74 0f                   je     18 <fA+0x18>
>     9:   83 3d 00 00 00 00 1b    cmpl   $0x1b,0x0(%rip)        # 10 
> <fA+0x10>
>    10:   76 06                   jbe    18 <fA+0x18>
>    12:   0f 0b                   ud2
>    14:   c3                      retq
>    15:   0f 1f 00                nopl   (%rax)
>    18:   cc                      int3
>    19:   c3                      retq
>    1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
> 
> 0000000000000020 <fB>:
>    20:   83 3d 00 00 00 00 1b    cmpl   $0x1b,0x0(%rip)        # 27 
> <fB+0x7>
>    27:   76 07                   jbe    30 <fB+0x10>
>    29:   0f 0b                   ud2
>    2b:   c3                      retq
>    2c:   0f 1f 40 00             nopl   0x0(%rax)
>    30:   cc                      int3
>    31:   c3                      retq
>    32:   66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)
>    39:   00 00 00 00
>    3d:   0f 1f 00                nopl   (%rax)
> 
> 0000000000000040 <fC>:
>    40:   f6 05 00 00 00 00 04    testb  $0x4,0x0(%rip)        # 47 <fC+0x7>
>    47:   74 07                   je     50 <fC+0x10>
>    49:   0f 0b                   ud2
>    4b:   c3                      retq
>    4c:   0f 1f 40 00             nopl   0x0(%rax)
>    50:   cc                      int3
>    51:   c3                      retq
> 
> I can think of 4 approaches that applications will use to detect
> availability of their specific rseq feature for each rseq critical
> section:
> 
> 1) Dynamically check whether the feature is implemented at runtime
>     with conditional branches. Those using this approach will probably
>     not want to have the overhead of the two comparisons in approach (A)
>     above. Applications and libraries should probably use their own copy
>     of the glibc symbols for speed purposes.
> 
> 2) Implement the entire function as IFUNC and select whether a rseq or
>     non-rseq implementation should be used at C startup. The tradeoff
>     here is code size vs speed, and using IFUNC for things like malloc
>     may add additional constraints on the startup order.
> 
> 3) Code rewrite (dynamic code patching) between rseq and non-rseq code.
>     This may be frowned upon in the security area and may not always be
>     possible depending on the context.
> 
> 3) JIT compilation of specialized rseq vs non-rseq code. Not generally
>     available in C.
> 
> I suspect that glibc may rely on approaches 1+2 depending on the
> situation, and many applications may use approach (1) for simplicity
> reasons.
> 
> Ideally I would like to keep approach (1) fast, so I'd prefer to
> keep the check to one single conditional branch. This eliminates
> approach (A) and leaves approaches (B) and (C). Approach (B) has
> the advantage of not limiting us to 32 features, but its downside
> is that we need to introduce a new __rseq_feature_size symbol to
> the libc ABI. Approach (C) has the advantage of using __rseq_flags
> which is already exposed, but limits us to 32 features.
> 
> Did you have in mind an approach like (A), (B) or (C) for exposing
> the rseq feature set or something else entirely ?

One more detail: Approach (C) using __rseq_flags would allow us to 
eventually deprecate features if need be by clearing the bit associated 
with the deprecated feature and leaving the field unpopulated in the 
rseq structure. I'm not certain whether this is something we want to be 
able to do or not, but it may be a nice property of the flags approach.

Thanks,

Mathieu

> 
> Thanks,
> 
> Mathieu
> 


-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.