Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 16 Sep 2022 16:36:46 +0200
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Florian Weimer <fw@...eb.enyo.de>, "carlos@...hat.com"
 <carlos@...hat.com>, libc-alpha <libc-alpha@...rceware.org>,
 szabolcs.nagy@....com, libc-coord@...ts.openwall.com
Subject: RSEQ symbols: __rseq_size, __rseq_flags vs __rseq_feature_size

Hi Florian,

I wanted to clarify by email what we each have in mind with respect
to exposing the RSEQ feature set available to the outside world
through libc symbols.

I have 3 different possible approaches in mind, shown below with
3 examples:

#include <stdint.h>

#undef likely
#define likely(x)       __builtin_expect(!!(x), 1)
#undef __aligned
#define __aligned(x)    __attribute__((__aligned__(x)))
#undef offsetof
#define offsetof(TYPE, MEMBER)  __builtin_offsetof(TYPE, MEMBER)
#undef sizeof_field
#define sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
#undef offsetofend
#define offsetofend(TYPE, MEMBER) \
         (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))

#define __RSEQ_FLAG_FEATURE_EXTENDED    0x2

#define __RSEQ_FLAG_FEATURE_VM_VCPU_ID  0x4

typedef uint32_t __u32;
typedef uint64_t __u64;

/* Original: size=32 bytes */

struct rseq_orig {
   uint32_t cpu_id_start;
   uint32_t cpu_id;
   uint64_t rseq_cs;
   uint32_t flags;
   uint32_t padding[3];
} __aligned(32);

/* Extended */

struct rseq_ext {
   uint32_t cpu_id_start;
   uint32_t cpu_id;
   uint64_t rseq_cs;
   uint32_t flags;
   /* New */
   uint32_t node_id;
   uint32_t vm_vcpu_id;
   uint32_t padding[1];
} __aligned(32);

unsigned int __rseq_flags;
unsigned int __rseq_size;
unsigned int __rseq_feature_size;

/* A) Check extended feature flag and size. One mask and two comparisons. */
void fA(void)
{
         if (likely((__rseq_flags & __RSEQ_FLAG_FEATURE_EXTENDED)
                    && __rseq_size >= offsetofend(struct rseq_ext, vm_vcpu_id))) {
                 /* Use rseq with vcpu_id. */
                 asm volatile ("ud2\n\t");
         } else {
                 /* Fallback. */
                 asm volatile ("int3\n\t");
         }
}

/*
  * B) Check rseq feature size. Feature number only limited by size of
  * uint32_t. One comparison.
  */
void fB(void)
{
         if (likely(__rseq_feature_size >= offsetofend(struct rseq_ext, vm_vcpu_id))) {
                 /* Use rseq with vcpu_id. */
                 asm volatile ("ud2\n\t");
         } else {
                 /* Fallback. */
                 asm volatile ("int3\n\t");
         }
}

/*
  * C) Check only rseq flags. 32 features at most. One mask and one
  * comparison.
  */

void fC(void)
{
         if (likely(__rseq_flags & __RSEQ_FLAG_FEATURE_VM_VCPU_ID)) {
                 /* Use rseq with vcpu_id. */
                 asm volatile ("ud2\n\t");
         } else {
                 /* Fallback. */
                 asm volatile ("int3\n\t");
         }

Here is the resulting objdump:


rseq-flags.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <fA>:
    0:   f6 05 00 00 00 00 02    testb  $0x2,0x0(%rip)        # 7 <fA+0x7>
    7:   74 0f                   je     18 <fA+0x18>
    9:   83 3d 00 00 00 00 1b    cmpl   $0x1b,0x0(%rip)        # 10 <fA+0x10>
   10:   76 06                   jbe    18 <fA+0x18>
   12:   0f 0b                   ud2
   14:   c3                      retq
   15:   0f 1f 00                nopl   (%rax)
   18:   cc                      int3
   19:   c3                      retq
   1a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)

0000000000000020 <fB>:
   20:   83 3d 00 00 00 00 1b    cmpl   $0x1b,0x0(%rip)        # 27 <fB+0x7>
   27:   76 07                   jbe    30 <fB+0x10>
   29:   0f 0b                   ud2
   2b:   c3                      retq
   2c:   0f 1f 40 00             nopl   0x0(%rax)
   30:   cc                      int3
   31:   c3                      retq
   32:   66 66 2e 0f 1f 84 00    data16 nopw %cs:0x0(%rax,%rax,1)
   39:   00 00 00 00
   3d:   0f 1f 00                nopl   (%rax)

0000000000000040 <fC>:
   40:   f6 05 00 00 00 00 04    testb  $0x4,0x0(%rip)        # 47 <fC+0x7>
   47:   74 07                   je     50 <fC+0x10>
   49:   0f 0b                   ud2
   4b:   c3                      retq
   4c:   0f 1f 40 00             nopl   0x0(%rax)
   50:   cc                      int3
   51:   c3                      retq

I can think of 4 approaches that applications will use to detect
availability of their specific rseq feature for each rseq critical
section:

1) Dynamically check whether the feature is implemented at runtime
    with conditional branches. Those using this approach will probably
    not want to have the overhead of the two comparisons in approach (A)
    above. Applications and libraries should probably use their own copy
    of the glibc symbols for speed purposes.

2) Implement the entire function as IFUNC and select whether a rseq or
    non-rseq implementation should be used at C startup. The tradeoff
    here is code size vs speed, and using IFUNC for things like malloc
    may add additional constraints on the startup order.

3) Code rewrite (dynamic code patching) between rseq and non-rseq code.
    This may be frowned upon in the security area and may not always be
    possible depending on the context.

3) JIT compilation of specialized rseq vs non-rseq code. Not generally
    available in C.

I suspect that glibc may rely on approaches 1+2 depending on the
situation, and many applications may use approach (1) for simplicity
reasons.

Ideally I would like to keep approach (1) fast, so I'd prefer to
keep the check to one single conditional branch. This eliminates
approach (A) and leaves approaches (B) and (C). Approach (B) has
the advantage of not limiting us to 32 features, but its downside
is that we need to introduce a new __rseq_feature_size symbol to
the libc ABI. Approach (C) has the advantage of using __rseq_flags
which is already exposed, but limits us to 32 features.

Did you have in mind an approach like (A), (B) or (C) for exposing
the rseq feature set or something else entirely ?

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.