Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 21 Feb 2019 17:06:47 +0100
From: Jann Horn <>
To: Kees Cook <>
Cc: "Tobin C. Harding" <>, "Tobin C. Harding" <>, Shuah Khan <>, 
	Alexander Shishkin <>, 
	Greg Kroah-Hartman <>, 
	Andy Shevchenko <>, 
	Kernel Hardening <>, 
	kernel list <>, Andy Lutomirski <>, 
	Rasmus Villemoes <>, Daniel Micay <>, 
	Arnd Bergmann <>, Stephen Rothwell <>, 
	Miguel Ojeda <>, 
	"Gustavo A. R. Silva" <>
Subject: Re: [PATCH 5/6] lib: Fix function documentation for strncpy_from_user

On Thu, Feb 21, 2019 at 7:02 AM Kees Cook <> wrote:
> On Wed, Feb 20, 2019 at 9:25 PM Tobin C. Harding <> wrote:
> >
> > On Wed, Feb 20, 2019 at 05:05:10PM -0800, Kees Cook wrote:
> > > So, generally speaking, I'd love to split all strncpy* uses into
> > > strscpy_zero() (when expecting to do str->str copies), and some new
> > > function, named like mempadstr() or str2mem() that copies a str to a
> > > __nonstring char array, with trailing padding, if there is space. Then
> > > there is no more mixing the two cases and botching things.
> I should use "converts" instead of "copies" above, just to drive the
> point home. :)
> >
> > Oh cool, treewide changes, I'm down with that.  So to v2 I'll add
> > str2mem() and then attack the tree as suggested.  What could possibly go
> > wrong :)?
> Some clear documentation needs to be written for str2mem() to really
> help people understand what a "non string" character array is
> (especially given that it LOOKS like it has NUL termination -- when in
> fact it's just "padding").
> The tree-wide changes will likely take a while (and don't need to be
> part of this series unless you want to find a couple good examples)
> since we have to do them case-by-case: it's not always obvious when
> it's actually a non-string, so getting help from maintainers here will
> be needed. (And maybe some kind of flow chart added to
> Documentation/process/deprecated.rst for how to stop using strncpy()
> and strlcpy().)

Wild brainstorming ahead...

Such non-string character arrays are usually fixed-size, right? We
could use struct types around such character arrays to hide the actual
characters (so that people don't accidentally use them with C string
APIs), and enforce that the length is passed around as part of the
type... something like:

#define char_array(len) struct { char __ca_data[len]; }
#define __CA_UNWRAP(captr) (captr)->__ca_data, sizeof((captr)->__ca_data)

void __str2ca(char *dst, size_t dst_len, char *src) {
  size_t src_len = strlen(src);
  if (WARN_ON(src_len > dst_len)) {
    // or whatever else we think is the safest way to deal with this
    // without crashing, if BUG() is not an option.
    memset(dst, 0, dst_len);
  memcpy(dst, src, src_len);
  memset(dst + src_len, 0, dst_len - src_len);
#define str2ca(dst, src) __str2ca(__CA_UNWRAP(dst), (src))

static inline void __ca2ca(char *dst, size_t dst_len, char *src,
size_t src_len) {
  BUILD_BUG_ON(dst_len < src_len);
  memcpy(dst, src, src_len);
  if (src_len != dst_len) {
    memset(dst + src_len, 0, dst_len - src_len);
#define ca2ca(dst, src) __ca2ca(__CA_UNWRAP(dst), __CA_UNWRAP(src))

And if you want to do direct assignments instead of using helpers, you
make a typedef like this (since just assigning between two equivalent
structs doesn't work):

typedef char_array(20) blah_name;
blah_name global_name;
int main(int argc, char **argv) {
  blah_name local_name;
  str2ca(&local_name, argv[1]);
  global_name = local_name;

You'd also have to use a typedef like this if you want to pass
references to other functions.

Something like this would also be neat for classic C strings in some
situations - if you can put bounds in the types, or (if the string is
dynamically-sized) in the struct, you don't need to messily pass
around bounds for things like snprint() and so on.

> What I can't quite figure out yet is how to find a way for sfr to flag
> newly added users of strcpy, strncpy, and strlcpy. We might need to
> bring back __deprecated, but hide it behind a W=linux-next flag or
> something crazy. Stephen, in your builds you're already injecting
> -Wimplicit-fallthrough: do you do W=1 or anything like that? If not, I
> think we need some W= setting for your linux-next builds that generate
> the maintainer-nag warnings...
> -Kees
> P.S. Here's C string API Rant (I just had to get this out, please feel
> free to ignore):
> strcpy returns dest ... which is already known, so it's a meaningless
> return value.
> strncpy returns dest (still meaningless)

Weeeell... it kinda makes sense if you're trying to micro-optimize the
amount of stack spills. If you write code like this:

char *a = malloc(10);
a = strcpy(a, other_string);
return a;

... then the compiler doesn't have to shove `a` in one of the
callee-saved registers or spill it to the stack around the strcpy().
Plus, it might allow tail-call optimizations in some cases.

> strlcpy returns strlen(src) ... the length we WOULD have copied. Why
> would we care about that? I'm operating on dest. Were there callers
> that needed to both copy part of src and learn how long it was at the
> same time?

You could use it to optimistically attempt a copy in a fastpath, and
then fall back to a reallocation if that failed.

> strscpy returns -E2BIG or non-NUL bytes copied: yay, a return about
> what actually happened from the operation!
> ... snprintf returns what it WOULD have written, much like strlcpy
> above. At least snprintf has an excuse: it can be used to calculate an
> allocation size (called with a NULL dest and 0 size) ... but shouldn't
> "how big is this format string going to be?" be a separate function? I
> wonder if we can kill all kernel uses of snprintf too (after
> introducing a "how big would it be?" function and switching all other
> callers over to scnprintf)...
> So scnprintf() does the right thing (count of non-NUL bytes copied out).

That seems kinda suboptimal. Wouldn't you ideally want to bail out
with an error, or at least do a WARN(), if you're trying to format a
string that doesn't fit into its destination? Like, for example, if
you're assembling a path, and the path doesn't fit into a buffer, and
so you just use half of it, you might end up in a very different place
from where you intended to go.

> So now our safe(r?) string API versions use different letters to show
> they're safe: "s" in strcpy and "cn" in sprintf. /me cries forever.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.