Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Mon, 24 Jul 2017 14:23:27 +0200
From: Peter Zijlstra <>
To: Michael Ellerman <>
Cc: Kees Cook <>,
	Andrew Morton <>,
	Ingo Molnar <>,
	Josh Poimboeuf <>,
	Christoph Hellwig <>,
	"Eric W. Biederman" <>,
	Jann Horn <>, Eric Biggers <>,
	Elena Reshetova <>,
	Hans Liljestrand <>,
	Greg KH <>,
	Alexey Dobriyan <>,
	"Serge E. Hallyn" <>,,
	Davidlohr Bueso <>,
	Manfred Spraul <>,
	"" <>,
	James Bottomley <>,
	"" <>, Arnd Bergmann <>,
	"David S. Miller" <>,
	Rik van Riel <>, LKML <>,
	linux-arch <>,
	"" <>
Subject: Re: [PATCH v6 0/2] x86: Implement fast refcount overflow protection

On Mon, Jul 24, 2017 at 10:09:32PM +1000, Michael Ellerman wrote:
> Peter Zijlstra <> writes:
> > anyway, and the fact that your LL/SC is horrendously slow in any case.
> Boo :/


> Just kidding. I suspect you're right that we can probably pack a
> reasonable amount of tests in the body of the LL/SC and not notice.
> > Also, I still haven't seen an actual benchmark where our cmpxchg loop
> > actually regresses anything, just a lot of yelling about potential
> > regressions :/
> Heh yeah. Though I have looked at the code it generates on PPC and it's
> not sleek, though I guess that's not a benchmark is it :)

Oh for sure, GCC still can't sanely convert a cmpxchg loop (esp. if the
cmpxchg is implemented using asm) into a native LL/SC sequence, so the
generic code will end up looking pretty horrendous.

A native implementation of the same semantics should look loads better.

One thing that might help you is that refcount_dec_and_test() is weaker
than atomic_dec_and_test() wrt ordering, so that might help some
(RELEASE vs fully ordered).

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.