Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 23 Jun 2017 20:53:48 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 4/8] determine the existence of private futexes at
 the first thread creation

On Sat, Jun 24, 2017 at 01:42:20AM +0200, Jens Gustedt wrote:
> Hello Rich,
> 
> On Fri, 23 Jun 2017 18:08:23 -0400 Rich Felker <dalias@...c.org> wrote:
> 
> > On Fri, Jun 23, 2017 at 11:48:25PM +0200, Jens Gustedt wrote:
> > > Hello Rich,
> > > 
> > > On Fri, 23 Jun 2017 13:05:35 -0400 Rich Felker <dalias@...c.org>
> > > wrote: 
> > > > This was intentional, the idea being that a 100% predictable
> > > > branch in a path where a syscall is being made anyway is much
> > > > less expensive than a GOT address load that gets hoisted all the
> > > > way to the top of the function and affects even code paths that
> > > > don't need to make the syscall. Whether it was a choice that
> > > > makes sense overall, I'm not sure, but that was the intent.  
> > > 
> > > So if we can avoid going through GOT, this would be better?
> > > I'd just add ATTR_LIBC_VISIBILITY to the variable, and then this
> > > should go away the same way as it is done for the libc object.  
> > 
> > It's not going through the GOT that's costly, but actually getting the
> > GOT address, which is used for both accesses through the GOT and
> > GOT-relative addressing. On several archs including i386, PC-relative
> > addressing is not directly available and requires hacks to load the PC
> > into a GPR, and these usually take some cycles themselves and spill
> > out of the free call-clobbered registers so that additional stack
> > shuffling is needed.
> 
> So you are saying that when I add ATTR_LIBC_VISIBILITY
> and see something like
> 
> 	movslq	__futex_private(%rip), %rsi

i386 does not have (%rip). x86_64 is one of the archs with very
efficient PC-relative addressing.

> What would you think of a patch that just cleans up the 128 vs
> FUTEX_PRIVATE issue? Just to improve readability?

That seems like a good thing. Regarding the mutex flag, I thought some
places we depended on the 128 shared mutex flag being the inverse of
the FUTEX_PRIVATE flag (so ^128 translates between them); if this is
the case, do you have an elegant way to make it work?

> Also there is this missing volatile in __get_locale.

Nice catch.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.