Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 15 Aug 2015 23:01:40 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
 instruction

Am Samstag, den 15.08.2015, 16:17 -0400 schrieb Rich Felker:
> On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> > according to the wisdom of the Internet, e.g
> > 
> > https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> > 
> > a mfence instruction is about 3 times slower than an xchg instruction.
> 
> Uhg, then why does this instruction even exist if it does less and
> does it slower?

Because they do different things ?)

mfence is to synchronize all memory, xchg, at least at a first glance,
only one word.

But I also read that the relative performance of these instructions
depend a lot on the actual dice you are dealing with.

> > Here we not only had mfence but also the mov instruction that was to be
> > protected by the fence. Replace all that by a native atomic instruction
> > that gives all the ordering guarantees that we need.
> > 
> > This a_store function is performance critical for the __lock
> > primitive. In my benchmarks to test my stdatomic implementation I have a
> > substantial performance increase (more than 10%), just because malloc
> > does better with it.
> 
> Is there a reason you're not using the same approach as on i386? It
> was faster than xchg for me, and in principle it "should be faster".

I discovered your approach for i386 after I experimented with "xchg"
fore x86_64. I guess the "lock orl" instruction is a replacement for
"mfence" because that one is not implemented for all variants of i386?

Exactly why a "mov" followed by a read-modify-write operation to some
random address (here the stack pointer) should be faster than a
read-modify-write operation with exactly the address you want to deal
with looks weird.

I trust you that it does, but seen from outside this arch stuff
resembles more voodoo than anything else.

I'll experiment a bit with "mov" and your approach a see what I get.

Thanks

Jens


-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::




Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.