musl - Re: [PATCH] replace a mfence instruction by an xchg instruction

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150815201755.GL31018@brightrain.aerifal.cx>
Date: Sat, 15 Aug 2015 16:17:55 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
 instruction

On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> according to the wisdom of the Internet, e.g
> 
> https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> 
> a mfence instruction is about 3 times slower than an xchg instruction.

Uhg, then why does this instruction even exist if it does less and
does it slower?

> Here we not only had mfence but also the mov instruction that was to be
> protected by the fence. Replace all that by a native atomic instruction
> that gives all the ordering guarantees that we need.
> 
> This a_store function is performance critical for the __lock
> primitive. In my benchmarks to test my stdatomic implementation I have a
> substantial performance increase (more than 10%), just because malloc
> does better with it.

Is there a reason you're not using the same approach as on i386? It
was faster than xchg for me, and in principle it "should be faster".

Rich

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.