Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 16 Aug 2015 17:50:21 +0200
From: Jens Gustedt <jens.gustedt@...ia.fr>
To: musl@...ts.openwall.com
Subject: Re: [PATCH] replace a mfence instruction by an xchg
 instruction

Am Sonntag, den 16.08.2015, 11:16 -0400 schrieb Rich Felker:
> On Sun, Aug 16, 2015 at 02:42:33PM +0200, Jens Gustedt wrote:
> > Hello,
> > 
> > Am Samstag, den 15.08.2015, 19:28 -0400 schrieb Rich Felker:
> > > On Sat, Aug 15, 2015 at 11:01:40PM +0200, Jens Gustedt wrote:
> > > > Am Samstag, den 15.08.2015, 16:17 -0400 schrieb Rich Felker:
> > > > > On Sat, Aug 15, 2015 at 08:51:41AM +0200, Jens Gustedt wrote:
> > > > > > according to the wisdom of the Internet, e.g
> > > > > > 
> > > > > > https://peeterjoot.wordpress.com/2009/12/04/intel-memory-ordering-fence-instructions-and-atomic-operations/
> > > > > > 
> > > > > > a mfence instruction is about 3 times slower than an xchg instruction.
> > > > > 
> > > > > Uhg, then why does this instruction even exist if it does less and
> > > > > does it slower?
> > > > 
> > > > Because they do different things ?)
> > > > 
> > > > mfence is to synchronize all memory, xchg, at least at a first glance,
> > > > only one word.
> > > 
> > > No, any lock-prefixed instruction, or xchg which has a builtin lock,
> > > fully orders all memory accesses. Essentially it contains a builtin
> > > mfence.
> > 
> > Hm, I think mfence does a bit more than that. The three fence
> > instructions were introduced when they invented the asynchronous
> > ("non-temporal") move instructions that came with sse.
> > 
> > I don't think that "lock" instructions synchronize with these
> > asynchronous moves, so the two (lock instructions and fences) are just
> > different types of animals. And this answers perhaps your question
> > up-thread, why there is actually something like mfence.
> 
> The relevant text seems to be the Intel manual, Vol 3A, 8.2.2 Memory
> Ordering in P6 and More Recent Processor Families:
> 
> ----------------------------------------------------------------------
> Reads are not reordered with other reads.
> 
> Writes are not reordered with older reads.
> 
> Writes to memory are not reordered with other writes, with the
> following exceptions:
> —   writes executed with the CLFLUSH instruction;
> —   streaming stores (writes) executed with the non-temporal move
> instructions (MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPS, and MOVNTPD); and
> —   string operations (see Section 8.2.4.1).
> 
> Reads may be reordered with older writes to different locations but
> not with older writes to the same location. 
> 
> Reads or writes cannot be reordered with I/O instructions, locked
> instructions, or serializing instructions.
> 
> Reads cannot pass earlier LFENCE and MFENCE instructions.
> 
> Writes cannot pass earlier LFENCE, SFENCE, and MFENCE instructions.
> 
> LFENCE instructions cannot pass earlier reads.
> 
> SFENCE instructions cannot pass earlier writes.
> 
> MFENCE instructions cannot pass earlier reads or writes
> ----------------------------------------------------------------------
> 
> See page 330, http://www.intel.com/Assets/en_US/PDF/manual/253668.pdf
> 
> So mfence seems to be weaker than lock-prefixed instructions in terms
> of the ordering it imposes (lock-prefixed instructions forbid
> reordering and also have a total ordering across all cores).

Yes, it says so on page 8-26 that the fences are definitively not
serializing instructions.

(But what I tried to show in my previous mail still holds, the
instruction latency itself plays a big part in the efficiency of these
instructions.)

I read all of that as:

 - mfence can be used to achieve acq_rel ordering
 - none of the fences can be use to achieve seq_cst ordering

Wasn't the idea that all atomic.h functions implement sequential
consistency?

Jens

-- 
:: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::





Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.