Date: Sun, 16 Aug 2015 18:16:41 +0200 From: Jens Gustedt <jens.gustedt@...ia.fr> To: musl@...ts.openwall.com Subject: Re: [PATCH] replace a mfence instruction by an xchg instruction Am Sonntag, den 16.08.2015, 11:58 -0400 schrieb Rich Felker: > On Sun, Aug 16, 2015 at 05:50:21PM +0200, Jens Gustedt wrote: > > > See page 330, http://www.intel.com/Assets/en_US/PDF/manual/253668.pdf > > > > > > So mfence seems to be weaker than lock-prefixed instructions in terms > > > of the ordering it imposes (lock-prefixed instructions forbid > > > reordering and also have a total ordering across all cores). > > > > Yes, it says so on page 8-26 that the fences are definitively not > > serializing instructions. > > > > (But what I tried to show in my previous mail still holds, the > > instruction latency itself plays a big part in the efficiency of these > > instructions.) > > I wasn't trying to contradict anything you've said, just expressing > the absurdity of mfence being slower than lock-prefixed instructions, > since it's a strictly-weaker operation. Yes, I got that :) One argument that we neglected for the moment, is the impact on other threads/cores. Even if such an mfence instruction may be more expensive for the thread that issues it, it imposes less constraints to other threads. Maybe overall this could be win? > > I read all of that as: > > > > - mfence can be used to achieve acq_rel ordering > > - none of the fences can be use to achieve seq_cst ordering > > By this you mean that only lock-prefixed instructions impose a total > order across all cores? Plus these very expensive complete serializing instructions that are listed in the manual. > > Wasn't the idea that all atomic.h functions implement sequential > > consistency? > > Yes, that's the intent, but I don't want to introduce 'major' > performance regressions fixing 'minor' failures to be seq_cst if > there's no observable misbehavior in the code using them. Misbehavior here is really hard to track down. Especially having an application that changes behavior if it is not guaranteed seq_cst is probably quite difficult to observe. > Still it > would be nice to know whether such failures still exist, and if so > where, so we can eventually clean this up. Replacing "mfence" by "lock ; orl $0,(%%rsp)" would provide us with security by not compromising performance :) Jens -- :: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt :: Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.