Date: Tue, 10 Feb 2015 16:37:56 -0500 From: Rich Felker <dalias@...c.org> To: musl@...ts.openwall.com Subject: Re: [PATCH 1/2] x86_64/memset: simple optimizations On Tue, Feb 10, 2015 at 10:08:29PM +0100, Denys Vlasenko wrote: > On Tue, Feb 10, 2015 at 9:50 PM, Rich Felker <dalias@...c.org> wrote: > > On Tue, Feb 10, 2015 at 06:30:56PM +0100, Denys Vlasenko wrote: > >> "and $0xff,%esi" is a six-byte insn (81 e6 ff 00 00 00), can use > >> 4-byte "movzbl %sil,%esi" (40 0f b6 f6) instead. > >> [...] > > > > Do you want to go ahead with these patches as-is, or consider some of > > the other ideas we discussed off-list like avoiding the 64-bit imul > > entirely in the small-n case? If you think that's easy as another > > incremental change I'll go ahead with these > > I think you can apply these patches without waiting > for potential future improvements. OK. Based on some casual testing on my Celeron 847: - For small sizes, your patches make significant improvement, 20-30%. - For rep stosq path, the improvement is minimal (roughly 1-2 cycles). - Using 32-bit imul instead of 64-bit makes no difference at all. I'll review the patches again for correctness, but so far they look good, and it doesn't look like these are things we'd want to back out or rewrite for subsequent improvements anyway. Thanks! Rich
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.