Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 10 Feb 2015 16:37:56 -0500
From: Rich Felker <>
Subject: Re: [PATCH 1/2] x86_64/memset: simple optimizations

On Tue, Feb 10, 2015 at 10:08:29PM +0100, Denys Vlasenko wrote:
> On Tue, Feb 10, 2015 at 9:50 PM, Rich Felker <> wrote:
> > On Tue, Feb 10, 2015 at 06:30:56PM +0100, Denys Vlasenko wrote:
> >> "and $0xff,%esi" is a six-byte insn (81 e6 ff 00 00 00), can use
> >> 4-byte "movzbl %sil,%esi" (40 0f b6 f6) instead.
> >> [...]
> >
> > Do you want to go ahead with these patches as-is, or consider some of
> > the other ideas we discussed off-list like avoiding the 64-bit imul
> > entirely in the small-n case? If you think that's easy as another
> > incremental change I'll go ahead with these
> I think you can apply these patches without waiting
> for potential future improvements.

OK. Based on some casual testing on my Celeron 847:

- For small sizes, your patches make significant improvement, 20-30%.

- For rep stosq path, the improvement is minimal (roughly 1-2 cycles).

- Using 32-bit imul instead of 64-bit makes no difference at all.

I'll review the patches again for correctness, but so far they look
good, and it doesn't look like these are things we'd want to back out
or rewrite for subsequent improvements anyway.



Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.