Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Oct 2011 11:17:35 +0400
From: Solar Designer <solar@...nwall.com>
To: owl-dev@...ts.openwall.com
Subject: Re: %optflags for new gcc

On Wed, Oct 26, 2011 at 10:22:37AM +0400, Vasiliy Kulikov wrote:
> IMHO -O2 is much, MUCH better and more consistent.  I heard numerous
> complains about -Os like -Os doesn't do it's job well providing code
> which is even bigger than with -O2.  Also sometimes -O2 is faster by
> order of magnitudes.

Sounds like FUD.  Any specific examples?

I am willing to believe that -Os sometimes increases code size, although
what I am seeing so far is that with 4.6.1 it decreases code size
compared to -O2 and it often makes the code faster as well (it does for
John the Ripper, for example).

As to an orders of magnitude slowdown with -Os compared to -O2, I do not
believe.  Such a large difference would typically be caused by
cache-related issues, and even then it is tricky to achieve even on
purpose.  (I tried that before - to measure cache associativity.)  While
there can be occasional poor luck (e.g., addresses of frequently
accessed variables just happened to be such that they conflict in the
cache), it is wrong to blame these gcc options for that.

At least on one of my benchmarks, merely going from 4.5.0 to 4.6.1, both
at -O2, decreased the speed by over 25%.  Going from -O2 to -Os gave
some of that speed back (but not all of it...)  This was with code
making use of SSE2 intrinsics, though, which is not very common.  But I
am also seeing a slowdown with 4.6.1's -O2 with the intrinsics disabled
(same algorithm), just not that much of a slowdown (a 12% speed
difference between -O2 and -Os, in favor of -Os).

> I didn't thoroughly compared both variants, though.  I suppose the
> latest versions of gcc were improving -O2 optimizations, but not -Os,
> and the profit of -Os might be negligible in most (not all) cases.

To me, it feels like in 4.6.x -O2 includes some optimizations that are
often making things worse.  This didn't happen with 4.5.0 in my testing -
maybe not so much because the included sets of optimizations changed
(I think they didn't change much between these versions), but rather
because of some yet unidentified regressions in some specific
optimizations that were enabled with -O2 previously as well.  Thus, in
4.6.x -O2 feels questionable much like -O3 did before.  -Os seems to be
a more conservative alternative, maybe like -O2 was before.

On the other hand, gcc 4.6.0 prereleases probably were extensively
tested with SPEC benchmarks and others, so perhaps things are a lot
better with -O2 on average, and I just happened to trigger non-typical
behavior with my less usual code (need to get it into SPEC CPU and
OpenMP benchmarks).

Maybe we need to run more benchmarks ourselves.  ...or maybe this gcc
options stuff is not worth our time right now since we have a lot of
tasks that would be of more importance for making Owl an attractive
choice for more of our prospective users.  There's little point in
having gcc options tuned when our glibc is out of date and there's no
official LAMP stack for Owl yet.  So please proceed with those tasks.

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.