Date: Wed, 26 Oct 2011 11:17:35 +0400 From: Solar Designer <solar@...nwall.com> To: owl-dev@...ts.openwall.com Subject: Re: %optflags for new gcc On Wed, Oct 26, 2011 at 10:22:37AM +0400, Vasiliy Kulikov wrote: > IMHO -O2 is much, MUCH better and more consistent. I heard numerous > complains about -Os like -Os doesn't do it's job well providing code > which is even bigger than with -O2. Also sometimes -O2 is faster by > order of magnitudes. Sounds like FUD. Any specific examples? I am willing to believe that -Os sometimes increases code size, although what I am seeing so far is that with 4.6.1 it decreases code size compared to -O2 and it often makes the code faster as well (it does for John the Ripper, for example). As to an orders of magnitude slowdown with -Os compared to -O2, I do not believe. Such a large difference would typically be caused by cache-related issues, and even then it is tricky to achieve even on purpose. (I tried that before - to measure cache associativity.) While there can be occasional poor luck (e.g., addresses of frequently accessed variables just happened to be such that they conflict in the cache), it is wrong to blame these gcc options for that. At least on one of my benchmarks, merely going from 4.5.0 to 4.6.1, both at -O2, decreased the speed by over 25%. Going from -O2 to -Os gave some of that speed back (but not all of it...) This was with code making use of SSE2 intrinsics, though, which is not very common. But I am also seeing a slowdown with 4.6.1's -O2 with the intrinsics disabled (same algorithm), just not that much of a slowdown (a 12% speed difference between -O2 and -Os, in favor of -Os). > I didn't thoroughly compared both variants, though. I suppose the > latest versions of gcc were improving -O2 optimizations, but not -Os, > and the profit of -Os might be negligible in most (not all) cases. To me, it feels like in 4.6.x -O2 includes some optimizations that are often making things worse. This didn't happen with 4.5.0 in my testing - maybe not so much because the included sets of optimizations changed (I think they didn't change much between these versions), but rather because of some yet unidentified regressions in some specific optimizations that were enabled with -O2 previously as well. Thus, in 4.6.x -O2 feels questionable much like -O3 did before. -Os seems to be a more conservative alternative, maybe like -O2 was before. On the other hand, gcc 4.6.0 prereleases probably were extensively tested with SPEC benchmarks and others, so perhaps things are a lot better with -O2 on average, and I just happened to trigger non-typical behavior with my less usual code (need to get it into SPEC CPU and OpenMP benchmarks). Maybe we need to run more benchmarks ourselves. ...or maybe this gcc options stuff is not worth our time right now since we have a lot of tasks that would be of more importance for making Owl an attractive choice for more of our prospective users. There's little point in having gcc options tuned when our glibc is out of date and there's no official LAMP stack for Owl yet. So please proceed with those tasks. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.