Date: Sat, 6 Jun 2015 14:47:35 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Interleaving of intrinsics Lei, Your use of VTune appears to be similar to use of gprof. If you use VTune at all, I'd expect you to profile things such as cache misses and pipeline stalls, as well as utilization of the CPU's execution units. Things that only the CPU vendor's profiler is capable of. For what you do with it now, I'd just use gprof. On Sat, Jun 06, 2015 at 07:38:18PM +0800, Lei Zhang wrote: > Same settings as the previous, except for longer run time (--test=20): Are the benchmark results significantly affected by your use of profiling, vs. a non-profiled run? This is very important. In some cases, profiling may change performance by an order of magnitude or even worse, which means that its results would be of questionable relevance. > Use of intrinsics is counted as function calls That's weird. You need to make sure they haven't, in fact, been turned into function calls or the like in this profiling build. If they have, performance is probably at a level much worse than what we normally see, and if so this is an instance of the problem I mentioned above. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.