Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 6 Jun 2015 14:47:35 +0300
From: Solar Designer <>
Subject: Re: Interleaving of intrinsics


Your use of VTune appears to be similar to use of gprof.  If you use
VTune at all, I'd expect you to profile things such as cache misses and
pipeline stalls, as well as utilization of the CPU's execution units.
Things that only the CPU vendor's profiler is capable of.

For what you do with it now, I'd just use gprof.

On Sat, Jun 06, 2015 at 07:38:18PM +0800, Lei Zhang wrote:
> Same settings as the previous, except for longer run time (--test=20):

Are the benchmark results significantly affected by your use of
profiling, vs. a non-profiled run?  This is very important.  In some
cases, profiling may change performance by an order of magnitude or even
worse, which means that its results would be of questionable relevance.

> Use of intrinsics is counted as function calls

That's weird.  You need to make sure they haven't, in fact, been turned
into function calls or the like in this profiling build.  If they have,
performance is probably at a level much worse than what we normally see,
and if so this is an instance of the problem I mentioned above.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.