Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 9 Aug 2013 16:03:56 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: Intel OpenCL on CPU and MIC

On 8 Aug, 2013, at 23:32 , Solar Designer <solar@...nwall.com> wrote:
> -cl-strict-aliasing was not understood by (and resulted in errors from) Intel's OpenCL compiler.  (Should we make this change standard?)

This is a violation of OpenCL, that option is mandatory ever since 1.0 afaik. We can drop it for now but we should probably use it selectively - some compilers may produce faster code with this option, right?

> I am fairly certain that the kernel was not vectorized, which is why the poor speed.

My experience with different CPU drivers is that some do best from a scalar kernel (auto-vectorizes and uses SIMD) while other drivers do better from a vectorized kernel. No one-size-fits-all unfortunately. Some of my formats & kernels honor the --request-vectorize option just to experiment with that.

> I am getting roughly the same cumulative speed by building the whole JtR right for Xeon Phi and running it there.  No OpenMP that way for a subtle reason that I can explain separately, but I did run with --fork=228, for a cumulative speed at bcrypt of around 6000 c/s.

A good speed would be at least 10x that, no? What could be the reason for such a low figure? What speed do you get from one core?

I'm hoping I will find time to start experimenting on 'well' within a few weeks.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.