john-dev - MSCash2 OpenCL (was: OpenCL tests on HD 7970)

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120411225741.GA19179@openwall.com>
Date: Thu, 12 Apr 2012 02:57:41 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: MSCash2 OpenCL (was: OpenCL tests on HD 7970)

Hi Sayantan,

On Thu, Apr 12, 2012 at 12:11:22AM +0530, SAYANTAN DATTA wrote:
> I will post a new patch soon which will incrase the speeds furthur by
> around 13%. Currently I'm trying to squeez in as many optimizations as
> possible and will be ready in a day or two. Here's a sample benchmark for
> my new codes:
> 
> OpenCL platform 0: AMD Accelerated Parallel Processing, 2 device(s).
> Using device 0: ATI RV770
> Benchmarking: MSCASH2-OPENCL [PBKDF2_HMAC_SHA1]... DONE
> Raw:    19277 c/s real, 19306 c/s virtual

BTW, your previous revision of the code (in magnum-jumbo as of
yesterday, perhaps same as today's), which gives 75k c/s on my 7970,
brings my card to 82 degrees Celsius (measured after 15 minutes of
running incremental mode) and 85% GPU load:

root@...l:~# XAUTHORITY=~user/.Xauthority DISPLAY=:0 aticonfig --odgt

Default Adapter - AMD Radeon HD 7900 Series
                  Sensor 0: Temperature - 82.00 C
root@...l:~# XAUTHORITY=~user/.Xauthority DISPLAY=:0 aticonfig --odgc

Default Adapter - AMD Radeon HD 7900 Series
                            Core (MHz)    Memory (MHz)
           Current Clocks :    925           1375
             Current Peak :    925           1375
  Configurable Peak Range : [300-1125]     [150-1575]
                 GPU load :    85%

I think this means that we're already quite close to optimal
performance, and further optimizations may be in two areas: reducing the
number of operations performed per password hashed (e.g., by making use
of bitselect() and rotate(), not re-computing common subexpressions, etc.)
and avoiding various stalls (apparently, the possible benefit here is
limited to 15% now).

For comparison, Lukas' phpass OpenCL code (that achieves around 1010K c/s
on this card) brings the reported GPU load to 93% (so it is more optimal
in that respect).

Thanks,

Alexander

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.