Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 9 Feb 2012 07:54:50 +0100
From: Lukas Odzioba <lukas.odzioba@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: cryptmd5 optimizations

2012/2/8 Simon Marechal <simon@...quise.net>:
> I suggest looking for the md5cryptsse function in sse-intrinsics.c. It
> will probably look a lot more GPU friendly to start with. It starts by
> preparing buffers for the 8 cases, computes the base fingerprints with
> the "slow" md5 function, then runs into "dispatch", where you should be
> able to see the logic.

Thank you Simon I digged though code one made changes but
unfortunatelly 8*64 bytes for each thread is still to much memory (42
from MD5_std.c was an overkill).
OpenCL compiler says:
"Warning: cryptmd5 kernel has register spilling. Lower performance is expected."
And overall performance droped from 143k c/s to 94k c/s.

The good news is that I've got phpass-opencl code making 960k c/s on
overclocked 5850. If only i knew how to remove memory bottleneck
cryptmd5 could be even faster.

By the way:
tctx value in md5cryptsse() is redundant. I am not sure what compiler
will do with it, but it can be removed just by reordering code.

Lukas

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ