Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 9 Feb 2012 07:54:50 +0100
From: Lukas Odzioba <>
Subject: Re: cryptmd5 optimizations

2012/2/8 Simon Marechal <>:
> I suggest looking for the md5cryptsse function in sse-intrinsics.c. It
> will probably look a lot more GPU friendly to start with. It starts by
> preparing buffers for the 8 cases, computes the base fingerprints with
> the "slow" md5 function, then runs into "dispatch", where you should be
> able to see the logic.

Thank you Simon I digged though code one made changes but
unfortunatelly 8*64 bytes for each thread is still to much memory (42
from MD5_std.c was an overkill).
OpenCL compiler says:
"Warning: cryptmd5 kernel has register spilling. Lower performance is expected."
And overall performance droped from 143k c/s to 94k c/s.

The good news is that I've got phpass-opencl code making 960k c/s on
overclocked 5850. If only i knew how to remove memory bottleneck
cryptmd5 could be even faster.

By the way:
tctx value in md5cryptsse() is redundant. I am not sure what compiler
will do with it, but it can be removed just by reordering code.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.