Date: Thu, 9 Feb 2012 07:54:50 +0100 From: Lukas Odzioba <lukas.odzioba@...il.com> To: john-dev@...ts.openwall.com Subject: Re: cryptmd5 optimizations 2012/2/8 Simon Marechal <simon@...quise.net>: > I suggest looking for the md5cryptsse function in sse-intrinsics.c. It > will probably look a lot more GPU friendly to start with. It starts by > preparing buffers for the 8 cases, computes the base fingerprints with > the "slow" md5 function, then runs into "dispatch", where you should be > able to see the logic. Thank you Simon I digged though code one made changes but unfortunatelly 8*64 bytes for each thread is still to much memory (42 from MD5_std.c was an overkill). OpenCL compiler says: "Warning: cryptmd5 kernel has register spilling. Lower performance is expected." And overall performance droped from 143k c/s to 94k c/s. The good news is that I've got phpass-opencl code making 960k c/s on overclocked 5850. If only i knew how to remove memory bottleneck cryptmd5 could be even faster. By the way: tctx value in md5cryptsse() is redundant. I am not sure what compiler will do with it, but it can be removed just by reordering code. Lukas
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.