Date: Sun, 1 Apr 2012 23:56:05 +0300 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: fast hashes on GPU > > Is this for AMD VLIW? > Yes, my Barts kernel (I have a machine with 2x5870s so that I can test the Cypress one - but I don't think it would make much of a difference). Do you limit this to uint2 because you can't afford 100+ GPRs? > Yes. It does not get to more than 100 though, more like 80-90. > Do you have similar stats for Nvidia? > Not at the moment. I have a NVidia card, but not a free PCIe slot right now... If things are so bad in terms of register pressure anyway, maybe > bitslicing would be of help - at least we'll avoid the 64-bit rotates. > Yes, but unfortunately my design would not allow for processing 64 or 128 hashes per workitem given the way I generate plaintexts currently. And this would not change I guess. But I am interested in the results if you do that in jtr. In fact, I was considering doing bitslice DES on GPUs before and I did some experiments. GPR usage is a disaster, but part of the data could be shifted to local memory and I still believe it's quite possible. Then again, unfortunately, it is completely incompatible with my model, if I were to implement it, changes would be so huge it's kind of becoming a "самоцел" (I believe you have the same word in Russian, I can think of no good equivalent in English though :) ) Anyway, I am not quite happy with the generated ISA code. Perhaps situation would somewhat improve if I do not use 64-bit longs and deal with 64-bit operations emulation on 32-bit uints myself. I wish I had more time for that :( Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.