Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 31 Oct 2014 15:44:42 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: descrypt speed

On Fri, Oct 31, 2014 at 05:26:14PM +0530, Sayantan Datta wrote:
> On Fri, Oct 31, 2014 at 11:03 AM, magnum <john.magnum@...hmail.com> wrote:
> 
> > Your numbers are not particularly bad afaik. I'm concerned about "fork"
> > being beneficial at all, for anyone. For a semi-slow format like this I
> > think a 50% speedup from using fork means we should be able to improve
> > something.
> 
> The bottleneck with descrypt-opencl are two fold. First is the candidate
> generation. Second is hiding latency of transferring candidates to GPU and
> back. The second bottleneck can be scrutinized by looking at the GPU usage
> percentage.

Besides latency, PCIe bandwidth may also be a bottleneck in low lane
count, older PCIe revision slots.  My guess is that magnum's GTX 980
might be in a faster PCIe slot than Royce's GTX 970.

Sayantan, how much data does descrypt-opencl transfer over PCIe per
candidate, in each direction?  Actually, only the higher of the two
numbers should matter in terms of a possible bandwidth bottleneck, since
PCIe is full-duplex.

PCIe 2.x is 500 MB/s per lane, 3.0 is 985 MB/s per lane.  If we're
transferring 8 bytes per candidate password (for max length, and no
separator char since not necessary for this length), then 100M c/s gives
us 800 MB/s - should fit in 1 lane with PCIe 3.0, but not with PCIe 2.x.
I doubt Royce uses a 1-lane slot, though - must be 4, 8, or 16 if it's a
typical motherboard with no PCIe extender in use.  But are we possibly
transferring much more data?  Can we reduce it to 8 bytes/password (max
for either direction)?

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ