Date: Wed, 31 May 2017 10:03:38 -0800 From: Royce Williams <royce@...ho.org> To: john-users@...ts.openwall.com Subject: Re: other algorithms on ZTEX 1.15y? On Wed, May 31, 2017 at 9:33 AM, Solar Designer <solar@...nwall.com> wrote: > On Wed, May 31, 2017 at 06:07:56AM -0800, Royce Williams wrote: >> Beyond the algorithms either already supported in john or implemented >> elsewhere (descrypt, bcrypt, DES), what other algorithms are feasible >> or worthwhile on ZTEX? > > Are you aware of bcrypt already implemented on ZTEX elsewhere? Where > exactly? Have you tested? I'm only aware of the work that you and Katja presented a couple of years ago. > Regarding DES, are you referring to Gifts' implementation? Have you > tried using it, or anything else? I'm only aware of Gifts', as I believe that he participated in the work as described in the Positive Technologies blog here: http://blog.ptsecurity.com/2014/12/4g-security-hacking-usb-modem-and-sim.html > Maybe we need to add a plain DES cracker mode to JtR, like I think > hashcat has now (but not on FPGAs yet). Yes, hashcat has DES (mode 14000) and 3DES (mode 14100) now. > As to our developments so far, after the descrypt-ztex format Denis has > also been working on bcrypt-ztex, citing speeds of ~105k c/s per board > at bcrypt cost 5 - but this work is yet to be completed and merged. > Actual speeds will vary by cracking mode since the current synchronous > crypt_all() API combined with the not-so-fast USB interface results in > significant idle time when the candidate passwords are fed from the > host. On-FPGA mask mode mostly avoids that (and so will an API revision > for asynchronous processing, but we haven't gotten around to that yet).\ Interesting - good to know! >> This project is working on WPA2 support, which seems interesting: >> >> https://github.com/JarrettR/FPGA-Cryptoparty >> >> From a brief review of the project's files, I infer that SHA1 and >> PBKDF2 would be possible on ZTEX. Would they be worth the effort? > > For PBKDF2 with MD*/SHA-1/SHA-2, it should be possible to obtain > GPU-like speeds on ZTEX, roughly like these boards worked for Bitcoin > mining (thus, one quad-FPGA board is roughly same as one high-end GPU > from 2015 or so). The purpose would be to put these boards to more > general use and to achieve better energy efficiency (compared to GPUs). > > For fast unsalted hashes, good speeds may only be achieved for up to a > few thousand hashes loaded for cracking. This is a lot worse than with > GPUs, which handle millions. So focusing on PBKDF2 makes more sense. I don't disagree about the priority, though I should point out that there are also use cases for which only one hash, or a few hashes, are the target. > We didn't come up with a good enough idea for a generic password hashing > soft CPU yet. My current thinking is that, to avoid bumping into BRAM > port count for the register file as we would with instructions doing > little work each, maybe we should have different bitstreams for > different crypto primitives like MD5, SHA-1, etc. (one at a time) and > have those available through very high latency instructions in the soft > CPU to allow for full pipelining - thus, 64 cycles latency for MD5, etc. > We'd also have a handful of simpler instructions (same or similar in the > different bitstreams) for implementing higher-level crypto schemes > around the current bitstream's crypto primitive (this way, the same > bitstream will be usable for multiple higher-level schemes sharing the > same crypto primitive). These would include data copying and control > transfer instructions. A tough question is how to combine the extreme > high-latency crypto instructions with control flow transfer - do we have > like 63 delay slots? SPARC has 1, some DSPs have a few, but I've never > heard of an ISA having tens of delay slots. Yet maybe this is the way > to go. > > Meanwhile, or alternatively, maybe we need PBKDF2-SHA* bitstreams. > There are many JtR formats that use PBKDF2, so it would have been a > primary candidate for implementation on the soft CPU anyway. > > For NTLM, we could use a soft CPU having an MD4 primitive, but then do > we have anything else needing MD4? Perhaps just raw-MD4? That's very > rare, and other MD4-based things are probably even more rare. So > perhaps a separate bitstream for NTLM as well, or maybe one usable for > NTLM and for raw-MD4 (different placement of characters into the current > block in on-FPGA mask mode; the rest of the difference can probably be > handled on host). > > LM will need to be its own bitstream, although it could be a revision of > the descrypt design. Denis probably has specific thoughts on it. > > Technically, we could share a bitstream between descrypt and LM, as > that's basically different IV (0 vs. non-0), iterations (25 vs. 1), and > salt size (12 vs. 0 bits, but we can simply set the 12 bits to all 0's), > but this would be suboptimal. > > Overall, most JtR formats (perhaps 90%+, with exception for scrypt and > the like) could be reasonably implemented for ZTEX, but a speedup over > GPU is expected for only a few (bcrypt, maybe Lotus/Domino), the > required effort is substantial, and there's almost no demand. Fair points, and informative exploration of the potential. Thanks! Royce
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.