|
Date: Wed, 8 Aug 2012 10:03:56 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: Mask mode (was Password Generation on GPU) myrice - I thought you'd try multiple sequential bitmaps like what Bit Weasil does in Multiforcer first. But I don't mind you working on both tasks in parallel. On Wed, Aug 08, 2012 at 03:10:47AM +0800, myrice wrote: > I am changing hard coded password generation to mask mode. If I got it > correct, hashcat use mask mode generated password based on mask only. I've never used it, but I think that historically mask mode was a separate cracking mode invoked on its own, but later hashcat gained the ability to chain cracking modes together (or does this only work for multiple wordlist rulesets? I don't know). If we have a hashcat user in here, perhaps he can enlighten us. Also, I think the mask syntax originated in PasswordsPro, which pre-dates hashcat. But I could be wrong. Anyhow, both *hashcat* and PasswordsPro are worth looking at if we want to avoid unnecessary syntax incompatibilities. > From previous discuss, JtR could apply mask on exist keys that passed > from set_keys() interface. Please don't confuse JtR as a whole and the formats interface. Yes, I suggested that set_mask(), which is to be added to the formats interface, will apply the mask to keys previously set with set_key(). Both are part of the formats interface. This does not imply that JtR as a whole would let the user combine masks with other cracking modes. It may, or it may not, independently of how it's implemented in the formats interface and whether a given format even provides set_mask(). The standalone mask mode (to be invoked on its own, not in combination with any other mode) would have code to generate candidate passwords on its own, without reliance on a format's set_mask(). It would also make use of set_mask() when available, for some of the character positions (like 2). It shouldn't do that for too many because then we'd be spending too much time per crypt_all() call, which would make the program non-interactive, prevent frequent enough updates of the .rec file, and cause "ASIC hangs" on GPUs. I suggest that initially we only implement this standalone mask mode, not supporting combinations with other cracking modes. (In a sense this will be inferior to your current hack. That's a pity.) Allowing for the use of masks along with other cracking modes is an enhancement to add later. Again, this should not depend on set_mask() being available (it should also work for formats that don't provide it), but set_mask() should be made use of (for some character positions, not necessarily for the entire mask) when available. I think that formats should provide the number of character positions for which they can apply masks on their own - maybe a min-max range (e.g., we may have params.min_mask_positions and params.max_mask_positions). Of course, it will always be allowed to avoid set_mask() altogether - for other cracking modes - so the min will only apply in case set_mask() is actually used. To provide an example, if you determine that iterating over two characters is optimal in a given format, it can report "2" for both min and max, or if you want to provide better performance with small charsets (such as digits only), you may report min=2, max=3 (and indeed support both 2 and 3 in your code then). Mask mode's code in JtR itself must adapt to that. > I provided void set_mask(int count, int *positions, char* masks). For > example, we could use set_mask(2, [2,4], ['d', 'l']). It will replace > position 2 with digits and 4 with lower case letters. 'd' indicates > digits and 'l' indicates letters which borrow from hashcat and > correspond to wordlist rules in JtR. I think these shortcuts for character sets should be at high level only, not in the formats interface. For example, if a user specifies ?l?l?l?l?d?l on the command-line, this string is passed on to JtR's mask mode implementation (on CPU), which turns it into strings "abc[...]xyz" and "0123456789" (or maybe with characters sorted for decreasing frequency) and then e.g. passes the strings (character lists) for positions 4 and 5 into set_mask(), and iterates over them on its own for positions 0 - 3. On the other hand, if significant speedup is expected for hard-coded character lists (e.g., if you'd increment the ASCII code rather than read the next character from an array), then we may consider an interface that would accommodate that as well. But we do need the flexible interface supporting arbitrary character lists anyway, because the user might as well specify arbitrary characters rather than use one of the shortcuts. BTW, this is a reason why the mask mode implementation might choose to use set_mask() on other than the last few character positions: there might be too few different characters in those (but more in other positions). > But this may be overlap. If we have a wordlist file contains > "password[1-9]" and we set mask to set_mask(1, [9], ['d']). In > crypt_all, we will have "password[1-9]" multiple times. Or we just > append to the keys not replace the exist character in the wordlist? As I explained above, with the initial implementation of mask mode this issue won't arise. When we later allow for combining of masks with other cracking modes, I think we should be appending the masks. In terms of the set_mask() interface, I think appending may be requested e.g. by specifying the positions as -1. > Also, what if some words don't have such positions. Using set_mask(2, > [2,4], ['d', 'l']) as example, if a word is 'am', it do not have > position 4 and am_'l'( _ is null) do not seems like a meaningful word. > I prefer to discard these invalid positions. The high-level code should ensure that this issue does not occur inside a format's set_mask(). > Another way of using mask is use mask alone (or it is real mask > mode?). Exactly. > We do not use set_keys() and only use set_masks(). No, we'll use both. > For > example, we could use set_mask(4, [1,2,3,4], ['d', 'd', 'd', 'd']) to > produce 0000-9999 on GPU. But if the mask is too large, such as > aaaaaaaa-zzzzzzzz, I am afraid of long GPU run which cause ASIC hang. This is one of the reasons to use both functions. > I am test mask mode only in raw-md5-opencl and have not produce > set_mask() interface into john. I manually set_mask in reset(). I want > to make these clear before I add interface into it: overlap, only > append to word and invalid positions. Initially, set_mask() needs to support overstrike only, and it should accept strings with character lists for each position. On the other hand, if it is easier for you to implement appending now (such as because you don't want to implement the standalone mask mode on your own yet), please do that. Thanks, Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.