Date: Sun, 26 Jul 2009 15:35:52 -0500 From: "JimF" <jfoug@....net> To: <john-users@...ts.openwall.com> Subject: Re: Work (optimization) in progress, and some ideas > That's true. Speaking of your specific intended use described above, > I think a "next generation" version of JtR should introduce some > multi-word or passphrase cracking mode. This is a topic we could > discuss separately and at a later time. This is the 'exact' cracking mode I was looking for. Most were multi-word but also common prefix/suffix work. John already has some 'common' append things, like english plurals, ing, ly, and other things. Those are great rules, and much of what was done here. But finding many arbitrary prefix and suffixes for a language, and then using those as the rules, and running against a larger dictionary (or even running BACK against just the language dictionary), cracked a lot. Simply put, all I did was try to figure out a way to do something like this, with the existing john. The 'solution' I have come to is certainly far from optimal, it really beats on the rule processor (and preprocessor in the way I have implemented it), but it certainly DID work, and it shows that this POC method does crack a lot of words. Making john better support this type of searching the 'core' level, would do nothing more than improve john. Note, I will put a 'howto' together of what I have done to get this to work. It was a few specialized peices of code, and some 'standard' nix filters (grep, sort, cut, etc). Even if john gets enhanced to do handle pre/append insert and multi-word packing in more of a 'native' manner than I did, this type of word building in implementation may help in getting more and better ideas and help lead to a better enhancement to john. >> Do they have to be 1 letter? > > For optimal performance, yes, as long as we don't introduce a compiler > into some internal representation (which is something to consider when > we're ready for a major rework). > > Also, single-letter commands allow for using the preprocessor on them: Ok, noted. >> It could be something like: >> $.!string! and ^.!string! (again ! being any char not used in >> string) >> >> this would require $\. or ^\. (or $[.] ^[.] for the pre-processor) to >> append just a period. > > I thought of this before, and decided against this approach for the > following reasons: > > 1. This would break some existing valid rules (those appending/prepending > the magic character or the escape character). ?? Not sure of this. The rule change I list is $charvalue or $.strvalue. Thus, the 'only' value you could not currently do, is $. To do that you would have to escape it. So, $\. would prepend a period $\\ would prepend the escape. Am I missing something here? I think not, but sometimes I miss things that I think I understand, but truely do not know the inner behavior. It is possible that the code handling the append would need patched to handle $\. properly but I think if it did this, then using the period char after the $ would signal the rules processor to expect a string (format being $.!string! with ! being both the same char, and being char unused within string) > 2. The checks for the magic character and the escape character would have > some performance cost. Fully agreed. But is there not a check now for $\\ ? If not, then at this time you could not prepend the \ char anyway. Speaking of checking the char after, what different logic does these 2 rules get? $a $[a] It appears to me that '[' is looked at. Or is this 'rule' removed and handled by the preprocessor? This question is due to my lack of 'overall' coding functionality. I understand some of the 'micro' flow of processing, but still lack some of the 'overall' processing and code flow at the macro functionality level. > Also, your preprocessor example is not correct. The preprocessor is > ... Big block of explanation I will have to read this carefully, and get the debugger out and do some stepping. Thank you VERY much for this information. It will help me to get a better grasp of wtf is going on within this code, and allow me to make better knowledgeable choices in asking questions, or in coding with the rules / ppr's . >> Speaking of rules, I have found the rejection rules to be a little on > > I think this problem can be generalized as follows: > > Right now, JtR tries the entire wordlist against each rule before it > advances to the next rule. If the next rule has the same initial few > commands (not only rejections, but any commands), those will be applied > to all entries in the wordlist for a second time. > > To deal with this, as well as to provide a way to temporarily swap the > loops for some other reason (and I can think of at least one other valid > reason), I've been thinking of introducing a way to declare "rule > blocks" - some sort of BEGIN and END directives within a ruleset. > JtR would then swap the loops while working on such a block. This sounds exactly like what would do the 'trick'. Changing from a depth first search to a bredth first search, so that each word gets the common rules done once, and then gets the 'specialized' rule work done if and only if the word is still there, would obtain 99.9..% of the speed gain (probably faster, since no interaction grepping data files) This 'BEGINBLOCK=common_code' rule1 rule2 rule3 'ENDBLOCK' method certain far improves what I was doing. I really see that the 'common_code' in this instance works great as 'pre work' prior to doing rule1, rule2, rule3 on the word, but could there be instances where common 'post' processing would be needed? I at this time can't think of any, because I have all of this 'pre' processing view of this new feature, but others might see useful post rules. > When going through the rules for the first time, JtR would need to > detect common initial substrings between adjacent rules (or better but > trickier - detect common initial sets of commands) and record the > character position (or command number) of the first difference into the > rule struct (to be introduced, and we'd have to keep the preprocessor > output in memory). Then, when actually working at full speed, the rules > engine would use this "first difference position" to determine when to > cache the intermediate result, when to recall it, and how many initial > characters or rule commands to skip. Automatic is one way, but could this also be left up to the user to provide? If this sort of behavior is wanted, the user would be pretty 'sure' what he/she wanted. Thus, instead of having !?d-8-c<8/?Lrule1 !?d-8-c<8/?Lrule2 !?d-8-c<8/?Lrule3 !?d-8-c<8/?Lrule4 ... where john have to pre-scan, find and 'remove' all of the !?d-8-c<8/?L values, simply allowing the user to pre-inform john that this situatation is happening like STARTBLOCK=!?d-8-c<8/?L rule1 rule2 rule3 .... ENDBLOCK and when john is pre-scanning rules, it would 'spot' these blocks, and of course not load them, but track where they start, where they end, and what prelim work is done. That way, once that rule number is hit, john could start working breadthfirst search over the rules, applying the common stuff, then running the result (if any) over each of the rules within the group. > If a rejection occurs at or before the caching position, that fact > should be cached instead of the intermediate result (string). Then the > following rules with the same or larger caching position should be > skipped. The pointer to the next rule to process in such a case could > have been cached too. To make this even more optimal, two caching > positions could be kept track of: one as described above and the other > at the end of the last rejection command within the common initial set > of commands. I will have to re-read that a few times before it 'sinks in' to my brain, which can be slow to grasp things at times. Jim. -- To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply to the automated confirmation request that will be sent to you.
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.