Date: Tue, 11 Dec 2012 00:43:41 -0500 From: "Matt Weir" <cweir@...edu> To: <john-users@...ts.openwall.com> Subject: Hashcat BF++ vs JtR Incremental and Markov Modes: (was How does incremental mode works?) So I've been playing around with Hashcat's Brutefoce++ mode and I figured I should share my findings with everyone else. Quick Disclaimer: All my observations about Hashcat's Bruteforce++ mode are based on using Hashcat's statsprocessor (version 0.08). According to the documentation, it uses the same algorithm as Hashcat's other cracking programs but I could be wrong. Speaking about documentation, there isn't a whole lot about what BF++ actually does under the hood. I also don't have access to the source-code nor have I reversed it. So please take everything I say with a grain of salt. On a related note, Hashcat's wiki states: "There is no known alternative that can handle markov like this, but the following programs can at least generate wordlists and are partially configurable: John the Ripper (using --stdout) http://www.openwall.com/john/" That seems a little bit unfair. It kind of shows the lack of communication between the different password cracking communities. This isn't just a Hashcat thing, as I'm willing to bet the JtR community's general knowledge of Hashcat is limited as well. Hashcat Bruteforce++, (plus some JtR), background: So just like JtR's Incremental and Markov modes, Hashcat's Bruteforce++ mode uses the conditional probability of different letters appearing together to determine how it generates its guesses, (aka Markov chain modeling). The biggest different between Hashcat's Bruteforce++ mode and JtR's modes is the algorithm it uses to go through the search space. As can be seen in the differences between JtR's Incremental and Markov modes, the actual algorithm used, (aka how it takes advantage of the conditional probability info), makes a huge difference in how it performs in real cracking attacks. JtR's Incremental mode attempts to generate guesses in probability order while going through the entire search space, (aka it will eventually try every combo possible no matter how unlikely it is). JtR's Markov mode on the other hand makes no attempt to generate guesses in true probability order but it will only generate guesses when their probability falls above a limit. Aka it won't try 'vsFD82,sq' but it will go through all the most likely guesses in the time you allocate to it. While this sounds like a negative, the reality is you'll likely never have enough time to bruteforce all 12 character passwords, so ignoring low probability guesses isn't as harmful as you'd expect. As far as how they compare, given the same number of guesses Markov mode will almost always crack more passwords than Incremental mode does. I'm still looking into why that's the case, but if you know exactly how long your cracking session will be Markov mode is the way to go. That being said, since Markov mode requires you to specify a limit, if you don't know how long your password cracking session will be (or you want to crack a password as quickly as possible and then just stop) Incremental mode is a better option. So that leaves us with Bruteforce++. Like JtR's Markov mode you can specify a limit so that it will only search a limited portion of the keyspace. If you don't specify any limit it will search the whole keyspace but it attempts to do so in an optimized fashion. The actual algorithm/looping it uses is a bit quirky but in a nutshell: 1) Bruteforce++ will exhaust the keyspace for each length guess before moving on to the next size guesses. Aka it will guess all 3 character passwords before trying 4 character passwords. This is a big difference from JtR's incremental mode where it will switch between longer and shorter passwords depending on their current probability. 2) The order that Bruteforce++ ranks characters to use is dependent on where that character appears. Aka uppercase characters will be more likely at the front of a guess and numbers more common at the end. This is like JtR's Incremental mode, and different from JtR's Markov mode. 3) The first character's order in a guess in Bruteforce++ is based on simple letter frequency, (since there is no character before it). What's strange though is that it also appears the second character's order is based on simple letter frequency as well. I've run a bunch of tests and all of them seem to show that. It isn't until the third character that it uses 1st order Markov chains, (aka it calculates the probability of a letter based on the letter before it). 4) This is the part that'll get me in trouble, but it "seems" that Bruteforce++ uses the ordering of characters appearing vs. the probability of them appearing. Aka instead of saying that given 'q', 'u' follows it 98% of the time, it'll say 'u' is the most likely character to follow 'q'. While this sounds like an esoteric point, it's actually a pretty big distinction. As an example of that, the probability info might tell me to only try 'u' after 'q', while if I'm just using ordering info I have to blindly guess the average case for how many transitions I want to allow. Aka there might be a lot of characters that are beneficial to try after 'a', but I have to treat 'a' and 'q' the same way. Both JtR's Incremental and Markov modes use probability info. 5) Bruteforce++ allows the user to set a threshold for how many potential transitions they want to allow so unlikely guesses are skipped. This is slightly different than the limit JtR's Markov model uses which calculates the total probability of a guess instead. For example, if you set the threshold in Bruteforce++ to be 10 then only the ten most likely starting characters will be used. This applies for every following transition as well. The plus side is it becomes very easy to calculate the total keyspace BF++ will search. Aka if you have a threshold of 10, and are only cracking 8 character passwords, the total keyspace searched would be 10^8, or 100,000,000. 6) I'm not going to go too much into it in this post, but Bruteforce++ also allows you to specify a "Mask" to use as well. Aka you can say "only try digits at the end" or "only try capital letters at the front". This is very useful when targeting password creation requirements or you have some knowledge about the target passwords that was not contained in the training set. Bruteforce++ vs. Incremental vs. Markov Results: So now that all the background is out of the way, let's see how they actually compare when cracking real passwords. The first step I took was to make sure all three methods were trained on the same training set. As I've done previously, I randomized and then divided the RockYou list into 32 different 1 million password sets. This way I can train and test against very similar passwords without training and testing against the exact same passwords. Please note that I kept duplicate passwords in for both training and testing. That's been a point of contention in the past, but I strongly believe that keeping duplicates in is important when figuring out how a password cracking session will work in real life. I then trained all three methods using the RockYou_32 list. For the first test I ran a short cracking session allowing each tool to make 269,538,503 guesses against the RockYou_1 list. Why did I choose such a random number? Well that's how many guesses JtR's Markov mode would make with a limit of 200. For the record, I did not limit the charset for any of the three modes, (aka no Masks), nor did I set a threshold for Bruteforce++ in this test. The results can be found at the link below: https://sites.google.com/site/reusablesec/Home/pictures/hc_bruteforce_graph1 .png As you can see, both of JtR's modes were clear winners here. Now while I didn't include it in the graph, I also ran a pure brutefoce attack, (which cracked 2,349 passwords), so a no-threshold Hashcat's BF++ mode did do substantially better than that (11,419 cracked). Still that wasn't anywhere near the 446,991 that JtR's Markov mode cracked. Upon reflection though, this makes sense since Bruteforce++ essentially got 'stuck' trying to perform a full bruteforce of five character passwords before it starting trying to crack 6 character passwords, (which it never got to). For the next test then I limited all three methods to only try 8 character passwords. I also upped the number of guesses to 425 million total, (Markov limit of 215), and tested against the RockYou_2 list. This way we can get a better idea of how Bruteforce++ goes about searching the keyspace. The results can be found at the link below: https://sites.google.com/site/reusablesec/Home/pictures/hc_bruteforce_graph2 .png Once again, both of JtR's bruteforce modes performed significantly better. Still, it wasn't a completely fair comparison since I haven't made use of the threshold option in Hashcat's BF++ mode to keep it from making unlikely guesses. So I then re-ran the previous test, (requiring 8 character long passwords), using a threshold of 12 which was optimal for making around 425 million guesses, (12^8 works out to around 430 million guesses). The results can be found at the link below: https://sites.google.com/site/reusablesec/Home/pictures/hc_bruteforce_graph3 .png While changing the threshold really helped the effectiveness of the Hashcat BF++ run, it still wasn't enough for it to come close to cracking as many passwords as either of JtR's modes. So in conclusion, both of JtR's Incremental and Markov modes appear to perform significantly better than Hashcat's Bruteforce++ mode even when the threshold is optimally set for Hashcat's Bruteforce++ mode. As a caveat to that, I haven't taken into account two potential variables that might make Brutefoce++ a better algorithm in certain situations. First of all, there's the question of how all three algorithms perform when used in GPU cracking. I unfortunately don't have any side-by-side comparison about how those algorithms perform when implemented in GPUs and I'll have to leave that testing to someone else. Second there's the Mask feature built into Hashcat's Bruteforce++ mode. This functionality can also be incorporated into JtR's modes using an '-external' filter but it's a bit of a pain. Still the advantage of using Masks in conjunction with Markov enhanced brutefoce attacks might be worth looking into some more. Bonus Test: I had a moment of doubt where I wondered if a threshold for Hashcat's Bruteforce++ mode of 12 was actually the best possible threshold when generating 425 million guesses, (long story but it has to do with the fact that it uses ordering vs probabilities). Luckily this was something easy to test. All I needed to do was re-run the previous cracking runs with different threshold settings! As a quick spoiler, yes a threshold of 12 was the best setting to use. Another useful result from this test is it'll hopefully give you an idea how much 'flexibility' you have when selecting a threshold that isn't optimal for your cracking session. The results can be found at the link below: https://sites.google.com/site/reusablesec/Home/pictures/hc_bruteforce_graph4 .png As always, if I've made any mistakes or incorrect assumptions please correct me. Matt
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.