Date: Mon, 29 Jun 2015 13:17:30 -0400 From: Matt Weir <cweir@...edu> To: john-dev@...ts.openwall.com Subject: Re: Using Probabilistic Context Free Grammars (Was precomputed attacks) >> The main downside for me is that PCFG is not easily parallelizable so no GPU for it. Well I guess *technically* it is parallelizable, it's just that the memory requirements currently aren't compatible with implementing the full PCFG in the GPU, (aka you are not going to load up your dictionaries there). In a CPU setup you can do a whole lot of parallelization with splitting up the work on filling out the terminal structures and actually making the guesses. In that setup I'll admit the core PCFG "next" function that finds the next terminal structure to use would be a blocking item but I suspect even that could be parallelized quite a bit if you are willing to make guesses that aren't exactly in probability order, (aka you are willing to try a slightly lower probability password guess before the highest possible one remaining). At the end of the day it really is a wordlist attack, (it's just creating very fine grained mangling rules), so worst case you can even pre-generate the mangling rules and save them to disk. So yeah, not easily parallelizable ;p If that was a priority then there at least is a way forward on CPU parallization. For GPU cracking it might be better to handle that like most implementations currently do with Wordlist attacks by giving a set of mangling rules for a small subset of dictionary words. It's just in this case those mangling rules would be selected by the PCFG vs hand crafted. >> I also find your performance comparison unfair given that you don't take into account implementation speed When talking about performance I think it comes down the the law of diminishing returns. There are a couple of costs associated with a password cracking session 1) The cost of making a plaintext guess. The overhead of incremental mode, PCFGs, PRINCE, etc come into play here 2) The cost of checking a guess. This is the cost of hashing/comparing. 3) The number of guesses required to crack a password. Aka incremental will likely require less guesses than dumbforce mode A true performance comparison would take into account all three of those and I tried to address the actual time involved in my dissertation, (section 5.3.3 Time and Memory Requirements). I'll admit in retrospect I needed to run longer sessions though, (aka I did a "dumb" comparison that included the amount of time JtR used starting up vs checking the guesses per second output that JtR gives out), and since then JtR's incremental mode has gotten even faster while my code has gotten slower with added features.... That being said, with the cost of checking guesses going up with stronger hashing algorithms, the cost of making a plaintext guess in comparison to the overall cost goes down. Therefore it may be optimal to spend more time making targeted guesses to reduce the overall number of guesses made. That's why right now I'd say the overhead is probably too high to make PCFGs the best choice for fast hashes like unsalted MD5s, but when the hashing algorithm gets more expensive it starts to be more useful. That's also why most papers focus on the guessing efficiency of different algorithms, (the number of guesses required to crack a certain percentage of passwords). It is easy to make a repeatable test using that and it's assumed the cost of hashing will go up while future improvements to the algorithm will make it faster, so the most relevant piece of info to share is the guessing efficiency. I'm not arguing that performance statistics aren't worthwhile, (especially when looking to translate them to the real world). It's just that if you are doing the initial research it's more appealing to look at which underlying technique might be useful vs optimizing your code. Of course on the John-dev board we are worried about real world issues ;p I don't want to sound like I'm advocating using PCFG techniques for everything. I do think it could be useful in certain situations though. Matt On Mon, Jun 29, 2015 at 12:17 PM, Alain Espinosa <alainesp@...ta.cu> wrote: > > > -------- Original message -------- > From: Matt Weir <cweir@...edu> > Date:06/29/2015 11:09 AM (GMT-05:00) > To: john-dev@...ts.openwall.com > Cc: > Subject: [john-dev] Using Probabilistic Context Free Grammars (Was > precomputed attacks) > > ...Now for the downsides. The overhead is pretty high so against fast > hashes the increase in precision isn't worth the slower guessing rate. Also > the grammar is imprecise enough that against really slow hashes hand > written mangling rules still are the way to go. So it currently performs > best against medium speed hashes. > > I read your Dissertation, is very interesting. The main downside for me is > that PCFG is not easily parallelizable so no GPU for it. > > I also find your performance comparison unfair given that you don't take > into account implementation speed (this is a very common problem in papers > talking about password generation). For example if brute force is 4x faster > than incremental we need to test it with 4x bigger tries, so we had the > same time spent in each attack. > > One thing that wake my interest is entropy calculation. One way to view > this is that 10 characters passwords are only 4.8x stronger that 8 > character password. Given my experience I was expecting more, or perhaps I > was too used to contest hashes. > > Regards, > Alain > > > > Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.