Date: Mon, 13 May 2013 20:42:06 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: Incremental mode in 22.214.171.124 On 13 May, 2013, at 20:22 , Solar Designer <solar@...nwall.com> wrote: > On Mon, May 13, 2013 at 07:30:44PM +0200, magnum wrote: >> I had similar results with two-character candidates and so on. Is there any way short lengths could get more "weight", or some other mitigation for this "regression"? > > They get so little weight because they're so rare in the training set > (perhaps non-existent, for these specific characters?) However, you may > adjust their weight here in charset.c: > > est *= (*cracks)[length][pos][count]; > if (est < 1e-3) /* may adjust this */ > est = 1e-3; > > Change the 1e-3 (in both places) to something larger (e.g., 1e-2). > I think the largest value that makes sense is 1.0. So maybe test these: > > 0.01 > 0.1 > 0.5 > 0.9 > 1.0 > > ... and you've already tested 0.001 and are unhappy with it. In your > testing, also see how this affects efficiency (in terms of successful > guesses per candidates tested) for actual runs (e.g. train on one half > of RockYou, test on the other, or train on RockYou and test on another > real-world data set). I suspect that as results "improve" in terms of > uncommon short strings being tried sooner, they will be getting worse in > terms of efficiency. I understand that we do need to be testing really > short strings reasonably early anyway, though. Thanks, I will try this. So if I understand the old code & comments, 1.7.9 used 1.0 for this, right? Just thinking out loud, how about using some variant of "1/length" instead of a fixed figure? That would benefit really short lengths but not skew the longer ones. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.