Date: Fri, 17 Aug 2012 10:04:35 -0600 From: Kevin Young <kevin.p.young@...il.com> To: john-users@...ts.openwall.com Subject: Passphrase Creation Hello everyone, First off, thanks to Matt, Solar Designer, and the other John-users for inviting me to participate in the CMIYC contest. I learned a lot and had a great time. I've been using passphrases for several months now and have seen some chatter on the subject so I thought I'd chip in. Most of my phrase creation is contained in a bash shell script. But I'm sure there's someone out there with a much better tool, method, or way to do this. Step 1. Find a good source of words As mentioned in other posts, the Gutenberg project is a good source. I've also tried mining the Library of Congress, and a few others. Step 2. Store and organize Storage proved an early challenge as I underestimated the space requirements. The 15,000 raw (unprocessed) books I currently have fill a 300GB drive. It doesn't sound like much, but things grow quickly. A SSD helps as disk I/O becomes a bottleneck. Step 3. Download your material I use a simple wget loop here. Don't saturate the bandwidth of your source or you'll get booted. Step 4. Scrub raw input Strip special characters and punctuation. Convert to lowercase and remove excess space characters (sed and awk). Convert between file formats if necessary (dos2unix, unix2dos, or unix2mac). Using these commands I create a single long "sentence". Before: It was a dark and stormy night. All the animals were asleep. Somewhere overhead a flash of lightning illuminated the canyon walls followed by the thunder's rumble. After: it was a dark and stormy night all the animals were asleep somewhere overhead a flash of lightning illuminated the canyon walls followed by the thunders rumble Step 5. Phrase length and create phrases I've tried phrase lengths from 3-10 words. Using the above example, a 5-word length, and custom app (arrays and recursion are your friend here) phrase creation begins: it it was it was a it was a dark it was a dark and was was a was a dark was a dark and was a dark and stormy a a dark a dark and a dark and stormy a dark and stormy night dark dark and dark and stormy dark and stormy night dark and stormy night all and and stormy and stormy night and stormy night all and stormy night all the I also create a no-space version at the same time. (Is there a mangling rule that can handle this?) itwas itwasa itwasadark itwasadarkand wasa wasadark wasadarkand wasadarkandstormy Step 6. Optimize and reduce As expected there are lot of duplicates so my script performs a dictionary sort and filters out the duplicates (sort and uniq). I also filter out (grep) things like open source verbiage, distribution notices, credits, etc. Step 7. You're done I typically get 1-5 million phrases per book. It isn't optimal but the combinations are vast. (See sample phrases submitted for CMIYC 2012.) I've plucked thousands of similar phrases from LinkedIn and Stratfor -- some were as long as 28 characters. = : ) So there it is...I'm sure there are better ways to do this and I clearly have a lot to learn. (Perhaps mangling rules can solve many of the above mentioned hurdles?) I still have a LOT of things to do to improve the process but I'll save those tricks for CMIYC 2013 ;) Thanks go to Matt Weir for his willingness to share a password dialog. I also throw a shout to @joshdustin ( http://7habitsofhighlyeffectivehackers.blogspot.com/ ) for his insight, assistance, and suggestions -- the guy is a linux wizard, white-hat genius, and great friend. If anyone has suggestions for improvement or questions look me up. Best of luck, -Kevin- CMIYC 2012 sample: ---------------------- He pondered a moment rummaged in his pack She was ashamed to shorter space of time to look at some treatment of the slaves I must be aware you and your master back of his head panel in the wall to his aid more capable of giving fathers shall eat establishment of so many have been here before There are a few upperhand a thousand years ago then he was thinking shall they utter Iamsorry1 been able to find
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.