john-users - walkthrough for Crack The Password contest

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <87d0ef3685.fsf@gmail.com>
Date: Tue, 29 Oct 2019 18:46:02 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-users@...ts.openwall.com
Subject: walkthrough for Crack The Password contest

Team john-users participated in Crack The Password / PCrack /
CrackThe.PW online hash cracking competition held at SAINTCON
conference.

The page of the contest:
https://www.saintcon.org/2019/contests/crackpw/

The tasks were interesting. Many tasks contained hints about
directions to try. Yet it was easy to over-complicate things and miss
easy solutions.

Initially there were 10 tasks for all teams and 1 task for on-site
teams only. 6 hours before the extended end, additional 3 tasks were
added (Secure Document, Smart Candidate Generation, Iteration). We
participated as off-site team, so the on-site task is not covered.


______________________________________________________________________
Our resources:

- 4 active members:

Aleksey Cherepanov
Ivan U
Jens T
rofl0r

- ~50 CPU cores / ~100 threads,
- ~4 GPUs initially,
  - ~27 GPUs during the last ~5 hours,
- 4 FPGA boards (ztex 1.15y, 4 chips per board).

Software used:
- John the Ripper [1]
- hashcat [2]
- EtherCalc [3] for collaboration (we used our own instance)
- common auxiliary software
- also we tried: gramtropy, CeWL, ZAP's ajax spider

[1] https://github.com/magnumripper/JohnTheRipper/
[2] https://github.com/hashcat/hashcat/
[3] https://ethercalc.org/


______________________________________________________________________
Our solutions (in arbitrary order):
______________________________________________________________________
    Algebraic Notation

Password: 18. Rxe7 Bxe7

After some edits description got 2 examples:
------------------------------------------------------------
2. Nf3 Nc6
39. Qxh5 Bf6
------------------------------------------------------------

So I considered the following regexp to describe the syntax:
------------------------------------------------------------
[1-9][0-9]?\. [KQRBN]?x?[a-h][1-8] [KQRBN]?x?[a-h][1-8]
------------------------------------------------------------

I expressed this as a set of masks:
------------------------------------------------------------
?d. ?2?3 ?2?3
?d?d. ?2?3 ?2?3
?d. ?1?2?3 ?2?3
?d?d. ?1?2?3 ?2?3
?d. ?1x?2?3 ?2?3
?d?d. ?1x?2?3 ?2?3
?d. ?2?3 ?1?2?3
?d?d. ?2?3 ?1?2?3
?d. ?2?3 ?1x?2?3
?d?d. ?2?3 ?1x?2?3
?d. ?1?2?3 ?1?2?3
?d?d. ?1?2?3 ?1?2?3
?d. ?1?2?3 ?1x?2?3
?d?d. ?1?2?3 ?1x?2?3
?d. ?1x?2?3 ?1?2?3
?d?d. ?1x?2?3 ?1?2?3
?d. ?1x?2?3 ?1x?2?3
?d?d. ?1x?2?3 ?1x?2?3
------------------------------------------------------------
with --1=KQRBN --2=abcdefgh --3=12345678 options.

[4] https://hashcat.net/wiki/doku.php?id=mask_attack#hashcat_mask_files

To run the set of masks in john, I used a cycle in shell:
------------------------------------------------------------
while read -r m; do
    echo "$m";
    ./JohnTheRipper/run/john pw/1.* --format=bcrypt-ztex \
         --1=KQBNS --2=abcdefgh --3=12345678 --mask="$m";
done < a.masks
------------------------------------------------------------

(For hashcat, a mask file[4] could be used, but definitions for ?1,
?2, ?3 should be provided on each line.)

We used the 4 FPGA boards hashing ~15k c/s for bcrypt cost 10. So we
did not have problems with high number of candidates. Anyway we sorted
the masks by number of candidates (approximately), so smaller masks
would be finished earlier. Also we started to attack the bcrypt hash
very early but our initial attacks missed the password.

This set of masks does not cover cases like "1. xa1 xa1", while the
regexp describes them. After the contest, PingTrip pointed out to me
that "18. Rxe7 Bxe7" can be found in a well-known game [5], also that
ambiguous situations use different syntax where original file or rank
is specified additionally[6] (e.g. "N4f2" or "cxd5" for one turn).

[5] https://en.wikipedia.org/wiki/Deep_Blue_versus_Kasparov,_1997,_Game_6
[6] https://www.cheatography.com/davechild/cheat-sheets/chess-algebraic-notation/


______________________________________________________________________
    Diceware C

Password: Skateboard Aftershave Luxurious Vestibule

"This passphrase was generated using dice and EFF's wordlist[7]. Each
word is capitalized and has a space between words."

[7] https://www.eff.org/files/2016/09/08/eff_short_wordlist_2_0.txt

John's default length limit is 27 for NT format. Ad-hoc dynamic format
--format='dynamic=md4(utf16($p))' may be used to extend it.

For john, we would create a set of rules of form Az" Word" and
Az" Word1 Word2". Then it would take some time to get the crack on cpu.

We used hashcat with the Combinator attack (-a 1).

Pairs of words as 1+1:
------------------------------------------------------------
$ head -n 2 eff_words_c.txt
Aardvark
Abandoned

$ sed -e 's/^/ /' < eff_words_c.txt > e_space.txt

$ ./hashcat/hashcat -d 1 -m 1000 -a 1 f0c845c934251926ee2a8b1d87a16b64 eff_words_c.txt eff_words_c.txt
------------------------------------------------------------

Triplets of words as 2+1:
------------------------------------------------------------
$ ./hashcat/hashcat -a 1 --stdout eff_words_c.txt e_space.txt > e2.txt

$ ./hashcat/hashcat -d 1 -m 1000 -a 1 f0c845c934251926ee2a8b1d87a16b64 e2.txt e_space.txt
------------------------------------------------------------

Quaternions of words as 2+2:
------------------------------------------------------------
$ sed -e 's/^/ /' < e2.txt > e2_space.txt

$ ./hashcat/hashcat -d 1 -m 1000 -a 1 f0c845c934251926ee2a8b1d87a16b64 e2.txt e2_space.txt
------------------------------------------------------------

The last attack took 4m33s on GTX 1080. And it is even without -O -w 3
options.


______________________________________________________________________
    Creating Wordlists

Password 1: qJRMyAUXp94wb
Password 2: kqTc1v7eXwRnqKjT9

The description suggested that the password may be seen on the site.
So we downloaded the site, extracted words and did not find the
password. Then I tried to extract all substrings and we found the
password 1. Later the task was simplified: the new password could be
extracted as word (sequence of uppercase + lowercase + digits), and
only the page of the contest was needed.

Download the site, put all lines together, extract substrings up to
length 20 keeping only printable ASCII:
------------------------------------------------------------
$ wget -r https://www.saintcon.org
[...]

$ find www.saintcon.org/ -type f -print0 | xargs -0 cat | sort -u > site.txt

$ perl -C0 -le 'while (<>) { chomp; for $i (0 .. length($_)) { for $l (1 .. 20) { $t = substr $_, $i, $l; if ($t =~ /^[ -~]*$/) { $h{$t} = 1; } } } } for (keys %h) { print }' < site.txt > site2.txt

$ wc -l site2.txt
24983770 site2.txt

$ john wl.pw --format=sha512crypt-ztex --wordlist=site2.txt
------------------------------------------------------------

The password 2 can be found the same way.

All substrings seem excessive. But it is an easy way to extract more
without understanding the contents. Working with sites, you may need
to consider the following points too: html entities, soft hyphens,
tags and joining of lines (effects of rendering of html), content
downloaded by AJAX, and content in other markup languages rendered by
JavaScript in the browser (so it may be tricky to clean up input).


______________________________________________________________________
    OPVault

Password: 652148

"The password is a number below 1000000."

------------------------------------------------------------
$ 7z x tasks/4/files/crackme.opvault.zip

$ ./JohnTheRipper/run/1password2john.py crackme.opvault/ > keych.pw

$ john keych.pw --mask='?d' --fork=24 --min-length=1 --max-length=6
------------------------------------------------------------


______________________________________________________________________
    7-Zip

Password: Oeb8p14KLu3pe9jK

The task did not require cracking: the password was part of file name
inside the archive: password_is_Oeb8p14KLu3pe9jK.

The contents of the file after unpacking:
Flag: encrypt_your_filenames

"encrypt_your_filenames" was the solution accepted by the CTFd.


______________________________________________________________________
    Hash Identification (#7)

Password: CrackMe766

A bare hash with 40-hex was provided. It could be anything strange
including a truncated longer hash or shorter hash packed with salt.
But usually 40-hex means that the final iteration is sha1. Yet it
could be sha1(md5($p)) or sha1(md5($p).md5($p)). But it was simple
triple sha1.

I used ad-hoc dynamic format:
------------------------------------------------------------
$ john h40.pw --mask='CrackMe?d?d?d' --format='dynamic=sha1(sha1(sha1($p)))'
------------------------------------------------------------


______________________________________________________________________
    Hash Identification (#10)

Password: CrackMe802238

"qSqZIBB2BvKxyVy2au0CoA==" seems base64 encoded. Decoding gives 16
bytes of binary data. I recoded them into hex. md5 is a popular hash
of such size. So I tried different combination with md5 and mask
CrackMe?d?d?d?d, but I could not find the solution. So I considered to
extend the mask and try again. It turned out to be plain raw-md5.


______________________________________________________________________
    Masks

Password: E*IF$?#

"This password consists solely of uppercase and special characters."

------------------------------------------------------------
$ john m.pw --format=raw-md5-opencl --1='?u?s' --mask='?1' --min-length=1 --max-length=10
------------------------------------------------------------


______________________________________________________________________
    Control Characters

Password: ^P@...0rd
    with binary ^P

Submitted as: $HEX[9040737377307264]
    (notice 90 instead of 10)

"His password was created replacing a normal letter with it's command
character equivalent on the keyboard."

Following the hint, I generated custom rules (commands are
reformatted):
------------------------------------------------------------
$ python -c '
import string, sys;
print ".include <john.conf>\n[List.Rules:rep]";
[ sys.stdout.write("s{0}\\x{1:02x}\ns{2}\\x{1:02x}\n".format(
    c, ord(c) - ord("a") + 1, c.upper()))
  for c in string.lowercase ]
' > rep.conf

$ head -n 5 rep.conf
.include <john.conf>
[List.Rules:rep]
sa\x01
sA\x01
sb\x02

$ john c.pw --config=rep.conf --rules=rep --wordlist=rockyou.txt
------------------------------------------------------------

"Submit the password using $HEX[666c6167] format."

^P is \x10. But the password was not accepted as $HEX[10...]. I had to
replace 10 with 90 (i.e. 0x10 | 0x80). Descrypt ignores the upper bit
in bytes, so it was logical to try to flip the most significant bit.


______________________________________________________________________
    Analysis / Diceware A / Diceware B

Password 1: angelfis
Password 2: angelfishtagalong

We solved the Analysis task. It contained descrypt hash of Password 1
that rofl0r cracked using the crackstation wordlist (that seems to be
quite good in contests).

The same descrypt hash was in Diceware B task and the same password
was accepted as the solution. We reported that. Analysis task was
removed and Diceware B was renamed into Diceware A. But descrypt seems
strange for a passphrase challenge because its maximal length of
password is 8, everything longer is truncated. Later the hash was
replaced by sha256crypt with rounds=1000 and the task was renamed into
Diceware B.

To find Password 2 (reformatted commands):
------------------------------------------------------------
$ head -n 2 eff_words.txt
aardvark
abandoned

$ (printf '.include <john.conf>\n[List.Rules:eff]\n';
   perl -C0 -lpe 's/^/Az"/; s/$/"/' < eff_words.txt
) > eff.conf

$ head -n 4 eff.conf
.include <john.conf>
[List.Rules:eff]
Az"aardvark"
Az"abandoned"

$ john dw.pw --format=sha256crypt-opencl \
    --config=eff.conf --rules=eff --wordlist=eff_words.txt
------------------------------------------------------------

Without mask mode, john forms all candidates on cpu and transfers them
to gpu when -opencl format is used. For fast NT, it is a problem,
because bandwidth of PCIe limits the speed of cracking. While
rounds=1000 is lower than the default for sha256crypt, the bandwidth
is not a problem for the hash. (To be precise, john's
sha256crypt-opencl format always transfers candidates from cpu to gpu.
Also -opencl format may run on cpu fully if cpu is chosen with
--device= option. But john's OpenCL on cpu may be suboptimal comparing
to respective native format. hashcat's OpenCL on cpu may be a
different story[8].)

[8] https://twitter.com/hashcat/status/688737453671342080


______________________________________________________________________
    Secure Document

Password: eDEwMHByZTIzMTA=

"The password to this very import document is an entry in the RockYou
wordlist encoded in base64."

Encode each line into base64, use the output as wordlist:
----------------------------------------------------------------------
$ python -c 'import sys'$'\n''for l in sys.stdin: print l.rstrip("\r\n").encode("base64").replace("\n", "")' < rockyou.txt > rb.txt

$ ./JohnTheRipper/run/libreoffice2john.py document.odt > odt.pw

$ ./JohnTheRipper/run/john odt.pw --wordlist=rb.txt
----------------------------------------------------------------------


______________________________________________________________________
    Smart Candidate Generation

Password: frontporchbubblegum123

"When cracking a large amount of salted/slow hashes it is very helpful
to have a smart candidate generator. Additionally, passwords that come
from the same source tend to be more similar to each other than to
those from other sources.

This password is similar to those found in the RockYou wordlist."

To identify hash type, we grep'ed hash's tag over the source code of
hashcat.
------------------------------------------------------------
$ grep -rlF '$pbkdf2$' hashcat
hashcat/tools/test_modules/m20400.pm
hashcat/src/modules/module_20400.c
hashcat/modules/module_20400.so

$ ./hashcat/hashcat -h | grep 20400
  20400 | Python passlib pbkdf2-sha1                       | Generic KDF
------------------------------------------------------------

We created test hash using passlib:
------------------------------------------------------------
>>> from passlib.hash import pbkdf2_sha1
>>> pbkdf2_sha1.hash('123456')
'$pbkdf2$131000$u9da6z0n5JzTem8NQUhpbQ$DHo7yf6i/UA5RHc0iW2kXImcep8'
------------------------------------------------------------

It was similar to the hash in the task. We used it to test our
recoding. John has support for pbkdf2-hmac-sha1 (on cpu and gpu) but
has different syntax:
- the tag is "$pbkdf2-hmac-sha1$",
- iterations, salt and digest are delimited by '.',
- salt and digest are encoded in hex, while passlib encodes in base64.

Only ~6 minutes before the end, Jens cracked the hash using best64
rules on rockyou.txt using hashcat on 4 GPUs. Before that, we tried a
lot of other simple attacks based on rockyou.

It seems that the easiest way to get the password was to pick word
from rockyou.txt and append "123". Appending "123" occurs in different
sets of rules. It seems rockyou-30000.rule would be better for the
task because the rule happens earlier in the set.
------------------------------------------------------------
$ grep -nF '$1 $2 $3' hashcat/rules/*
hashcat/rules/best64.rule:33:$1 $2 $3
[...]
hashcat/rules/dive.rule:41:$1 $2 $3
[...]
hashcat/rules/rockyou-30000.rule:5:$1 $2 $3
[...]
------------------------------------------------------------


______________________________________________________________________
    Iteration

Answer: 12999016

Both password and hash were provided. The hashing algo was said to be
iterated sha512. The goal was to find and submit the number of
iterations.

The code in Python 2:
------------------------------------------------------------
from hashlib import sha512

h = 'aa91c7391f6cddba095e22729bde2707931dcf23cddd4a95bdef6c08aace9cac6a3e611ca2ba492c27062731f07c79aa3c512be138412b910cf70e210545e5b7'

t = 'CrackMe'
k = 0
while t != h:
    k += 1
    t = sha512(t).hexdigest()

print k
------------------------------------------------------------

Notice that hash in hex is passed to next iteration.


______________________________________________________________________

That's all for off-site teams.

Thanks!

-- 
Regards,
Aleksey Cherepanov
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.