Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Sun, 13 Sep 2015 15:34:09 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: experiment with functions to reject computed hashes

I was curious if it is possible to make a function to reject computed
hashes useless for us. Such function has to:
- never reject hashes from given set,
- reject as many as possible other hashes.

I tried the following function:
T1[(v >> a) & 0xF] ^ T2[(v >> b) & 0xF] == 0
where results means reject, v is some integer from hash, a and b are
some shifts so that 4 bits do not overlap (maybe it is not really
needed), T1 and T2 are tables with 16 values each (0 - 255). It is
easy to implement this function with SSE using shuffle instruction.

a and b should be either 0 and word_size_in_bits-4 to remove 2 ops, or
they should be chosen to produce more same values for given hashes.

I used hill climbing to find T1 and T2 for random sets. Actually there
can be more tables. I attached my code, it has tunable count of tables.

So far, the code can produce 2 tables for some sets of 2^5 hashes that
reject half of values to be rejected and accept everything else. For
that, I fill tables with random values 0-255 and then replace random
elements in each table with 0 and 1 randomly, updating state when good
replacement occurs. Usually initial state accepts all values.
Mutations improve reject ratio. It sticks often but otherwise works
quickly, so restarts are needed.

So generated function can reject almost half of candidates using ~7
sse instructions (not counting load of tables but counting final
&0x0..0ff), cracking 32 hashes.

In john, it might be used in the end of crypt_all() to report that
there are no possible passwords. But for such use, all computed
candidates should fail the check. So it would be ~1/4 for
MAX_KEYS_PER_CRYPT == 2.

My algo does not scale at all. Naive hill climbing is a wrong approach
because it needs high redundancy. And I am not sure that ^ is a good
way to combine tables, especially 3+ tables.

What do you think?

Thanks!

-- 
Regards,
Aleksey Cherepanov

# -*- coding: utf-8 -*-
# Experiment with function for fast reject

# Copyright  2015 Aleksey Cherepanov <lyosha@...nwall.com>
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted.

# ######################################################################
# Tunable parameters
# (Actually the whole algo may be tuned...)

# table_bits == 4 allows sse's shuffle to perform dereference
table_bits = 4

# for tests
test_target_count = 2**5

# %% it should be auto-tunable
table_count = 2

rand_value = lambda: random.randint(0, 255)
# rand_value = lambda: random.randint(0, 1)
# rand_value = lambda: random.randint(0, 7)
# rand_value = lambda: random.randint(0, 3)
# rand_value = lambda: random.randint(0, 2 ** 5 - 1)

rand_value_for_init = lambda: random.randint(0, 255)

mutations_for_each_table = 1

# ######################################################################
# Code

import random
import sys

table_size = 2 ** table_bits
table_max = table_size - 1

# return 0 on success and 1-255 in other cases
def check(candidate, tables):
    r = 0
    for c, t in zip(candidate, tables):
        r ^= t[c]
    # Having 0 for misses, it may be easier to build tables that don't
    # reject what they should not reject
    r = 1 if r == 0 else 0
    return r


def make_tables(target):

    tables = [[rand_value_for_init() for j in range(table_size)] for i in range(table_count)]
    ar, aw, rr, rw, total = measure_tables(tables, target)

    count_multiplexer = lambda rw, rr: rw * 1000000 - rr
    deep_copy = lambda t: map(list, t)
    rand_bit = lambda: 1 << random.randint(0, 7)
    rand_table = lambda: random.randint(0, table_count - 1)
    rand_index = lambda: random.randint(0, table_max)

    # hill climbing

    old_rw = rw
    old_aw = aw
    old_rr = rr
    old_tables = deep_copy(tables)

    while old_rw != 0 or old_aw > old_rr:
    # while old_rw != 0 or old_rr < total / 2:

        # reset tables
        tables = deep_copy(old_tables)

        # mutate tables

        # tables[rand_table()][rand_index()] = rand_value()
        # tables[0][rand_index()] ^= rand_bit()
        for i in range(table_count):
            for j in range(mutations_for_each_table):
                # tables[i][rand_index()] ^= rand_value()
                tables[i][rand_index()] = rand_value()

        # measure tables
        ar, aw, rr, rw, total = measure_tables(tables, target)

        # Did we improve tables?
        if count_multiplexer(rw, rr) < count_multiplexer(old_rw, old_rr):
            # we improved state, save it
            old_rw = rw
            old_rr = rr
            old_aw = aw
            old_tables = deep_copy(tables)
            print >> sys.stderr, rw, rr, aw, count_multiplexer(rw, rr)

    print >> sys.stderr, "finish", old_rr, old_rw
    return tables


def measure_tables(tables, test_target):
    # test tables: we try all possible candidates
    test_candidate = [0] * table_count
    test_target_hash = {}
    for t in test_target:
        if t in test_target_hash:
            test_target_hash[t] += 1
        else:
            test_target_hash[t] = 1
    # We need a function that:
    # - never rejects elements of target set
    # - rarely accepts elements not from target set, but such are
    # permitted
    # So our target is accepted_right == number of keys in
    # test_target_hash, rejected_wrong == 0, high rejected_right and low
    # accepted_wrong
    accepted_wrong = 0
    accepted_right = 0
    rejected_wrong = 0
    rejected_right = 0
    total = 0
    while True:
        tup = tuple(test_candidate)
        r = check(tup, tables)
        total += 1
        should_accept = tup in test_target_hash
        if r == 0:
            if should_accept:
                accepted_right += 1
            else:
                accepted_wrong += 1
        else:
            if should_accept:
                rejected_wrong += 1
            else:
                rejected_right += 1
        # pick next candidate
        i = 0
        test_candidate[i] += 1
        while i < len(test_candidate) - 1 and test_candidate[i] == table_size:
            test_candidate[i] = 0
            test_candidate[i + 1] += 1
            i += 1
        if i == len(test_candidate) - 1 and test_candidate[i] == table_size:
            # that's all
            break
    return [accepted_right, accepted_wrong, rejected_right, rejected_wrong, total]


# ######################################################################
# Test run

# We return tables_count random numbers with table_bits bits. Like
# get_hash() but already split into packs suitable for indexing.
def make_candidate():
    return tuple(random.randint(0, table_max) for i in range(table_count))

# make test set
test_target = tuple(make_candidate() for i in range(test_target_count))

# test_target can have duplicates. Actually we'd like to have more of
# them.
# %% implement choosing of bits to check

# make tables to check with
tables = make_tables(test_target)

accepted_right, accepted_wrong, rejected_right, rejected_wrong, total = measure_tables(tables, test_target)
print >> sys.stderr, "accepted_wrong", accepted_wrong
print >> sys.stderr, "accepted_right", accepted_right
print >> sys.stderr, "rejected_wrong", rejected_wrong
print >> sys.stderr, "rejected_right", rejected_right, float(rejected_right) / total
print >> sys.stderr, "total real", total

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ