john-dev - Re: PHC: Lyra2 on CPU

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date: Sat, 6 Jun 2015 12:41:10 +0200
From: Agnieszka Bielec <bielecagnieszka8@...il.com>
To: john-dev@...ts.openwall.com
Subject: Re: PHC: Lyra2 on CPU

it seems that speed for both b) and c) versions after allocating
memory beyond hash function and moving nCols and nThreads to salt
doesn't differ on my laptop.

tests on super:

version c)

[a@...er run]$ ./john --test --format=lyra2
Will run 32 OpenMP threads
Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE
Speed for cost 1 (t) of 8, cost 2 (m) of 8
Many salts:     1394 c/s real, 44.8 c/s virtual
Only one salt:  1411 c/s real, 45.9 c/s virtual

[a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2
Will run 32 OpenMP threads
Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE
Speed for cost 1 (t) of 8, cost 2 (m) of 8
Many salts:     17696 c/s real, 554 c/s virtual
Only one salt:  17664 c/s real, 553 c/s virtual

version b)

[a@...er run]$ ./john --test --format=lyra2
Will run 32 OpenMP threads
Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE
Speed for cost 1 (t) of 8, cost 2 (m) of 8
Many salts:     11904 c/s real, 372 c/s virtual
Only one salt:  11722 c/s real, 370 c/s virtual

[a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2
Will run 32 OpenMP threads
Benchmarking: Lyra2, Generic Lyra2 [ ]... (32xOMP) DONE
Speed for cost 1 (t) of 8, cost 2 (m) of 8
Many salts:     112 c/s real, 3.8 c/s virtual
Only one salt:  112 c/s real, 3.8 c/s virtual

[a@...er run]$ GOMP_CPU_AFFINITY=0-31 ./john --test --format=lyra2 in
version b) can be so slow because my strange construction of
crypt_all().

but it looks like omp has problems with barriers or there is something
I don't know . I wrote simple program in C

#include <stdio.h>
#include <omp.h>

static void func()
{
    printf("checkpoint 1\n");
    printf("threads_num=%d,
my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num());
    #pragma omp barrier
    printf("checkpoint 2\n");
    printf("threads_num=%d,
my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num());
    #pragma omp barrier
    printf("checkpoint 3\n");
    printf("threads_num=%d,
my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num());
    #pragma omp barrier
    printf("checkpoint 4\n");
    printf("threads_num=%d,
my_thread_num=%d\n",omp_get_num_threads(),omp_get_thread_num());
}

int main()
{
    int i;
#pragma omp parallel for
    for(i=0;i<2;i++)
    {
       func();
    }
}

and the output is:
none@...e ~/Desktop $ ./omp
checkpoint 1
checkpoint 1
threads_num=8, my_thread_num=1
threads_num=8, my_thread_num=0
checkpoint 2
threads_num=8, my_thread_num=1
checkpoint 2
threads_num=8, my_thread_num=0
[here program blocks]

if I change 2 in for(i=0;i<2;i++) to 8 program works OK.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.