Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 3 Feb 2012 15:33:23 +0000
From: Alex Sicamiotis <>
To: <>
Subject: RE: DES with OpenMP


> No parameters to tweak, I think -
it's just different code.
> You may try building with
-D_OPENMP instead of -fopenmp - that is, don't
> actually
enable OpenMP, but request that version of John's source code.
This should complain on truly OpenMP-specific constructs such as
> to omp_get_max_threads(), which you'll need to remove
(just put 1 for
> the threads count, etc.) It should also give
warnings about the
> #pragma's, which you may ignore.

> You may analyze the generated assembly code and try to
figure out why
> one version of it is faster than the other on
your CPU. 
> This is getting off-topic for john-users,
though (not just tweaking, but
> source code changes) - want to
join us on the john-dev list maybe?

Analyzing or writing code
code is not my kind of thing for the last...17 years or so. I did
some BASIC in the 80's but for more serious programming I was using Pascal - never saw the need to learn another language like C
because my programming needs were not that large to feel constrained
by the language itself. There was no linux back then either as an incentive for C. Out of curiosity (a friend was into assembly
that time) I did a bit of asm which was intriguing but my knowledge was limited in this sector - and now has been forgotten... Then I moved on to networks, html and then other things losing track of programming. When I'm seeing asm code today I can't
even recognize much of the instructions or how they operate other
than what I've read in a wiki about their functionality. 

So I wish I could help with the development
but I can't really claim I know C or asm. I can
make sense of some portions of c code which have comments (lol) but understanding, say, 10% of what you see and writing new, more optimized code, is an entirely different issue. However, if you need any benchmarking, or require testing for something, I'll be more than willing to assist.

>With OpenMP,
the code is thread-safe, so it references the DES_bs_all structure
via a
> pointer. On one hand, this consumes a register (leaving
fewer registers
> for other stuff), but on the other it may
result in smaller code size
> (only need to encode offsets
relative to a pointer rather than larger
> absolute addresses)
and thus more other stuff staying in L1 instruction

Theoretically, this *sounds* easy to replicate if
someone alters the non-omp version and uses a pointer just like the
omp version. Then a compilation with icc can show whether the non-omp
benefits from this despite the wasted register. Of course I have no
idea how much code change this requires and if its even worth the
time (it could just slow things down, due to the wasted register).

> An easy thing to check is "size DES_bs_b.o".

Given my lack of ability to discern asm differences, I just
did the easy thing, heh. I checked the sizes out of curiosity for GCC
4.6.2 and ICC 12.1 (.o files from plain compilation and -fopenmp
copilation) ... there were huge differences in size (over than 10X)
which is counter-intuitive relative to the 32kb of a l1 cache. Then I
tried with GCC 4.3.2 which was fast in the non-omp version. Indeed
this had differences too. Anyway I gathered most* benchmarks + file
sizes in one spreadsheet:

* I left out an ICC batch of
-march=core2. Doesn't have much difference performance wise so opted
for the generic one. 


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.