Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 24 Mar 2013 15:20:16 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: inlining strzncpy - and others?

magnum -

On Sun, Mar 24, 2013 at 03:16:17AM +0100, magnum wrote:
> I just replaced Dhiru's four identical instances of inlined _memmem() with a shared jtr_memmem() in misc.h (using a faster implementation while at it) and made it 'extern inlined' as per the portable model described in http://www.greenend.org.uk/rjk/tech/inline.html (i.e. with function source in misc.h instead of misc.c).

Now we have these warnings:

In file included from unafs.c:8:0:
misc.h: In function 'jtr_memmem':
misc.h:149:7: warning: implicit declaration of function 'memcmp' [-Wimplicit-function-declaration]

In file included from common-opencl.h:27:0,
                 from common_opencl_pbkdf2.h:11,
                 from common_opencl_pbkdf2.c:8:
misc.h: In function 'jtr_memmem':
misc.h:149:7: warning: implicit declaration of function 'memcmp' [-Wimplicit-function-declaration]

> Do you have any opinions or wisdom on such inlining?

It might be the way to go, but so far I avoided it in part because we
still try to support old C compilers that lack "inline".  (Maybe we
should not.)  For example, the MAYBE_INLINE macro that we use is there
not only to make use of gcc's attribute always_inline, but also to omit
the inline with compilers that lack support for this keyword.  Of
course, jumbo may be ahead of core in terms of its requirements for the
C compiler.  It already is in terms of required libraries.

> The exercise made me think we might want to do the same to strnzcpy() and perhaps some other tiny functions that may be used in performance-critical code.

strnzcpy() is not meant to be used in performance critical code.  We may
try to optimize those uses.  For example, in dummy.c I replaced (during
dummy.c development/testing) its use in set_key() with libc-provided
strncat() (which may be in asm and may be processing multiple chars per
loop iteration):

static void set_key(char *key, int index)
{
	char *p = saved_key[index];
	*p = 0;
	strncat(p, key, PLAINTEXT_LENGTH);
}

I also tried changing strnzcpy() itself to use strncat() as above, but
on very short strings (such as typical candidate passwords) this was of
no help (in my testing), so I reverted that change.  With two function
calls - to strnzcpy(), and from it to strncat() - this was not worth it.

> I tested this briefly with strnzpy. The john binary grews with about 20 KB. A benchmark of raw-sha0 (using strnzpy in set_key) shows a one percent boost. That's not a lot but with slight boosts also in the rule engine, cracking modes and other parts of John it will add up to more.

This may be a fine change to make, but regardless I think we should also
review those uses of strnzcpy() where this matters, and try to optimize
them.  In rules.c, there are such uses in just 3 commands: 'x', '1', and
'2'.  Of these, maybe we only care about 'x' (since '1' and '2' are for
single crack mode only, which rarely runs for long and usually has other
bottlenecks anyway; there are also uses of strnz*() in single.c).  We
may try the strncat() trick for 'x' (need to benchmark it on multiple
systems/libc's).

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.