Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 15 Jul 2012 00:02:25 +0400
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: magnum-jumbo and magnum-bleeding (NOT J7), and the source() function

Jim -

I feel that your message is emotional.  If you continue to be so
uncomfortable about the current state of things, I might make code
changes just to make you happier since you're an important contributor,
even though I continue to think that no such changes are desirable right
now.  I think that the current (limited) interface I introduced is
sufficient for bleeding's immediate needs, whereas we have other and
higher-priority formats interface changes to introduce as well (such as
to support myrice's work, which is time-sensitive because of GSoC
constraints), and I dislike your get_source() interface from bleeding
for two reasons (so I'd try to come up with a different enhancement to
the current interface when the time is right).

On Sat, Jul 14, 2012 at 01:27:00PM -0500, jfoug wrote:
> Yes, there was a lot hackish about it, but 'simplifying' with individual
> params is no better, IMHO.  Now, you have to keep adding params, and
> changing the interface, vs updating a structure.  

There's exactly one parameter to add in the next revision: void *salt.
And that will be all - no more parameters to add in the future.

We could add it now (without full support from some callers yet), but I
dislike having an incomplete implementation in there, and it is not the
best use of my time to work on a complete implementation of this right
now (e.g., it wouldn't help myrice's GSoC project, unlike certain other
formats interface changes).

> Also, since this is individual params, if we make changes, required for
> jumbo, this means we also have to change the CORE formats in jumbo,
> different than what they are in core.  Jumbo having to live within the
> limitations of core has been something that has caused some of the hackish
> coding (like globals, etc) to have to be used in certain places within
> jumbo.

I think jumbo's support for source() with salted hashes, if we want it
at all (I am not sure), can wait for me making the enhancement to the
interface in core.  And if you really want to do it sooner, you can, by
only adding one parameter to source() and one field to struct
db_password (so that you will easily have the salt pointer in loader).
Yes, the latter would be wasteful.

> >So I opted for the simpler thing first, with
> >intent to enhance it in a slightly different way from what you did.
> >
> >Doesn't this simpler interface cover all use cases you currently have in
> >bleeding?  I thought you were not using this for any salted hashes yet,
> >were you?
> 
> The interface I had worked perfectly for salted formats. I have tested with
> several dynamic formats (but only on testing, this was never committed).
> SapG at least (and I think others), are salted formats using source
> (get_source). 

I had no doubt that it would work for salted hashes.  My question was
whether you were actually using it for those already.  You've answered
it above: "yes, but never committed".  To me, this means that we're not
making things worse by not supporting salted hashes in this interface yet.

> Also well within dynamic, which in of itself, adds a LOT more requirements,
> due to the format not knowing just WHAT is expected at compile time.

What does this mean?

Specifically: is there any code currently in bleeding that you can't
easily modify to fit the source() interface currently in core?

> >Besides, with salted hashes we'd normally have a unique salt for almost
> >every hash (unless the system generated salts poorly, which does
> >happen), so we'd have a struct db_salt allocated anyway and thus the
> >savings from not keeping the source string are relatively smaller.
> 
> That is why I stole a pointer. There was no salt reallocation. The original
> salt did just what it did.  However, each hash 'knew' the salt, and always
> had it available. This really turned out to be a key to getting the code for
> get_source() to work well. Thus, there was absolute NO additional memory
> used.  It was 100% memory savings.

I think you misunderstood what I wrote/meant in the paragraph you quoted.

I do understand how your implementation worked, why it stole the
pointer, and that it did not copy the struct.  This is not what I was
referring to.

Nevermind.

> >What I implemented so far is
> >"inline" storage of very small binary hashes (up to 64 bits in a 64-bit
> >John build) by reusing the "char *source" pointer to store them in
> >formats where we don't need that pointer for its original purpose.  This
> >currently works for LM hashes and it should work for a few more in jumbo
> >(old MySQL, anything else?), but it wouldn't work e.g. for NTLM.
[...]

> I personally think this usage is more 'hackish' than using the same pointer
> to point to different 'type' objects, depending upon runtime conditions.
> Now you have a pointer that sometimes is not a pointer.  Oh well, your call.

I was not complaining about your reuse of the pointer field from a
purist point of view.  My complaint was not it.

Rather, I found such reuse wasteful.  I am currently making better use
of that memory in some cases, and will likely be making even better use
with proper hot/cold separation - or alternatively that field might
become optional.  Under your interface, we'd be stuck with it.

I did in fact complain about another aspect from a purist point of view -
namely, mixing up the hashes database and formats unnecessarily.
source() only needs the salt value, not the db_password struct.  Even to
test source() in the self-test you have to create a fake db_password.
This is a bit dirty.

Like I wrote earlier, I do in fact intend to mix things up in another
formats method, but that would be for a good reason.  The method would
actually be expected to use the struct for its intended purpose, when it
uses the struct at all.  This is not the case for source().

> >One idea is to revise struct db_password such that it would either have
> >explicit inline space for the hot portion (perhaps a 32-bit field) or
> >have the hot portion right before or right after it (a fixed offset
> >relative to struct start either way).  
> 
> Something like this 'could' be done.  One way, is to make  an array, and
> know that this hash list (salt) uses this buffer.  Then simply use the index
> into the array, for salt count.   I think inlining into the db_password is
> not good, because now it is not a packed array of 32 bit values, but an
> array of db_passwords with a small (32 bit) part being in the hot.

Whatever allocation method we use (array or not), we need to identify
the db_password fields that are actually hot (and it'll be more than
just the hot portion of binary - it will also be the next_hash pointer
and maybe something else) and keep exactly those in the hot region.  The
rest should go into cold.

> >> thus jumbo will be hamstrung.
> >
> >I don't feel that my initial source() interface reduces the current
> >functionality of bleeding in any way.  Does it?  Please be specific if
> >so.
> 
> You can only do a subset of formats, non-salted.

This does not reduce the _current_ functionality of bleeding in any way,
as far as I am aware.

> There is no thought into
> salted formats at all.  Yes, that was NOT the easiest thing to get done.  It
> took several versions, before getting it the way it was.  

There is "thought" into salted formats.  It is not in the code yet, but
the code does not contradict adding support for salted hashes in
source() later - in fact, it'd be a trivial addition of "void *salt", as
far as the interface is concerned (and I have thoughts on obtaining this
salt in the loader).  For comparison, your implementation did contradict
reuse of the source field or making this field optional.

> I will wait on final thoughts, until I see how this gets inserted into
> jumbo.  But I do think that salted formats, an likely dynamic may run into
> issues working with the source() interface as is in core.

Please post in here if/when you do run into specific issues.  For now, I
think my time is better spent on other formats interface enhancements,
if I manage to allocate more time to this at all.  I think that
supporting salts in source() is not relevant until the end of August (at
least), whereas some other changes to the formats interface are.  JtR
1.8 is not coming out before the end of August anyway, so we'll have
time to enhance the source() interface in September and on, if we want to.

Thanks,

Alexander

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.