Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 27 Mar 2020 11:37:59 +0100
From: Solar Designer <>
Subject: Re: Getting error while using john command

On Fri, Mar 27, 2020 at 03:35:11AM +0100, magnum wrote:
> On 2020-03-26 16:08, Solar Designer wrote:
> >Unrelated, but reminded by the above:
> >
> >magnum, why is it that we care about character encodings of password
> >hash files?  Should we?  I understand why we care about character
> >encodings in wordlists, but not in password hash files.
> Because of the *other* fields

Oh, indeed.  In fact, I realized that this matters for single mode after
I sent the message.  I also think we discussed this on some GitHub issue
before, but I forget.

I guess this also matters for "--show" and for displaying " (usernames)"
while cracking?

Anyway, perhaps we can ignore the encodings (and speed up loading, too)
until the first line containing a colon is seen.  And have this logic
apply separately to each input file.  That way, files with bare hashes
won't be expected to be of any particular encoding (assuming that the
characters used inside hashes are universal across supported encodings)
and will be loaded faster.

> >Also, why is seeing a UTF-16 BOM a fatal error?  Apparently, people are
> >running into this once in a while - perhaps in misuses of the tools
> >similar to the above (in which case it's good luck the error happens to
> >be triggered), but maybe not always.
> I'm really curious how you somehow do not think that should be a fatal 
> error? Before I added it (IIRC) you could run a perfectly fine wordlist 
> encoded in UTF-16 and just not get a single crack, with NO clue as of 
> why, even though *all the right words were there*. Now THAT is a problem 
> if you ask me.

Do wordlists encoded in UTF-16 exist in the wild?  I'd expect this error
to be triggered mostly when the input file is (partly) binary garbage,
whereas it might also be partly reasonable (as a wordlist or hash list).

I guess the solution is to use "--encoding=raw" in such cases, but it
can be rather unfortunate if one runs JtR against a huge partly-garbage
wordlist and the session terminates with that error while unattended.

There's also less of a reason not to try and proceed further when this
is seen in a hash file.

So I'd change this from a fatal error to a warning, e.g.:

Warning: UTF-16 BOM seen in input file %s.  We don't support UTF-16.

> If you mean we should actually support that dreadful encoding, sure. 
> Just open an issue. I think all the needed bits are in there so it 
> should be fairly trivial.

I see no need to support it.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.