Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4E466D8E.9050101@bredband.net>
Date: Sat, 13 Aug 2011 14:26:54 +0200
From: magnum <rawsmooth@...dband.net>
To: john-dev@...ts.openwall.com
Subject: Re: Unicode, casing, obtaining data, and some real-world
 MSSQL (2000) data.

On 2011-08-12 22:49, jfoug wrote:
>> Do you mean the reinstated "third case" in utf8towcs()?
>
> I believe so.  There were a couple of if blocks which printf error codes, and 'tried' to correct their location within the data stream.  I commented both of those out, at this time. I know it is not right, and we will have to work through what 'is' right, but it allowed the format to process every data point from U+0 to U+FFFF.
>
> It is likely that I was simply spitting out invalid nonsense data, and the code was correct, in 'expecting' another UTF16 character, which was not present.  However, I think this is simply garbage avoidance code.  We simply have to get it where it keeps the process image 'safe', and does not output unneeded warnings.  Like I said, what I initially publish will likely need some tuning.  However, I do not think this would cause anyones cracking job to have any heartburn at all, right now.

OK, you mean in utf16toutf8_r()

...
         } else { /* it's an unpaired high surrogate */
                 --source; /* return to the illegal value itself */
                 fprintf(stderr, "warning, utf16toutf8 failed (illegal) 
- this is a bug in JtR\n");
                 break;
         }
} else { /* We don't have the 16 bits following the high surrogate. */
         --source; /* return to the high surrogate */
         fprintf(stderr, "warning, utf16toutf8 failed (no surrogate) - 
this is a bug in JtR\n");
         break;
}
...

The original code puts in replacement characters and uses return codes 
(conversion_fail etc) but since this function is only used for 
converting back from UTF-16 that we "know" is correct, I simplified it 
and put those error messages in there. If you hit those clauses we 
should find out why.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.