Date: Wed, 21 Oct 2015 21:07:32 +0300 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: locale I first brought this up off-list, but I think it should be in here: On Wed, Oct 21, 2015 at 09:21:27AM +0200, magnum wrote: > On 2015-10-19 22:05, Solar Designer wrote: > >On Mon, Oct 19, 2015 at 02:51:36AM +0200, magnum wrote: > >>On 2015-10-18 15:06, Solar Designer wrote: > >>>BTW, magnum, can we please get rid of the UTF-8 char for degrees? Don't > >>>assume everyone has their terminal set to UTF-8 all the time, especially > >>>as it's a totally unnecessary assumption here. > >> > >>I made it configurable but it still defaults to UTF-8. I dislike the > >>idea of dropping it by default - users might not realize that "GPU:73C" > >>is a temp reading at all. > > > >Maybe check the current locale and default to plain "C" if the current > >locale is not UTF-8? To avoid checking env vars explicitly, maybe use > >mbrtowc() and see what it returns for the UTF-8 character under the > >current locale? > > I'm now checking/setting locale (if autoconf says I can) and fall back > to skipping the degree sign. Let me know if it misbehaves. This is: https://github.com/magnumripper/JohnTheRipper/issues/1841 https://github.com/magnumripper/JohnTheRipper/commit/5acb98062d25efb319e9ac4dbd04555693b1d739 Looking at these changes, I realize that my idea was probably bad: initializing the locale with setlocale() affects lots of things, including the ctype macros. With some cracking modes, this might affect what candidate passwords they generate. IIRC, we avoided using the ctype macros in our wordlist rules engine, but now that I grep e.g. for "islower", I find uses in dynamic_compiler.c, jumbo.c, mask.c. While we might later choose to add initializing locale to JtR for other reasons, I think DEGREE_SIGN alone isn't a sufficient reason, and if we do add locale support, we should make it consistent: initialize it all the time and do so early on, and not only do it for OpenCL and CUDA formats like the current code does. For now, maybe we should in fact check env vars explicitly to decide on DEGREE_SIGN. A maybe acceptable hack (for jumbo) is to do something like: setlocale(LC_ALL, ""); ... check for UTF-8 here ... setlocale(LC_CTYPE, "C"); so that ctype macros are unaffected by the current locale (since our uses of them appear to be of the kind where we prefer consistency over customization; arguably, this means they are misuses). But we'll need to do it all the time, and early on, to ensure consistent behavior regardless of whether an OpenCL or CUDA format is run. Also, the current checks for strchr(setlocale(LC_ALL, NULL), '.') do not tell us whether the locale is UTF-8 or not. We'll need to do better. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.