Date: Tue, 7 May 2013 16:21:17 +0200 From: magnum <john.magnum@...hmail.com> To: john-dev@...ts.openwall.com Subject: Re: Non-ASCII characters in various files -- core and jumbo On 7 May, 2013, at 16:00 , Alexander Cherepanov <cherepan@...me.ru> wrote: > There are some non-ascii chars in various files and not everything is in the same encoding. > > 1. Core. > > There is only one non-ascii char in core tree -- in "R<E9>mi Guyomarch" in doc/CREDITS. If this is in iso-8859-1 then it translates to Rémi Guyomarch. Isn't it better to convert it to utf-8 or to pure ascii form? This is for Solar to decide, but my vote would be UTF-8. > 2. Jumbo. > > - There are some places where non-ascii char can easily be eliminated -- patches attached. As long as they are UTF-8 I do not think they need to be fixed. The pass_gen.pl fix is probably good though, for avoiding accidental trashing. > - There are multiple names in utf-8. This is probably Ok. It is the canonical character set nowadays. If my name had non-ascii characters I would hate having it mangled in some inferior and ambigous encoding. > - There are two strings of lower- and upper-case letters from iso-8859-1 in doc/RULES. They are -- surprise:-) -- in iso-8859-1. IMHO it's better to remove them or to convert the file to utf-8. Actually, earlier today I converted doc/RULES to UTF-8 independantly of your findings :-) > - src/encoding_data.h contains many chars in utf-8. It's probably Ok to have one files where all such stuff lives. Yes, that file is clearly documented as being supposed to be UTF-8. > - src/rules.c contains several comments with non-ascii chars copied from src/encoding_data.h. Not sure, maybe remove them or non-ascii chars? Replacing them with some other characters would totally void their meaning :-) We could drop them though. I really don't think we need to. I'll apply some, most, maybe all of your patches. magnum
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.