Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Wed, 3 Aug 2011 15:28:02 -0500
From: "jfoug" <>
To: <>
Subject: RE: Character casing question for U+0131

>-----Original Message-----
>From: magnum []
>On 2011-08-01 23:15, magnum wrote:
>> Notes for NT:
>> 1. The german double-s (ß) is NOT uppercased to SS (just as we
>> 2. The micro sign is NOT uppercased to the greek uppercase version of
>> that character (Unicode specs suggest that could be done)
>The above is obviously rubbish as NT is case significant. JimF is about
>to do a similar test for old mssql (which is also Unicode, but
>uppercasing), that will be more interesting.

I have a long write up about this, but would like to get a few more facts in line, before posting.

It appears that ms-sql does not perform any of the 1 char to multi-char upcasing.  Also there were many other characters listed by to be upcased, that were not being upcased.  ALSO, there are some characters which 'should' have been cased (up and down), that I did not have in the UnicodeData.h file.

I will put together a comprehensive list, and post to the email group shortly.  I want to also build a test app to find out just what is happening at the Win32 API level, when it comes to Unicode and case changing.  For MSSQL, I have information about all characters from U+0000 to U+FFFF.  I will try to do the same for the Win32 API, and have that information prior to posting.  The amount of information is rather long, and I may write it up in a white paper (along with the sample code/scripts, etc used in obtaining the information), into more of a white paper format, vs a simple email.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.