Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Fri, 12 Aug 2011 16:48:14 -0500
From: "jfoug" <jfoug@....net>
To: <john-dev@...ts.openwall.com>
Subject: Patch 0007  Codepage enahancements

I have added a new patch (0007) to the wiki page. This patch adds numerous
new code page encodings, into the Unicode.c file, and into rules.  It also
adds 2 'types' of Unicode casing data.  Once from Unicode.org, and the other
from observations from M$ Windows, and from MSSQL behavior.  Also a couple
of strange bugs showed up in mscash1 and NT formats, when loading a
character U+0080.

 

New version of cmpt_cp.pl, and this script requires the UnicodeData.txt file
to be located in ./src/unused (and the script has to be run from ./src).
This script now detects MANY things other than simple up case / downcase.
It detects things like, control, numbers, white space, etc, etc.  Then these
can get loaded into rules.c when that code page is selected.

 

I still have a little work to do on the test suite.   The current mssql-old
will likely have problems with the 'existing' test suite.  This is due to
the test suite's data being wrong.  I DO have a large set of test files
which were 100% generated BY mssql, so I am taking them as golden, much more
than fake hashes generated by a perl script.  The mssql-old-fmt-plug.c
format works 100% with these new test suite files.

 

Jim.

 

 


Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.