Date: Fri, 12 Aug 2011 16:48:14 -0500 From: "jfoug" <jfoug@....net> To: <john-dev@...ts.openwall.com> Subject: Patch 0007 Codepage enahancements I have added a new patch (0007) to the wiki page. This patch adds numerous new code page encodings, into the Unicode.c file, and into rules. It also adds 2 'types' of Unicode casing data. Once from Unicode.org, and the other from observations from M$ Windows, and from MSSQL behavior. Also a couple of strange bugs showed up in mscash1 and NT formats, when loading a character U+0080. New version of cmpt_cp.pl, and this script requires the UnicodeData.txt file to be located in ./src/unused (and the script has to be run from ./src). This script now detects MANY things other than simple up case / downcase. It detects things like, control, numbers, white space, etc, etc. Then these can get loaded into rules.c when that code page is selected. I still have a little work to do on the test suite. The current mssql-old will likely have problems with the 'existing' test suite. This is due to the test suite's data being wrong. I DO have a large set of test files which were 100% generated BY mssql, so I am taking them as golden, much more than fake hashes generated by a perl script. The mssql-old-fmt-plug.c format works 100% with these new test suite files. Jim. Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.