Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 9 May 2015 00:55:12 -0700
From: John Sully <john@...uare.ca>
To: luoyonggang@...il.com
Cc: musl@...ts.openwall.com, James McNellis <james@...esmcnellis.com>, 
	austin-group-l@...ngroup.org, Clang Dev <cfe-dev@...uiuc.edu>, blees@...n.de, 
	dplakosh@...t.org, hsutter@...rosoft.com, writeonce@...ipix.org
Subject: Re: [cfe-dev] Is that getting wchar_t to be 32bit on win32 a good
 idea for compatible with Unix world by implement posix layer on win32 API?

wchar_t is also pretty common in the win32 world.  You shouldn't assume
people use the windows macros.  Regardless of what you choose someone is
going to lose, so it might make more sense to think about what is more
useful long term.

In my opinion you almost never want 32-bit wide characters once you learn
of their limitations.  Most people assume that if they use them they can
return to the one character -> one glyph idiom like ASCII.  But Unicode is
vastly more complex than that and while you avoid surrogates you don't
avoid things like combining characters and diacritics so the idiom does not
hold.

Given that almost every character in frequent use around the world is in
the BMP plane 16-bit wide chars make the most sense for most applications.


On Fri, May 8, 2015 at 8:16 PM, 罗勇刚(Yonggang Luo) <luoyonggang@...il.com>
wrote:

> Two solution:
> 1、Change the width of wchar_t to 16 bit, I guess that would broken a
> lot of things that exist on Win32 world.
> 2、Or we should preserve wchar_t to be 16 bit on win32, and add the
> char16_t and char32_t
> variant API for all API that have both narrow and wide version?
>
>
> I support for the second one, even if the second option is not
> applicable. the first option would cause a lot problems, the first
> thing is all Windows API use wchar_t and dependent on the wchar_t to
> be 2 byte width.  Second is, there is open source libraries that
> dependent the de fac·to that wchar_t to be 16 bit, such as Qt,
> Git(maybe).
> Almost exist open source libraries that already ported to Win32 are
> dependent the the fact wchar_t to be 16 bit,  cygwin is also discussed
> if getting wchar_t to be 32bit on win32
>
> https://www.cygwin.com/ml/cygwin/2011-02/msg00037.html
>
>
> > think there is no one would use
> >>>>> wchar_t for cross text processing, cause, on some system, wchar_t is
> >>>>> just 8bit  width!
> >>>>
> >>>> anybody would use wchar_t who cares about standard conformant
> >>>> implementations.
> >>>>
> >>>> non-standard broken platforms may get an unmaintained #ifdef
> >>>> as usual..
> >>>
> >>> I think we (and midipix) have a different perspective from Yonggang
> >>> Luo on portable development. Our view is that you write to a POSIX (or
> >>> nearly-POSIX) target with fully working Unicode support and fix the
> >>> small number of targets (i.e. just Windows) that don't already provide
> >> Small is relative, if counting the distribution count, well, Unix wins.
> >>> these things. Yonggang Luo's perspective seems to be more of a
> >>> traditional Windows approach with #ifdef and lots of OS-specific code,
> >>> but just making the Windows branch of the #ifdefs less hideous than it
> >>> was before. :)
> >> If getting wchar_t to be 32 bit on win32, then truly will be a lot of
> >> #ifdef, I am not so sure
> >> if you have experience on Win32 API development, I hope we discussing
> >> the problems in a
> >>   more objective way.
> >>
> >
> > One primary objective of code portability and posix-compatibility layer
> > for win32 is to _remove_ the need for OS-specific code-paths. A wchar_t
> > that is anything short (no pun intended) of a 32-bit integer will render
> > it impossible to build out of the box many pieces of commonly-used
> > software, including, but not limited to musl libc, the curses library,
> > and anything that expects wchar_t to cover the entire unicode range.
> >
> > As for your suggested framework: there are currently at least three
> > compilers that can produce optimized code for the target platform (gcc,
> > clang, and cparser), and which work very well with most open-source
> > software out there. As an aside, if you are interested in an 8-byte long
> > on 64-bit windows then an open-source compiler is probably your only
> > option. To compile musl with msvc, on the other hand, you'd have to make
> > so many changes to the source code that you might as well write your own
> > libc from scratch. To see why, please attempt to compile some ten or
> > fifteen core libc headers (stdio.h, unistd.h, etc.) with msvc. If that
> > goes well (spoiler: it won't), then the next step would be to compile a
> > subset of the source files (src/pthread or src/stdio, for instance) and
> > remove any remaining obstacles.
> >
> > m.
> >
> >
> >>>
> >>> Rich
> >>
> >>
> >>
> >
> >
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev@...uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.