Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 18 Jul 2015 13:01:43 -0700
From: ibid.ag@...il.com
To: musl@...ts.openwall.com
Subject: Re: Left-shift of negative number

On Fri, Jul 17, 2015 at 05:35:22PM -0400, Rich Felker wrote:
> > What worries me more than the shift of a negative value, is that this
> > code is erroneous if `int` is only 16 bit wide. Whereas we can
> > reasonably assume that a shift of a negative value in two's complement
> > is the same as an unsigned shift, compilers tend to produce just crap
> > if the shift exceeds the width.
> > 
> > So I would feel much more comfortable if we'd use UINT32_C(0x40)
> > inside the R macro.
> 
> The entire internal API here uses the type unsigned for character
> codes and state, so like the rest of musl there is an assumption
> (guaranteed by POSIX) that int is at least 32-bit. Since the
> UTF-8/multibyte code is written to be largely self-contained and
> independent of musl, we could look into enhancing the code to be
> portable to systems with 16-bit int, but I suspect this would be
> rather useless in practice. If we did that, we would need to use
> something ugly like uint_least32_t rather than uint32_t to gain any
> portability since the latter need not even exist.

As far as I know, 16-bit int is applicable to the following platforms:
-Some ports of certain RTOSes to 8 or 16 bit microcontrollers
 (ie, FreeRTOS and perhaps eCos)
-DOS, when *not* using GCC (DJGPP uses 32-bit int); this boils down to
 OpenWatcom, the old C89 compilers, and even older K&R-ish compilers.
-FUZIX
-ELKS
-Minix (8086 version)
-Xenix and other old commercial 16-bit *nixes

FUZIX uses sdcc, which is an incomplete C89-ish compiler.
ELKS uses a K&R compiler that can be used with a preprocessor
to compile some C89 code.
Old *nixes use K&R C.
8086 Minix uses ACK, which is C89; there's an experimental port of
PCC to the 8086, but that's a long way from being useable right now.
(Alan Cox is working on it part time so he can port FUZIX to the PC.)

In short, the possible compilers are OpenWatcom, or various bits
that are C89 at best (can't rely on uint* being available at all,
short of a custom "limits.h").

I'm not sure if OpenWatcom uses 16-bit int when building in 32-bit mode;
the compatability with HXRT would suggest that it doesn't.
So to make it meaningful, you would have to make it work with segmented
memory and probably C89.

Odd as it may sound, there are people using UTF on DOS (the Blocek text
editor comes to mind); but I'm not aware of interest in UTF on 16-bit DOS.


Thanks,
Isaac Dunham



Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.