Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 27 Dec 2021 13:41:38 -0500
From: Rich Felker <>
To: Markus Wichmann <>
Subject: Re: ASM-to-C conversion for i386

On Mon, Dec 27, 2021 at 07:04:37PM +0100, Markus Wichmann wrote:
> On Mon, Dec 27, 2021 at 11:30:56AM -0500, Rich Felker wrote:
> > One thought, and I'm not sure if this is a good idea or a bad one but
> > worth discussing:
> >
> > Using your acos.c as an example, where you have the comment:
> >
> > 	atan2(fabs(sqrt((1-x)*(1+x))), x)
> >
> That comment was copied from acos.s. In general, I have tried to
> preserve comments. Except in fenv.s, where each time __hwcap was tested,
> the same comment was prefixed, and its point should be coming across ten
> times more easily by just creating a symbolic constant.
> > The actual code could be written as:
> >
> > 	return (double)x87_fpatan(x, x87_fabs(x87_fsqrt((1-x)*(1+x))));
> >
> > with the appropriate "x87.h" defining each of these with the
> > appropriate asm & constraints. This kinda makes the individual
> > functions self-documenting and non-error-prone (repetition of
> > error-prone constraints, especially the hidden requirement that, in
> > "=t"(x), x have type long double).
> >
> That's probably an even better idea than what I am currently doing:
> Moving the "core" functions into a new header file (as static inline
> functions), and using these in the function implementations. I could not
> get all the duplication out; in some cases the duplication is only
> conceptual (hypot() and hypotf() have the same idea, but it needs to be
> implemented differently due to the different precisions/representations).
> I think I can combine both approaches, because what I'm doing appears to
> have the effect of moving the __asm__ statements entirely out of the C
> files into the new header file. And it appears that we are only using a
> couple of instructions, anyway.
> Downside is that implementing
> static inline long double x87_fabs(long double x) {
>     __asm__("fabs" : "+t"(x));
>     return x;
> }
> now actually carries the connotation that the result is of
> double-extended precision and needs rounding before being returned.
> Unlike the current version which does not do that. However, to my
> knowledge that will not actually be wrong, only slower, so a solution
> that preserves the current connotations for these few instructions can
> probably be considered a micro-optimization.

Yes, I think the insns that can emit other precisions probably would
need 3 versions, but there are very few of these -- just fabs and


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.