Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 9 Mar 2015 13:27:30 +0300
From: Solar Designer <solar@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: 256/128 bit integer arithmatic

On Mon, Mar 09, 2015 at 02:24:49PM +0530, Sayantan Datta wrote:
> I have my own emulation based on 64bit uint in HI, LO configuration for
> performing modulo operations specific to my requirements and it's much
> faster than native 128bit modulo operations.

What do you mean by native here?  gcc's __uint128_t?

> I tried to do the three 64bit
> modulo operations required for emulating 128bit modulo using avx intrinsics

Why use SIMD for this?  Did you want to compute multiple instances of
the modulo division in parallel (and does your task have the parallelism
for this to make sense)?  Anyway, there's no SIMD integer division on x86.

> but it turns out there are no integer division built_ins in gcc, let alone
> modulo operations. No wonder modulo operations are slow!!

I'm not sure what you're referring to, and why you're blaming gcc.  This
shows that gcc is able to produce widening integer multiply instructions
(and I guess the corresponding division instructions as well, although
that would need to be tested separately) without needing an intrinsic:

http://stackoverflow.com/questions/13187629/gcc-intrinsic-for-extended-division-multiplication/13187798#13187798

Also relevant:

http://stackoverflow.com/questions/16822757/sse-integer-division/16830506#16830506
http://libdivide.com

I think we shouldn't depend on external libraries that we can easily
avoid, though.

Sayantan, you haven't mentioned what you're trying to accomplish, but I
think you should redefine the task.  For example, you probably don't
actually need 128-bit precision to divide a keyspace.

Alexander

Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ