Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 9 Mar 2015 13:27:30 +0300
From: Solar Designer <>
Subject: Re: 256/128 bit integer arithmatic

On Mon, Mar 09, 2015 at 02:24:49PM +0530, Sayantan Datta wrote:
> I have my own emulation based on 64bit uint in HI, LO configuration for
> performing modulo operations specific to my requirements and it's much
> faster than native 128bit modulo operations.

What do you mean by native here?  gcc's __uint128_t?

> I tried to do the three 64bit
> modulo operations required for emulating 128bit modulo using avx intrinsics

Why use SIMD for this?  Did you want to compute multiple instances of
the modulo division in parallel (and does your task have the parallelism
for this to make sense)?  Anyway, there's no SIMD integer division on x86.

> but it turns out there are no integer division built_ins in gcc, let alone
> modulo operations. No wonder modulo operations are slow!!

I'm not sure what you're referring to, and why you're blaming gcc.  This
shows that gcc is able to produce widening integer multiply instructions
(and I guess the corresponding division instructions as well, although
that would need to be tested separately) without needing an intrinsic:

Also relevant:

I think we shouldn't depend on external libraries that we can easily
avoid, though.

Sayantan, you haven't mentioned what you're trying to accomplish, but I
think you should redefine the task.  For example, you probably don't
actually need 128-bit precision to divide a keyspace.


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.