Date: Mon, 22 Aug 2011 12:09:31 -0700 From: David Hulton <0x31337@...il.com> To: crypt-dev@...ts.openwall.com Subject: Re: Yuri's Status Report - #14 of 15 You should have much lower latency on the M501 for Writes/Reads but you'll definitely want to be transferring your data in larger chunks (instead of calling WriteDevice multiple times, call it once with a large length). This will take advantage of bus mastering and bursting on the bus and if you use a FIFO on the other end it should reduce your latency almost down to 0 if you code it properly so the core is never waiting for the software to fill the FIFO. Also, looking at your Manager.v code you have multiple modules outputting to the same PicoDataOut signal. You should create a PicoDataOut wire for each core and OR them all together, this might be causing issues with your build... I've attached a patched Manager.v that tries to instantiate 6 cores. Also, it seems like a lot of the logic is probably used up by larger resources that could be shared (since they are used at different states in the state machine). I would recommend trying to break it up to use modules that perform the 32-bit ADDs and other more resource intensive operations (could also make use of a DSP48 block for the 32-bit ADD for example) and then have the different parts of the state machine that need to perform a 32-bit ADD to use the module instead of the c <= a + b because usually the tools aren't very good at realizing that all of the ADDs can be performed using a shared resource. I would also just look into the possibility of doing a fully or partially unrolled pipeline design for the LX240 since there's a lot more logic that you can make use of. -David On Thu, Aug 18, 2011 at 10:18 PM, Yuri Gonzaga <yuriggc@...il.com> wrote: >> Does this apply to E-101 only or also to M-501? > > I don't know yet. This answer will have to wait for the bitstream > generation. > >> >> I notice that in your >> changes to the JtR tree, you call drv->WriteDeviceAbsolute() with sizes >> larger than 4 bytes. I guess this is untested yet, but you're hoping >> that it'll work. Correct? > > Right. It is untested. My intention is to transfer bigger block of data at > a time. > >> >> And indeed for decent performance you'll need sizes not merely larger >> than 4 bytes, but rather you need to send/receive the entire blob of >> around 4.5 KB in size in one call. > > I will have to change the loop verilog construction to receive and send > everything together. > Regards, > Yuri > Download attachment "Manager.v" of type "application/octet-stream" (2738 bytes)
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.