Date: Tue, 5 Jun 2012 13:34:24 +0530 From: SAYANTAN DATTA <std2048@...il.com> To: john-dev@...ts.openwall.com Subject: Re: HD 7970 LDS Bank Conflicts Hi Milen, On Tue, Jun 5, 2012 at 1:18 PM, Milen Rangelov <gat3way@...il.com> wrote: > This is a good candidate for local memory, I would not leave it like that. > Arrays declared as local variables are almost always a bad idea, even if > they are backed by private memory, but in reality just very small arrays > do, it is more likely that the scratchpad memory is used instead which is > as slow as global. Also I am not sure if your calculations regarding > avoiding channel conflicts that involve division by 13 do not end up slower > than actually having the LDS conflicts because integer division and > especially modulus are very expensive. The code you saw in the current repository tries to maximze only global memory bandwidth. It doesn't use LDS. Since 7970 has 12 global memory channels , it is fixed to 12+1 =13(ideally 12 should provide max performance but it turns out 13 providing max performance) which gives maximum performance on 7970 using Global memory. Also this kernel has ALU instruction to fetch ratio of only 3.29 . So I'm not worried about ALU instructuctions. Thanks for the feedback. Regards, Sayantan Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.