Date: Sat, 16 Jul 2011 01:55:02 +0400 From: Solar Designer <solar@...nwall.com> To: john-dev@...ts.openwall.com Subject: Re: memory usage question, caused by new Unicode-casing Jim - Thanks for bringing this up. On Fri, Jul 15, 2011 at 10:35:16AM -0500, JFoug wrote: > UTF16 utc2_upcase[0x10000]; > UTF16 utc2_downcase[0x10000]; What does "utc2" stand for? > Now for the question. utc2_up[down]case arrays require 256K of heap. Not heap, but .bss, although the effect is similar. > Is 256K of heap a problem, that should be changed if running in --save-mem > mode? I think 256 KB is not too much of a problem, especially not in -jumbo and not with OpenMP builds where we have some other arrays in .bss that are of comparable size. When you don't actually write to those memory pages, then no virtual memory is allocated for them (but address space is, and it is counted against certain limits the OS or sysadmin might impose). You say that the arrays are sparse - if so, and if zero reads from many elements are expected, you may choose to not initialize those regions, relying on the read-as-zero property of everything uninitialized in .bss. This saves a little bit of memory. You may also skip initialization of these altogether when a given invocation of John does not need them. > IF SO, then there will likely have to be substantial changes made to make > it 'work' properly, if in --save-mem mode, while at the same time, trying > to preserve the speed within the 'normal' mode. We 'could' fall back to > only handling 'a'-'z' casing even in -utf8 if in --save-mem, but that > pretty much nueters certain formats. I think that having this depend on --save-mem would be confusing. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.