|
Date: Tue, 5 May 2015 19:30:03 +0800 From: Lei Zhang <zhanglei.april@...il.com> To: john-dev@...ts.openwall.com Subject: Re: [GSoC] JtR SIMD support enhancements > On Apr 25, 2015, at 8:34 PM, Solar Designer <solar@...nwall.com> wrote: > >> Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (240xOMP) DONE >> Raw: 17976 c/s real, 75.5 c/s virtual > > This is very poor speed. Needs to be investigated. Out of curiosity, I used Intel VTune to profile this self-test, and got a execution time distribution table: Function CPU Time ----------------------------------------- [libiomp5.so] 248.451s [vmlinux] 22.605s [john] 6.882s [libc-2.14.90.so] 0.627s [libcrypto.so.10] 0.171s [Others] 0.067s The program spends most of its on libiomp5.so, which I guess is where inter-threads synchronization happens. I think this poor speed results from the high synchronization overhead. Actually by setting OMP_NUM_THREADS to smaller values, I could get better results than the above. -------------------------------------------------------- [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=120 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (120xOMP) DONE Raw: 21576 c/s real, 179 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=60 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (60xOMP) DONE Raw: 22494 c/s real, 374 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=30 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (30xOMP) DONE Raw: 21530 c/s real, 714 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=15 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (15xOMP) DONE Raw: 24237 c/s real, 1613 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=8 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (8xOMP) DONE Raw: 24000 c/s real, 3000 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=4 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (4xOMP) DONE Raw: 12166 c/s real, 3049 c/s virtual [zhanglei@...0 zhanglei]$ OMP_NUM_THREADS=2 jumbo/john --test --format=phpass Benchmarking: phpass ($P$9) [phpass ($P$ or $H$) 128/128 MIC 16x1]... (2xOMP) DONE Raw: 8188 c/s real, 4094 c/s virtual -------------------------------------------------------- It appears that the default OMP_NUM_THREADS=240 isn't optimal for MIC, as the synchronization overhead is too high. Maybe we should tune OMP_NUM_THREADS individually for each format, just like OMP_SCALE. Lei
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.