Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 30 Aug 2012 01:37:44 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: OpenCL on OSX

On 08/28/2012 02:18 AM, magnum wrote:
> I got hold of a Macbook with Kepler GPU so I did some OpenCL testing.
> That was depressing. Every single format fails on GPU (while most or
> all work fine on CPU). There are also tons of benign but noisy
> warnings about things like comparing integers of different signs.
> I'll fix that too but it's not the real problem.
>
> The symptoms are just like a few others already have reported:
>
> OpenCL platform 0: Apple, 2 device(s). Using device 1: GeForce GT
> 650M Compilation log: Error building kernel. Returned build code:
> -11. DEVICE_INFO=130 OpenCL error (CL_BUILD_PROGRAM_FAILURE) in file
> (common-opencl.c) at line (136) - (clBuildProgram failed.)

Some progress. After forcing discrete GPU (OSX bug), CUDA works fine (if 
running auto-switching GPU, it kernel panics). Also, Some OpenCL formats 
work now (I committed a load of minor fixes):


CUDA Device #0
	Name:                          GeForce GT 650M
	Compute capability:            sm_30
	Number of multiprocessors:     2
	Clock rate:                    878 Mhz
	Total global memory:           1.0 GB
	Total shared memory per block: 48.0 kB
	Total constant memory:         64.0 kB
	Kernel execution timeout:      Yes
	Concurrent copy and execution: Yes
	Warp size:                     32

Benchmarking: md5crypt [CUDA]... DONE
Raw:	157827 c/s real, 157827 c/s virtual

Benchmarking: M$ Cache Hash MD4 len(pass)=8, len(salt)=13 [CUDA]... DONE
Raw:	21665K c/s real, 21884K c/s virtual

Benchmarking: M$ Cache Hash 2 (DCC2) PBKDF2-HMAC-SHA-1 [CUDA]... DONE
Raw:	4843 c/s real, 4843 c/s virtual

Benchmarking: phpass MD5 ($P$9 lengths 1 to 15) [CUDA]... DONE
Raw:	158117 c/s real, 156582 c/s virtual

Benchmarking: Password Safe SHA-256 [CUDA]... DONE
Raw:	20928 c/s real, 20928 c/s virtual

Benchmarking: Raw SHA-224 [CUDA]... DONE
Raw:	25305K c/s real, 25305K c/s virtual

Benchmarking: Raw SHA-256 [CUDA]... DONE
Raw:	25057K c/s real, 25057K c/s virtual

Benchmarking: Raw SHA-512 [CUDA]... DONE
Raw:	9074K c/s real, 9074K c/s virtual

Benchmarking: sha256crypt (rounds=5000) [CUDA]... DONE
Raw:	2466 c/s real, 2443 c/s virtual

Benchmarking: sha512crypt (rounds=5000) [CUDA]... DONE
Raw:	2059 c/s real, 2059 c/s virtual

Benchmarking: WPA-PSK PBKDF2-HMAC-SHA-1 [CUDA]... DONE
Raw:	5973 c/s real, 5973 c/s virtual

Benchmarking: Mac OS X 10.7+ salted SHA-512 [CUDA]... DONE
Many salts:	9766K c/s real, 9671K c/s virtual
Only one salt:	7349K c/s real, 7419K c/s virtual


Platform #0 name: Apple
Platform version: OpenCL 1.2 (Jun 20 2012 14:18:19)
	Device #0 name:		Intel(R) Core(TM) i7-3615QM CPU @ 2.30GHz
	Device vendor:		Intel
	Device type:		CPU (LE)
	Device version:		OpenCL 1.2
	Driver version:		1.1
	Global Memory:		8192 MB
	Global Memory Cache:	64 bytes
	Local Memory:		32 KB (Global)
	Max clock (MHz) :	2300
	Max Work Group Size:	1024
	Parallel compute cores:	8

	Device #1 name:		GeForce GT 650M
	Device vendor:		NVIDIA
	Device type:		GPU (LE)
	Device version:		OpenCL 1.1
	Driver version:		CLH 1.0
	Global Memory:		1024 MB
	Global Memory Cache:	0 bytes
	Local Memory:		48 KB (Local)
	Max clock (MHz) :	405
	Max Work Group Size:	1024
	Parallel compute cores:	2
	Stream processors:	16  (2 x 8)

That last line is incorrect. It should be 384 (2 x 192). Claudio's code 
does not work because Apple's nvidia framwork does not export all stuff 
that native nvidia do. Not sure how to fix.


OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: md5crypt [OpenCL]... DONE
Raw:	68266 c/s real, 7372K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 64, Global work size (GWS) 2097152
Benchmarking: MySQL 4.1 double-SHA-1 [OpenCL]... DONE
Many salts:	11870K c/s real, 83886K c/s virtual
Only one salt:	11983K c/s real, 89877K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: phpass MD5 ($P$9 length 8) [OpenCL]... DONE
Raw:	127746 c/s real, 4300K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Benchmarking: Password Safe SHA-256 [OpenCL]... DONE
Raw:	15753 c/s real, 5734K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 128, Global work size (GWS) 2097152
Benchmarking: Raw MD4 [OpenCL]... DONE
Raw:	19972K c/s real, 69905K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 128, Global work size (GWS) 2097152
Benchmarking: Raw MD5 [OpenCL]... DONE
Raw:	39064K c/s real, 99614K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 64, Global work size (GWS) 2097152
Benchmarking: Raw SHA-1 OpenCL [OpenCL]... DONE
Raw:	23741K c/s real, 119837K c/s virtual

OpenCL platform 0: Apple, 2 device(s).
Using device 1: GeForce GT 650M
Local work size (LWS) 1024, global work size (GWS) 2048
Benchmarking: sha256crypt (rounds=5000) [OpenCL]... DONE
Raw:	2482 c/s real, 409600 c/s virtual


Not bad for a laptop. The rest of the OpenCL formats do not yet work though.

magnum

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.