Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 11 Apr 2012 07:27:59 +0400
From: Solar Designer <>
Subject: Re: JtR:Multi-GPU Setups

Samuele -

On Tue, Apr 10, 2012 at 12:58:44PM +0200, Samuele Giovanni Tonon wrote:
> I hope with this one to give you some good advice and extend those advice to
> all the people interested in GPU computing for john.

Thank you for posting this!  I think we/you should turn this into a wiki
page like:

(it does not exist yet).

> ===
> = Single GPU hints
> ===
> As already told by many people AMD/ATI need to work an Xserver running


> AND a user logged in

With typical distros, yes, although I think the only thing that's
actually needed is that the program (where we use OpenCL) is allowed to
talk to the X server.  I think this has to do with /usr/sbin/atieventsd
and /etc/ati/, which simply runs some xauth commands.
Anyway, for my testing today I simply configured autologin in
/etc/lightdm/lightdm.conf as suggested at the URL you posted (thanks!)

> and use 'ssh -X' (X11 protocol enabled in sshd)

Not confirmed.  Moreover, if this command is used when connecting from a
remote machine, it will prevent things from working because our program
(such as JtR) will be talking to the remote machine's X server instead
of to the one with the AMD GPU in it.  In my testing today, when
connecting from a remote machine, simple "ssh" works, "ssh -X" does not
(the AMD GPU is not seen with "clinfo" and "john --platform=list", then).

> this means that if that card is your only card but you plan to work on that
> server from remote (ssh), you must set up a lightdm session with an autologin
> this should help you in doing that:

This has helped, thanks.

> Nvidia on the other hand doesn't need any xserver running,

In my testing today, although the X server does not have to be running
at the moment in order for the Nvidia GPU to be usable (both CUDA and
OpenCL), it does need to have run at least once since the system was
booted up.  I think it initializes the driver or/and the hardware in
some relevant way - and leaves them initialized even upon exit.

> so you can use OpenCL through a normal ssh.

OpenCL works through "normal" SSH (without X forwarding) for both AMD
and Nvidia for me.

> both nvidia and amd gives you libraries for opencl and the driver to use it, that
> driver is the proprietary one; if you find yourself stuck with some errors it's most likely
> because you are using open source driver (radeon, vesa) instead of proprietary (nvidia, fglrx)

Yes, when I ran X with an AMD-only xorg.conf (no Nvidia driver
explicitly mentioned there), somehow I was also getting the free nouveau
driver loaded (along with fglrx).  This prevented me from loading the
proprietary nvidia driver later (and somehow the nouveau driver would
refuse to be rmmod'ed because of what looked like a circular dependency
between nouveau-related modules).  So I added "blacklist nouveau" to
/etc/modprobe.d/blacklist.conf, which prevented this annoyance.

Another issue was that the new Catalyst 12.3 driver
( turned out to be too buggy(?)
for me (with my 7970) - more so than the previous beta I was using
(, so I had to revert to the
latter.  This was unexpected, so it took me a while to figure out - at
first, I suspected that I broke something else.  The symptoms were that
the AMD GPU wouldn't be listed in "clinfo", the X desktop would have
various artifacts (lots of them!), and when I tried to run a game
(TORCS) it failed with a GLX related error.  Simply downgrading the
driver made all of these go away at once.  To be fair: I am doing this
on Ubuntu 12.04 Beta, which is not officially supported by AMD yet; it
is possible that things would be better with the new driver on a
supported distro.

Even with the older driver, I still get occasional X server segfaults on
its startup, which are cured only by rebooting the system (rmmod'ing
fglrx is not enough).

> What card to choose: i suggest move to ati hd 6xxxx family which is a good tradeoff of 
> price per speed; ati hd 5xxxx family are good too but maybe a bit outdated and with different
> "bandwith" pci speed from cpu -> gpu and vice versa.

I think the bandwidth is mostly irrelevant, but I agree with the
recommendation - the newer cards are a better target for

> Nvidia are generally more expensive than amd

Not quite, but they're about 3 times slower for what we're doing.

> but they offer a bit more stability of driver and issue .

Yes, this matches my experience so far.

> Everybody atm is "going wild" for the amd hd 7970; i still don't know if it's worth getting one, i'd wait a bit more

It's slower than 5970 or 6990 (dual-GPU cards) while being priced
similarly.  So 6990 is a better thing to buy in terms of performance per
dollar.  In fact, cheaper cards like 5870/5850 or 6970/6950 are even
better in that respect (and many 6950's are BIOS-upgradable to 6970),
but of course they're slower.

7970 is good for some of us to have because it's a new architecture,
which we need to target as well.  Also, it is slightly more energy

Oh, and indeed it is the fastest single GPU card, so if we only have
code supporting one GPU and we don't want to bother running multiple
instances manually, then 7970 has advantage over dual-GPU cards.  On the
other hand, it encourages laziness in implementing better multi-GPU
support in JtR. ;-)

> ===
> = CPU mode
> ===
> if you have a AMD card and AMD cpu you might change your OpenCL code to either run on CPU or on GPU.
> while this can help on tracking problem debug with gdb 

Thank you for this link!

> it could lead to some misbehaviours due to using a different architecture

And that's a good thing - it may help produce more portable OpenCL code.

> ===
> = Dual GPU system:
> ===
> As already told in other messages this at the moment is a bit messy: linux distribution are not yet
> ready with packages for nvidia and AMD so mixing both drivers and libraries could lead you to strange
> side effects.

In fact, my conclusion so far is that it's best not to use whatever
facilities a distro might provide to manage the proprietary drivers
because we're likely to end up needing to install different revisions of
those drivers anyway and then it's best for the distro's package manager
and scripts to be completely unaware of the drivers than for them to be
aware of some version that is no longer there.

> Nothing to worrry about, but you could end spending a lot of time trying to figure out which correct
> order of libraries to specify when using nvidia and which one to use when using AMD.

Yes, and getting X to work with both drivers may also trigger issues.
On my system, I can run X with nvidia only or with fglrx only, but when
I try with both at once (custom xorg.conf, which I think is valid), the
X server segfaults at startup (in fact, in what looks like exactly the
same way that is occasionally triggered by fglrx alone - perhaps there's
simply a bug in fglrx or an incompatibility with the X server version).

So to get both cards initialized after system bootup, I first run with
xorg.conf for Nvidia (the nvidia driver), then terminate the X server,
then run it with xorg.conf for AMD (fglrx).  I suppose I could instead
run two X servers at once (and keep them running).

> Having dual AMD or dual Nvidia is simpler but not so useful: at the moment jtr still need a good interface
> design for GPU to have it scale with more than one GPU so at the moment is not so *needed* to have more
> than one GPU.

You're speaking of fast hashes only.  JtR is already perfectly capable
of fully loading a GPU (or multiple GPUs with multiple instances) for
slow hashes.

> I hope i have clarified a bit more the situation of OpenCL developing and current jtr opencl status.

Your posting was helpful, thanks again!  I think I've provided some
helpful info for the new wiki page as well.

user@...l:~/john/magnum-jumbo/src$ ../run/john -pla=list
Platform #0 name: NVIDIA CUDA
Platform version: OpenCL 1.1 CUDA 4.2.1
        Device #0 name:         GeForce GTX 570
        Device vendor:          NVIDIA Corporation
        Device type:            GPU (LE)
        Device version:         OpenCL 1.1 CUDA
        Driver version:         295.33
        Global Memory:          1279 MB
        Global Memory Cache:    240 KB
        Local Memory:           48 KB (Local)
        Max clock (MHz) :       1600
        Max Work Group Size:    1024
        Parallel compute cores: 15

Platform #1 name: AMD Accelerated Parallel Processing
Platform version: OpenCL 1.1 AMD-APP (844.4)
        Device #0 name:         Tahiti
        Device vendor:          Advanced Micro Devices, Inc.
        Device type:            GPU (LE)
        Device version:         OpenCL 1.1 AMD-APP (844.4)
        Driver version:         CAL 1.4.1658 (VM)
        Global Memory:          3072 MB
        Global Memory Cache:    0 bytes
        Local Memory:           32 KB (Local)
        Max clock (MHz) :       0
        Max Work Group Size:    256
        Parallel compute cores: 32

        Device #1 name:         AMD FX(tm)-8120 Eight-Core Processor
        Device vendor:          AuthenticAMD
        Device type:            CPU (LE)
        Device version:         OpenCL 1.1 AMD-APP (844.4)
        Driver version:         2.0
        Global Memory:          7966 MB
        Global Memory Cache:    16 KB
        Local Memory:           32 KB (Global)
        Max clock (MHz) :       1400
        Max Work Group Size:    1024
        Parallel compute cores: 8

("1400 MHz" for the CPU is actually the minimum clock rate, because the
CPU was almost idle when that was measured.)


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.