Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 7 Mar 2014 19:59:41 -0800 (PST)
From: Anthony Tanoury <tanoury@...oo.com>
To: "john-users@...ts.openwall.com" <john-users@...ts.openwall.com>
Subject: Re: Help - mpi ocl restore session

Thanks for the response magnum! 

On 14-03-07 02:31 AM, magnum wrote: 

On 2014-03-07 07:26, Anthony Tanoury wrote: 
>
>I use John the Ripper, version 1.8.0.2-bleeding-jumbo_mpi 
>>[linux-x86-64-opencl] 
>>
>Is this a very recent snapshot or an older one? Some timer oddities has 
>changed (for the better, hopefully) very recently. Like yesterday... 
>
my snapshot is about two weeks old. 


I can run an mpirun opencl session just fine and all sessions 
>>complete just fine. My only  trouble is with session restore and only 
>>if it involves remote hosts. I can resume a session if there is not a 
>>remote host. However, if I terminate a session with a one or more 
>>remote hosts using "killall mpirun", "kill HUP" or Ctrl-c and try to 
>>restore the session, only one core or one GPU will resume. 
>>
>When you say "one" I assume you mean only the root node resumes? 
>
All 6 cores resume on the master node, but only one core on each of the 
remote computers. 


Does this happen even if the session had been running for 30 minutes or 
>more? Did you set "Save = 60" in john.conf per the instructions? Before 
>killing an MPI session, you should "kill -USR1" the mpirun that controls 
>it. This should trigger a session save. Then wait at least 30 seconds 
>before aborting them. 
>
Yes, it happens even if the session has been running for more than 30 
minutes. 

I had "Save=600", but changed it to 60 in john.conf on all computers. I 
did not notice any difference. 

To abort, I use "pkill -USR1 mpirun" to trigger a session save, wait 30 
seconds, then I do "killall mpirun". Is this the correct way to end a 
session?? 


What happens with the other nodes? Do they silently just not resume or 
>are there any errors or other clues? 
>
The remote nodes resume but with only one core each. 

When I start a new session I get this: 

Device 2: Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz 
Device 1: Intel(R) Core(TM) i7 CPU       X 980  @ 3.33GHz 
Device 2: Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz 
Local worksize (LWS) 1, Global worksize (GWS) 8192 
Local worksize (LWS) 1, Global worksize (GWS) 16384 
Local worksize (LWS) 1, Global worksize (GWS) 1024 
Loaded 4 password hashes with 4 different salts (wpapsk-opencl, WPA/WPA2 
PSK [PBKDF2-SHA1 OpenCL 4x]) 
Node numbers 1-3 of 3 (MPI) 
Note: minimum length forced to 8 
Send SIGHUP to john process for status 

However, when I try to resume a session I get this: 

0 Session completed 
0 Session completed 
Device 1: Intel(R) Core(TM) i7 CPU       X 980  @ 3.33GHz 
Local worksize (LWS) 1, Global worksize (GWS) 16384 
Loaded 4 password hashes with 4 different salts (wpapsk-opencl, WPA/WPA2 
PSK [PBKDF2-SHA1 OpenCL 4x]) 
Node numbers 1-3 of 3 (MPI) 
Note: minimum length forced to 8 
Send SIGHUP to john process for status 

Notice that only Device 1 of the master node is listed above, All six 
cores on the master start, however, only core 2 on each of the remote 
computers start. 

If I do a "pkill -USR1 mpirun" after a session resume I will get: 
-------------------------------------------------------------------------- 
mpirun noticed that process rank 1 with PID 7490 on node ub1 exited on 
signal 10 (User defined signal 1). 
-------------------------------------------------------------------------- 
and the session will abort and take me back to the prompt: 

That message above dose not always indicate the same node but varies 
between all nodes including the master. 


Can you see any clues in the log files? 
>
The john log files look good, no errors. Is there any other logs I 
should check? 

Also, the john.rec files on each computer are updated each time I do a 
"pkill -USR1 mpirun" and look good. 


If it's a very slow (unresponsive) format, try running with lower GWS 
>(using eg. "mpirun -x GWS=2048" or whatever number is a lot lower than 
>what is othereise used) when testing. 
>
I lowered GWS down to 512 and no difference. Any more ideas? 


magnum 
>
>
>
Thanks again magnum!







On Friday, March 7, 2014 1:26 AM, Anthony Tanoury <tanoury@...oo.com> wrote:
 
Greetings JTR wizards- 
 
I humbly bow before your knowledge again, in quest of jtr enlightenment.... 

I use John the Ripper, version 1.8.0.2-bleeding-jumbo_mpi [linux-x86-64-opencl] 

I can run an mpirun opencl session just fine and all sessions complete just
 fine. My only  trouble is with session restore and only if it involves remote hosts. I can resume a session if there is not a remote host. However, if I terminate a session with a one or more remote hosts using
 "killall mpirun", "kill HUP" or Ctrl-c and try to restore the session, only one core or one GPU will resume.

I use the following syntax to restore:

mpirun --host ub0 --host ub1 --host ub2 ./john --restore

I also have version 1.8.0.2-bleeding-jumbo_mpi [linux-x86-64-native] and I can
 restore multi host sessions just fine usng:

mpirun -n 32 --hostfile /etc/nodes ./john --restore

Any idea why I'm having so much trouble restoring mpi opencl multi host sessions??

Thanks,
Tony

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.