john-dev - Re: [RFC] Johnny further development proposal

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150422174305.GA22988@openwall.com>
Date: Wed, 22 Apr 2015 20:43:05 +0300
From: Aleksey Cherepanov <lyosha@...nwall.com>
To: john-dev@...ts.openwall.com
Subject: Re: [RFC] Johnny further development proposal

On Wed, Apr 22, 2015 at 08:41:38AM +0300, Shinnok wrote:
> > On Apr 21, 2015, at 12:29 PM, Aleksey Cherepanov <lyosha@...nwall.com> wrote:
> > On Tue, Apr 21, 2015 at 10:24:46AM +0300, Shinnok wrote:
> >>> On Apr 20, 2015, at 11:20 PM, Aleksey Cherepanov <lyosha@...nwall.com> w> >>> On Mon, Apr 20, 2015 at 10:47:06AM -0400, Mathieu Laprise wrote:
> >>>> Sprint 1(week 1 and 2) :  Get familiar with john the ripper doc and
> >>>> codebase. Code, integrate, test version 1.4 requirements . Translation is
> >>>> already advanced but proper threading will introduce new changes and bugs
> >>>> that I'll fix.
> >>> 
> >>> Threading is a pain and unneeded complexity. So I propose to not
> >>> implement it as long as we can. Is there an example of slowness
> >>> solvable by threading?
> >> 
> >> Threading is not so much of a pain nowadays and is quite a necessity. We have plenty of Qt cross-platform support for that in both QThread and QtConcurrent.
> >> This is not just a threading task, what I ultimately desire is to have proper separation between UI specific logic and all the backend related tasks(cli invocation and monitoring, cli output parsing, post and pre file input/output processing). Reasons for having this separation include:
> >> 
> >> * Better code where we do not mix parsing of JtR process status and output right in the UI related slots and activities. The UI code(mainwindow.cpp) is a particularly bad choice of handling other processes or doing any kind of compute intensive task. Can lead to crashes, slow responses, slow parsing and bad user experience in the end.
> >> * Loose coupling between UI code and backend code(as defined previously), leads to better code clarity, the possibility of defining an interface of communication between JtR  and Johnny, less crashes, easier debugging and greater overall architecture design.
> >> * Having a loosely coupled relation between the UI and the backend tasks will leave room for easier extensibility(via a bit more management code yes). We'll need to call 2john scripts and gracefully recover if some fail, aren't available or are hanging, we can't really do that from MainWindow. Apart from those obvious extensions, I'm thinking the big picture here, when we'll need to transition Johnny to managing multiple instances of JtR running on different machines(which I am a serious believer is the only maturity conclusion for Johnny, even if nobody agrees with me :) ), it will not warrant a complete rewrite, but just an extension of that mediating intermediary step(the language).
> > 
> > So far there is no main solution to run john on remote machines.
> 
> What about --node? Can't that be used to crop a working solution for distributing a john task across several machines?

Yes, it may be used for distributed attacks. There are even scripts to
do that (they're called "slave", I am not sure where are them, we used
them in contests several times). These scripts were rather main
solution but Solar Designer is not really happy about them calling the
mechanism "abuse" of --node. These scripts uses very high total number
of nodes so the attacks are finished quite fast.

BTW Tavis Ormandi wrote a bit on parallelism:
http://www.openwall.com/lists/john-dev/2012/06/14/5

> > Though it is useful to run several john instances even on 1 machine.
> 
> Can you give a practical example for that?

1) You have 2+ hash formats to attack.
2) You have gpu to utilize in addition to cpu or just 2+ gpus. (I
think gpu can't be run together with cpu or other gpu.)
3) You are short on time and want to dispatch as many attacks as you
can not caring about performance too much (while it is rather common
in contest environment, I am not sure about life).
4) You want to dispatch many attacks and then stop not so effective.

> > Separation is good. I agree. Though it does not need threading.
> > 
> >> * Now about the threading itself, that's just a piece of the decoupling procedure. Once we properly decouple, we can run any compute intensive or busy wait in different threads. Another aspect is that the QProcess event loop polling for stdin/stdout is happening on the same UI thread in the current implementation. That's a problem since each time the event loop is awaken for the main UI thread it also does quite a lot of other UI stuff instead of just polling the QProcess. A separate thread for that would be so much better and it might even fix one CPU usage problem I noticed.
> > 
> >> I know this all sounds very complicated, but it's not. It's not like we're going to do this in C and POSIX interfaces. We use C++ and Qt and that eases things quite a lot. If you look in the ancient history of Johnny, I intended to do that from the ground up(look for johnParser and johnThread). (https://github.com/shinnok/johnny/tree/9357c12b585fd5270d7c8fca29bf343a1bdbea4f)
> >> Though I think you shun it away for convenience reasons and in absence of some guidance, which was mostly my fault since I wasn't around.
> > 
> > I remember that I removed them because they caused crashes and did not
> > solve freezes (you can't move work with table view to another
> > thread). Though I can't find a mail about it. So let's think I
> > removed them because the less threads the less problems while it works
> > good enough without them as written here:
> > http://openwall.com/lists/john-dev/2011/07/18/4
> > 
> > I think the arguments are too general without examples.
> 
> What kind of examples do you need? Do you realize how easy it is to "move" an object to a thread so that all slots are executed in that separate thread(think JohnClient class)? As easy as QObject::moveToThread(someThread). That's just one example of using QThread. Seriously, you being scared of threads in general is just not the way to go. I'd like you to first shake away that problem, so that we can have an informed discussion of this topic afterwards. How can I help you with that?

Oh, I was not clear about examples: I'd like to see an example of
input files that make johnny to work slow or freeze.

I am curious, can you move table view to other thread? Does not it
break displaying?

Though you wrote about 15-20% below, so I see a practical usage of
threads now. It'll be interesting if threads can solve that.

> > 
> > Affection on "CPU usage problem I noticed" is interesting.
> 
> Yep, I previously mentioned that Johnny eats at least 15-20% of one core constantly on my machine without doing nothing. No other app does that on my system. I didn't look into it yet.


> >>> 1.4.4: I'd print commands that can be copy-pasted to term. It may be
> >>> tricky to implement because there should be proper quoting that
> >>> depends onto platform. Though "Ability to select/deselect individual
> >>> hashes" makes it less important.
> >> 
> >> It's more for debugging and learning I think. I'd like that have that for all CLI wrappers if possible, in a perfect world.
> > 
> > I think the complexity with quoting is that it should be done manually
> > (i.e. by Johnny) while it is not needed to call the programs reliably
> > through QProcess.
> 
> Quoting? Do you refer to argument shell escaping?

Yes, shell escaping.

> > 
> >>> 
> >>> 1.5.1 may be started earlier. 1.5.2 is either small (and ok to be done
> >>> at that position) or overlaps with 1.6.2 (though in such form it
> >>> should be small too).
> >>> 
> >>> 1.5.3 may need support from john. We need it anyway so I may implement
> >>> john's part. Though there is a pitfall: there are hashes with 1 format
> >>> in john, some hashes have several identical formats in john (several
> >>> implementations), while some hashes may be of different formats with
> >>> different passwords (md5u vs md5).
> >> 
> >> This is in an important aspect. What are the actual chances of getting any support code for Johnny in at least Jumbo when warranted? Thinking more explicit return codes, error messages and possibly new key interrupts for more in depth or new data into what's JtR actually doing at any point in time.
> > 
> > While jumbo already has a lot of utility interfaces through --list=...
> > option I doubt there will be more specific support for Johnny. Also
> > Johnny have to support core well too.
> > 
> >>> 
> >>> So I'd move 1.5.1 earlier and 1.4.3 later.
> >>> 
> >>>> Sprint 2(week 3 and 4) :  Code, integrate, test version 1.5
> >>>> requirements . *Make
> >>>> builds for major platforms and send it to the list and johnny website.*
> >>> 
> >>> We don't really send releases itself to the list, only announces.
> >>> Johnny does not have a separate website. It is hosted on Openwall's
> >>> wiki:
> >>> http://openwall.info/wiki/john/johnny
> >> 
> >> Mathieu, can you please detail a bit more the builds and release step in your timeline? We need Linux and OS X for now, it would help us in better gauging how much you could help with this and if you need any help.
> >> 
> >> Aleksey, Solar, is it enough if we test and provide deb's for Debian latest and Ubuntu latest in "house"? I personally don't want to test on rpm based distros, mainly since I don't use any anymore. For OS X I'm thinking just a DMG, will include JtR, Qt libs and Johnny build(I'm thinking of testing only Yosemite). If there's trouble with bundling JtR in that I need to know. Johnny should and will be able to look for an existing JtR in PATH and override in settings either way.
> > 
> > Release 1.1 was tested on 6 most popular distros including 3 deb based
> > and 3 rpm based. I tested 32 and 64 bit version in KDE and GNOME. I
> > used a lot of virtual machines. There are spec files for .deb and
> > .rpm. Though the build was around one binary so we don't have source
> > packages: I just packaged one well tested binary into .deb and .rpm
> > and then I tested these packages on various distros.
> 
> Can you take on the rpm based machines once we get to that?

Of course I'll give you advice. There are no big differences between
deb based and rpm based distros for testing johnny (mostly the name of
default package manager).

> >>> I guess multiple pwd files session management is quite big task.
> >> 
> >> You sure? I'm thinking of just keeping a track of multiple .pot, .recs and input sources. Just the ability of switching between .recs and a last used history session menu with an option to clear that on demand. We can use QSettings further to keep track of this along with the current app settings.
> > 
> > Oh, does not it include running of multiple john instances at the same time?
> 
> No. There is one question regarding the implementation though. I was hoping we could just use --session to do this, but then we'd either have to teach Johnny to parse a .rec so that we can extract the options and translate them to the UI on session restore or store them using QSettings also.

To restore session, you just call john with --restore[=name] and it
runs. Currently "Resume Attack" button uses it (but for 1 session).

Parsing of .rec files is not hard though it is not considered a good
style. You can store settings in a file created by johnny, maybe near
john's session file. I think it may be used for some "whistles".
Storing in your own file may be interesting because you can store
additional values you need for other features.

Though storing anything in files, you have to think about sensitive
data and user expecting or not expecting the data to be written there.


Thanks!

-- 
Regards,
Aleksey Cherepanov
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.