Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 10 Jun 2012 23:51:25 +0800
From: orc <>
Subject: Re: Re: Vision for new platform

On Sun, 10 Jun 2012 11:13:11 -0400
Rich Felker <> wrote:

> On Sun, Jun 10, 2012 at 10:52:26PM +0800, orc wrote:
> > If we need no starting and stopping, than this can be already
> > implemented in init scripts. Only a simple program-wrapper that
> > forcibly daemonizes that daemons with "do not fork" option needed.
> > Optionally it can report a pid after fork() before execvp().
> I don't think you're getting the issue at hand. Suppose you want to be
> able to automatically bring down a particular daemon -- perhaps to
> restart it with completely new configuration or to switch to a new
> version of it. This could happen as part of an automated upgrade
> process or under manual admin control.

'Automated' often becomes the source of problems, if this automated
subsystem is not engineered properly. If we want daemon that will be
responsible for other's daemons status and it will start and stop them
automatically based on the admin's decision than it must be
well-engineered and tested in many types of situations first.

> Traditional init scripts DO NOT solve this problem. They are extremely
> buggy, ranging from doing things as stupid as killing any instance
> of the daemon

Are you talking about traditional SysV init scripts? Yes, they're
buggy, I fully agree.

> (even one run by a user as opposed to by root with a
> separate config file and running on a separate port)

Killing processes based on uid/gid and cmdline can be achieved with
pkill already,

> to killing
> unrelated processes (by scanning /proc or reading a pid file, then
> subsequently killing the pid which might not belong to a different
> process).

Again, pkill much better than "traditional"
"kill $(cat /var/run/" that most of init script use today
(Am I right?)

> I agree that the problem of daemons crashing or otherwise exiting
> unexpectedly is one that should be fixed in the daemons. Unfortunately
> that's much harder than it sounds. A large portion of the daemons in
> modern use are using "xmalloc" type wrappers that abort
> unconditionally on malloc failure, either directly or by virtue of
> using atrociously-bad libraries like glib that abort without the
> caller's consent.

I fully agree that the in reality we have no ideal daemons in this
question, many of them are unreliable.

> If daemons really didn't exit unexpectedly, the only race condition in
> pid-based approaches to lifetime management would be races between
> multiple scripted administrative actions (e.g. 2 admins trying to down
> the daemon at the same time) which could be fixed by locking at the
> script level.

Hm, for me that situation sounds a bit strange: even script will exit
with 'daemon already stopped' or script will send an additional signal
to daemon that will not harm it such (I omit here talk about
sighandlers, most daemons did not crashed after a second signal if it
was not KILL signal).

I partially agree with approach that such daemon for monitoring status
of other daemons should be developed, but I think this daemon should
control only critical processes for admin, such as:
- syslog daemon (Such situation happened with me when rsyslog crashed
  for no reason)
- possibly various daemons for remote network access, such as sshd (?)
- other daemons, if their task is not to write/read something important
  from disk. For example, database daemons should NOT be restarted

P.S. If you talk about traditional init scripts that to be appear in
most distros today - then I fully agree with you in all aspects you
talked about here. I was paranoid, and rewritten them from scratch back
some time ago.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.