Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Thu, 3 Nov 2016 10:12:55 -0600
From: Kees Cook <keescook@...omium.org>
To: Lafcadio Wluiki <wluikil@...il.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Andrew Morton <akpm@...ux-foundation.org>, 
	linux-arch <linux-arch@...r.kernel.org>, Jann Horn <jann@...jh.net>, 
	"kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>
Subject: Re: [PATCH 2/2] procfs/tasks: add a simple per-task procfs hidepid= field

On Thu, Nov 3, 2016 at 9:30 AM, Lafcadio Wluiki <wluikil@...il.com> wrote:
> (Third, rebased submission, since first two submissions yielded no replies.)
>
> This adds a new per-task hidepid= flag that is honored by procfs when
> presenting /proc to the user, in addition to the existing hidepid= mount
> option. So far, hidepid= was exclusively a per-pidns setting. Locking
> down a set of processes so that they cannot see other user's processes
> without affecting the rest of the system thus currently requires
> creation of a private PID namespace, with all the complexity it brings,
> including maintaining a stub init process as PID 1 and losing the
> ability to see processes of the same user on the rest of the system.
>
> With this patch all acesss and visibility checks in procfs now
> honour two fields:
>
>         a) the existing hide_pid field in the PID namespace
>         b) the new hide_pid in struct task_struct
>
> Access/visibility is only granted if both fields permit it; the more
> restrictive one wins. By default the new task_struct hide_pid value
> defaults to 0, which means behaviour is not changed from the status quo.
>
> Setting the per-process hide_pid value is done via a new PR_SET_HIDEPID
> prctl() option which takes the same three supported values as the
> hidepid= mount option. The per-process hide_pid may only be increased,
> never decreased, thus ensuring that once applied, processes can never
> escape such a hide_pid jail.  When a process forks it inherits its
> parent's hide_pid value.
>
> Suggested usecase: let's say nginx runs as user "www-data". After
> dropping privileges it may now call:
>
>         …
>         prctl(PR_SET_HIDEPID, 2);
>         …
>
> And from that point on neither nginx itself, nor any of its child
> processes may see processes in /proc anymore that belong to a different
> user than "www-data". Other services running on the same system remain
> unaffected.
>
> This should permit Linux distributions to more comprehensively lock down
> their services, as it allows an isolated opt-in for hidepid= for
> specific services. Previously hidepid= could only be set system-wide,
> and then specific services had to be excluded by group membership,
> essentially a more complex concept of opt-out.
>
> A test-tool that validates this functionality is available here:
>
>         https://paste.fedoraproject.org/412975/71967605/
>
> Signed-off-by: Lafcadio Wluiki <wluikil@...il.com>

I like this idea: it meaningfully reduces attack surface, even though
it doesn't ensure the same confinement as a pid namespace (e.g. a
process with this prctl set can still direct syscalls to pids that are
hidden). However, the attack surface in /proc is relatively large
compared to the syscalls that use pids.

For task launchers, this may overlap nicely with no_new_privs too.

Some suggestions/nits below...

> ---
>  fs/proc/array.c            |  3 +++
>  fs/proc/base.c             |  6 ++++--
>  include/linux/init_task.h  |  1 +
>  include/linux/sched.h      |  1 +
>  include/uapi/linux/prctl.h |  4 ++++
>  kernel/fork.c              |  1 +
>  kernel/sys.c               | 10 ++++++++++
>  7 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/fs/proc/array.c b/fs/proc/array.c
> index 81818ad..ea801e5 100644
> --- a/fs/proc/array.c
> +++ b/fs/proc/array.c
> @@ -163,6 +163,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         const struct cred *cred;
>         pid_t ppid, tpid = 0, tgid, ngid;
>         unsigned int max_fds = 0;
> +       int hide_pid;
>
>         rcu_read_lock();
>         ppid = pid_alive(p) ?
> @@ -183,6 +184,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         task_lock(p);
>         if (p->files)
>                 max_fds = files_fdtable(p->files)->max_fds;
> +       hide_pid = p->hide_pid;
>         task_unlock(p);
>         rcu_read_unlock();
>
> @@ -201,6 +203,7 @@ static inline void task_state(struct seq_file *m, struct pid_namespace *ns,
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->egid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->sgid));
>         seq_put_decimal_ull(m, "\t", from_kgid_munged(user_ns, cred->fsgid));
> +       seq_put_decimal_ull(m, "\nHidePID:\t", hide_pid);
>         seq_put_decimal_ull(m, "\nFDSize:\t", max_fds);
>
>         seq_puts(m, "\nGroups:\t");

This should get an addition to table 1-2 of
Documentation/filesystems/proc.txt that covers the contents of
/proc/$pid/status


> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index ae5e13c..6c9a42b 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -709,7 +709,8 @@ static bool has_pid_permissions(struct pid_namespace *pid,
>                                  struct task_struct *task,
>                                  int hide_pid_min)
>  {
> -       if (pid->hide_pid < hide_pid_min)
> +       if (pid->hide_pid < hide_pid_min &&
> +           current->hide_pid < hide_pid_min)
>                 return true;
>         if (in_group_p(pid->pid_gid))
>                 return true;
> @@ -730,7 +731,8 @@ static int proc_pid_permission(struct inode *inode, int mask)
>         put_task_struct(task);
>
>         if (!has_perms) {
> -               if (pid->hide_pid == HIDEPID_INVISIBLE) {
> +               if (pid->hide_pid == HIDEPID_INVISIBLE ||
> +                   current->hide_pid == HIDEPID_INVISIBLE) {
>                         /*
>                          * Let's make getdents(), stat(), and open()
>                          * consistent with each other.  If a process

Instead of open-coding both of these "||" tests, I think it might be
cleaner to just choose the highest protection value, and use that in
both comparisons. e.g.

    int hide_pid = max(pid->hide_pid, current->hide_pid);

    if (hide_pid < hide_pid_min) ...

    9if (hide_pid == HIDEPID_INVISIBLE) ...


> diff --git a/include/linux/init_task.h b/include/linux/init_task.h
> index 325f649..c87de0e 100644
> --- a/include/linux/init_task.h
> +++ b/include/linux/init_task.h
> @@ -250,6 +250,7 @@ extern struct task_group root_task_group;
>         .cpu_timers     = INIT_CPU_TIMERS(tsk.cpu_timers),              \
>         .pi_lock        = __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),        \
>         .timer_slack_ns = 50000, /* 50 usec default slack */            \
> +       .hide_pid       = 0,                                            \
>         .pids = {                                                       \
>                 [PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),            \
>                 [PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),           \
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 348f51b..3e8ca16 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1572,6 +1572,7 @@ struct task_struct {
>         /* unserialized, strictly 'current' */
>         unsigned in_execve:1; /* bit to tell LSMs we're in execve */
>         unsigned in_iowait:1;
> +       unsigned hide_pid:2; /* per-process procfs hidepid= */
>  #if !defined(TIF_RESTORE_SIGMASK)
>         unsigned restore_sigmask:1;
>  #endif
> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> index a8d0759..ada62b6 100644
> --- a/include/uapi/linux/prctl.h
> +++ b/include/uapi/linux/prctl.h
> @@ -197,4 +197,8 @@ struct prctl_mm_map {
>  # define PR_CAP_AMBIENT_LOWER          3
>  # define PR_CAP_AMBIENT_CLEAR_ALL      4
>
> +/* Per process, non-revokable procfs hidepid= option */
> +#define PR_SET_HIDEPID 48
> +#define PR_GET_HIDEPID 49
> +
>  #endif /* _LINUX_PRCTL_H */
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 623259f..d4fe951 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -1562,6 +1562,7 @@ static __latent_entropy struct task_struct *copy_process(
>  #endif
>
>         p->default_timer_slack_ns = current->timer_slack_ns;
> +       p->hide_pid = current->hide_pid;
>
>         task_io_accounting_init(&p->ioac);
>         acct_clear_integrals(p);
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 89d5be4..c0a1d3e 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2270,6 +2270,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
>         case PR_GET_FP_MODE:
>                 error = GET_FP_MODE(me);
>                 break;
> +       case PR_SET_HIDEPID:
> +               if (arg2 < HIDEPID_OFF || arg2 > HIDEPID_INVISIBLE)
> +                       return -EINVAL;
> +               if (arg2 < me->hide_pid)
> +                       return -EPERM;
> +               me->hide_pid = arg2;
> +               break;
> +       case PR_GET_HIDEPID:
> +               error = put_user((int) me->hide_pid, (int __user *)arg2);
> +               break;
>         default:
>                 error = -EINVAL;
>                 break;
> --
> 2.7.4
>

Since this adds a new prctl interface, it's best to Cc linux-arch
(which I added now).

Thanks for proposing this!

-Kees

-- 
Kees Cook
Nexus Security

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.