Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 09 Feb 2017 14:32:56 +0100
From: Yann Droneaud <ydroneaud@...eya.com>
To: Kees Cook <keescook@...omium.org>, Thomas Gleixner <tglx@...utronix.de>
Cc: Xing Gao <xgao01@...il.wm.edu>, Jessica Frazelle <me@...sfraz.com>, John
 Stultz <john.stultz@...aro.org>, "Eric W. Biederman"
 <ebiederm@...ssion.com>, Jonathan Corbet <corbet@....net>, Tejun Heo
 <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>, Petr Mladek
 <pmladek@...e.com>, Andrew Morton <akpm@...ux-foundation.org>, Oleg
 Nesterov <oleg@...hat.com>, Nicolas Iooss <nicolas.iooss_linux@....org>,
 Nicolas Pitre <nicolas.pitre@...aro.org>,  Richard Cochran
 <richardcochran@...il.com>, "Paul E. McKenney"
 <paulmck@...ux.vnet.ibm.com>, Michal Marek <mmarek@...e.com>, Josh
 Poimboeuf <jpoimboe@...hat.com>, Dmitry Vyukov <dvyukov@...gle.com>, Olof
 Johansson <olof@...om.net>, Shuah Khan <shuah@...nel.org>,
 linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org, 
 kernel-hardening@...ts.openwall.com, linux-api@...r.kernel.org
Subject: Re: [PATCH v2] time: Remove CONFIG_TIMER_STATS

Hi,

Don't forget to send to linux-api@...r.kernel.org 

Le mercredi 08 février 2017 à 11:26 -0800, Kees Cook a écrit :
> Currently CONFIG_TIMER_STATS exposes process information across
> namespaces:
> 
> kernel/time/timer_list.c print_timer():
> 
>         SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
> 
> /proc/timer_list:
> 
>  #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep,
> cron/2570
> 
> Given that the tracer can give the same information, this patch
> entirely
> removes CONFIG_TIMER_STATS.
> 
> Suggested-by: Thomas Gleixner <tglx@...utronix.de>
> Signed-off-by: Kees Cook <keescook@...omium.org>
> Acked-by: John Stultz <john.stultz@...aro.org>
> ---
> v2:
> - dropped doc comments for removed structure elements; thx 0-day
> builder.
> ---
>  Documentation/timers/timer_stats.txt |  73 ------
>  include/linux/hrtimer.h              |  11 -
>  include/linux/timer.h                |  45 ----
>  kernel/kthread.c                     |   1 -
>  kernel/time/Makefile                 |   1 -
>  kernel/time/hrtimer.c                |  38 ----
>  kernel/time/timer.c                  |  48 +---
>  kernel/time/timer_list.c             |  10 -
>  kernel/time/timer_stats.c            | 425 -----------------------
> ------------
>  kernel/workqueue.c                   |   2 -
>  lib/Kconfig.debug                    |  14 --
>  11 files changed, 2 insertions(+), 666 deletions(-)
>  delete mode 100644 Documentation/timers/timer_stats.txt
>  delete mode 100644 kernel/time/timer_stats.c
> 
> diff --git a/Documentation/timers/timer_stats.txt
> b/Documentation/timers/timer_stats.txt
> deleted file mode 100644
> index de835ee97455..000000000000
> --- a/Documentation/timers/timer_stats.txt
> +++ /dev/null
> @@ -1,73 +0,0 @@
> -timer_stats - timer usage statistics
> -------------------------------------
> -
> -timer_stats is a debugging facility to make the timer (ab)usage in a
> Linux
> -system visible to kernel and userspace developers. If enabled in the
> config
> -but not used it has almost zero runtime overhead, and a relatively
> small
> -data structure overhead. Even if collection is enabled runtime all
> the
> -locking is per-CPU and lookup is hashed.
> -
> -timer_stats should be used by kernel and userspace developers to
> verify that
> -their code does not make unduly use of timers. This helps to avoid
> unnecessary
> -wakeups, which should be avoided to optimize power consumption.
> -
> -It can be enabled by CONFIG_TIMER_STATS in the "Kernel hacking"
> configuration
> -section.
> -
> -timer_stats collects information about the timer events which are
> fired in a
> -Linux system over a sample period:
> -
> -- the pid of the task(process) which initialized the timer
> -- the name of the process which initialized the timer
> -- the function where the timer was initialized
> -- the callback function which is associated to the timer
> -- the number of events (callbacks)
> -
> -timer_stats adds an entry to /proc: /proc/timer_stats
> -
> -This entry is used to control the statistics functionality and to
> read out the
> -sampled information.
> -
> -The timer_stats functionality is inactive on bootup.
> -
> -To activate a sample period issue:
> -# echo 1 >/proc/timer_stats
> -
> -To stop a sample period issue:
> -# echo 0 >/proc/timer_stats
> -
> -The statistics can be retrieved by:
> -# cat /proc/timer_stats
> -
> -While sampling is enabled, each readout from /proc/timer_stats will
> see
> -newly updated statistics. Once sampling is disabled, the sampled
> information
> -is kept until a new sample period is started. This allows multiple
> readouts.
> -
> -Sample output of /proc/timer_stats:
> -
> -Timerstats sample period: 3.888770 s
> -  12,     0 swapper          hrtimer_stop_sched_tick
> (hrtimer_sched_tick)
> -  15,     1 swapper          hcd_submit_urb (rh_timer_func)
> -   4,   959 kedac            schedule_timeout (process_timeout)
> -   1,     0 swapper          page_writeback_init (wb_timer_fn)
> -  28,     0 swapper          hrtimer_stop_sched_tick
> (hrtimer_sched_tick)
> -  22,  2948 IRQ 4            tty_flip_buffer_push
> (delayed_work_timer_fn)
> -   3,  3100 bash             schedule_timeout (process_timeout)
> -   1,     1 swapper          queue_delayed_work_on
> (delayed_work_timer_fn)
> -   1,     1 swapper          queue_delayed_work_on
> (delayed_work_timer_fn)
> -   1,     1 swapper          neigh_table_init_no_netlink
> (neigh_periodic_timer)
> -   1,  2292 ip               __netdev_watchdog_up (dev_watchdog)
> -   1,    23 events/1         do_cache_clean (delayed_work_timer_fn)
> -90 total events, 30.0 events/sec
> -
> -The first column is the number of events, the second column the pid,
> the third
> -column is the name of the process. The forth column shows the
> function which
> -initialized the timer and in parenthesis the callback function which
> was
> -executed on expiry.
> -
> -    Thomas, Ingo
> -
> -Added flag to indicate 'deferrable timer' in /proc/timer_stats. A
> deferrable
> -timer will appear as follows
> -  10D,     1 swapper          queue_delayed_work_on
> (delayed_work_timer_fn)
> -
> diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
> index cdab81ba29f8..e52b427223ba 100644
> --- a/include/linux/hrtimer.h
> +++ b/include/linux/hrtimer.h
> @@ -88,12 +88,6 @@ enum hrtimer_restart {
>   * @base:	pointer to the timer base (per cpu and per clock)
>   * @state:	state information (See bit values above)
>   * @is_rel:	Set if the timer was armed relative
> - * @start_pid:  timer statistics field to store the pid of the task
> which
> - *		started the timer
> - * @start_site:	timer statistics field to store the site
> where the timer
> - *		was started
> - * @start_comm: timer statistics field to store the name of the
> process which
> - *		started the timer
>   *
>   * The hrtimer structure must be initialized by hrtimer_init()
>   */
> @@ -104,11 +98,6 @@ struct hrtimer {
>  	struct hrtimer_clock_base	*base;
>  	u8				state;
>  	u8				is_rel;
> -#ifdef CONFIG_TIMER_STATS
> -	int				start_pid;
> -	void				*start_site;
> -	char				start_comm[16];
> -#endif
>  };
>  
>  /**
> diff --git a/include/linux/timer.h b/include/linux/timer.h
> index 51d601f192d4..5a209b84fd9e 100644
> --- a/include/linux/timer.h
> +++ b/include/linux/timer.h
> @@ -20,11 +20,6 @@ struct timer_list {
>  	unsigned long		data;
>  	u32			flags;
>  
> -#ifdef CONFIG_TIMER_STATS
> -	int			start_pid;
> -	void			*start_site;
> -	char			start_comm[16];
> -#endif
>  #ifdef CONFIG_LOCKDEP
>  	struct lockdep_map	lockdep_map;
>  #endif
> @@ -197,46 +192,6 @@ extern int mod_timer_pending(struct timer_list
> *timer, unsigned long expires);
>   */
>  #define NEXT_TIMER_MAX_DELTA	((1UL << 30) - 1)
>  
> -/*
> - * Timer-statistics info:
> - */
> -#ifdef CONFIG_TIMER_STATS
> -
> -extern int timer_stats_active;
> -
> -extern void init_timer_stats(void);
> -
> -extern void timer_stats_update_stats(void *timer, pid_t pid, void
> *startf,
> -				     void *timerf, char *comm, u32
> flags);
> -
> -extern void __timer_stats_timer_set_start_info(struct timer_list
> *timer,
> -					       void *addr);
> -
> -static inline void timer_stats_timer_set_start_info(struct
> timer_list *timer)
> -{
> -	if (likely(!timer_stats_active))
> -		return;
> -	__timer_stats_timer_set_start_info(timer,
> __builtin_return_address(0));
> -}
> -
> -static inline void timer_stats_timer_clear_start_info(struct
> timer_list *timer)
> -{
> -	timer->start_site = NULL;
> -}
> -#else
> -static inline void init_timer_stats(void)
> -{
> -}
> -
> -static inline void timer_stats_timer_set_start_info(struct
> timer_list *timer)
> -{
> -}
> -
> -static inline void timer_stats_timer_clear_start_info(struct
> timer_list *timer)
> -{
> -}
> -#endif
> -
>  extern void add_timer(struct timer_list *timer);
>  
>  extern int try_to_del_timer_sync(struct timer_list *timer);
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 2318fba86277..8461a4372e8a 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -850,7 +850,6 @@ void __kthread_queue_delayed_work(struct
> kthread_worker *worker,
>  
>  	list_add(&work->node, &worker->delayed_work_list);
>  	work->worker = worker;
> -	timer_stats_timer_set_start_info(&dwork->timer);
>  	timer->expires = jiffies + delay;
>  	add_timer(timer);
>  }
> diff --git a/kernel/time/Makefile b/kernel/time/Makefile
> index 976840d29a71..938dbf33ef49 100644
> --- a/kernel/time/Makefile
> +++ b/kernel/time/Makefile
> @@ -15,6 +15,5 @@ ifeq ($(CONFIG_GENERIC_CLOCKEVENTS_BROADCAST),y)
>  endif
>  obj-$(CONFIG_GENERIC_SCHED_CLOCK)		+= sched_clock.o
>  obj-$(CONFIG_TICK_ONESHOT)			+= tick-oneshot.o
> tick-sched.o
> -obj-$(CONFIG_TIMER_STATS)			+= timer_stats.o
>  obj-$(CONFIG_DEBUG_FS)				+=
> timekeeping_debug.o
>  obj-$(CONFIG_TEST_UDELAY)			+= test_udelay.o
> diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
> index c6ecedd3b839..edabde646e58 100644
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -766,34 +766,6 @@ void hrtimers_resume(void)
>  	clock_was_set_delayed();
>  }
>  
> -static inline void timer_stats_hrtimer_set_start_info(struct hrtimer
> *timer)
> -{
> -#ifdef CONFIG_TIMER_STATS
> -	if (timer->start_site)
> -		return;
> -	timer->start_site = __builtin_return_address(0);
> -	memcpy(timer->start_comm, current->comm, TASK_COMM_LEN);
> -	timer->start_pid = current->pid;
> -#endif
> -}
> -
> -static inline void timer_stats_hrtimer_clear_start_info(struct
> hrtimer *timer)
> -{
> -#ifdef CONFIG_TIMER_STATS
> -	timer->start_site = NULL;
> -#endif
> -}
> -
> -static inline void timer_stats_account_hrtimer(struct hrtimer
> *timer)
> -{
> -#ifdef CONFIG_TIMER_STATS
> -	if (likely(!timer_stats_active))
> -		return;
> -	timer_stats_update_stats(timer, timer->start_pid, timer-
> >start_site,
> -				 timer->function, timer->start_comm, 
> 0);
> -#endif
> -}
> -
>  /*
>   * Counterpart to lock_hrtimer_base above:
>   */
> @@ -932,7 +904,6 @@ remove_hrtimer(struct hrtimer *timer, struct
> hrtimer_clock_base *base, bool rest
>  		 * rare case and less expensive than a smp call.
>  		 */
>  		debug_deactivate(timer);
> -		timer_stats_hrtimer_clear_start_info(timer);
>  		reprogram = base->cpu_base ==
> this_cpu_ptr(&hrtimer_bases);
>  
>  		if (!restart)
> @@ -990,8 +961,6 @@ void hrtimer_start_range_ns(struct hrtimer
> *timer, ktime_t tim,
>  	/* Switch the timer base, if necessary: */
>  	new_base = switch_hrtimer_base(timer, base, mode &
> HRTIMER_MODE_PINNED);
>  
> -	timer_stats_hrtimer_set_start_info(timer);
> -
>  	leftmost = enqueue_hrtimer(timer, new_base);
>  	if (!leftmost)
>  		goto unlock;
> @@ -1128,12 +1097,6 @@ static void __hrtimer_init(struct hrtimer
> *timer, clockid_t clock_id,
>  	base = hrtimer_clockid_to_base(clock_id);
>  	timer->base = &cpu_base->clock_base[base];
>  	timerqueue_init(&timer->node);
> -
> -#ifdef CONFIG_TIMER_STATS
> -	timer->start_site = NULL;
> -	timer->start_pid = -1;
> -	memset(timer->start_comm, 0, TASK_COMM_LEN);
> -#endif
>  }
>  
>  /**
> @@ -1217,7 +1180,6 @@ static void __run_hrtimer(struct
> hrtimer_cpu_base *cpu_base,
>  	raw_write_seqcount_barrier(&cpu_base->seq);
>  
>  	__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
> -	timer_stats_account_hrtimer(timer);
>  	fn = timer->function;
>  
>  	/*
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index ec33a6933eae..82a6bfa0c307 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -571,38 +571,6 @@ internal_add_timer(struct timer_base *base,
> struct timer_list *timer)
>  	trigger_dyntick_cpu(base, timer);
>  }
>  
> -#ifdef CONFIG_TIMER_STATS
> -void __timer_stats_timer_set_start_info(struct timer_list *timer,
> void *addr)
> -{
> -	if (timer->start_site)
> -		return;
> -
> -	timer->start_site = addr;
> -	memcpy(timer->start_comm, current->comm, TASK_COMM_LEN);
> -	timer->start_pid = current->pid;
> -}
> -
> -static void timer_stats_account_timer(struct timer_list *timer)
> -{
> -	void *site;
> -
> -	/*
> -	 * start_site can be concurrently reset by
> -	 * timer_stats_timer_clear_start_info()
> -	 */
> -	site = READ_ONCE(timer->start_site);
> -	if (likely(!site))
> -		return;
> -
> -	timer_stats_update_stats(timer, timer->start_pid, site,
> -				 timer->function, timer->start_comm,
> -				 timer->flags);
> -}
> -
> -#else
> -static void timer_stats_account_timer(struct timer_list *timer) {}
> -#endif
> -
>  #ifdef CONFIG_DEBUG_OBJECTS_TIMERS
>  
>  static struct debug_obj_descr timer_debug_descr;
> @@ -789,11 +757,6 @@ static void do_init_timer(struct timer_list
> *timer, unsigned int flags,
>  {
>  	timer->entry.pprev = NULL;
>  	timer->flags = flags | raw_smp_processor_id();
> -#ifdef CONFIG_TIMER_STATS
> -	timer->start_site = NULL;
> -	timer->start_pid = -1;
> -	memset(timer->start_comm, 0, TASK_COMM_LEN);
> -#endif
>  	lockdep_init_map(&timer->lockdep_map, name, key, 0);
>  }
>  
> @@ -1001,8 +964,6 @@ __mod_timer(struct timer_list *timer, unsigned
> long expires, bool pending_only)
>  		base = lock_timer_base(timer, &flags);
>  	}
>  
> -	timer_stats_timer_set_start_info(timer);
> -
>  	ret = detach_if_pending(timer, base, false);
>  	if (!ret && pending_only)
>  		goto out_unlock;
> @@ -1130,7 +1091,6 @@ void add_timer_on(struct timer_list *timer, int
> cpu)
>  	struct timer_base *new_base, *base;
>  	unsigned long flags;
>  
> -	timer_stats_timer_set_start_info(timer);
>  	BUG_ON(timer_pending(timer) || !timer->function);
>  
>  	new_base = get_timer_cpu_base(timer->flags, cpu);
> @@ -1176,7 +1136,6 @@ int del_timer(struct timer_list *timer)
>  
>  	debug_assert_init(timer);
>  
> -	timer_stats_timer_clear_start_info(timer);
>  	if (timer_pending(timer)) {
>  		base = lock_timer_base(timer, &flags);
>  		ret = detach_if_pending(timer, base, true);
> @@ -1204,10 +1163,9 @@ int try_to_del_timer_sync(struct timer_list
> *timer)
>  
>  	base = lock_timer_base(timer, &flags);
>  
> -	if (base->running_timer != timer) {
> -		timer_stats_timer_clear_start_info(timer);
> +	if (base->running_timer != timer)
>  		ret = detach_if_pending(timer, base, true);
> -	}
> +
>  	spin_unlock_irqrestore(&base->lock, flags);
>  
>  	return ret;
> @@ -1331,7 +1289,6 @@ static void expire_timers(struct timer_base
> *base, struct hlist_head *head)
>  		unsigned long data;
>  
>  		timer = hlist_entry(head->first, struct timer_list,
> entry);
> -		timer_stats_account_timer(timer);
>  
>  		base->running_timer = timer;
>  		detach_timer(timer, true);
> @@ -1868,7 +1825,6 @@ static void __init init_timer_cpus(void)
>  void __init init_timers(void)
>  {
>  	init_timer_cpus();
> -	init_timer_stats();
>  	open_softirq(TIMER_SOFTIRQ, run_timer_softirq);
>  }
>  
> diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
> index afe6cd1944fc..387a3a5aa388 100644
> --- a/kernel/time/timer_list.c
> +++ b/kernel/time/timer_list.c
> @@ -62,21 +62,11 @@ static void
>  print_timer(struct seq_file *m, struct hrtimer *taddr, struct
> hrtimer *timer,
>  	    int idx, u64 now)
>  {
> -#ifdef CONFIG_TIMER_STATS
> -	char tmp[TASK_COMM_LEN + 1];
> -#endif
>  	SEQ_printf(m, " #%d: ", idx);
>  	print_name_offset(m, taddr);
>  	SEQ_printf(m, ", ");
>  	print_name_offset(m, timer->function);
>  	SEQ_printf(m, ", S:%02x", timer->state);
> -#ifdef CONFIG_TIMER_STATS
> -	SEQ_printf(m, ", ");
> -	print_name_offset(m, timer->start_site);
> -	memcpy(tmp, timer->start_comm, TASK_COMM_LEN);
> -	tmp[TASK_COMM_LEN] = 0;
> -	SEQ_printf(m, ", %s/%d", tmp, timer->start_pid);
> -#endif
>  	SEQ_printf(m, "\n");
>  	SEQ_printf(m, " # expires at %Lu-%Lu nsecs [in %Ld to %Ld
> nsecs]\n",
>  		(unsigned long
> long)ktime_to_ns(hrtimer_get_softexpires(timer)),
> diff --git a/kernel/time/timer_stats.c b/kernel/time/timer_stats.c
> deleted file mode 100644
> index afddded947df..000000000000
> --- a/kernel/time/timer_stats.c
> +++ /dev/null
> @@ -1,425 +0,0 @@
> -/*
> - * kernel/time/timer_stats.c
> - *
> - * Collect timer usage statistics.
> - *
> - * Copyright(C) 2006, Red Hat, Inc., Ingo Molnar
> - * Copyright(C) 2006 Timesys Corp., Thomas Gleixner <tglx@...esys.co
> m>
> - *
> - * timer_stats is based on timer_top, a similar functionality which
> was part of
> - * Con Kolivas dyntick patch set. It was developed by Daniel Petrini
> at the
> - * Instituto Nokia de Tecnologia - INdT - Manaus. timer_top's design
> was based
> - * on dynamic allocation of the statistics entries and linear search
> based
> - * lookup combined with a global lock, rather than the static array,
> hash
> - * and per-CPU locking which is used by timer_stats. It was written
> for the
> - * pre hrtimer kernel code and therefore did not take hrtimers into
> account.
> - * Nevertheless it provided the base for the timer_stats
> implementation and
> - * was a helpful source of inspiration. Kudos to Daniel and the
> Nokia folks
> - * for this effort.
> - *
> - * timer_top.c is
> - *	Copyright (C) 2005 Instituto Nokia de Tecnologia - INdT -
> Manaus
> - *	Written by Daniel Petrini <d.pensator@...il.com>
> - *	timer_top.c was released under the GNU General Public
> License version 2
> - *
> - * We export the addresses and counting of timer functions being
> called,
> - * the pid and cmdline from the owner process if applicable.
> - *
> - * Start/stop data collection:
> - * # echo [1|0] >/proc/timer_stats
> - *
> - * Display the information collected so far:
> - * # cat /proc/timer_stats
> - *
> - * This program is free software; you can redistribute it and/or
> modify
> - * it under the terms of the GNU General Public License version 2 as
> - * published by the Free Software Foundation.
> - */
> -
> -#include <linux/proc_fs.h>
> -#include <linux/module.h>
> -#include <linux/spinlock.h>
> -#include <linux/sched.h>
> -#include <linux/seq_file.h>
> -#include <linux/kallsyms.h>
> -
> -#include <linux/uaccess.h>
> -
> -/*
> - * This is our basic unit of interest: a timer expiry event
> identified
> - * by the timer, its start/expire functions and the PID of the task
> that
> - * started the timer. We count the number of times an event happens:
> - */
> -struct entry {
> -	/*
> -	 * Hash list:
> -	 */
> -	struct entry		*next;
> -
> -	/*
> -	 * Hash keys:
> -	 */
> -	void			*timer;
> -	void			*start_func;
> -	void			*expire_func;
> -	pid_t			pid;
> -
> -	/*
> -	 * Number of timeout events:
> -	 */
> -	unsigned long		count;
> -	u32			flags;
> -
> -	/*
> -	 * We save the command-line string to preserve
> -	 * this information past task exit:
> -	 */
> -	char			comm[TASK_COMM_LEN + 1];
> -
> -} ____cacheline_aligned_in_smp;
> -
> -/*
> - * Spinlock protecting the tables - not taken during lookup:
> - */
> -static DEFINE_RAW_SPINLOCK(table_lock);
> -
> -/*
> - * Per-CPU lookup locks for fast hash lookup:
> - */
> -static DEFINE_PER_CPU(raw_spinlock_t, tstats_lookup_lock);
> -
> -/*
> - * Mutex to serialize state changes with show-stats activities:
> - */
> -static DEFINE_MUTEX(show_mutex);
> -
> -/*
> - * Collection status, active/inactive:
> - */
> -int __read_mostly timer_stats_active;
> -
> -/*
> - * Beginning/end timestamps of measurement:
> - */
> -static ktime_t time_start, time_stop;
> -
> -/*
> - * tstat entry structs only get allocated while collection is
> - * active and never freed during that time - this simplifies
> - * things quite a bit.
> - *
> - * They get freed when a new collection period is started.
> - */
> -#define MAX_ENTRIES_BITS	10
> -#define MAX_ENTRIES		(1UL << MAX_ENTRIES_BITS)
> -
> -static unsigned long nr_entries;
> -static struct entry entries[MAX_ENTRIES];
> -
> -static atomic_t overflow_count;
> -
> -/*
> - * The entries are in a hash-table, for fast lookup:
> - */
> -#define TSTAT_HASH_BITS		(MAX_ENTRIES_BITS - 1)
> -#define TSTAT_HASH_SIZE		(1UL << TSTAT_HASH_BITS)
> -#define TSTAT_HASH_MASK		(TSTAT_HASH_SIZE - 1)
> -
> -#define __tstat_hashfn(entry)					
> 	\
> -	(((unsigned long)(entry)->timer       ^			
> 	\
> -	  (unsigned long)(entry)->start_func  ^			
> 	\
> -	  (unsigned long)(entry)->expire_func ^			
> 	\
> -	  (unsigned long)(entry)->pid		) &
> TSTAT_HASH_MASK)
> -
> -#define tstat_hashentry(entry)	(tstat_hash_table +
> __tstat_hashfn(entry))
> -
> -static struct entry *tstat_hash_table[TSTAT_HASH_SIZE]
> __read_mostly;
> -
> -static void reset_entries(void)
> -{
> -	nr_entries = 0;
> -	memset(entries, 0, sizeof(entries));
> -	memset(tstat_hash_table, 0, sizeof(tstat_hash_table));
> -	atomic_set(&overflow_count, 0);
> -}
> -
> -static struct entry *alloc_entry(void)
> -{
> -	if (nr_entries >= MAX_ENTRIES)
> -		return NULL;
> -
> -	return entries + nr_entries++;
> -}
> -
> -static int match_entries(struct entry *entry1, struct entry *entry2)
> -{
> -	return entry1->timer       == entry2->timer	  &&
> -	       entry1->start_func  == entry2->start_func  &&
> -	       entry1->expire_func == entry2->expire_func &&
> -	       entry1->pid	   == entry2->pid;
> -}
> -
> -/*
> - * Look up whether an entry matching this item is present
> - * in the hash already. Must be called with irqs off and the
> - * lookup lock held:
> - */
> -static struct entry *tstat_lookup(struct entry *entry, char *comm)
> -{
> -	struct entry **head, *curr, *prev;
> -
> -	head = tstat_hashentry(entry);
> -	curr = *head;
> -
> -	/*
> -	 * The fastpath is when the entry is already hashed,
> -	 * we do this with the lookup lock held, but with the
> -	 * table lock not held:
> -	 */
> -	while (curr) {
> -		if (match_entries(curr, entry))
> -			return curr;
> -
> -		curr = curr->next;
> -	}
> -	/*
> -	 * Slowpath: allocate, set up and link a new hash entry:
> -	 */
> -	prev = NULL;
> -	curr = *head;
> -
> -	raw_spin_lock(&table_lock);
> -	/*
> -	 * Make sure we have not raced with another CPU:
> -	 */
> -	while (curr) {
> -		if (match_entries(curr, entry))
> -			goto out_unlock;
> -
> -		prev = curr;
> -		curr = curr->next;
> -	}
> -
> -	curr = alloc_entry();
> -	if (curr) {
> -		*curr = *entry;
> -		curr->count = 0;
> -		curr->next = NULL;
> -		memcpy(curr->comm, comm, TASK_COMM_LEN);
> -
> -		smp_mb(); /* Ensure that curr is initialized before
> insert */
> -
> -		if (prev)
> -			prev->next = curr;
> -		else
> -			*head = curr;
> -	}
> - out_unlock:
> -	raw_spin_unlock(&table_lock);
> -
> -	return curr;
> -}
> -
> -/**
> - * timer_stats_update_stats - Update the statistics for a timer.
> - * @timer:	pointer to either a timer_list or a hrtimer
> - * @pid:	the pid of the task which set up the timer
> - * @startf:	pointer to the function which did the timer setup
> - * @timerf:	pointer to the timer callback function of the
> timer
> - * @comm:	name of the process which set up the timer
> - * @tflags:	The flags field of the timer
> - *
> - * When the timer is already registered, then the event counter is
> - * incremented. Otherwise the timer is registered in a free slot.
> - */
> -void timer_stats_update_stats(void *timer, pid_t pid, void *startf,
> -			      void *timerf, char *comm, u32 tflags)
> -{
> -	/*
> -	 * It doesn't matter which lock we take:
> -	 */
> -	raw_spinlock_t *lock;
> -	struct entry *entry, input;
> -	unsigned long flags;
> -
> -	if (likely(!timer_stats_active))
> -		return;
> -
> -	lock = &per_cpu(tstats_lookup_lock, raw_smp_processor_id());
> -
> -	input.timer = timer;
> -	input.start_func = startf;
> -	input.expire_func = timerf;
> -	input.pid = pid;
> -	input.flags = tflags;
> -
> -	raw_spin_lock_irqsave(lock, flags);
> -	if (!timer_stats_active)
> -		goto out_unlock;
> -
> -	entry = tstat_lookup(&input, comm);
> -	if (likely(entry))
> -		entry->count++;
> -	else
> -		atomic_inc(&overflow_count);
> -
> - out_unlock:
> -	raw_spin_unlock_irqrestore(lock, flags);
> -}
> -
> -static void print_name_offset(struct seq_file *m, unsigned long
> addr)
> -{
> -	char symname[KSYM_NAME_LEN];
> -
> -	if (lookup_symbol_name(addr, symname) < 0)
> -		seq_printf(m, "<%p>", (void *)addr);
> -	else
> -		seq_printf(m, "%s", symname);
> -}
> -
> -static int tstats_show(struct seq_file *m, void *v)
> -{
> -	struct timespec64 period;
> -	struct entry *entry;
> -	unsigned long ms;
> -	long events = 0;
> -	ktime_t time;
> -	int i;
> -
> -	mutex_lock(&show_mutex);
> -	/*
> -	 * If still active then calculate up to now:
> -	 */
> -	if (timer_stats_active)
> -		time_stop = ktime_get();
> -
> -	time = ktime_sub(time_stop, time_start);
> -
> -	period = ktime_to_timespec64(time);
> -	ms = period.tv_nsec / 1000000;
> -
> -	seq_puts(m, "Timer Stats Version: v0.3\n");
> -	seq_printf(m, "Sample period: %ld.%03ld s\n",
> (long)period.tv_sec, ms);
> -	if (atomic_read(&overflow_count))
> -		seq_printf(m, "Overflow: %d entries\n",
> atomic_read(&overflow_count));
> -	seq_printf(m, "Collection: %s\n", timer_stats_active ?
> "active" : "inactive");
> -
> -	for (i = 0; i < nr_entries; i++) {
> -		entry = entries + i;
> -		if (entry->flags & TIMER_DEFERRABLE) {
> -			seq_printf(m, "%4luD, %5d %-16s ",
> -				entry->count, entry->pid, entry-
> >comm);
> -		} else {
> -			seq_printf(m, " %4lu, %5d %-16s ",
> -				entry->count, entry->pid, entry-
> >comm);
> -		}
> -
> -		print_name_offset(m, (unsigned long)entry-
> >start_func);
> -		seq_puts(m, " (");
> -		print_name_offset(m, (unsigned long)entry-
> >expire_func);
> -		seq_puts(m, ")\n");
> -
> -		events += entry->count;
> -	}
> -
> -	ms += period.tv_sec * 1000;
> -	if (!ms)
> -		ms = 1;
> -
> -	if (events && period.tv_sec)
> -		seq_printf(m, "%ld total events, %ld.%03ld
> events/sec\n",
> -			   events, events * 1000 / ms,
> -			   (events * 1000000 / ms) % 1000);
> -	else
> -		seq_printf(m, "%ld total events\n", events);
> -
> -	mutex_unlock(&show_mutex);
> -
> -	return 0;
> -}
> -
> -/*
> - * After a state change, make sure all concurrent lookup/update
> - * activities have stopped:
> - */
> -static void sync_access(void)
> -{
> -	unsigned long flags;
> -	int cpu;
> -
> -	for_each_online_cpu(cpu) {
> -		raw_spinlock_t *lock = &per_cpu(tstats_lookup_lock,
> cpu);
> -
> -		raw_spin_lock_irqsave(lock, flags);
> -		/* nothing */
> -		raw_spin_unlock_irqrestore(lock, flags);
> -	}
> -}
> -
> -static ssize_t tstats_write(struct file *file, const char __user
> *buf,
> -			    size_t count, loff_t *offs)
> -{
> -	char ctl[2];
> -
> -	if (count != 2 || *offs)
> -		return -EINVAL;
> -
> -	if (copy_from_user(ctl, buf, count))
> -		return -EFAULT;
> -
> -	mutex_lock(&show_mutex);
> -	switch (ctl[0]) {
> -	case '0':
> -		if (timer_stats_active) {
> -			timer_stats_active = 0;
> -			time_stop = ktime_get();
> -			sync_access();
> -		}
> -		break;
> -	case '1':
> -		if (!timer_stats_active) {
> -			reset_entries();
> -			time_start = ktime_get();
> -			smp_mb();
> -			timer_stats_active = 1;
> -		}
> -		break;
> -	default:
> -		count = -EINVAL;
> -	}
> -	mutex_unlock(&show_mutex);
> -
> -	return count;
> -}
> -
> -static int tstats_open(struct inode *inode, struct file *filp)
> -{
> -	return single_open(filp, tstats_show, NULL);
> -}
> -
> -static const struct file_operations tstats_fops = {
> -	.open		= tstats_open,
> -	.read		= seq_read,
> -	.write		= tstats_write,
> -	.llseek		= seq_lseek,
> -	.release	= single_release,
> -};
> -
> -void __init init_timer_stats(void)
> -{
> -	int cpu;
> -
> -	for_each_possible_cpu(cpu)
> -		raw_spin_lock_init(&per_cpu(tstats_lookup_lock,
> cpu));
> -}
> -
> -static int __init init_tstats_procfs(void)
> -{
> -	struct proc_dir_entry *pe;
> -
> -	pe = proc_create("timer_stats", 0644, NULL, &tstats_fops);
> -	if (!pe)
> -		return -ENOMEM;
> -	return 0;
> -}
> -__initcall(init_tstats_procfs);
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 1d9fb6543a66..072cbc9b175d 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1523,8 +1523,6 @@ static void __queue_delayed_work(int cpu,
> struct workqueue_struct *wq,
>  		return;
>  	}
>  
> -	timer_stats_timer_set_start_info(&dwork->timer);
> -
>  	dwork->wq = wq;
>  	dwork->cpu = cpu;
>  	timer->expires = jiffies + delay;
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index eb9e9a7870fa..132af338d6dd 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -980,20 +980,6 @@ config DEBUG_TIMEKEEPING
>  
>  	  If unsure, say N.
>  
> -config TIMER_STATS
> -	bool "Collect kernel timers statistics"
> -	depends on DEBUG_KERNEL && PROC_FS
> -	help
> -	  If you say Y here, additional code will be inserted into
> the
> -	  timer routines to collect statistics about kernel timers
> being
> -	  reprogrammed. The statistics can be read from
> /proc/timer_stats.
> -	  The statistics collection is started by writing 1 to
> /proc/timer_stats,
> -	  writing 0 stops it. This feature is useful to collect
> information
> -	  about timer usage patterns in kernel and userspace. This
> feature
> -	  is lightweight if enabled in the kernel config but not
> activated
> -	  (it defaults to deactivated on bootup and will only be
> activated
> -	  if some application like powertop activates it
> explicitly).
> -
>  config DEBUG_PREEMPT
>  	bool "Debug preemptible kernel"
>  	depends on DEBUG_KERNEL && PREEMPT && TRACE_IRQFLAGS_SUPPORT
> -- 
> 2.7.4
> 
> 

Powered by blists - more mailing lists

Your e-mail address:

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.