kernel-hardening - Re: Sparse context checking Vs Clang Thread Safety analysis

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190521182322.GB15859@himanshu-Vostro-3559>
Date: Tue, 21 May 2019 23:53:23 +0530
From: Himanshu Jha <himanshujha199640@...il.com>
To: Jann Horn <jannh@...gle.com>
Cc: linux-sparse@...r.kernel.org, Johannes Berg <johannes@...solutions.net>,
	Philipp Reisner <philipp.reisner@...bit.com>,
	Lukas Bulwahn <lukas.bulwahn@...il.com>,
	clang-built-linux <clang-built-linux@...glegroups.com>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>
Subject: Re: Sparse context checking Vs Clang Thread Safety analysis

Hi Jann,

On Mon, May 20, 2019 at 09:01:26PM +0200, 'Jann Horn' via Clang Built Linux wrote:
> +kernel-hardening
> 
> On Mon, May 20, 2019 at 6:42 PM Himanshu Jha
> <himanshujha199640@...il.com> wrote:
> > I'm an undergrad student working on Google Summer of Code'19 Project[1]
> > to apply clang thread safety analysis feature on linux kernel to find bugs
> > related to concurrency/race condtions with Lukas & clangbuiltlinux
> > community.
> >
> > Since sparse has similar context checking feature, I started
> > investigating by looking the source and some other resources such as
> > LWN[2] about the internals.
> >
> > `-Wcontext` is my prime focus for now and currently we have:
> >
> > himanshu@...anshu-Vostro-3559:~/linux-next$ make C=2 CF="-Wcontext" 2>&1 >/dev/null | grep -w 'context' | wc -l
> > 772
> >
> > o Why do we have so many open warnings for context imbalance ? Or
> >   Why did we stop at some point annotating the codebase ?
> 
> Many developers don't use sparse, and sparse doesn't support some
> locking patterns that the kernel uses.

I understand little interest in sparse warnings among developers.

I see frequently patches related to fixing codebase such as:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/commit/?h=char-misc-testing&id=324fa64cf4189094bc4df744a9e7214a1b81d845

and by 0-day bot:
https://lore.kernel.org/lkml/201905210520.GS4ztecg%25lkp@intel.com/

> > o Does sparse stores some sort of context counter since we didn't get
> > any warnings for `bad_difflocks` which locks 'lock1' and unlocks 'lock2'
> > ?
> 
> Yes. Sparse currently ignores the context and only has a simple
> counter shared by all locks.
> 
> > o What exactly the usage of `__acquire/__release` ?
> > I have used it to shut up the warning for lockfn & unlockfn above.
> 
> You use those to inform sparse where you're taking a lock or releasing
> a lock; and some parts of the kernel also use it to inform sparse of
> places where a lock is effectively taken, even though there is no
> actual locking call (e.g. when two pointers point to the same object,
> and therefore you only need to actually call the locking function once
> instead of twice).
> 
> [...]
> > So, clang thread safety analysis[3] follows a different mechanism
> > to overcome what we have observed above.
> >
> > I did small analysis on a C program[4] and a device driver[5].
> >
> > Clang analysis has many annotations available to suitable annotate the
> > codebase which can be found in the documentation[3].
> >
> > Quite surprisingly, Philipp proposed[6] `__protected_by` feature which is
> > very similar to `guarded_by`[7] feature implemented in Clang.
> >
> > Similarly, Johannes proposed[8] the same with a different implementation.
> >
> > Questions from both you:
> >
> > o Why was it not deployed in sparse ?
> >
> > o Does the lock protecting the data should be a global variable ?
> >
> > ie.,
> >
> > struct foo {
> >         struct mutex lock;
> >         int balance __protected_by(lock);
> > }
> >
> > Can this be done ? Or lock should be global ?
> >
> > Because clang analysis wants it to be global!
> >
> > There are other attribute restrictions as well for clang analysis:
> > https://github.com/llvm-mirror/clang/blob/master/test/Sema/attr-capabilities.c
> >
> >
> > *Most Important*
> > Could you please point me some critical data examples that you know in
> > the kernel source which should be protected. This would help us a lot!
> 
> The complicated thing in the kernel is that almost any structure
> member can be accessed without locking under some circumstances - for
> example, when a structure is initialized, or when the structure is
> being freed and all other references to the object have gone away. On
> top of that, many fields can be accessed under multiple locking
> mechanisms - e.g. many fields can be read under either a
> spinlock/mutex or in an RCU read-critical section. And there are
> functions that conditionally acquire a lock and signal the state of
> the lock through their return value - for example,
> mutex_lock_killable() and mutex_lock_interruptible().

Yes, I understand.

Maybe TRY_ACQUIRE() would fit such a situation:
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#try-acquire-bool-try-acquire-shared-bool


> I think that static analysis of locking is a great thing, but the
> kernel doesn't exactly make it easy. In particular, I think it is
> going to require annotations that you can use to tell the compiler
> which state of its lifecycle an object is in (since that can influence
> locking rules), and annotations that tell the compiler what the
> semantics of functions like mutex_lock_killable() are.

I will try with the easy ones in the beginning and share the analysis.


Thanks for your time!
-- 
Himanshu Jha
Undergraduate Student
Department of Electronics & Communication
Guru Tegh Bahadur Institute of Technology
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.