Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Tue, 21 May 2019 23:53:23 +0530
From: Himanshu Jha <himanshujha199640@...il.com>
To: Jann Horn <jannh@...gle.com>
Cc: linux-sparse@...r.kernel.org, Johannes Berg <johannes@...solutions.net>,
	Philipp Reisner <philipp.reisner@...bit.com>,
	Lukas Bulwahn <lukas.bulwahn@...il.com>,
	clang-built-linux <clang-built-linux@...glegroups.com>,
	Kernel Hardening <kernel-hardening@...ts.openwall.com>
Subject: Re: Sparse context checking Vs Clang Thread Safety analysis

Hi Jann,

On Mon, May 20, 2019 at 09:01:26PM +0200, 'Jann Horn' via Clang Built Linux wrote:
> +kernel-hardening
> 
> On Mon, May 20, 2019 at 6:42 PM Himanshu Jha
> <himanshujha199640@...il.com> wrote:
> > I'm an undergrad student working on Google Summer of Code'19 Project[1]
> > to apply clang thread safety analysis feature on linux kernel to find bugs
> > related to concurrency/race condtions with Lukas & clangbuiltlinux
> > community.
> >
> > Since sparse has similar context checking feature, I started
> > investigating by looking the source and some other resources such as
> > LWN[2] about the internals.
> >
> > `-Wcontext` is my prime focus for now and currently we have:
> >
> > himanshu@...anshu-Vostro-3559:~/linux-next$ make C=2 CF="-Wcontext" 2>&1 >/dev/null | grep -w 'context' | wc -l
> > 772
> >
> > o Why do we have so many open warnings for context imbalance ? Or
> >   Why did we stop at some point annotating the codebase ?
> 
> Many developers don't use sparse, and sparse doesn't support some
> locking patterns that the kernel uses.

I understand little interest in sparse warnings among developers.

I see frequently patches related to fixing codebase such as:
https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git/commit/?h=char-misc-testing&id=324fa64cf4189094bc4df744a9e7214a1b81d845

and by 0-day bot:
https://lore.kernel.org/lkml/201905210520.GS4ztecg%25lkp@intel.com/

> > o Does sparse stores some sort of context counter since we didn't get
> > any warnings for `bad_difflocks` which locks 'lock1' and unlocks 'lock2'
> > ?
> 
> Yes. Sparse currently ignores the context and only has a simple
> counter shared by all locks.
> 
> > o What exactly the usage of `__acquire/__release` ?
> > I have used it to shut up the warning for lockfn & unlockfn above.
> 
> You use those to inform sparse where you're taking a lock or releasing
> a lock; and some parts of the kernel also use it to inform sparse of
> places where a lock is effectively taken, even though there is no
> actual locking call (e.g. when two pointers point to the same object,
> and therefore you only need to actually call the locking function once
> instead of twice).
> 
> [...]
> > So, clang thread safety analysis[3] follows a different mechanism
> > to overcome what we have observed above.
> >
> > I did small analysis on a C program[4] and a device driver[5].
> >
> > Clang analysis has many annotations available to suitable annotate the
> > codebase which can be found in the documentation[3].
> >
> > Quite surprisingly, Philipp proposed[6] `__protected_by` feature which is
> > very similar to `guarded_by`[7] feature implemented in Clang.
> >
> > Similarly, Johannes proposed[8] the same with a different implementation.
> >
> > Questions from both you:
> >
> > o Why was it not deployed in sparse ?
> >
> > o Does the lock protecting the data should be a global variable ?
> >
> > ie.,
> >
> > struct foo {
> >         struct mutex lock;
> >         int balance __protected_by(lock);
> > }
> >
> > Can this be done ? Or lock should be global ?
> >
> > Because clang analysis wants it to be global!
> >
> > There are other attribute restrictions as well for clang analysis:
> > https://github.com/llvm-mirror/clang/blob/master/test/Sema/attr-capabilities.c
> >
> >
> > *Most Important*
> > Could you please point me some critical data examples that you know in
> > the kernel source which should be protected. This would help us a lot!
> 
> The complicated thing in the kernel is that almost any structure
> member can be accessed without locking under some circumstances - for
> example, when a structure is initialized, or when the structure is
> being freed and all other references to the object have gone away. On
> top of that, many fields can be accessed under multiple locking
> mechanisms - e.g. many fields can be read under either a
> spinlock/mutex or in an RCU read-critical section. And there are
> functions that conditionally acquire a lock and signal the state of
> the lock through their return value - for example,
> mutex_lock_killable() and mutex_lock_interruptible().

Yes, I understand.

Maybe TRY_ACQUIRE() would fit such a situation:
https://clang.llvm.org/docs/ThreadSafetyAnalysis.html#try-acquire-bool-try-acquire-shared-bool


> I think that static analysis of locking is a great thing, but the
> kernel doesn't exactly make it easy. In particular, I think it is
> going to require annotations that you can use to tell the compiler
> which state of its lifecycle an object is in (since that can influence
> locking rules), and annotations that tell the compiler what the
> semantics of functions like mutex_lock_killable() are.

I will try with the easy ones in the beginning and share the analysis.


Thanks for your time!
-- 
Himanshu Jha
Undergraduate Student
Department of Electronics & Communication
Guru Tegh Bahadur Institute of Technology

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.