oss-security - Re: CVE Request: More perf security fixes

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130605194833.GA27861@tassilo.jf.intel.com>
Date: Wed, 5 Jun 2013 12:48:33 -0700
From: Andi Kleen <ak@...ux.jf.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Andi Kleen <ak@...ux.jf.intel.com>, Marcus Meissner <meissner@...e.de>,
        OSS Security List <oss-security@...ts.openwall.com>,
        eranian@...gle.com, security@...nel.org
Subject: Re: CVE Request: More perf security fixes

On Wed, Jun 05, 2013 at 10:23:02AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 04, 2013 at 10:59:33AM -0700, Andi Kleen wrote:
> > > 3. Information leak (??) via perf LBR filter 
> > 
> > Leak + crash actually.
> > 
> > > 
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=6e15eb3ba6c0249c9e8c783517d131b47db995ca
> > > 
> > > commit 6e15eb3ba6c0249c9e8c783517d131b47db995ca
> > > Author: Peter Zijlstra <a.p.zijlstra@...llo.nl>
> > > Date:   Fri May 3 14:11:24 2013 +0200
> > > 
> > >     perf/x86/intel/lbr: Fix LBR filter
> > >     
> > >     The LBR 'from' adddress is under full userspace control; ensure
> > >     we validate it before reading from it.
> > 
> > This patch is known broken and causes additional crashes.
> > There's no updated patch for that so far.
> 
> And yet there's no crash report in my inbox.. how kind you you andi.

It was brand new. Normally we only report when we knew more.

I just posted it here to stop people from releasing buggy updates.

> 
> And I know you don't agree with the patch, but since you're too lazy to
> provide a better one I didn't think you minded _that_ much :-)

It has nothing to do with the old performance problem 
(the patch adding a O(n^2) performance path)

We found recently a correctness problem during some stress testing.
It seems to be a latent bug in the module list handling.

Under some conditions the module list walk from the NMI handler
crashes.

Strangely it even happens under normal operation, you don't
even need to load/unload a module.

Here's an example oops (unfortunately with a lot of junk)

[ 2041.104399]  [<c26bc64d>] ? do_page_fault+0xd/0x10
[ 2041.109766]  [<c26b9cac>] ? error_code+0x6c/0x74
[ 2041.114943]  [<c20967b8>] ? print_modules+0x38/0xe0
[ 2041.120408]  [<c26b07e6>] ? printk+0x3d/0x3f
[ 2041.125195]  [<c2033d1f>] ? warn_slowpath_common+0x5f/0x80
[ 2041.131342]  [<c26bbfbf>] ? vmalloc_fault+0x5f/0xd2
[ 2041.136807]  [<c26bbfbf>] ? vmalloc_fault+0x5f/0xd2
[ 2041.142277]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.149686]  [<c26bc640>] ? __do_page_fault+0x550/0x550
[ 2041.155546]  [<c2033d62>] ? warn_slowpath_null+0x22/0x30
[ 2041.161499]  [<c26bbfbf>] ? vmalloc_fault+0x5f/0xd2
[ 2041.166967]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.174376]  [<c26bc56d>] ? __do_page_fault+0x47d/0x550
[ 2041.180237]  [<c20137cf>] ? intel_pmu_enable_all+0x1f/0x90
[ 2041.186386]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.193795]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.201206]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.208617]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.216027]  [<c26bc640>] ? __do_page_fault+0x550/0x550
[ 2041.221881]  [<c26bc64d>] ? do_page_fault+0xd/0x10
[ 2041.227250]  [<c26b9cac>] ? error_code+0x6c/0x74
[ 2041.232426]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.239837]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.247248]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.254658]  [<c2090e89>] ? __module_address+0x39/0x80
[ 2041.260415]  [<c2090ee0>] ? __module_text_address+0x10/0x60
[ 2041.266660]  [<f967e0a0>] ? cpufreq_get_measured_perf+0xa0/0xe0 [mperf]
[ 2041.274068]  [<c209674c>] ? is_module_text_address+0x1c/0x50
[ 2041.280411]  [<c20523c7>] ? kernel_text_address+0x47/0x50
[ 2041.286459]  [<c2011167>] ? branch_type+0x47/0x240
[ 2041.291834]  [<c24c1ceb>] ? __cpufreq_driver_getavg+0x4b/0x70
[ 2041.298271]  [<c24c1cec>] ? __cpufreq_driver_getavg+0x4c/0x70
[ 2041.304709]  [<c20117b1>] ? intel_pmu_lbr_read+0x241/0x420
[ 2041.310858]  [<f967e0c0>] ? cpufreq_get_measured_perf+0xc0/0xe0 [mperf]
[ 2041.318270]  [<c201409a>] ? intel_pmu_handle_irq+0x9a/0x360
[ 2041.324516]  [<c206dab0>] ? find_busiest_group+0x110/0x9e0
[ 2041.330668]  [<c26baedb>] ? perf_event_nmi_handler+0x1b/0x20
[ 2041.337008]  [<c26ba641>] ? nmi_handle.isra.0+0x41/0x60
[ 2041.342862]  [<c26ba753>] ? do_nmi+0xf3/0x420

struct module *__module_address(unsigned long addr)
{
        struct module *mod;

        if (addr < module_addr_min || addr > module_addr_max)
                return NULL;

        list_for_each_entry_rcu(mod, &modules, list) {
                if (mod->state == MODULE_STATE_UNFORMED)
                ^^^^^^^ trigger page fault ^^^^^^^
                        continue;
                if (within_module_core(addr, mod)
                    || within_module_init(addr, mod))
                        return mod;
        }
        return NULL;
}



-Andi




-- 
ak@...ux.intel.com -- Speaking for myself only
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.