Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 14 Mar 2018 10:33:43 -0700
From: Matthew Wilcox <willy@...radead.org>
To: Igor Stoppa <igor.stoppa@...wei.com>
Cc: keescook@...omium.org, david@...morbit.com, rppt@...ux.vnet.ibm.com,
	mhocko@...nel.org, labbott@...hat.com,
	linux-security-module@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, kernel-hardening@...ts.openwall.com
Subject: Re: [RFC PATCH v19 0/8] mm: security: ro protection for dynamic data

On Wed, Mar 14, 2018 at 06:11:22PM +0200, Igor Stoppa wrote:
> On 14/03/18 15:04, Matthew Wilcox wrote:
> > but the principle it uses
> > seems like a better match to me than the rather complex genalloc.
> 
> It uses meta data in a different way than genalloc.
> There is probably a tipping point where one implementation becomes more
> space-efficient than the other.

Certainly there are always tradeoffs in writing a memory allocator.

> Probably page_frag does well with relatively large allocations, while
> genalloc seems to be better for small (few allocation units) allocations.

I don't understand why you would think that.  If you allocate 4096 1-byte
elements, page_frag will just use up a page.  Doing the same thing with
genalloc requires allocating two bits per byte (1kB of bitmap), plus
other overheads.

> Also, in case of high variance in the size of the allocations, genalloc
> requires the allocation unit to be small enough to fit the smallest
> request (otherwise one must accept some slack), while page_frag doesn't
> care if the allocation is small or large.

Right; internal versus external fragmentation.  The bane of memory
allocators ;-)

> page_frag otoh, seems to not support the reuse of space that was freed,
> since there is only

To a certain extent it does.  If you free everything on a page, and that
page is still in the page_frag_cache, it will get reused.

> But could you please explain to what you are referring to, when you say
> that page_frag has "significantly lower overhead" ?

Less CPU time taken per allocation, less metadata stored per object.

> Ex: if the pfree is called only on error paths, is it ok to not claim
> back the memory released, if it's less than one page?

Yes, I think that's a great example.

> To be clear: I do not want to hold to genalloc just because I have
> already implemented it. I can at least sketch a version with page_frag,
> but I would like to understand why its trade-offs are better :-)
> 
> > Just allocate some pages and track the offset within those pages that 
> 
> > is the current allocation point.
> 
> 
> > It's less than 100 lines of code!
> 
> Strictly speaking it is true, but it all relies on other functions,
> which must be rewritten, because they use linear address, while this
> must work with virtual (vmalloc) addresses.

No, that's basically the whole thing.  I think an implementation of
pmalloc which used a page_frag-style allocator would be larger than
100 lines, but I don't think it would have to be significantly larger
than that.

> Also, I see that the code relies a lot on order of allocation.
> I think we had similar discussion wrt compound pages.
> 
> It seems to me wasteful, if I have a request of, say, 5 pages, and I end
> up allocating 8.

Yes, but the other three pages are available for use by the pmalloc pool.
Now, at pmalloc_protect() time, you might well want to release the unused
pages by calling make_alloc_exact() and hand those three pages back to the
page allocator.

> I do not recall anyone giving a justification like:
> "yeah, it uses extra pages, but it's preferable, for reasons X, Y and Z,
> so it's a good trade-off"

Sometimes it is, sometimes it isn't.

> Could it be that it's normal RAM is considered less precious than the
> special memory genalloc is written for, so normal RAM is not really
> proactively reused, while special memory is treated as a more valuable
> resource that should not be wasted?

We're certainly at the point where normal RAM is pretty cheap.  A 16GB
DIMM is $200, so that's $12.50 per gigabyte.  We have more of a problem
with fragmentation than we do with squeezing every last byte out of
the system.

Of course, Linux still runs on tiny systems, and we don't want to
unnecessarily bloat the kernel.  And cachelines are also a precious
resource; the fewer we touch, the faster the system runs.  The bitmap
in genalloc can easily occupy several cachelines; the page_frag allocator
touches a single cacheline for most allocations.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.