Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 2 Sep 2021 15:56:55 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Mike Rapoport <rppt@...nel.org>,
 "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "peterz@...radead.org" <peterz@...radead.org>,
 "keescook@...omium.org" <keescook@...omium.org>,
 "Weiny, Ira" <ira.weiny@...el.com>,
 "linux-hardening@...r.kernel.org" <linux-hardening@...r.kernel.org>,
 "linux-mm@...ck.org" <linux-mm@...ck.org>, "x86@...nel.org"
 <x86@...nel.org>, "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
 "Williams, Dan J" <dan.j.williams@...el.com>,
 "Lutomirski, Andy" <luto@...nel.org>,
 "kernel-hardening@...ts.openwall.com" <kernel-hardening@...ts.openwall.com>,
 "Hansen, Dave" <dave.hansen@...el.com>,
 "shakeelb@...gle.com" <shakeelb@...gle.com>
Subject: Re: [RFC PATCH v2 11/19] mm/sparsemem: Use alloc_table() for table
 allocations

On 9/1/21 09:22, Mike Rapoport wrote:
> On Tue, Aug 31, 2021 at 06:25:23PM +0000, Edgecombe, Rick P wrote:
>> On Tue, 2021-08-31 at 11:55 +0300, Mike Rapoport wrote:
>> > On Mon, Aug 30, 2021 at 04:59:19PM -0700, Rick Edgecombe wrote:
>> <trim> 
>> > > -static void * __meminit vmemmap_alloc_block_zero(unsigned long
>> > > size, int node)
>> > > +static void * __meminit vmemmap_alloc_table(int node)
>> > >  {
>> > > -	void *p = vmemmap_alloc_block(size, node);
>> > > +	void *p;
>> > > +	if (slab_is_available()) {
>> > > +		struct page *page = alloc_table_node(GFP_KERNEL |
>> > > __GFP_ZERO, node);
>> > 
>> > This change removes __GFP_RETRY_MAYFAIL|__GFP_NOWARN from the
>> > original gfp
>> > vmemmap_alloc_block() used.
>> Oh, yea good point. Hmm, I guess grouped pages could be aware of that
>> flag too. Would be a small addition, but it starts to grow
>> unfortunately.
>> 
>> > Not sure __GFP_RETRY_MAYFAIL is really needed in
>> > vmemmap_alloc_block_zero()
>> > at the first place, though.
>> Looks like due to a real issue:
>> 055e4fd96e95b0eee0d92fd54a26be7f0d3bcad0

That commit added __GFP_REPEAT, but __GFP_RETRY_MAYFAIL these days became
subtly different.

> I believe the issue was with memory map blocks rather than with page
> tables, but since sparse-vmemmap uses the same vmemmap_alloc_block() for
> both, the GFP flag got stick with both.
> 
> I'm not really familiar with reclaim internals to say if
> __GFP_RETRY_MAYFAIL would help much for order-0 allocation.

For costly allocation, __GFP_RETRY_MAYFAIL will try harder, thus the RETRY
part is accented. For order-0 the only difference is that it will skip OOM,
thus the MAYFAIL part. It usually means there's a fallback. I guess in this
case there's no fallback, so allocating without __GFP_RETRY_MAYFAIL would be
better.

> Vlastimil, can you comment on this?
>  
>> I think it should not affect PKS tables for now, so maybe I can make
>> separate logic instead. I'll look into it. Thanks.
>> > 
>> > More broadly, maybe it makes sense to split boot time and memory
>> > hotplug
>> > paths and use pxd_alloc() for the latter.
>> > 
>> > > +
>> > > +		if (!page)
>> > > +			return NULL;
>> > > +		return page_address(page);
>> > > +	}
>> > >  
>> > > +	p = __earlyonly_bootmem_alloc(node, PAGE_SIZE, PAGE_SIZE,
>> > > __pa(MAX_DMA_ADDRESS));
>> > 
>> > Opportunistically rename to __earlyonly_memblock_alloc()? ;-)
>> > 
>> Heh, I can. Just grepping, there are several other instances of
>> foo_bootmem() only calling foo_memblock() pattern scattered about. Or
>> maybe I'm missing the distinction.
> 
> Heh, I didn't do s/bootmem/memblock/g, so foo_bootmem() are reminders we
> had bootmem allocator once.
> Maybe it's a good time to remove them :)
>  
>> <trim>
> 

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.