Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 25 Jan 2023 15:48:37 +0900
From: Dominique MARTINET <dominique.martinet@...ark-techno.com>
To: Rich Felker <dalias@...c.org>
Cc: musl@...ts.openwall.com
Subject: Re: infinite loop in mallocng's try_avail

Rich Felker wrote on Wed, Jan 25, 2023 at 12:53:23AM -0500:
> > > This is really weird, because at the point of the infinite loop, the
> > > new group should not yet be activated (line 163), so
> > > __malloc_context->active[0] should still point to the old active
> > > group. But its avail_mask has all bits set and active_idx is not
> > > corrupted, so try_avail should just have obtained an available slot
> > > from it without ever entering the block at line 120. So I'm confused
> > > how it got to the loop.
> > 
> > try_avail's pm is `__malloc_context->active[0]`, which is overwritten by
> > either dequeue(pm, m) or *pm = m (lines 123,128), so the original
> > m->avail_mask could have been zero, with the next element having a zero
> > freed mask?
> 
> No, avail_mask is only supposed to be able to be nonzero after
> activate_group, which is only called on the head of an active list
> (free.c:86 or malloc.c:163) and which atomically pulls bits off
> freed_mask to move them to avail_mask. If we're observing avail_mask
> nonzero at the point you saw it, some invariant seems to have been
> violated.

Ok, so if I understood that correctly, the second item in the list
should not have avail_mask set, so it having a non-zero value after we
poped the first element is unexpected?

The avail_mask value is 1073741822, which from a naive interpretation
looks a lot like it got through alloc_group (2<<29 -1), 29 being its
last_idx which match up with active_idx (line 272 and 280), then
perhaps alloc_slot (avail_mask--) to make up for that extra -1... But
that does not explain how it got in second place in the active list, and
guessing in a code base I haven't even fully read will only get so far.


Alright, it does not look like we can use more informations from the
currently pending process, I will try to get more traces.

I'll add a circular buffer to log things like the active[0] at entry and
its mask values, then set my board up to reproduce again, which will
probably bring us to next Monday.

If there is anything else you'd like to see, please ask.

-- 
Dominique Martinet

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.