Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 15 Apr 2020 11:58:11 -0400
From: Rich Felker <dalias@...c.org>
To: Florian Weimer <fw@...eb.enyo.de>
Cc: Norbert Lange <nolange79@...il.com>, musl@...ts.openwall.com
Subject: Re: [BUG] sysconf implementing _SC_NPROCESSORS_(CONF|ONLN)
 incorrectly

On Wed, Apr 15, 2020 at 11:50:36AM +0200, Florian Weimer wrote:
> * Norbert Lange:
> 
> > How should  one deal with this?
> > I understand that the semantics are vague, but given that musl now
> > implements this
> > function, it will make detection and fallback hard (especially as musl
> > doesn't wants to be identified by the likes of macros).
> >
> > As it is now, just using the affinity mask definitely cant be useful,
> > an application wanting that behavior should be patched to
> > use that function directly.
> > If musl would not define the _SC_NPROCESSORS_* macros (but still keep
> > the implementation),
> > this could be used for compile-time detection atleast. Enabling the
> > current implementation would be
> > just a matter of explicitly defining those macros.
> 
> _SC_NPROCESSORS_* as implemented in glibc is bad because those values
> are not adjusted by cgroups, so it can grossly overestimate available
> resources.
> 
> The cgroups interfaces themselves are not stable and very complicated.
> I don't think it's a good idea to target them, especially not from
> code that is expected to be linked statically into applications.
> 
> Given that, I'm not sure that glibc's way is a significant
> improvement.  musl should perhaps be changed to cope more gracefully
> with a sched_getaffinity failure, though (by not reporting a UP
> environment by accident).

For what it's worth, even without the sched_getaffinity failure, it's
still problematic for programs linked to musl to be using the values
obtained to omit memory barriers since they may be restricted to a
single core themselves but communicating over shared memory with
another process that's not restricted or restricted to a different
core.

There really should be some documented meaning for the return values,
whereby we decide either that such sketchy application usage is
supported (e.g. document that values less than 2 are never returned,
so that applications doing the hack always use barriers and they have
no remaining documented way to determine it's really a UP environment)
or declare the application usage incorrect/buggy (i.e. that the values
may be specific to the cgroup or other resource-constraints (possibly
virtualized) and can't be relied on if you're communicating with
processes that might live outside those resource constraints).

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.