Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 25 Sep 2018 16:54:37 +0200
From: Rabbitstack <rabbitstack7@...il.com>
To: musl@...ts.openwall.com
Subject: Re: setrlimit hangs the process

Sorry. Let me describe the problem in more detail.

The process only hangs when launched without root privileges on the host
(Arch Linux x64 with kernel 4.17.5-1) where Alpine docker container is
running. Once with root privileges, it starts up correctly (but this is
obvious since it doesn't hit setrlimit call). The odd side is that on other
hosts it hangs even when started with root. No error messages so far.
Strace output:

$ sudo strace -p 9285

futex(0x2cddfc0, FUTEX_WAIT_PRIVATE, 0, NULL

$ sudo strace -f -p 9285

.....
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=51442144}) = -1 ETIMEDOUT (Connection timed out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=62384239}) = -1 ETIMEDOUT (Connection timed out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=73251219}) = -1 ETIMEDOUT (Connection timed out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=84458579}) = -1 ETIMEDOUT (Connection timed out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=95098614}) = -1 ETIMEDOUT (Connection timed out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
[pid  9287] futex(0x7efbff70008c, FUTEX_LOCK_PI_PRIVATE,
{tv_sec=1537887068, tv_nsec=106005502}) = -1 ETIMEDOUT (Connection timed
out)
[pid  9287] getdents64(10, /* 0 entries */, 2048) = 0
[pid  9287] lseek(10, 0, SEEK_SET)      = 0
[pid  9287] getdents64(10, /* 14 entries */, 2048) = 336
[pid  9287] tgkill(9285, 9285, SIGRT_2) = 0
.....


I'll try to build a tiny example to isolate the problem and hopefully
provide more feedback.

Thanks

On Tue, Sep 25, 2018 at 4:15 PM Szabolcs Nagy <nsz@...t70.net> wrote:

> * Rabbitstack <rabbitstack7@...il.com> [2018-09-25 14:59:45 +0200]:
> > I'm using the latest golang:alpine Docker image to produce a
> > statically-linked Go binary. Even though I'm able to build the binary,
> when
> > I run it the process gets stuck during ebpf program loading. I've
> > investigated a bit and found the root cause is the call to setrlimit
> (this
> > is the offending line
> >
> >
> https://github.com/iovisor/gobpf/blob/2e314be67b1854ad226f012f08a984e0e89b6da9/elf/elf.go#L105
> ).
> > Are you aware of such behaviour in musl?
> >
>
> well you could have described what goes wrong in more detail
> (error message, strace output, target platform, are you root, ...)
>
> i assume you are not running this on mips (since there is no
> alpine docker image for mips), which has the issue of
> SYSCALL_RLIM_INFINITY != RLIM_INFINITY
> the kernel side value is different from userspace so musl
> has to translate the value which may go wrong.
>
> nor on x32 (which may have various issues with the raw
> syscalls both in the go code and c code).
>
> increasing rlimit is not allowed by default, so you have to
> ensure you have permissions, musl should have no special
> behaviour with respect to RLIMIT_MEMLOCK, so it's more likely
> that you just don't have bpf and setrlimit permissions.
>
> instead of using a complex system like go + c code + elf loader,
> try a minimal c program to see if the bpf syscall succeeds at
> all in your docker environment.
>
>

Content of type "text/html" skipped

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.