Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <544A0152.4040201@i-soft.com.cn>
Date: Fri, 24 Oct 2014 15:35:46 +0800
From: 黄建忠 <jianzhong.huang@...oft.com.cn>
To: musl@...ts.openwall.com
Subject: Re: musl pthread/tls issue.

Great clue, Thanks.

It's a stack overflow issue.

The default pthread stacksize is 81920, that's 80k.

I increase the stacksize to 8M and this bug disappear.

I had tried add locks, make local copies and even found it's a over flow 
issue, But so stupid to forget the thread stacksize issue(since it's 
sufficient defaultly under glibc.)

And about the webkit, the different codebase of webkitgtk had different 
behaviors:
2.4.x run but report a exception of RangeError.
2.6.x(they call it webkitgtk4) use the same codebase as ewebkit, 
directly segfault.

I guess it's related to the "fastmalloc" of JavaScriptCore.



On 10/22/14 15:45, Szabolcs Nagy wrote:
> * ?????? <jianzhong.huang@...oft.com.cn> [2014-10-22 14:33:01 +0800]:
>> These days, I finished build a bootable x86_64 system(rpm based) include
>> musl/systemd/dracut/gcc-4.9.1/gcc-5/clang-3.5 and wayland/Xorg and the
>> whole GNOME-3.14 desktop(except webkit js segfault issue I mentioned
>> before) with a lot of patches(I will release all of them someday until
>> it reach a stable state.)
>>
>> After a simple try, I found gnome-shell will segfault If I triggered the
>> app list(not always but often).
>>
>> The dmesg report "pool [<some pid>] segfault xxxxxxxxxxx
>> libpixman-xxxxx", That's to say, it segfault in pixman library(A common
>> library used by Xorg and cairo),
>> gdb report it's a thread issue(a thread of gnome-shell) and segfault at
>> the beginning of general_composite_rect function in pixman-general.c,
>> the pointer of argument can not be accessed.
>>
> that's not enough info..
>
> both the webkit js and this crash sounds like thread stack overflow
>
>> That's to say, there must be a problem exist in musl pthread/tls
>> implementation and can be triggered under certain circumstances. Please
>> help to solve it.
>>
> i don't believe that without evidence: general_composite_rect itself
> allocates >24k on the stack, that is about a third of the musl default
> stack size
>
> you can verify it by checking the diff of the top and bottom of the stack
> (gdb backtrace prints the stack pointer, if the diff is >56k when that
> func was entered then this was the problem) or looking at /proc/pid/maps
> and if the crash happened in a guard page after a thread stack
>
> to fix: make the application create a larger thread stack eg 1M
> (pthread_attr_setstacksize, but gnome* will use gthread most likely
> which has different api)
>


-- 
Huang JianZhong

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.