Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sat, 24 Oct 2015 00:00:56 +0200
From: Denys Vlasenko <vda.linux@...glemail.com>
To: Rich Felker <dalias@...c.org>
Cc: musl <musl@...ts.openwall.com>
Subject: Re: [PATCH] configure: add gcc flags for better link-time optimization

On Fri, Oct 23, 2015 at 4:48 PM, Rich Felker <dalias@...c.org> wrote:
>> Minimizing the number of data pages is more important
>> than text pages. A text page is shared among all processes linked
>> to this libc.so; data page is allocated in every process
>> (as soon as even one byte in this page is written to.
>> With only 4 pages in total like in this example, I'm pretty sure
>> all of them get dirtied by libc init, use of stdio or malloc).
>>
>> Make libc (.data + .bss) fit into one page less and you get about
>> as many pages saved as you have processes running.
>
> FYI all the data/bss in libc except a few large objects _easily_ fits
> in a single page.

What's importand is how many pages are dirtied.
Here's a test with Aboriginal's x86_64 static busybox:

# sleep 9999 | sh -c 'echo $$; exec ./busybox dd bs=1'
16290

# pmap 16290
16290:   ./busybox dd bs 1
0000000000400000    320K r-x--
/home/srcdevel/aboriginal/a.0/build/root-filesystem-x86_64/usr/bin/busybox
^^^^ text + rodata
000000000064f000      4K rw---
/home/srcdevel/aboriginal/a.0/build/root-filesystem-x86_64/usr/bin/busybox
^^^^ data + start of bss
0000000000650000      8K rw---    [ anon ]
^^^^ the rest of bss

0000000001655000      4K rw---    [ anon ]
^^^^ brk

00007fff57196000    132K rw---    [ stack ]
00007fff571df000      8K r----    [ anon ]
00007fff571e1000      8K r-x--    [ anon ]
ffffffffff600000      4K r-x--    [ anon ]
 total              488K


Thus, for this binary, three RW pages mapped immediately for .data and .bss,
for any applet.
Are all these pages touched?

# cat /proc/16290/smaps
00400000-00450000 r-xp 00000000 08:02 1810890
  /home/srcdevel/aboriginal/a.0/build/root-filesystem-x86_64/usr/bin/busybox
Size:                320 kB
...
0064f000-00650000 rw-p 0004f000 08:02 1810890
  /home/srcdevel/aboriginal/a.0/build/root-filesystem-x86_64/usr/bin/busybox
Size:                  4 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
...
00650000-00652000 rw-p 00000000 00:00 0
Size:                  8 kB
Rss:                   4 kB
Pss:                   4 kB
Shared_Clean:          0 kB
Shared_Dirty:          0 kB
Private_Clean:         0 kB
Private_Dirty:         4 kB
Referenced:            4 kB
Anonymous:             4 kB

No. Only two pages are mapped, not three.

This is pretty impressive. However, this is a small busybox config:
only 31 applet.

I have a complete (~320 applets) 32-bit static busybox config built
against uclibc, and it has only 2 pages .data+.bss

Will test & see how close to that can musl get.

I'll continue sending patches which allow to carry over some
data size reductions from uclibc to musl.

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.