Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 24 May 2015 20:36:48 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: ppc soft-float regression

On Sat, May 23, 2015 at 11:08:09PM -0400, Rich Felker wrote:
> > Attached is a patch that finishes the job by completing option 3. I
> > haven't tested it much yet so I'll hold off on committing it for a
> > while but it seems to work fine (not break anything) on i386.
> > 
> > diff --git a/src/ldso/dynlink.c b/src/ldso/dynlink.c
> > index 93595a0..485bd4f 100644
> > --- a/src/ldso/dynlink.c
> > +++ b/src/ldso/dynlink.c
> > @@ -280,12 +280,17 @@ static void do_relocs(struct dso *dso, size_t *rel, size_t rel_size, size_t stri
> >  			def.dso = dso;
> >  		}
> >  
> > -		int gotplt = (type == REL_GOT || type == REL_PLT);
> > -		if (dso->rel_update_got && !gotplt && stride==2) continue;
> > -
> > -		addend = stride>2 ? rel[2]
> > -			: gotplt || type==REL_COPY ? 0
> > -			: *reloc_addr;
> > +		if (stride > 2) {
> > +			addend = rel[2];
> > +		} else if (type==REL_GOT || type==REL_PLT || type==REL_COPY) {
> > +			addend = 0;
> > +		} else {
> > +			addend = *reloc_addr;
> > +			if (dso->rel_update_got) {
> > +				struct symdef old = find_sym(&ldso, name, 0);
> > +				addend -= (size_t)ldso.base+old.sym->st_value;
> > +			}
> > +		}
> 
> Actually I'm not happy with this patch as-is. It's only valid for
> REL_SYMBOLIC (or REL_SYM_OR_REL with a symbol) type relocations,
> because it's assuming that the value at reloc_addr is sym_val+addend.
> We could restrict reprocessing to just those types, but there are a
> number of other reloc types that could theoretically arise and that we
> should be treating correctly. REL_OFFSET/REL_OFFSET32 probably should
> not appear in libc.so (or anything without TEXTRELs), but if we need
> to support them, we would also need to adjust by (size_t)reloc_addr.
> What's more important, though, are TLS-type relocations which in
> principle could appear if libgcc.a is emulating floating point
> environment for softfloat via TLS. REL_DTPOFF and REL_TLSDESC are
> probably the only ones that would be valid here (only GD model is
> valid in shared libraries) and REL_DTPOFF is trivial to reverse and
> extract an addend, but REL_TLSDESC is relatively complex to handle.
> 
> Sure we could just do REL_SYMBOLIC for now, but if we can't yet solve
> the problem in a future-proof way, I'm not sure there's much value in
> committing the patch at this point, since there's no present issue
> it's fixing.

I think I have a potential solution. What I've wanted to do is backup
the original addends before stage 2 relocation, either by constructing
a RELA table to replace the REL or just backing up the addends
out-of-line and having special logic in do_relocs to pull them instead
of the inline addends (the latter takes 1/3 the space, which is
appealing). Unfortunately, this requires allocation of storage, which
pessimizes archs where this whole issue doesn't matter -- extra
syscalls (mmap) and/or large enough static storage to be safely above
the possible size of the addends.

There's a simple alternative I just came up with though: have
dlstart.c compute the number of REL entries that need their addends
saved and allocate a VLA on its stack for stages 2 and 3 to use. While
the number of addends could be significant, it's many orders of
magnitude smaller than the smallest practical stack sizes we could
actually run with, so it's perfectly safe to put it on the stack.

Here're the basic changes I'd like to make to dlstart.c to implement
this:

1. Remove processing of DT_JMPREL REL/RELA table. All entries in this
   table are necessarily JMP_SLOT type, and thus symbolic, so there's
   nothing stage 1 can do with them anyway. Also, being JMP_SLOT type,
   they have implicit addends of zero if they're REL-type, so there's
   no need to save addends.

2. Remove the loop in dlstart.c that works like a fake function call 3
   times to process DT_JMPREL, DT_REL, and DT_RELA. Instead we just
   need 2 iterations, and now the stride is constant in each, so they
   should simplify down a lot more inline.

3. During the loop that processes DT_REL, count the number of
   non-relative relocations (ones we skip at this stage), then make a
   VLA this size and pass its address to __dls2 as a second argument.

4. Have the do_relocs in stage 2 save addends in this provided array
   before overwriting them, and save its address for use by stage 3.

5. Have the do_relocs in stage 3 (for ldso/libc only) pull addends
   from this array instead of of from inline.

Steps 1 and 2 are purely code removal/simplification and should be
done regardless of whether we move forward on the above program, I
think. Steps 3-5 add some complexity but hardly any code, just a few
lines here and there.

Comments?

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.