Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 12 Oct 2015 11:12:03 -0400
From: Rich Felker <dalias@...c.org>
To: musl@...ts.openwall.com
Subject: Re: [PATCH 2/3] fix matching errors related to i386
 addressing modes in CFI generation script

On Mon, Oct 12, 2015 at 03:58:59PM +0200, Alex Dowad wrote:
> the regexps previously used to identify registers clobbered by MOVs, ADDs,
> and various other operations would erroneously match index registers. In other
> words, the following asm:
> 
>     mov $0, (%eax,%ebx,4)
> 
> ....would cause EBX to be considered as overwritten, which might prevent a
> debugger from displaying a variable's value in a higher stack frame.
> 
> thanks to Rich Felker for noticing this problem.
> ---
>  tools/add-cfi.i386.awk | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/tools/add-cfi.i386.awk b/tools/add-cfi.i386.awk
> index fc0d8cf..bd7932f 100644
> --- a/tools/add-cfi.i386.awk
> +++ b/tools/add-cfi.i386.awk
> @@ -184,13 +184,13 @@ function trashed(register) {
>  }
>  # this does NOT exhaustively check for all possible instructions which could
>  # overwrite a register value inherited from the caller (just the common ones)
> -/mov.*,%e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg2()) }
> -/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)/ {
> +/mov.*,%e(ax|bx|cx|dx|si|di|bp)$/  { trashed(get_reg2()) }
> +/(add|addl|sub|subl|and|or|xor|lea|sal|sar|shl|shr).*,%e(ax|bx|cx|dx|si|di|bp)$/ {
>    trashed(get_reg2())
>  }
> -/^i?mul [^,]*$/                     { trashed("eax"); trashed("edx") }
> -/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)/ { trashed(get_reg2()) }
> -/^i?div/                            { trashed("eax"); trashed("edx") }
> +/^i?mul [^,]*$/                      { trashed("eax"); trashed("edx") }
> +/^i?mul.*,%e(ax|bx|cx|dx|si|di|bp)$/ { trashed(get_reg2()) }
> +/^i?div/                             { trashed("eax"); trashed("edx") }
>  /(dec|inc|not|neg|pop) %e(ax|bx|cx|dx|si|di|bp)/  { trashed(get_reg()) }
>  /cpuid/ { trashed("eax"); trashed("ebx"); trashed("ecx"); trashed("edx") }

Clever. At first I didn't see how this was fixing anything, with the
.* still there, but given that you strip comments and extra
whitespace, anchoring to the end with $ seems to work.

While seeing them separately was useful for seeing how you fixed the
bug, patches 1 and 2 should be merged for commit. All patch 2 is doing
is fixing a bug that patch 1 introduces; together they just form a
non-buggy version of "fix operand order". I can take care of the
merging though.

One other thing I noticed for future improvement: your patterns don't
seem to catch instructions that modify just the low byte or half of a
register. These are fairly uncommon in musl's i386 asm, but for
x86_64, I would estimate a good 50% of register usage uses the 32-bit
half (%e..) of a register rather than the full %r.., and your current
script fails to mark these clobbers at all. Probably the regex should
be something like %[er]?([abcd][xlh]|si|di|bp|...) - I don't recall
the right form for the numbered x86_64 registers' low parts right off,
though.

Rich

Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.