Openwall GNU/*/Linux - a small security-enhanced Linux distro for servers
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 7 Dec 2012 08:09:57 +0200
From: Milen Rangelov <>
Subject: Re: bitslice DES on GPU

Why would you want to do that via patching (given that they are
compile-time constants)?

On Fri, Dec 7, 2012 at 6:28 AM, Sayantan Datta <> wrote:

> Hi Milen,
> On Fri, Dec 7, 2012 at 3:02 AM, Milen Rangelov <> wrote:
>> Hello,
>> I did not follow the whole thread (and I should have, bitslice DES on GPU
>> sounds interesting, I did not manage to make it practical though :( ).
>> Why do you want to patch your kernels?
>> I used to patch the kernel binaries for BFI (before AMD mapped it to
>> bitselect()). It was not the IL code that was being patched, it's rather
>> the binary. I have never patched the AMDIL part. Binary patching is easy
>> because the kernels themselves were coded in a way that all that we needed
>> was to replace one instruction with another. The kernel binary is an ELF
>> file indeed, and from what I remember, it had one or more embedded ELF data
>> in it, so it's like ELF inside ELF. The general idea was to find all
>> occurences of the instruction to replace and change it with another
>> instruction (BFI) that had the same number of operands. There were several
>> potential candidates for such "replacement" instructions, but the
>> BYTEALIGN_INT one was best as it was easy to have it generated from OpenCL
>>  code (by using an AMD extension to OpenCL). The VLIW5/VLIW4 ISA is in fact
>> simple, instructions are always 64bit (though they may have 2 or more
>> operands) and part of it is the instruction ID, src/dst register ids, some
>> flags, etc. By exploiting the fact that the instruction id part is known
>> and some flags should have fixed value, you can (heuristically) find and
>> replace your instructions.
>> In fact I had several versions of this, the first version was dumb. It
>> assumed the ISA code started right after the ELF header and since that's
>> not true, it tried several alignments and chose the one that produced most
>> "instructions found" results. This of course was error-prone and I had to
>> implement per-kernel quirks that failed often :) Then I decided that it
>> would be better if we parse the ELF file better to find exactly where the
>> text section of the kernel lies. I failed a lot of times (most of them
>> ended with GPU crashes :) ) until I found out a bitcoin miner code that
>> _reliably_ patched the BFI thing. Then I was rather surprised to find out
>> that we have the ELF-inside-ELF situation. Once I understood that, I was
>> finally able to find out where the binary code starts so that I could
>> reliably patch the opcode.
>> Note that this is a rather simple case, for more advanced binary
>> patching, this would become much more complex. I've seen in the AMD forums
>> some people posting stuff about IL patching inside kernel. I don't really
>> know how that works. Perhaps they compile from source, patch the IL section
>> inside binary, strip the text section, then again pass that to
>> clBuildProgram to get the final binary, but I am not quite sure about this.
>> Regards,
>> Milen
>>>> I know some people "binary patch" AMD kernels for BFI and stuff but I
>>>> always thought they actually patch IL code in an ELF binary file and then
>>>> load that. This will of course be a lot faster than actually recompiling so
>>>> it might be a better alternative (of course vendor dependent, but I think
>>>> that currently goes for any method).
>>>> I bet Milen would know for sure how to proceed.
>>>> My task is even simpler though. I only need to replace a set of compile
> time constants. Using build options I can eliminate the source,llvmir
> ,amdil sections and keep only the text/ISA section in the binaries. My
> problem is that I cannot pinpoint any location for patching. First I search
> for an integer constant in binaries and came up with a set of locations
> where I can find that integer. Next I change the constant and do the search
> again to get a second set of locations. But I couldn't find any common
> location between the two sets.  Should I be looking for  character
> equivalent of integers(like what we do for text files e.g string matching )
> instead of searching binary integers in the header ?
> Regards,
> Sayantan


Powered by blists - more mailing lists

Your e-mail address:

Powered by Openwall GNU/*/Linux - Powered by OpenVZ