Date: Thu, 6 Dec 2012 23:32:41 +0200 From: Milen Rangelov <gat3way@...il.com> To: john-dev@...ts.openwall.com Subject: Re: bitslice DES on GPU Hello, I did not follow the whole thread (and I should have, bitslice DES on GPU sounds interesting, I did not manage to make it practical though :( ). Why do you want to patch your kernels? I used to patch the kernel binaries for BFI (before AMD mapped it to bitselect()). It was not the IL code that was being patched, it's rather the binary. I have never patched the AMDIL part. Binary patching is easy because the kernels themselves were coded in a way that all that we needed was to replace one instruction with another. The kernel binary is an ELF file indeed, and from what I remember, it had one or more embedded ELF data in it, so it's like ELF inside ELF. The general idea was to find all occurences of the instruction to replace and change it with another instruction (BFI) that had the same number of operands. There were several potential candidates for such "replacement" instructions, but the BYTEALIGN_INT one was best as it was easy to have it generated from OpenCL code (by using an AMD extension to OpenCL). The VLIW5/VLIW4 ISA is in fact simple, instructions are always 64bit (though they may have 2 or more operands) and part of it is the instruction ID, src/dst register ids, some flags, etc. By exploiting the fact that the instruction id part is known and some flags should have fixed value, you can (heuristically) find and replace your instructions. In fact I had several versions of this, the first version was dumb. It assumed the ISA code started right after the ELF header and since that's not true, it tried several alignments and chose the one that produced most "instructions found" results. This of course was error-prone and I had to implement per-kernel quirks that failed often :) Then I decided that it would be better if we parse the ELF file better to find exactly where the text section of the kernel lies. I failed a lot of times (most of them ended with GPU crashes :) ) until I found out a bitcoin miner code that _reliably_ patched the BFI thing. Then I was rather surprised to find out that we have the ELF-inside-ELF situation. Once I understood that, I was finally able to find out where the binary code starts so that I could reliably patch the opcode. Note that this is a rather simple case, for more advanced binary patching, this would become much more complex. I've seen in the AMD forums some people posting stuff about IL patching inside kernel. I don't really know how that works. Perhaps they compile from source, patch the IL section inside binary, strip the text section, then again pass that to clBuildProgram to get the final binary, but I am not quite sure about this. Regards, Milen >> >> I know some people "binary patch" AMD kernels for BFI and stuff but I >> always thought they actually patch IL code in an ELF binary file and then >> load that. This will of course be a lot faster than actually recompiling so >> it might be a better alternative (of course vendor dependant, but I think >> that currently goes for any method). >> >> I bet Milen would know for sure how to proceed. >> >> Content of type "text/html" skipped
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.