Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 07 May 2012 21:00:36 +0200
From: magnum <john.magnum@...hmail.com>
To: john-dev@...ts.openwall.com
Subject: Re: New RAR OpenCL kernel

On 05/07/2012 09:27 AM, Milen Rangelov wrote:
> I think we can narrow the case further (that's what I did, still not sure
> it's completely safe).
> 
> To understand why a bad stream returns long series of same values, you need
> to have a look at the end of rar_decode_number() function. You can do one
> more stricter check: if rar_decode_number() returns repeatedly
> decode->DecodeNum[0], then it's "likely" a bad stream. Still I think it
> might happen for some valid stream too. But I guess there must be some
> threshold that can be considered relatively safe.

I totally missed that one. Even if decode->DecodeNum[0] is valid several
times, I would *guess* this code is safe:

@@ -387,7 +387,7 @@ int rar_decode_number(unpack_data_t *unpack_data,
struct Decode *decode)
        rar_addbits(unpack_data, bits);

n=decode->DecodePos[bits]+((bit_field-decode->DecodeLen[bits-1])>>(16-bits));
        if (n >= decode->MaxNum) {
-               n=0;
+               return -1;
        }
        rar_dbgmsg("rar_decode_number return(%d)\n", decode->DecodeNum[n]);

...and add code in read_tables() and rar_unpack29() so this -1
propagates to an immediate bailout. Seems to work fine but this alone
doesn't help performance as much as the threshold for repeated numbers.
BTW forget what I wrote seeing 26 of them, I did not reset the counter
properly. Besides, it depends on how/where you count them. I had the
check in read_tables() but it should be inside rar_decode_number()
itself (again returning -1 for bailout) because there are more callers.

magnum


> I think with ZIP we have somewhat similar situation - it is possible
> (though very unlikely) that for a small file the verifier matches, file is
> inflated correctly and checksum matches, yet password candidate is wrong.
> 
> 
> 
> On Mon, May 7, 2012 at 2:59 AM, magnum <john.magnum@...hmail.com> wrote:
> 
>> On 05/05/2012 02:09 AM, Milen Rangelov wrote:
>>> Well, there are two cases here, a PPM block or a LZ block. The PPM case
>>> would very quickly give you an error if stream is invallid and unpack29
>>> would switch back to the pure LZSS case.
>>
>> Do you mean you modified something in unpack29 for this case? When I
>> debugged it I got the impression it already rejects properly in this case.
>>
>>> Then for Lempel-Ziv  it's highly
>>> unlikely that you'd get the same result from rar_decode_number()  several
>>> times at a row unless the stream is bad. The key point here being
>> "several"
>>> and that's where  my concerns are.
>>
>> Nice. I tested a little and I've seen the same result at most 26 times
>> in a row for a valid stream while bad streams can have several hundreds.
>> The question is what threshold would be safe...
>>
>> I think I'll have a look at Jim's pkzip code again, maybe something in
>> there is usable for the LZ case.
>>
>> magnum
>>
>>
>>
>>> On Sat, May 5, 2012 at 2:45 AM, magnum <john.magnum@...hmail.com> wrote:
>>>
>>>> On 05/04/2012 08:19 PM, Milen Rangelov wrote:
>>>>> OK, my tests are all successful until now. Speed is relatively the
>> same,
>>>>> doesn't matter whether the compressed file is a 4KB text file, or a
>> 50MB
>>>>> avi movie :) Unfortunately there is still overhead as compared to the
>> HE
>>>>> case, speed being somewhat 20% slower. I am still not 100% whether this
>>>>> would hold true in all cases and if I am wrong, it might happen that
>> for
>>>>> some archives we can get false negatives.
>>>>>
>>>>> I am not yet 100% convinced I am correct yet though. The approach is
>>>> based
>>>>> on 2 assumptions, first one can easily be proved, the second one is
>>>> related
>>>>> to the LZ algorithm and I cannot yet prove what I check is safe and
>> can't
>>>>> occur in a valid LZ stream.
>>>>
>>>> This is good news Milen, just what I hoped for. I am SURE there are such
>>>> short-cuts because cRARk runs full speed no matter what you throw at it.
>>>> Please do share your findings, you know I would.
>>>>
>>>> magnum
>>>>
>>>>
>>>
>>
>>
>>
> 


Powered by blists - more mailing lists

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.