```Date: Tue, 05 Apr 2011 19:13:02 +0200
From: klondike <klondike@...cosoft.es>
To: oss-security@...ts.openwall.com
Subject: Re: A new way of writing secure data backups, combining

El 05/04/11 07:17, Gareth Randall escribió:
> Hi,
>
> I have published a free software project called "Triplyx", which
> writes data to a set of three storage devices in such a way that if
> any one of them is lost or stolen, it cannot be used to recover the
> data. Any two storage devices can be brought together to recover the
> data. It was created for use with offsite data backups.
>
> The concept is simple, although I have never seen it done in a
> commercial or open source product.
Well long ago I tried to make a similar thing to divide a private key so
a x out of the n people allowed to see the document could recover the
data, the problem is that although the solution was trivial for small
numbers it started getting complex as the n increased.

Anyway there is a way to increase the amount of data ciphered. We take
D1,D2 as two chunks of data and A and B as two one time pads:
D1^A and B
D2^B and A
D1^D2^A and D1^D2^B

Recovery works as follows:
Case:
D1^A and B
D2^B and A

(D1^A)^A=D1
(D2^B)^B=D2

Case:
D1^A and B
D1^D2^A and D1^D2^B

(D1^A)^(D1^D2^A)=D2
(D1^D2^B)^(D2)^(B)=D1

Case:
D2^B and A
D1^D2^A and D1^D2^B

(D2^B)^(D1^D2^B) = D1
(D1^D2^B)^(D1)^(A)=D2

Cool isn't it I have reduced your 6 bytes of cipher+data for byte of
data into 6 bytes of data for 2 bytes of data so I basically doubled
your performance :P Also I only need two sources of randomness instead
of three.

Now let's see what the attacker can get out of this:
With D1^A and B -> D1^A^B
With D2^B and A -> D2^A^B
With D1^D2^A and D1^D2^B -> A^B

Only problem is that you need a way to tag the disks as 1, 2 or 3, but
that'd be just two bits or a label ;-) And an extra bit telling whether
we count or not the last data. It can also take an extra byte at the end
when the number of bytes is odd (why should someone want to do that) in
that case simply make D2 a random value.

Another implementation trick: on disk 3 you can do first (D1^D2) then
compute (D1^D2)^A and (D1^D2)^B to make only 3 xors.

Of course the faster way to recover data involves disks 1 and 2 but with
3 disks we can also check the data for some errors look:
D1^A and B
D2^B and A
D1^D2^A and D1^D2^B

Get data:
(D1^A)^(A)=D1
(D2^B)^(B)=D2

Or:
(D1^D2^B)^(D2^B)=D1
(D1^D2^A)^(D1^A)=D2

(Efficiency detail: applying the first method once and the second one
twice will result in 4 access to each of the disks, so it can be used
smartly to maximize throughput).

Check data:
(D1^D2^B)^(D2^B)=D1=(D1^A)^(A)
(D1^D2^A)^(D1^A)=D2=(D2^B)^(B)

Note that chained errors may cause undetectable data corruption for that
an recovery the usage of FEC codes is highly recommended.

All I ask if you decide to use this is proper credit. If you aren't
going to use it tell me because I may then try to make a fork to make use.

If you are interested I can also write sse and mmx routines to handle
the data in a more efficient way (if the data is properly aligned :P).

klondike

PD: Some free C code under a GPL license I wrote:
Released under GPLv3, contact me on klondike ( a t ) xiscosoft ( d o t )
/*This macros will read and of data and split and encrypt them in such a
way that at least 2 out of the three streams are needed to recover it*/
/*In a similar way they can also recover the data*/
#define disk1(d1, d2, a, b, o1, o2) {(o1)=((d1)^(a)); (o2)=(b);}
#define disk2(d1, d2, a, b, o1, o2) {(o1)=((d2)^(b)); (o2)=(a);}
#define disk3(d1, d2, a, b, o1, o2) {(o1)=((d1)^(d2)); (o2)=(o1)^(b);
(o1)^=(a);}

#define recoverdisk12(d1o1, d1o2, d2o1, d2o1, d1, d2)
{(d1)=(d1o1)^(d2o2); (d2)=(d1o2)^(d2o1);}
#define recoverdisk13(d1o1, d1o2, d3o1, d3o1, d1, d2)
{(d2)=(d1o1)^(d3o1); (d1)=(d3o2)^(d1o2)^(d2);}
#define recoverdisk23(d2o1, d2o2, d3o1, d3o1, d1, d2)
{(d1)=(d2o1)^(d3o2); (d2)=(d3o1)^(d2o2)^(d1);}