Date: Thu, 12 Jul 2012 15:43:29 +0400 From: Aleksey Cherepanov <aleksey.4erepanov@...il.com> To: john-dev@...ts.openwall.com Subject: Re: Re: Aleksey's status report #10 On Thu, Jul 12, 2012 at 01:16:16PM +0200, Frank Dittrich wrote: > On 07/12/2012 12:53 PM, Aleksey Cherepanov wrote: > > The opposite problem: how to find the same files? In general before > > addition we should compare new file with all existing in the store. I > > think I will speed it up with index file (consisted of checksums) in > > the store (I will not push that file to avoid conflicts). Though there > > is a race condition: two users could add the same file in parallel. So > > we could get two equal files. But it does not really matter. > > I am not a git expert, but: I am not a git expert too. > Can't you define a pre-commit hook which either computes sha1sum of a > file, and commits the file under this name instead of the name given by > the user, then adds a line to your index file in a post-commit hook? > Can't possible conflicts be resolved automatically if you keep that > index file sorted? It does not seem that conflicts could be solved by sort. We touched that when talked about commits of additions to one .pot file. git is not good if we want to track unordered file that only grows. Though if we store files named by checksums we do not need an index file at all. > But that would possibly require to rewrite attack descriptions in a > similar way, so that they use the checksum instead of the user-supplied > file name. > And when you checkout the files, they could be renamed to the > user-specified file name again. I think renaming is not needed. We could just store two names: original will be used in properties of attack and checksum will be used to refer real file when user runs attack. > Am I mising something? Can this work? Does it make sense at all? How > hard would it be to implement it in a way that works flawlessly? I'd say that there could be two files with the same sha1 checksum but it does not seem to be very probable. At least git itself stores meta data in files named like sha1 taken from content of the file. (Though it could be useful (for my paranoid nerves) to compare files byte to byte and yell if we have different files with the same sha1sum). I guess this only makes sense if we would like to see what is in the store manually (like for debugging during the contest). Because original file names would be easier to understand. So original file names are a bit more convenient for investigation but hard to implement and/or slow. I'll store files renamed into sha1. Thanks! Regards, Aleksey Cherepanov
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.