December 9th, 2005

keyword-3

Geek call for help!

hazelchaz (better known as Chaz Boston-Baden) has a fabulous collection of convention and socal fan-gathering pictures.

Or rather, he had a fabulous collection.

A while back, his web provider's disks went belly-up, and munched it.

Now don't start on backups. We've all made mistakes like that. Hell, some of us have made mistakes like that in our professional lives. It happens.

He paid money for data recovery, and got a lot back, but not all of it is intact. Here are a pair of examples:
 

They're both offset errors. When the file was recovered, a block of null data got crammed into the middle. In the first case, it's easily repairable by just slicing the chunks out and reassembling them. In the second, the break happened in the middle of an image block, and it wonked up the colors.

I've exhausted my binary file skillz identifying the offset error, determining that the junk actually is a pile of zeros and showing that by cleaning up the bad data, the color information in the really wonky images is still there. I haven't been able to manually remove the appropriate number of zeros to patch the file back together. All the jpeg fix utilities I've found either just repair corrupted headers or are tailored to "restore" image quality lost when an image is over-compressed. I haven't found anything that can analyze the data stream and identify junk data.

If you know any serious JPEG code wonks, please pass this along. If there's a way to programmatically identify the bad data and stitch it back together, help would be much appreciated.
keyword-3

johno figured the rest out...

That thing that I mentioned for hazelchaz?

I can't believe it didn't bite me in the nose.

So here's how to fix those corrupted photos.

Download XVI32 (or, if you're not on windows, some other hex editor)

Drag a corrupted file over to the XVI32 window.

Search for a big string of zeros (I usually go for about 10 blocks of zeros in a row, though you might find a few false matches before you find the big block).

Bookmark the first zero in the run so you can get back to it.

Block-mark the second block of zeros in the run. Scroll to the end of the string of zeros and block-mark the last block. Delete the block and save the file. It'll still be corrupted, but the offset will be reduced greatly.

So here's where johno found the final steps.

Go back to the bookmark. You'll see a 4-block pattern working its way backwards up the file. It's going to match xx:9x:10:00 (this is why you saved the last 00, to help you recognize the pattern.

Block-mark the bookmarked 00. Work backwards in the file. the 9x will always be the same 90-something. (94, 95, 96 I've seen in files). The xx will decrement. The smallest repeat I've seen is two of the pattern. Block-mark the first xx in the sequence and delete the block. Save the file. It should be intact.

I've got a few I cleaned up in my ChazRecovery Gallery