Friday, 8 June 2007

Editing JPEG photos, thumbnails and your privacy - edited out bits may still be visible

Many know this, but some don't, so I thought it worth a reminder. If you crop a JPEG image, typically a photo taken with a digital camera, in order to cut out part of the photo (e.g. people who don't want their photo on a website), or you obscure or edit it to try to hide someone's identity, be warned that there's a gotcha. Sometimes the original photograph can still be recovered from the JPG, in its full unedited unexpurgated glory.

Now I don't really know the ins and outs of how it works and I haven't researched it fully, but it seems that JPGs often contain what's known as EXIF metadata, including a small thumbnail copy of the original image. If you edit a JPEG file using photo editing or graphics software, sometimes the original thumbnail is still preserved and saved along with the edited final image - and can be extracted from it.

UPDATED with quickie explanation: You take a digital snapshot. Behind the scenes, a small thumbnail version of the original pic usually gets automatically created and saved as part of what's known as EXIF data, which is stored with the image as standard. You then edit the original pic e.g. to crop out someone else in the photo, and save that. Well, that also saves the thumbnail with the same file, behind the scenes. Sometimes the original thumbnail info in the EXIF gets updated to reflect your new edited pic. But other times it doesn't get updated - it still shows a thumbnail of the original snapshot. So the large (edited) pic and the thumbnail may no longer match. If you get that situation you might be able to edit the thumbnail too to match, but I don't know how.

The famous nude thumbnail

This "bug" became well known in 2003 when Cat Schwartz posted JPEGs on her blog of just her face, after cropping out bits from the original photo. Unfortunately in the original full photos she was, well, mostly unclad, and someone managed to recover the thumbnails of the naked versions from the published edited photos (there are pics of the original versions around on the Net but I'm not going to link to them, yep I can be a spoilsport sometimes). UPDATE: Oh all right, as so many people have been good enough to stop by, here y'go:

UPDATE: the following links no longer work, so for example of cropped image vs. thumbnail see the links above. You can check out some real life examples of extracted thumbnails which reveal the person's face or body, even after they were mosaiced out or tweaked - which can even reveal e.g. two entire people who had been cropped out. For a fuller low down and other example pics showing originals and extracted thumbnails side by side, see Hutta on embedded thumbnails (I was tempted to ask, is that like in Hutta the Jab? But I won't...).

How to get rid of the original thumbnails?

So the tip is, from a privacy and security point of view, if you want to protect your identity (or someone else's) by editing a JPG photo, you need to be very careful that a thumbnail of the original photo isn't still embedded in it, before you upload, publish or email the edited pic. I gather (but haven't tested it) that you can use EXIF editing software to get rid of the thumbnail, and that Adobe Photoshop's Save for Web function also strips out EXIF data automatically.

Again I've not tested this myself but I'm told by a reliable source that Blogger, in Old Blogger at least, removes EXIF data when you upload a picture file. Picasa Web, to which New Blogger images are now uploaded, probably preserves EXIF data on uploading, but I don't know if that would include the original thumbnail. Anyone know, or care to do more research on this?? JPGs may be better than PNGs or GIFs if you use Blogger, in terms of speed of loading for your users, but obviously you need to be careful about the "hidden thumbnail" possibility if it's important that only the edited image can be seen by visitors.

(This post was triggered by Robert Castelo's recent uploading of photos of the May 2007 Drupal event to Flickr, including ones with anonymous me in them (why I blog anonymously). I'd said it was OK to include the pics of me as long as he blurred out my face, and in fact he mosaiced out all of me, but it then occurred to me to wonder about the edited versions. No dodgy embedded thumbnails there, though, before anyone tries to look - thanks Robert! My secret identity is still safe, phew. And you can forget about trying to see more of the anonymous model in the Lara Croft "lookalike" pic, so there!)

No comments: