Document metadata is easy to understand, but can cause problems for those who aren’t aware of it.
Document metadata is additional information about the document that is not visible or presented anywhere within the document, but can be read by programs or users to glean additional useful details about the document itself.
Many common file formats now carry a lot of metadata by standard. This blog post was originally written in Word 2016, and taking a look at the metadata attached, I can see the title of this document, the names of everyone who edited it, the number of revisions it has been through, the program I used to edit it, and even how long I have spent with it open. There are plenty of other metadata slots that are unfilled, as well.
Metadata can be extremely useful. People often tag music or images in large folders so they can search by tag. If I tag every photo as I take it with who appears in that photo, I can then go back and search for a single person, and see only photos in which they appear.
However document metadata has caused problems in the past, typically when people are unaware of the additional information they are sharing. For instance, many phones and GPS-enabled cameras now automatically put location metadata into any photos they take. This can useful, if you want to know exactly where or when a photo was taken, but can also be a privacy concern as many websites do not strip out this metadata when you upload to them, and thus photos uploaded online can provide a pinpoint map reference of where you are; something you might want left private.
People often don’t realise the amount of data they’re giving away with just a simple file. To demonstrate this, I’ve opened a random photo on my hard drive that I downloaded a while ago. Now this photo is just of a piece of computer hardware – no harm there, right? Well, taking a look at the metadata in this picture, I can see exactly where this photo was taken, that the photo was edited in Adobe Photoshop before being uploaded, and I can even see the precise details of the camera used, such as the camera model, f-stop, and that the flash wasn’t used to take this photo. I can see when this photo was taken, the name of the photographer (or whomever the photographer gave the photo to), and even that the photographer manually adjusted the white balance while taking the shot.
A stunning amount of information is given away with many files. While some of this data can seem harmless, it can work as a dangerous tool in social engineering, and always prevents a privacy concern. I would recommend stripping out unwanted or unneeded metadata from files before uploading them or sending them somewhere public or insecure.
So view the metadata of a file in Windows, right click the file and select “Properties”, and then “Details”.