Multimedia Meta-data – More than meets the eye

Multimedia Meta-data – More than meets the eye

It is time to step back from system administrative tasks and move back into the subject of digitized multimedia data processing.  The subject is a vast ever evolving beast.  Multimedia basically consists of some combination of visual or auditory information.  With digitized multimedia, additional data can be layered which provides any number of uses.

In this post, I’ll be focusing on image file meta-data.  There are several levels and flavors of meta-data which I won’t cover in detail here.  However, I’ll give some basics about device information, Geo-tagging, and depth-maps.

A good way to think of meta-data is to compare it to a library card catalog.  Library books contain written media that have a large quantity of information.  The card catalog has basic information about the book, such as author, publish date, location, etc.  Essentially, this is how meta-data came to be.  As libraries became digitized, the paper card catalog transformed from index cards to meta-data on a database.

The purpose of meta-data is to provide an efficient way of finding relevant information and resources.  It does this by logically organizing the data by means of identification.  Lets see how this applies to multimedia meta-data, specifically digital image meta-data .

If you were to be handed a photograph by itself, you would be limited in what you knew about the picture.  In contrast, if the photograph had a caption hand written on the back of it “Rockaway Beach, 1972”, now you would know when and were.

Digital image meta-data operates under the same premise.  However, it isn’t limited to the “human based” details that might be hand written on the back of the photograph.  It’s not uncommon for details to contain how large the picture is, the color depth, the image resolution, when the image was created, or the shutter speed.  It would be tedious and just plain silly to do this manually.  With digitized images, the meta-data is created automatically.  The level of information only depends on what stage the images are created or modified.


Lets take out Rockaway Beach photograph again, only this time it was taken with a digital camera.  When the camera created to image, it also created meta-data about the image.  Some of the elements of that meta-data are the Geo-tag and time stamp.  Instead of handwriting it, it is now embedded as a layer in the image file itself.

Opening the image file on a computer, I can see that it was taken at Rockaway Beach this year.  However, the detail is even greater than that.  The Geo-tag tells me were on Rockaway Beach, typically within 25 meters.  Meanwhile, the time stamp tells me down to the millisecond when the picture was taken.  This is just ridiculous detail, but it’s there anyway because the digitized process does it automatically and effortlessly.

With the meta-data, I can also see what device took the photograph and several settings that photographers would understand.  These lend a hand to post processing and image enhancements later.  The meta-data can be used to automatically post process the image for the best visual appearance.  That is significant.

The nit and grit of meta-data is it doesn’t stop with just those elements.  There are consortiums that are constantly defining, refining, and ratifying the standards that meta-data is built on.  One such element is the depth-map.


A depth-map is an image layer that contains a gray-scale image with values the represent distance.  The distance reference point is the observer, typically the camera that took the photograph.  The varying shades of gray indicate how far or close a pixel is from the camera.  The layer is a component of a standard known as the eXtensible Device Meta-data, or XDM specification.  It is the result of a collaborative effort by Intel and Google starting in 2014.  As of version 1.01, XDM supports a variety of use cases including Depth, VR, and 360 photography.

It might not be apparent, but the XDM spec isn’t just a meta-data value for depth-map, but a whole host of meta-data values that provide a large variety of values.  With these values, the flat 2 dimensional image can be rendered with a greater degree of detail.  The higher detail allows for more processing options.  To put it simply, the image can now be analyzed, malipulated, and rendered in ways a traditional image can not.  This is the real magic behind depth-map meta-data.

For more details about the XDM specification, reference it from the website,

Traditional images are evolving into dynamic datasets that go far beyond the initial intent of cataloging and organizing.  Embedded information in the image not only lets us know details about the image source, but gives us the ability to process the image.  The term, “more than meets the eye” has never been truer in the field of digitized photography.

Comments are closed.