«
»

Digital, Thinking

When data about data matters most

05.10.11 | Gautam Ramdurai

I’ve been thinking about the music industry a lot lately. There are so many players in this space trying to figure the new economics of this industry, yet so few seem to get it right. Some services such as iTunes, Spotify and Last.fm are truly exceptional, while others last a mere few months. And once in a while you come across something as beautiful as Planetary. How do you even conceive such a thing? I think it has to do with data.

Let’s talk data for a bit

When music went digital, it became easily replicable and easily transmissible – and hence it became abundant. I think this abundance is what makes re-imagining music the way Planetary does truly valuable. And I think the best way to create more services that provide such novelty is to start thinking of this industry as not the music business, but the data business. And when data becomes abundant, we have to broaden our vision and get into the metadata business.

The data pyramid

In college, my professor taught me a very simple way of looking at information – the data pyramid. At the bottom of the pyramid lies DATA – raw data. It could be words, numbers, images…anything. Then comes INFORMATION – this is defined data. This is the difference between 5.6 (just datum) and 5.6 feet. Adding the data-type “feet” gives the number context. Once INFORMATION is given some sort of personal significance it becomes KNOWLEDGE. The statement “Neda is 5.6 feet tall” counts as information with some significance to Neda. The topmost level is WISDOM or INTELLIGENCE – this is applied KNOWLEDGE. An example could be how Neda’s physician uses her height as a metric to prescribe medicine or if her height is used by her tailor to sew the perfect dress.

At each level of the data pyramid more meaning is being added to data. And that meaning comes from increasing the amount and/or complexity of metadata. When data goes digital, it becomes easier to manipulate this metadata – finding new ways to scale up the data pyramid.

The dull definition of metadata is:

Metadata is data about data.

I personally prefer this one:

Metadata is information about a thing, apart from the thing itself.

As Rishad Tobaccowala points out – when this “thing” is some form of data and when that data is abundant, metadata becomes even more important.

Music and metadata

When music was sold on cassettes, records and CDs – the best way of discovering new music was to go to the record. Store owners meticulously organized the music by genre, artist and date. Sometimes they would also act as curators and tastemakers – showing you which new albums are worthy of your attention.

MP3s made music extremely portable and transmissible. What if buying music meant just getting a file without any information associated with it – no filename, no song name, no artist nor album name. Chris Anderson points out in his book FREE that you can easily download a song from a torrent site for free but choose iTunes because it guarantees you the right album name, the right song name and genre. What matters most when you listen to music? The file! But we choose to pay 99 cents for a file that we can easily get for free – we are paying for quality metadata.

Metadata for discovery

Your iTunes library is essentially a database. A database might seem like a mundane way of just stacking up files, but a closer look shows how databases are all about relationships and associations derived from metadata. This makes itself evident each time you sort your iTunes library based on one of the fields. The “jazz” genre is not just another field, but a connector that makes associations between three disparate music files.

This property of establishing relationships between datum can be used as a tool for discovery in giant databases. This is exactly what services like Last.fm and Pandora do. They use connector metadata to suggest new music to you. For Last.fm it’s more about the artist+genre while Pandora cares about the waveform and the music pattern.

Discovery is one of the most important services that any data dependent entity can provide today. Think about it, 10 years ago – music was what you found in the store. Now the concept of music is infinite. There was always enough and more music to go around the world – but the fact that it is now extremely accessible makes it daunting to explore. The same applies to books, movies, TV series, YouTube clips, articles and news items. Anything that has turned digital has turned abundant at the same time. This is the reason why we need more help than ever before to wade through the crap to get to the good stuff.

The Human Side

Metadata depends on how we interpret and manipulate it. We appropriate our music in countless ways – each of those appropriations could lead to a new service. For example, a lot music buffs look to music blogs and sites like Pitchfork for their new music recommendations. Shuffler.fm takes that behavior and added a metadata lens to it. Finding new ways to re-interpret metadata will have to be based in some core human trait. Ghostly does a great job of understanding the need for “work music” and the correlation between moods, colors and beats. There are plenty of other examples out there such as Musicovery and HypeM. The latest addition is a service that uses data from live shows – Songkick.

The best part about this is that creating new services around data is limited only by our imagination of metadata. For example, a lot of my friends listen to songs based on the weather outside – they have rainy songs, sunny songs, summer mixes and so on. I could start a new service that uses local weather data and plays you songs accordingly. Of course, the service would need a ridiculous name like Clou.dy or Sun.ny.

Pivoting on metadata

Planetary does exactly this – it takes musical data, extracts the metadata and finds an entirely new way of looking at it – and eventually visualizing it. It doesn’t answer the question “what else can I do with this music” but “how else can i imagine this data” – at least that’s how it seems to me.

The concept of metadata applies to all types of media – images, videos, films, TV series, cartography, house listings, classifieds, online dating etc.

In a world where everyday behavior is being digitized, data of any sort will become abundant. Once that happens, metadata becomes the key to organize, relate and discover. Metadata lies in the eyes of the beholder. If the interpretation, manipulation and use of it comes from core human understanding, what comes out of it will become a digital conduit to serve a human desire.

This, I think, is the key to engineering serendipity in an algorithm-driven digital world. Thoughts?

  • http://www.erictabone.com EricTabone

    G-Ram -

    / I’ve never seen that data pyramid before but it is fantastic. Thank you for my daily awesome.

    // Is ‘wisdom’ only attainable with external knowledge (or info or wisdom)? Just curious.

    /// I had not thought of metadata’s applicability in this way. Your music example explains it perfectly. (And to that point, your blogging is excellent – though I know too you learned from the best.) :-)

    //// Thanks also for Ghostly. Missed that.

    ///// I strongly disagree with the reasoning behind a 99 cent buy vs. stealing. (It may have been truer when FREE was first published, but I doubt that too.) Qualitatively, I’ve never once heard anyone make that case before; you can easily apply metadata changes to thousands of files in any music player in just 2 clicks. But a separate conversation, it doesn’t impact your argument one way or another.

    ////// Totally agree to the last statement. I’ve been thinking a lot about recommendation engines and ‘engineering serendipity’ over the last couple of months and your argument is spot on – spoken like a true engineer.

    Keep up the blogging, G! Great stuff.

  • http://twitter.com/matthewbonin Matthew Bonin

    Brilliant and engaging post…. thanks for bringing to light some new ways to enjoy music at the same time..

  • Pingback: This Week In Brand Strategy & Marketing - PSFK

  • http://techbrahmana.blogspot.com Srirang G D

    First of all,  It is both awesome and unbelievable, that I am seeing the “Data Pyramid”. The last encounter with that was in the CS seminar hall during the valedictory function of TechnologiX ’07.. ! :)   On that note, kudos to Shri. You-know-who. ;)

    That apart, well written. It was a joy to read. You just got a new subscriber to your blog. :)

    Now, talking about the data itself, you have beautifully captured the essence of digitization of the world and the people in it. Specifically you explore one (music) aspect of that. As you (and very likely other readers) would have observed, more and more data about people is now getting digitized. Starting from Facebook profile, to Frousquare check-ins, Quora question and answers, playlists on last.fm and so on. If all of this data could be mined together, one can get a “real” or “almost real” 360-degrees view of the users. After that we can have recommendations for just about everything (like the recommendations for music you talk about). I think this is where we are headed.

    Of course there are technical challenges, the biggest being correlating data from this disparate sets and more than a few companies (including mine http://www.drumaroo.com) are trying to deal with these. Lets see how we fare.

    Once again, nice post.

    Srirang

  • Pingback: Big Spaceship | Think Blog - Forward Thinking: May 13 2011

  • Pingback: Michael Yap — MFA Candidate in Interaction Design at the School of Visual Arts (SVA) » Thesis Journal Entry — May 11, 2011

  • Pingback: Michael Yap — MFA Candidate in Interaction Design at the School of Visual Arts (SVA) » When Data about Data Matters Most


«
»