Last week Gnip, the leading provider of social media data for enterprise applications, announced its partnership with Automattic to bring comments and blogs on the WordPress platform into its already powerful data stream.

For four years, Gnip has been the company behind the elusive social data stream businesses are hungry for. Though you may not be familiar with the name, Gnip provide data to eight of the nine largest social media monitoring firms such as Radian6.

We had a chance to talk to President and COO, Chris Moody, about how Gnip it is changing the way we look at data.

In your words, what is Gnip and how did it get started?

Gnip gives social data to businesses so they can monitor, listen and gain insight based on public social conversations.

Before our service, companies would have to go to each individual channel and use their API to get data. Some publishers offer no access, it’s very limited technically, or the content is limited as you only have a portion of what they get. If you can live with those limitations, you could contact each directly. The downside is the API often changes a lot and the subsets from each data source are hard to analyse as they are often in different formats.

Gnip brought together data and normalised it to make it easy to consume. So far we’ve had broad reach and a lot of success. We’re now coming up to our four year anniversary.

In the past four years, have any major changes from publishers affected the quality and accessibility of online data?

18 months ago, the industry changed for the better when one publisher wanted to take a different approach. Twitter believed the business world should have their data as they need good and complete data to work with.

Twitter developed their commercial product through Gnip. Though they want to serve the business market, they prefer the consumer side so partnered with us to provide social data to businesses.

With no rate limit, the monitoring industry blossomed. A lot of decisions have been made on data from twitter and it’s consumed in large volumes. Companies now want more data through all social conversations. Twitter is good for certain business use cases especially news or PR crisis management. With Twitter you get the facts but not deeper conversation.

You’ve now partnered with Automattic. How is this going to affect the analytics business?

Partnering with Automattic is big step forward for us and the industry. By having access to the blog and comments platforms they offer (like WordPress and Jetpack), Gnip can now radically increase the data they provide. Blogs and comments show further engagement, and data that is new, deeper and richer. Now there’s a wider reach and an existing ecosystem, it’s much easier to serve the business market.

Do you provide your services directly to brands?

Typically, we work with analytics providers and they work with the brand as they provide insight. We are about getting the best and most comprehensive data while the analytics companies get a result for the brand showing in their systems.

With billions of social interactions, how do you deal with the age old problem of Spam?

Providing clean data is a challenge. But the bigger question is “what is spam”? We aren’t the best company to decide so we leave the publisher to decide what spam is in their system.

What are your priorities for 2012?

Two things are of high priority for us.

  1. More coverage: Twitter is amazing but it’s not all social conversation. Our ultimate (but unattainable) goal is to give a full picture. We want to work with publishers to have data that has never been available before.
  2. Improving our technology: This is a never ending task as we process over a billion social activities a day. A big technical challenge is to get the most data but to consume it in best way possible and try to make easier to absorb. Customers want to know everything. So we’re trying to look for whole brand like fully resolving the URL to make search better which is a hard technical problem. With far greater access, we can enrich what we receive to make data more consumable.

We talked about your goals for this year but what is the future of analytics and data mining? Where will we see the most immediate growth?

Social data seemed to be marketing demand at first so today the conversation is focused on real time data (i.e. what is happening now). As you can imagine, there is an enormous need for historical data. It can be a chance to replay history and determine the path of the problem. This means if a company didn’t set up those searches, they can go back and refilter the stream to extend coverage to the past. As businesses are more and more reliant on data, the need for historical data has increased.

In some ways, historical is cutting edge to social media. If we’re looking to build predictive analysis, then we need the past.  Financial institutions are a good case in point. They are now trading on social data to provide a more sophisticated analysis.

Beyond marketing, what unanticipated areas are emerging that may start using this vast amount of online data?

There a couple of interesting areas. The most apparent is the whole disaster relief model. Emergency responders use data to help react to challenges in the field.

As for businesses, most think sales but a more interesting area to use data is supply chain management. Companies can see the impact in multiple areas. For instance, you could look internationally at political events, such as protest or war, to understand if there will be disruption of service. Then you can understand the impact to part supply, especially if you have manufacturing facilities in those areas. You could also look locally. Retailers can see activities around their store or a lead up to a release. They can see products that are in high demand and stock accordingly.