Google, Bing and Yahoo may not be the best of friends, but every once in a while they do get together in a high-profile way.

That was the case yesterday, when the search trio announced the launch of, which seeks to add more structure to content on the web. goes about this by offering new markup based on the W3C's draft HTML microdata specification, which in this case allows publishers to add data to their HTML tags to "help search engines better understand their websites".

There are specific types for different types of content, ranging from movies to restaurants. As an example, a movie type would augment standard HTML to help search engines understand the content:

<div itemscope itemtype ="">
  <h1 itemprop="name">Avatar</h1>
  <span>Director: <span itemprop="director">James Cameron</span> (born August 16, 1954)</span>
  <span itemprop="genre">Science fiction</span>
  <a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a>

It's not difficult to see why Google, Bing and Yahoo would like this microdata. Spidering the world's content and making sense of it all is really, really tough. Even with the best technology in the world, finding semantic meaning in huge amounts of semi-structured and unstructured information is challenging.

But publishers, lured by the promise that microdata could boost their position in the SERPs, can provide the desired semantic context manually. In other words, the major search engines can get publishers to do their work for them.

But should publishers go along? There are several big problems with worth considering:

  • Adding the markup could be a lot of work.

    Adding the markup to existing content will require great effort for publishers with a lot of content. This will be especially true for publishers using content management systems, as many would need to update their templates and administrative interfaces to support markup.

  • It can and will be abused. types may be more specific than meta tags which search engines largely ignore today, but just like meta tags, you can be sure SEO spammers and black hats will look to take advantage of them. In the worst case scenario, this would render as worthless as meta tags for SEO purposes.

  • HTML is for display.

    HTML describes how content should look when rendered; HTML is not designed to describe the content itself. That's one of the things XML is designed for.

    The HTML microdata specification which is based on may be a W3C draft, but that doesn't mean that there isn't a strong argument to be made that blurring the lines between HTML and XML is undesirable for just about everybody except search engines.

So should publishers completely ignore No. It will be interesting to see how Google, Bing and Yahoo use it in the next year, and what impact publishers who do adopt it see on their search rankings. In the meantime, most publishers should probably remain focused on the SEO basics.

Patricio Robles

Published 3 June, 2011 by Patricio Robles

Patricio Robles is a tech reporter at Econsultancy. Follow him on Twitter.

2647 more posts from this author

You might be interested in

Comments (17)

Save or Cancel
Martin McAndrew

Martin McAndrew, Online Marketing Manager at Redweb

it also doesn't seem to be supported by WC3. I am also concerned about accessibility issues has anyone tried using the schemas with screen readers etc.

about 7 years ago



This is just plain crazy in my opinion. Too many companies creating proprietory code for their own gains. Look what happened to FBML! FMBL wasn't necessary and was made totally redundant.
Was this nwe coding 'standard' passed through the W3C?

I have never see the web so fractured and no, I'm not joking.

about 7 years ago



A lot of work? Not at all. And the use of content management systems will make adding markup to existing content easier, not harder. Chances are, if you have a CMS full of movie data, you already have fields for title, director, and so one. All you have to do is code the CMS, in one place, to generate markup based on existing fields. In one easy step, your entire site will support markup. And the big CMS products will certainly create modules to add this functionality automatically. Drupal already has one underway, a day after the announcement.

Subject to abuse? Definitely. I don't understand how this is any different than META tags. Relying on the content producer to embed meaningful metadata (or microdata) is just asking for it to be abused. How can this be any different?

about 7 years ago


bob makemoney

I agree with the concerns about accessibility and abuse which makes me think that it will mean maybe not a lot of work but putting more thought into design considerations. I'm not looking forward to the fallout of this collaboration.

about 7 years ago



Yes I agree with you, schema will give publishers and web designers a lot of work..

about 7 years ago


Jon Henshaw

This seems like borderline FUD. There really isn't any comparison to META keywords and structured data. It wasn't that META keywords were simply abused, they went away because they were useless. Search engines developed the ability to determine keyword relevance through content analysis and other means. As for abuse, the search engines survived people using heading elements improperly, and they'll do just fine with structured data, as they have been since supporting rich snippets.

Structured data has been a long time coming. Microformats has been leading this fight for some time. I predicted several years ago that it was only a matter of time until search engines adopted Microformats, because it's elegant and just makes sense for both people and bots. That realization finally came true with the support of "rich snippets". Schema is simply the next step in the highly logical adoption of structured data in HTML.

I was also confused with your statement about HTML being for display. It's become standard to view CSS for display, not HTML. The latest version of HTML is actually designed to semantically structure data, with standards for displaying that data being adopted by browsers who create rendering engines for it. The people behind HTML have made great strides to remove anything remotely related to display, trying to deprecate any element and attribute related to that purpose.

HTML elements that contain copy are used to convey it's importance, emphasis and semantic structure. I know that copy contained inside of an H1 is a primary heading. I know that copy contained in a LI is a list item. Structured data takes it one step further by describing the data within the copy, without changing it's basic meaning or altering how it would be rendered in a browser. It's amazingly simple and clever, but also powerful in its utility.

A less elegant and cumbersome solution would be to have an HTML page with copy that contains the data you want to communicate, and then an XML file with duplicate data that's structured just to talk to bots. That's the entire point of structured data in HTML, to combine it in a document that can be read by humans and machines.

I will concede that it will take some work on the part of web designers/developers to update their code. Especially if they're just now waking up to rich snippets, microformats, and other structured data formats.

about 7 years ago



It will be abused. There is no question about it. I expect that very soon there will be services or some software to help with it. And very soon Google, Yahoo and Bing will have to come up with a solution, something we saw with nofollow.

about 7 years ago


Christoph Koepernick

Assuming schemas will increase one's position in the SERPs, are we sure this will draw more visitors to our sites? Maybe Google is using this extra data to display it directly in th search results, as they are doing with star ratings and location information. Might schemas harm our sites like Google is making qype and yelp redundant with their local/places initiative?

about 7 years ago

Martin White

Martin White, Senior User Experience Manager at Sainsbury's

Microformats make structure more semantic and content more accessible and portable. This is good for search engines and, more importantly, end users.

Retrospective integration of schemas will undoubtedly require more effort than early adoption in the development of a site. But the benefits in the form of rich snippets in SERPs are obvious. What's not clear is whether schemas spell the end of existing microformats such as the hcard (

The abuse of technology in the internet is nothing new and it is the job of search engines to filter dubious sources when returning relevant results.

about 7 years ago


Andy Birchwood, Lead designer, Ecommerce at TUI UK & Ireland

"HTML is for display.
HTML describes how content should look when rendered"

This is incorrect. HTML is a markup language that semantically describes the purpose of content in a document. "This is a heading". "This is a list" etc. According to W3C web standards, the display is controlled by user agent stylesheets and custom stylesheets.

So adding attributes to html elements in no way breaks W3C guidelines and does not affect the display of HTML elements in any modern browser.

Martin, regarding your point about accessibility, attributes seem to overlap some of those used for ARIA markup. It will be interesting to see how screen reader vendors choose to interpret these extra roles if at all, but again, if the browser doesn't recognise the attributes, it will ignore them rather than break the page.

about 7 years ago


Dean Marshall

I agree with others that HTML is a structural mark-up language and that display should be controlled by Cascading Style Sheets (CSS), but that is an aside from the article not the main point.

I think the article touches on an important point though - although I would tend to go much further about why it is bad for the website operator/owner.

Certainly given Google's ability to build a business out of other people's content - and their tearing up of the copyright rules in printing/publishing the dangers are real.

At the minute a webpage is basically a set of unstructured data - so Google sees words on the page and sends the human visitor to your site to check whether the topic of the page is what you are after.

If the webpage becomes a structured set of data - Google, Microsoft, Yahoo or some other 'Content aggregator' just lifts it from your page, aggregates it and uses it for their own ends - without sending you the human visitor.

Think Yellow Pages harvesting business addresses; or a publisher harvesting recipes (how difficult would it be to find your recipe being used, then prove it was *your* recipe and then assert your rights).

Just like those news aggregation services that set up after RSS became widely used - not always giving links to the sites whose news they scraped.

Sure Google dangles the prospect of more visits (for now), but I see real danger down the line.


about 7 years ago


Nick Nettleton, Director at Loft Digital


In this case, I think you are wrong on almost every point.

> Adding the markup could be a lot of work

Adding it to existing HTML templates should be fairly trivial for most sites. Certainly much easier than the alternative you suggest - building a whole new XML output layer, and maintaining it in parallel with the HTML layer.

> It can and will be abused

Everything is abused. That's the nature of an open web. But it's never a reason not to do it. The solution is to create practical ways for the crowd to filter out abuse, and economic disincentives for abusers. Report buttons and lengthy bans, for example.

> HTML is for display

Andy Birchwood is correct. Display is exactly what HTML is *not* for.

More importantly, I think the ideas behind are great not just for publishers and search engines, but also for end-users and start-ups.

For example, search Google for "Chicken Tikka". Among the results, you'll see some recipes. Some of these have images, star ratings, and an indication of how long it takes to cook. Some of them don't.

The problem is, it's simply not practical (or desirable) for people who write bots to create and maintain a different semantic indexer for every site, based on how its HTML is arranged.

By introducing a Recipe schema (, it becomes possible for anyone to write a bot to crawl and index recipes, and display them in ways that are useful for the user; and for anyone to publish recipes in a way that will be indexed.

Everyone wins.

about 7 years ago


Dawn Wentzell

Totally agree with Jon on his point about CSS being to control display while HTML is to structure the data. That always was and continues to be the purpose of HTML.

As for the amount of time it takes to add? Minor. I implemented the microdata for Books and Reviews on a WordPress site I own yesterday; it took me 5 hours to create the custom fields, taxonomies and post template, and only 1 hour to add in the microdata and test it. And I've never used microdata before (was always a microformats fan). Relatively speaking, it's not a large time investment.

about 7 years ago


Chris Graham

I agree with CWG, Andy, and Nick. We've just released a new version of ocPortal CMS ( with support, and it's worked out very nicely. The difficulty was implementing HTML5 (required for this) not with

about 7 years ago


William McKee

I'll chime in with Jon Henshaw et al. that this is borderline FUD. A good CMS will eliminate the issue around extra work. Search engines are already adapting to blackhat attempts to gain rank. And HTML is not at all about display; that's what CSS was created to do.

It's good to see the search community begin to take on this issue of creating more semantic markup in order to improve our Web experience. BTW, this work goes way beyond search engine ranking and into the original vision that Tim Berners-Lee had for the internet.

about 7 years ago


John Robinson

I think that given movies are probably the hardest type of web content to accurately index the schema COULD be a good thing, the only way to determine what video content is about is to watch it, and sometimes this means ALL of it as the subject may not be apparent in the first few minutes, this could only be done properly by humans (or movie watching robots) however will content developers and movie makers use this accurately... I doubt it

about 7 years ago



another schema... this is an additional work for web designers and developers... I dont think that it will going to be accepted the web community

about 6 years ago

Save or Cancel

Enjoying this article?

Get more just like this, delivered to your inbox.

Keep up to date with the latest analysis, inspiration and learning from the Econsultancy blog with our free Digital Pulse newsletter. You will receive a hand-picked digest of the latest and greatest articles, as well as snippets of new market data, best practice guides and trends research.