Can anyone advise me on this one as I have made some enquiries and got quite mixed messages!
I am wanting to implement multiple and enlarged images against articles on the media site I manage. As we make revenue from banners, we want to refresh the banners when we show the different variations of the image, ie image 1 will show one set of banners, image 2 will show a new set. As a result I was planning to do a new page for each image (as opposed to Ajax), with the same text content (as I believe the BBC does). My questions are:
1) If the text content on the duplicate page is the same, will google see this as duplicate content?
2) It was suggested that I use robots.txt to get around the duplicate content issue, but I guess this would then mean that the 2nd image will not be indexed by google? Is there any way to get round this?
If the image is different, the content is different (images are also content), and I don't think you have to worry about Google.
I don't think you need to use the robots.txt at all. It's better to have two pages in the index, one ranked well and the other one not so well, than to only have one page indexed. The word 'penalty' might not be the best one to describe what is happening. Google just wants to make sure that the results given in the SERP's (Search Engine Result Pages) are unique, and that they don't server results with the same content. So if it finds pages that it thinks have the same content, it will choose one has the original and simply not show the other one(s) on the SERP's.
I disagree slightly with Ove (sorry, Ove!). If the pages are totally identical other than the image, then they probably will be deemed duplicates & only one will show in SERPs. You have an opportunity to make the pages different & therefore rank for different keywords.
The simple way to do this is to use different content on each of the pages.
You probably have a lot of info associated with these images in your database. Automatically using different bits of that on each page could double the keywords you target & therefore increase your traffic.
Example: say that these images are related to articles, you could therefore use:
Different title tag for each page (eg. 'Small Photo - Caroline Whyatt - 450 x 600 pixel photograph' vs 'Caroline Whyatt Large Photo - 4500 x 6000 jpg Image', both of which target different keywords)
Different h1 tags (similar to above)
Different article snippet for each page (eg. the opening paragraph of your article for one vs an article summary for the larger image)
Different image alt tags
A few extra ways to get extra page views from your images:
Small image page: main call to action 'view large image', secondary 'read the full article'
Large image page: main call to action 'read the full article'
Article page: 'view large image'
New page: Gallery for each article, containing thumbnails of all images. Link to this both from the article & from all associated image pages.
Tagging: tag all your images & put tag lists/clouds on your image pages. Have automated gallery pages for each tag. eg the tag 'George Clooney' would link to a gallery page of all thumbnails tagged 'George Clooney'. Clicking a thumbnail would load the small image page, which would lead the user to the article of the large image page, etc.
Not a problem at all! I completely agree with the advice you're giving. It guarantees that the issue is taken care of, and it further optimizes the site.
Thanks for your advice. I have also been advised to put the duplicate HTML pages in a subdir and exclude the folder within Robots.txt. But I am not sure if the additional images will then get indexed?
Founder, Executive Director at Online Marketing Summit
22 February 2009 18:35pm
Caroline,
Just joined here, so a little late in the conversation... but would add that DJ is pretty dead on w/ duplicate content issues. On the additional images, I think the real question is do you care if they get indexed? what's the main objective. have access to a team of SEOs that I can get to answer some of these questions in detail if you need.
The SEO Best Practice: Index Inclusion Guide is part of Econsultancy's renowned SEO Best Practice Guide and is has been created with the help and frontline insight of globally-esteemed SEO practitioners, in order to give you the edge in your natural search marketing activity.
The State of Search Marketing Report 2012, published by Econsultancy in association with SEMPO, looks in-depth at how companies are using paid search, search engine optimization (natural search) and social media marketing. The report looks closely at current practices and emerging trends across paid search and SEO, as well as their relationship with social media.
Head of Digital Experience at Royal Mail group
11 February 2009 08:07am
Can anyone advise me on this one as I have made some enquiries and got quite mixed messages!
I am wanting to implement multiple and enlarged images against articles on the media site I manage. As we make revenue from banners, we want to refresh the banners when we show the different variations of the image, ie image 1 will show one set of banners, image 2 will show a new set. As a result I was planning to do a new page for each image (as opposed to Ajax), with the same text content (as I believe the BBC does). My questions are:
1) If the text content on the duplicate page is the same, will google see this as duplicate content?
2) It was suggested that I use robots.txt to get around the duplicate content issue, but I guess this would then mean that the 2nd image will not be indexed by google? Is there any way to get round this?
Web Production Coordinator at Gaslight Media
11 February 2009 19:58pm
Hi Caroline,
If the image is different, the content is different (images are also content), and I don't think you have to worry about Google.
I don't think you need to use the robots.txt at all. It's better to have two pages in the index, one ranked well and the other one not so well, than to only have one page indexed. The word 'penalty' might not be the best one to describe what is happening. Google just wants to make sure that the results given in the SERP's (Search Engine Result Pages) are unique, and that they don't server results with the same content. So if it finds pages that it thinks have the same content, it will choose one has the original and simply not show the other one(s) on the SERP's.
E-Business Consultant at Dan Barker
12 February 2009 14:54pm
hiya, Caroline,
I disagree slightly with Ove (sorry, Ove!). If the pages are totally identical other than the image, then they probably will be deemed duplicates & only one will show in SERPs. You have an opportunity to make the pages different & therefore rank for different keywords.
The simple way to do this is to use different content on each of the pages.
You probably have a lot of info associated with these images in your database. Automatically using different bits of that on each page could double the keywords you target & therefore increase your traffic.
Example: say that these images are related to articles, you could therefore use:
A few extra ways to get extra page views from your images:
Hope this is useful!
dan
--
http://www.barker.dj
Web Production Coordinator at Gaslight Media
12 February 2009 15:52pm
Not a problem at all! I completely agree with the advice you're giving. It guarantees that the issue is taken care of, and it further optimizes the site.
Head of Digital Experience at Royal Mail group
15 February 2009 06:37am
Thanks for your advice. I have also been advised to put the duplicate HTML pages in a subdir and exclude the folder within Robots.txt. But I am not sure if the additional images will then get indexed?
Founder, Executive Director at Online Marketing Summit
22 February 2009 18:35pm
Caroline,
Just joined here, so a little late in the conversation... but would add that DJ is pretty dead on w/ duplicate content issues. On the additional images, I think the real question is do you care if they get indexed? what's the main objective. have access to a team of SEOs that I can get to answer some of these questions in detail if you need.
Good luck.
Aaron
SEO Analyst and Content Writer at Kaushalam Digital Pvt Ltd
16 July 2011 12:49pm
The best way to get around with duplicate content is to use rel='canonical'. Here's a video and a brief explaining how to use canonicalization .. http://www.google.com/support/webmasters/bin/answer.py?answer=139394
Hope this helps.
Krinal