The Brave Programmer - Blogging and coding
Not for the faint hearted
 

Blog Posts From The Brave Programmer

Minimize

Google Does Not Penalise Duplicate Content.

Sep29

Written by:
2009/09/29 09:41 AM RssIcon

Duplicate content", a topic that has had much debate for a number of years now  . A topic that has plagued SEO for years is this idea of a penalty for duplicate content. The question being asked is; “Does Google, or any search engine, penalise duplicate content?”

 

Quite frankly there is no such thing as a search engine penalty for duplicate content. At least not the in the way that you think there is or what is being promoted around the web today.
 
This is no denying the fact that there is a significant problem with duplicate content. What with the huge problem of scraped sites and plagiarism. But that in and of it’s self might not be as bad as you think. Read my article on Benefiting from scraped content.
 

Legitimate Duplicate Content.

Some or a lot of duplicate content is legitimate. The problem is search engines by them selves have no way of determining the legitimacy of duplicate content. They have no directive or right to make a blanketed decision on duplicate content.
 
In fact, scraped sites only account for a small part of duplicate content. The fact is, although plagiarism and scraped content is a problem, it is not the majority when it comes to duplicate content. Most duplicate content is legit.
 
The search engines will only penalize you intentionally engaging in deceptive practices and try to manipulate search engine results. The search engines will penalize you if you are caught.
 

Types of Duplicate Content

So then what are some of the types of duplicate content found on the web today? Here is a list of s few of the important ones.
 
1. URLs – Websites and web pages can be accessed via different url addresses. Consider the following:
  • website.com/
  • website.com/?
  • website.com/index.html
  • website.com/Home.aspx
  • www.website.com/
  • www.website.com/?
  • www.website.com/index.html
  • www.website.com/Home.aspx
All of the above can and does point to the same page. This is a prime example of duplicate content and probably accounts for the majority.  Google’s algorithm will recognize that they’re the same and will try to pick the right one, but it may not be the one the webmaster or site owner prefers.
2. RSS – Many sites and blogs syndicate their content. If you do this, then there is a great chance that that content will be syndicated to another site. RSS is a great way of distributing. Although considered duplicate content it is not penalized by Google.
3. Article Marketing – There are many article marketing sites or ezines. These sites are set up for the distinct purpose to market and publish valuable article to the masses. 99% of the time, consent is given to reproduce the article or the owner themselves reproduce the article.
4. Aggregators – Many sites use article aggregation as a form of promotion. Sites like Digg, Stumbleupon, Afrigator, News Aggregators will often duplicate all or a portion of original content. This not malicious, but just a product of the type of service being offered.
5. E-commerce Product Pages – many product listings and ecommerce sites have similar or same product descriptions littered all over the internet. I mean, how many different descriptions can you have for product SKU123, Blue Ballpoint Pen, when there are literally thousands exactly the same out there.
6. Multiple Versions – Many sites offer multiple versions of the same article. One format that comes to mind is offering an article for print. This article will have a different format than that displayed on the web.
7. Quotations – Many websites, forums, news groups will and can quote all or part of an article for legitimate purposes.
Paid for or consented articles – Many people and webmasters actually pay for or buy content. Sometimes this content can be sold to multiple sites, thereby creating duplicate content.
8. Plagiarised – This is obviously where most of the problem about duplicate content arises from. You content being stolen and used on other sites without our permission or any credit given to you as the original author or owner. Often times this author is unaware of this. It is therefore unfair to penalize the author. Google have stated that they do not and cannot penalize the original author.
 
This by no means an exhaustive list, but of the ones mentioned above, probably only point 8, Plagiarism is the one where duplicate content will be penalized. But then this is not 100% accurate. Google and search engines have only a few ways of trying to tell what is original and what is duplicate content. One way is by date of publish. But even then this is not 100% accurate.
 

The Job of the Search Engine.

The search engines want to index and show to their users (the searchers) as much unique content as algorithmically possible. That’s their job, and they do it quite well considering what they have to work with.
 
There’s no doubt that duplicate content is a problem for search engines. There is no doubt, contrary to popular belief, search engines, and in particular Google, are not out to get you.
 
Their job is to index sites, and not to be the duplicate content police.
 
Search engine penalties are reserved for pages and sites that are purposely attempting to trick the search engines in one form or another. The penalties perceived by users are generally of their own doing. By diluting the importance and uniqueness of a page or article they only achieve to dilute the result in the search engines.
 
From Google official blog: “Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.”
 
In most cases Google does a good job of isolating duplicate content. Presenting the best possible content to their users, filtering or grouping duplicated or similar content is their ultimate goal.
 
Google does not care if you have even 60% duplicate content on your website. As long as you provide some unique content on each site so that your websites offer enough unique value to the end user Google will not penalize you at all.
 
The point is, if you want to be featured on Google, you need to create unique and valuable content. If there is duplication then you need to add extra value to that content.
 
If your content is being duplicated, trust Google to sort that out. If you can’t then it’s up to you to take the necessary action required.
 

Gainsayers.

Yet for all this there are still people and webmasters who don’t believe that Google is doing a good job. Some are just your average moaners, while some are legitimate complaints.
 
Yes there have been cases, and sometimes many, where a scraped site, or an article site ranks higher than the original. Remember, neither Google nor their algorithm is perfect. 
 
The bottom line is that the engines are actively seeking out lousy content and removing it from their main results. This is not a perfect science.
 
Why then do some high ranking article sites, scraper sites rank higher than the original article? Well the short answer is I don’t know. But the fact of the matter is that they do. This is better served if we leave this point over for another article and discussion.
 

Conclusion.

Google does not penalise duplicate content in the way we view a penalty. They are in the job of indexing sites and presenting the best possible search results. Google has put out webmaster guidelines concerning duplicate content. You will do well reading that.
There are a number of ways to deal with duplicate content.
  1. Create Unique Content – authoritative, informative and useful
  2. Utilise a separate database for specific web properties – avoid populating the same content from the same database on different sites
  3. Canonicalization – informing Google and other engines of your preferred URL using canonical tags
  4. Using Robots.txt to block content from the Index – guide the engines to the content that you want treated as the authority
  5. Use 301 (permanent) Redirection
  6. Consistent Interlinking
  7. Set a preferred domain in Google Webmaster Tools
  8. Utilize a search engine friendly CMS
This is such a vast topic it is best for me to offer further reading on the subject. Check out the Google Official blog  posts:

What are you thoughts and experience on duplicate? Leave your thoughts in the comments below.

New here, or perhaps you've been here a few times? Like this post? Why not subscribeto this blog and get the most up to date posts as soon as they are published.

Tags:
Categories:
blog comments powered by Disqus

7 comment(s) so far...


Gravatar

Re: Google Does Not Penalise Duplicate Content.

Very interesting and enlightening article, Robert.
I've gained some insight from reading this.
Thanks for sharing!

By Jimi Jones on  2009/09/29 01:15 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

Thanks for the information Robert regarding duplicate content.

By AntonRSA on  2009/09/29 04:38 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

@Jimi,
Thanks for the flattering comments. Much appreciated. If it helped you, then all is well, and my work is done. Seriously, if one person gains from my articles, I am happy.

By Robert Bravery on  2009/09/29 04:38 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

It's a Pleasure Anton

By Robert Bravery on  2009/09/29 04:40 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

That's blown away some misconceptions - I think I'll go back to throwing chicken giblets at the wal and interpreting the patterns!

By Kevin Tea on  2009/09/29 05:03 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

@Kevin,

LOL, ROTHFLMHO
Yeah, with Google and SEO and PageRank and Search Results, you probably find more consistency with the giblets.

By Robert Bravery on  2009/09/29 05:25 PM
Gravatar

Re: Google Does Not Penalise Duplicate Content.

Good article. As I am still new to blogging, I found this article very educational.
Wishing you a scent-sational day!

By Patty Reiser on  2009/09/29 06:24 PM
 
Blog Updates Via E-mail
 Blog Updates Via E-mail
Minimize

Do you want to receive blog updates via e-mail. Then just click on the link below. You will be redirected to Google's feed burner, where you can fill out a form. Supplying your e-mail address.

The subscription is managed entirely by Google's Feedburner. We cannot and do not collect your email address.

Subscribe to The Brave Programmer by Email

Print  
 

 

Latest Comments
 Latest Comments
Minimize
Powered by Disqus

Sign up with Disqus to enjoy a  surprise box of features

Print  
 
Blog Roll
 Blog Roll
Minimize
Print  
 
Categories
 Categories
Minimize
Print  
 
<h1>Search Blogs From The Brave Programmer</h1>
 

Search Blogs From The Brave Programmer

Minimize
Print  
 
Archive
 Archive
Minimize
Archive
<October 2024>
SunMonTueWedThuFriSat
293012345
6789101112
13141516171819
20212223242526
272829303112
3456789
Monthly
Go
Print  
 
<h1>News Feeds (RSS)</h1>
 

News Feeds (RSS)

Minimize
Print  
 

Follow robertbravery on Twitter

Blog Engage Blog Forum and Blogging Community, Free Blog Submissions and Blog Traffic, Blog Directory, Article Submissions, Blog Traffic

View Robert Bravery's profile on LinkedIn

Mybyte

 

Robert - Find me on Bloggers.com

Tags
 Tags
Minimize
Print  
 
Contact Us Now
 Contact Us Now
Minimize
 

Email  us now or call us on 082-413-1420,  to host your website.

We design and develop websites. We develop websites that make a difference. We do Dotnetnuke Module development.

Web Masters Around The World
Power By Ringsurf
Print