Google Duplicate Content Penalty, is it real?
by Daniel Snyder on Apr 18, 2011 • 7:32 am 43 CommentsDuplicate content raises a lot of concern in the world of webmasters and bloggers, the fear of the infamous ‘duplicate content penalty’ abounds. We are constantly being warned against producing and utilizing duplicate content, and I agree, the use of duplicate content on your blog is not something I would ever encourage. After all your blog, is supposed to be YOURS, a unique piece of the internet where you can share your personality and be yourself. I regularly run into blogs that aren’t doing much but scraping content from other websites and when I find these blogs I leave them just as quickly as I came. I’ve even written about how these so called ‘bloggers’ aren’t really blogging at all (you can read my rant here, when a blog is not really a blog.)
In this brief article I want to discuss the issue of duplicate content and it’s use on the web. Firstly lets take a look at what duplicate content is. In fact there are two types of duplicate content.
- The first is duplicate content that takes place within your own website. In other words it’s the same information appearing on more than one URL on your domain. This content is not necessarily 100% identical, in many cases it is only ‘considerably similar’.
- The second form of duplicate content can be called ‘Cross-domain-duplicate-content’, this is identical content to that which is on your website appearing elsewhere on the internet on other websites. This can be unintentional, or perhaps caused by scrapers (or maybe you are the one doing the scraping?)
No such thing as a “duplicate content penalty”?!?!
To bring a sense of calm to this discussion, right off the bat I’m going to share an excerpt from Google’s webmaster central blog where they discuss the issue of duplicate content. Many webmasters ask the question “am I being penalized for duplicate content?”. The official google answer (though somewhat vague) is “In most cases, having duplicate content on your site does not mean your site is penalized.” In fact google goes on record to say and I quote, “Let’s put this to bed once and for all, folks: There’s no such thing as a “duplicate content penalty.” At least, not in the way most people mean when they say that.“
Sources: Google Webmaster Blog, deftly dealing with duplicate content. Google Webmaster Blog, duplicate content due to scrapers.
Google also mentions that we should not worry about excerpts or snippets (when you’re quoting someone or another source) as duplicate content. The Google algorithms are much more complex and smarter than we’d all like to believe. As people interested in SEO our ego’s lead us to believe that we can outsmart google, or that sometimes google is penalizing us for things we didn’t even do. This way of looking at SEO is inherently wrong since it fails to take responsibility for the things we CAN control, and that is all of the content on our own website. There was recent debate about the google honeymoon period where webmasters who are seeing their keyword rankings suddenly drop blame it on a google algorithm called the google honeymoon. Blaming a drop in YOUR rankings on google is ignorant, and a cop out. But too stay on course and not go off on this tangent, you can read more here about my theories on the… google honeymoon period.
In essence Google does not penalize a website for duplicate content. They essentially are saying we aren’t going to drop you in the SERPs what we are going to do is filter out the duplicate content so it doesn’t appear in the SERPs at all.
There are websites that have stolen my content, am I affected by a duplicate content penalty?
The biggest fear I think most bloggers have is that their original articles are being stolen and posted on other blogs and that this is affecting their blogs ranking. Not only that, but their content is being stolen which leaves them with a sense of being violated and somewhat powerless to do anything about it. Have no fear, google is here! Google is smart, and they boldly claim that they are able to identify the original article. Google promises “you shouldn’t be very concerned about seeing negative effects on your site’s presence on Google if you notice someone scraping your content.“
Summarizing the duplicate content penalty
Based on the articles I’ve read from the Google Webmaster Blog, and various other webmasters SEO experiences with duplicate content I would still want to encourage everyone to produce only original content. If you are a website owner building niche or affiliate sites and looking for content, at the very least you should be manually re-writing those articles. Put things into your own words, add your own style, add your own experience… wait suddenly you realize you’ve done research and written a whole new article! Good job. There is a lot more that can be discussed surrounding duplicate content, including the issue of duplicate content within your own website. Google reports to you duplicate content they’ve found on your site within webmaster tools, so if you are concerned you should take a look there and read up on what you can do to resolve those issues. The sources cited above contain some great resources for resolving duplicate content issues.
What’s your experience or take on the google duplicate content penalty?
43 comments
Ryan says:
Apr 18, 2011
Hi Dan,
I disagree that Google will not drop you out of the SERPS for housing duplicate content, I have seen this on many ocassions. I have also seen Google get it wrong about which is the original.
For instance…. if for example a top SEO site were to copy your article here and post it then Google is likely to trust their content over yours resulting in you taking the hit!
Daniel Snyder says:
Apr 19, 2011
I think dropping a site out of the SERPs completely, and filtering out duplicate content, or lowering the placement are two different things. I hear what you are saying, my point is that the site is not receiving a ‘site-wide’ penalty, but rather the specific duplicate content is not showing up in results. Your second point about a big site stealing my content, is an issue that is of some concern for some small website owners, but I seriously doubt its likeliness. Thanks for your feedback. Besides google’s own crawl date, I wonder if there are any other markers for identifying who published an article first.
Gabriele Maidecchi says:
Apr 18, 2011
In most cases you’re not even aware you’re duplicating content in your own blog.
Wordpress includes a list of posts in more than one URL (the main page, the search, the archives), but this doesn’t mean you’ll be penalize for not tweaking the default installation so that you’ll block such duplications.
Probably there’s more than some irrational fear behind content duplication, good things sometimes we run into posts which can clear our minds a bit.
Daniel Snyder says:
Apr 19, 2011
True, and some content that google considers duplicate really isn’t necessarily. Using robots.txt to block google’s access to pages that are duplicate (such as printer friendly pages which duplicate html pages) is a good idea.
Sean McGinnis says:
May 17, 2011
Another example of when duplicate content can creep in (especially with corporate & enterprise sites) is when developers include a “printable” version of each page – something to be careful of and manage – you can excluse these through appropriate robots.txt info or through nofollowing the links to those pages.
Also, be aware of canonicalization issues such as the entire site being duplicated on both the www and non www versions of the url.
Daniel Snyder says:
May 17, 2011
Good point Sean! Thanks for the tip for our readers here.
Dino Dogan says:
Apr 18, 2011
Good topic…theres a lot of confusion about this one.
So dup. content doesnt really matter…unless (and I could be wrong about this..its just my understanding)
If your site is crawled once a month and then the offending site steals your content and they are crawled once a day, then google is likely to crawl the offender first and identify your content as native to the offending site.
This might be fine since highly ranked sites are unlikely to rip off content.
I mean…Huff post is not going to rip off infocarnivore ..at least I dont think she would lol, I could be wrong.
Daniel Snyder says:
Apr 19, 2011
Hey Dino. There will always be confusion about google’s methods! 🙂 We love to assume we know though don’t we. I agree about crawl rate, but I also agree it is just as unlikely. I was wondering if there may be other markers to identify when an article was first published besides a search engines crawl date.
Dino Dogan says:
Apr 19, 2011
Good question…lets ask Dan 🙂
Daniel Snyder says:
Apr 19, 2011
Here’s the answer straight from Matt Cutt’s… How Can I make sure Google knows my content is original.
A. Tatum says:
Apr 18, 2011
Great back story on this one. I’m sure Google has and is doing everything to kill the marketing systems that get people to sign-up for different types of “adsense like” programs and reproduce the same content over various domains with the promise of you making big money. I saw this being advertised heavily about a year ago, but I believe the main program is dead now.
Daniel Snyder says:
Apr 19, 2011
Yeah I still see people attempting that method, what a waste of time. I’m far more interested in making money ‘legitimately’ by producing valuable content and engaging my readers.
A. Tatum Jr says:
Apr 19, 2011
Definitely! They just make SEO bad for everyone.
Mariana says:
Apr 18, 2011
Maybe it is true that Google does not penalize duplicate content. However, webmasters and web content creators should make always their best effort to provide readers with original and useful content. Google is the most important search engine, it is true, but websites are not only ment for it but also and mainly for other human beings that require information.
Kristi Hines says:
Apr 18, 2011
Great post to debunk the myths!
I always figured the duplicate content penalty was enforced on sites really breaking rules, like businesses who create 10 websites with the same content on different domains, all linking to each other. But sites that just might have a WordPress setup issue were not such a big deal.
And I think, when it comes to content thieves, Google can tell which site is the originator (generally the one that is the most popular) and the one that is the thief (least popular, probably ad driven).
Daniel Snyder says:
Apr 19, 2011
Hi Kristi, thank you! One special thing about duplicate content is that if a page does duplicate it only affects that page, not a site-wide issue. So though that page may be filtered out of the SERPs the overall site is simply not penalized. Thanks for your visit.
stacey says:
Apr 18, 2011
Hi Daniel, can I say that this is one of the few places I come to read tech stuff, and don’t feel intimidated; or like I don’t get it. You make things so easily digestible even if some of it goes totally over my head. I’m really enjoying learning the stuff that you share, rarely say much. but learning heaps. Thanks
Daniel Snyder says:
Apr 19, 2011
Wow, thanks Stacey. Tremendous compliment! I hope I continue to deliver. 🙂 Have a great day. Any topics of special interest to you that you’d like me to write about?
John Soares says:
Apr 18, 2011
Following up on what you and Kristi said, I’ve also read that Google gives credit to the site that posts the content first.
autobazar says:
Apr 18, 2011
Thanks – where is article source?
Daniel Snyder says:
Apr 19, 2011
Links are in the article.
Gina says:
Apr 19, 2011
I don’t know. I think the line between duplicate content and non-duplicate content is getting thinner with each algorithm change.
A lot of high PR sites ranked in Google for years have lost a significant amount of traffic.
Article directories are cracking down on what they consider non-quality, promotional content.
The repercussions of perceived “duplicate content”, penalty or not, has affected sites far and wide.
Google is much like the sea. You throw a rock in one end, it causes waves elsewhere. Some will not be affected. Others will feel a Tsunami.
Daniel Snyder says:
Apr 19, 2011
Hey Gina, google is very specific about duplicate content not affecting sites, site-wide, but rather individual pages. In addition, I’m going to go with what they are publicly telling us, rather than being skeptical. As secretive as google is, they are also honest in their PR. Bottom line though is that google does not ever encourage duplicate content, and goes to great measures to educate people about avoiding it. There are various types of duplicate content too, so we shouldn’t be worried about them all. I for one do not scrape, or steal content and spin it. I just don’t agree with that method, and there is a risk of tracers in the original article. Thanks for your visit, and feedback!
Suresh Khanal@Bivori says:
Apr 19, 2011
The duplicate content is not desirous, obvious but often I was confused if the excerpt on home page that presents a few paragraph from the post page too are condisered duplicate or not.
Thanks Daniel, you’ve cleared many of the confusion.
Daniel Snyder says:
Apr 19, 2011
Excerpts generally won’t be considered duplicate – google is smart enough to figure this out. 🙂
Thiru says:
Apr 26, 2011
Hi Dani
Its been a long time to visit infocarnivore.. Hope everything good. Great blog about content penalty for duplication. But you confused me, Finally whats your conclusion on it, either will we punished or not?
Daniel Snyder says:
Apr 26, 2011
Hey Thiru! Thanks for visiting man. My synopsis is simply that it is not a penalty, but rather could cause the particular page to be simply removed from the results. It will not affect you site-wide.
anuj@webtricksblog says:
Apr 26, 2011
This is perhaps the most thoughtful, useful article of this type I’ve seen. It goes beyond a lot of the surface-level ideas I’ve seen repeated over and over in other places. Nicely said 🙂
Daniel Snyder says:
Apr 26, 2011
Thanks man! Glad this article proved thought provoking for you.
Aybi @ kalamazoo dentists says:
Apr 27, 2011
For me, duplicating another’s content is just a matter of dignity of the blog or website owners who plan to do it. Yes, we must do everything to get noticed online but to the extent of copying contents from other sites is something not to be proud of. Though there’s no penalty for that, I think a clean conscience is not worth the trade.
Daniel Snyder says:
Apr 28, 2011
Hi Aybi, thank you for dropping by here! I agree with you, blogging integrity is important to our online success.
Tia Peterson says:
Apr 28, 2011
I always hate reading posts about duplicate content concerns, because they do more harm than good.
People blame ghosts for doing things when they have no other answer, and that’s what has ended up happening with the duplicate content BS. For me, it really irks me when there is a concern about syndicated content (which bizchickblogs has some of, along with sites like Social Media Today) and the “penalty.”
I don’t know. I might be a jerk for saying this, but if everyone would focus on providing great content and real marketing, instead of trying to own page one, this whole topic would be a non-issue.
Daniel Snyder says:
Apr 29, 2011
Ha! Tia, well put. It’s true that great content will take care of itself, and there is too much effort to get to page one without putting the real work in. Success online is a lengthy process, and most people are too impatient to take the time and do it right. Unfortunately those people rarely succeed. Thanks for your visit.
waterpearls says:
Apr 29, 2011
Hi Daniel,
It is an interesting post and It is good to produce original content.Duplicating a content is not good thing for bloggers.
This is a nice post about content penalty for duplication.
Toby Aletha says:
May 3, 2011
I work for a mid sized online retailer and I write articles to the website about featured products. The company also has a blog that I work on where I mostly put unique content. However, in some instances I duplicate the article from the main website to the blog website (different domains)
The ratio of unique content on the blog to duplicate is about 8/2. Will this hurt either or both pages? Thanks.
Daniel Snyder says:
May 3, 2011
My opinion is that it will not “hurt” but rather google will return the results from the page it finds the most relevant (ie. the one they find deem as providing the most authoritative content), so the duplicate content will not appear in the SERPs. In your case, it sounds like there is no loss, since the reader is going to be finding the information from the same source and the blog links to the site.
SERPD says:
May 5, 2011
Google Duplicate Content Penalty, is it real?…
Duplicate content raises a lot of concern in the world of webmasters and bloggers, the fear of the infamous ‘duplicate content penalty’ abounds. We are constantly being warned against producing a……
Lee Spaziano says:
May 5, 2011
I work for a mid sized online retailer and I write articles to the website about featured products. The company also has a blog that I work on where I mostly put unique content. However, in some instances I duplicate the article from the main website to the blog website (different domains)
The ratio of unique content on the blog to duplicate is about 8/2. Will this hurt either or both pages? Thanks.
Daniel Snyder says:
May 6, 2011
Hey Lee. I don’t know that you could say it will hurt you. But in any case it will never affect both, most likely the originally published article will be ranked in the SERPs and the duplicated article will not rank. There will not be a penalty (according to google) this is how they handle this. If you are concerned about it however one option is to not even have the duplicate page indexed by google. When you duplicate something add the URL to your robots.txt file so that google does not crawl/index that page.
Bingo Babe Jenny says:
May 13, 2011
This is definitely something i’ve been worried about. Not so much my site being penalised, but more so about scrapers outranking me or Google not detecting my site is the original source.
Marvin@Eat To Lose Weight says:
May 25, 2011
I am a little confused, If you have two pages on the same site that is considered duplicate content, will both drop in ranking or just the newest one?
Thanks
Daniel Snyder says:
May 27, 2011
That is uncertain and would be at the discretion of google’s algorithm. It may be the newest one, or not. But I highly doubt that it will be both!
5 Article Marketing Disadvantages | John's Internet Marketing Blog says:
Oct 20, 2011
[…] Search Engine Optimization (search engine marketing)Reducing Duplicate Content on WordPress BlogsGoogle Duplicate Content Penalty, is it real? var analyticsFileTypes = ['']; var analyticsEventTracking = 'enabled'; var _gaq = _gaq || []; […]