Duplicated contents is not a new topic. We all know that it is not good for your site.
SEOmoz has categorized this problem into “issue” and “penalty” here. An “issue” is created when Google or other search engines don’t know how to index and rank the same piece of content on different domain. Rand Fish summarized how to know if you are having an issue or getting penalized:
Penalties require a good bit of abuse to go into effect, but I’ve seen it happen, even on domains from respectable brands. The penalties really arise when you start copying hundreds or thousands of pages from other domains and don’t have a considerable amount of unique content of your own.
Recently, CN Reviews has experienced one of this duplicated content problem – between an issue and a penalty. I want to share with you and hope it is helpful for you to maintain a healthy blog.
Symptom of sickness:
- CNReviews can’t rank at the first place on Google for title tag. In fact, a RSS feed aggregator site called virtualreview.org ranked on all our title tags during that period.
- The ranking of some keywords dropped dramatically. See below. Obviously, our ranking for “airport” dropped dramatically during May 16 – Jun 8.
Trouble shooting process:
- I started to “blame” a plugin we installed recently which is to create “sticky posts”. So I deactivated.
- I signed up Google Webmaster Tool and look at CNReviews from the eye of Google (bot). It is a two-step process.Sign up here and upload a verification file to your site’s root directory. And soon I found out there were a few hundred pages URLs ended with “?wpcf7=json”. For example, we have a page called: cnreviews.com?wpcf7=json which is extractly the same as cnreviews homepage. According to WebTalk, this is a problem created by a Wordpress plugin called “Contact Form 7″ which we have installed since the blog launched.
Solutions
- Deactivated the Contact Form 7 plugin.
- I used the “disallow” command to block Google bot from indexing the pages have “?wpcf7=json”. It is very easy to compile this robots.txt file once you get into Google Webmaster Tools and follow the instructions.
So far, I think we have solved the problem as you can see the searched for “airport” going up again. But why Google, such an intelligent search engine, indexes pages like this. The code “?wpcf7=json” is only used in AJAX submitting (POST) process by the plugin? And why this issue didn’t float up as a problem earlier? I don’t know the answers from technical standpoint, but this problem became visible after we got the traffic spikes from Sichuan Earthquake Donation Guide.
Lessons Leaned:
- Do more research about the plugins before installing.
- Monitor your metrics, especially when you have a spike in traffic; a larger data set tell you more stories. If you find something unusual, do some sample queries to see if your ranking of past top keywords drop.
- Sign up Google Webmaster Tool and see if you have any duplicated contents indexed by Google.
-
Fellow bloggers: how do you measure success with your blog? Bloggers and metrics: I’ll show my stats if you show me yours. Some of my most fun meetups in...
-
Off-topic post alert (but what is really on-topic for CNReviews these days heh) I’m at WordCamp SF today, and just saw a presentation by Stephen Spencer, founder...
-
I have been to the Chinese Blogger Conference annual event twice, and have gotten to know some great geek bloggers. But based on the number of comments on their...

The Symptom of sickness you have reported has nothing to with the duplicate content issue (at least in that case). As a matter of fact during the period you mentioned Google changed its alghoritm and this affected the whole web. During that period for instance I lost around 300 visitors a day while my webpages where ranked low in the SERP. The same was for my keywords which i could not find anywhere. At the same time “unknown” websites started appearing from nowhere high in the Google SERP. Now the situation is back to normality. It is said that Google does that to promote those websites which use Adword (if you pay it is right to be listed higher). Other people instead say that this alghoritm-change prevents spam site from being list high in the SERP (search engine result page).
if you want to know more here is an article to read:
http://www.webtlk.com/2008/06/12/google-dance-google-everflux/
greets and thanks for your backlink
Frank – webmaster of webtalk
@Frank, agree that the Google Dance (or as you more accurately put it, the Google Everflux, great term BTW) could have affected us. But after removing the offending duplicate content our traffic has resumed and we are now winning on our an exact match of our blog Title tags which was not the case when we had the wcf7 json versions of all of our content indexed by Google as well. Bottom line: Do NOT use Web Contact Form 7 and be very afraid of plug-ins that are bad for SEO. Let me say again: Web Contact Form 7 is BAD for SEO. Do not use!
What have you used to replace WCF7?