Duplicate content exists everywhere online. With the vast amount of information online, it is sometimes inevitable for the same piece of information to appear in more than one place. Likewise, it is common to build your content upon ideas and perspectives gathered from other online sources. The truth is, 25% – 30% of the web consists of duplicate content. Indeed, having duplicate content will not doom your site. However, it can certainly affect your SEO negatively.
Before we go into details on why duplicate content is bad for SEO, let’s look at how Google defines duplicate content.
What is duplicate content?
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin……”
Google is clear about their treatment of duplicate content. They do not penalize duplicate content. However, if it is deemed to be manipulative in nature or classified as spam, then it’s a different story. Nonetheless, having duplicate content will do no good for your SEO. Here are a few reasons why.
Why is it bad for SEO?
1) Duplicate content = Low value = Bad rankings
The first reason is straightforward. Google values content that is unique, informative and useful. From Google’s perspective, duplicate content provides little added value. Thus, although there is no penalty, it will not rank well. Furthermore, if the page has a significant amount of repeated content, it may be classified as low-quality which might trigger a Google Panda penalty.
2) Confused Google = No-no for your visibility
Imagine if you are Google. Would you want to provide your users with search results that contain multiple duplicate versions of the same content? The answer is obviously no. To provide the best search experience, Google will hope to minimize ranking pages with repeated content. So, if there is duplicate content on your site, Google will not know which version to rank. It will be forced to select one of the versions to rank as the best result. When that happens, it hurts the visibility for each of the versions, causing a reduction in web traffic.
3) Dilutes link equity (also colloquially called link juice)
Link equity is the ranking power that a page gets from inbound links (when another site links to it). When there are duplicates of the same content, not all the inbound links will be pointing to the same version. As the inbound links are distributed among different versions, link equity will be diluted. This will affect search rankings negatively as inbound links are a ranking factor considered by Google.
4) Takes up Google crawler resources
Google sends its search engine bots to scan (in SEO we call this ‘crawl’) your site periodically. The duration of this period depends on how often you publish new content. When crawler resources are used to discover and include duplicate content in Google’s database (we call this ‘indexing), new content is not crawled. This harms your SEO as it will delay the indexing of your new content. Delayed indexing means delayed listing on SERPs. Without listing on SERPs, you won’t be able to get any traffic.
Types of duplicate content on your own site
Most of the time, duplicates on your own site is unintentionally created. This can happen as people are not aware, careless or inconsistent when producing content. Here are two common types of duplicate content that you might already be guilty of!
1) Variations in URL structure
Such duplicate content is created when different versions of a page are available in different URL structures. Let me use an example to paint a clearer picture.
Same page, same content, various versions with different URL structures:
To visitors, these URLs are the same. Visitors expect to see the same content across all the versions of URL. However, to Google’s search engine bots, these are different URLs. When they discover the same content on these different URLs, they classify them as duplicate content.
2) Boilerplate content
Boilerplate content is content that can be reused and applied in different contexts without having to be really altered. Don’t worry about the unfamiliar term, here’s an example to demonstrate what boilerplate content is!
Let’s say you own a sportswear business and one of your dry-fit T-shirt is available in 12 colors. If you created a page for each color, it is likely that your product description, washing tips and size details etc. will be repeated across all 12 pages without much change.
That repetition is boilerplate content which is considered duplicative.
Remember, regardless of how the duplicate content on your site is created, intentionally or not, it is bad for your SEO. You can’t control duplicate content on other sites. However, you can stay aware of the possible duplicate content on your own site. Our next article will touch on some solutions you can adopt to get rid of (or at least reduce) your duplicate content, so watch this space for updates!
Suspect that your site might be ranking badly because of duplicate content? Or can’t seem to find the reason for it? We can help narrow down the problem so feel free to reach out to us!
Zachary is a Business Development Executive in Appiloque. In the after-hours, he serves as a Division Agent, taking back the city of New York when all else fails.