May 6th, 2010 by Dan
Having the same content appearing simultaneously in more than one place online is potentially serious, but easy to remedy. The reason why duplicate content can affect a website’s search engine optimisation is that search robots specifically look for text similarities, with possible negative effects.
Occasionally, unscrupulous web villains copy a site’s content, using automated software to scramble the words before posting it up on their own pages. These sites are usually full of advertisements, and as search engines are not fooled by pages which have no meaningful content, they will be marked down in search results or possibly banned.
Sometimes, duplicate content appears word for word on another page. This could occur within one’s own site, on another website where permission has been given to use the content, or it may have been stolen. In all of these instances, search robots will recognise that multiple pieces of the same content exist, but they will not necessarily know which is the original. They will therefore ‘throw away’ all but one for their search results, and the selected page could be any of the options. Fortunately, it is possible to mark the original page with what is called a ‘canonical’ tag which is understood by search robots.
When content is used on another site with permission, search robots recognise a link back to the original page as an indication that this is where the content came from. It is helpful to add a ‘NoIndex’ tag to the page on the second website, as this tells search robots not to index it at all, thus improving the original site’s search engine optimisation.
Where there is duplicate content within one’s own site, perhaps due to the way a content management system has been set up, this can be a problem. Any time search robots spend crawling duplicate pages means less time is spent indexing the rest of the site. Important content may be missed altogether, and the pages chosen and placed in search results may not be the ones which are best for visitors. A plain page optimised for printing might be selected by the search engines, rather than a more attractive version with images and design more conducive to sales.
It is possible to prevent search robots from crawling certain pages at all, simply by adding the ‘robots.txt’ rule. This would be a solution with duplicate content on one’s own site. Multiple similar pages might not be necessary in the first place, and it is worth considering if any of them could be discarded.
Search engine optimization content that merely resembles another piece of text can be picked up by search engines, as they are able to compare small sections, or ’shingles’, of content. If there is too large a proportion of shingles on a page, this will be seen as duplicate content. Search engines do not judge occasional quotes negatively, though, if the content as a whole appears unique.
At SearchEngineOptimization.co.uk, we take care always to use unique content, as this gives the best results for SEO.
Link to us
If you want to link to this blog, copy and paste the following HTML code to your website.

0845 077 2967