Difference between revisions of "Duplicate Content, Why Once is Enough"
Suzi Ziegler (talk | contribs) |
Suzi Ziegler (talk | contribs) |
||
Line 8: | Line 8: | ||
}} | }} | ||
− | In business, | + | In business, as at a dinner party, repeating youreslf isn't a good thing. No one wants to sit through the same college-antics story for the tenth time. Search engines like dinner guests have limited patience for repetition. |
Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers. However, duplicate content can diminish a visitor's experience and dilute the authenticity of a site. | Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers. However, duplicate content can diminish a visitor's experience and dilute the authenticity of a site. |
Revision as of 18:41, 11 February 2011
By [[User:|]] on
In business, as at a dinner party, repeating youreslf isn't a good thing. No one wants to sit through the same college-antics story for the tenth time. Search engines like dinner guests have limited patience for repetition.
Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers. However, duplicate content can diminish a visitor's experience and dilute the authenticity of a site.
In some cases, content is deliberately duplicated across domains to -- heaven forbide, manipulate -- search engine ranking, or garner more traffic. Deceptive practices can result in poor user experience and turn off potential customers or clients. When a visitor sees the same content repeated within a set of search results, they are likely to leave the site and never return. Not something you can do when a good friend repeats themselves constantly.
If your site contains multiple pages with largely identical content, there are a number of ways you can communicate your preferred URL to Google. This is called canonicalization.
Google does a thorough job trying to index and show pages with distinct information. For example, if your site has a printer version of a page and a viewing version of a page, neither will be blocked with a noindex meta tag. If Google perceives duplicate content with the intent to manipulate rankings and deceive users, they'll make handslapping adjustments to the indexing and ranking of the site(s) involved. As a result, the ranking of one or both sites may suffer. Even worse, the site may be removed from Google's index, in which it will not appear in search results -- at all!
Things you can do prevent duplicate content on your site
Building a website takes time, energy and financial resources. When someone visits your site, you want them to find it easy to use, helpful, and informative. Ultimately, you'd like a visitor to leave your site impressed and informed by your content. Showing duplicate content on multiple pages diminishes the user's experience. There are specific things you can do when developing a site to avoid creating duplicating content.
- Use top-level domains ~ To serve the most appropriate version of a document, Use top-level domains whenever possible to handle country-specific content. We're more likely to know that http://www.example.de contains Germany-focused content, for instance, than http://www.example.com/de or http://de.example.com.
- Use 301 redirects if you've restructured your site. 301 redirects permanently redirect your content in your .htaccess file to smartly redirect users, Googlebot, and other spiders.
- Be consistent ~ Try to keep your internal linking consistent. For example, don't link to http://www.example.com/page/ and http://www.example.com/page and http://www.example.com/page/index.htm.
- Syndicate carefully ~ If you syndicate your content on other sites, Google will always show the version they find to be the most appropriate for users in each given search, regardless whether it's your preferred version. It is helpful to ensure that each site syndicating your content includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex meta tag to prevent search engines from indexing their version of the content.
- Use Webmaster Tools to define how your site be indexed: You can tell Google your preferred domain (for example, http://www.example.com or http://example.com).
- Minimize the use of boilerplates ~ Instead of including lengthy copyright text on the bottom of each page, include a very brief summary, then add a link to a specific page for more details. Try the Parameter Handling tool to see how Google to treats URL parameters.
- Avoid publishing stubs ~ Users don't like finding empty pages, so avoid placeholders where possible. Don't publish incomplete pages or pages under construction. If you do create placeholder pages, use the noindex meta tag to block these pages from being indexed.
- Understand your content management system ~ Make sure you're familiar with how content is displayed on your web site. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on the home page, in an archive page, or on a page of other entries with the same label.
- Minimize similar content ~ If you have many pages that are similar, consider expanding each page or consolidating the pages into one. For instance, if you have a travel site with separate pages for two cities, but the same information on both pages, you could either merge the pages into one page about both cities or you could expand each page to contain unique content about each city.
Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads too much crawling of your website, you can adjust the crawl rate settings in Webmaster Tools.
Remember, duplicate content on a website is not grounds for action unless it appears that the duplicate content is intended to deceive or manipulate search engine results.