Difference between revisions of "Duplicate Content, Why Once is Enough"

(migration import)
 

(63 intermediate revisions by one other user not shown)



Line 4: Line 4:
 
| Header = Duplicate%Content,%Why%20Oncet%20Is%20Enough%20
 
| Header = Duplicate%Content,%Why%20Oncet%20Is%20Enough%20
 
| Subhead = What's wrong with a little repetition?
 
| Subhead = What's wrong with a little repetition?
| Bitly = http://bit.ly/  
+
| Bitly = http://bit.ly/dupcontenet
| Date = February 3, 2011
+
| Date = February 14, 2011
 
}}
 
}}
  
In business, or at a dinner party, repeating youreslf is not a good thing. No one wants to sit through, "when I was in college," stories for the umteenth time. Search engines like dinner guests have limited patience for repetition.  
+
Search engines like dinner guests have limited use for repetition. No one wants to hear the same story repeatedly any more than search engines will rank duplicate content favorably.
  
 
'''What exactly is duplicate content?'''
 
'''What exactly is duplicate content?'''
  
Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers, though it can diminish a visitor's experience and dilute the authenticity of a site.  
+
Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers, but it often dilutes the authenticity of an online business.  
  
 
'''Why duplicate content is bad news'''
 
'''Why duplicate content is bad news'''
  
In some cases, content is deliberately duplicated across domains -- heaven forbide -- to manipulate search engine ranking, or garner more traffic. Deceptive practices can result in poor user experience and turn off potential customers or clients. When a visitor sees the same content repeated within a set of search results, they are likely to leave the site and never return.  
+
In some cases, content is deliberately duplicated across domains to manipulate search engine ranking and garner more traffic. Deceptive practices can result in poor user experience and turn off potential customers or clients. When a visitor sees the same content repeated within a set of search results, they are likely to leave a site and never return.  
  
Google does a thorough job trying to index and show pages with distinct information. If Google perceives duplicate content with the intent to manipulate rankings and deceive users, they'll make handslapping adjustments to the indexing and ranking of the sites involved. As a result, the ranking of both sites may suffer. Even worse, both sites could be removed from Google's index, meaning they will not appear in search results -- ever!
+
Google does a thorough job of indexing pages with distinct information. If Google perceives duplicate content with the intent to manipulate rankings and deceive users, they'll make handslapping adjustments to the indexing and ranking of the sites involved. As a result, the ranking of both sites may suffer -- or worse, both sites could be removed from Google's index. This means they will not appear in search results!
  
If your site contains multiple pages with largely identical content, there are a number of ways you can communicate your preferred URL to Google. This is called [http://en.wikipedia.org/wiki/Canonicalization canonicalization]. Similarly, if your site has a printer version that is identical to a viewing version of the same page, you can avoid Google blocking these pages by using a [http://en.wikipedia.org/wiki/Noindex noindex meta tag].  
+
'''Things you can do prevent duplicate content on your site'''
 +
 
 +
Building a website takes time, energy and financial resources. You want people and search engines your to find your site easily. Once your website is found, you'd like viewers to find it useful and informative. Showing duplicate content on multiple pages will diminishe a user's experience.
 +
 
 +
Fortunately, there are specific things you can do when developing your website to avoid creating duplicating content.
  
'''Things you can do prevent duplicate content on your site'''
+
:* Use top-level domains whenever possible to handle country-specific content. Many readers will know that <nowiki>http://www.mocksite.uk</nowiki> contains United Kingdom-centric content, but they may not recognize the same quality in these other URLs: <nowiki>http://www.mocksite.com/uk</nowiki>, and <nowiki>http://uk.mocksite.com</nowiki>. [http://en.wikipedia.org/wiki/Canonicalization Canonicalization] is the process of picking the best URL when several choices are available.
  
Building a website takes time, energy and financial resources. When someone visits your site, you want them to find it easy to use, helpful, and informative. Ultimately, you'd like a visitor to leave your site impressed by your content. Showing duplicate content on multiple pages diminishes your user's experience. There are specific things you can do when developing a site to avoid creating duplicating content.
+
:* Keep your internal linking consistent. Don't use multiple variations to link to the same place. For example, <nowiki>http://www.mocksite.com/</nowiki> and <nowiki>http://www.mocksite.com/page</nowiki> and <nowiki>http://www.mocksite.com/page/index.htm</nowiki>.
  
:*Use top-level domains ~ To serve the most appropriate version of a document, Use top-level domains whenever possible to handle country-specific content. We're more likely to know that http://www.example.de contains Germany-focused content, for instance, than http://www.example.com/de or http://de.example.com.
+
:* If your site has a printer version of a page that is identical to a viewable version of the same page, you can help Google avoid blocking these pages by using a [http://en.wikipedia.org/wiki/Noindex noindex meta tag].
  
:*Use [http://www.aboutus.org/Glossary/301-redirect 301 redirects] if you've restructured your site. 301 [http://www.aboutus.org/Learn/Redirect-Alternative-Domains-for-More-Traffic redirects] permanently redirect your content in your .htaccess file to smartly redirect users, Googlebot, and other spiders.  
+
:* Use [http://www.aboutus.org/Glossary/301-redirect 301 redirects] if you've restructured your site. 301 [http://www.aboutus.org/Learn/Redirect-Alternative-Domains-for-More-Traffic redirects] permanently redirect content in your .htaccess file to successfully redirect users, Googlebots, and other spiders to where you really want them.
  
:*Keep your internal linking consistent. For example, don't link to http://www.example.com/page/ and http://www.example.com/page and http://www.example.com/page/index.htm.
+
:* If you have other people publishing your content on their sites, make sure each site publishes your content with a link back to the original article.
 +
 +
:: If there are multiple versions of your article out on the Web, and Google has included all of them in its index, Google can show any version higher in search results. It will show the version it has determined is most relevant to someone’s search, regardless of whether it’s the original version from your site or another version on a different site.
 +
 +
:: To avoid having other people’s versions of your article trump your own, you can ask website owners publishing your content to tell search engines not to index the article. They can use the [http://en.wikipedia.org/wiki/Noindex noindex meta tag] to accomplish this. This makes it more likely that your original version will show up in searches relevant to your article.
  
:*If you syndicate your content on other sites, do it carefully! Google will always show the version they find to be the most appropriate for users in each given search -- regardless whether it's your preferred version. Make sure each site syndicating your content includes a link back to your original article. Remember, you can always request that anyone using your syndicated material, use the [http://en.wikipedia.org/wiki/Noindex noindex meta tag] to prevent search engines from indexing their version of the content, too.
+
:* Use Google [https://www.google.com/accounts/ServiceLogin?service=sitemaps&passive=true&nui=1&continue=https://www.google.com/webmasters/tools/&followup=https://www.google.com/webmasters/tools/&hl=en Webmaster Tools] to define how your site is indexed. You can tell Google your preferred domain (for example, <nowiki>http://www.mocksite.com</nowiki> or <nowiki>http://mocksite.com</nowiki>).
  
:*Use Google [https://www.google.com/accounts/ServiceLogin?service=sitemaps&passive=true&nui=1&continue=https://www.google.com/webmasters/tools/&followup=https://www.google.com/webmasters/tools/&hl=en Webmaster Tools] to define how your site be indexed: You can tell Google your preferred domain (for example, http://www.example.com or http://example.com).
+
:* Don't use boilerplates of lengthy copyright text on the bottom of each page. Instead, include a brief summary of the content you wish to display, then add a link to a specific page for more details. Try [http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=147959 Google's parameter handling tool] to see how they to treats [http://www.ironspeed.com/articles/Using%20URL%20Parameters/Article.aspx URL parameters].
  
:*Don't use boilerplates. Instead of lengthy copyright text on the bottom of each page, include a very brief summary, then add a link to a specific page for more details. Try [http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=147959 Google's parameter handling tool] to see how they to treats URL parameters.
+
:* No one likes to land on an empty page, so try to avoid placeholders where possible. Don't publish incomplete pages or pages under construction. If you do create placeholder pages, use a [http://en.wikipedia.org/wiki/Noindex noindex meta tag] to block these pages from being indexed.
  
:*Users don't like finding empty pages, so try to avoid placeholders where possible. Don't publish incomplete pages or pages under construction. If you do create placeholder pages, use a [http://en.wikipedia.org/wiki/Noindex noindex meta tag] to block these pages from being indexed.
+
:* Make sure you're familiar with how content is displayed on your website. Blogs, forums, and related items often show the same content in multiple formats. For example, a blogpost may appear on your home page, on an archive page, or on a page with other entries using the same label.
  
:*Make sure you're familiar with how content is displayed on your website. Blogs, forums, and related items often show the same content in multiple formats. For example, a blogpost may appear on the home page, on an archive page, or on a page with other entries using the same label.
+
:* Lastly, if you have many pages that are similar, consider developing each page uniquely, or consolidate the similar pages into one. For example, if you have a travel site with separate pages for Manhattan and Brooklyn, but much of the same information appears on both, you could merge the two pages together, or expand each page to contain unique content exclusive and relevant to each burrow.  
  
:*Minimize similar content ~ If you have many pages that are similar, consider expanding each page or consolidating the pages into one. For instance, if you have a travel site with separate pages for two cities, but the same information on both pages, you could either merge the pages into one page about both cities or you could expand each page to contain unique content about each city.
+
Google no longer recommends blocking crawler access to duplicate content on your website with a robots.txt file. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicate pages by using the [http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=139394 rel="canonical" link element], the [http://www.huntcal.com/cgi/urls.cgi URL parameter handling tool], or [http://www.aboutus.org/Glossary/301-redirect 301 redirects]. In cases where duplicate content leads to too much crawling of your website, you can adjust the crawl rate settings in [https://www.google.com/accounts/ServiceLogin?service=sitemaps&passive=true&nui=1&continue=https://www.google.com/webmasters/tools/&followup=https://www.google.com/webmasters/tools/&hl=en Webmaster Tools].
Google no longer recommends blocking crawler access to duplicate content on your website, whether with a robots.txt file or other methods. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads too much crawling of your website, you can adjust the crawl rate settings in Webmaster Tools.
 
  
Remember, duplicate content on a website is not grounds for action unless it appears that the duplicate content is intended to deceive or manipulate search engine results.
+
Remember, duplicate content on a website is not grounds for punitive action from Google unless it appears that the duplicate content is intended to deceive or manipulate search engine results.
  
 
{{LearnBottomBio
 
{{LearnBottomBio

Latest revision as of 10:16, 7 November 2013

By [[User:|]] on

Search engines like dinner guests have limited use for repetition. No one wants to hear the same story repeatedly any more than search engines will rank duplicate content favorably.

What exactly is duplicate content?

Duplicate content mirrors other content exactly or significantly. It can be posted on the same domain, or on another domain. Most of the time, duplicate content is not meant to deceive viewers, but it often dilutes the authenticity of an online business.

Why duplicate content is bad news

In some cases, content is deliberately duplicated across domains to manipulate search engine ranking and garner more traffic. Deceptive practices can result in poor user experience and turn off potential customers or clients. When a visitor sees the same content repeated within a set of search results, they are likely to leave a site and never return.

Google does a thorough job of indexing pages with distinct information. If Google perceives duplicate content with the intent to manipulate rankings and deceive users, they'll make handslapping adjustments to the indexing and ranking of the sites involved. As a result, the ranking of both sites may suffer -- or worse, both sites could be removed from Google's index. This means they will not appear in search results!

Things you can do prevent duplicate content on your site

Building a website takes time, energy and financial resources. You want people and search engines your to find your site easily. Once your website is found, you'd like viewers to find it useful and informative. Showing duplicate content on multiple pages will diminishe a user's experience.

Fortunately, there are specific things you can do when developing your website to avoid creating duplicating content.

  • Use top-level domains whenever possible to handle country-specific content. Many readers will know that http://www.mocksite.uk contains United Kingdom-centric content, but they may not recognize the same quality in these other URLs: http://www.mocksite.com/uk, and http://uk.mocksite.com. Canonicalization is the process of picking the best URL when several choices are available.
  • Keep your internal linking consistent. Don't use multiple variations to link to the same place. For example, http://www.mocksite.com/ and http://www.mocksite.com/page and http://www.mocksite.com/page/index.htm.
  • If your site has a printer version of a page that is identical to a viewable version of the same page, you can help Google avoid blocking these pages by using a noindex meta tag.
  • Use 301 redirects if you've restructured your site. 301 redirects permanently redirect content in your .htaccess file to successfully redirect users, Googlebots, and other spiders to where you really want them.
  • If you have other people publishing your content on their sites, make sure each site publishes your content with a link back to the original article.
If there are multiple versions of your article out on the Web, and Google has included all of them in its index, Google can show any version higher in search results. It will show the version it has determined is most relevant to someone’s search, regardless of whether it’s the original version from your site or another version on a different site.
To avoid having other people’s versions of your article trump your own, you can ask website owners publishing your content to tell search engines not to index the article. They can use the noindex meta tag to accomplish this. This makes it more likely that your original version will show up in searches relevant to your article.
  • Use Google Webmaster Tools to define how your site is indexed. You can tell Google your preferred domain (for example, http://www.mocksite.com or http://mocksite.com).
  • Don't use boilerplates of lengthy copyright text on the bottom of each page. Instead, include a brief summary of the content you wish to display, then add a link to a specific page for more details. Try Google's parameter handling tool to see how they to treats URL parameters.
  • No one likes to land on an empty page, so try to avoid placeholders where possible. Don't publish incomplete pages or pages under construction. If you do create placeholder pages, use a noindex meta tag to block these pages from being indexed.
  • Make sure you're familiar with how content is displayed on your website. Blogs, forums, and related items often show the same content in multiple formats. For example, a blogpost may appear on your home page, on an archive page, or on a page with other entries using the same label.
  • Lastly, if you have many pages that are similar, consider developing each page uniquely, or consolidate the similar pages into one. For example, if you have a travel site with separate pages for Manhattan and Brooklyn, but much of the same information appears on both, you could merge the two pages together, or expand each page to contain unique content exclusive and relevant to each burrow.

Google no longer recommends blocking crawler access to duplicate content on your website with a robots.txt file. If search engines can't crawl pages with duplicate content, they can't automatically detect that these URLs point to the same content and will therefore have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicate pages by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where duplicate content leads to too much crawling of your website, you can adjust the crawl rate settings in Webmaster Tools.

Remember, duplicate content on a website is not grounds for punitive action from Google unless it appears that the duplicate content is intended to deceive or manipulate search engine results.




Retrieved from "http://aboutus.com/index.php?title=Duplicate_Content,_Why_Once_is_Enough&oldid=27233961"