Would Google Call These Pages Duplicate Content?
-
Our Web store, http://www.audiobooksonline.com/index.html, has struggled with duplicate content issues for some time. One aspect of duplicate content is a page like this: http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html.
When an audio book title goes out-of-publication we keep the page at our store and display a http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html whenever a visitor attempts to visit a specific title that is OOP. There are several thousand OOP pages.
Would Google consider these OOP pages duplicate content?
-
I'm confused. When a book goes out of print, does the URL change to this long OOP html page? Or does that book's URL then redirect to this page? Or *(shudders) do you make the OOP page re-titled to whatever the OOP book's page was?
If it were me I'd do the first scenario here. It's essentially the same concept as a 404.
-
Yes that is duplicate content, you should make these pages return a 404 instead. or leave the content in place with a sold out banner or something.
Something I don't like is your index.htm on your home page, people how link to you are likely to link to http://www.audiobooksonline.com/ you will then get a 301 redirect to http://www.audiobooksonline.com/index.html
this will leak link juice, as all 301's leak link juice just the same as a link does, 155 if we go by the original published Google algorithm. Also your internal pages link to http://www.audiobooksonline.com and are once again redirected to http://www.audiobooksonline.com/index.html
-
yes larry that is fine. so long as it is a single URL with a single HTML file on it, there is no duplicate issues. If you want to clarify I would suggest (if you aren't an SEOmoz pro member) to use a sitemap generator to ensure it isn't crawling multiple pages... But if that page is only listed once (and from what you are saying here that should be the case) then you have no duplicate content issues.
It's just the same as linking to one page from every page on your website. A redirect doesn't work much differently (although it does drop a small amount of linkjuice.)
You might consider no-crawling that OOP page anyway if you're still concerned. Not sure why you would need that one indexed in the first place.
Good luck to you!
-
We use only one URL for the OOP pages. It is 301 redirected from the each unique OOP title's page. Based on what you said, I am understanding that this is fine. Correct?
-
Hi Larry
Couple of questions - is that the only URL for the OOP pages, or are there other versions of the page and/or URL that exist?
If there are multiple pages, then that is definitely duplicate content. However, that can quite easily be fixed. If you add this code to the head tag of all those OOP pages, it will prevent Google from indexing the pages (thus not seeing them as duplicate):
That way you can keep the page for the user but not have to worry about duplicate content. I would do this anyway even if there is only one version of the page, as the page is thin on content as it is.
If you are displaying that image on other URLs that used to have products on them, but have gone OOP, then those multiple URLs and pages would be duplicate. Again, if you add the above code into the head text, it removes the problem. You could also 301 redirect the URL of the product page to the OOP page. For example, if you had a page for a product called: http://www.audiobooksonline.com/examplerecord.html that is now OOP, you could put in a 301 redirect to the http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html. page and it wouldn't be duplicate. You can learn more about redirection here.
Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why does Google's search results display my home page instead of my target page?
Why does Google's search results display my home page instead of my target page?
Technical SEO | | h.hedayati6712365410 -
Duplicate Content
I am trying to get a handle on how to fix and control a large amount of duplicate content I keep getting on my Moz Reports. The main area where this comes up is for duplicate page content and duplicate title tags ... thousands of them. I partially understand the source of the problem. My site mixes free content with content that requires a login. I think if I were to change my crawl settings to eliminate the login and index the paid content it would lower the quantity of duplicate pages and help me identify the true duplicate pages because a large number of duplicates occur at the site login. Unfortunately, it's not simple in my case because last year I encountered a problem when migrating my archives into a new CMS. The app in the CMS that migrated the data caused a large amount of data truncation Which means that I am piecing together my archives of approximately 5,000 articles. It also means that much of the piecing together process requires me to keep the former app that manages the articles to find where certain articles were truncated and to copy the text that followed the truncation and complete the articles. So far, I have restored about half of the archives which is time-consuming tedious work. My question is if anyone knows a more efficient way of identifying and editing duplicate pages and title tags?
Technical SEO | | Prop650 -
Duplicate Content Mystery
Hi Moz community! I have an ongoing duplicate mystery going on here and I'm hoping someone here can answer my question. We have an Ecommerce site that has a variety of product pages and category pages. There are Rel canonicals in place, along with parameters in GWT, and there are also URL rewrites. Here are some scenarios, maybe you can give insight as to what’s exactly going on and how to fix it. All the duplicates look to be coming from category pages specifically. For example:
Technical SEO | | Ecom-Team-Access
This link re-writes: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html?cat=407&color=152&price=20- To: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html The rel canonical tag looks like this: http://www.incipio.com/cases/tablet-cases/amazon-kindle-cases-sleeves.html" /> The CONTENT is different, but the URLs are the same. It thinks that the product category view is the same as the all products view, even though there is a canonical in there telling it which one is the original. Some of them don’t have anything to do with each other. Take a look: Link identified as duplicate: http://www.incipio.com/cases/smartphone-cases/htc-smartphone-cases/htc-windows-phone-8x-cases.html?color=27&price=20- Link this is a duplicate of: http://www.incipio.com/cases/macbook-cases/macbook-pro-13in-cases.html Any idea as to what could be happening here?0 -
Why wont google Index this page?
A week ago i accidentally changed this page settings in my CMS to "disable & dont index" as i was going to replace this page with another, but this didnt happen, but i forgot to switch the settings back! http://www.over50choices.co.uk/funeral-planning/funeral-plans Anyhow in an effort to get it back up quickly i submitted in GWTs but its still not indexed. When i use several SEO on page checking tools it has the Meta Title data as "Form" and not the correct title. Any ideas please? Yours frustrated Ash
Technical SEO | | AshShep10 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
Duplicate content with same URL?
SEOmoz is saying that I have duplicate content on: http://www.XXXX.com/content.asp?ID=ID http://www.XXXX.com/CONTENT.ASP?ID=ID The only difference I see in the URL is that the "content.asp" is capitalized in the second URL. Should I be worried about this or is this an issue with the SEOmoz crawl? Thanks for any help. Mike
Technical SEO | | Mike.Goracke0 -
Is there an easier way from the server to prevent duplicate page content?
I know that using either 301 or 302 will fix the problem of duplicate page content. My question would be; is there an easier way of preventing duplicate page content when it's an issue with the URL. For example: URL: http://example.com URL: http://www.example.com My guess would be like it says here, that it's a setting issue with the server. If anyone has some pointers on how to prevent this from occurring, it would be greatly appreciated.
Technical SEO | | brianhughes2 -
Duplicate Page Content Lists the same page twice?
When checking my crawl diagnostics this morning I see that I have the error Duplicate page content. It lists the exact same url twice though and I don't understand how to fix this. It's also listed under duplicate page title. Personal Assistant | Virtual Assistant | Charlotte, NC http://charlottepersonalassistant.com/110 Personal Assistant | Virtual Assistant | Charlotte, NC http://charlottepersonalassistant.com/110 Does this have anything to do with a 301 redirect here? Why does it have http;// twice? Thanks all! | http://www.charlottepersonalassistant.com/ | http://http://charlottepersonalassistant.com/ |
Technical SEO | | eidna220