Duplicate content warning: Same page but different urls???
-
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance
Page 1 is http://yourdigitalfile.com/signing-documents.html
the warning page is http://www.yourdigitalfile.com/signing-documents.html
another example
Page 1 http://www.yourdigitalfile.com/
same second page http://yourdigitalfile.com
i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one???
thanks very much
-
Thanks Tim. Do you have any examples of what those problems might be? With such a large catalog managing those rel canonical tags will be difficult (I don't even know if the store allows them, it's a hosted store solution and little code customization is allowed).
-
Hi there AspenFasteners, in this instance rather than a .HTAccess rule I would suggest applying a rel canonical tag which points to the page you deem as the original master source.
Using the robots to try and hide things could potentially cause you more issues as your categories may struggle to be indexed correctly.
-
We have a similar problem, but much more complex to handle as we have a massive catalog of 80,000 products and growing.
The problem occurs legitimately because our catalog is so large that we offer different navigation paths to the same content.
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8314.htm
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8315.htm
(If you look at the "You are here" breadcrumb trail, you will see the subtle differences in the navigation paths, with 8314.htm, the user went through Home > Screws, with 8315.htm, via Home > Security Fasteners > Screws).
Our hosted web store does not offer us htaccess, so I am thinking of excluding the redundant navigation points via robots.txt.
My question: is there any reason NOT to do this?
-
Oh ok
The only reason i was thinking it is duplicate content is the warnings i got on the moz crawl, see below.
75 Duplicate Page Content
6 4xx Client Error
5 Duplicate Page Title
44 Missing Meta Description Tag
5 Title Element is Too Short
I have found over 80 typos, grammatical errors, punctuation errors and incorrect information which was leading me to believe the quality of the work and their attention to detail was rather bad, which is why i thought this was a possibility.
Thanks again for your time its really appreciated
-
I wouldn't say that they have created two pages, it is just that because you have two versions of the domain and not set a preferred version that you are getting it indexing twice. .HTaccess changes are under the hood of the website and could have simply been an oversight.
-
Hey Tim
Thanks for your answer. It's really weird, other than lazyness on the devs part not to remove old or previous versions of pages?, have you any idea why they would create multiple versions of the same page with different url's?? is there any legit reason like ones severs mobile or something??
Just wondering thanks for replying
-
OK, so in this instance the only issue you have is that you need to choose your preferred start point - www or non www.
I would add a bit of code to your htaccess file to point to your preferred choice. I personally prefer a www. domain. Something like the below would work.
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]As your site is already indexed I would also for the time being and as more of a safety measure add canonicals to the pages that point to the www. version of your site.
Also if you have a Google Search Console account, you can select your prefered domain prefix in there. this will again help with your indexation.
Hopefully I have covered most things.
Cheers
Tim
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site build in the 80% of canonical URLs - What is the impact on visibility?
Hey Everyone, I represent international wall decorations store where customer can freely choose a pattern to be printed on a given material among a few milions of patterns. Due to extreme large number of potential URL combinations we struggle with too many URL adressess for a months now (search console notifications). So we finally decided to reduce amount of products with canonical tag. Basing on users behavior, our business needs and monthly search volume data we selected 8 most representative out of 40 product categories and made them canonical toward the rest. For example: If we chose 'Canvas prints' as our main product category, then every 'Framed canvas' product URL points rel=canonical tag toward its equivalent URL within 'Canvas prints' category. We applied the same logic to other categories (so "Vinyl wall mural - Wild horses running" URL points rel=canonical tag to "Wall mural - Wild horses running" URL, etc). In terms of Googlebot interpretation, there are really tiny differences between those Product URLs, so merging them with rel=canonical seems like a valid use. But we need to keep those canonicalised URLs for users needs, so we can`t remove them from a store as well as noindex does not seem like an good option. However we`re concerned about our SEO visibility - if we make those changes, our site will consist of ~80% canonical URLs (47,5/60 millions). Regarding your experience, do you have advices how should we handle that issue? Regards
White Hat / Black Hat SEO | | _JediMindBender
JMB0 -
Somebody took an article from my site and posted it on there own site but gave it credit back to my site is this duplicate content?
Hey guys, This question may sound a bit drunk, but someone copied our article and re-posted it on their site the exact article, however the article was credited to our site and the original author of the article had approved the other site could do this. We created the article first though, Will this still be regarded as duplicate content? The owner of the other site has told us it wasn't because they credited it. Any advice would be awesome Thanks
White Hat / Black Hat SEO | | edward-may0 -
One page sites
HI Guys, I need help with a one page site What is the best method to getting the lower pages indexed? Linking back to the site(Deeplinking) is looking impossible. Will this hurt my SEO? Are there any other tips on one page websites that you can recommend?
White Hat / Black Hat SEO | | Johnny_AppleSeed0 -
Separate Servers for Humans vs. Bots with Same Content Considered Cloaking?
Hi, We are considering using separate servers for when a Bot vs. a Human lands on our site to prevent overloading our servers. Just wondering if this is considered cloaking if the content remains exactly the same to both the Bot & Human, but on different servers. And if this isn't considered cloaking, will this affect the way our site is crawled? Or hurt rankings? Thanks
White Hat / Black Hat SEO | | Desiree-CP0 -
Looking for a Way to Standardize Content for Thousands of Pages w/o Getting Duplicate Content Penalties
Hi All, I'll premise this by saying that we like to engage in as much white hat SEO as possible. I'm certainly not asking for any shady advice, but we have a lot of local pages to optimize :). So, we are an IT and management training course provider. We have 34 locations across the US and each of our 34 locations offers the same courses. Each of our locations has its own page on our website. However, in order to really hone the local SEO game by course topic area and city, we are creating dynamic custom pages that list our course offerings/dates for each individual topic and city. Right now, our pages are dynamic and being crawled and ranking well within Google. We conducted a very small scale test on this in our Washington Dc and New York areas with our SharePoint course offerings and it was a great success. We are ranking well on "sharepoint training in new york/dc" etc for two custom pages. So, with 34 locations across the states and 21 course topic areas, that's well over 700 pages of content to maintain - A LOT more than just the two we tested. Our engineers have offered to create a standard title tag, meta description, h1, h2, etc, but with some varying components. This is from our engineer specifically: "Regarding pages with the specific topic areas, do you have a specific format for the Meta Description and the Custom Paragraph? Since these are dynamic pages, it would work better and be a lot easier to maintain if we could standardize a format that all the pages would use for the Meta and Paragraph. For example, if we made the Paragraph: “Our [Topic Area] training is easy to find in the [City, State] area.” As a note, other content such as directions and course dates will always vary from city to city so content won't be the same everywhere, just slightly the same. It works better this way because HTFU is actually a single page, and we are just passing the venue code to the page to dynamically build the page based on that venue code. So they aren’t technically individual pages, although they seem like that on the web. If we don’t standardize the text, then someone will have to maintain custom text for all active venue codes for all cities for all topics. So you could be talking about over a thousand records to maintain depending on what you want customized. Another option is to have several standardized paragraphs, such as: “Our [Topic Area] training is easy to find in the [City, State] area. Followed by other content specific to the location
White Hat / Black Hat SEO | | CSawatzky
“Find your [Topic Area] training course in [City, State] with ease.” Followed by other content specific to the location Then we could randomize what is displayed. The key is to have a standardized format so additional work doesn’t have to be done to maintain custom formats/text for individual pages. So, mozzers, my question to you all is, can we standardize with slight variations specific to that location and topic area w/o getting getting dinged for spam or duplicate content. Often times I ask myself "if Matt Cutts was standing here, would he approve?" For this, I am leaning towards "yes," but I always need a gut check. Sorry for the long message. Hopefully someone can help. Thank you! Pedram1 -
href="#" and href="javascript.void()" links. Is there a difference SEO wise?
I am currently working a site re-design and we are looking at if href="#" and href="javascript.void()" have an impact on the site? We were initially looking at getting the links per page down but I am thinking that rel=nofollow is the best method for this. Anyone had any experience with this? Thanks in advanced
White Hat / Black Hat SEO | | clickermediainc0 -
Moving content to a clean URL
Greetings My site was seriously punished in the recent penguin update. I foolishly got some bad out sourced spammy links built and I am now paying for it 😞 I am now thinking it best to start fresh on a new url, but I am wondering if I can use the content from the flagged site on the new url. Would this be flagged as duplicate content, even if i took the old site down? your help is greatly appreciated Silas
White Hat / Black Hat SEO | | Silasrose0 -
Tricky Decision to make regarding duplicate content (that seems to be working!)
I have a really tricky decision to make concerning one of our clients. Their site to date was developed by someone else. They have a successful eCommerce website, and the strength of their Search Engine performance lies in their product category pages. In their case, a product category is an audience niche: their gender and age. In this hypothetical example my client sells lawnmowers: http://www.example.com/lawnmowers/men/age-34 http://www.example.com/lawnmowers/men/age-33 http://www.example.com/lawnmowers/women/age-25 http://www.example.com/lawnmowers/women/age-3 For all searches pertaining to lawnmowers, the gender of the buyer and their age (for which there are a lot for the 'real' store), these results come up number one for every combination they have a page for. The issue is the specific product pages, which take the form of the following: http://www.example.com/lawnmowers/men/age-34/fancy-blue-lawnmower This same product, with the same content (save a reference to the gender and age on the page) can also be found at a few other gender / age combinations the product is targeted at. For instance: http://www.example.com/lawnmowers/women/age-34/fancy-blue-lawnmower http://www.example.com/lawnmowers/men/age-33/fancy-blue-lawnmower http://www.example.com/lawnmowers/women/age-32/fancy-blue-lawnmower So, duplicate content. As they are currently doing so well I am agonising over this - I dislike viewing the same content on multiple URLs, and though it wasn't a malicious effort on the previous developers part, think it a little dangerous in terms of SEO. On the other hand, if I change it I'll reduce the website size, and severely reduce the number of pages that are contextually relevant to the gender/age category pages. In short, I don't want to sabotage the performance of the category pages, by cutting off all their on-site relevant content. My options as I see them are: Stick with the duplicate content model, but add some unique content to each gender/age page. This will differentiate the product category page content a little. Move products to single distinct URLs. Whilst this could boost individual product SEO performance, this isn't an objective, and it carries the risks I perceive above. What are your thoughts? Many thanks, Tom
White Hat / Black Hat SEO | | SoundinTheory0