How do I fix apparent duplicates
-
I'm auditing a site and would appreciate your help with possible explanations and solutions as to why Google Analytics in the Content Drilldown page is showing what appears to be duplicate pages. (Refer image)
I'm wondering if I have got my head around the rel=canonical tag because the page I'd consider a duplicate "page/" has a Canonical tag pointing to "~/page.html"
This is the tag from the page Locations/
rel="canonical" href="http://www.domain.com/Locations.html" /> so am unsure why both versions of the page are generating views. Shouldn't the Canonical tag work like a 301 redirect?
I'm unsure how the pages using the path page/ are generating so many views because I have not been able to find them and they are not indexed by Google.
Unfortunately the site is built using a Propriety CMS I'm not familiar with.
-
Hi Paul
I appreciate your explanation of when to use Canonical tags. I had previously thought they were limited to redirecting www.domain.com to domain.com.
I understand your solution to the Dupes problem and will be searching SEOMoz's resources for how to write rewrites and Search & Replace filters using RegEx in Analytics for that matter.
It's not the first time you've provided an high quality answer to a question of mine. I very much appreciate your contribution to my growing knowledge and the SEOMoz community.
Best
Nic
-
A canonical tag is fundamentally different from a 301-redirect, Nic. There's nothing about a canonical tag that stops a visitor from being able to visit that URL. A 301-redirect actually forwards the visitor to the target page as if the initial page doesn't even exist so there's no physical way for a visitor to land on it.
Put another way, the source page of a 301-redirected URL doesn't even exist as far as the search engines are concerned (and eventually the'll actually drop the original URL altogether).
The canonical tag serves a very specific purpose. When two pages must continue to be reachable by 2 different URLs but the page content is essentially identical (e.g. a product page sorted by size or colour), then a canonical tag suggests that the search engines should consolidate the ranking value in the primary URL. That's it.
In the case of the /contact+us.html and /contact+us/ pages - that page should only be reachable at one or the other URL. There's no reason or value to the user for the page to be reachable at the second address. The correct way to deal with this is to use a rewrite rule to 301-redirect all the page/ versions of the site's pages to the page.html (assuming that's what you've decided should be the canonical.
The only time to use canonical tags instead of redirects in a case like this is if it is technically impossible to implement the rewrites (a shared server that doesn't allow access to the .htaccess file for example). But this is sub-optimal and would still leave you with the same Analytics dupe page problem you're currently running into.
So what to do about the dupes in Analytics, given the site wasn't configured with the rewrites? You can write a custom Search and Replace filter for the site's profile that uses regex to merge both versions of each page into a single line. You'll absolutely want to do this in a new profile created just for this purpose though, keeping the original unfiltered profile for reference and historical data.
Note that this will only affect data collected from the date of creation of the new profile/filter. It's not retroactive. If you want to combine results for these pages for the existing data, you'll need to dump it to Excel and use a formula to combine the dupes.
Hope that all makes sense?
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help needed - traffic reduced by half for no apparent reason - not sure what to do
My client site, https://www.helpinhearing.co.uk/ - has regularly been showing traffic of between 1000 & 1500 sessions per month. I've just looked at their Analytics for September and the sessions have dropped to just over 600! And not only was there a period of around 11 days from September 19th onwards when there was either zero traffic or 1 or 2 visits, but also from October 4th to today (8th Oct) also no visits at all. This has never happened and we've managed this website for several years now. I cannot fathom out why this may be the case. We haven't changed anything on the site from a technical point of view, just added content and usual blog etc pages. Search Console lists 147 pages as 404 errors but there are no urgent messages or alerts/warnings. I really don't know how to proceed and try and find out what is going on with the site. Can anyone offer suggestions?
Reporting & Analytics | | mfrgolfgti0 -
Tracking links and duplicate content
Hi all, I have a bit of a conundrum for you all pertaining to a tracking link issue I have run into on a clients site. They currently have over duplicate content. Currently, they have over 15,000 pages being crawled (using Screaming Frog) but only 7,000+ are legitimate pages in the sense of they are not duplicates of themselves. The client is using Omniture instead of Google Analytics and using an advanced tracking system on their site for internal and external links (ictids and ectids) in the URL parameters. This is creating thousands of duplicated pages being crawled by Google (as seen on their Search Console and on Screaming Frog). They also are in the middle of moving over from http to https and have thousands of pages currently set up for both, again, creating a duplicate content issue. What I have suggested for the tracking links is setting up a URL parameter in Search Console for these tracking links. I've also suggested they canonical all tracking links to point to the clean page so the pages that have already been indexed point to the correct clean url. Does this seam like the appropriate strategy? Additionally, I've told them before they submit a new sitemap to Google, they need to switch their website over to https to avoid worsening their duplicate content issue. They have not submitted a sitemap to Google Search Console since March 2015. Thank you for any help you can offer!
Reporting & Analytics | | Rydch410 -
641 Crawl Errors In My Moz Report - 190 are high priority Duplicate Content
Hi everyone, There are high and medium level errors. I was surprised to see any especially since Google Analytics shows no errors whatsoever.190 errors - duplicate content.A lot of images are showing in the Moz Crawl Report as errors, and when I click on one of these links in the report, it directs to the image which displays on a blog post on the site unusually since I haven't started blogging yet.. So it looks like all those errors are because the images are appearing on their own post.So for example a picture of a mountain would be referred to with www.domain.com/mountains ; the image would be included in the content on a page but why give an image a page/post all of it's own when that was not my intention. Is there a way I can change this?# ----------------------------------------
Reporting & Analytics | | SEOguy1
These are things I first see at the top of the Moz Report: There are 2 similar home urls at the top of the report: http status code is 200 for both (1) and (2) Link Count for (1) is 71. Link count for (2) is 60. No client or server errors Rel Canonical Rel-Canonical Target
Yes http:// domain. co.uk/home
Yes http:// domain. co.uk/home/ Does this mean that the home page is being seen as a duplicate by Google and the search engines?http status codes on every page is 200.Your help would be appreciated.Best Regards,0 -
404 errors more than 1.8 lacs, Duplicate Content, Duplicate title, missing meta description increasing as site is based on regular ticket selling (CRM), kindly help
Sites error increasing i.e. 404 errors more than 1.8 lacs, Duplicate Content, Duplicate title, missing meta description increasing day by day as site is based on regular ticket selling (CRM), We have checked with webmasters for 404's, but it is not easy to delete 1.8 lac entries. How to resolve this issue for future. kindly help and suggest the solution.
Reporting & Analytics | | 1akal0 -
403 error-How to fix it?
http://muslim-academy.com/ Got 36 "403 errors". Google some stuff and also look into SEOMOZ nothing relevant. The site is wordpress latest version and host is Godaddy. I have recently added these URL's in robots.txt file and they were removed but because of some issue in robots.txt file I have to revert it and make it blank. Kindly guide me a permanent remedy for it?
Reporting & Analytics | | csfarnsworth0 -
Duplicate Page Title
I'm new to SEO and have just signed up to SEOMOZ to see what I can learn. I got the report back on my site and it indicates various errors, one of them being Duplicate Page Title - I have a blog on my site and a lot of pages identified as with duplicates are like this: http://www.martinspencephotography.co.uk/blog?page=2 Is it important I rectify this? Do I need to rectify it?
Reporting & Analytics | | MartinSpence460 -
Duplicate content warnings
I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?
Reporting & Analytics | | CatfishTPA0 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0