Duplicate Content/Missing Meta Description | Pages DO NOT EXISIT!
-
Hello all,
For the last few months, Moz has been showing us that our site has roughly 2,000 duplicate content errors. Pages that were actually duplicate content, I took care of accordingly using best practice (301 redirects, canonicalization,etc.). Still remaining after these fixes were errors showing for pages that we have never created.
Our homepage is www.primepay.com. An example of pages that are being shown as duplicate content is http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/payroll/online-payroll with a referring page of http://primepay.com/blog/%5BLink%20to%20-%20http:/www.primepay.com/en/payrollservices/payroll/payroll/online-payroll. Some of these are even now showing up as 403 and 404 errors.
The only real page on our site within that URL strand is primepay.com/payroll or primepay.com/payroll/online-payroll. Therefore, I am not sure where Moz is getting these pages from.
Another issue we are having in relation to duplicate content is that moz is showing old campaign url’s tacked on to our blog page i.e. http://primepay.com/blog?title=&page=2&utm_source=blog&utm_medium=blogCTA&utm_campaign=IRSblogpost&qt-blog_tabs=1.
As of this morning, our duplicate content went from 2,000 to 18,000. I exported all of our crawl diagnostics data and looked to see what the referring pages were, and even they are not pages that we have created. When you click on these links, they take you to a random point in time from the homepage of our blog; some dating back to 2010.
I checked our crawl stats in both Google and Bing’s Webmaster tool, and there are no duplicate content or 400 level errors being reporting from their crawl. My team is truly at a loss with trying to resolve this issue and any help with this matter would be greatly appreciated.
-
Thanks Dirk. Very insightful tip about not using campaign tracking to check internal links. There was an old blog post that had anchor text with campaign tracking that was causing many SEO issues. As for the latter part, it is unknown why a string of gibberish can be placed after /blog/ and also for our locations page. Our team's web developer is looking further into this issue. If anyone has any more advice on the matter it would be greatly appreciated.
-
Hey there
Dirk pretty much hit upon the issue, which I'll reiterate with a visual. If you enter any gibberish /blog URL (like this: http://primepay.com/blog/jglkjglkjg) in the browser it returns a 200 OK which, but it should return a 404 code --> http://screencast.com/t/cStpPB5zE
Otherwise pages that are really broken will look to crawlers like they are supposed to exist.
-
You shouldn't use campaign tracking to check internal links - you have to use event tracking. Check http://cutroni.com/blog/2010/03/30/tracking-internal-campaigns-with-google-analytics/ . Apart from the reporting issue - it's also generating a huge number of url's that need to be crawled by Google bot and is just wasting it's time (most of these tagged url have a correct canonical version). You mention these tags are old - but they are still present on a lost of pages.
For cases like this it's better to check with a local tool like Screaming Frog which gives you a much better view which pages are generating these links.The other issue you have is probably related to a few pages that have a bad formatted (relative) url in a link - the way your site is configured it's just rendering a page on your site - so the bots are then crawling your site over and over again, each time encountering the same bad relative link - and each time adding the bad formatting to the url. It's an endless loop - best way to avoid this is to use absolute internal links rather than relative links. Not sure if it's the only one - but one of the pages with this error is :http://primepay.com/blog/7-ways-find-right-payroll-service-your-company - it contains a link to
[Your payroll service is no different.]([Link to - http://www.primepay.com/en/payrollservices/] "Your payroll service is no different.")
This page should generate a 404 but is generating a 200 and the loop starts here.
Again - with screaming frog you can for each of these bad url's you can generate a crawl path report which shows you exactly on which page the error is generated.
Hope this helps,
Dirk
-
Example:
http://primepay.com/blog/hgehergreg
Status:
My site as an example:
https://caseo.ca/blog/hgehergreg
If I put in random gibberish in this URL, it should be displaying a 404 page and not the blog page.
-
Getting you some help for direct advice on your problem, but wanted to leave a comment about the tool itself. When you are looking at the Moz crawl tool, it only updates once a week, so if there hasn't been that long between the last crawl and when you did the work, it won't be updated. Here's more info.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why would I be ranking for a term when it's not anywhere in my page or pages linking to it?
Hi, I have a keyword that's not anywhere in my page in page-grade tool nor are any pages linking to it with that term, why would that be ranking. The term is "dynamic card solution" This is a company we used to have 5 years ago and merged with our company. This term is ranking #1 for this page- http://www.datacard.com/instant-issuance-solutions . I can't figure why that would be even as this is a newer page in the past year. Thanks for any insight as I'm newer to moz. Laura
Moz Pro | | lauramrobinson320 -
Lots of 404s listed in Top Pages by Page Authority
Hi guys, I'm slowly getting to grips with all the aspects of using Moz.
Moz Pro | | giddygrafix
Looking at my link analysis, under the tab labelled Top Pages by Page Authority, there's an awful lot of 404 pages listed. Are these pages which are being linked to (either internally or externally), and should I put a 301 redirect on all the pages listed? I've attached a picture... moz-screen-shot.jpg0 -
Duplicate Page Title error for an eCommerce store !!
I currently launched my eCommerce startup hosted in Shopify and linked with MOZ. From my first Crawl Report I am getting 580 Duplicate Page Title i.e. all my Collection page have the same title. I have googled and have been checking the MOZ community but cannot find a fix to it. Some of the URL's are - http://www.onlypetstore.com/collections/all http://www.onlypetstore.com/collections/all?page=10 http://www.onlypetstore.com/collections/all?page=100 http://www.onlypetstore.com/collections/all?page=101 http://www.onlypetstore.com/collections/all?page=102 http://www.onlypetstore.com/collections/all?page=103 I am new to SEO and any suggestions will be a great help to me.
Moz Pro | | OnlyPetStore2 -
Crawl report - duplicate page title/content issue
When the crawl report is finished, it is saying that there are duplicate content/page titles issues. However there is a canonical tag that is formatted correctly so just wondered if this was a bug or if anyone else was having the same issues? For example, I'm getting a error warning for this page http://www.thegreatgiftcompany.com/categories/categories_travel?sort=name_asc&searchterm=&page=1&layout=table
Moz Pro | | KarlBantleman0 -
On-Page Report Card B grade because its a PPC landing page
I have a PPC landing page with I'm getting a B grade on the On-Page Report Card. Can I just ignore that, it says its a "Critical Factor" Thanks Mike Crawl status <dd>Status Code: 200
Moz Pro | | mjrinvent
meta-robots: noindex,nofollowall
meta-refresh: None
X-Robots: None</dd> <dt>Explanation</dt> <dd>Pages that can't be crawled or indexed have no opportunity to rank in the results. Before tweaking keyword targeting or leveraging other optimization techniques, it's essential to make sure this page is accessible.</dd> <dt>Recommendation</dt> <dd>Ensure the URL returns the HTTP code 200 and is not blocked with robots.txt, meta robots or x-robots protocol (and does not meta refresh to another URL)</dd>0 -
Why does Crawl Diagnostics report this as duplicate content?
Hi guys, we've been addressing a duplicate content problem on our site over the past few weeks. Lately, we've implemented rel canonical tags in various parts of our ecommerce store, over time, and observing the effects by both tracking changes in SEOMoz and Websmater tools. Although our duplicate content errors are definitely decreasing, I can't help but wonder why some URLs are still being flagged with duplicate content by our SEOmoz crawler. Here's an example, taken directly from our Crawl Diagnostics Report: URL with 4 Duplicate Content errors:
Moz Pro | | yacpro13
/safety-lights.html Duplicate content URLs:
/safety-lights.html ?cat=78&price=-100
/safety-lights.html?cat=78&dir=desc&order=position /safety-lights.html?cat=78 /safety-lights.html?manufacturer=514 What I don't understand, is all of the URLS with URL parameters have a rel canonical tag pointing to the 'real' URL
/safety-lights.html So why is SEOMoz crawler still flagging this as duplicate content?0 -
How do I find the corresponding duplicate content pages from my SEOmoz report?
Once I have run my report and the duplicate content pages come up, is there a way to find out which pages have the duplicate content on them? I have one URL but where can I find the duplicate content that corresponds to it? Thanks Barry
Moz Pro | | MrBarrytg0 -
On Page Optimization Reports - Huh?
I've been working hard to use this EXCELLENT tool for optimize some of what I consider my most important pages . . . But the automatic tool that pulls pages and grades them (the "summary" of the "on page" report) . . . I don't get it. It only graded three of my pages, and I don't understand how it chose what keywords to grade it for? I'm just very confused. I don't understand how it chose the pages to grade, not the words it chose to grade it against. 😞
Moz Pro | | damon12120