Abnormal crawl issues appearing in my Moz results
-
I have been asked to look at a site for a friend and was more than surprised to see 16,9k crawl issues appear in the dashboard... of this 6,238 are duplicate page content and 5878 are duplicated page titles.
What on earth is going on? I have spoken to the web developer as it appears there is a dev site somewhere and this is his response
[Can I stress that Google determines which site was in the index first and then removes other sites it sees as having duplicate content. Our dev sites appearing in the search index would not affect your ranking due to duplicate content as Google would see your site as the first site with the content]
As I cannot make contact with him, I am scratching my head, surely a dev site should be no-indexed, it sounds as though he is saying that its ok because Google will take the main site as the first site with the content...
Very confused! Help need MOZ community.
Manythanks,
Sarah
-
Thanks again Dirk. I like your direct and knowledgeable responses. I have sent a Linkedin connection!!
Many thanks,
Sarah
-
Hi Sarah,
Googlebot will follow these links as well and discover these "useless" pages (the are off course not useless from human perspective but they don't add value for bots - and they will be considered as duplicates). Duplicates are no reason for "punishment" - so you could just let them be. Personally I would put a nofollow on these links or add a "noindex" tag to the login page. Normally you shouldn't use nofollow on internal links - but login pages are an exemption on this (check also https://searchenginewatch.com/sew/news/2298312/matt-cutts-you-dont-have-to-nofollow-internal-links : "Of course, there are always exceptions to the rule, and things like login pages can be the exception. He said it doesn’t hurt to put the nofollow link for a link pointing to a login page, or things like terms and conditions or other “useless” pages. However, it doesn’t hurt at all for those pages to be crawled by Google."
For the practical part - if you add an additional question to a question which has been marked as answered - only the ones who have already answered will see the additional question. To be on the safe side - it's better open a new question if you want other people to have a look at it.
Hope this helps,
Dirk
-
hello Dirk, thank you for that great answer, we have since been doing a bit more digging of our own and before we go back to the web developer we want to check what should be happening with the links the we are finding duplicated as we are seeing that the issues relating to Duplicate Pages are coming from links from the login page which shows information about where the user was redirected from.
For example, if the visitor is not logged on and wishes to wish-list an item, they will be redirected to the login page, with the item code and intended action in the url; which can then continue on to the desired page once logged on.
The MOZ crawler is seeing these pages as having Duplicated Content whilst they are all the same apart from a piece of information in the URL. Should we be blocking these duplications? Are they a risk to us? What should we be doing?
I have also added this as a new question - I am quite new to this community thing so wasn't sure which was the best way to ask the question.
Many thanks again,
Sarah
-
Moz is only indexing pages it's crawler is able to find. This implies that on your production site you have links to your development site.
Don't really agree with what your dev is saying - he should correct these links first; put a noindex on these pages. Alternative - put a password on the dev site so it's only accessible with a password. If a lot of users are putting links to your dev site it could become more important than your main site. Google will try to choose the most appropriate site - but you have no guarantee that it will choose the right version. In any case - that's not the type of risk you should be willing to take.
Once this is done - you can request a removal of these pages via the search console.
If all pages are removed from the index you can adapt the robots.txt to prevent access to the Google & other bots. Do this only after all pages are removed - if not Google will never find the noindex directive.
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
Are there tools to discover duplicate content issues with the other websites?
We have issues with users copy-pasting content from other sources into our site. The only way I know to find out, is to manually (!!) copy a snippet of their text into google, to see if I get results from other sites. I have been googling for tools to help automate this process, but without luck. Can you recommend any?
Moz Pro | | betternow0 -
Issue: 301 (Permanent Redirect) with my wordpress
I see in my campaigns in seomoz pro account...
Moz Pro | | sandyallain
it says that I have two urls with issue 301 (Permanent Redirect) What to do within my different content management systems WordPress ? here are the errors:: Page Title URL Redirects to Page Authority Linking Root Domains http://www.simplymadrid.org/blog/ http://www.simplymadrid.org/blog/ http://www.simplymadrid.org/blog 14 1 http://www.simplymadrid.org/es http://www.simplymadrid.org/es | http://www.simplymadrid.org/es/ | 24 | 2 |0 -
Why does SEO Moz say we are ranked lower than appears on Google Searches?
We are currently ranked 18th for the Sydney Vet keyword in SEO Moz ranking tools, however in organic Google search we are ranked third. This search was conducted without Googles personalised results feature. Is this just an error? Or does it have something to do with Google Places not being counted in SEO Moz ranking tools? Any help would be much appreciated.
Moz Pro | | Peter.Huxley590 -
Moz crawling
Hi Everyone! I'm new to the SEOMoz and wanted to find out if there is a way to decrease the waiting time for the campaign crawl. I have made a lot of changes based on the first crawl and would like to see how these are reflected on the reports, but can't until the next crawl is performed. Any help would be greatly appreciated.
Moz Pro | | coremediadesign0 -
Google Peru rank and SEO Moz
Hello, I wanna track the google Peru rank but the tool say me that im not in the top 50 and i'm sure that I'm, how to get a real ranking?
Moz Pro | | Kuna0 -
How long does it take for a link to appear in ose ?
HI, So how long does it take for ose to index a link ? Say from a pr 9 site like yahoo cheers, vishal
Moz Pro | | vishalkhialani0 -
On page optimisation tool issues
When viewing my campaign and looking at the on page optimisation tool, I have a few issues. I seems to only shows the keywords I want rankings for and how optimised my homepage is for those keywords. Is there any way I can get it to analyse permanently specifc keywords for specific pages because my homepage isnt optimised for some keywords which are on my list, which I have optimised other pages for, and because its looking at my homepage its getting a really low grade, and looks really bad and frustrates me because I cant work this out. Any help greatly appreciated.
Moz Pro | | CompleteOffice1