Mozbot Can Not Crawl Entire Domain
-
I'm trying to crawl Redken.com in Moz Analytics and the Search Diagnostics is only crawling 4 pages. The domain uses a "select your country" the first time you visit, and it seems as though the bot is not getting beyond that (aka, not clicking on "USA") and is therefore not crawling the rest of the domain. There is no country specific URL other than redken.com.
I've tried entering both "redken.com" and "www.redken.com" as the URL, but no luck.
Any tips?
-
It's caused by the way you have build your site. If you click on redken.com - you get the choice of language. If you select "USA" you're redirected with 302 to redken.com/USA - then with 302 to redken.com/?country=USA then with 302 to redken.com I guess for browsers you store this somewhere (cookie?) - however for a simple bot (like Moz - but I have the same with Screaming Frog) - you just go back where you started = redken.com which again will start the same loop.
So - only 4 url's can be crawled. The other countries are on different url's so will not be included in the crawl.
Google bot is smarter and acts more like a real browser so will crawl the site - but Mozbot can't do that.
rgds
Dirk
Update - I actually forgot one redirect - redken.com first is redirected with 302 to redken.com/international
PS The site is horribly slow as well - and the redirect chain is certainly not helping.
-
Well, I just noticed that website is in flash! I believe non of crawl bots are able to crawl flash websites.
It seems that if I try to access redken.com it redirects me to flash version (/international).
Actually, now I can't recreate that. Super weird. Is there something "special" going on with automatic redirects? Look into that.
-
Thanks for the response!
These are the pages it crawled.
<colgroup><col width="420"></colgroup>
| http://redken.com |
| http://www.redken.com/ |
| http://www.redken.com/international/ |
| http://www.redken.com/USA |
| http://www.redken.com/?country=USA |Robots.txt looks clean, nothing that should have stopped it from crawling more.
-
Hi there.
Which pages are those 4 pages? Is your robots.txt blocking it for some reason maybe?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How I can increase DA of my site?
Hi, I have my 4 month old blog, PA of site is 17 but DA is still 5. I don't know how to increase DA of site. Please suggest me how to increase DA of Site https://myeasygrader.com/ . Thanks
Getting Started | | markwillson0 -
How can I identify most relevant websites in Mexico that create content about a specific term?
Hi. I need to define the most relevant sites which are talking about a specific keyword ir order to create an PR strategy based on that term. How can I identify those sites?
Getting Started | | HarolRuiz0 -
Domain Authority hasn't recovered since August
I really need some major advice on this one. Back in September, I asked a question on here as follows: "A client wanted to change their domain name, which we have now done. The site content itself is exactly the same. We put 301 redirect links in so that Google searchers would redirect from the old site to the new one. However Moz then said that it couldn't crawl the old domain because of the redirects and advised creating a brand new campaign for the new domain. We have done this but now Moz says that the domain authority of the new site is 2 (it was 14 on the old domain)." My original question and the answers I got are here: https://moz.com/community/q/new-domain-wipes-out-domain-authority). Generally the responses I got were that we should give Moz time to crawl the new domain and process all the "new" pages. It is now February, ie 6 months after the domain rename, and on Moz the site still has a DA of 2. It seems like 6 months is enough time to wait. We checked all the recommended guides and believe we have done it all correctly. I really don't know what to do now. Can anyone help or have a quick look and work out why this is so bad? Specifics are:
Getting Started | | mfrgolfgti
old domain: https://ryemeadcleaning.co.uk
new domain: https://ryemeadgroup.co.uk0 -
How can keyword explorer help me search on a more local level?
I am a total novice at this. I am taking the tutorial and the first thing she addresses is Keyword Explorer. It makes sense to me, but what doesn't is that it asks me to look for keywords in USA. I need to explore keywords on a local level. Anyone out there who can help me with this? am I over my head with Moz Pro if I am a complete novice?
Getting Started | | grettelp1 -
Does MOZ pick up every issue in one crawl?
Hi, Does MOZ pick up every error/warning in one crawl? Or does it take numerous crawls? Many thanks Lee
Getting Started | | lbagley0 -
Page Count per campaign - Crawl Usage 500,000 Pages
How to you find the page crawl count per campaign? I have 3 campaigns and Moz stats I have used 150,000 pages from 500,000. I want to check this. Thanks
Getting Started | | SJMDT0 -
How do I update the crawl issues & Notifications?
I have a list of errors, most relating to missing meta descriptions. I have added a meta description to a page, visited the site and viewed the source, and the meta description is now there. When I go to analyze issues, the report it gives back for the link contains the same missing meta description as previously. How do I get it to update and realize the issue has been fixed?
Getting Started | | ETGg0 -
How to get moz to crawl a staging domain that is blocked by robots.txt
Is it possible to get Moz to do a crawl report on a domain blocked by robots.txt and actually display all errors instead of only one saying the domain was blocket in robots.txt? Anything i can add to robots.txt to make moz able to do the crawl report but still hinder google from crawling a staging domain?
Getting Started | | classifiedtech0