Mozbot Can Not Crawl Entire Domain
-
I'm trying to crawl Redken.com in Moz Analytics and the Search Diagnostics is only crawling 4 pages. The domain uses a "select your country" the first time you visit, and it seems as though the bot is not getting beyond that (aka, not clicking on "USA") and is therefore not crawling the rest of the domain. There is no country specific URL other than redken.com.
I've tried entering both "redken.com" and "www.redken.com" as the URL, but no luck.
Any tips?
-
It's caused by the way you have build your site. If you click on redken.com - you get the choice of language. If you select "USA" you're redirected with 302 to redken.com/USA - then with 302 to redken.com/?country=USA then with 302 to redken.com I guess for browsers you store this somewhere (cookie?) - however for a simple bot (like Moz - but I have the same with Screaming Frog) - you just go back where you started = redken.com which again will start the same loop.
So - only 4 url's can be crawled. The other countries are on different url's so will not be included in the crawl.
Google bot is smarter and acts more like a real browser so will crawl the site - but Mozbot can't do that.
rgds
Dirk
Update - I actually forgot one redirect - redken.com first is redirected with 302 to redken.com/international
PS The site is horribly slow as well - and the redirect chain is certainly not helping.
-
Well, I just noticed that website is in flash! I believe non of crawl bots are able to crawl flash websites.
It seems that if I try to access redken.com it redirects me to flash version (/international).
Actually, now I can't recreate that. Super weird. Is there something "special" going on with automatic redirects? Look into that.
-
Thanks for the response!
These are the pages it crawled.
<colgroup><col width="420"></colgroup>
| http://redken.com |
| http://www.redken.com/ |
| http://www.redken.com/international/ |
| http://www.redken.com/USA |
| http://www.redken.com/?country=USA |Robots.txt looks clean, nothing that should have stopped it from crawling more.
-
Hi there.
Which pages are those 4 pages? Is your robots.txt blocking it for some reason maybe?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?
Hi guys, When I try to run a site crawl on Moz it returns an error saying that it has failed due to an error with the robots.txt file. However, my site can be crawled by SEM Rush with no mention of problems with roots.txt file issues. My developer has looked into it and insists their is no problem with my robots.txt and I've tried the Moz crawl at least 6 times over an 8 week period. Has anyone ever seen such a large discrepancy between Moz and SEM Rush or have any ideas why Moz has this issue with my site?? TIA everyone
Getting Started | | Webreviewadmin0 -
Moz Site Crawl can't index WIX sites
We've been attempting to work on some SEO for a new potential client however they are using a WIX site. We've noticed that Moz SEO tools will not index any WIX sites. e.g. https://www.sharonradisch.com/ (which is one of their case studies). Anyone seen this that can offer any advice? Thanks,
Getting Started | | monkeex
Mark2 -
How to have MOZ site crawl pre-launch
Hi, Our new website is about to launch. We would love to have moz.com SITE CRAWL our site before launch. For issues like "missing meta description" and everything else that moz.com checks. We would love to do it before we launch. The new site is currently on a different domain than our live site. example.com <-- this is our live site. new-site.com <-- this is our "staging" server with the new site. We have a long running campaign for example.com Do we need to create a new campain for new-site.com ? Or is there some other simpler way? When we launch we will switch the site from new-site.com to example.com .. example.com will be the address for the new site.. Any ideas or suggestions? best practices? edit Forgot to say thank you for your help and input 🙂
Getting Started | | tandvarden0 -
Crawl rate
How often does Moz crawl my website ? (I have a number of issues I believe I have fixed, and wondered if there was a manual request to re-crawl ?) Thanks. Austin.
Getting Started | | FuelDump0 -
New to MOZ, can't create a campaign.
I just started the free trial today, but I can't setup a campaign. Everywhere I go (http://analytics.moz.com/pro/home, http://analytics.moz.com/manage-campaigns), all I see is: Oops! Try refreshing the page, if that doesn't work, please click here to contact our help team. Is something broken?
Getting Started | | jcsilkey0 -
Crawling Website
How often does Moz crawl the website? I have a lot of high priority crawl issues but my web development team says that it fixed them. However, Moz still shows me that none of the errors are fixed.
Getting Started | | Leoni0 -
My site is not being fully crawled
Our site has been crawled several times by RogerBot but each time only 6 pages are crawled even though we have more than 100 pages. Do I need to submit my sitemap.xml to Moz?
Getting Started | | Scurri0 -
Why I can't add more campaign to my moz analytics?
As we know that being a standard pro member in Moz community, we have at least 5 campaigns to add and track & as well in new moz analytics. But when I just made http://www.norwoodgreen.in website campaign an archive to make active or add one of the old himachalpackagetour.com campaign which I was unable to do WHY WHY WHY? Let me know . .? Also do guide or train me How to use Moz analytics as I was not enough familiar with it as I am habitual with in Pro tools. Best,
Getting Started | | Futura
Teginder0