Mozbot Can Not Crawl Entire Domain
-
I'm trying to crawl Redken.com in Moz Analytics and the Search Diagnostics is only crawling 4 pages. The domain uses a "select your country" the first time you visit, and it seems as though the bot is not getting beyond that (aka, not clicking on "USA") and is therefore not crawling the rest of the domain. There is no country specific URL other than redken.com.
I've tried entering both "redken.com" and "www.redken.com" as the URL, but no luck.
Any tips?
-
It's caused by the way you have build your site. If you click on redken.com - you get the choice of language. If you select "USA" you're redirected with 302 to redken.com/USA - then with 302 to redken.com/?country=USA then with 302 to redken.com I guess for browsers you store this somewhere (cookie?) - however for a simple bot (like Moz - but I have the same with Screaming Frog) - you just go back where you started = redken.com which again will start the same loop.
So - only 4 url's can be crawled. The other countries are on different url's so will not be included in the crawl.
Google bot is smarter and acts more like a real browser so will crawl the site - but Mozbot can't do that.
rgds
Dirk
Update - I actually forgot one redirect - redken.com first is redirected with 302 to redken.com/international
PS The site is horribly slow as well - and the redirect chain is certainly not helping.
-
Well, I just noticed that website is in flash! I believe non of crawl bots are able to crawl flash websites.
It seems that if I try to access redken.com it redirects me to flash version (/international).
Actually, now I can't recreate that. Super weird. Is there something "special" going on with automatic redirects? Look into that.
-
Thanks for the response!
These are the pages it crawled.
<colgroup><col width="420"></colgroup>
| http://redken.com |
| http://www.redken.com/ |
| http://www.redken.com/international/ |
| http://www.redken.com/USA |
| http://www.redken.com/?country=USA |Robots.txt looks clean, nothing that should have stopped it from crawling more.
-
Hi there.
Which pages are those 4 pages? Is your robots.txt blocking it for some reason maybe?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can't Crawl Site - but deducting crawls.
Why am I being deducted crawls if MOZ keeps telling me that it can't crawl my site?
Getting Started | | BloggyMoms1 -
When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?
Hi guys, When I try to run a site crawl on Moz it returns an error saying that it has failed due to an error with the robots.txt file. However, my site can be crawled by SEM Rush with no mention of problems with roots.txt file issues. My developer has looked into it and insists their is no problem with my robots.txt and I've tried the Moz crawl at least 6 times over an 8 week period. Has anyone ever seen such a large discrepancy between Moz and SEM Rush or have any ideas why Moz has this issue with my site?? TIA everyone
Getting Started | | Webreviewadmin0 -
Got a problem in using MOZ Crawl test
Hello,
Getting Started | | turkeyanaclinic
Guys i need help as i'm getting this message "**Moz was unable to crawl your site on Dec 26, 2017. **Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster."
After i made a Campaign i'm getting this message but after i created new campaign it crawls well
can you help me to edit the old campaign ? Regards.0 -
Crawl issues, how to see a referring link?
Hi There, We've got two crawl issues for pages that don't exist (and never existed). The links are strange and judging by the code in them, appear to be coming from our own CMS. How can we see which pages the links are on in Moz? Cheers Ben
Getting Started | | cmscss0 -
New non-www. web address but the domain is the same
Hi Everyone, we're launching a new WP website that has a non-www. web address. Old address www.1to1therapy.ca, new address http://1to1therapy.ca. A re-direct has been created for the www. address. It appears that this is causing an issue for the Moz page crawler. It is currently only crawling 1 page. I will set up a new campaign. BUT As best practice should I set up all new google analytics on http://1to1therapy.ca? It appears that the analytics are functioning correctly, but I'm unsure if any issues may arise from the change.
Getting Started | | JayTurner0 -
Why can't I Ctrl + click on links on Moz any more?
I'm interested if it's just me that gets frustrated by this? I've just Ctrl + clicked a few links to open them in separate tabs and then realised that none of them had opened. I know it's been like this for a while. It's a usability issue as it goes against expected norms, and now I have to right-click and then click "Open in new tab" on each link, which is more time-consuming and frustrating. More and more websites seem to be losing their Ctrl + click on links ability (JavaScript often breaks it). I don't know if there's a Mac equivalent... Anyway, I hope that doesn't seem like I'm too angry. It just frustrates me a little and I hope it gets fixed. 🙂 Edit - I've just realised these are getting blocked by Chrome's pop-up blocker - but why? It's only an issue on a small number of websites.
Getting Started | | Alex-Harford1 -
How to locate page with the duplicate title? (Crawl Diagnostics - Duplicate Titles Warning)
I am looking through my crawl diagnostics and one of my errors states that a page has a duplicate title. My problem is that I do not know how to find the duplicate. Any advice here?
Getting Started | | bearpaw0 -
I'm new to Moz, where can I find something that explains the basics of everything?
For instance how do I use the Rank Tracker or Keyword Tool. I want to get the most out of Moz and I need a resource that explains it. Thanks
Getting Started | | AliciaMarie1