When rogerbot tried to crawl my site it gets a 404\. Why?
-
When rogerbot tries to craw my site it tries http://website.com. My website then tries to redirect to http://www.website.com and is throwing a 404 and ends up not getting crawled. It also throws a 404 when trying to read my robots.txt file for some reason. We allow rogerbot user agent so unsure whats happening here. Is there something weird going on when trying to access my site without the 'www' that is causing the 404? Any insight is helpful here.
Thanks,
-
Hey Dan,
So that's the problem. Our site is up and i can manually navigate to anything including the robots.txt file. I've done this multiple times throughout the day and different days as well and manually triggered different Moz crawls at different times so i've ruled out an outage.
-
The robots.txt 404 could be a temporary outage, but it's a bit hard to tell without being able to see the actual site and robots.txt. Try checking the site is up, and you can access the robots.txt then requesting a new Moz crawl...
I do have one client who insists on blocking everything and then allowing specific crawlers, and allowing rogerbot seems to have worked fine to date.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404 crawl errors ending with your domain name??
Hello, I have a crawl test with numerous 404 errors ending with my domain name..? Not sure what the cause is. Plugins? Ecommerce? I use Wordpress if that could lead to an answer. Thanks for your time. K
Technical SEO | | Hydraulicgirl0 -
Why are my webpages not getting indexed?
I want to figure out why a lot of my pages for my website are not getting indexed by google. I have installed the SEO plugin by Yoast to my wordpress website. Under the titles and meta section of the plugin options I have set categories and tags to noindex. In WMT, google is saying that all my category pages and most of my tag pages are not being indexed. I want to make sure that the reason these pages are not being indexed are because of the SEO plugin. I want to prevent duplicate content so that is the reason I have set my categories and tags to noindex. Please respond if you know the absolute answer, its very important that I have my website indexed the proper way I want it to.
Technical SEO | | Dino640 -
Why can no tool crawl this site?
I am trying to perform a crawl analysis on a client's website at https://www.bravosolution.com I have tried to crawl it with IIS for SEO, Sreaming Frog and Xenu and not one of them makes it further than the home page of the site. There is nothing I can see in the robots.txt that is blocking these agents. As far as I can see, Google is able to crawl the site although they have noticed a significant drop in organic traffic. Any advise would be very welcome Regards Danny
Technical SEO | | richdan0 -
Staging site and "live" site have both been indexed by Google
While creating a site we forgot to password protect the staging site while it was being built. Now that the site has been moved to the new domain, it has come to my attention that both the staging site (site.staging.com) and the "live" site (site.com) are both being indexed. What is the best way to solve this problem? I was thinking about adding a 301 redirect from the staging site to the live site via HTACCESS. Any recommendations?
Technical SEO | | melen0 -
Site Blacklisted
Good morning. Just done my WMT ritual morning check and one of my sites has been blacklisted for malware. It's a wordpress site - I've run various scans, e.g. http://sitecheck.sucuri.net/scanner/ and also installed wordfence and scanned with that and wordfence produced some offending files which I have now deleted. I've also installed website defender in the hope that it wont happen again. I'm pretty good with staying on top of updates and rarely let a few days pass without upgrading new version of wordpress or plugins etc. I've also checked my users to make sure no new admins or anything and also changes passwords. I've asked for a review from Google and just wondered how long these reviews take? Also, has anybody got any advice, is there anything else I should be doing? Thanks
Technical SEO | | littlesthobo0 -
Can 404 results from external links hurt site ranking?
Hello, I'm helping a university transition to a brand new website. In some cases the URLs will change between the old site and new site. They will put 301 redirects in place to make sure that people who have old URLs will get redirected properly to the new URLs. However they also have a bunch of old pages that they aren't using anymore. They don't really care if people still try to get to them (because they don't think many will), but they do care about the overall search engine rankings. I know that if a site has internal 404 links, that could hurt rankings. However can external links that return a 404 hurt rankings? Ryan
Technical SEO | | GreenHatWeb0 -
How to do a no follow on site search
We have a site search that is causing a huge amount of errors as the SEOmoz crawler is showing these as duplicate content. Our first thought was to do a no-follow on the site-search directory, but we realized that the site search is /site-search.aspx and URl strings appear at the end for hundreds of pages. How dow we/how can we no-follow an undetermined amount of URL strings?
Technical SEO | | Apptixweb0 -
404 help
Hello all, firstly let me apologize if this is the wrong place to ask this question. I have a site www.promptresponseaccidentmanagement.com which gets a 200ok when checked for crawl issues, however pages such as /whiplash-injury-compensation-claims.php , /road-traffic-accident-compensation-claims.php and quite a few more return a 404. That's fine (usually) as I can quite happily fix that most of the time. However if you actually go to those pages in your browser, or click through to them on any part of the site you will see that they are in fact not redirecting to a 404 and everything is fine!? Any body got any ideas? Best H
Technical SEO | | haydyn0