When rogerbot tried to crawl my site it gets a 404\. Why?
-
When rogerbot tries to craw my site it tries http://website.com. My website then tries to redirect to http://www.website.com and is throwing a 404 and ends up not getting crawled. It also throws a 404 when trying to read my robots.txt file for some reason. We allow rogerbot user agent so unsure whats happening here. Is there something weird going on when trying to access my site without the 'www' that is causing the 404? Any insight is helpful here.
Thanks,
-
Hey Dan,
So that's the problem. Our site is up and i can manually navigate to anything including the robots.txt file. I've done this multiple times throughout the day and different days as well and manually triggered different Moz crawls at different times so i've ruled out an outage.
-
The robots.txt 404 could be a temporary outage, but it's a bit hard to tell without being able to see the actual site and robots.txt. Try checking the site is up, and you can access the robots.txt then requesting a new Moz crawl...
I do have one client who insists on blocking everything and then allowing specific crawlers, and allowing rogerbot seems to have worked fine to date.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Getting 'Indexed, not submitted in sitemap' for around a third of my site. But these pages ARE in the sitemap we submitted.
As in the title, we have a site with around 40k pages, but around a third of them are showing as "Indexed, not submitted in sitemap" in Google Search Console. We've double-checked the sitemaps we have submitted and the URLs are definitely in the sitemap. Any idea why this might be happening? Example URL with the error: https://www.teacherstoyourhome.co.uk/german-tutor/Egham Sitemap it is located on: https://www.teacherstoyourhome.co.uk/sitemap-subject-locations-surrey.xml
Technical SEO | | TTYH0 -
Why HTML entities gets crawled as content keywords in Google search console?
My Google search console shows HTML parameters such as div, class, img, src, gif, align as content keywords, but why google crawls HTML parameters as keywords? because of this, I would be losing traffic for my on-page content keywords. Please let me know how to solve this. Thanks, Jenifer
Technical SEO | | Jenifer300 -
How can I get Google to forget an https version of one page on my site?
Google mysteriously decided to index the broken, https version of one page on my company's site (we have a cert for the site, but this page is not designed to be served over https and the CSS doesn't load). The page already has many incoming links to the http version, and it has a canonical URL with http. I resubmitted it on http with webmaster tools. Is there anything else I could do?
Technical SEO | | BostonWright0 -
What changes do i need to make to my site to get into google news
Hi, when we had the old design, we were in google news but then when we upgraded our site, we had a major problem which forced us to have to redesign our site. Since then we have not been included in google news and we would like to get back in. We only want to be in google news for the following page http://www.in2town.co.uk/Latest-News-Headlines But for some reason, no matter what we do we keep getting knocked back. I would love to know what we should be doing to get into google news and see what the problems are. We have moved to a bigger dedicated server to increase speed so i know it is not that. Any help would be great Also is there an alternative to google news that i can get our site into to generate traffic and to get our news stories straight out to people Hi, Thank you for your note. We appreciate your interest in sharing your content with us. However, when we reviewed your site, we found that we cannot include it in Google News at this time. We have certain guidelines in place regarding the quality of sites which are included in Google News. Please feel free to review these guidelines at the following link: http://www.google.com/support/news_pub/bin/answer.py?hl=en&answer=40787 We know it can be frustrating to not have more information about this but we appreciate your efforts and understanding. We will log your site for future consideration. Please keep in mind that we will be unlikely to review your site for at least 60 days following this email. Thanks for your understanding and your continued interest in Google News. Regards,
Technical SEO | | ClaireH-184886
The Google News Team0 -
Crawl issue
Hi I have a problem with crawl stats. Crawls Only return 3k pages while my site have 27k pages indexed(mostly duplicated content pages), why such a low number of pages crawled any help more than welcomed Dario PS: i have more campaign in place, might that be the reason?
Technical SEO | | Mrlocicero0 -
Can someone help me get this site ranked? www.2sponsors.com
Hi, I am have been trying for months to get a site ranked for one of my customers and I am not doing very well. I have been doing SEO for years and have gotten lots of sites ranked but this one has been the most difficult. Does anyone have time to look at it for me? Thanks The sites PR=4. I am trying to get it ranked in www.google.com.ar Thanks Carla skype: carla.dawson78
Technical SEO | | Carla_Dawson0 -
Getting More Pages Indexed
We have a large E-commerce site (magento based) and have submitted sitemap files for several million pages within Webmaster tools. The number of indexed pages seems to fluctuate, but currently there is less than 300,000 pages indexed out of 4 million submitted. How can we get the number of indexed pages to be higher? Changing the settings on the crawl rate and resubmitting site maps doesn't seem to have an effect on the number of pages indexed. Am I correct in assuming that most individual product pages just don't carry enough link juice to be considered important enough yet by Google to be indexed? Let me know if there are any suggestions or tips for getting more pages indexed. syGtx.png
Technical SEO | | Mattchstick0