When rogerbot tried to crawl my site it gets a 404\. Why?
-
When rogerbot tries to craw my site it tries http://website.com. My website then tries to redirect to http://www.website.com and is throwing a 404 and ends up not getting crawled. It also throws a 404 when trying to read my robots.txt file for some reason. We allow rogerbot user agent so unsure whats happening here. Is there something weird going on when trying to access my site without the 'www' that is causing the 404? Any insight is helpful here.
Thanks,
-
Hey Dan,
So that's the problem. Our site is up and i can manually navigate to anything including the robots.txt file. I've done this multiple times throughout the day and different days as well and manually triggered different Moz crawls at different times so i've ruled out an outage.
-
The robots.txt 404 could be a temporary outage, but it's a bit hard to tell without being able to see the actual site and robots.txt. Try checking the site is up, and you can access the robots.txt then requesting a new Moz crawl...
I do have one client who insists on blocking everything and then allowing specific crawlers, and allowing rogerbot seems to have worked fine to date.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
GWT giving me 404 errors based on old and deleted site map
I'm getting a bunch of 404 crawl errors in my Google Webmaster Tools because we just moved our site to a new platform with new URL structure. We 301 redirected all the relevant pages. We submitted a new site map and then deleted all the site maps to the old website url structure. However, google keeps crawling the OLD urls and reporting back the 404 errors. It says that the website is linking to these 404 pages via an old outdated sitemap (which if you goto shows a 404 as well, so it's not as if Google is reading these old site maps now). Instead it's as if Google has cached the old sitemap but continues to use it to crawl these non-existent pages. Any thoughts?
Technical SEO | | Santaur0 -
Our stage site got crawled and we got an unnatural inbound links warning. What now?
live site: www.mybarnwoodframes.com stage site: www.methodseo.net We recently finished a redesign of our site to improve our navigation. Our developer insisted on hosting the stage site on her own server with a separate domain while she worked on it. However, somebody left the site turned on one day and Google crawled the entire thing. Now we have 4,320 pages of 100% identical duplicate content with this other site. We were upset but didn't think that it would have any serious repercussions until we got two orders from customers from the stage site one day. Turns out that the second site was ranking pretty decently for a duplicate site with 0 links, the worst was yet to come however. During the 3 months of the redesign our rankings on our live site dropped and we suffered a 60% drop in organic search traffic. On May 22, 2013 day of the Penguin 2.0 release we received an unnatural inbound links warning. Google webmaster tools shows 4,320 of our 8,000 links coming from the stage site domain to our live site, we figure that was the cause of the warning. We finished the redesign around May 14th and we took down the stage site, but it is still showing up in the search results and the 4,320 links are still showing up in our webmaster tools. 1. Are we correct to assume that it was the stage site that caused the unnatural links warning? 2. Do you think that it was the stage site that caused the drop in traffic? After doing a link audit I can't find any large amount of horrendously bad links coming to the site. 3. Now that the stage site has been taken down, how do we get it out of Google's indexes? Will it be taken out over time or do we need to do something on our end for it to be delisted? 4. Once it's delisted the links coming from it should go away, in the meantime however, should we disavow all of the links from the stage site? Do we need to file a reconsideration request or should we just be patient and let them go away naturally? 5. Do you think that our rankings will ever recover?
Technical SEO | | gallreddy0 -
Site getting referral traffic from its self
Buon giorno from zero degrees C freezing fog Wetherby UK On this site http://www.collegeofphlebology.com i ran a referral report via Google Analytics and was surprised ro see referral traffic being counted from its own url, illustration here http://i216.photobucket.com/albums/cc53/zymurgy_bucket/referral-anomoly.jpg So my question is please how can a site get referral traffic from the same url? Grazie Tanto, David
Technical SEO | | Nightwing0 -
Redirecting the .com of our site
Hey guys, A company I consult for has a different site for its users depending on the geography. Example: When a visitor goes to www.company.com if the user is from the EU, it gets redirected to http://eu.company.com If the user is from the US, it goes to http://us.company.com And so on. I have two questions: Does having a redirect on the .com will influence rankings on each specific sub-site? I suspect it will affect the .com since it will simply not get indexed but not sure if affects the sub domains. The content on this sub-sites are not different (I´m still trying to figure out why they are using the sub-domains). Will they get penalized for duplicate content? Thanks!
Technical SEO | | FDSConsulting0 -
301 redirecting old content from one site to updated content on a different site
I have a client with two websites. Here are some details, sorry I can't be more specific! Their older site -- specific to one product -- has a very high DA and about 75K visits per month, 80% of which comes from search engines. Their newer site -- focused generally on the brand -- is their top priority. The content here is much better. The vast majority of visits are from referrals (mainly social channels and an email newsletter) and direct traffic. Search traffic is relatively low though. I really want to boost search traffic to site #2. And I'd like to piggy back off some of the search traffic from site #1. Here's my question: If a particular article on site #1 (that ranks very well) needs to be updated, what's the risk/reward of updating the content on site #2 instead and 301 redirecting the original post to the newer post on site #2? Part 2: There are dozens of posts on site #1 that can be improved and updated. Is there an extra risk (or diminishing returns) associated with doing this across many posts? Hope this makes sense. Thanks for your help!
Technical SEO | | djreich0 -
Way to find how many sites within a given set link to a specific site?
Hi, Does anyone have an idea on how to determine how many sites within a list of 50 sites link to a specific site? Thanks!
Technical SEO | | SparkplugDigital0 -
If you add a no follow to a time sensitive link, will it get picked up as broken link 404 in WMT report?
We have a client who publishes deals that are time sensitive. Links to the deals expire and so Google's crawlers are picking them up and finding a 404 If I no follow them, will the 404's still get picked up and reported in WMT? The same question applies to SEOMoz Pro.
Technical SEO | | Red_Mud_Rookie0 -
Site Structure question
when deciding the Site structure for a e-commerce site Is it better to keep everything mysite.com/widget.html or use categories like mysite.com/Gifts/widget.html
Technical SEO | | DavidKonigsberg0