Help - we're blocking SEOmoz cawlers
-
We have a fairly stringent blacklist and by the looks of our crawl reports we've begin unintentionally blocking the SEOmoz crawler.
can you guys let me know the useragent string and anything else I need to enable mak sure you're crawlers are whitelisted?
Cheers!
-
Hi Keri,
Still testing, though i see no reason why this shouldn't work so will close the QA ticket.
cheers!
-
Hi! Did this work for you, or would you like our help team to lend a hand?
-
We maintain a crawler (and others) blacklist to control server loads, so I'm just looking for the useragent string I can add to the white list. this one should do the trick;
Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)
-
Still way to early for me ;-). I block specific robots rather than excluding all but a few.
I have not tried the following (but think/hope it will work) - this should block all robots, but allow SeoMoz and Google:
User-agent: *
Disallow: /User-agent: rogerbot
Disallow:User-agent: Google
Disallow:You would already have something like this in your robots.txt (unless your block occurs on a network/firewall level).
-
Thanks Gerd, though looks like your robots.txt is a disallow rule, when I'm looking to let it through.
I'll give this one a try: Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)
-
I have it as "rogerbot"
<code>User-agent: rogerbot Disallow: /</code>
Access-log: Mozilla/5.0 (compatible; rogerBot/1.0; UrlCrawler; http://www.seomoz.org/dp/rogerbot)
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
301 Re-direct help
Hello Mozzers, I have a technical question that perhaps someone has experience with and can help with. I currently have 2 e-commerce websites: SITE-A.COM (original site) & SITE-B.COM (new site) SITE-B.COM is the newer site that has a lot of new products and new features and great content and is very user friendly. We are thinking about funneling all of our visitors and traffic to SITE-B.com since it is the better experience for the users ... the question is this: If we want to 301 redirect all traffic from Site-A.com to Site-B.com ... where do we initiate those redirect requests? Would it be on the server for Site-A.com? If so, would i have to keep that server up and running forever if i don't want to lose the re-directs? Also, how do i do this properly without violating Google's guidelines? Any help is appreciated. Thanks
Technical SEO | | Prime850 -
Godaddy and Soft 404's
Hello, We've found that a website we manage has a list of not-found URLS in Google webmaster tools which are "soft 404's " according to Google. I went to the hosting company GoDaddy to explain and to see what they could do. As far as I can see GoDaddy's server are responding with a 200 HTTP error code - meaning that the page exists and was served properly. They have sort of disowned this as their problem. Their server is not serving up a true 404 response. This is a WordPress site. 1) Has anyone seen this problem before with GoDaddy?Is it a GoDaddy problem?2) Do you know a way to sort this issue? When I use the command site:mydomain.co.uk the number of URLs indexed is about right except for 2 or 3 "soft URLs" . So I wonder why webmaster tools report so many yet I can't see them all in the index?
Technical SEO | | AL123al0 -
Pagination Help
Hi Moz Community, I've recently started helping a new site with their overall health and I have some pagination issues. It's an ecommerce site and they currently don't have any pagination in place except for these tags: Prev 1 2 3 ... 66 Next I understand what these are doing (leading visitors to the previous, next or last page, but do these do anything for search crawlers or does the site need to have an option of:
Technical SEO | | IceIcebaby
1.rel=next/rel=prev
2.canonical leading to the view all page (the view all page takes a long time to load) Thanks for your help. -Reed0 -
Why aren't certain links showing in SEOMOZ?
Hi, I have been trying to understand our page rank and domains that are linking to us. When I look at the list of linking domains, I see some bigger ones are missing and I don't know why. For example, we are in the Yahoo Directory with a link to trophycentral.com, but SEOMOZ is not showing the link. If SEOMOZ is not seeing it, my guess is Google is not either, which concerns me. There are several onther high page rank domains also not showing. Anyone have any idea why? Thanks! BTW, our domain is trophycentral.com
Technical SEO | | trophycentraltrophiesandawards0 -
Redirection help to retrieve broken links
Hi, my hosting company after they updated my joomla website lost thousands of pages of content, i am now searching for all broken links and re doing the content to get my links back, but i am having a problem understanding how to redirect these links. For example, i have now managed to retrieve this page http://www.in2town.co.uk/news/have-your-say/liberal-dem-leader-says-he-will-be-the-next-prime-minister-what-do-you-think but the old url for this page was http://www.in2town.co.uk/Have-Your-Say/Liberal-Dem-Leader-says-He-Will-be-The-Next-Prime-Minister-What-Do-You-Think/menu-id-4953 i do not have the unfriendly url for this page, so what i am trying to find out is, how to tell google that the above page is now http://www.in2town.co.uk/news/have-your-say/liberal-dem-leader-says-he-will-be-the-next-prime-minister-what-do-you-think in my joomla site. if anyone could please explain how to do this with joomla 1.5 then you will make me very happy as then i will be able to retrieve some of my lost links
Technical SEO | | ClaireH-1848860 -
Site 'filtered' by Google in early July.... and still filtered!
Hi, Our site got demoted by Google all of a sudden back in early July. You can view the site here: http://alturl.com/4pfrj and you may read the discussions I posted in Google's forums here: http://www.google.com/support/forum/p/Webmasters/thread?tid=6e8f9aab7e384d88&hl=en http://www.google.com/support/forum/p/Webmasters/thread?tid=276dc6687317641b&hl=en Those discussions chronicle what happened, and what we've done since. I don't want to make this a long post by retyping it all here, hence the links. However, we've made various changes (as detailed), such as getting rid of duplicate content (use of noindex on various pages etc), and ensuring there is no hidden text (we made an unintentional blunder there through use of a 3rd party control which used CSS hidden text to store certain data). We have also filed reconsideration requests with Google and been told that no manual penalty has been applied. So the problem is down to algorithmic filters which are being applied. So... my reason for posting here is simply to see if anyone here can help us discover if there is anything we have missed? I'd hope that we've addressed the main issues and that eventually our Google ranking will recover (ie. filter removed.... it isn't that we 'rank' poorly, but that a filter is bumping us down, to, for example, page 50).... but after three months it sure is taking a while! It appears that a 30 day penalty was originally applied, as our ranking recovered in early August. But a few days later it dived down again (so presumably Google analysed the site again, found a problem and applied another penalty/filter). I'd hope that might have been 30 or 60 days, but 60 days have now passed.... so perhaps we have a 90 day penalty now. OR.... perhaps there is no time frame this time, simply the need to 'fix' whatever is constantly triggering the filter (that said, I 'feel' like a time frame is there, especially given what happened after 30 days). Of course the other aspect that can always be worked on (and oft-mentioned) is the need for more and more original content. However, we've done a lot to increase this and think our Guide pages are pretty useful now. I've looked at many competitive sites which list in Google and they really don't offer anything more than we do..... so if that is the issue it sure is puzzling if we're filtered and they aren't. Anyway, I'm getting wordy now, so I'll pause. I'm just asking if anyone would like to have a quick look at the site and see what they can deduce? We have of course run it through SEOMoz's tools and made use of the suggestions. Our target pages generally rate as an A for SEO in the reports. Thanks!
Technical SEO | | Go2Holidays0 -
What to do about "blocked by meta-robots"?
The crawl report tells me "Notices are interesting facts about your pages we found while crawling". One of these interesting facts is that my blog archives are "blocked by meta robots". Articles are not blocked, just the archives. What is a "meta" robot? I think its just normal (since the article need only be crawled once) but want a second opinion. Should I care about this?
Technical SEO | | GPN0 -
Slashes In Url's
If your cms has created two urls for the same piece of content that look like the following, www.domianname.com/stores and www.domianname.com/stores/, will this be seen as duplicate content by google? Your tools seem to pick it up as errors. Does one of the urls need 301 to the other to clear this up, or is it not a major problem? Thanks.
Technical SEO | | gregster10000