The "webmaster" disallowed all ROBOTS to fight spam! Help!!
-
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots.
I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt.
Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different?
I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case!
If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance.
Thanks in advance for your help!
-
Thanks for the suggestions!! I'll keep you updated.
-
You can get the list of good robots from the list at Robotstxt.org: http://www.robotstxt.org/db.html.
I'd recommend creating an edited version of the robots.txt file yourself, specifically Allowing googlebot and others. Then send that with a link to the robotstxt.org site.
You may need to get the business owners involved. IT exists to enable the business, not strap it down so it can't move.
-
What you could do is just add Allow statements for the different Googlebots and the bots of other search engines. This will probably make the developers happy so they can keep other bots out of the door (although I doubt this would work and definitely don't think that this should be the option to keep spammers away, but that says more about the quality of development ;-)).
-
Yes, there are a ton of bad bots one may want to block. Can you show us the robots.txt file? If they aren't blocking legit search engine bots, you're probably okayish. If they are actually blocking all bots, you have cause for concern.
Can you give us a screenshot from GWT?
I use a program called Screaming Frog daily. It's not malicious, off the shelf. I just want to crawl and gather meta data. I can tell it to disregard robots.txt. It will crawl a site until it hit's something password protected. There's not much any robots.txt can do about it, as it can also spoof user agents.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt & Disallow: /*? Question!
Hi, I have a site where they have: Disallow: /*? Problem is we need the following indexed: ?utm_source=google_shopping What would the best solution be? I have read: User-agent: *
Intermediate & Advanced SEO | | vetofunk
Allow: ?utm_source=google_shopping
Disallow: /*? Any ideas?0 -
Google WMT/search console showing thousands of links in "Internal Links"
Hi, One of our blog-post has been interlinked with thousands of internal links as per search console; but lists only 2 links it got connected from. How come so many links it got connected internally? I don't see any. Thanks, Satish
Intermediate & Advanced SEO | | vtmoz0 -
Robots.txt, Disallow & Indexed-Pages..
Hi guys, hope you're well. I have a problem with my new website. I have 3 pages with the same content: http://example.examples.com/brand/brand1 (good page) http://example.examples.com/brand/brand1?show=false http://example.examples.com/brand/brand1?show=true The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages... I don't know how should do now, but, i am thinking 2 posibilites: Remove filters (true, false) and leave only the good page and show 404 page for others pages. Update robots.txt with disallow for these parameters & remove those URL's manually Thank you so much!
Intermediate & Advanced SEO | | thekiller990 -
Why do Local "5 pack" results vary between showing Google+, Google+ and website address
I had a client ask me a good question. When they pull up a search result they show up at the top but only with a link to their G+ page. Other competitors show their web address and G+ page. Why are these results different in the same search group? Is there a way to ensure the web address shows up?
Intermediate & Advanced SEO | | Ron_McCabe0 -
SEO direction - help needed
Hi, I've been working on a site for about 5 years. We built the traffic up to about 8k visitors/day. Although now it's dropped down over the past 2 years to about 2k visitors a day. New traffic source is mainly from SEO longtail. The whole time we have been working to improve the site. What's the best way to get some help from experts on the right direction to get traffic back up or to at least tell me the site will never work 🙂 Thanks in advance. M
Intermediate & Advanced SEO | | relientmark0 -
What is a "Bad Link" in Google's eyes? Low DA?
Hi there, I'm going through my link profile and I noticed I have a few links that are from <10 DA sites. One has a DA of 6. Should I remove these? Aside from any referral traffic I receive from these links (I know there is none), are these links hurting me?
Intermediate & Advanced SEO | | Travis-W
What should I look out for in a site I may guest post on? Thanks!
Travis0 -
Emergency Help...
Hello All, I'm trying to get a better handle on this, but any help would be hugely appreciated. Per my Pro account, i just found out that the keyword i was severely trying to rank for "Boston Wedding Phot*grapher" i just declined by over 40 positions. Just last week i was in the #3 position. Needless to say, this is extremely bad. I feel sick from it. This is my livelyhood. I recently hired a 'so-called' SEO expert to look at it, but i'm having my doubts. I'm using a php based site with a wordpress blog. He added a bunch of 301 redirects from pages that the crawler was complaining about to my .htaccess file. He also installed the following plugins: Link Juice Keeper NoFollow Free The SEO Rich Snippets Udinra All Image Sitemap WP Robots Txt WP-PageNavi Add Meta Tags These are essentially the only changes made. Does anyone see anything blaring and/or obvious? I could really really use some help. My blog link is : http://www.symbolphoto.com/blog/ I'm assuming it's the blog because that's where most of my site content is located. Any advice is hugely appreciated. TIA.
Intermediate & Advanced SEO | | symbolphoto0 -
So what exactly does Google consider a "natural" link profile?
As part of my company's ongoing SEO effort we have been analyzing our link profile. A colleague of mine feels that we should be targeting at least 50% branded anchor text. He claims this is what search engines consider "natural" and we should not go past a threshold of 50% optimized anchor text to make sure we avoid any penalties or decrease in rankings. 50% brand term anchor text seems too high to me. I pointed out that most of our competitors who outrank us have a much greater percentage of optimized links. I've also read other industry experts state that somewhere in the range of 30% branded anchor text would be considered natural. What percent of branded vs. optimized anchor text do you feel looks "natural" and what do you base your opinion on?
Intermediate & Advanced SEO | | DeannaTallman0