The "webmaster" disallowed all ROBOTS to fight spam! Help!!
-
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots.
I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt.
Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different?
I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case!
If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance.
Thanks in advance for your help!
-
Thanks for the suggestions!! I'll keep you updated.
-
You can get the list of good robots from the list at Robotstxt.org: http://www.robotstxt.org/db.html.
I'd recommend creating an edited version of the robots.txt file yourself, specifically Allowing googlebot and others. Then send that with a link to the robotstxt.org site.
You may need to get the business owners involved. IT exists to enable the business, not strap it down so it can't move.
-
What you could do is just add Allow statements for the different Googlebots and the bots of other search engines. This will probably make the developers happy so they can keep other bots out of the door (although I doubt this would work and definitely don't think that this should be the option to keep spammers away, but that says more about the quality of development ;-)).
-
Yes, there are a ton of bad bots one may want to block. Can you show us the robots.txt file? If they aren't blocking legit search engine bots, you're probably okayish. If they are actually blocking all bots, you have cause for concern.
Can you give us a screenshot from GWT?
I use a program called Screaming Frog daily. It's not malicious, off the shelf. I just want to crawl and gather meta data. I can tell it to disregard robots.txt. It will crawl a site until it hit's something password protected. There's not much any robots.txt can do about it, as it can also spoof user agents.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it good practice to use "SAVE $1000's" in SEO titles and Meta Descriptions?
Our company sells a product system that will permanently waterproof almost anything. We market it as a DIY system. I am working on SEO titles and descriptions. This topic came up for discussion, if using "SAVE $1000's.." would help or hurt. We are trying to create an effective call to action, but we are wondering if search engines see it as click bait. Can you
Intermediate & Advanced SEO | | tyler.louth0 -
Disallow URLs ENDING with certain values in robots.txt?
Is there any way to disallow URLs ending in a certain value? For example, if I have the following product page URL: http://website.com/category/product1, and I want to disallow /category/product1/review, /category/product2/review, etc. without disallowing the product pages themselves, is there any shortcut to do this, or must I disallow each gallery page individually?
Intermediate & Advanced SEO | | jmorehouse0 -
SEO agency makes "hard to believe" claims
Hi I operate in a highly competitive niche of "sell house fast" in UK. Sites that are in top 1-3 tend to have thousands of links. Some of these are spammy type links. These sites have Domain Authority too. My site has good content http://propertysaviour.co.uk and is listed with around 12 well known directories. I have been building back-links manually over the last 3-4 months. The SEO agency we are looking to work with are claiming they can get my website to first page with above keyword. How would you go about this strategy? What questions would you ask SEO agency? What elements can do I myself? By the way, I am good at producing content!
Intermediate & Advanced SEO | | propertysaviour0 -
Is their value in linking to PPC landing pages and using rel="canonical"
I have ppc landing pages that are similar to my seo page. The pages are shorter with less text with a focus on converting visitors further along in the purchase cycle. My questions are: 1. Is there a benefit for having the orphan ppc pages indexed or should I no index them? 2. If indexing does provide benefits, should I create links from my site to the ppc pages or should I just submit them in a sitemap? 3. If indexed, should I use rel="canonical" and point the ppc versions to the appropriate organic page? Thanks,
Intermediate & Advanced SEO | | BrandExpSteve0 -
SEO direction - help needed
Hi, I've been working on a site for about 5 years. We built the traffic up to about 8k visitors/day. Although now it's dropped down over the past 2 years to about 2k visitors a day. New traffic source is mainly from SEO longtail. The whole time we have been working to improve the site. What's the best way to get some help from experts on the right direction to get traffic back up or to at least tell me the site will never work 🙂 Thanks in advance. M
Intermediate & Advanced SEO | | relientmark0 -
Trailing slash and rel="canonical"
Our website is in a directory format: http://www.website.com/website.asp Our homepage display URL is http://www.website.com which currently matches our to eliminate the possibility of duplicate content. However, I noticed that in the SERPs, google displays the homepage with a trailing slash http://www.website.com/ My question: should I change the rel="canonical" to have a trailing slash? I noticed one of our competitors uses the trailing slash in their rel="canonical" Do potential benefits outweigh the risks? I can PM further information if necessary. Thanks for the assistance in advance...
Intermediate & Advanced SEO | | BethA0 -
Do "NoFollow" links provide any SEO value?
Do "nofollow" links provide any SEO value, particularly for Google? I have heard that they still can, since Google doesn't necessarily follow all of the tags. Is this true? Is there any value in obtaining nofollow links? Can they also hurt in any way? Thank you!
Intermediate & Advanced SEO | | applesofgold
Afshin Apples of Gold0 -
How to keep the link juice in E-commerce to an "out of stock" products URL?
I am running an e-commerce business where I sell fashion jewelry. We usually have 500 products to offer and some of them we have only one in stock. What happens is that many of our back links are pointed directly to a specific product, and when a product is sold out and no longer is in stock the URL becomes inactive, and we lose the link juice. What is the best practice or tool to 301-redirect many URLs at the same time without going and changing one URL at a time? Do you have any other suggestions on how to manage an out of stock product but still maintain the link juice from the back link? Thanks!
Intermediate & Advanced SEO | | ikomorin0