Google Crawler Error / restricting crawling

Flipmedia112

Hi

On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant.

The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error.

Currently Google is attempting to crawl these pages twice a day.

I have implemented the following to stop this:-

1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri:

http://www.domian.com/catalogsearch/advanced/result/

2. Added Disallow to robots.txt

Disallow: /catalogsearch/advanced/result/*
Disallow: /catalogsearch/advanced/result/

3. Add rel="nofollow" to the links in the site linking to the advanced search.

Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status.

Can anything else be done?

Cyrus-Shepard

Seems like you've done everything right. You could also add a Meta robots "NOINDEX, FOLLOW" to those pages.

I'd also double check the referring "linked from" referrer in Webmasters tools just to make sure you haven't missed any live followed links pointing to those pages.

When did you submit the removal request, and what is the status? (approved, denied, pending?) Another question, are those pages in Google's index?

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Google Crawler Error / restricting crawling

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Keep getting "/feed" broken links in Google Search Console

Google only indexed 19/94 images

Google crawling but not indexing for no apparent reason

Help! How to Remove Error Code 901: DNS Errors (But to a URL that doesn't exist!)

How to Remove /feed URLs from Google's Index

Domain Forwarding / Multiple Domain Names / or Rebuild Blogs on them

Crawl Errors

Blocking Google from Crawling Parameters