Google Crawler Error / restricting crawling
-
Hi
On a Magento Instance we manage there is an advanced search. As part of the ongoing enhancement of the instance we altered the advance search options so there are less and more relevant.
The issue is Google has crawled and catalogued the advanced search with the now removed options in the query string. Google keeps crawling these out of date advanced searches. These stale searches now create a 500 error.
Currently Google is attempting to crawl these pages twice a day.
I have implemented the following to stop this:-
1. Submitted requested the url be removed via Webmaster tools, selecting the directory option using uri:
http://www.domian.com/catalogsearch/advanced/result/
2. Added Disallow to robots.txt
Disallow: /catalogsearch/advanced/result/* Disallow: /catalogsearch/advanced/result/
3. Add rel="nofollow" to the links in the site linking to the advanced search.
Below is a list of the links it is crawling or attempting to crawl, 12 links crawled twice a day each resulting in a 500 status.
Can anything else be done?
-
Seems like you've done everything right. You could also add a Meta robots "NOINDEX, FOLLOW" to those pages.
I'd also double check the referring "linked from" referrer in Webmasters tools just to make sure you haven't missed any live followed links pointing to those pages.
When did you submit the removal request, and what is the status? (approved, denied, pending?) Another question, are those pages in Google's index?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Google Search Console Still Reporting Errors After Fixes
Hello, I'm working on a website that was too bloated with content. We deleted many pages and set up redirects to newer pages. We also resolved an unreasonable amount of 400 errors on the site. I also removed several ancient sitemaps that listed content deleted years ago that Google was crawling. According to Moz and Screaming Frog, these errors have been resolved. We've submitted the fixes for validation in GSC, but the validation repeatedly fails. What could be going on here? How can we resolve these error in GSC.
Technical SEO | | tif-swedensky0 -
Redirect of https:// to http:// without SSL. Possible or not?!
Good afternoon, smart dudes : ) I am here to ask for your help. I posted this question on google help forum and stackoverflow, but looks like people do not know the correct answer... QUESTION: We used to have a secured site, but recently purchased a separate reservation software that provides SSL (takes clients to a separate secured website) where they can fill out the reservation form. We cancelled our SSL (just think its a waste to pay $100 for securing plain text). Now i have so many links pointing to our secured site and i have no idea how to fix it! How do i redirect https://www.mysite.comto http://www.mysite.com.Also would like to mention that i already have redirect from non www to www domain (not sure if that matters): RewriteEngine onRewriteCond %{HTTP_HOST} ^mysite.com$ [NC]RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L]As i already mentioned....we do not have SSL!!!! None of those 301 redirect codes i found online work (you have to have SSL for the site to be redirected from https to http | currently i get an error - can't establish a secured connection to the server ). Is there anything i can do???? Or do i have to purchase SSL again?
Technical SEO | | JennaD140 -
Google not pulling my favicon
Several sites use Google favicon to load favicons instead of loading it from the Website itself. Our favicon is not being pulled from our site correctly, instead it shows the default "world" image. https://plus.google.com/_/favicon?domain=www.example.com Is the address to pull a favicon. When I post on G+ or see other sites that use that service to pull favicons ours isn't displaying, despite it shows up in Chrome, Firefox, IE, etc and we have the correct meta in all pages of our site. Any idea why is this happening? Or how to "ping" Google to update that?
Technical SEO | | FedeEinhorn0 -
Google webmaster errors
**If you know what these google webmasters errors mean, and you can explain it to me in simple english and tell me how I can locate the problem, I would really appreciate it!. <colgroup><col width=""><col width=""><col width=""><col width=""><col width="*"><col width="124"><col width="54"></colgroup>
Technical SEO | | Joseph-Green-SEO
| | | | | Server error | | | | Soft 404 | | | | Access denied | | Not found | | | Not followed | | | |** I have many of these errors, is it harming SEO?Yoseph0 -
Crawl rate
Hello, In google WMT my site has the following message. <form class="form" action="/webmasters/tools/settings-ac?hl=en&siteUrl=http://www.prom-hairstyles.org/&siteUrl=http://www.prom-hairstyles.org/&hl=en" method="POST">Your site has been assigned special crawl rate settings. You will not be able to change the crawl rate.Why would this be?A bit of backgound - this site was hammered by Penguin or maybe panda but seems to be dragging itself back up (maybe) but has dropped from several thousand visitors/day to 100 or so.Cheers,Ian</form>
Technical SEO | | jwdl0 -
Have a client that migrated their site; went live with noindex/nofollow and for last two SEOMoz crawls only getting one page crawled. In contrast, G.A. is crawling all pages. Just wait?
Client site is 15 + pages. New site had noindex/nofollow removed prior to last two crawls.
Technical SEO | | alankoen1230 -
How is my competition causing bad crawl errors and links on my site
We have a compeditor who we are in a legal dispute at the moment, and they are using under hand tactics to cause us to have bad links and crawl errors and i do not know how they are doing it or how to stop it. The crawl errors we are getting is the site having two urls together, for example www.testsite.com/www.testsite.com and other errors are pages that we do not even have or pages that are spelt wrong or have a dot after the page name. We have been told off a number of people in our field that this has also happened to them and i would like to know how they are doing it so we can have this stopped Since they have been doing this our traffic has gone down by half
Technical SEO | | ClaireH-1848860 -
Google Website Optimizer
So if you are AB testing two pages: index.html and indexB.html Shouldn't I nofollow indexB.html? It has all the same content, just a different design.
Technical SEO | | tylerfraser0