Disallow: /search/ in robots but soft 404s are still showing in GWT and Google search?
-
Hi guys, I've already added the following syntax in robots.txt to prevent search engines in crawling dynamic pages produce by my website's search feature: Disallow: /search/. But soft 404s are still showing in Google Webmaster Tools. Do I need to wait(it's been almost a week since I've added the following syntax in my robots.txt)? Thanks, JC
-
You could also look at using the meta robots = noindex tag on /search/ pages, rather than just blocking it in robots.txt, as this will remove existing URLs from the index.
-
Glad to help
-
Thanks a lot Dan!
-
That is a good recommendation but ultimately search engines will make a final decision on crawl frequency. Take a look at your 'Crawl Stats' on GWTs and this will give you an idea of how often your site is crawled.
-
Is the time issue related in crawl frequency of the URLs in my sitemap?
Thanks Dan, appreciate it.
-
You will probably need to wait a little longer - it depends how often your site usually gets crawled and indexed.
However, robots.txt does not always stop search engines from indexing your pages. It will stop them crawling a page on your site but it tells them that they can still index that page. If they find links from external sites then the URL may still appear in the SERP.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexing staging / development site that is redirected...
Hi Moz Fans! - Please help. We had a acme.stagingdomain.com while a site was in development, when it went live it redirected (302) to acmeprofessionalservices.com (real names redacted!!) no known external links to staging site although staging site url has been emailed from Google Apps(!!!) now found that staging site is in the index even though it redirects to the proper public site. and some (but not all) of the pages are in the index too. They all redirect to the proper public site when visited. It is convenient to have a redirect from the staging site to the new one for the team, Chrome etc. remember frequently visited sites. Be a shame to lose that. Yes, these pages can be removed using webmaster tools.
Technical SEO | | mozroadjan
But how did they get in the index to start with? And if we're building a new site, and a customer has an existing site is there a danger of duplicate content etc. penalties caused by the staging site? We had a similar incident recently when a PDF that was not linked anywhere on the site appeared in the index. The link had been emailed through Google Apps, and visited in Chrome, but that was it. So 3 questions. Why is the staging site still in the index despite the redirects? How did they get in the index in the first place? Will the new staging site affect the rank of the existing site, eg. duplicate content penalties?0 -
404s in GWT - Not sure how they are being found
We have been getting multiple 404 errors in GWT that look like this: http://www.example.com/UpdateCart. The problem is that this is not a URL that is part of our structure, it is only a piece. The actual URL has a query string on the end, so if you take the query string off, the page does not work. I can't figure out how Google is finding these pages. Could it be removing the query string? Thanks.
Technical SEO | | Colbys0 -
Correct linking to the /index of a site and subfolders: what's the best practice? link to: domain.com/ or domain.com/index.html ?
Dear all, starting with my .htaccess file: RewriteEngine On
Technical SEO | | inlinear
RewriteCond %{HTTP_HOST} ^www.inlinear.com$ [NC]
RewriteRule ^(.*)$ http://inlinear.com/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://inlinear.com/ [R=301,L] 1. I redirect all URL-requests with www. to the non www-version...
2. all requests with "index.html" will be redirected to "domain.com/" My questions are: A) When linking from a page to my frontpage (home) the best practice is?: "http://domain.com/" the best and NOT: "http://domain.com/index.php" B) When linking to the index of a subfolder "http://domain.com/products/index.php" I should link also to: "http://domain.com/products/" and not put also the index.php..., right? C) When I define the canonical ULR, should I also define it just: "http://domain.com/products/" or in this case I should link to the definite file: "http://domain.com/products**/index.php**" Is A) B) the best practice? and C) ? Thanks for all replies! 🙂
Holger0 -
Website disappeared from Google organic keyword searches.
We have an auto repair company as a client www.autorepairauroratilden.com who for the better part of a year their website had ruled the 1st page organic Google search results. Their website, Blogs, Facebook, and Twitter all came up on page one for their keyword searches. On May 13th, it all came to a screeching halt. The website is nowhere to be found for any of their keywords (example: brake repair Aurora.) There are a couple of blogs on page 2 but it’s nothing like it was prior to May 13th. On May 12th we published 5 branded websites for this client – Chrysler, Ford, Honda, Jeep, and Toyota, all on separate URL’s. All the page titles, keywords, and descriptions were specifically branded to the individual websites as were all the keywords. Since the beginning of June we’ve taken down the 5 branded websites and we’ve gone through our keywords on the auto repair website. The website was last crawled on June 11th. We still do not have any page 1 placement or for that matter any page placement. I checked 10 pages out. We have a 2nd auto repair client that has been running their website as well as their 5 branded websites a couple of months longer than this client and we’ve had no problems with any of their websites and keyword search results. How do we fix this?
Technical SEO | | markindenver0 -
Internal links - is div on click still no followed by google?
Hi Mozzers Does anyone know if are still no followed by Google From a UX perspective, making a container div clickable will work well, but i don't want this link to absorb any link juice as text within the div would make much better anchor text, so i would rather that link was receiving the juice. Is the above the best approach to this issue of UX vs SEO? Many thanks Justin
Technical SEO | | JustinTaylor880 -
Http:// vs http://www.
Why is it that when I run an "On Page Optimization Keyword Report" for my website I get a different score when using http://www.tandmkitchens.com vs http://tandmkitchens.com. My keyword is "Kitchen Remodeling" http://www.tandmkitchens.com scores an A http://tandmkitchens.com scores a B It's the same page yet one url scores higher than the other. Any help! Thanks
Technical SEO | | fun52dig
Gary0 -
Google search result going to a page that I did not put on my site
Hi, I am seeing a very strange result in google for my site. When doing a search for the term "london reflexology" my site comes up 18th in the results. But when I click the link or check the URL it shows up as: http://www.reflexologyonline.co.uk/reflexologyonline.php?Action=Webring This is not right at all. It looks like some sort of cloaking but I am not sure. I am new to SEO and I do not know why goole is showing this URL that does not exist on my site and of witch the content is totally wrong. Can anyone please help with this? See the 2 linked images for more details. It seems to me the site might be hacked or something to that effect. Please help.... jyJdP.png 71Mf4.png
Technical SEO | | RupDog0 -
What Google uses in search result descriptions
Recently, Google has started including certain information from our web pages in their search results description that is a bit puzzling. For example if you google 'Wedding Band Raleigh' the description they are using for our site's (GigMasters) page begins with the text 'Results 1 - 10 of 1005' Not sure why they are pulling that information. That is in on the page but its not high up on the page or marked with any special h1, h2, or h3 tag. We do have that information inside of a div which we have named 'Results'. Maybe that's why? Did we inadvertently use some sort of Google rich snippet or schema.org naming convention?! Any insight would be hugely appreciated.
Technical SEO | | gigmasters0