Should I use noindex or robots to remove pages from the Google index?
-
I have a Magento site and just realized we have about 800 review pages indexed. The /review directory is disallowed in robots.txt but the pages are still indexed.
From my understanding robots means it will not crawl the pages BUT if the pages are still indexed if they are linked from somewhere else.
I can add the noindex tag to the review pages but they wont be crawled.
https://www.seroundtable.com/google-do-not-use-noindex-in-robots-txt-20873.html
Should I remove the robots.txt and add the noindex? Or just add the noindex to what I already have?
-
Thanks, Logan!
-
Rhys,
Your web dev team is confused. You cannot de-index by simply disallowing them in your robots.txt file. Google will still index anything they find (that doesn't have a noindex tag) from a link, this is the reason you often see search results that say "A description for this result is not available because of this site's robots.txt" as the description.
Here's a quote from Google regarding the subject: "You should not use robots.txt as a means to hide your web pages from Google Search results." - https://support.google.com/webmasters/answer/6062608?hl=en
-
Hi all,
Sorry to jump in here but I've been told the opposite by our web dev team. We're removing indexed 404s at the moment, and our web dev team said we simply need to add robots.txt to the pages and they'll be de-indexed. If this incorrect? I thought I'd need to add a noindex tag but was argued down...
Cheers,
Rhys
-
Hi there. Good question and one that comes up a lot.
You need to do the following:
- Put the noindex on those pages
- Remove the block in robots.txt
- Monitor these pages falling out of the index
- Once they are all out, then put the block back in place
You both want them to a) drop out and b) then not be crawled, so the above will take care of that for you.
Hope that helps!
John
-
Thanks.
That is what I figured just wanted to double check.
-
Hi Tyler,
Yes, remove the robots.txt disallow for that section and add a noindex tag. Noindex is the only sure-fire way to de-index URLs, but the crawlers need to be allowed to crawl those pages to see the tag.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any way to force a URL out of Google index?
As far as I know, there is no way to truly FORCE a URL to be removed from Google's index. We have a page that is being stubborn. Even after it was 301 redirected to an internal secure page months ago and a noindex tag was placed on it in the backend, it still remains in the Google index. I also submitted a request through the remove outdated content tool https://www.google.com/webmasters/tools/removals and it said the content has been removed. My understanding though is that this only updates the cache to be consistent with the current index. So if it's still in the index, this will not remove it. Just asking for confirmation - is there truly any way to force a URL out of the index? Or to even suggest more strongly that it be removed? It's the first listing in this search https://www.google.com/search?q=hcahranswers&rlz=1C1GGRV_enUS753US755&oq=hcahr&aqs=chrome.0.69i59j69i57j69i60j0l3.1700j0j8&sourceid=chrome&ie=UTF-8
Intermediate & Advanced SEO | | MJTrevens0 -
In the google index but search redirects to homepage
Hi everyone, thanks for reading i have a website "www.gardeners.scot" and have the following pages listed in google site: command http://www.gardeners.scot/garden-landscaping-Edinburgh.htm & http://www.gardeners.scot/garden-maintenance-Edinburgh.htm however when a user searches for "garden landscaping Edinburgh" or "garden maintenance Edinburgh" we are in the rankings but google search links these phrases to the home page not to their targeted pages. the site is about a year old have checked the robots.txt, sitemap.xml & .htaccess files but can see anything wrong there. any ideas out there?
Intermediate & Advanced SEO | | livingphilosophy0 -
Google Indexing of Images
Our site is experiencing an issue with indexation of images. The site is real estate oriented. It has 238 listings with about 1190 images. The site submits two version (different sizes) of each image to Google, so there are about 2,400 images. Only several hundred are indexed. Can adding Microdata improve the indexation of the images? Our site map is submitting images that are on no-index listing pages to Google. As a result more than 2000 images have been submitted but only a few hundred have been indexed. How should the site map deal with images that reside on no-index pages? Do images that are part of pages that are set up as "no-index" need a special "no-index" label or special treatment? My concern is that so many images that not indexed could be a red flag showing poor quality content to Google. Is it worth investing in correcting this issue, or will correcting it result in little to no improvement in SEO? Thanks, Alan
Intermediate & Advanced SEO | | Kingalan10 -
Google Is Indexing My Internal Search Results - What should i do?
Hello, We are using a CMS/E-Commerce platform which isn't really built with SEO in mind, this has led us to the following problem.... a large number of internal (product search) search result pages, which aren't "search engine friendly" or "user friendly", are being indexed by google and are driving traffic to the site, generating our client revenue. We want to remove these pages and stop them from being indexed, replacing them with static category pages - essentially moving the traffic from the search results to static pages. We feel this is necessary as our current situation is a short-term (accidental) win and later down the line as more pages become indexed we don't want to incur a penalty . We're hesitant to do a blanket de-indexation of all ?search results pages because we would lose revenue and traffic in the short term, while trying to improve the rankings of our optimised static pages. The idea is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages. Our main focus is to improve user experience and not have customers enter the site through unexpected pages. All thoughts or recommendations are welcome. Thanks
Intermediate & Advanced SEO | | iThinkMedia0 -
Link Removal Request Sent to Google, Bad Pages Gone from Index But Still Appear in Webmaster Tools
| On June 14th the number of indexed pages for our website on Google Webmaster tools increased from 676 to 851 pages. Our ranking and traffic have taken a big hit since then. The increase in indexed pages is linked to a design upgrade of our website. The upgrade was made June 6th. No new URLS were added. A few forms were changed, the sidebar and header were redesigned. Also, Google Tag Manager was added to the site. My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline. My developer submitted a page removal request to Google via Webmaster tools around June 20th. Now when a Google search is done for site:www.nyc-officespace-leader.com 851 results display. Would these extra pages cause a drop in ranking? My developer issued a link removal request for these pages around June 20th and the number in the Google search results appeared to drop to 451 for a few days, now it is back up to 851. In Google Webmaster Tools it is still listed as 851 pages. My ranking drop more and more everyday. At the end of displayed Google Search Results for site:www.nyc-officespace-leader.comvery strange URSL are displaying like:www.nyc-officespace-leader.com/wp-content/plugins/... If we can get rid of these issues should ranking return to what it was before?I suspect this is an issue with sitemaps and Robot text. Are there any firms or coders who specialize in this? My developer has really dropped the ball. Thanks everyone!! Alan |
Intermediate & Advanced SEO | | Kingalan10 -
Indexing Dynamic Pages
http://www.oreillyauto.com/site/c/search/Wiper+Blade/03300/C0047.oap?make=Honda&model=Accord&year=2005&vi=1430764 How is O'Reilly getting this page indexed? It shows up in organic results for [2005 honda accord windshield wiper size].
Intermediate & Advanced SEO | | Kingof50 -
Google: How to See URLs Blocked by Robots?
Google Webmaster Tools says we have 17K out of 34K URLs that are blocked by our Robots.txt file. How can I see the URLs that are being blocked? Here's our Robots.txt file. User-agent: * Disallow: /swish.cgi Disallow: /demo Disallow: /reviews/review.php/new/ Disallow: /cgi-audiobooksonline/sb/order.cgi Disallow: /cgi-audiobooksonline/sb/productsearch.cgi Disallow: /cgi-audiobooksonline/sb/billing.cgi Disallow: /cgi-audiobooksonline/sb/inv.cgi Disallow: /cgi-audiobooksonline/sb/new_options.cgi Disallow: /cgi-audiobooksonline/sb/registration.cgi Disallow: /cgi-audiobooksonline/sb/tellfriend.cgi Disallow: /*?gdftrk Sitemap: http://www.audiobooksonline.com/google-sitemap.xml
Intermediate & Advanced SEO | | lbohen0 -
Indexing specified entry pages
Hi,We are currently working on location based info.Basically, when someone searches from Florida they will get specific Florida results and when they search from California they will specific California results.How does this location based info affect crawling and indexing?Lets say we have location info for googlebot, sometimes they crawl from a New York ip address, sometimes they do it from Texas and sometimes from California. In this case google will index 3 different pages with 3 different prices and a bit different text, and I'm afraid they might see these as some kind of cloaking or suspicious movement because we serve different versions of the page. What's the best way to handle this?
Intermediate & Advanced SEO | | SEODinosaur0