Need only tens of pages to be indexed out of hundreds: Robots.txt is Okay for Google to proceed with?
-
Hi all,
We 2 sub domains with hundreds of pages where we need only 50 pages to get indexed which are important. Unfortunately the CMS of these sub domains is very old and not supporting "noindex" tag to be deployed on page level. So we are planning to block the entire sites from robots.txt and allow the 50 pages needed. But we are not sure if this is the right approach as Google been suggesting to depend mostly on "noindex" than robots.txt. Please suggest whether we can proceed with robots.txt file.
Thanks
-
Hi vtmoz,
Given the limitations you are telling us, I'd give noindex in robots.txt a try.
I've run some experiments and found that noindex rule in Robots.txt works. It definitely won´t remove from index that pages, but it will stop showing them for search results.I'd suggest you to try using that rule with care.
Also, run some experiments on your own. My first test would be only adding one or two pages, the one that causes more trouble being indexed (maybe due to undesired traffic or due to ranking on undesired search terms).Hope it helps.
Best luck!
GR
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta robots at every page rather than using robots.txt for blocking crawlers? How they'll get indexed if we block crawlers?
Hi all, The suggestion to use meta robots tag rather than robots.txt file is to make sure the pages do not get indexed if their hyperlinks are available anywhere on the internet. I don't understand how the pages will be indexed if the entire site is blocked? Even though there are page links are available, will Google really index those pages? One of our site got blocked from robots file but internal links are available on internet for years which are not been indexed. So technically robots.txt file is quite enough right? Please clarify and guide me if I'm wrong. Thanks
Algorithm Updates | | vtmoz0 -
Do we need to maintain consistency in page titles suffix?
Hi all, We usually give "brand & primary keyword" across all pages in website like "vertigo tiles". Do we need to maintain this suffix across all page titles? What if we change according to the page? Will Google downlook for not maintaining these page titles suffix like I mentioned? Thanks
Algorithm Updates | | vtmoz0 -
Adding non-important folders to disallow in robots.txt file
Hi all, If we have many non-important folders like /category/ in blog.....these will multiply the links. These are strictly for users who access very rarely but not for bots. Can we add such to disallow list in robots to stop link juice passing from them, so internal linking will me minimised to an extent. Can we add any such paths or pages in disallow list? Is this going to work pure technical or any penalty? Thanks, Satish
Algorithm Updates | | vtmoz0 -
Google indexing https sites by default now, where's the Moz blog about it!
Hello and good morning / happy Friday! Last night an article from of all places " Venture Beat " titled " Google Search starts indexing and letting users stream Android apps without matching web content " was sent to me, as I read this I got a bit giddy. Since we had just implemented a full sitewide https cert rather than a cart only ssl. I then quickly searched for other sources to see if this was indeed true, and the writing on the walls seems to indicate so. Google - Google Webmaster Blog! - http://googlewebmastercentral.blogspot.in/2015/12/indexing-https-pages-by-default.html http://www.searchenginejournal.com/google-to-prioritize-the-indexing-of-https-pages/147179/ http://www.tomshardware.com/news/google-indexing-https-by-default,30781.html https://hacked.com/google-will-begin-indexing-httpsencrypted-pages-default/ https://www.seroundtable.com/google-app-indexing-documentation-updated-21345.html I found it a bit ironic to read about this on mostly unsecured sites. I wanted to hear about the 8 keypoint rules that google will factor in when ranking / indexing https pages from now on, and see what you all felt about this. Google will now begin to index HTTPS equivalents of HTTP web pages, even when the former don’t have any links to them. However, Google will only index an HTTPS URL if it follows these conditions: It doesn’t contain insecure dependencies. It isn’t blocked from crawling by robots.txt. It doesn’t redirect users to or through an insecure HTTP page. It doesn’t have a rel="canonical" link to the HTTP page. It doesn’t contain a noindex robots meta tag. It doesn’t have on-host outlinks to HTTP URLs. The sitemaps lists the HTTPS URL, or doesn’t list the HTTP version of the URL. The server has a valid TLS certificate. One rule that confuses me a bit is : **It doesn’t redirect users to or through an insecure HTTP page. ** Does this mean if you just moved over to https from http your site won't pick up the https boost? Since most sites in general have http redirects to https? Thank you!
Algorithm Updates | | Deacyde0 -
Pages fluctuating +/- 70 positions in Google SERPs?
I've got some pages that appear somewhere around #25 in Google. Every now and then, it just goes away from the top 100 results for a few days (even up to a week) and then it comes back. I've got other pages that rank around #8 which falls down to about #75 for a while and then it comes back. But while a page may be gone from the top 100 results in the US, it still ranks at about the same place everywhere else in the world (+/- 10 positions). I've seen this happen in the past but never it happened so often. What gives?!?
Algorithm Updates | | sbrault740 -
Home page rank for keyword
Hi Mozers I have traded from my website balloon.co.uk for over 10 years. For a long while the site ranked first for the word 'balloon' across the UK on google.co.uk (first out of 41 million). Around the time Penguin launched the site began to drop and currently sits on about page 5. What's confusing is that for a search on 'balloons' ('s' on the end of balloon) it ranks 2nd in the location of Birmingham where I'm based. That's 2nd in the real search rather than a map local search. But - if I search 'balloon' from the location of Birmingham my contact page ranks 5th: http://www.balloon.co.uk/contact.htm but the home page ranks nowhere. So - it's gone from ranking 1st nationally to ranking nowhere with my contact page ranking above the home page (which is a generic word domain). Any ideas?
Algorithm Updates | | balloon.co.uk0 -
What happened on September 17 on Google?
According to mozcast: http://mozcast.com/ and to my own stats, Google had a pretty strong algorithm update on September 17. Personally I have experienced a drop of about 10% of traffic coming from Google on most of my main e-commerce site virtualsheetmusic.com. Anyone know more about that update? Any ideas about what changed? Thank you in advance for any thoughts! Best, Fab.
Algorithm Updates | | fablau1 -
Drop in Traffic from Google, However no change in the rankings
I have seen a 20% drop in traffic from google last week (After April 29th). However when I try to analyze the rank of the keywords in the google results that send me traffic they seem to be the same. Today (6th March) Traffic has fallen further again with not much/any visible change in the rankings. Any ideas on what the reason for this could be? I have not made any changes to the website recently.
Algorithm Updates | | raghavkapur0