Have you ever seen or experienced a page indexed which is actually from a website which is blocked by robots.txt?
-
Hi all,
We use robots file and meta robots tags for blocking website or website pages to block bots from crawling. Mostly robots.txt will be used for website and expect all the pages to not getting indexed. But there is a condition here that any page from website can be indexed by Google even the site is blocked from robots.txt; because crawler may find the page link somewhere on internet as stated here at last paragraph. I wonder if this really the case where some webpages have got indexed.
And even we use meta tags at page level; do we need to block from robots.txt file? Can we use both techniques at a time?
Thanks
-
Hi vtmoz,
The most mandatory way to prevent any page to be indexed is by using a meta robots tag with a _noindex _parameter.
Then using robots.txt will help to optimize your server resources and is a way that prevent google to crawl any new page that do not have the meta robots tag.And yeah, its very common to have indexed pages even the robots.txt file blocks the entire website.
If what you are looking for is to remove from index the pages, follow this steps:
- Allow the whole website to be crawable (or at least that specific pages/section) in the robots.txt
- add the robots meta tag with "noindex,follow" parametres
- wait several weeks, 6 to 8 weeks is a fairly good time. Or just do a followup on those pages
- when you got the results (all your desired pages to be de-indexed) re-block with robots.txt those pages
- DO NOT erase the meta robots tag.
Hope it helps.
Best luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search Console Not Indexing Pages
Hi there! I have a problem that I was hoping someone could help me with. On google search console, my website does not seem to be indexed well. In fact, even after rectifying problems that Moz's on-demand crawl has pointed out, it still does not become "valid". There are some of the excluded pages that Google has pointed out. I have rectified some of the issues but it doesn't seem to be helping. However, when I submitted the sitemap, it says that the URLs were discoverable, hence I am not sure why they can be discovered but are not deemed "valid". I would sincerely appreciate any suggestions or insights as to how can I go about to solve this issue. Thanks! Screenshot+%28341%29.png Screenshot+%28342%29.png Screenshot+%28343%29.png
Algorithm Updates | | Chowsey0 -
Google adding main site name to the title tags of pages in the sub folders: How to handle?
Hi community, Ours is a WP hosted website. We have given our site title which reflects across all the website page title suffix. Like "Moz SEO" will be default at the title for pages like "Local SEO - Moz SEO". We have given different page title suffix to our sub-folders' pages like blog and help guides. For blog we have given "Moz blog" as title tag suffix which was working fine. But Google suddenly started showing main website's title as suffix in pages of sub folders. Ex blog: "How to rank better - Moz blog - Moz SEO". Here we can see "Moz SEO" has been added which is not required. How to handle this? Thanks
Algorithm Updates | | vtmoz0 -
Page Rank on Moz compared to Ahrefs
So there seems to be a huge philosophical difference behind how Moz and Ahrefs calculates page rank (PA). On Moz, PA is very dependent on a site's DA. For instance, any new page or page with no backlinks for a 90DA site on Moz will have around 40PA. However, if a site has around 40 DA, any new page or page with no backlinks will have around 15PA PA. Now if one were to decide to get tons of backlinks to this 40 DA/15PA page, that will raise the PA of the page slightly, but it will likely never go beyond 40PA....which hints that one would rather acquire a backlink from a page on a high DA site even if that page has 0 links back to it as opposed to a backlink from a page on a low DA site with many, many backlinks to it. This is very different from how Ahrefs calculates PA. For Ahrefs, the PA of any new page or page with no backlinks to it will have a PA of around 8-10ish....no matter what the DA of the site is. When a page from a 40DA site begins acquiring a few links to it, it will quickly acquire a higher PA than a page from a 90DA site with no links to it. The big difference here is that for Ahrefs, PA for a given page is far more dependent on how many inbound links that page has. On the other hand, for Moz, PA for a given page is far more dependent on the DA of the site that page is on. If we were to trust Moz's PA calculations, SEOrs should emphasize getting links from high DA sites....whereas if we were to trust Ahref's PA calculations, SEOrs should focus less on that and more on building links to whatever page they want to rank up (even if that page is on a low DA site). So what do you guys think? Do you agree more with Moz or Ahref's valuation of PA. Is PA of a page more dependent on the DA or more dependent on it's total inbound links?
Algorithm Updates | | ButtaC1 -
Ahrefs - What Causes a Drastic Loss in Referring Pages?
While I was doing research on UK Flower companies I noticed that one particular domain had great rankings (top 3), but has slid quite a bit down to page two. After investigating further I noticed that they had a drastic loss of referring pages, but an increase in total referring domains. See this screenshot from ahrefs. I took a look at their historical rankings (got them from the original SEO provider's portfolio) and compared it to the Wayback Machine. There did not seem to be any drastic changes in the site structure. My question is what would cause such a dramatic loss in total referring pages while showing a dramatic increase in referring domains? It appears that the SEO company was trying rebound from the loss of links though. Any thoughts on why this might happen? 56VD5jD
Algorithm Updates | | AaronHenry0 -
In the body of index page i want to be able to add text that can be picked up by crawlers but I do not want these text to be visible? How can I code this?
in the body of index page i want to be able to add text that can be picked up by crawlers but I do not want these text to be visible? How can I code this?
Algorithm Updates | | FinindDesign0 -
Is my page footer the reason keyword rankings have dropped?
Hi all, One of my sites http://henstuff.com/ has seen some ranking drops for major keywords over the past few weeks and I was wondering if it was something to do with Penguin not taking a positive view of link-filled footers. It is something we are looking at phasing out but wanted to get the opinions of the SEOMOZ community. Thanks! Rob
Algorithm Updates | | RobertHill0 -
Top 5 most optimized websites
Throwing this question out to the community but was wondering if anyone can direct me on how I can find the top 5 or 10 ten sites that have been most optimized for search engines. Meaning which web sites have the best reputation when it comes to website optimization for search engines or is there a resource where I can read about websites that have been ranked as the best when it comes to following best practices and have constantly ranked well within their industry? Figured it's always a good idea to learn from the best by looking at what they are doing. Thank you.
Algorithm Updates | | DRTBA2 -
Using Brand Name in Page titles
Is it a good practice to append our brand name at the end of every page title? We have a very strong brand name but it is also long. Right now what we are doing is saying: Product Name | Long brand name here Product Category | Long brand name here Is this the right way to do it or should we just be going with ONLY the product and category names in our page titles? Right now we often exceed the 70 character recommendation limit.
Algorithm Updates | | mlentner1