How can a Page indexed without crawled?
-
Hey moz fans,
In the google getting started guide it says**"
Note: **Pages may be indexed despite never having been crawled: the two processes are independent of each other. If enough information is available about a page, and the page is deemed relevant to users, search engine algorithms may decide to include it in the search results despite never having had access to the content directly. That said, there are simple mechanisms such as robots meta tags to make sure that pages are not indexed.
"How can it happen, I dont really get the point.
Thank you -
Pleasure is all mine my friend. You are most welcome. Moz SEO community is an indispensable asset and weapon in any SEO's inventory in my opinion. We learn a great deal here while helping others. I am really thankful to each and everyone here on Moz community. Long live Moz and Mozzers. YOU ROCK!!
-
Ov man, you always come tome with great ideas
I never thought about that .
Thank you very much stay rock! -
Yes, of course my friend, Google has to crawl the page to see the page-level meta robots tag but till date I have not seen any page in Google's index that has been blocked using the robots.txt file and page-level meta robots tag. Password protecting your .htaccess file would be an overkill if you just want Google not to index a page. If you want Google to remove any particular page from its index, you can get it done from webmaster tools account. Here you go for more: https://support.google.com/webmasters/answer/1663419?hl=en
Good Luck to you my friend.
Best regards,
Devanur Rafi
-
Thank you guyz,
Devanur You've got the point let me correct you at one point.
You can't say google that remove my index just using meta robots tag, because It can't read the meta tag till it crawl.
So only solution looks like .htaccess password protect.
Anyway thanks for your efforts. -
I'm also thinking site maps, but I'm not really sure if they trust them that much to list links in it that they haven't crawled.
-
Hi friend,
If a page has been blocked using Robots.txt file, then Google will not crawl and index the page from within the website but what if a reference of that page is found on a third-party website? In cases like this, link discovery will happen and the page will be indexed without a Description snippet and such pages will have the following text in the place of a description in the search results pages:
"A description for this result is not available because of this site's robots.txt – learn more"
So inorder to completely stop Google from crawling and indexing a page, you should should block the page by implementing, page-level meta robots tag.
Here you go for more: https://support.google.com/webmasters/answer/156449?hl=en
Please feel free to post back if you have any other queries in this regards.
Best regards,
Devanur Rafi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
To remove or not remove a redirected page from index
We have a promotion landing page which earned some valuable inbound links. Now that the promotion is over, we have redirected this page to a current "evergreen" page. But in the search results page on Google, the original promotion landing page is still showing as a top result. When clicked, it properly redirects to the newer evergreen page. But, it's a bit problematic for the original promo page to show in the search results because the snippet mentions specifics of the promo which is no longer active. So, I'm wondering what would be the net impact of using the "removal request " tool for the original page in GSC. If we don't use that tool, what kind of timing might we expect before the original page drops out of the results in favor of the new redirected page? And if we do use the removal tool on the original page, will that negate what we are attempting to do by redirecting to the new page, with regard to preserving inbound link equity?
Intermediate & Advanced SEO | | seoelevated0 -
Why is Google no longer Indexing and Ranking my state pages with Dynamic Content?
Hi, We have some state specific pages that display dynamic content based on the state that is selected here. For example this page displays new york based content. But for some reason google is no longer ranking these pages. Instead it's defaulting to the page where you select the state here. But last year the individual state dynamic pages were ranking. The only change we made was move these pages from http to https. But now google isn't seeing these individual dynamically generated state based pages. When I do a site: url search it doesn't find any of these state pages. Any thoughts on why this is happening and how to fix it. Thanks in advance for any insight. Eddy By the way when I check these pages in google search console fetch as google, google is able to see these pages fine and they're not being blocked by any robot.txt.
Intermediate & Advanced SEO | | eddys_kap0 -
Why some websites can rank the keywords they don't have in the page?
Hello guys, Yesterday, I used SEMrush to search for the keyword "branding agency" to see the SERP. The Liquidagency ranks 5th on the first page. So I went to their homepage but saw no exact keywords "branding agency", even in the page source. Also, I didn't see "branding agency" as a top anchor text in the external links to the page (from the report of SEMrush). I am an SEO newbie, can someone explain this to me, please? Thank you.
Intermediate & Advanced SEO | | Raymondlee0 -
Robots.txt Disallowed Pages and Still Indexed
Alright, I am pretty sure I know the answer is "Nothing more I can do here." but I just wanted to double check. It relates to the robots.txt file and that pesky "A description for this result is not available because of this site's robots.txt". Typically people want the URL indexed and the normal Meta Description to be displayed but I don't want the link there at all. I purposefully am trying to robots that stuff outta there.
Intermediate & Advanced SEO | | DRSearchEngOpt
My question is, has anybody tried to get a page taken out of the Index and had this happen; URL still there but pesky robots.txt message for meta description? Were you able to get the URL to no longer show up or did you just live with this? Thanks folks, you are always great!0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
Webmaster Index Page significant drop
Has anyone noticed a significant drop in indexed pages within their Google Webmaster Tools sitemap area? We went from 1300 to 83 from Friday June 23 to today June 25, 2012 and no errors are showing or warnings. Please let me know if anyone else is experiencing this and suggestions to fix this?
Intermediate & Advanced SEO | | datadirect0 -
How do I increase rankings when the indexed page is the homepage?
Hi Forum, This is a two-part question. The first is: "what may be the cause of some rank declines?" and the second is "how do I bring them back up when the indexed page is the homepage?" Over the last week I noticed some declines in several of my top keywords, many of which point to the site's homepage. The site itself is an eCommerce site, which had less visits last week than normal (holidays it seems, since the data jibes with key dates). Can a decline in traffic cause ranking declines? Any other ideas of where to look? Secondly, for those keywords that link to the homepage, how do we bring these back up since a homepage can't be optimized for every single keyword? We sell yoga products and can't have a homepage that is optimized for keywords like "yoga mat," "yoga blocks," "yoga pilates clothing," and several others, as these are our category pages' keywords. Any thoughts? Thanks!
Intermediate & Advanced SEO | | pano0 -
Old pages still crawled by SE returning 404s. Better to put 301 or block with robots.txt ?
Hello guys, A client of ours has thousand of pages returning 404 visibile on googl webmaster tools. These are all old pages which don't exist anymore but Google keeps on detecting them. These pages belong to sections of the site which don't exist anymore. They are not linked externally and didn't provide much value even when they existed What do u suggest us to do: (a) do nothing (b) redirect all these URL/folders to the homepage through a 301 (c) block these pages through the robots.txt. Are we inappropriately using part of the crawling budget set by Search Engines by not doing anything ? thx
Intermediate & Advanced SEO | | H-FARM0