How can a Page indexed without crawled?
-
Hey moz fans,
In the google getting started guide it says**"
Note: **Pages may be indexed despite never having been crawled: the two processes are independent of each other. If enough information is available about a page, and the page is deemed relevant to users, search engine algorithms may decide to include it in the search results despite never having had access to the content directly. That said, there are simple mechanisms such as robots meta tags to make sure that pages are not indexed.
"How can it happen, I dont really get the point.
Thank you -
Pleasure is all mine my friend. You are most welcome. Moz SEO community is an indispensable asset and weapon in any SEO's inventory in my opinion. We learn a great deal here while helping others. I am really thankful to each and everyone here on Moz community. Long live Moz and Mozzers. YOU ROCK!!
-
Ov man, you always come tome with great ideas I never thought about that .
Thank you very much stay rock! -
Yes, of course my friend, Google has to crawl the page to see the page-level meta robots tag but till date I have not seen any page in Google's index that has been blocked using the robots.txt file and page-level meta robots tag. Password protecting your .htaccess file would be an overkill if you just want Google not to index a page. If you want Google to remove any particular page from its index, you can get it done from webmaster tools account. Here you go for more: https://support.google.com/webmasters/answer/1663419?hl=en
Good Luck to you my friend.
Best regards,
Devanur Rafi
-
Thank you guyz,
Devanur You've got the point let me correct you at one point.
You can't say google that remove my index just using meta robots tag, because It can't read the meta tag till it crawl.
So only solution looks like .htaccess password protect.
Anyway thanks for your efforts. -
I'm also thinking site maps, but I'm not really sure if they trust them that much to list links in it that they haven't crawled.
-
Hi friend,
If a page has been blocked using Robots.txt file, then Google will not crawl and index the page from within the website but what if a reference of that page is found on a third-party website? In cases like this, link discovery will happen and the page will be indexed without a Description snippet and such pages will have the following text in the place of a description in the search results pages:
"A description for this result is not available because of this site's robots.txt – learn more"
So inorder to completely stop Google from crawling and indexing a page, you should should block the page by implementing, page-level meta robots tag.
Here you go for more: https://support.google.com/webmasters/answer/156449?hl=en
Please feel free to post back if you have any other queries in this regards.
Best regards,
Devanur Rafi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No Index No follow instead of Rel canoncical on product pages
Hi all, we handle our product pages no with rel canonical now, we have 1 url that is indexed http://www.prams.net/cam-combi-family the other colours have different urls like http://www.prams.net/cam-combi-family-3-in-1-pram-reversible-seat-car-seat-grey-d which canonicalize to the indexed page. Google still crawls all those pages. For crawl budget reasons we want to use "no index, no follow" instead on these pages (the pages for the other colours)? Google would then crawl fewer pages more often? Does this make sense? Are their any downsides doing it? Thanks in advance Dieter
Intermediate & Advanced SEO | | Storesco1 -
Can we talk a bit more about cannibalisation? Will Google pick one page and disregard others.
Hi all. I work for an e-commerce site called TOAD Diaries and we've been building some landing pages recently. Our most generic page was for '2017 Diaries'. Take a look here. Initial results are encouraging as this page is ranking top page for a lot of 'long tail' search queries, e.g) '2017 diaries a4', '2017 diaries a5', '2017 diaries week to view' etc. Interesting it doesn't even rank top 50 for the 'head term'... '2017 diaries'. **And our home page outranks it for this search term. **Yet it seems clear that this page is considered relevant and quality by Google it ranks just fine for the long tails. Question: Does this mean Google 'chosen' our home page over the 2017-page landing page? And that's why the 2017-page effectively doesn't rank for it's 'head term'? (I can't see this as many times a website will rank multiple times such as amazon) But any thoughts would be greatly appreciated. Also, what would you do in this scenario? Work on home-page to try to push it up for that term and not worry about the landing page? Any suggestions or thoughts would be greatly appreciated. Hope that makes sense. Do shout if not. Thanks in advance. Isaac.
Intermediate & Advanced SEO | | isaac6630 -
Pages with Duplicate Page Content (with and without www)
How can we resolve pages with duplicate page content? With and without www?
Intermediate & Advanced SEO | | directiq
Thanks in advance.0 -
Do internal links from non-indexed pages matter?
Hi everybody! Here's my question. After a site migration, a client has seen a big drop in rankings. We're trying to narrow down the issue. It seems that they have lost around 15,000 links following the switch, but these came from pages that were blocked in the robots.txt file. I was wondering if there was any research that has been done on the impact of internal links from no-indexed pages. Would be great to hear your thoughts! Sam
Intermediate & Advanced SEO | | Blink-SEO0 -
Google Is Indexing The Wrong Page For My Keyword
For a long time (almost 3 mounth) google indexing the wrong page for my main keyword.
Intermediate & Advanced SEO | | Tiedemann_Anselm
The problem is that each time google indexed another page each time for a period of 4-7 days, Sometimes i see the home page, sometimes a category page and sometimes a product page.
It seems though Google has not yet decided what his favorite / better page for this keyword. This is the pages google index: (In most cases you can find the site on the second or third page) Main Page: http://bit.ly/19fOqDh Category Page: http://bit.ly/1ebpiRn Another Category: http://bit.ly/K3MZl4 Product Page: http://bit.ly/1c73B1s All links I get to the website are natural links, therefore in most cases the anchor we got is the website name. In addition I have many links I get from bloggers that asked to do a review on one of my products, I'm very careful about that and so I'm always checking the blogger and their website only if it is something good, I allowed it. also i never ask for a link back (must of the time i receive without asking), and as I said, most of their links are anchor with my website name. Here some example of links that i received from bloggers: http://bit.ly/1hF0pQb http://bit.ly/1a8ogT1 http://bit.ly/1bqqRr8 http://bit.ly/1c5QeC7 http://bit.ly/1gXgzXJ Please Can I get a recommendation what should you do?
Should I try to change the anchor of the link?
Do I need to not allow bloggers to make a review on my products? I'd love to hear what you recommend,
Thanks for the help0 -
Getting Pages Requiring Login Indexed
Somehow certain newspapers' webpages show up in the index but require login. My client has a whole section of the site that requires a login (registration is free), and we'd love to get that content indexed. The developer offered to remove the login requirement for specific user agents (eg Googlebot, et al.). I am afraid this might get us penalized. Any insight?
Intermediate & Advanced SEO | | TheEspresseo0 -
Can I Use Cross Domain Canonical For Duplicate Categories & Product Pages?
I want to fix issue regarding duplicate categories & product pages on my multiple eCommerce websites. http://www.vistastores.com/patio-umbrellas-fiberbuilt-umbrellas-llc-7gcrw-teal.html - Want to rank with this... http://www.vistapatioumbrellas.com/patio-umbrellas-fiberbuilt-umbrellas-llc-7gcrw-teal.html - Duplicate one! http://www.vistastores.com/patio-umbrellas - Want to rank with this... http://www.vistapatioumbrellas.com/patio-umbrellas - Duplicate one!
Intermediate & Advanced SEO | | CommercePundit0 -
Can you use more than one meta robots tag per page?
If you want to add both "noindex, follow" and "noopd" should you add two meta robots tags or is there a way to combine both into one?
Intermediate & Advanced SEO | | nicole.healthline0