Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
NoIndex/NoFollow pages showing up when doing a Google search using "Site:" parameter
-
We recently launched a beta version of our new website in a subdomain of our existing site. The existing site is www.fonts.com with the beta living at new.fonts.com. We do not want Google to crawl the new site until it's out of beta so we have added the following on all pages:
However, one of our team members noticed that google is displaying results from new.fonts.com when doing an "site:new.fonts.com" search (see attached screenshot). Is it possible that Google is indexing the content despite the noindex, nofollow tags? We have double checked the syntax and it seems correct except the trailing "/". I know Google still crawls noindexed pages, however, the fact that they're showing up in search results using the site search syntax is unsettling.
Any thoughts would be appreciated!
-
Thanks, appreciate you taking the time to write out a response!
-
Thank you for your reply. I will get this information over to the dev team!
-
Hi Chris
If Google sees a link to the page it may still list it in its index even though when they got there they saw the noindex tag so they didn't crawl it.
The rational is they see a link from your main site with some anchor text and index the link based on the anchor text they can't crawl it because you say not to, but they still have some information about the page from your anchor text.
Here is a direct Matt Cutts Quote:
"Our highest duty has to be to our users, not to an individual webmaster. When a user does a navigational query and we don’t return the right link because of a NOINDEX tag, it hurts the user experience (plus it looks like a Google issue). If a webmaster really wants to be out of Google without even a single trace, they can use Google’s url removal tool."
REF: http://www.mattcutts.com/blog/google-noindex-behavior/
You can block access to the test site (which is what we do) via htacess (if you're on a Linux Server) and use the Google Index Removal Tool to strip out the currently indexed pages.
I hope that helps.
-
If you have nofollow on all the pages, there is a chance it is being caused because google can't follow any links to your pages tho crawl and update them with the no-index tag.
Try changing your links to noindex, follow.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Sudden Indexation of "Index of /wp-content/uploads/"
Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you.
Technical SEO | | Tom3_150 -
How do I "undo" or remove a Google Search Console change of address?
I have a client that set a change of address in Google Search Console where they informed Google that their preferred domain was a subdomain, and now they want Google to also consider their base domain (without the change of address). How do I get the change of address in Google search console removed?
Technical SEO | | KatherineWatierOng0 -
Using the Google Remove URL Tool to remove https pages
I have found a way to get a list of 'some' of my 180,000+ garbage URLs now, and I'm going through the tedious task of using the URL removal tool to put them in one at a time. Between that and my robots.txt file and the URL Parameters, I'm hoping to see some change each week. I have noticed when I put URL's starting with https:// in to the removal tool, it adds the http:// main URL at the front. For example, I add to the removal tool:- https://www.mydomain.com/blah.html?search_garbage_url_addition On the confirmation page, the URL actually shows as:- http://www.mydomain.com/https://www.mydomain.com/blah.html?search_garbage_url_addition I don't want to accidentally remove my main URL or cause problems. Is this the right way this should look? AND PART 2 OF MY QUESTION If you see the search description in Google for a page you want removed that says the following in the SERP results, should I still go to the trouble of putting in the removal request? www.domain.com/url.html?xsearch_... A description for this result is not available because of this site's robots.txt – learn more.
Technical SEO | | sparrowdog1 -
"Search Box Optimization"
A client of ours recently received en email from a random SEO "company" claiming they could increase website traffic using a technique known as "search box optimization". Essentially, they are claiming they can insert a company name into the autocomplete results on Google. Clearly, this isn't a legitimate service - however, is it a well known technique? Despite our recommendation to not move forward with it, the client is still very intrigued. Here is a video of a similar service:
Technical SEO | | McFaddenGavender
https://www.youtube.com/watch?v=zW2Fz6dy1_A0 -
Does using data-href="" work more effectively than href="" rel="nofollow"?
I've been looking at some bigger enterprise sites and noticed some of them used HTML like this: <a <="" span="">data-href="http://www.otherodmain.com/" class="nofollow" rel="nofollow" target="_blank"></a> <a <="" span="">Instead of a regular href="" Does using data-href and some javascript help with shaping internal links, rather than just using a strict nofollow?</a>
Technical SEO | | JDatSB0 -
How do I add "noindex" or "nofollow" to a link in Wordpress
It's been a while since I've SEOed a Wordpress site. How do I add "nofollow" or "noindex" to specific links? I highlight the anchor text in the text editor, I click the "link" button. I could have sworn that there used to be an option in the dialogue box that pops up.
Technical SEO | | CsmBill0 -
How Can I Block Archive Pages in Blogger when I am not using classic/default template
Hi, I am trying to block all the archive pages of my blog as Google is indexing them. This could lead to duplicate content issue. I am not using default blogger theme or classic theme and therefore, I cannot use this code therein: Please suggest me how I can instruct Google not to index archive pages of my blog? Looking for quick response.
Technical SEO | | SoftzSolutions0 -
Does google use the wayback machine to determine the age of a site?
I have a site that I had removed from the wayback machine because I didn't want old versions to show. However I noticed that in many seo tools the site now always shows a domain age of zero instead of 6 years ago when I registered it. My question is what do the actual search engines use to determine age when they factor it into the ranking algorithm? By having it removed from the wayback machine, does that make the search engines think the site is brand new? Thanks
Technical SEO | | FastLearner0