Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
-
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results.
Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/
Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page.
I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed.
Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
-
@effectdigital and @jasongmcmahon did you ever get to the bottom of this and if so what caused it and what was the long term fix, as GSC and Google seem to behaving in a peculiar way?
We had a similar issue with this page: https://www.simplyadverse.co.uk/bad-credit-mortgage, but after several cache clears and re-indexing/fix requests it indexed fine.
We now have a page on another similar site that is stubbornly refusing to index. Its a new site and other than the a simple domain homepage, all pages when under development had "noindex " on them.
Several pages on the site on launch behaved like this with GSC saying the page was marked as "noindex" but submitted in the sitemap, but when you check to see if indexing was possible GSC says its fine (we'd removed noindex and setup the sitemap) . All crawling tools say its fine, but this page wont index despite repeated attempts over a couple of weeks, all other pages are now fine, but this page won't index: https://simplysl.co.uk/buy-to-let/
Other than they're all mortgage related sites/pages, I can't fathom why one page would be troublesome and all others index OK despite having the same setup and indexing process, any ideas?
-
Thanks, I'll take a look
-
Thanks for going into so much detail, much appreciated.
We've asked Google to reindex it and 'validate the fix', even though we can't find anything to fix!
-
Hi there, check that caching isn; the issues at server & CMS levels. Other than that reindex the page via GSC
-
This is really weird. Really really weird!
As you say, your site's source code seems to confirm that it is set to index. If we look here, we can plainly see that the coding syntax for a no-index directive is "noindex" (all one word).
Let's look at your source code:
Yep, everything seems fine there! But what if a script is modifying your source code and including the directive - and Google's picking up on that?
If we look at the modified source code which I rendered and saved to a file here:
... we can see, there are no problems here either:
Wow - that's really unhelpful!
Let's see what happens if we specifically search Google's live index for the URL:
Interestingly, when we search Google's index for this page, we get this page returned instead.
It makes sense that Google would return that URL if it couldn't return the main URL, as one is nested inside of the other. If everything was healthy, we'd see Google listing both URLs instead of just one of them. Even if you edit my index query to remove the trailing slash, you still only get the nested URL (not the one you want to be showing, which is at a slightly higher-up level)
Another thought I had was, hmm maybe this is a canonical tag gone rogue. That bore no fruit either, as this page (which you want to index, yet won't) canonicals to this page - and both of those URLs are exactly the same. As such, it's obvious that we can't blame the canonical tag either! I even viewed the modified source to see if it got altered, no dice (the canonical tag is just fine)
Maybe the XML file is telling Google not to index the URL?
Nope - that's fine too! No problems there...
Could the robots.txt file be interfering?
No! Darn it, that's not the problem
I know that a no-index or blocking directive can also be sent through the HTTP header (usually via X-robots). Let's check the response header of your URL out:
Nothing there that really raises my eyebrow. This is enabled and set to block, but to be honest that shouldn't affect Google's crawling at all. Anyone correct me if I am wrong, but defending your site against cross-site scripting (XSS) attacks doesn't impede crawling right?
Fudge it. Let's fling it through Google's Page-Speed Insights tool. Usually that will tell you if something is being blocked and why...
Nothing useful still!
Google's mobile friendly tool gives us some, semi-interesting information:
But it doesn't say the page can't be loaded. It only says some resources which the page pulls in can't be loaded! And guess what? They're all external things on other websites (other than a few theme related bits, but nothing IMO that should stop the whole page loading).
Let's try DeepCrawl's indexability checker (they make amazing software by the way... expensive though):
Sir... there is NO GOOD REASON why your URL shouldn't be indexed. I am 99.9% certain you have encountered a legit Google bug. Post about it here. Only Google can help you at this juncture
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keywords are indexed on the home page
Hello everyone, For one of our websites, we have optimized for many keywords. However, it seems that every keyword is indexed on the home page, and thus not ranked properly. This occurs only on one of our many websites. I am wondering if anyone knows the cause of this issue, and how to solve it. Thank you.
Technical SEO | | Ginovdw1 -
Google is indexing bad URLS
Hi All, The site I am working on is built on Wordpress. The plugin Revolution Slider was downloaded. While no longer utilized, it still remained on the site for some time. This plugin began creating hundreds of URLs containing nothing but code on the page. I noticed these URLs were being indexed by Google. The URLs follow the structure: www.mysite.com/wp-content/uploads/revslider/templates/this-part-changes/ I have done the following to prevent these URLs from being created & indexed: 1. Added a directive in my Htaccess to 404 all of these URLs 2. Blocked /wp-content/uploads/revslider/ in my robots.txt 3. Manually de-inedex each URL using the GSC tool 4. Deleted the plugin However, new URLs still appear in Google's index, despite being blocked by robots.txt and resolving to a 404. Can anyone suggest any next steps? I Thanks!
Technical SEO | | Tom3_150 -
Not all images indexed in Google
Hi all, Recently, got an unusual issue with images in Google index. We have more than 1,500 images in our sitemap, but according to Search Console only 273 of those are indexed. If I check Google image search directly, I find more images in index, but still not all of them. For example this post has 28 images and only 17 are indexed in Google image. This is happening to other posts as well. Checked all possible reasons (missing alt, image as background, file size, fetch and render in Search Console), but none of these are relevant in our case. So, everything looks fine, but not all images are in index. Any ideas on this issue? Your feedback is much appreciated, thanks
Technical SEO | | flo_seo1 -
Is Google suppressing a page from results - if so why?
UPDATE: It seems the issue was that pages were accessible via multiple URLs (i.e. with and without trailing slash, with and without .aspx extension). Once this issue was resolved, pages started ranking again. Our website used to rank well for a keyword (top 5), though this was over a year ago now. Since then the page no longer ranks at all, but sub pages of that page rank around 40th-60th. I searched for our site and the term on Google (i.e. 'Keyword site:MySite.com') and increased the number of results to 100, again the page isn't in the results. However when I just search for our site (site:MySite.com) then the page is there, appearing higher up the results than the sub pages. I thought this may be down to keyword stuffing; there were around 20-30 instances of the keyword on the page, however roughly the same quantity of keywords were on each sub pages as well. I've now removed some of the excess keywords from all sections as it was getting in the way of usability as well, but I just wanted some thoughts on whether this is a likely cause or if there is something else I should be worried about.
Technical SEO | | Datel1 -
Blocked URL parameters can still be crawled and indexed by google?
Hy guys, I have two questions and one might be a dumb question but there it goes. I just want to be sure that I understand: IF I tell webmaster tools to ignore an URL Parameter, will google still index and rank my url? IS it ok if I don't append in the url structure the brand filter?, will I still rank for that brand? Thanks, PS: ok 3 questions :)...
Technical SEO | | catalinmoraru0 -
How to change noindex to index?
Hey, I've recently upgraded to a pro SEOmoz account and have realised i have 14574 issues to do with 'blocked by meta-robot' and that 'This page is being kept out of the search engine indexes by the meta tag , which may have a value of "noindex", keeping this page out of the index.' How can i change this so my pages get indexed? I read somewhere that i need to change my privacy settings but that thread was 3 years old and now the WP Dashboard has updated.. Please let me know Many thanks, Jamie P.s Im using WordPress 3.5 And i have the XML sitemap plugin And i have no idea where to look for this robots.txt file..
Technical SEO | | markgreggs0 -
How to remove a sub domain from Google Index!
Hello, I have a website having many subdomains having same copy of content i think its harming my SEO for that site since abc and xyz sub domains do have same contents. Thus i require to know i have already deleted required subdomain DNS RECORDS now how to have those pages removed from Google index as well ? The DNS Records no more exists for those subdomains already.
Technical SEO | | anand20100 -
Which pages to "noindex"
I have read through the many articles regarding the use of Meta Noindex, but what I haven't been able to find is a clear explanation of when, why or what to use this on. I'm thinking that it would be appropriate to use it on: legal pages such as privacy policy and terms of use
Technical SEO | | mmaes
search results page
blog archive and category pages Thanks for any insight of this.0