Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
-
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results.
Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/
Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page.
I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed.
Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
-
@effectdigital and @jasongmcmahon did you ever get to the bottom of this and if so what caused it and what was the long term fix, as GSC and Google seem to behaving in a peculiar way?
We had a similar issue with this page: https://www.simplyadverse.co.uk/bad-credit-mortgage, but after several cache clears and re-indexing/fix requests it indexed fine.
We now have a page on another similar site that is stubbornly refusing to index. Its a new site and other than the a simple domain homepage, all pages when under development had "noindex " on them.
Several pages on the site on launch behaved like this with GSC saying the page was marked as "noindex" but submitted in the sitemap, but when you check to see if indexing was possible GSC says its fine (we'd removed noindex and setup the sitemap) . All crawling tools say its fine, but this page wont index despite repeated attempts over a couple of weeks, all other pages are now fine, but this page won't index: https://simplysl.co.uk/buy-to-let/
Other than they're all mortgage related sites/pages, I can't fathom why one page would be troublesome and all others index OK despite having the same setup and indexing process, any ideas?
-
Thanks, I'll take a look
-
Thanks for going into so much detail, much appreciated.
We've asked Google to reindex it and 'validate the fix', even though we can't find anything to fix!
-
Hi there, check that caching isn; the issues at server & CMS levels. Other than that reindex the page via GSC
-
This is really weird. Really really weird!
As you say, your site's source code seems to confirm that it is set to index. If we look here, we can plainly see that the coding syntax for a no-index directive is "noindex" (all one word).
Let's look at your source code:
Yep, everything seems fine there! But what if a script is modifying your source code and including the directive - and Google's picking up on that?
If we look at the modified source code which I rendered and saved to a file here:
... we can see, there are no problems here either:
Wow - that's really unhelpful!
Let's see what happens if we specifically search Google's live index for the URL:
Interestingly, when we search Google's index for this page, we get this page returned instead.
It makes sense that Google would return that URL if it couldn't return the main URL, as one is nested inside of the other. If everything was healthy, we'd see Google listing both URLs instead of just one of them. Even if you edit my index query to remove the trailing slash, you still only get the nested URL (not the one you want to be showing, which is at a slightly higher-up level)
Another thought I had was, hmm maybe this is a canonical tag gone rogue. That bore no fruit either, as this page (which you want to index, yet won't) canonicals to this page - and both of those URLs are exactly the same. As such, it's obvious that we can't blame the canonical tag either! I even viewed the modified source to see if it got altered, no dice (the canonical tag is just fine)
Maybe the XML file is telling Google not to index the URL?
Nope - that's fine too! No problems there...
Could the robots.txt file be interfering?
No! Darn it, that's not the problem
I know that a no-index or blocking directive can also be sent through the HTTP header (usually via X-robots). Let's check the response header of your URL out:
Nothing there that really raises my eyebrow. This is enabled and set to block, but to be honest that shouldn't affect Google's crawling at all. Anyone correct me if I am wrong, but defending your site against cross-site scripting (XSS) attacks doesn't impede crawling right?
Fudge it. Let's fling it through Google's Page-Speed Insights tool. Usually that will tell you if something is being blocked and why...
Nothing useful still!
Google's mobile friendly tool gives us some, semi-interesting information:
But it doesn't say the page can't be loaded. It only says some resources which the page pulls in can't be loaded! And guess what? They're all external things on other websites (other than a few theme related bits, but nothing IMO that should stop the whole page loading).
Let's try DeepCrawl's indexability checker (they make amazing software by the way... expensive though):
Sir... there is NO GOOD REASON why your URL shouldn't be indexed. I am 99.9% certain you have encountered a legit Google bug. Post about it here. Only Google can help you at this juncture
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Website stopped being in the Google Index
Hi there, So My website is two weeks old, and I published it and it was ranking at about page 10 or 11 for a week maybe a bit longer. The last few days it dropped off the rankings, which I assumed was the google algorithm doing its thing but when I checked Google Search Console it says my domain is not in the index. 'This page is not in the index, but not because of an error. See the details below to learn why it wasn't indexed.' I click request indexing, then after a bit, it goes green saying it was successfully indexed. Then when I refresh the website it gives me the same message 'This page is not in the index, but not because of an error. See the details below to learn why it wasn't indexed.' Not sure why it says this, any ideas or help is appreciated cheers.
Technical SEO | | sydneygardening0 -
Google Ignoring region settings on contact pages
Hi All, I've got an issue with multi-region contact pages. For example, Google favors the UAE other region contact pages for French region searches, when I only want /fr/contact. I've used a Rel-con and set up the website to be pointing to the correct regions.
Technical SEO | | WattbikeSEO0 -
Google still listing pages from old domain after 2 change requests
Good Morning I put forward the following question in December 2014 https://moz.com/community/q/google-still-listing-old-domain as pages from our old domain www.fhr-net.co.uk were still indexed in Google. We have submitted two change request in WMT, the most recent was over 6 months ago yet the old pages are still being indexed and we can't see why that would be Any advice would be appreciated
Technical SEO | | Ham19790 -
Google Webmaster tools Sitemap submitted vs indexed vs Index Status
I'm having an odd error I'm trying to diagnose. Our Index Status is growing and is now up to 1,115. However when I look at Sitemaps we have 763 submitted but only 134 indexed. The submitted and indexed were virtually the same around 750 until 15 days ago when the indexed dipped dramatically. Additionally when I look under HTML improvements I only find 3 duplicate pages, and I ran screaming frog on the site and got similar results, low duplicates. Our actual content should be around 950 pages counting all the category pages. What's going on here?
Technical SEO | | K-WINTER0 -
Why is this page not ranking but is indexed?
I have a page http://jobs.hays.co.uk/jobs-in-norfolk and it is indexed by Google but will not show up for any keywords I try. Any ideas?
Technical SEO | | S_Curtis0 -
Google Places Page Changes
We had a client(dentist) hire another marketing firm(without our knowledge) and due to some Google page changes they made, their website lost a #1 ranking, was disassociated with the places page and was placed at result #10 below all the local results. We quickly made some changes and were able to bring them up to #2 within a few days and restore their Google page after about a week, but the tracking/forwarding phone number the marketing company was using shows up on the page despite attempts to contact Google through updating the business in places management as well as submit the phone number as incorrect while providing the correct phone number. And because the client fired that marketing company, the phone number will no longer be active in a few days. Of course this is very important for a dental office. Has anyone else had problems with the speed and updating Google Places/Plus pages for businesses? What's the most efficient way to make changes like this?
Technical SEO | | tvinson0 -
I know I'm missing pages with my page level 301 re-directs. What can I do?
I am implementing page level re-directs for a large site but I know that I will inevitably miss some pages. Is there an additional safety net root level re-direct that I can use to catch these pages and send them to the homepage?
Technical SEO | | VMLYRDiscoverability0 -
Properly Moving Blog from Index to its Own Page
Right now I have a website that is exclusively a blog. I want to create pages outside of the blog and move the blog to a page other than the index file e.g.) from domain.com to domain.com/blog I will have the blog post pages stay in the root directory. e.g.) domain.com/blog-post Any suggestions how to properly tell SE's and other websites that the blog has moved?
Technical SEO | | Bartell0