Homepage not indexed - seems to defy explanation
-
Hey folks
Hoping to get some more eyes on a specific problem I am seeing with a clients site.
Site: http:www.ukjuicers.com
We have checked everything we can think of and the usual suspects here are not present:
- Canonical URL is in place
- Site is shown as indexed in search console
- No Crawl, DNS, Connectivity or server errors
- No robots.txt blocking - verified in search console
- No robots meta tags or directives
- Fetch as Google works
- Fetch & render works
- site command returns all other pages
- info command does not return the homepage
- homepage is cached and cache has been updated since this issue started: http://webcache.googleusercontent.com/search?q=cache:www.ukjuicers.com
- homepage is indexed in yahoo and Bing
- all variations redirect to the www.ukjuicers.com domain (.co.uk, .com, www, sans www etc)
The only issue I found after some extensive digging was some issues with the HTTP and HTTPS versions of the site both being available and both specifying the canonical version as themselves. So, http site used canonicals with http and https site used canonicals with https. So, a conflict there with the canonical exacerbating the problem it is there to solve.
The HTTPS site is not indexed though and we have set this up in webmaster tools and now the web developer has set redirects to ensure all versions even the https now 301 redirect to the http://www.ukjuicers.com page so these canonical issues have been ironed out.
But... it's still not indexing the homepage.
The practical implications of this are quite scary - the site used to be somewhere between 1st and 4th for keywords like 'juicers', 'juicer' etc. Now they are bottom of page 1 or top of page 2 with an internal page. They were jostling with the big boys (amazon, argos, john lewis etc) but now they are right at the bottom of the second page.
It's a strange one - i have seen all manor of technical problems over the years but this one seems to defy sensible explanation. The next step is to do a full technical SEO audit of the site but I am always of the opinion that with many eyes all bugs are shallow so if anyone has any input or experience with odd indexation problems like this would love to get your input.
Cheers
Marcus -
Glad you figured it out. I honestly didn't think it would have been the canonicals. I'm a little surprised that the bots didn't just choose not to respect the suggestion as opposed to blanking your site from the index. Didn't think that was even a possibility from incorrect canonicals. Good to know for the future though in case anything like this comes up with anyone else's site.
-
Yep - it's back. Looks like resolving the canonical issue fixed it. Seems it was a usual suspect after all.
-
Yep - bit of a weird one but in the end looks like the canonicals were the issue. Thanks for taking a look though man - super appreciated.
-
Hey Bernadette - thanks for the feedback. Site is back in the index now, looks like the canonicals were the culprit but the owners are keen for no future issues so I will dig in and take a look at these points. Cheers!
-
Hey folks
24 hours after we identified and fixed the canonical issue the site is now indexed again so it does look like it was indeed a canonical conundrum. Both the HTTP and HTTPS sites were claiming to be the canonical version so in some respects creating a conflict. We removed this conflict and it is now indexed.
Thanks for the extra eyes folks - appreciated and if anyone ever needs another pair of eyes to look a problem give me a shout.
Cheers
Marcus -
Hey Marcus. You just need some links from high authority website like moz:) People say you're indexed so case closed, job done:)
-
I just noticed that clicking on the entire slider, even out to the sides where it appears to be just white space, takes you to another page. At first I didn't realize what I was clicking that got me to the next page. When I do Crtl+A on the page, the full width of the slider images shows highlighted in blue, but to the side of those images outside of those bounds is linked. I'm wondering if Google sees this as cloaking and kicked out the homepage as a result.
*I did see that AGM pointed out it's indexed now, but that's not to say this wasn't the cause of original de-index.
-
As of this writing it looks like the page is indexed. By searching site:ukjuicers.com it comes up in the search results with about 861 other results. Not sure if there is anything you changed to get things working again but it seems to be in their index now.
-
I took a look at all of the usual suspects as well... which amounts to pretty much everything that everyone else mentioned but I was intrigued by this issue and thought maybe another set of eyes might notice something that was off. Nothing was wrong in the page source from what I saw, no issues crawling it myself and I didn't see any penalties. Normally I'd think that if your homepage wasn't appearing for branded organic searches then a penalty was levied against you but when that is the case the homepage is still normally find-able in a Site operator search. M__aybe it is related to all the backlinks that were lost/deleted in the past month but I'm not sure why that would be the case unless removing the homepage from the index was a Penguin response to link issues... but I was under the impression that peguin was devaluing the link source not the link recipient and deleting/removing links seems to be a preferred method of handling penguin-related issues. So if there is a relationship between penguin and your homepage being deindexed then I am not sure at all why nor am I certain how to fix it as I'm not seeing anything in particular that screams "linking issue" at me. (though I only did a fairly cursory inspection of things)
So I am stumped. Whenever the issue is figure out I would love to know how/why this came to be.
-
Marcus, I know this is frustrating. I've checked several things, and looked at many of the possibilities that you've already brought up. I don't have access to the Google Search Console, so I cannot comment about any of that data. I'm assuming that you don't have a manual action on the site or any other messages from Google.
What I've seen in the past is issues with schema markup, especially when it comes to reviews and how they're handled on sites. I'm not saying that this is the issue--but I've seen issues that Google has had with these (especially because there is the word "hidden" there in the code). So, you might look into that some more.
The issue could also be related to links--look at the links to the site's home page to see if there is an issue with low quality links pointing to that page or other unnatural links.
If someone has copied the page, added a canonical tag, and then added a "meta noindex tag" to their page, it's possible that they could have taken your page out of the index. This has happened before.
-
Unfortunately you're not amazon so maybe you must try harder;)
or force to index mainpage with some software or indexer website then wait a while.
I'm thinking about some negative seo made for your mainpage but so far can't see any symptoms.
-
This is a strange one then.... very strange.
Just performed a site: search and like you said it is not showing up as indexed. There is normally something technical to explain an issue like this, but I cannot see anything after looking at your site robots and source code.
-
Hey Krzysztof
Yeah, the page has little textual content but... neither does the amazon homepage. Ultimately the page is a jump in point for all the products and the content suits that. Certainly, I could understand Google not liking the page but would that not result in a reduced rank rather than a complete removal like this?
On the dodgy links front they have never done anything on that front - so anything there would be surprising (or just incidental cruft that is out there on scraper sites and the like).
Super odd.
-
Yep - super odd. 15 years or so in this game and never seen anything quite like this. Transient drops but usually it boiled down to some simple technical error or more often user error cough no index / robots.txt cough
-
Hey - the real issue here is the page is just not indexed. It's not there. Not that another page is a more suitable or preferential result. Ultimately that was the best page for a user to jump in at... The page is not even returned in a brand search so... can't see how any other page could be more suitable for that kind of search.
-
Hi Marcus
The only thing I think it can be the issue is the number of words on mainpage. Mostly I see images and words from menus, links and not main content. Digging deeper can help (seo audit).
This can be a penguin too but to know the answer, full link analysis is needed. After quick glance I see some unnatural links but not in larger number. Maybe they got footprints not visible at once (same ip, c class, content with link etc).
-
You're not kidding, this does defy explanation. When did it drop out of the index?
In all honesty, I don't have a solution, you've already checked everything I would have. I'm mostly commenting so I can keep up with this issue and see how it unfolds. Very curious to see if anyone can identify what's happening here.
-
Hmmm, is it a case of Google simply feels the homepage is not as engaging and relevant in terms of search to your users and they put more emphasis on product pages which it choose to feature instead.
I often find that for key terms our product pages almost always rank higher then the homepage unless a brand only search.
Secondly, is this a recent change? Could the most recent Penguin update have simply resulted in your competitors getting a boost where as before the previous algo was holding them back which has resulted in your position slide.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website indexing issues
My website is being indexed with both https - https with www. and no leader at all. example. https//www.example.com and https//example.com and example.com 3 different versions are being indexed. How would I begin resolving this? Hosting?
Technical SEO | | DigitalRipples0 -
My Homepage Won't Load if Javascript is Disabled. Is this an SEO/Indexation issue?
Hi everyone, I'm working with a client who recently had their site redesigned. I'm just going through to do an initial audit to make sure everything looks good. Part of my initial indexation audit goes through questions about how the site functions when you disable, javascript, cookies, and/or css. I use the Web Developer extension for Chrome to do this. I know, more recently, people have said that content loaded by Javascript will be indexed. I just want to make sure it's not hurting my clients SEO. http://americasinstantsigns.com/ Is it as simple as looking at Google's Cached URL? The URL is definitely being indexed and when looking at the text-only version everything appears to be in order. This may be an outdated question, but I just want to be sure! Thank you so much!
Technical SEO | | ccox10 -
Need to de-index certain pages fast
I need to de-index certain pages as fast as possible. These pages are already indexed. What is the fastest way to do this? I have added the noindex meta tag and run a few of the pages through Search Console/Webmaster tools (fetch as google) earlier today, however nothing has changed yet. The 'fetch as google' services do see the noindex tag, but it haven't changed the SERPs yet. I now I should be patient, but if there is a faster way to get Google to de-index these pages, I want to try that. I am considering the removal tool also, but I'm unsure if that is risky to do. And even if it's not, I can understand it's not a permanent solution anyway. What to do?
Technical SEO | | WebGain0 -
Index bloating issue
Hello, In the last month, I noticed a huge spike in the number of pages indexed on my site, which I think is impacting my SEO quality score. While I've only have about 90 pages on my site map, the number of pages indexed jumped to 446, with about 536 pages being blocked by robots. At first we thought this might be due to duplicate product pages showing up in different categories on my site, but we added something to our robot.txt file to not index those pages. But the number has not gone down. I've tried to consult with our hosting vendor, but no one seems to be concerned or have any idea why there was such a big jump in the last month. Any insights or pointers would be so greatly appreciated, so that I can fix/improve my SEO as quickly as possible! Thanks!
Technical SEO | | Saison0 -
Homepage no longer indexed in Google
Have been working on a site and the hompage has recently vanished from Google. I submit the site to Google webmaster tools a couple of days ago and checked today and the homepage has vanished. There are no no follow tags, and no robots.txt stopping the page from being crawled. It's a bit of a worry, the site is http://www.beyondthedeal.com
Technical SEO | | tonysandwich
Any insights would be massively appreciated! Thanks.0 -
Does Google index has expiration?
Hi, I have this in mind and I think you can help me. Suppose that I have a pagin something like this: www.mysite.com/politics where I have a list of the current month news. Great, everytime the bot check this url, index the links that are there. What happens next month, all that link are not visible anymore by the user unless he search in a search box or google. Does google keep those links? The current month google check that those links are there, but next month are not, but they are alive. So, my question is, Does google keep this links for ever if they are alive but nowhere in the site (the bot not find them anymore but they work)? Thanks
Technical SEO | | informatica8100 -
Site being indexed by Google before it has launched
We are currently coming towards the end of a site migration, and are at the final stage of testing redirects etc. However, to our horror we've just discovered Google has started indexing the new site. Any ideas on how this could have happened? I have most recently asked for robots.txt to exclude anything with a certain parameter in URL. Is there a chance this, wrongly implemented, could have caused this?
Technical SEO | | Sayers0 -
Index forum sites
Hi Moz Team, somehow the last question i raised a few days ago not only wasnt answered up until now, it was also completely deleted and the credit was not "refunded" - obviously there was some data loss involved with your restructuring. Can you check whether you still find the last question and answer it quickly? I need the answer 🙂 Here is one more question: I bought a website that has a huge forum, loads of pages with user generated content. Overall around 500.000 Threads with 9 Million comments. The complete forum is noindex/nofollow when i bought the site, now i am thinking about what is the best way to unleash the potential. The current system is vBulletin 3.6.10. a) Shall i first do an update of vbulletin to version 4 and use the vSEO tool to make the URLs clean, more user and search engine friendly before i switch to index/follow? b) would you recommend to have the forum in the folder structure or on a subdomain? As far as i know subdomain does take lesser strenght from the TLD, however, it is safer because the subdomain is seen as a separate entity from the regular TLD. Having it in he folder makes it easiert to pass strenght from the TLD to the forum, however, it puts my TLD at risk c) Would you release all forum sites at once or section by section? I think section by section looks rather unnatural not only to search engines but also to users, however, i am afraid of blasting more than a millionpages into the index at once. d) Would you index the first page of a threat or all pages of a threat? I fear duplicate content as the different pages of the threat contain different body content but the same Title and possibly the same h1. Looking forward to hear from you soon! Best Fabian
Technical SEO | | fabiank0