Google Index Status Falling Fast - What should I be considering?
-
Hi Folks,
Working on an ecommerce site. I have found a month on month fall in the Index Status continuing since late 2015. This has resulted in around 80% of pages indexed according to Webmaster.
I do not seem to have any bad links or server issues. I am in the early stages of working through, updating content and tags but am yet to see a slowing of the fall.
If anybody has tips on where to look for to issues or insight to resolve this I would really appreciate it.
Thanks everybody!
Tim
-
Hi dude, thank you so much for taking time to look at this site. It is really kind of you. I will be taking a look at all the points raised over the next week to see what we can achieve. Thanks, Tim
-
Thank you for taking so much time to look at our site. I really appreciate it. I will dig in to the points to see what we can achieve. Thanks again, Tim
-
Thanks dude, I will take a look at this. Really appreciate you taking time to respond.
-
Hi Tim,
I agree with Laura on the canonical tags. I've worked on several large Magento sites and I've never seen any issue with the way Magento handles it - by canonicalizing product URLs to the root directory.
In fact, I actually prefer this was over assigning a product to a 'primary' category and using that as the canonical.
As Laura said, a reduction in the total number of indexed pages might actually be a really big positive here! More pages indexed does not mean it's better. If they are low quality/duplicate pages that have been removed from index, that's a really good thing.
I did find some issues with your robots.txt file:
- Disallow: /media/ - should be removed because it's blocking images from being crawled (this is a default Magento thing and they should remove it!)
- Disallow: /? - this basically means that any URLs containing a ? will not be crawled and with the way pagination is setup on the site, this means that any pages after 1 are not being crawled.
This could be impacting how many product pages you have indexed - which would definitely be a bad thing! You would obviously want your product pages to be crawled and indexed.
Solution: I would leave Disallow: /? in robots.txt because it stops a product filter URLs being crawled, but I would add the following line:
Allow: */?p=
This line will allow your paginated pages to be crawled, which will also allow products linked from those pages to be crawled.
Hope this helps!
Cheers,
David
-
I would be interested in seeing examples of where this has happened. Were the canonical tags added after the URLs were already indexed or were the canonicals in place when the site launched?
-
However, the canonical is only an advisory tag. I've had few cases where people have relied on their canonical tag when their site has numerous product url types (as above with category in the url and just product url) which has many references to these different urls elsewhere (onsite and offsite) and they are now indexed as both versions, which is not always ideal. It also means that reporting tools such as Screaming Frog only show the true URLs on the site. It's also saving crawl budget as it doesn't have to crawl the category produced url and the canonical url.
Whilst it's not a major issue, it's something I would look at changing.
-
If I understand you correctly, you are referring to the following two URLs:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
https://www.symectech.com/pole-mounting-kit-94614.html
Both of these have the same canonical referenced, which is https://www.symectech.com/pole-mounting-kit-94614.html.
It doesn't matter what actually shows in the address box. For the purposes of indexation, what matters is what is referenced in the canonical tag.
.
-
What I've suggested will be avoiding these duplicate urls? Here's some actual examples, going via a tier two category I get the following product url:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
With a canonical of:
https://www.symectech.com/pole-mounting-kit-94614.html
Yet when going from https://www.symectech.com/epos-systems/?limit=32&p=2 (a tier 1 category) I get the canonical url.
So if there are products listed in multiple tier two categories then that's multiple urls for the same product. With the suggestion I made, there would only be one variation of this product url (the canonical)
-
A reduction in the number of pages indexed does not necessarily mean something is wrong. In fact, it could mean that something is right, especially if your rankings are improving.
How are you determining that only 80% of pages are indexed? Can you provide a specific URL that is not being indexed?
If you made changes to your canonical tag, robots.txt , or meta robots tag, these could all cause a reduction in the number of pages being indexed.
-
The canonicals appear to be set up correctly, and I would not advise listing the product URLs as their canonicals in the category as suggested above. That will create duplicate URLs with the same content, which is exactly what canonical tags are designed to avoid.
-
Just going through Laura's list as a checklist for ones that are applicable:
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
Nothing that I can see, that's causing a major issue.
- Is it a large site? If so, check for issues that may affect crawl budget.
The main thing I can see is that the product urls and canonicals are different, is there anyway of listing the product urls as their canonical versions in the category?
-
<a name="_GoBack"></a>Sorry for the delay in response. Website is symectech.com
We have fixed various issues including a noindex issue earlier this year but our index status is continuing to fall. However, the ranking seems to be improving week on week according to MOZ. Thanks.
Tim
-
Just to echo what Laura has said, if you can share a URL that would be great so we can help you get to the source of the problem.
Try running a tool like screamingfrog (https://www.screamingfrog.co.uk/seo-spider/) to check the issues above that Laura has mentioned, as doing a lot of those by hand can be quite time consuming.
Also, do you have a drop in rankings with your pages falling out the index?
-
Any chance you can share the URL? That would make it much easier for someone to help in this forum. Without the URL, I can offer a few diagnostic questions.
- Have the number of pages on the site remained the same and pages are being removed from the index? Or have you added more content, but the percentage in the index has decreased?
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
- Have you submitted an XML sitemap? If so, check the XML sitemap to make sure what's being submitted should be indexed. It's possible to submit a sitemap that includes noindexed pages, especially with some automated tools.
- Is it a large site? If so, check for issues that may affect crawl budget.
- Have you changed any canonical tags?
- Have you used the Fetch as Google tool to diagnose a specific URL?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Docs
Hi Mozers, I was wondering what do you guys think about indexing Google Docs files as Documents or Spreadsheets? Can you do that and is it any help if you what to get some content on the firs page of Google. And also can Google see that content and links, because when I deactivate the javascript on chrome I couldn't see anything from the content Thanks
Intermediate & Advanced SEO | | VeeamSoftware0 -
Google Index Constantly Decreases Week over Week (for over 1 year now)
Hi, I recently started working with two products (one is community driven content), the other is editorial content, but I've seen a strange pattern in both of them. The Google Index constantly decreases week over week, for at least 1 year. Yes, the decrease increased 🙂 when the new Mobile version of Google came out, but it was still declining before that. Has it ever happened to you? How did you find out what was wrong? How did you solve it? What I want to do is take the sitemap and look for the urls in the index, to first determine which are the missing links. The problem though is that the sitemap is huge (6 M pages). Have you find out a solution on how to deal with such big index changes? Cheers, Andrei
Intermediate & Advanced SEO | | andreib0 -
Removing Parameterized URLs from Google Index
We have duplicate eCommerce websites, and we are in the process of implementing cross-domain canonicals. (We can't 301 - both sites are major brands). So far, this is working well - rankings are improving dramatically in most cases. However, what we are seeing in some cases is that Google has indexed a parameterized page for the site being canonicaled (this is the site that is getting the canonical tag - the "from" page). When this happens, both sites are being ranked, and the parameterized page appears to be blocking the canonical. The question is, how do I remove canonicaled pages from Google's index? If Google doesn't crawl the page in question, it never sees the canonical tag, and we still have duplicate content. Example: A. www.domain2.com/productname.cfm%3FclickSource%3DXSELL_PR is ranked at #35, and B. www.domain1.com/productname.cfm is ranked at #12. (yes, I know that upper case is bad. We fixed that too.) Page A has the canonical tag, but page B's rank didn't improve. I know that there are no guarantees that it will improve, but I am seeing a pattern. Page A appears to be preventing Google from passing link juice via canonical. If Google doesn't crawl Page A, it can't see the rel=canonical tag. We likely have thousands of pages like this. Any ideas? Does it make sense to block the "clicksource" parameter in GWT? That kind of scares me.
Intermediate & Advanced SEO | | AMHC0 -
Google de-indexed a page on my site
I have a site which is around 9 months old. For most search terms we rank fine (including top 3 rankings for competitive terms). Recently one of our pages has been fluctuating wildly in the rankings and has now disappeared altogether from the rankings for over 1 week. As a test I added a similar page to one of my other sites and it ranks fine. I've checked webmaster tools and there is nothing of note there. I'm not really sure what to do at this stage. Any advice would me much appreciated!
Intermediate & Advanced SEO | | deelo5550 -
Removing content from Google's Indexes
Hello Mozers My client asked a very good question today. I didn't know the answer, hence this question. When you submit a 'Removing content for legal reasons report': https://support.google.com/legal/contact/lr_legalother?product=websearch will the person(s) owning the website containing this inflammatory content recieve any communication from Google? My clients have already had the offending URL removed by a court order which was sent to the offending company. However now the site has been relocated and the same content is glaring out at them (and their potential clients) with the title "Solicitors from Hell + Brand name" immediately under their SERPs entry. **I'm going to follow the advice of the forum and try to get the url removed via Googles report system as well as the reargard action of increasing my clients SERPs entries via Social + Content. ** However, I need to be able to firmly tell my clients the implications of submitting a report. They are worried that if they rock the boat this URL (with open access for reporting of complaints) will simply get more inflammatory)! By rocking the boat, I mean, Google informing the owners of this "Solicitors from Hell" site that they have been reported for "hosting defamatory" content. I'm hoping that Google wouldn't inform such a site, and that the only indicator would be an absence of visits. Is this the case or am I being too optimistic?
Intermediate & Advanced SEO | | catherine-2793880 -
Home page not being indexed
Hi Moz crew. I have two sites (one is a client's and one is mine). They are both Wordpress sites and both are hosted on WP Engine. They have both been set up for a long time, and are "on-page" optimized. Pages from each site are indexed, but Google is not indexing the homepage for either site. Just to be clear - I can set up and work on a Wordpress site, but am not a programmer. Both seem to be fine according to my Moz dashboard. I have Webmaster tools set up for each - and as far as I can tell (definitely not an exper in webmaster tools) they are okay. I have done the obvious and checked that the the box preventing Google from crawling is not checked, and I believe I have set up the proper re-directs and canonicals.Thanks in advance! Brent
Intermediate & Advanced SEO | | EchelonSEO0 -
Index, Nofollow Issue
We are having on our site a couple of pages that we want the page to be indexed, however, we don't want the links on the page to be followed. For example url: http://www.printez.com/animal-personal-checks.html. We have added in our code: . Bing Webmaster Tools, is telling us the following: The pages uses a meta robots tag. Review the value of the tag to see if you are not unintentionally blocking the page from being indexed (NOINDEX). Question is, is the page using the right code as of now or do we need to do any changes in the code, if so, what should we use for them to index the page, but not to follow the links on the page? Please advise, Morris
Intermediate & Advanced SEO | | PrintEZ0 -
Google replacing subpages in index with home page?
Hi! I run a backlink building company. Recently, we had a customer who had us build targeted backlinks to certain subpages on his site. Then something really bizarre happened...all of a sudden, their subpages that were indexed in Google (the ones we were building links to) disappeared from the index, to be replaced with their home page. They haven't lost their rank, per se--it's just now their home page instead of their subpages. At this point, we are tracking literally thousands of keywords for our link building customers, and we've never run into this issue before. Have you ever run into it? If so, what's the best way to handle it from an SEO company perspective? They have a sitemap.xml and their GWT account reports no crawl errors, so it doesn't seem to be a site issue.
Intermediate & Advanced SEO | | ownlocal0