Google Index Status Falling Fast - What should I be considering?
-
Hi Folks,
Working on an ecommerce site. I have found a month on month fall in the Index Status continuing since late 2015. This has resulted in around 80% of pages indexed according to Webmaster.
I do not seem to have any bad links or server issues. I am in the early stages of working through, updating content and tags but am yet to see a slowing of the fall.
If anybody has tips on where to look for to issues or insight to resolve this I would really appreciate it.
Thanks everybody!
Tim
-
Hi dude, thank you so much for taking time to look at this site. It is really kind of you. I will be taking a look at all the points raised over the next week to see what we can achieve. Thanks, Tim
-
Thank you for taking so much time to look at our site. I really appreciate it. I will dig in to the points to see what we can achieve. Thanks again, Tim
-
Thanks dude, I will take a look at this. Really appreciate you taking time to respond.
-
Hi Tim,
I agree with Laura on the canonical tags. I've worked on several large Magento sites and I've never seen any issue with the way Magento handles it - by canonicalizing product URLs to the root directory.
In fact, I actually prefer this was over assigning a product to a 'primary' category and using that as the canonical.
As Laura said, a reduction in the total number of indexed pages might actually be a really big positive here! More pages indexed does not mean it's better. If they are low quality/duplicate pages that have been removed from index, that's a really good thing.
I did find some issues with your robots.txt file:
- Disallow: /media/ - should be removed because it's blocking images from being crawled (this is a default Magento thing and they should remove it!)
- Disallow: /? - this basically means that any URLs containing a ? will not be crawled and with the way pagination is setup on the site, this means that any pages after 1 are not being crawled.
This could be impacting how many product pages you have indexed - which would definitely be a bad thing! You would obviously want your product pages to be crawled and indexed.
Solution: I would leave Disallow: /? in robots.txt because it stops a product filter URLs being crawled, but I would add the following line:
Allow: */?p=
This line will allow your paginated pages to be crawled, which will also allow products linked from those pages to be crawled.
Hope this helps!
Cheers,
David
-
I would be interested in seeing examples of where this has happened. Were the canonical tags added after the URLs were already indexed or were the canonicals in place when the site launched?
-
However, the canonical is only an advisory tag. I've had few cases where people have relied on their canonical tag when their site has numerous product url types (as above with category in the url and just product url) which has many references to these different urls elsewhere (onsite and offsite) and they are now indexed as both versions, which is not always ideal. It also means that reporting tools such as Screaming Frog only show the true URLs on the site. It's also saving crawl budget as it doesn't have to crawl the category produced url and the canonical url.
Whilst it's not a major issue, it's something I would look at changing.
-
If I understand you correctly, you are referring to the following two URLs:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
https://www.symectech.com/pole-mounting-kit-94614.html
Both of these have the same canonical referenced, which is https://www.symectech.com/pole-mounting-kit-94614.html.
It doesn't matter what actually shows in the address box. For the purposes of indexation, what matters is what is referenced in the canonical tag.
.
-
What I've suggested will be avoiding these duplicate urls? Here's some actual examples, going via a tier two category I get the following product url:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
With a canonical of:
https://www.symectech.com/pole-mounting-kit-94614.html
Yet when going from https://www.symectech.com/epos-systems/?limit=32&p=2 (a tier 1 category) I get the canonical url.
So if there are products listed in multiple tier two categories then that's multiple urls for the same product. With the suggestion I made, there would only be one variation of this product url (the canonical)
-
A reduction in the number of pages indexed does not necessarily mean something is wrong. In fact, it could mean that something is right, especially if your rankings are improving.
How are you determining that only 80% of pages are indexed? Can you provide a specific URL that is not being indexed?
If you made changes to your canonical tag, robots.txt , or meta robots tag, these could all cause a reduction in the number of pages being indexed.
-
The canonicals appear to be set up correctly, and I would not advise listing the product URLs as their canonicals in the category as suggested above. That will create duplicate URLs with the same content, which is exactly what canonical tags are designed to avoid.
-
Just going through Laura's list as a checklist for ones that are applicable:
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
Nothing that I can see, that's causing a major issue.
- Is it a large site? If so, check for issues that may affect crawl budget.
The main thing I can see is that the product urls and canonicals are different, is there anyway of listing the product urls as their canonical versions in the category?
-
<a name="_GoBack"></a>Sorry for the delay in response. Website is symectech.com
We have fixed various issues including a noindex issue earlier this year but our index status is continuing to fall. However, the ranking seems to be improving week on week according to MOZ. Thanks.
Tim
-
Just to echo what Laura has said, if you can share a URL that would be great so we can help you get to the source of the problem.
Try running a tool like screamingfrog (https://www.screamingfrog.co.uk/seo-spider/) to check the issues above that Laura has mentioned, as doing a lot of those by hand can be quite time consuming.
Also, do you have a drop in rankings with your pages falling out the index?
-
Any chance you can share the URL? That would make it much easier for someone to help in this forum. Without the URL, I can offer a few diagnostic questions.
- Have the number of pages on the site remained the same and pages are being removed from the index? Or have you added more content, but the percentage in the index has decreased?
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
- Have you submitted an XML sitemap? If so, check the XML sitemap to make sure what's being submitted should be indexed. It's possible to submit a sitemap that includes noindexed pages, especially with some automated tools.
- Is it a large site? If so, check for issues that may affect crawl budget.
- Have you changed any canonical tags?
- Have you used the Fetch as Google tool to diagnose a specific URL?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is my name. Overnight it went from being the number one google search to not showing up at all when you google my name. Why would this happen?
I built my website via square space. It is my name. If you google my name it was the number one hit. Suddenly 2 weeks ago it doesn't show up AT ALL. I went through square spaces SEO check list, secured my site etc. Still doesn't show up. Why would this happen all of the sudden and What can I do? Thank you!
Intermediate & Advanced SEO | | Jbark0 -
Sitemap indexing
Hi everyone, Here's a duplicate content challenge I'm facing: Let's assume that we sell brown, blue, white and black 'Nike Shoes model 2017'. Because of technical reasons, we really need four urls to properly show these variations on our website. We find substantial search volume on 'Nike Shoes model 2017', but none on any of the color variants. Would it be theoretically possible to show page A, B, C and D on the website and: Give each page a canonical to page X, which is the 'default' page that we want to rank in Google (a product page that has a color selector) but is not directly linked from the site Mention page X in the sitemap.xml. (And not A, B, C or D). So the 'clean' urls get indexed and the color variations do not? In other words: Is it possible to rank a page that is only discovered via sitemap and canonicals?
Intermediate & Advanced SEO | | Adriaan.Multiply1 -
Would you consider this thin content?
Just wondering what the community thinks about the following URLS and whether they are essentially thin content that should be handled through a canonical, noindex or a parameter filtering system: https://www.adversetdisplay.co.uk/products/3x1-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x2-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x3-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x4-popup-exhibition-stand https://www.adversetdisplay.co.uk/products/3x5-popup-exhibition-stand
Intermediate & Advanced SEO | | ColinDocherty0 -
Https & http urls in Google Index
Hi everyone, this question is a two parter: I am now working for a large website - over 500k monthly organic traffic. The site currently has both http and https urls in Google's index. The website has not formally converted to https. The https began with an error and has evolved unchecked over time. Both versions of the site (http & https) are registered in webmaster tools so I can clearly track and see that as time passes http indexation is decreasing and https has been increasing. The ratio is at about 3:1 in favor of https at this time. Traffic over the last year has slowly dipped, however, over the last two months there has been a steady decline in overall visits registered through analytics. No single page appears to be the culprit, this decline is occurring across most pages of the website, pages which traditionally draw heavy traffic - including the home page. Considering that Google is giving priority to https pages, could it be possible that the split is having a negative impact on traffic as rankings sway? Additionally, mobile activity for the site has steadily increased both from a traffic and a conversion standpoint. However that traffic has also dipped significantly over the last two months. Looking at Google's mobile usability error's page I see a significant number of errors (over 1k). I know Google has been testing and changing mobile ranking factors, is it safe to posit that this could be having an impact on mobile traffic? The traffic declines are 9-10% MOM. Thank you. ~Geo
Intermediate & Advanced SEO | | Geosem0 -
VisitSweden indexing error
Hi all Just got a new site up about weekend travel for VisitSweden, the official tourism office of Sweden. Everything went just fine except som issues with indexing. The site can be found here at weekend.visitsweden.com/no/ For some weird reason the "frontpage" of the site does not get indexed. What I have done myself to find the issue: Added sitemaps.xml Configured and added site to webmaster tools Checked 301s so they are not faulty By doing a simple site:weekend.visitsweden.com/no/ you can see that the frontpage is simple not in the index. Also by doing a cache:weekend.visitsweden.com/no/ I see that Google tries to index the page without the trailing /no/ for some reason. http://webcache.googleusercontent.com/search?q=cache:http://weekend.visitsweden.com/no/ Any smart ideas to get this fixed or where to start looking? All help greatly appreciated Kind regards Fredrik
Intermediate & Advanced SEO | | Resultify0 -
Google and JavaScript
Hey there! Recent announcements at Google to encourage webmasters to let Google crawl Java Script http://www.googlewebmastercentral.blogspot.com/2014/05/understanding-web-pages-better.html http://googlewebmastercentral.blogspot.com/2014/05/rendering-pages-with-fetch-as-google.html We have always put JS and CSS behind robots.txt, but now considering taking them out of robots. Any opinions on this?
Intermediate & Advanced SEO | | CleverPhD0 -
Why are some pages indexed but not cached by Google?
The question is simple but I don't understand the answer. I found a webpage that was linking to my personal site. The page was indexed in Google. However, there was no cache option and I received a 404 from Google when I tried using cache:www.thewebpage.com/link/. What exactly does this mean? Also, does it have any negative implication on the SEO value of the link that points to my personal website?
Intermediate & Advanced SEO | | mRELEVANCE0 -
Need to duplicate the index for Google in a way that's correct
Usually duplicated content is a brief to fix. I find myself in a little predicament: I have a network of career oriented websites in several countries. the problem is that for each country we use a "master" site that aggregates all ads working as a portal. The smaller nisched sites have some of the same info as the "master" sites since it is relevant for that site. The "master" sites have naturally gained the index for the majority of these ads. So the main issue is how to maintain the ads on the master sites and still make the nische sites content become indexed in a way that doesn't break Google guide lines. I can of course fix this in various ways ranging from iframes(no index though) and bullet listing and small adjustments to the headers and titles on the content on the nisched sites, but it feels like I'm cheating if I'm going down that path. So the question is: Have someone else stumbled upon a similar problem? If so...? How did you fix it.
Intermediate & Advanced SEO | | Gustav-Northclick0