Google Index Status Falling Fast - What should I be considering?
-
Hi Folks,
Working on an ecommerce site. I have found a month on month fall in the Index Status continuing since late 2015. This has resulted in around 80% of pages indexed according to Webmaster.
I do not seem to have any bad links or server issues. I am in the early stages of working through, updating content and tags but am yet to see a slowing of the fall.
If anybody has tips on where to look for to issues or insight to resolve this I would really appreciate it.
Thanks everybody!
Tim
-
Hi dude, thank you so much for taking time to look at this site. It is really kind of you. I will be taking a look at all the points raised over the next week to see what we can achieve. Thanks, Tim
-
Thank you for taking so much time to look at our site. I really appreciate it. I will dig in to the points to see what we can achieve. Thanks again, Tim
-
Thanks dude, I will take a look at this. Really appreciate you taking time to respond.
-
Hi Tim,
I agree with Laura on the canonical tags. I've worked on several large Magento sites and I've never seen any issue with the way Magento handles it - by canonicalizing product URLs to the root directory.
In fact, I actually prefer this was over assigning a product to a 'primary' category and using that as the canonical.
As Laura said, a reduction in the total number of indexed pages might actually be a really big positive here! More pages indexed does not mean it's better. If they are low quality/duplicate pages that have been removed from index, that's a really good thing.
I did find some issues with your robots.txt file:
- Disallow: /media/ - should be removed because it's blocking images from being crawled (this is a default Magento thing and they should remove it!)
- Disallow: /? - this basically means that any URLs containing a ? will not be crawled and with the way pagination is setup on the site, this means that any pages after 1 are not being crawled.
This could be impacting how many product pages you have indexed - which would definitely be a bad thing! You would obviously want your product pages to be crawled and indexed.
Solution: I would leave Disallow: /? in robots.txt because it stops a product filter URLs being crawled, but I would add the following line:
Allow: */?p=
This line will allow your paginated pages to be crawled, which will also allow products linked from those pages to be crawled.
Hope this helps!
Cheers,
David
-
I would be interested in seeing examples of where this has happened. Were the canonical tags added after the URLs were already indexed or were the canonicals in place when the site launched?
-
However, the canonical is only an advisory tag. I've had few cases where people have relied on their canonical tag when their site has numerous product url types (as above with category in the url and just product url) which has many references to these different urls elsewhere (onsite and offsite) and they are now indexed as both versions, which is not always ideal. It also means that reporting tools such as Screaming Frog only show the true URLs on the site. It's also saving crawl budget as it doesn't have to crawl the category produced url and the canonical url.
Whilst it's not a major issue, it's something I would look at changing.
-
If I understand you correctly, you are referring to the following two URLs:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
https://www.symectech.com/pole-mounting-kit-94614.html
Both of these have the same canonical referenced, which is https://www.symectech.com/pole-mounting-kit-94614.html.
It doesn't matter what actually shows in the address box. For the purposes of indexation, what matters is what is referenced in the canonical tag.
.
-
What I've suggested will be avoiding these duplicate urls? Here's some actual examples, going via a tier two category I get the following product url:
https://www.symectech.com/epos-systems/customer-displays/pole-mounting-kit-94591.html
With a canonical of:
https://www.symectech.com/pole-mounting-kit-94614.html
Yet when going from https://www.symectech.com/epos-systems/?limit=32&p=2 (a tier 1 category) I get the canonical url.
So if there are products listed in multiple tier two categories then that's multiple urls for the same product. With the suggestion I made, there would only be one variation of this product url (the canonical)
-
A reduction in the number of pages indexed does not necessarily mean something is wrong. In fact, it could mean that something is right, especially if your rankings are improving.
How are you determining that only 80% of pages are indexed? Can you provide a specific URL that is not being indexed?
If you made changes to your canonical tag, robots.txt , or meta robots tag, these could all cause a reduction in the number of pages being indexed.
-
The canonicals appear to be set up correctly, and I would not advise listing the product URLs as their canonicals in the category as suggested above. That will create duplicate URLs with the same content, which is exactly what canonical tags are designed to avoid.
-
Just going through Laura's list as a checklist for ones that are applicable:
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
Nothing that I can see, that's causing a major issue.
- Is it a large site? If so, check for issues that may affect crawl budget.
The main thing I can see is that the product urls and canonicals are different, is there anyway of listing the product urls as their canonical versions in the category?
-
<a name="_GoBack"></a>Sorry for the delay in response. Website is symectech.com
We have fixed various issues including a noindex issue earlier this year but our index status is continuing to fall. However, the ranking seems to be improving week on week according to MOZ. Thanks.
Tim
-
Just to echo what Laura has said, if you can share a URL that would be great so we can help you get to the source of the problem.
Try running a tool like screamingfrog (https://www.screamingfrog.co.uk/seo-spider/) to check the issues above that Laura has mentioned, as doing a lot of those by hand can be quite time consuming.
Also, do you have a drop in rankings with your pages falling out the index?
-
Any chance you can share the URL? That would make it much easier for someone to help in this forum. Without the URL, I can offer a few diagnostic questions.
- Have the number of pages on the site remained the same and pages are being removed from the index? Or have you added more content, but the percentage in the index has decreased?
- Have you checked your robots.txt file or page-level meta robots tag to see if you are blocking or noindexing anything?
- Have you submitted an XML sitemap? If so, check the XML sitemap to make sure what's being submitted should be indexed. It's possible to submit a sitemap that includes noindexed pages, especially with some automated tools.
- Is it a large site? If so, check for issues that may affect crawl budget.
- Have you changed any canonical tags?
- Have you used the Fetch as Google tool to diagnose a specific URL?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Would google consider this the anchor text?
Hi guys, For a button based link, can you define the anchor text google will use. I have attached screenshot of what i mean. Cheers. geavig
Intermediate & Advanced SEO | | bridhard80 -
Does google ignore ? in url?
Hi Guys, Have a site which ends ?v=6cc98ba2045f for all its URLs. Example: https://domain.com/products/cashmere/robes/?v=6cc98ba2045f Just wondering does Google ignore what is after the ?. Also any ideas what that is? Cheers.
Intermediate & Advanced SEO | | CarolynSC0 -
Indexed Pages Different when I perform a "site:Google.com" site search - why?
My client has an ecommerce website with approx. 300,000 URLs (a lot of these are parameters blocked by the spiders thru meta robots tag). There are 9,000 "true" URLs being submitted to Google Search Console, Google says they are indexing 8,000 of them. Here's the weird part - When I do a "site:website" function search in Google, it says Google is indexing 2.2 million pages on the URL, but I am unable to view past page 14 of the SERPs. It just stops showing results and I don't even get a "the next results are duplicate results" message." What is happening? Why does Google say they are indexing 2.2 million URLs, but then won't show me more than 140 pages they are indexing? Thank you so much for your help, I tried looking for the answer and I know this is the best place to ask!
Intermediate & Advanced SEO | | accpar0 -
Should I worry about rendering problems of my pages in google search console fetch as google?
Some elements are not properly shown when I preview our pages in search console (fetch as google), e.g.
Intermediate & Advanced SEO | | lcourse
google maps, css tables etc. and some parts are not showing up since we load them asynchroneously for best page speed. Is this something should pay attention to and try to fix?0 -
The images on site are not found/indexed, it's been recommended we change their presentation to Google Bot - could this create a cloaking issue?
Hi We have an issue with images on our site not being found or indexed by Google. We have an image sitemap but the images are served on the Sitecore powered site within <divs>which Google can't read. The developers have suggested the below solution:</divs> Googlebot class="header-banner__image" _src="/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx"/>_Non Googlebot <noscript class="noscript-image"><br /></span></em><em><span><div role="img"<br /></span></em><em><span>aria-label="Arctic Safari Camp, Arctic Canada"<br /></span></em><em><span>title="Arctic Safari Camp, Arctic Canada"<br /></span></em><em><span>class="header-banner__image"<br /></span></em><em><span>style="background-image: url('/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx?mw=1024&hash=D65B0DE9B311166B0FB767201DAADA9A4ADA4AC4');"></div><br /></span></em><em><span></noscript> aria-label="Arctic Safari Camp, Arctic Canada" title="Arctic Safari Camp, Arctic Canada" class="header-banner__image image" data-src="/~/media/images/accommodation/arctic-canada/arctic-safari-camp/arctic-cafari-camp-david-briggs.ashx" data-max-width="1919" data-viewport="0.80" data-aspect="1.78" data-aspect-target="1.00" > Is this something that could be flagged as potential cloaking though, as we are effectively then showing code looking just for the user agent Googlebot?The devs have said that via their contacts Google has advised them that the original way we set up the site is the most efficient and considered way for the end user. However they have acknowledged the Googlebot software is not sophisticated enough to recognise this. Is the above solution the most suitable?Many thanksKate
Intermediate & Advanced SEO | | KateWaite0 -
Removing index.php
I have question for the community and whether or not this is a good or bad idea. I currently have a Joomla site that displays www.domain.com/index.php in all the URLs with the exception of the home page. I have read that it's better to not have index.php showing in the URL at all. Does it really matter if I have index.php in my URL? I've read that it is a bad practice. I am thinking about installing the sh404SEF component on my site and removing the index.php. However, I rank pretty high for the keywords I want in Google, Bing and Yahoo. All of the URLs that show up in the searches have index.php as part of the URL. Has anyone ever used sh404SEF to remove the index.php and how did you overcome not loosing your search engine links? I don't want an existing search showing www.domain.com/index.php/sales and it not linking to the correct page which would now be www.domain.com/sales. I guess I could insert the proper redirects in the htaccess file. But I was hoping to avoid having every page of my site in the htaccess file for redirecting. Any help or advice appreciated.
Intermediate & Advanced SEO | | MedGroupMedia0 -
Google Places Listing Active In Two Seperate Google Places Accounts?
Hi is there any issues with having a google places listing in two seperate google places accounts. For example we have a client who cannot access their old google places account (ex-employee had their login details which they can't get) and want us to take control over the listing. If we click the "is this your listing" manage this page button - and claim the listing, will this transfer the listing to our control? Or will it create a duplicate? Are there any problems having the listing in different separate accounts. Is it a situation in which the last person who manages the listing takes control? And the listing automatically deactivates from the old account? Do all the images remain aswell? Thanks,
Intermediate & Advanced SEO | | MBASydney
Tom0 -
Google Not Indexing Description or correct title (very technical)
Hey guys, I am managing the site: http://www.theattractionforums.com/ If you search the keyword "PUA Forums", it will be in the top 10 results, however the title of the forum will be "PUA Forums" rather than using the code in the title tag, and no description will display at all (despite there being one in the code). Any page other than the home-page that ranks shows the correct title and description. We're completely baffled! Here are some interesting bits and pieces: It shows up fine on Bing If I go into GWT and Fetch as Google Bot, it shows up as "Unreachable" when I try to pull the home-page. We previously found that it was pulling 'index.htm' before 'index.php' - and this was pulling a blank page. I've fixed this in the .htaccess however to make it redirect, however this hasn't solved the problem. I've disallowed it from pulling the description .etc from the Open Directory with the use of meta tags - didn't change anything. It's vBulletin and is running vBSEO Any suggestions at all guys? I'll be forever in anyones debt who can solve this, it's proving to be near impossible to fix. Here is the .htaccess file, it may be a part of the issue: RewriteEngine On DirectoryIndex index.php index.html Redirect /index.html http://www.theattractionforums.com/index.php RewriteCond %{HTTP_HOST} !^www.theattractionforums.com
Intermediate & Advanced SEO | | trx
RewriteRule (.*) http://www.theattractionforums.com/$1 [L,R=301] RewriteRule ^((urllist|sitemap_).*.(xml|txt)(.gz)?)$ vbseo_sitemap/vbseo_getsitemap.php?sitemap=$1 [L] RewriteCond %{REQUEST_URI} !(admincp/|modcp/|cron|vbseo_sitemap/)
RewriteRule ^((archive/)?(..php(/.)?)?)$ vbseo.php [L,QSA] RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !^(admincp|modcp|clientscript|cpstyles|images)/
RewriteRule ^(.+)$ vbseo.php [L,QSA]
RewriteRule ^forum/(.*)$ http://www.theattractionforums.com/$1 [R=301,L]0