Index bloating issue
-
Hello,
In the last month, I noticed a huge spike in the number of pages indexed on my site, which I think is impacting my SEO quality score.
While I've only have about 90 pages on my site map, the number of pages indexed jumped to 446, with about 536 pages being blocked by robots. At first we thought this might be due to duplicate product pages showing up in different categories on my site, but we added something to our robot.txt file to not index those pages. But the number has not gone down. I've tried to consult with our hosting vendor, but no one seems to be concerned or have any idea why there was such a big jump in the last month.
Any insights or pointers would be so greatly appreciated, so that I can fix/improve my SEO as quickly as possible!
Thanks!
-
in order to determine if your website is hacked this is one of the best tools I know of both to find out and to remove the malware.
In order to determine rather not you have on-site SEO problems on a very technical and granular scale I would use
https://www.deepcrawl.com/ $80 a month you cannot go wrong
another amazing tool and it's free for the first 500 pages and if you want the added features which you do or more pages only about $150 a year is
-
Thank you. These are helpful suggestions.
-
A couple of things to note:
- As Robert mentioned, I would definitely make sure there is no longer an issue on your wordpress site relating to your previous hack.
- Robots.txt disallow does not stop pages from being indexed. It merely tells search engines to stop crawling that page from here out. The meta noindex tag is more applicable for noindexing pages that are already out there.
- I would check your search console crawl errors to see if there's a hefty spike in 404 errors as well, as it may be old spam pages you removed from the site.
- If these pages that are bloating your index are all still old spam filled pages from when you were hacked, you could start by using the search console's "remove url's" tool, which will remove all these url's from the index temporarily. For a more long term approach, instead of them giving off a 404 if they have been removed, making the server give off a "410" response would tell google they are gone forever, and thus they will be removed from the index as time goes on.
-
When I do the search for my main url - the results are clean. Just the pages to my site show up. And the index results for this site still bloated. However, for my wordpress site, which is a subdomain and on a different platform to my main site, there are some issues (it was hacked as Rob noted below). But we have since cleaned up the pages etc, reuploaded the site maps, etc. So I'm a little stumped on my main site (which wasn't hacked - that I'm aware of).
-
What do you see if you do a search for site:yoursite.com ?
-
Hello Julie,
This sounds like you might have a hacking issue on your website. You probably need someone to conduct a full code audit of your site to determine whether any files you have uploaded (plugins, for example) were contaminated. If a site is hacked, new pages can be added that are hidden from view and difficult to detect unless handled by a security specialist.
We recently brought on a new client who had this issue and discovered that his site had 1000's of pages dedicated to testosterone pills, etc. We had to go through GWT and the site logs to determine what new pages were created and it was a complete hack job.
In terms of fixing your SEO, the first step is to determine where/if the hack exists. Once that is decided, you have to clean up the site and restore the site's security.
I would be happy to help you with the next steps if you would like. I am always available!
Thanks and best of luck,
Rob
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Website stopped being in the Google Index
Hi there, So My website is two weeks old, and I published it and it was ranking at about page 10 or 11 for a week maybe a bit longer. The last few days it dropped off the rankings, which I assumed was the google algorithm doing its thing but when I checked Google Search Console it says my domain is not in the index. 'This page is not in the index, but not because of an error. See the details below to learn why it wasn't indexed.' I click request indexing, then after a bit, it goes green saying it was successfully indexed. Then when I refresh the website it gives me the same message 'This page is not in the index, but not because of an error. See the details below to learn why it wasn't indexed.' Not sure why it says this, any ideas or help is appreciated cheers.
Technical SEO | | sydneygardening0 -
Redirect Chain Issue
I just found I'm having a redirect chain issue for http://ifixappliancesla.com (301 Moved Permanently). According to Moz, "Your page is redirecting to a page that is redirecting to a page that is redirecting to a page... and so on" These are the pages involved: 301 Moved Permanently
Technical SEO | | VELV
http://ifixappliancesla.com
https://ifixappliancesla.com https://www.ifixappliancesla.com/ This is what Yoast support told me: "The redirect adds the https and then the www, ending at: https://www.ifixappliancesla.com/. You want all variants of your site's domain to end up at: https://www.ifixappliancesla.com/ " - which is totally true. But I would also like not to have the redirect chain issue! Could you please give me an advise on how to properly redirect my pages so I don't have that issue anymore?0 -
Internal Linking issue
So i am working with a review company and I am having a hard time with something. We have created a category which lists and categorizes every one of our properties. For example a specific property in the category "restaurant" would be as seen below: /restaurant/mcdonalds /restaurant/panda-express And so on and so on. What I am noticing however is that our more obscure properties are not being linked to by any page. If I were to visit the page myurl.com/restaurant I would see 100+ pages of properties, however it seems like only the properties on the first few pages are being counted as having links. So far the only way I have been able to work around this issue is by creating a page and hiding it in our footer called "all restaurants". This page lists and links to every one of our properties. However it isn't exactly user friendly and I would prefer scrapers not to be able to scrape all properties at once! Anyway, any suggestions would be greatly appreciated.
Technical SEO | | HashtagHustler0 -
We need a bit of help from someone to fix the following issues causing speed issues on our website.
Hi We need a bit of help from someone to fix the following issues causing speed issues on our website.Does anyone know of someone that can help? Reduce server response time Optimize images Eliminate render-blocking JavaScript and CSS in above-the-fold content Avoid landing page redirects Leverage browser caching Minify CSS Minify JavaScript Minify HTML
Technical SEO | | Bev.Aquaspresso0 -
Homepage not indexed
Hi, I have a problem with my website. From my PC, when I search for site:nobelcom.com the homepage of the website doesn't appear, but on other PCs (different IPs) it is ok.
Technical SEO | | Silviu
Also any keywords that usually responded with homepage, now responds with other page. Does anyone know way this is happening. It happen before the Penguin update, and after a fetch like google and send to index, I had the homepage back on serps0 -
Webmaster internal links issue
Hi All, In webmaster > Internal links https://www.google.com/webmasters/tools/internal-links?hl=en&siteUrl= I get counts as in the image http://imgur.com/9bO5H0f is this logical and ok or should i work on finding why so many links and reduce them? Thanks Martin
Technical SEO | | mtthompsons0 -
Getting More Pages Indexed
We have a large E-commerce site (magento based) and have submitted sitemap files for several million pages within Webmaster tools. The number of indexed pages seems to fluctuate, but currently there is less than 300,000 pages indexed out of 4 million submitted. How can we get the number of indexed pages to be higher? Changing the settings on the crawl rate and resubmitting site maps doesn't seem to have an effect on the number of pages indexed. Am I correct in assuming that most individual product pages just don't carry enough link juice to be considered important enough yet by Google to be indexed? Let me know if there are any suggestions or tips for getting more pages indexed. syGtx.png
Technical SEO | | Mattchstick0