What does this mean about my site index?
-
How should I go about fixing this? See image.
-
How do I find out from Google webmaster toolkit all the pages google has indexed of our site?
-
It means your website is creating a lot of different URLs. However, Google is deeming them as low quality (perhaps duplicates or near duplicates) and choosing not to index them.
I would look at these two options first:
- Prevent any unecessary URLs from being created
- Restrict crawl access through robots.txt
You also need to figure out, how many pages does your site actually have? Should you have significantly more or significantly less than 3,400 URLs in the index?
If you should have more than 3,400 URLs, I'd suggest making multiple sitemaps based on site sections. This will allow you to see what sections are having problems with indexation.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to configure multilingual site in google analytic? Currently showing in Referral Traffic why?
Hello All, Currently my Multilingual site is showing in referral traffic is it because I have not added hreflang tag on site? If yes and if I add the hreflang tag on all sites when where it will show in google analytic traffic from international sites? And what type of configuration required in analytic? Thanks!
Reporting & Analytics | | pragnesh96390 -
Drop in indexation but increase in organic traffic
We've had a puzzling drop in indexed pages on our ecommerce website. My crawl returns just over 25k items. Until 19/6 we had about 23-24k indexed. Then we experienced a sudden drop from 19/6 to 26/6: from 23,400 to 18,999, losing 4.4k pages from one week to the next. At the same time, our organic traffic has not decreased, it actually increased, however, it's only been a couple of weeks so that may be coincidence. A few things that have happened during the past few weeks: 31/5: we implemented pagination on category pages to avoid issues with duplicate content - could it be that this led to a decrease in indexed pages 3 weeks later? However, I can only find about 1.5k pages in my crawl that are page 2+ 18-19/6: we had some website outages over the weekend; as a B2B business, we don't get much traffic over the weekend, so I can't see an impact to traffic. However, the following week, indexation dropped by another 250 (then stayed the same this past week), so I don't think this was a factor. 21/6: we retired another website and migrated it to our main website. However, all pages were redirected to existing pages so no new pages were created for the migration. This doesn't really explain a decrease in indexation, but may account for some of the increase in organic traffic; however not all as the retired website hardly got any organic traffic. So, should we be worried? As our website is quite large, it would probably be quite difficult to pin point exactly which pages dropped off the index, but a loss of 19% of pages is quite significant. Then again, it doesn't appear to have negatively impacted organic traffic... Have you got any suggestions for what I should be looking at to find out what happened? Should I be worried at this point? I will definitely continue to have an eye on how our organic traffic (and indexation) develops but I am not sure if there is anything I can do at this point. I'd appreciate your advice on this, to make sure I am not missing something blindingly obvious. Thanks! RmWaNib JJm4tC3
Reporting & Analytics | | ViviCa10 -
Site relaunch and impact on SEO
I have some tough decisions to make about a web site I run. The site has seen around for 20 years (September 1995, to be precise, is the date listed against the domain). Over the years, the effort I've expanded on the site has come and gone, but I am about to throw a lot of time and effort back into it. The majority of the content on the site is pretty dated, isn't tremendously useful to the audience (since it's pretty old) and the site design and URL architecture isn't particularly SEO-friendly. In addition, I have a database of thousands vendors (for the specific industry this site serves). I don't know if it's a factor any more but 100% of the links there have been populated by the vendors themselves specifically requesting inclusion (through a form we expose on the site). When the request is approved, the vendor link shows up on the appropriate pages for location (state) and segment of the industry. Though the links are all "opt-in" from vendors (we've never one added or imported any ourselves), I am sure this all looks like a terrible link farm to Google! And some vendors have asked us to remove their link for that reason 🙂 One final (very important) point. We have a relationship with a nationwide brand and have four very specific pages related to that brand on our site. Those pages are essential - they are by far the most visited pages and drive virtually all our revenue. The pages were put together with SEO in mind and the look and feel is very different to the rest of the site. The result is, effectively, a site-within-a-site. I need to carefully protect the performance of these pages. To put some rough numbers on this, the site had 475,000 page views over the last year, with about 320,000 of those being to these four pages (by the way, for the rest of the content "something happened" around May 20th of last year - traffic almost doubled overnight - even though there were no changes to our site). We have a Facebook presence and have put a little effort into that recently (increasing fans from about 10,000 last August to nearly 24,000 today, with a net gain of about 2,500 per month currently). I don't have any sense of whether that is a meaningful resource in the big picture. So, that's the background. I want to totally revamp the broader site - much improved design, intentional SEO decisions, far better, current and active content, active social media presence and so on. I am also moving from one CMS to another (the target CMS / Blog platform being WordPress). Part of me wants to do the following: Come up with a better plan for SEO and basically just throw out the old stuff and start again, with the exception of the four vendor pages I mentioned Implement redirection of the old URLs to new content (301s) Just stop exposing the vendor pages (on the basis that many of the links are old/broken and I'm really not getting any benefit from them) Leave the four important pages exactly as they are (URL and content-wise) I am happy to rebuild the content afresh because I have a new plan around that for which I have some confidence. But I have some important questions. If I go with the approach above, is there any value from the old content / URLs that is worth retaining? How sure can I be there is no indirect negative effect on the four important pages? I really need to protect those pages Is throwing away the vendor links simply all good - or could there be some hidden negative I need to know about (given many of the links are broken and go to crappy/small web sites, I'm hoping this is just a simple decision to make) And one more uber-question. I want to take a performance baseline so that I can see where I started as I start making changes and measure performance over time. Beyond the obvious metrics like number of visitors, time per page, page views per visit, etc what metrics would be important to collect from the outset? I am just at the start of this project and it is very important to me. Given the longevity of the site, I don't know if there is much worth retaining for that reason, even if the content changes radically. At a high level I'm trying to decide what questions I need to answer before I set off on this path. Any suggestions would be very much appreciated. Thanks.
Reporting & Analytics | | MarkWill0 -
Webmaster Tools, why does it show 486 pages submitted to web, and only 40 indexed?
I am confused on what a client account shows in WMTs, client account is http://multiview.com. They have a graph showing 486 pages submitted to web, but only 40 are indexed. Also, they recently re-launched, i.e in April 2014, and the new site has about 40 pages indexed.... so I am guessing that the 486 number relates to all the pages that are showing errors in retrieving...i.e. 28 soft 404 errors, 10 access denied errors, 808 not found errors. Does this make sense to explain why there is such a gap between 486 and 40?
Reporting & Analytics | | DianeDP0 -
WMT and 'Links To Your Site'
Anyone else find that there are, almost continually, links added to the 'Links To Your Site' list from years ago that weren't previously reflected? I'm seeing links that were added to directories in 2008 (by whoever was doing the SEO then) only showing in the last week or so when these links weren't in the list a few months ago. I don't suppose there's much I can do - it's just annoying in that it adds to more people to contact to have nonsense removed.
Reporting & Analytics | | Martin_S0 -
Anyone notice a drop in results using site operator?
I set our site's preferred domain back on January 28. We had a www and non www domain being indexed. Since then, I've seen the number or results for our site site operator (site:) decline dramatically. Not sure if this is a good thing or bad thing. So, I'm trying to see if it's unique to our site. My gut is that the numbers are probably leveling out to where they should be and the duplicates are falling out, but I would think that as I see number of results for non www decline, the number of results for www would increase. Any thoughts? Anyone else seeing fluctuations in results using site: ? Lisa
Reporting & Analytics | | Aggie0 -
Google Analytics Site Search to new sub-domain
Hi Mozzers, I'm setting up Google's Site Search on a website. However this isn't for search terms, this will be for people filling in a form and using the POST action to land on a results page. This is similar to what is outlined at http://support.google.com/analytics/bin/answer.py?hl=en&answer=1012264 ('<a class="zippy zippy-collapse">Setting Up Site Search for POST-Based Search Engines').</a> However my approach is different as my results appear on a sub-domain of the top level domain. Eg.. user is on www.domain.com/page.php user fills in form submits user gets taken to results.domain.com/results.php The issue is with the suggested code provided by Google as copied below.. Firstly, I don't use query strings on my results page so I would have to create an artificial page which shouldn't be a problem. But what I don't know is how the tracking will work across a sub-domain without the _gaq.push(['_setDomainName', '.domain.com']); code. Can this be added in? Can I also add Custom Variables? Does anyone have experience of using Site Search across a sub-domain perhaps to track quote form values? Many thanks!
Reporting & Analytics | | panini0 -
I have recently switched from a html site to a Wordpress based one...
Obviously all my indexed links have changed. How can I avoid loosing page rank?
Reporting & Analytics | | HeadlandMachinery0