How do I know what pages of my site is not inedexed by google ?
-
Hi
I my Google webmaster tools under Crawl->sitemaps it shows 1117 pages submitted but 619 has been indexed.
Is there any way I can fined which pages are not indexed and why?
it has been like this for a while.
I also have a manual action (partial) message. "Unnatural links to your site--impacts links" and under affects says "Some incoming links"
is that the reason Google does not index some of my pages?
Thank you
Sina
-
Thank you very much for the detail answer,
Is there any way I can find when I got the Manual Actions (Partial)
there is no date
-
Hi Sina,
For your first question, make sure you have Google Webmaster Tools setup (which I gather you do) as you have received a 'low quality/spam links' message by them. I should add that dealing with an 'unnatural link profile by Google is a whole other project!) and super important to boot so get on top of that also! Open Site Explorer is a perfect place to start, to crawl the links and to profile your entire linking domain profile. From here you can begin to examine domain link profile by filtering through options to identify ones which may be causing you that warning from Google. This will need to be rectified in order to ensure solid indexing of your site pages. You will need to clean these up in order for the rest to work and be effective
Now, to look at the indexing issue you asked on. If you look to the right in Webmaster Tools once you login, on the dashboard, you will see a section called SITEMAPS (3rd on the right once you click into the domain) from the main panel. Click on the TITLE of this section from the dashboard, and you will land on the SITEMAPS report file. There is a wealth of information here from Google about the indexing health of your site.
There are 3 steps here, Google needs to have done in order to identify which to help you figure out the information you are looking for:
- Crawling
- Indexing
- Ranking (what you see in the SERP results pages
using search terms or Google Operators for site review.
In order to see any results at all, you need to ensure you have a SITEMAPS.XML file built, loaded and submitted to Google. It also needs to be configured properly and have no errors for proper processing. This is the only way you will get clear snapshot of what has been indexed based on your XML file by Google. This will tell you have many pages you have indexed in their index, but not identify. If you don't have any at all, it will state it.
it's also time to look at your robots.txt and .htaccess file to ensure those are configured and installed properly. This would be another troubleshooting step, but seeing as you have a unnatural link profile, you may want to take these steps first. Ensure you don't have any of the <noindex>meta fields listed here as well site-wide.</noindex>
So, from here, once you login to Webmaster Tools (dashboard for the site you are referring to you) under SITEMAPS, you will see a section saying XXX number of pages submitted and XXX # of pages indexed along with any errors and warnings you are getting from them now in that box (link warnings will be here too!). This will give you some important informtion which you can log in an Excel file later
Here is where you will most likely see that linking domain link alert from Google as well.
Now you have Google's 'indexed pages' view. Now you have to dig a little.
----- GOOGLE OPERATORS ---- Now, once you have some data from Google WebMaster Tools as mentioned above, You can now go to Google.com (or the Google index you want to see like .ca. or others) and use Google search operators to speficially see which URL's and pages have been indexed by the engine. There are a few different ones you can use below. I found a great resource below and copied in the link.
Domain search with - site: Operator
(site:google.com)
This should returns results only from the specified Domain.
So you will need to be careful if your site is with a SubDomain (or multiple SubDomains) ("www" is a SubDomain).Domain search with - inurl: Operator
(inurl:google.com)
This should return results that contain the specified Domain.
This may not be only from the site in question though! It is possible for other sites to contain your domainname in their URLs (whois.domaintools.com may have such URLs etc.)Domain search with - site: and inurl: Operators
(site:google.com inurl:google.com)
This way you limit the results to your Domain Only ... and it seems to generate more "reliable" results than the site: operator alone.Domain and Path/Query search with - site: and inurl: Operators
(site:google.com inurl:/somepath/somedirectory/)
(site:google.com inurl:?this=that&rabbits=lunch)
This way you limit the results to your Domain Only ... and focus on a specific directory/folder or set of paramters etc.Domain and FileType search with - site: and filetype: Operators
(site:google.com filetype:html)
This limits the results to those from your Domain, and to a specific type of file.
Please note - the filetype: operator may not show All of that type - it may only work for URLs that end in that type. thus if you serve content as html, but without the .html in the filename - they will not show in the results!)Domain and Path/Query search with - site:, inurl: and inurl: Operators
(site:google.com inurl:google.com inurl:/somepath/somedirectory/)
(site:google.com inurl:google.com inurl:?this=that&rabbits=lunch)
This permits you to start limiting the results to specific parts of your site if you need too.Make sure that your site pages also don't include in the section the <meta-noindex>or <meta-nofollow>tags. This would tell Google not to index or follow the pages from your site
</meta-nofollow></meta-noindex>
Ensure that you have, in your .htaccess file the proper redirects for the site if you find you have duplicate content. Ensure you are 301 redirecting the non-www to www versions of your site and pages (or vice-versa), whichever you prefer to have indexed by Google to ensure clean indexing of the site. This will make sure you don't have problems indexing wide for search.
TO NOTE
---- SERVER LOG FILES ---- (Note: please make sure that you request log files) from your hosting company too. If you don't have access to server log files for hosting traffic, switch! Log and keep an eye on these as well for information for your needs. This process is not a fast or easy one and does require some work to detect. Don't get lazy. This is a crucial step to keep an eye on.
What I recommend next is starting to keep log files if you aren't already and tracking those on a weekkly pr monthly basis (which ever is easier). The reason being is once you get indexed to Google, you always want to keep an idea of what is indexed and what isn't (dropped) or de-indexed pages. This can also help identify early problems (or penalties) from Google if you see trending things happening day over day or week over week.
Hope this helps point you in the right direct. Remember don't be lazy here
Exhaust all options to indentify your problems! Cheers,
Rob
-
Based on the manual action message from Google, I would guess that one of the possible reasons is that the unindexed pages have bad links pointing towards them. So Google is thinking that those pages are not "quality."
I would also check that all pages are included in your XML sitemap at a minimum and HTML sitemap (if you have the latter one). I'd also check the section of all pages to make sure that no pages are set to "noindex." Lastly, you may have duplicate content. If two pages have the exact-same text with only minor keyword-based variations, for example, then Google will often index only one of the two pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Breaking up a site into multiple sites
Hi, I am working on plan to divide up mid-number DA website into multiple sites. So the current site's content will be divided up among these new sites. We can't share anything going forward because each site will be independent. The current homepage will change to just link out to the new sites and have minimal content. I am thinking the websites will take a hit in rankings but I don't know how much and how long the drop will last. I know if you redirect an entire domain to a new domain the impact is negligible but in this case I'm only redirecting parts of a site to a new domain. Say we rank #1 for "blue widget" on the current site. That page is going to be redirected to new site and new domain. How much of a drop can we expect? How hard will it be to rank for other new keywords say "purple widget" that we don't have now? How much link juice can i expect to pass from current website to new websites? Thank you in advance.
Intermediate & Advanced SEO | | timdavis0 -
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
Should I Re-direct Domains to Internal Pages on Money Site
I have an ecommerce site that is fully built out with thousands of products. I own many industry related domains for the products that i sell. Many of these domains are sitting unused. I started to think that it would beneficial if i 301 redirect (at the registrar level) these domains to their SPECIFIC subcategories on my main money site. For example, i sell sporting goods and my main website is buysportinggoods.com I also own the following domains: basketballoutlet.com & baseballequipmentstore.com & footballpads.com Would it be wise or foolish (and potentially cause a Google penalty) if i did the following: Point basketballoutlet.com to buysportinggoods.com/basketballs Point baseballequipmentstore.com to buysportinggoods.com/baseball Point footballpads.com to buysportinggoods.com/football Please let me know your thoughts or experiences with similar situations. Thanks!
Intermediate & Advanced SEO | | Prime850 -
I am launching a new site, what I need to know about backlinks
Hi 🙂 soon I am launching a new site. It will be focused on topics: Android, smartphones, reviews etc... First I will buy a good looking WP theme, it will be responsive theme. after that when I set up all in theme, after that I will start writing. I will try to write interesting and good content.... This content I will share on my pages: Facebook, Twitter, Google+ and Youtube. So only remains, how to get good backlinks ? good backlinks for Google ? Blog commenting is not good, right ? Writing on relative forums and posting links as source of that information ? Contacting other relative sites for sharing my content ? Am I missed something or not ? In your opinion, which is the best way of getting good backlinks ? And what is the best solution in the beginning, because no one will know my site because my site is new. Thank you 🙂
Intermediate & Advanced SEO | | Ivek990 -
Does duplicate content penalize the whole site or just the pages affected?
I am trying to assess the impact of duplicate content on our e-commerce site and I need to know if the duplicate content is affecting only the pages that contain the dupe content or does it affect the whole site? In Google that is. But of course. Lol
Intermediate & Advanced SEO | | bjs20100 -
How to resubmit a Web 2.0 site to Google?
I have 3 web 2.0 sites that look like theyve been hit by a penalty. I have checked their backlinks and there are a lot of backlinks from sites that have been deindexed. I have requested the removal of lots of the links, but now I need to resubmit the site to Google. Is this even possible with them being a web 2.0 site? I don't have webmaster tools for the site so how would I do this?
Intermediate & Advanced SEO | | JohnPeters0 -
Why is my XML sitemap ranking on the first page of google for 100s of key words versus the actual relevant page?
I still need this question answerd and I know it's something I must have changed. But google is ranking my sitemap for 100s of key terms versus the actual page. It's great to be on the first page but not my site map...... Geeeez.....
Intermediate & Advanced SEO | | ursalesguru0 -
Why my site is "STILL" violating the Google quality guidelines?
Hello, I had a site with two topics: Fashion & Technology. Due to the Panda Update I decided to change some things and one of those things was the separation of these two topics. So, on June 21, I redirected (301) all the Fashion pages to a new domain. The new domain performed well the first three days, but the rankings dropped later. Now, even the site doesn't rank for its own name. So, I thought the website was penalized for any reason, and I sent a reconsideration to Google. In fact, five days later, Google confirmed that my site is "still violating the quality guidelines". I don't understand. My original site was never penalized and the content is the same. And now when it is installed on the new domain becomes penalized just a few days later? Is this penalization only a sandbox for the new domain? Or just until the old URLs disappear from the index (due to the 301 redirect)? Maybe Google thinks my new site is duplicating my old site? Or just is a temporal prevention with new domains after a redirection in order to avoid spammers? Maybe this is not a real penalization and I only need a little patience? Or do you think my site is really violating the quality guidelines? (The domain is http://www.newclothing.co/) The original domain where the fashion section was installed before is http://www.myddnetwork.com/ (As you can see it is now a tech blog without fashion sections) The 301 redirect are working well. One example of redirected URLs: http://www.myddnetwork.com/clothing-shoes-accessories/ (this is the homepage, but each page was redirected to its corresponding URL in the new domain). I appreciate any advice. Basically my fashion pages have dropped totally. Both, the new and old URLs are not ranking. 😞
Intermediate & Advanced SEO | | omarinho0