Tool to calculate the number of pages in Google's index?
-
When working with a very large site, are there any tools that will help you calculate the number of links in the Google index? I know you can use site:www.domain.com to see all the links indexed for a particular url. But what if you want to see the number of pages indexed for 100 different subdirectories (i.e. www.domain.com/a, www.domain.com/b)? is there a tool to help automate the process of finding the number of pages from each subdirectory in Google's index?
-
A good way to know how many pages are indexed in Google is to check the Top Landing Pages in Google Analytics,
It gives you a more interesting information than, IMHO, Google WBT itself, because those pages are the pages that actually are indexed and because of that receive traffic.
-
Google Webmaster Tools allows you to see your sitemap, along with the count of URLs in the sitemap and Google's index.
You can also download your sitemap from Google WMT. I am not aware of any data Google offers to help identify which URLs are not indexed. It sounds like this is the information Michelle is seeking.
If you just needed to know the # of URLs for each subdirectory, you could submit a sitemap for each one and then Google will show the URLs in Index for each given sitemap.
-
Is this your site? If so you should be able to find this information in Google Webmaster tools. Upload your sitemap and Google will tell you how many of the pages in your sitemap are in there index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will google be able to crawl all of the pages given that the pages displayed or the info on a page varies according to the city of a user?
So the website I am working for asks for a location before displaying the product pages. There are two cities with multiple warehouses. Based on the users' location, the product pages available in the warehouse serving only in that area are shown. If the user skips location, default warehouse-related product pages are shown. The APIs are all location-based.
Intermediate & Advanced SEO | | Airlift0 -
Domain Authority Dropped and Indexed Pages Went Down on Google?
Hi there, We run an e-commerce site on Shopify. Our Domain Authority was 28 at the start of our campaign in May of this year. We also had 610 indexed pages on Google. We did some SEO work which included: Renaming Images for SEO Adding in alt tags Optimizing the meta title to "Product Name - Keyword - Brand Name" for products Optimizing meta descriptions Transition of Hubspot blog to Shopify (it was on a subdomain at Hubspot previously) Fixing some 404s Resubmitting site map after the changes Now it is almost at the 3-month mark and it looks like our Domain Authority has gone down 4 points to 24. The # of indexed pages has gone to down to 555. We made sure all our SEO updates weren't spammy or keyword-stuffed, but took a natural and helpful-sounding approach. We followed guidelines. So there shouldn't be any penalty right? I checked site traffic and it does not coincide with the drop. Our site traffic remains steady. I also looked at "site:" as well as conducted some test searches for the important pages (i.e. main pages, blog pages, and product pages) and they still come up on Google. So could it only be non-important pages being deindexed? My questions are: Why did both the Domain Authority and # of indexed pages go down? Is there any way to see which pages were deindexed? I checked Google Search Console, but couldn't find it. Thank you!
Intermediate & Advanced SEO | | kindalpaca70 -
How to get a large number of urls out of Google's Index when there are no pages to noindex tag?
Hi, I'm working with a site that has created a large group of urls (150,000) that have crept into Google's index. If these urls actually existed as pages, which they don't, I'd just noindex tag them and over time the number would drift down. The thing is, they created them through a complicated internal linking arrangement that adds affiliate code to the links and forwards them to the affiliate. GoogleBot would crawl a link that looks like it's to the client's same domain and wind up on Amazon or somewhere else with some affiiiate code. GoogleBot would then grab the original link on the clients domain and index it... even though the page served is on Amazon or somewhere else. Ergo, I don't have a page to noindex tag. I have to get this 150K block of cruft out of Google's index, but without actual pages to noindex tag, it's a bit of a puzzler. Any ideas? Thanks! Best... Michael P.S., All 150K urls seem to share the same url pattern... exmpledomain.com/item/... so /item/ is common to all of them, if that helps.
Intermediate & Advanced SEO | | 945010 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
Google is indexing the wrong page
Hello, I have a site I am optimizing and I cant seem to get a particular listing onto the first page due to the fact google is indexing the wrong page. I have the following scenario. I have a client with multiple locations. To target the locations I set them up with URLs like this /<cityname>-wedding-planner.</cityname> The home page / is optimized for their port saint lucie location. the page /palm-city-wedding-planner is optimized for the palm city location. the page /stuart-wedding-planner is optimized for the stuart location. Google picks up the first two and indexes them properly, BUT the stuart location page doesnt get picked up at all, instead google lists / which is not optimized at all for stuart. How do I "let google know" to index the stuart landing page for the "stuart wedding planner" term? MOZ also shows the / page as being indexed for the stuart wedding planner term as well but I assume this is just a result of what its finding when it performs its searches.
Intermediate & Advanced SEO | | mediagiant0 -
Google's Structured Data Testing Tool? No Data
I'm stumped as to why some of the pages on my website return no data from Google's Structured Data Testing Tool while other pages work fine and return the appropriate data. My home page http://www.parkseo.net returns no data while many inner pages do. http://www.parkseo.net Returns No Data http://www.parkseo.net/citation-submission.html Does Return Data. I have racked my brains out trying to figure out why some pages return data and others don't. Any help on this issue would be greatly appricated. Cheers!
Intermediate & Advanced SEO | | YMD
Gary Downey0 -
Google indexing issue?
Hey Guys, After a lot of hard work, we finally fixed the problem on our site that didn't seem to show Meta Descriptions in Google, as well as "noindex, follow" on tags. Here's my question: In our source code, I am seeing both Meta descriptions on pages, and posts, as well as noindex, follow on tag pages, however, they are still showing the old results and tags are also still showing in Google search after about 36 hours. Is it just a matter of time now or is something else wrong?
Intermediate & Advanced SEO | | ttb0 -
Ranking with other pages not index
The site ranks on page 4-5 with other page like privacy, about us, term pages. I encounter this problem allot in the last weeks; this usually occurs after the page sits 1-2 months on page 1 for the terms. I'm thinking of to much use the same anchor as a primary issue. The sites in questions are 1-5 pages microniche sites. Any suggestions is appreciated. Thank You
Intermediate & Advanced SEO | | m3fan0