Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Indexed pages
- 
					
					
					
					
 Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... - Google Search Console: 237 indexed pages
- Google search using site command: 468 results
- MOZ site crawl: 1013 unique URLs
- Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500)
 Can anyone shed any light on why they differ so much? And where lies the truth? 
- 
					
					
					
					
 Another option is if the site uses a CMS. If so, then you can create a sitemap for content pages/posts etc,. Personally, I'm with Krzysztof Furtak on SF. Screaming Frog rocks. It'll find most pages, except perhaps Orphan pages as it wouldn't be able to find a link to crawl to discover the page. If it's really important to get as many pages as possible, I'd do the following (I've put an Astrix (*) next to ones that some people may think are a tad extreme) - Run a Screaming Frog crawl
- Grab a sitemap from your CMS
- Check any server-based analytics (AWSTATS etc)
- Check your access_log file & parse out URLs in there**(*)**
- site: queries, with & without www, and also using * as a subdomain (use something like Moz's toolbar to export)
- As Krzysztof suggests, Scrapebox would extract data too, but be careful scraping, you may get an IP slap.(*)
- Export crawl data from Moz & a tool such as Deep Crawl
- Throw the pages from all into Excel and de-dupe.
- Once you have a de-duped list, as an optional last step, go back to Screaming Frog and enter list mode (I have the paid version, not sure if it's possible with the free one) and run a crawl over all the de-duped URLs to get status codes etc
 If you're going to do this sort of thing a fair bit - buy a Screaming Frog license, it's an awesome tool and can be useful in a multitude of situations.  
- 
					
					
					
					
 The site: command is handy for asking Google what pages it knows about, however if Muzzmoz wants to know the number of pages on a site, you'd need more than this. Also, re: your different ways or querying, I like to use: site:*.domain.com - This can show other subdomains too, that may otherwise be missed  
- 
					
					
					
					
 Ok so check with site something under 1000 pages and go to the last results page. You'll see that there'll be different number (in almost all cases). 
- 
					
					
					
					
 I Will Always Prefer To Check Manually Using Site Command Because, site: operator, which will show us how many pages Google currently has indexed for the domain. There Will Be Difference Between Index status in search console and current index as search console update the data after few days. The number of indexed URLs is almost always significantly smaller than the number of crawled URLs, because Total indexed excludes URLs identified as duplicates, non-canonical or those that contain a meta no index tag. Also, Check For Index(Preferred) Version Of Your Site For E.g- You can check More About this Here - https://support.google.com/webmasters/answer/2642366?hl=en 
- 
					
					
					
					
 Hi Most accurate number is from screaming frog (if you have less than 500 pages or paid version if more than 500). Google indexes what it wants and if good enough to show in google index. If some pages are similar, got quality issues, blocked by robots etc then it won't show all. BTW don't think number in GSC or google index is good, check it manually because there can be 468 but in fact 200 only. Moz can have "historical" pages that now don't exists or don't care about quality issues. The truth is in screaming frog - most accurate number. If you used google user agent then number is the max that can appear in google index. If screaming frog user agent with turned off robots then you'll see bigger number (but google won't show it because of blocks). If you want to check what's indexed then use tool like scrapebox. First get all urls (maybe without images if you don't care), then check indexed with sb. What's not indexed, can have some issues. 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Is there a way to get a list of all pages of your website that are indexed in Google?
 I am trying to put together a comprehensive list of all pages that are indexed in Google and have differing opinions on how to do this. Technical SEO | | SpodekandCo0
- 
		
		
		
		
		
		Does a no-indexed parent page impact its child pages?
 If I have a page* in WordPress that is set as private and is no-indexed with Yoast, will that negatively affect the visibility of other pages that are set as children of that first page? *The context is that I want to organize some of the pages on a business's WordPress site into silos/directories. For example, if the business was a home remodeling company, it'd be convenient to keep all the pages about bathrooms, kitchens, additions, basements, etc. bundled together under a "services" parent page (/services/kitchens/, /services/bathrooms/, etc.). The thing is that the child pages will all be directly accessible from the menus, so there doesn't need to be anything on the parent /services/ page itself. Another such parent page/directory/category might be used to keep different photo gallery pages together (/galleries/kitchen-photos/, /galleries/bathroom-photos/, etc.). So again, would it be safe for pages like /services/kitchens/ and /galleries/addition-photos/ if the /services/ and /galleries/ pages (but not /galleries/* or anything like that) are no-indexed? Thanks! Technical SEO | | BrianAlpert781
- 
		
		
		
		
		
		Are image pages considered 'thin' content pages?
 I am currently doing a site audit. The total number of pages on the website are around 400... 187 of them are image pages and coming up as 'zero' word count in Screaming Frog report. I needed to know if they will be considered 'thin' content by search engines? Should I include them as an issue? An answer would be most appreciated. Technical SEO | | MTalhaImtiaz0
- 
		
		
		
		
		
		Can you noindex a page, but still index an image on that page?
 If a blog is centered around visual images, and we have specific pages with high quality content that we plan to index and drive our traffic, but we have many pages with our images...what is the best way to go about getting these images indexed? We want to noindex all the pages with just images because they are thin content... Can you noindex,follow a page, but still index the images on that page? Please explain how to go about this concept..... Technical SEO | | WebServiceConsulting.com0
- 
		
		
		
		
		
		Should I put meta descriptions on pages that are not indexed?
 I have multiple pages that I do not want to be indexed (and they are currently not indexed, so that's great). They don't have meta descriptions on them and I'm wondering if it's worth my time to go in and insert them, since they should hypothetically never be shown. Does anyone have any experience with this? Thanks! The reason this is a question is because one member of our team was linking to this page through Facebook to send people to it and noticed random text on the page being pulled in as the description. Technical SEO | | Viewpoints0
- 
		
		
		
		
		
		How Does Google's "index" find the location of pages in the "page directory" to return?
 This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better. Technical SEO | | reidsteven750
- 
		
		
		
		
		
		Can you 301 redirect a page to an already existing/old page ?
 If you delete a page (say a sub department/category page on an ecommerce store) should you 301 redirect its url to the nearest equivalent page still on the site or just delete and forget about it ? Generally should you try and 301 redirect any old pages your deleting if you can find suitable page with similar content to redirect to. Wont G consider it weird if you say a page has moved permenantly to such and such an address if that page/address existed before ? I presume its fine since say in the scenario of consolidating departments on your store you want to redirect the department page your going to delete to the existing pages/department you are consolidating old departments products into ? Technical SEO | | Dan-Lawrence0
- 
		
		
		
		
		
		Instant Indexing
 I've been working on a site for a while now, methodically building content and building trust and authority. Lately I've noticed that anything I publish there appears to be instantly indexed by Google, which surprises me. I haven't had this happen before so I'm curious. I'd be interested to hear the experience of others. Technical SEO | | waynekolenchuk0
 
			
		 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				