Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What tool do you use to check for URLs not indexed?
- 
					
					
					
					
 What is your favorite tool for getting a report of URLs that are not cached/indexed in Google & Bing for an entire site? Basically I want a list of URLs not cached in Google and a seperate list for Bing. Thanks, Mark 
- 
					
					
					
					
 I've had good results using Google Search Console for checking which URLs are indexed. It's pretty straightforward and gives a clear overview of any indexing issues halloweensquishmallows. 
- 
					
					
					
					
 
- 
					
					
					
					
 I can work on building this tool if there's enough interest. 
- 
					
					
					
					
 I generally just use Xenu's hyperlink sleuth (if you thousands of pages) to listing out all the URLs you have got and I might then manually take a look at them, however, see the guitar in demand I have not come upon an automatic device yet. If all people are aware of any, I'd like to recognize as properly. 
- 
					
					
					
					
 This post from Distilled mentions that SEO for Excel plugin has a "Indexation Checker": 
 https://www.distilled.net/blog/seo/awesome-examples-of-how-to-use-seotools-for-excel/Alas, after downloading and installing, it appears this feature was removed... 
- 
					
					
					
					
 Unless I'm missing something, there doesn't seem to be a way to get Google to show more than 100 results on a page. Our site has about 8,000 pages, and I don't relish the idea of manually exporting 80 SERPs. 
- 
					
					
					
					
 Annie Cushing from Seer Interactive made an awesome list of all the must have tools for SEO. You can get it from her link which is http://bit.ly/tools-galore In the list there is a tool called scrapebox which is great for this. In fact there are many uses for the software, it is also useful for sourcing potential link partners. 
- 
					
					
					
					
 I would suggest using the Website Auditor from Advanced Web Ranking. It can parse 10.000 pages and it will tell you a lot more info than just if it's indexed by Google or not. 
- 
					
					
					
					
 hmm...I thought there was a way to pull those SERPs urls into Google docs using a function of some sort? 
- 
					
					
					
					
 I think you need not any tool for this, you can directly go to google.com and search: Site:www.YourWebsiteNem.com Site:www.YourWebsiteName.com/directory I think this will be the best option to check if your website is crwled by google or not. 
- 
					
					
					
					
 I do something similar but use Advanced Web Ranking, use site:www.domain.com as your phrase, run it to retrieve 1000 results and generate a Top Site Report in Excel to get the indexed list. Also remember that you can do it on sub-directories (or partial URL paths) as a way to get more than 1000 pages from the site. In general I run it once with site:www.domain.com, then identify the most frequent sub-directories, and add those as additional phrases to the project and run a second time, i.e.: site:www.domain.com site:www.domain.com/dir1 site:www.domain.com/dir2 etc. Still not definitive, but think it does give indication of where value is. 
- 
					
					
					
					
 David Kauzlaric has in my opinion the best answer. If google hasn't indexed it and you've investigated your Google webmaster account, then there isn't anything better out there as far as I'm concerned. It's by far the simplest, quickest and easiest way to identify a serp result. re: David Kauzlaric We built an internal tool to do it for us, but basically you can do this manually. Go to google, type in "site:YOURURLHERE" without the quotes. You can check a certain page, a site, a subdomain, etc... of course if you have thousands of URLs this method is not ideal, but it can be done. Cheers! 
- 
					
					
					
					
 I concur, Xenu is an extremely valuable tool for me that I use daily. Also, once you get a list of all the URLs on your site, you can compare the two lists in excel (two lists being the Xenu page list for your site and the list of pages that have been indexed by Google). 
- 
					
					
					
					
 Nice solution Kieran! I use the same method, to compare URL list from Screaming Frog output with URL Found column from my Keyword Ranking tool - of course it doesn't catch all pages that might be indexed. The intention is not really to get a complete list, more to "draught" out pages that need work. 
- 
					
					
					
					
 I agree, this is not automated but so far, from what we know, looks like a nice and clean option. Thanks. 
- 
					
					
					
					
 Saw this and tried the following which isn't automated but is one way of doing it. - First install SEO Quake plugin
- Go to Google
- Turn off Google Instant (http://www.google.com/preferences)
- Go to Advanced search set the number of results you want displayed (estimate the number of pages on your site)
- Then run your site:www.example.com search query
- Export this to CSV
- Import to Excel
- Once then do a Data to columns conversion using ; as a delimiter (this is the CSV delimiter)
- This gives you a formatted list.
- Then import your sitemap.xml into another TAB in Excel
- Run a vlookup between the URL tabs to flag which are on sitemap or vice versa.
 Not exactly automated but does the job. 
- 
					
					
					
					
 Curious about this question also, it would be very useful to see a master list of all URLs on our site that are not indexed by Google so that we can take action to see what aspects of the page are lacking and what we need for it to get indexed. 
- 
					
					
					
					
 I usually just use Xenu's link sleuth (if you thousands of pages) to list out all the URLs you have and I would then manually check them, but I haven't come across an automated tool yet. If anyone knows any, I'd love to know as well. 
- 
					
					
					
					
 Manual is a no go for large sites. If someone knows a tool like this, it woul be cool to know which/ where to find. Or..... This would make a cool SEOmoz pro tool  
- 
					
					
					
					
 My bad - you are right that it doesn't display the actual URLs. So I guess the best thing you can do is site:examplesite.com and see what comes up. 
- 
					
					
					
					
 That will tell you the number indexed, but it still doesn't tell you which of those URLs are or are not indexed. I think we all wish it would! 
- 
					
					
					
					
 I would use Google Webmaster Tools as you can see how many URLs are indexed based on your sitemap. Once you have that, you can compare it to your total list. The same can be done with Bing. 
- 
					
					
					
					
 Yeah I do it manually now so was looking for something more efficient. 
- 
					
					
					
					
 We built an internal tool to do it for us, but basically you can do this manually. Go to google, type in "site:YOURURLHERE" without the quotes. You can check a certain page, a site, a subdomain, etc... of course if you have thousands of URLs this method is not ideal, but it can be done. 
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
- 
		
		Moz ToolsChat with the community about the Moz tools. 
- 
		
		SEO TacticsDiscuss the SEO process with fellow marketers 
- 
		
		CommunityDiscuss industry events, jobs, and news! 
- 
		
		Digital MarketingChat about tactics outside of SEO 
- 
		
		Research & TrendsDive into research and trends in the search industry. 
- 
		
		SupportConnect on product support and feature requests. 
Related Questions
- 
		
		
		
		
		
		Is 301 redirect the only way when using Vanity URLs?
 We have been using vanity urls for some of our pages. Mostly the pages that have a vanity URL have a long URL length. But now the problem is, the vanity URL is getting displayed on the search engine when the particular keyword related to the page is entered. I checked the google search console, the vanity URL is indexed and the original URL remains unindexed. What should I do? Is adding 301 redirect to the vanity URLs are solution? Since some of vanity URLs are not redirecting to the original. Some of the original pages are not getting traffic. Also, can using canonical tag help? Technical SEO | | tejasbansode0
- 
		
		
		
		
		
		Sudden Indexation of "Index of /wp-content/uploads/"
 Hi all, I have suddenly noticed a massive jump in indexed pages. After performing a "site:" search, it was revealed that the sudden jump was due to the indexation of many pages beginning with the serp title "Index of /wp-content/uploads/" for many uploaded pieces of content & plugins. This has appeared approximately one month after switching to https. I have also noticed a decline in Bing rankings. Does anyone know what is causing/how to fix this? To be clear, these pages are **not **normal /wp-content/uploads/ but rather "index of" pages, being included in Google. Thank you. Technical SEO | | Tom3_150
- 
		
		
		
		
		
		Vanity URLs are being indexed in Google
 We are currently using vanity URLs to track offline marketing, the vanity URL is structured as www.clientdomain.com/publication, this URL then is 302 redirected to the actual URL on the website not a custom landing page. The resulting redirected URL looks like: www.clientdomain.com/xyzpage?utm_source=print&utm_medium=print&utm_campaign=printcampaign. We have started to notice that some of the vanity URLs are being indexed in Google search. To prevent this from happening should we be using a 301 redirect instead of a 302 and will the Google index ignore the utm parameters in the URL that is being 301 redirect to? If not, any suggestions on how to handle? Thanks, Technical SEO | | seogirl221
- 
		
		
		
		
		
		Staging & Development areas should be not indexable (i.e. no followed/no index in meta robots etc)
 Hi I take it if theres a staging or development area on a subdomain for a site, who's content is hence usually duplicate then this should not be indexable i.e. (no-indexed & nofollowed in metarobots) ? In order to prevent dupe content probs as well as non project related people seeing work in progress or finding accidentally in search engine listings ? Also if theres no such info in meta robots is there any other way it may have been made non-indexable, or at least dupe content prob removed by canonicalising the page to the equivalent page on the live site ? In the case in question i am finding it listed in serps when i search for the staging/dev area url, so i presume this needs urgent attention ? Cheers Dan Technical SEO | | Dan-Lawrence0
- 
		
		
		
		
		
		How to Remove /feed URLs from Google's Index
 Hey everyone, I have an issue with RSS /feed URLs being indexed by Google for some of our Wordpress sites. Have a look at this Google query, and click to show omitted search results. You'll see we have 500+ /feed URLs indexed by Google, for our many category pages/etc. Here is one of the example URLs: http://www.howdesign.com/design-creativity/fonts-typography/letterforms/attachment/gilhelveticatrade/feed/. Based on this content/code of the XML page, it looks like Wordpress is generating these: <generator>http://wordpress.org/?v=3.5.2</generator> Any idea how to get them out of Google's index without 301 redirecting them? We need the Wordpress-generated RSS feeds to work for various uses. My first two thoughts are trying to work with our Development team to see if we can get a "noindex" meta robots tag on the pages, by they are dynamically-generated pages...so I'm not sure if that will be possible. Or, perhaps we can add a "feed" paramater to GWT "URL Parameters" section...but I don't want to limit Google from crawling these again...I figure I need Google to crawl them and see some code that says to get the pages out of their index...and THEN not crawl the pages anymore. I don't think the "Remove URL" feature in GWT will work, since that tool only removes URLs from the search results, not the actual Google index. FWIW, this site is using the Yoast plugin. We set every page type to "noindex" except for the homepage, Posts, Pages and Categories. We have other sites on Yoast that do not have any /feed URLs indexed by Google at all. Side note, the /robots.txt file was previously blocking crawling of the /feed URLs on this site, which is why you'll see that note in the Google SERPs when you click on the query link given in the first paragraph. Technical SEO | | M_D_Golden_Peak0
- 
		
		
		
		
		
		Effective use of hReview
 Hi fellow Mozzers! I am just in the process of adding various reviews to our site (a design agency), but I wanted to use the ratings in different ways depending on the page. So for the home page and the services (branding, POS, direct mail etc) I wanted to aggregate relevant reviews (giving us an average of all reviews for the home page, an average of ratings from all brand projects and so on). Then, I wanted to put specific reviews on our portfolio pages, so the review relates specifically to that project. This is the easiest to do as the hReview generator is geared up for reviews that come from one source, but I can't find a way of aggregating the star ratings to make an average rating rich snippet. Anyone know where I can get the coding for this? Thanks in advance! Nick. Technical SEO | | themegroup0
- 
		
		
		
		
		
		What tools produce a complete list of all URLs for 301 redirects?
 I am project managing the rebuild of a major corporate website and need to set up 301 redirects from the old pages to the new ones. The problem is that the old site sits on multiple CMS platforms so there is no way I can get a list of pages from the old CMS. Is there a good tool out there that will crawl through all the sites and produce a nice spreadsheet with all the URLs on it? Somebody mentioned Xenu but I have never used it. Any recommendations? Thanks -Adrian Technical SEO | | Adrian_Kingwell0
- 
		
		
		
		
		
		Does Google pass link juice a page receives if the URL parameter specifies content and has the Crawl setting in Webmaster Tools set to NO?
 The page in question receives a lot of quality traffic but is only relevant to a small percent of my users. I want to keep the link juice received from this page but I do not want it to appear in the SERPs. Technical SEO | | surveygizmo0
 
			
		 
				
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
			
		 
					
				 
					
				 
					
				 
					
				 
					
				 
					
				